1

I was watching a video which uses integration to show that the area under the standard normal distribution function is equal to $1$.

The function was squared which resulted in two variables $x$ and $y$.

This was converted to polar coordinated by $x=r\cos\theta$ and $y=r\sin\theta$

The next line was $$dx\,dy=r\,dr\,d\theta$$

I have no idea where that came from.

I would have thought that $dx=-r\sin\theta\ d\theta$ and $dy=r\cos\theta\ d\theta$

And hence $$dx\,dy=-r^2\sin\theta\cos\theta\ d\theta\ d\theta$$

Where did that expression come from?

DMcMor
  • 10,059
Kantura
  • 2,787

2 Answers2

5

The trouble here is that $dx\,dy$ is not a multiplication, but rather a wedge product more properly written as $dx\wedge dy$ - which basically means "measure area as projected upon the $xy$ plane."

Intuitively, wedge products basically make an algebra of area, where you say that if you have two vectors, you could consider the parallelogram of area created with those sides and have some notion of sign (or, in three or more dimensions, of the "direction" of area). This gets combined with differential forms to create a notion of "infintesimal" area.

It turns out that calculating how wedge products change under substitution is just calculating the Jacobian determinant of the relevant map, but you can get at this at elementary means too, just by noting that $dx\wedge dx = 0$ (since a parallelogram built from two of the same vector has no area) and $dx\wedge dy = -dy\wedge dx$ (which is a consequence of the previous rule, but also can be imagined as flipping the orientation - and thus signed area - of a parallelogram). This product is also distributive, like multiplication.

As for manipulating these things, it's easy enough to do by hand using the above rules. Start out by taking total derivatives: $$dx=-r\sin(\theta)\,d\theta+\cos(\theta)\,dr$$ $$dy=r\cos(\theta)\,d\theta+\sin(\theta)\,dr$$ Note that this captures that $x$ and $y$ depend on both parameters - you cannot say anything useful about the rate of change of $x$ only knowing the rate of change of $\theta$. You need to know how $r$ is changing too.

You can then take the wedge product of these two equations to say: $$dx\wedge dy = (-r\sin(\theta)\,d\theta+\cos(\theta)\,dr) \wedge (r\cos(\theta)\,d\theta+\sin(\theta)\,dr).$$ One may then use distributivity to say: $$dx\wedge dy = -r^2\sin(\theta)\cos(\theta)\,d\theta \wedge d\theta + r\cos(\theta)^2\,dr\wedge d\theta - r\sin(\theta)^2\,d\theta\wedge dr + \cos(\theta)\sin(\theta)\,dr\wedge dr$$ We can eliminate the $d\theta\wedge d\theta$ and $dr\wedge dr$ terms, since they are zero, as well as replace $d\theta \wedge dr$ by $-dr \wedge d\theta$ to simplify this as: $$dx\wedge dy = (r\cos(\theta)^2 + r\sin(\theta)^2)dr\wedge d\theta=r\cdot dr\wedge d\theta$$ which is as desired.

Although this is mostly algebraic, it's worth remembering that there is a sensible way to read this equation:

If you take a small region near any point, the area of that region when projected onto the $(x,y)$ plane is $r$ times the area of the projection on the $(r,\theta)$ plane.

This fact is extremely relevant if we're trying to do something like integration where the task is "sum up a bunch of little parts weighted by area" and is precisely the right thing to reason about for substitution - and the fact that multiplication can't be interpreted like this (and doesn't really make sense) is why wedge products are needed in the first place. (There's some other ways you might interpret the equation, but this is the most literal)

Milo Brandt
  • 61,938
1

We have a theorem that explains how to change the integration variables, that states:

Theorem: Let $D$ and $D^*$ be elementary regions of the plane, and let $T: D^* \to D$ be a bijective derivable function then for any integrable function $f:D\to \mathbb{R}$ it holds that $$ \iint_D f(x,y) \, dx\, dy = \iint_{D^*} f(x(u,v),y(u,v)) \left|\frac{\partial(x,y)}{\partial(u,v)} \right| \, du \, dv $$ where $T$ is the transformation that makes $x = x(u,v)$ and $y = y(u,v)$

This theorem and the definition given on the last paragraph of this answer (with some determinants calculation from your part) explain why the $r$ term appears on that integral. However on the next paragraphs I will try to give a more in depth understanding of what the theorem is saying.

The $dx\, dy$ that you see on the end of double integral means something like "move along $x$ and then move along $y$". Or said differently, you are filling the plane with a grid along both of this axis. Now, when you use polar coordinates you are dividing the plane in a different grid, the "polar" grid.

What the integral does is to "count" the volume assigned to the bar that is produced by moving along the axis, and the value of the function. When you use rectangular coordinates the area of a grid movement is always the same (i.e. moving one unit in $x$ and one in $y$ will always be of area 1), however this is not the case when you are using polar coordinates, as the area of a circular segment does depend on the radius. To compensate this, you put the $r$ besides de differentials indicating that the area of a "square" in polar coordinates has an area that depends on this.

The above explanation can be generalized and formalized via the Jacobian Determinant. The transformation to polar coordinates is given by $x = r\cos \theta$ and $y=r\sin \theta$, but more generally we can say say that an arbitrary transformation $x = x(u,v),\ y = y(u,v)$ and the following determinant, $$ \frac{\partial(x,y)}{\partial(u,v)} := \begin{pmatrix} \frac{\partial x}{\partial u}&\frac{\partial x}{\partial u}\\ \frac{\partial x}{\partial u}&\frac{\partial x}{\partial u}\\ \end{pmatrix} $$ also remembering that a determinant of a $2\times 2$ is representing an area we have that $\left|\frac{\partial(x,y)}{\partial(u,v)} \right|$ accounts for that "change" of area that we mentioned earlier. Namely, this factors expresses how much does the area of the new "square" under the transformation compares to the the are of the original square in cartesian coordinates. This is a very loose explanation and draft of the proof for the theorem stated in the beginning.