14

Suppose $M$ is a $(2k+1)$-dimensional manifold on which a 1-form $\alpha$ is defined. $M$ is termed as a contact manifold if the distribution arising from $\alpha$ is nowhere integrable, i.e. if: $$\xi_q=\{v\in T_qM:\alpha(v)=0\}$$ is a distribution that admits no integral manifolds whatever point you look for an integrable manifold at. I have read that this is equivalent to: $$\alpha\wedge(d\alpha)^k\neq0.$$ How do I prove this?

MickG
  • 9,085

3 Answers3

10

This is false. The contact condition is often known as being maximally non-integrable.

Given an integrable distribution $\xi$ defined by a 1-form $\alpha$, use that $$d\alpha(X,Y) = Y\alpha(X) - X\alpha(Y) - \alpha([X,Y]).$$ So if $X,Y \in \xi$, then $d\alpha(X,Y) = 0$. So in particular $\alpha \wedge d\alpha = 0$ everywhere.

For 3-manifolds, the contact condition is equivalent to non-integrability, because non-integrability is the same thing as $\alpha \wedge d\alpha \neq 0$. For higher-dimensional manifolds the contact condition can be reinterpreted as "$d\alpha|_{\xi}$ is non-degenerate" - about as non-integrable as you can get, given that integrable is equivalent to $d\alpha|_{\xi} = 0$".

  • 1
    OK, I forgot the maximality, my bad. Now up there you are assuming an integrable distribution is involutive, as you are saying that if $\xi$ is integrable, and $X,Y\in\xi$, then $[X,Y]\in\xi$. Wasn't complete integrability (not just integrability) equivalent to involutiveness by Frobenius's theorem? That is, didn't that theorem state a distribution is involutive if and only if it admits, at each point of the manifold, an integral manifold of dimension precisely the dimension of the distribution? I thought I had a proof of that… am I misreading you? – MickG Aug 13 '15 at 17:07
  • @MickG Completely integrable $\implies$ integrable $\implies$ involutive. The hard direction is the proof that involutive $\implies$ completely integrable. But all three conditions are equivalent. –  Aug 13 '15 at 17:08
  • Oh. So integrable is equivalent to completely integrable. Which means it is impossible for a distribution to admit integral manifolds at every point of the ambient manifold and all of dimension less than that of the ambient manifold. Interesting. – MickG Aug 13 '15 at 17:11
  • @MickG: Huh? I don't understand that comment. That's certainly possible - that's precisely what a foliation is. –  Aug 13 '15 at 17:12
  • Just one last thing. It is curious that in dimension three the contact condition is equivalent to $\alpha\wedge(d\alpha)\neq0$ and in higher dimensions it is just $d\alpha|_\xi\neq0$. I can now see the reason for the thing in the higher dimensions, but why is the condition different in 3 dimensions? And what has all that got to do with $\alpha\wedge(d\alpha)^k$? – MickG Aug 13 '15 at 17:14
  • Whoops, I meant dimension less than that of the distribution. – MickG Aug 13 '15 at 17:14
  • @MickG: The contact condition isn't that $d\alpha|\xi \neq 0$, it's that $d\alpha|\xi$ is nondegenerate. (This is equivalent in every dimension, including 3.) Think back to the symplectic case - a symplectic form can be defined either as a 2-form such that $\omega^k \neq 0$ everywhere, or a nondegenerate 2-form. These two notions are equivalent. And you can't have integral manifolds of dimension less than the distribution simply because the tangent spaces of these manifolds would be dimension less than the distribution. No big theorems there. –  Aug 13 '15 at 17:16
  • Perhaps it depends on the definition of integral manifolds. Boothby defines an i.m. as a manifold $N$ which is immersed into the manifold $M$ where the distribution is given via a 1-to-1 immersion $F$, and such that $dF(T_qN)\subseteq\Delta_{F(q)}$ for all $q$, where $\Delta$ is the distribution. And the book explicitly adds «Note that an integral manifold may be of lower dimension than $\Delta$». Perhaps it is more standard to impose an equals sign in that inclusion when defining an i.m.. With the $=$ sign there the i.m.s are forced to have the distribution's dimension, naturally. – MickG Aug 13 '15 at 17:23
  • If integrability is the same as $d\alpha|\xi=0$, then non-integrability is the same as $d\alpha|\xi\neq0$, isn't it? And why is nondegeneracy equivalent to $\omega^k\neq0$? – MickG Aug 13 '15 at 17:25
  • @MickG: Boothby's definition is bizarre, I've never heard of another author doing such a thing. Yes, non-integrability is the same as $d\alpha|_\xi \neq 0$. You should be able to prove the last statement yourself. –  Aug 13 '15 at 17:27
  • I think I am missing the sense of "maximally" in "maximally non-integrable". Also, it is curious that the condition $\alpha\wedge(d\alpha)^k\neq0$ didn't pop up in our discussion yet, when Wikipedia says it is a way of giving the non-integrability, and this thing defines maximally non-integrable by precisely that condition.… – MickG Aug 13 '15 at 17:32
  • This is precisely equivalent to $d\alpha|_\xi$ being nondegenerate, which is how I prefer to conceptualize it. Being nondegenerate is just about as far as you can get from being zero, hence maximal. –  Aug 13 '15 at 17:34
  • I can write out $\omega^n(v_1,\dotsc,v_{2n})$ as $\sum_{\sigma\in S_{2n}} \operatorname{sgn}\sigma\cdot\omega(v_{\sigma(1)},v_{\sigma(2)})\cdot\dotso\cdot\omega(v_{\sigma(2n-1)},v_{\sigma(2n)})$, but does that help me? Similarly, I can write $\alpha\wedge(d\alpha)^k(v_1,\dotsc,v_{2k+1})=\sum_{\sigma\in S_{2k+1}} \operatorname{sgn}\sigma\cdot\alpha(v_{\sigma(1)})d\alpha(v_{\sigma(2),\sigma(3)}\cdot\dotso\cdots d\alpha(v_{\sigma(2k)},v_{\sigma(2k+1)})$, to show $d\alpha$ non degenerate is equivalent to $\alpha\wedge(d\alpha)^k\neq0$, but does that help me? – MickG Aug 13 '15 at 19:36
  • For the symplectic one, I know than one can prove by induction that there is a basis which satisfies $\omega(e_i,e_j)=0$ for $j\neq i+n,i-n$ and $\omega(e_i,e_{i+n})=1$, so the form is represented by a matrix with a zero block in the top-left, the identity in the top-right block, minus the identity on the bottom-left, and zero again on the bottom-right. With that basis, by feeding in the arguments in the appropriate order, one should obtain that $\omega^n$ is not 0. Then again, the reverse implication is still unproven. Is there a similar result for the other implication, that I can use… – MickG Aug 13 '15 at 19:42
  • …for one implication of $\alpha\wedge(d\alpha)^k\neq0\iff d\alpha$ is nondegenerate? – MickG Aug 13 '15 at 19:43
  • @MickG You're close. You should do this on your own; it will help you more than me showing you the last steps. –  Aug 13 '15 at 20:00
  • For the $\omega^k\neq0$ following from nondegeneracy of $\omega$, the only terms surviving from the sum above when $v_i$ is chosen to be the symplectic basis are those where the indexes of the arguments to each $\omega$ differ by $n$. One is $\omega(1,n+1)\cdot\dotso\cdot\omega(n,2n)$, where I abbreviated $v_i$ by just the index. The others all have the same sign since all other nonzero terms come from permutations of ${1,\dotsc,2n}$ that are obtained from that one by either swapping arguments in a factor or swapping factors. – MickG Aug 14 '15 at 08:43
  • Swapping factors is like composing the permutation with two transpositions, which adds two minuses to the sign of the permutation, and doesn't change the value of the product without the permutation's sign. Swapping arguments changes the sign of the term we swap arguments in, but it is lime composing the original permutation with a transposition, thus changing the permutation's sign and canceling out the factor's sign change. So all those terms have the same signs, and are of modulus one, ergo they can't cancel out and voilà one implication. – MickG Aug 14 '15 at 08:44
  • The "discussion in chat" comment was an accidental tapping. – MickG Aug 14 '15 at 08:45
  • As for the reverse implication, if $\omega^k\neq0$, then there are at least $2n$ vectors which given as arguments to $\omega^k$ give a nonzero output, and nonzero output implies linear independence, by skew-simmetry / alternatingness. So we have a basis of the space for every vector of which there exist another vector such that $\omega$ applied to them is nonzero, for $\omega^k$ applied to the basis gives the above sum, where at least one term must be nonzero. By "normalizing" we can get a symplectic basis again, so we reduce $\omega$ to the standard symplectic form. – MickG Aug 14 '15 at 09:14
  • Now for the $\alpha\wedge(d\alpha)^k\neq0$. If that holds, in particulare $\alpha,(d\alpha)^k\neq0$. If $(d\alpha)^k\neq0$, then by arguments such as the above there must exist a $2k$-dimensional space on which $d\alpha$ is nondegenerate. – MickG Aug 14 '15 at 09:17
  • It is easier to start from the wedge product directly. I write it out, and I have a sum of terms. There exist $2k+1$ linearly independent vectors by applying the wedge product to which you get a nonzero result. Linear independence is deduced as above. The space we are in is $(2k+1)$-dimensional, so those vectors form a basis. If the term has $\sigma(1)$ give an index for which the vector is in $\xi_q$, then the term is zero. So the only term that survives is the one for which $\alpha$ has an argument outside $\xi_q$, and the rest will form a symplectic basis for $d\alpha$ on $\xi_q$, showing… – MickG Aug 14 '15 at 09:38
  • …$d\alpha$ is nondegenerate on $\xi_q$. If this condition is our starting point, we find a symplectic basis for $d\alpha$ on $\xi_q$, complete it to a basis of the tangent space at $q$, then we apply the wedge product to that basis, and the term where the vector outside $\xi_q$ is provided to $\alpha$… the termS, in fact, are the surviving ones, and by the same argument as above they are all equal, proving the wedge product is nonzero. Will soon post a sum-up of this discussion as an answer, to have all this in a single place. – MickG Aug 14 '15 at 09:41
  • I posted my answer. Would you care to check it's correct? I will soon attempt to prove the formula you gave for $d\alpha$. – MickG Aug 14 '15 at 10:21
  • Also, if you know how to fix that nested list, please let me know. – MickG Aug 14 '15 at 10:25
  • I have tried proving your formula for $d\alpha(X,Y)$, but I failed. See the answer below. What am I doing wrong? – MickG Aug 14 '15 at 12:52
6

Posting this to sum up all the stuff that came out in the huge comment discussion under Mike's answer, a discussion which needs 4 screenshots, 1 2 3 and 4, to fit in them. Will accept Mike's answer for the patience he must have used to go on with that discussion :). What emerged was the following.

  1. I forgot a "maximally" in my contact condition. "maximally non-integrable" means "as far as can be from being integrable", in a sense that will be made clear in the following points.
  2. If you define integral manifolds as those whose tangent space is the subspace of the t.s. of the ambient manifold, and not just a subspace of that as Boothby does -- and Boothby is apparently the only one doing that --, then completely integrable, integrable and involutive are equivalent, since that c.i. and involutive are equivalent is Frobenius's theorem, proved on Boothby and Lee, and that integrable implies involutive can be proven -- Proposition 14.3 on p. 358 of Lee.
  3. Thus if the distribution given by the zeros of $\alpha$ is integrable, it is involutive, and $d\alpha$ is zero, as: $$d\alpha(X,Y)=X\alpha(Y)-Y\alpha(X)-\alpha([X,Y]).$$ I actually haven't yet seen a proof of this, as far as I remember, but it shouldn't be hard. Will try soon. Of course, if $d\alpha=0$, the distribution is involutive, thus integrable.
  4. So integrability is equivalent to $d\alpha=0$, and non-integrability to $d\alpha\neq0$. How far can you get from $d\alpha$ being zero? By having it nondegenerate on $\xi$. That is why this is termed as $\xi$ being maximally non-integrable. This is the definition of contact condition, and of contact form.
  5. And now the main serving of the meal: this is equivalent to $\alpha\wedge(d\alpha)^k\neq0$. Mike told me to try proving this myself, and I did. First of all, we write the wedge product out explicitly: $$\alpha\wedge(d\alpha)^k(v_1,\dotsc,v_{2k+1})=\sum_{\sigma\in S_{2k+1}}\operatorname{sgn}\sigma\cdot\alpha(v_{\sigma(1)})d\alpha(v_{\sigma(2)},v_{\sigma(3)})\cdot\dotso\cdot d\alpha(v_{\sigma(2k)},v_{\sigma(2k+1)}).$$ Now assume $d\alpha|_\xi$ is nondegenerate. Then it is easy to prove by induction that we can find a symplectic basis for $d\alpha$ on $\xi$, so $d\alpha$ is represented by the matrix $J_0$ having a block of zeros on the TL and BR corners, the identity on the BL corner and minus the identity on the TR corner, where all blocks are $k\times k$. We then complete this symplectic basis to a basis for the whole tangent space by adding a vector outside of $\xi$. Plug these into that wedge product, and the surviving terms all have $\sigma(1)=2k+1$, where $v_{2k+1}$ is outside $\xi_q$ and $v_i$ is the symplectic basis. So the result of plugging in these vectors into the wedge product is $\alpha(v_{2k+1})$ times the $k$-th power of the canonical symplectic form applied to the symplectic basis. Now applying the $k$th power of the canonical symplectic form to the symplectic basis yields, as is trivially seen, a sum of terms that are all either $1$ or $-1$. The terms in question differ from each other in three possible ways:

    1) The sign of $\sigma$;

    2) The order of the arguments inside the factors;

    3) The order of the factors.

    Let us see how altering the second two alters the first one. If I swap factors (3), the permutation $\sigma$ is altered by way of composing with the two transpositions that swap the factors. To be more explicit, if I have $\omega(v_1,v_3)\omega(v_2,v_4)$ and I want to swap those factors, I need but compose $\sigma$ with the permutation $(1,2)(3,4)$. This has even sign, so $\sigma$ keeps its sign, and the factor also does, so no change. If I swap arguments, I get a minus sign from the factor, but another one from the sign of $\sigma$, which is composed with a transposition. Again, no sign change. So they all have the same sign, and we are done. Next, suppose the wedge product is nonzero. This implies we have $2k+1$ linearly independent vectors for which the wedge product applied to them gives a nonnzero result. Exactly one of them is outside $\xi$, so again by the above expression we have a sum of terms with the same argument given to $\alpha$. One of those terms is nonzero, which means that if $v_i$ are those vectors and $v_{2k+1}\notin\xi$, then for each $i\leq2k$ there exists $j\leq2k$ such that $\omega(v_i,v_j)\neq0$. This is not a symplectic basis, but almost: with a couple normalizations it becomes one. So $d\alpha|_\xi$ is nondegenerate, as it admits a symplectic basis.

  6. As a bonus, if $\omega$ is a 2-form, nondegeneracy is equivalent to $\omega^k\neq0$. The argument is similar to the above: use a similar expression for $\omega^k$ applied to $2k$ vectors, if $\omega^k$ is nonzero then there exist $2k$ vectors for which one term is nonzero, which means almost a symplectic basis, and if $\omega$ is nondegenerate then we have the symplectic basis, and for the canonical symplectic form the $k$th power is nonzero simply by applying it to the basis. The expression for the $k$th power is: $$\omega^k(v_1,\dotsc,v_{2k})=\sum_{\sigma\in S_{2k}}\operatorname{sgn}\sigma\cdot\omega(v_{\sigma(1)},v_{\sigma(2)})\cdot\dotso\cdot\omega(v_{\sigma(2k-1)},v_{\sigma(2k)}).$$

Update: I tried to prove the formula for $\alpha([X,Y])$, but I seem to have disproven it. I am sure there must be something wrong in what I've done but I just can't see what. I did everything locally. Locally, I have a chart, a basis of the tangent space which is $\partial_i$, the "canonical" coordinate basis, and a basis of the dual of the tangent, $dx_i$. $dx_i(\partial_j)=\delta_{ij}$. Locally, $\alpha=\alpha_idx_i$, with the repeated indices convention. $d\alpha$ can be written as: $$d\alpha=dx_i\wedge\partial_j\alpha_idx_i=(\partial_j\alpha_i-\partial_i\alpha_j)dx_j\wedge dx_i.$$ Now, if I plug in $X,Y$, I get: \begin{align*} d\alpha(X,Y)={}&(\partial_j\alpha_i-\partial_i\alpha_j)(dx_j(X)dx_i(Y)-dx_i(X)dx_j(Y))={} \\ {}={}&\partial_j\alpha_idx_j(X)dx_i(Y)-\partial_j\alpha_idx_i(X)dx_j(Y)-\partial_i\alpha_jdx_j(X)dx_i(Y)+\partial_i\alpha_jdx_i(X)dx_j(Y)={} \\ {}={}&\partial_j\alpha_iX_jY_i-\partial_j\alpha_iX_iY_j-\partial_i\alpha_jX_jY_i+\partial_i\alpha_jX_iY_j. \end{align*} The first term up there is $X\alpha(Y)$, the second one is $-Y\alpha(X)$, so the rest should be $-\alpha([X,Y])$. So I wrote the commutator out: $$[X,Y]=[X_i\partial_i,Y_j\partial_j]=X_i\partial_i(Y_j\partial_j)-Y_j\partial_j(X_i\partial_i)=X_iY_j\partial_i\partial_j+X_i(\partial_iY_j)\partial_j-Y_jX_i\partial_j\partial_i-Y_j(\partial_jX_i)\partial_i.$$ The mixed derivatives cancel out, and the rest is: $$[X,Y]=X_i(\partial_iY_j)\partial_j-Y_j(\partial_jX_i)\partial_i.$$ Apply $\alpha$ to it: \begin{align*} \alpha([X,Y])={}&\alpha_kdx_k[X_i(\partial_iY_j)\partial_j-Y_j(\partial_jX_i)\partial_i]={} \\ {}={}&\alpha_kX_i(\partial_iY_j)dx_k(\partial_j)-\alpha_kY_j(\partial_jX_i)dx_k(\partial_i)={} \\ {}={}&\alpha_kX_i(\partial_iY_j)\delta_{jk}-\alpha_kY_j(\partial_jX_i)\delta_{ik}={} \\ {}={}&\alpha_jX_i(\partial_iY_j)-\alpha_iY_j(\partial_jX_i). \end{align*} Which is evidently not the same as above. What am I doing wrong here?

Update 2: I tried an altogether different approach, and failed again. I am copying it for the record, and also because the terrible habit I have of using $i,j$ as indices might have had me mess indices up and get a wrong result, which of course won't happen on the computer. I tried using Cartan's formula: $$\mathcal{L}_X\alpha=\iota_Xd\alpha+d(\iota_X\alpha),$$ since evidently: $$d\alpha(X,Y)=(\iota_Xd\alpha)(Y)=(\mathcal{L}_X\alpha-d(\iota_X\alpha))(Y).$$ Let us write out the commutator. Suppose $X=X_i\partial_i,Y=Y_i\partial_i$. Then: \begin{align*} [X,Y]={}&[X_i\partial_i,Y_j\partial_j]=X_i(\partial_iY_j)\partial_j+X_iY_j\partial_i\partial_j-Y_j(\partial_jX_i)\partial_i-Y_jX_i\partial_j\partial_i=X(Y_j)\partial_j-Y(X_i)\partial_i={} \\ {}={}&(X(Y_i)-Y(X_i))\partial_i. \end{align*} Let us start from the second term. Suppose $\alpha=\alpha_idx_i$. Then: $$d(\iota_X\alpha)(Y)=d(\alpha(X))(Y)=\partial_j(\alpha_iX_i)dx_j(Y)=(\partial_j\alpha_i)X_iY_j+(\partial_jX_i)\alpha_iY_j=Y(\alpha(X)).$$ OK, I had a wrong minus sign over here. I had gotten $Y(\alpha(X))-2\alpha_iY(X_i)$. But then there must be something wrong in the next bit too. Let me see. $$\mathcal{L}_X\alpha=X_i\partial_i(\alpha_jdx_j)=X_i\partial_i(\alpha_j)dx_j+X_i\alpha_j\partial_i(dx_j).$$ Interpreting $\partial_i$ as a vector field, $\partial_i(dx_j)$ would be a Lie derivative, so I use Cartan's formula once more: $$\mathcal{L}_X\alpha=X_i\partial_i(\alpha_j)dx_j+X_i\alpha_j(\iota_{\partial_i}ddx_j+d(\iota_{\partial_i}dx_j)).$$ Now $ddx_j=0$, and $\iota_{\partial_i}dx_j=dx_j(\partial_i)=\delta_{ij}$, so: $$\mathcal{L}_X\alpha=X_i\partial_i(\alpha_j)dx_j+X_i\alpha_jd(\delta_{ij}),$$ OK, that can't be right. Or maybe it is. Let us go on and see what we get. That means the second term is 0. Now we finally insert $Y$: $$(\mathcal{L}_X\alpha)(Y)=X_i(\partial_i\alpha_j)Y_j=X(\alpha_j)Y_j=X(\alpha_jY_j)-X(Y_j)\alpha_j.$$ Is that last term $\alpha([X,Y])$? Remember how $[X,Y]=(X(Y_i)-Y(X_i))\partial_i$. Then: $$\alpha([X,Y])=\alpha((X(Y_i)-Y(X_i))\partial_i)=\alpha_jdx_j((X(Y_i)-Y(X_i))\partial_i)=\alpha_j(X(Y_j)-Y(X_j)).$$ So I am missing half of this above. What is wrong above?

Update 3: Chi la dura, la vince (He conquers who endures). I was stubborn enough to try a third time. We have written before that: $$\alpha([X,Y])=\alpha_i(X(Y_i)-Y(X_j)).$$ We can easily see the following: \begin{align*} X(\alpha(Y))={}&X(\alpha_i)Y_i+X(Y_i)\alpha_i, \\ Y(\alpha(X))={}&Y(\alpha_i)X_i+Y(X_i)\alpha_i,$$ \end{align*} this boils down to writing the arguments of $X,Y$ and $X,Y$ themselves explicitly, I think we've done that above as well. Let us then compute the RHS of our claim: \begin{align*} X(\alpha(Y))-Y(\alpha(X))-\alpha([X,Y])={}&X(\alpha_i)Y_i+\underline{X(Y_i)\alpha_i}-Y(\alpha_i)X_i-\overline{Y(X_i)\alpha_i}-\alpha_i(\underline{X(Y_i)}-\overline{Y(X_j)})={} \\ {}={}&X(\alpha_i)Y_i-Y(\alpha_i)X_i. \end{align*} For the LHS, I must first stress I have an erroneous definition of $d\alpha$. $d\alpha\neq(\partial_i\alpha_j-\partial_j\alpha_i)dx_i\wedge dx_j$. It is NOT a sum over all combinations of $i,j$, but a sum over $i<j$. To have all possible combinations, I must add a half in front of everything. I will now compute the LHS finally prove the equality. Let us see: \begin{align*} 2d\alpha(X,Y)={}&(\partial_j\alpha_i-\partial_i\alpha_j)(dx_j(X)dx_i(Y)-dx_i(X)dx_j(Y))={} \\ {}={}&X_jY_i\partial_j\alpha_i-X_jY_i\partial_i\alpha_j-X_iY_j\partial_j\alpha_i+X_iY_j\partial_i\alpha_j={} \\ {}={}&Y_iX(\alpha_i)-X_jY(\alpha_j)-X_iY(\alpha_i)+Y_jX(\alpha_j)={} \\ {}={}&2Y_iX(\alpha_i)-2X_iY(\alpha_i), \end{align*} which unless I'm much mistaken is exactly twice the RHS.

We try and we fail, we try and we fail, but the only true failure is when we stop trying.

Says the gypsy in the sphere in "The Haunted Mansion". Well, lucky I didn't stop trying :).

MickG
  • 9,085
2

Thanks for the post--it was very helpful! If you're interested, the following is another way to prove the formula (in case you're like me and are allergic to computing things in coordinates). Recall that there is a general formula for how the Lie derivative acts on tensors. Namely, if $T$ is a tensor of type $(p,q)$ then for all $Y_1,\dots,Y_p\in \mathfrak{X}(M)$ and $\eta_1,\dots,\eta_q \in \Omega^1(M)$ we have $$ (\mathcal{L}_X T)(Y_1,\dots,Y_p,\eta_1,\dots\eta_q) = X( T(Y_1,\dots,Y_p,\eta_1,\dots,\eta_q)) -T(\mathcal{L}_XY_1,\dots,Y_p,\eta_1,\dots,\eta_q) - \dots -T(Y_1,\dots,Y_p,\eta_1,\dots,\mathcal{L}_x\eta_q) $$ (see https://en.wikipedia.org/wiki/Lie_derivative)

Using this, we have $$(\mathcal{L}_X\alpha)(Y) = X(\alpha(Y)) - \alpha(\mathcal{L}_XY)= X(\alpha(Y))-\alpha([X,Y]) $$

and now we use Cartan's magic formula to obtain $$d\alpha(X,Y) = (\iota_Xd\alpha)(Y) = (\mathcal{L}_X\alpha)(Y) - (d\iota_X\alpha)(Y) \\ = X(\alpha(Y))-\alpha([X,Y]) - (d\alpha(X))(Y)\\ = X(\alpha(Y)) - \alpha([X,Y]) - Y(\alpha(X))$$ as desired.