I'm trying to understand how vectors, differential forms and multi-linear maps in general transform under change of coordinates. So I start with the simplest case of vectors. Here's my own attempt, please bear with me:
Suppose that $V$ is a vector space and $\alpha=\{v_1,\cdots,v_n\}$ and $\beta = \{w_1,\cdots,w_n\}$ are two ordered bases for $V$. $\alpha$ and $\beta$ give rise to the dual bases $\alpha^*=\{v^1,\cdots,v^n\}$ and $\beta^*=\{w^1,\cdots,w^n\}$ for $V^*$ respectively.
If $[T]_{\beta}^{\alpha}=[\lambda_{i}^{j}]$ is the matrix representation of coordinate transformation from $\alpha$ to $\beta$, i.e.
$$\begin{bmatrix} w_1 \\ \vdots \\ w_n \end{bmatrix}= \begin{bmatrix} \lambda_1^1 & \lambda_1^2 & \dots &\lambda_1^n \\ \vdots & \vdots & \ddots & \vdots \\ \lambda_n^1 & \lambda_n^2 & \cdots & \lambda_n^n\end{bmatrix} \begin{bmatrix} v_1 \\ \vdots \\ v_n \end{bmatrix}$$
What is the matrix of coordinate transformation from $\alpha^*$ to $\beta^*$?
We can write $w^j \in \beta^*$ as a linear combination of basis elements in $\alpha^*$:
$$w^j=\mu_{1}^{j}v^1+\cdots+\mu_n^{j}v^n$$
We get a matrix representation $[S]_{\beta^*}^{\alpha^*}=[\mu_{i}^{j}]$ as the following:
$$\begin{bmatrix} w^1 & \cdots & w^n \end{bmatrix}= \begin{bmatrix} v^1 & \cdots & v^n \end{bmatrix}\begin{bmatrix} \mu_1^1 & \mu_1^2 & \dots &\mu_1^n \\ \vdots & \vdots & \ddots & \vdots \\ \mu_n^1 & \mu_n^2 & \cdots & \mu_n^n\end{bmatrix} $$
We know that $w_i = \lambda_{i}^1v_1+\cdots+\lambda_{i}^nv_n$. Evaluating this functional at $w_i \in V$ we get:
$$w^j(w_i)=\mu_{1}^{j}v^1(w_i)+\cdots+\mu_n^{j}v^n(w_i)=\delta_{i}^j$$ $$w^j(w_i)=\mu_{1}^{j}v^1(\lambda_{i}^1v_1+\cdots+\lambda_{i}^nv_n)+\cdots+\mu_n^{j}v^n(\lambda_{i}^1v_1+\cdots+\lambda_{i}^nv_n)=\delta_{i}^j$$ $$w^j(w_i)=\mu_{1}^{j}\lambda_{i}^1+\cdots+\mu_n^{j}\lambda_{i}^n=\sum_{k=1}^n\mu_{k}^j \lambda_{i}^k=\delta_{i}^j$$
But $\sum_{k=1}^n\mu_{k}^j \lambda_{i}^k$ is the $(i,j)$ entry of the matrix product $TS$. Therefore $TS=I_n$ and $S=T^{-1}$.
If we want to write down the transformation from $\alpha^*$ to $\beta^*$ as column vectors instead of row vector and name the new matrix that represents this transformation as $U$, we observe that $U=S^{t}$ and therefore $U=(T^{-1})^t$.
Therefore if $T$ represents the transformation from $\alpha$ to $\beta$ by the equation $\mathbf{w}=T\mathbf{v}$, then $\mathbf{w^*}=U\mathbf{v^*}$.
The important case is when $T=(T^{-1})^t$ which happens if and only if $T$ is an orthonormal transformation.
What I don't understand is why the transformation from $\alpha$ to $\beta$ is called a contravariant transformation while the transformation from $\alpha^*$ to $\beta^*$ is called a covariant transformation. Would you please elaborate on this important point? It's been driving me crazy for the last two days.
Thanks in advance.