Whatever your notion of $\frac{\partial y}{\partial W}$, part of the data carried by this object is the set of all partial derivatives $\frac{\partial y}{\partial W_{ij}}$, and these derivatives should form all "entries" of $\frac{\partial y}{\partial W}$. In this wiki page, the author(s) use only these partial derivatives and do not make any reference to a "total" derivative $\frac{\partial y}{\partial W}$.
Let $e_1,e_2$ denote the canonical basis of $\Bbb R^2$, i.e. the columns of the $2 \times 2$ identity matrix. We can see that these partial derivatives are given by
$$
\frac{\partial y}{\partial W_{ij}} = x_j e_i.
$$
To put things in terms of scalar entries, we would say that
$
\frac{\partial y_k}{\partial W_{ij}} = \delta_{ik} x_j,
$ where $y_k$ denotes the $k$th entry of $y$ and $\delta_{ik}$ denotes a "Kronecker delta".
Now, in terms of the total/Frechet derivative, we could say the following. $y(W)$ defines a function from $\Bbb R^{2 \times 2}$ to $\Bbb R^2$, so for any $W \in \Bbb R^{2 \times 2}$, $D_Wy(X) = Dy(X)$ defines a linear map from $\Bbb R^{2 \times 2}$ to $\Bbb R^2$; specifically, for any $H \in \Bbb R^{2 \times 2}$, we have
$$
Dy(X)(H) = y(H) = Hx.
$$
Although it is not an array of entries, this function $Dy$ is the operator that the array/tensor $\frac{\partial y}{\partial W}$ would represent.
We can recover the partial derivatives by evaluating the "directional derivatives" $d_Wy(X)(E_{ij})$, where $E_{ij} = e_ie_j^T$ is the matrix with a $1$ in the $i,j$ entry and zeros elsewhere. Indeed, we have
$$
Dy(X)(E_{ij}) = E_{ij} x = e_i (e_j^Tx) = x_j e_i.
$$
The chain rule tells us the following: for any function $g:\mathcal Z \to \Bbb R^{2 \times 2}$, we may compute the total derivative of $y \circ g$ as follows. For any $z \in \mathcal Z$, the derivative (a linear map from $\mathcal Z$ to $\Bbb R^{2}$) is given by
$$
D(y \circ g)(z) = Dy(g(z)) \circ Dg(z),
$$
where $Dy(g(z))$ is a linear map from $\Bbb R^{2 \times 2} \to \Bbb R^2$ and $Dg(z)$ is a linear map from $\mathcal Z$ to $\Bbb R^{2 \times 2}$. More concretely, if $h \in \mathcal Z$, then the directional derivative "along" $h$ should be given by
$$
D(y \circ g)(z)(h) = [Dy(g(z)) \circ Dg(z)](h) = [Dg(z)(h)] x.
$$
Similarly, for any function $p: \Bbb R^2 \to \mathcal Z$, we may compute the total derivative of $p \circ y$
as follows. For any $X \in \Bbb R^{2 \times 2}$, the derivative (a linear map from $\Bbb R^{2 \times 2}$ to $\mathcal Z$) is given by
$$
D(p \circ y)(X) = Dh(y(X)) \circ Dy(X),
$$
where $Dh(y(X))$ is a linear map from $\Bbb R^2$ to $\mathcal Z$ and $Dy(X)$ is a linear map from $\Bbb R^{2 \times 2}$ to $\Bbb R^2$. More concretely, if $H \in \Bbb R^{2 \times 2}$, then the directional derivative "along" $H$ should be given by
$$
D(p \circ y)(X)(H) = [Dp(y(X)) \circ Dy(X)](H)
= Dp(y(X))(Hx).
$$