5

Why would the dimension of $w^{[2]}$ be $(n^{[2]}, n^{[1]})$ ?

This is a simple linear equation, $z^{[n]}= W^{[n]}a^{[n-1]} + b^{[n]}$

There seems to be an error in the screenshot. the weight, $W$ should be transposed, please correct me if I am wrong.

$W^{[2]}$ are the weights assigned to the neurons in the layer 2

$n^{[1]}$ is the number of neurons in layer 1

Screenshot from Andrew Ng deeplearning coursera course video:

backpropagation algorithm

Stephen Rauch
  • 1,831
  • 11
  • 23
  • 34
kevin
  • 283
  • 1
  • 3
  • 9

1 Answers1

5

There seems to be an error in the screenshot. The weight, $W$ should be transposed, please correct me if I am wrong.

You are wrong.

Matrix multiplication works so that if you multiply two matrices together, $C = AB$, where $A$ is an $i \times j$ matrix and $B$ is a $j \times k$ matrix, then C will be a $i \times k$ matrix. Note that $A$'s column count must equal $B$'s row count ($j$).

In the neural network, $a^{[1]}$ is a $n^{[1]} \times 1$ matrix (column vector), and $z^{[2]}$ needs to be a $n^{[2]} \times 1$ matrix, to match number of neurons.

Therefore $W^{[2]}$ has to have dimensions $n^{[2]} \times n^{[1]}$ in order to generate an $n^{[2]} \times 1$ matrix from $W^{[2]}a^{[1]}$

Neil Slater
  • 29,388
  • 5
  • 82
  • 101