2

Im currently learning neural networks and I see conflicting decsriptions of the dimensions of weight and input matrices on the internet. I just wanted to know if there is some convention which more people use than the other.

I currently define my input matrix X with the dimensions of:

(m x n)

Where m is the number of samples and n is the number of features.

And I define my weight matrices with the dimensions:

(a x b)

Where a is the number of neurons in the layer and b is the number of neurons in the last layer.

Is that conventional or should I change something?

Tim von Känel
  • 381
  • 2
  • 12

1 Answers1

3

I would not say there is such a convention for it per se (if anyone has anything to comment on this, I would also like to know).

I think to make it clearer how the layer's input x interacts with the weights W, it might better to define the dimensions as the following:

  • x: (m x n)
  • W: (n x k)
  • bias term b: (k)

m remains as the number of examples. n represents number of input features and k represents number of neutrons in the layer.

As we know, we compute the output of the layer y as Wx + b. Therefore, the resulting output matrix will be (m x k)

shepan6
  • 1,486
  • 7
  • 14