Conventions for dimensions of input and weight matrices in neural networks?

Question

Im currently learning neural networks and I see conflicting decsriptions of the dimensions of weight and input matrices on the internet. I just wanted to know if there is some convention which more people use than the other.

I currently define my input matrix X with the dimensions of:

(m x n)

Where m is the number of samples and n is the number of features.

And I define my weight matrices with the dimensions:

(a x b)

Where a is the number of neurons in the layer and b is the number of neurons in the last layer.

Is that conventional or should I change something?

score 3 · Accepted Answer · answered Jul 06 '20 at 09:34

I would not say there is such a convention for it per se (if anyone has anything to comment on this, I would also like to know).

I think to make it clearer how the layer's input x interacts with the weights W, it might better to define the dimensions as the following:

x: (m x n)
W: (n x k)
bias term b: (k)

m remains as the number of examples. n represents number of input features and k represents number of neutrons in the layer.

As we know, we compute the output of the layer y as Wx + b. Therefore, the resulting output matrix will be (m x k)

Conventions for dimensions of input and weight matrices in neural networks?

1 Answers1