number of parameters for convolution layers

Question

In this highly cited paper, authors give the following discussion on the number of weight parameters. I am not very clear why it has $49C^2$ parameters. I think it should be $49C$ since each of $C$ input channels shares the same filter, which has $49$ parameters.

Icyblade · Accepted Answer · 2017-02-21T10:11:54.997

Actually it's $49C*C$, the first $C$ is the number of input channels, and the second $C$ is the number of filters.

Quote from CS231n:

To summarize, the Conv Layer:

Accepts a volume of size $W_1 \times H_1 \times D_1$

Requires four hyperparameters:

Number of filters $K$,

their spatial extent $F$,

the stride $S$,

the amount of zero padding $P$.

Produces a volume of size $W_2 \times H_2 \times D_2$ where:

$W_2 = (W_1 - F + 2P)/S + 1$

$H_2 = (H_1 - F + 2P)/S + 1$ (i.e. width and height are computed equally by symmetry)

$D_2 = K$

With parameter sharing, it introduces $F \cdot F \cdot D_1$ weights per filter, for a total of $(F \cdot F \cdot D_1) \cdot K$ weights and $K$ biases.

In the output volume, the $d$-th depth slice (of size $W_2 \times H_2$) is the result of performing a valid convolution of the $d$-th filter over the input volume with a stride of $S$, and then offset by $d$-th bias.

A common setting of the hyperparameters is $F = 3, S = 1, P = 1$. However, there are common conventions and rules of thumb that motivate these hyperparameters. See the ConvNet architectures section below.

number of parameters for convolution layers

1 Answers1