10

In this highly cited paper, authors give the following discussion on the number of weight parameters. I am not very clear why it has $49C^2$ parameters. I think it should be $49C$ since each of $C$ input channels shares the same filter, which has $49$ parameters.

enter image description here

Icyblade
  • 4,376
  • 1
  • 25
  • 34
user297850
  • 253
  • 1
  • 3
  • 8

1 Answers1

18

Actually it's $49C*C$, the first $C$ is the number of input channels, and the second $C$ is the number of filters.

Quote from CS231n:

To summarize, the Conv Layer:

  • Accepts a volume of size $W_1 \times H_1 \times D_1$
  • Requires four hyperparameters:
    • Number of filters $K$,
    • their spatial extent $F$,
    • the stride $S$,
    • the amount of zero padding $P$.
  • Produces a volume of size $W_2 \times H_2 \times D_2$ where:
    • $W_2 = (W_1 - F + 2P)/S + 1$
    • $H_2 = (H_1 - F + 2P)/S + 1$ (i.e. width and height are computed equally by symmetry)
    • $D_2 = K$
  • With parameter sharing, it introduces $F \cdot F \cdot D_1$ weights per filter, for a total of $(F \cdot F \cdot D_1) \cdot K$ weights and $K$ biases.
  • In the output volume, the $d$-th depth slice (of size $W_2 \times H_2$) is the result of performing a valid convolution of the $d$-th filter over the input volume with a stride of $S$, and then offset by $d$-th bias.

A common setting of the hyperparameters is $F = 3, S = 1, P = 1$. However, there are common conventions and rules of thumb that motivate these hyperparameters. See the ConvNet architectures section below.

Icyblade
  • 4,376
  • 1
  • 25
  • 34