How does Alexnet flatten operation go from 6x6x256 tensor to 4096 vector

Question

In the Alexnet model, after the encoder steps are completed, you end up with a 6x6x256 tensor. Now this needs to be flattened before we go to the ANN part of the network. However, the flattening results in a length of 4096. How did the size of the tensor reduce? In a few tutorials I read about these flatten steps, there is no loss of size when you flatten the tensor so I was expecting the length of the flattened vector to be 6 * 6 * 256 i.e. 9216. Why does Alexnet flatten end up with 4096 and not 9216 length?

The Alexnet paper does not go in to the details of the individual layers of the network.

Thanks

score 1 · Answer 1 · answered Jul 27 '23 at 08:28

First I also got confused but after having a look at some images of the model architecture the solution is quite clear:

a busy cat (source https://media5.datahacker.rs/2018/11/alexnet_ispravljeno.png)

The last convolutional layer with the output of 6x6x256 is unfolded to a 1D vector with n=9216 (as you already said correctly). BUT these 9216 neurons are identical to the neurons in the convolution block. These are not new neurons! the unfolding is just to better understand how each of these 9216 neurons is connected to our actual Fully Connected Layer with 4096 neurons.

How does Alexnet flatten operation go from 6x6x256 tensor to 4096 vector

1 Answers1