In neural networks, is applying dropout the same as zeroing random neurons?

Question

Is applying dropout equivalent to zeroing output of random neurons in each mini-batch iteration and leaving rest of forward and backward steps in back-propagation unchanged? I'm implementing network from scratch in numpy.

hH1sG0n3 · Accepted Answer · 2022-11-10T20:22:56.137

Indeed. To be precise, the dropout operation will randomly zero some of the input tensor elements with probability $p$, and furthermore the rest of the non-dropped out outputs are scaled by a factor of $\frac{1}{1-p}$ during training.

For example, see how elements of each tensor in the input (top tensor in output) are zeroed in the output tensor (bottom tensor in output) using pytorch.

m = nn.Dropout(p=0.5)
input = torch.randn(3, 4)
output = m(input)
print(input, '\n', output)
>>> tensor([[-0.9698, -0.9397,  1.0711, -1.4557],
>>>        [-0.0249, -0.9614, -0.7848, -0.8345],
>>>        [ 0.9420,  0.6565,  0.4437, -0.2312]]) 
>>> tensor([[-0.0000, -0.0000,  2.1423, -0.0000],
>>>        [-0.0000, -0.0000, -1.5695, -1.6690],
>>>        [ 0.0000,  0.0000,  0.0000, -0.0000]])

EDIT: please note the post has been updated to reflect Todd Sewell's addition in the comments.

In neural networks, is applying dropout the same as zeroing random neurons?

1 Answers1