The meaning of random word dropout in NLP

Question

I have been reading the early paper on pre-training in NLP (https://arxiv.org/abs/1511.01432) and I can't understand what random word dropout means. The authors completely ignore explaining this method as if it was a standard thing. Can someone explain what they really do and what is the purpose of that?

score 3 · Answer 1 · edited Feb 25 '20 at 19:22

It is not uncommon that we can make sense of a sentence without reading it completely. Or when you are having a quick look at a document, you tend to oversee some words and still understand the main point. This is the intuition behind the word dropout.

Generally this is done by randomly dropping each word in a sequence following for example a Bernoulli distribution:

$X \leftarrow X \odot \vec{e}, \vec{e} ∼ B(n, p)$

where X is the index of the word token, n is the lenth of the sequence, and $\vec{e}$ is a vector with each word dropout state.

This is usually done after calculating the word embeddings, and the words selected to be left out are normally changed to the <UNK> equivalent embedding.

By doing this, we allow out model to learn more flexible ways of writing/convey meaning.

The meaning of random word dropout in NLP

1 Answers1