We are all familiar with the famous Deep Mind paper STN.
Upon implementation, such as here, did anyone still use input data augementation such as affine transformations?
There are used to make CNN robust to rotation, transformation etc. I wonder how these/lack of these affect the output of STNs.