Ascon has an interesting state construction where linear diffusion only occurs within words and the only non-linearity serves as also the only diffusion between words:


As far as I know it's the only standardized cipher where an s-box serves as the only diffusion mechanism between parts of the state.
The rationale stated in their publications did not address this seemingly novel approach.
There are a number of permutations based on 4 128-bit vectors utilizing AES (ex. Haraka) but they rely on shuffling bytes around. Would not using 4-bit s-boxes been a better choice? Is there something wrong with the approach? Ascon was selected despite or because of it.