Is there a standard way of scrambling the output of AES?

Question

So I needed symmetric encryption for my program. I landed on AES 192 bits in the CTR mode, because of some Computerphile videos on YouTube.

After using it with the Node.js "crypto" lib implementation, I noticed that some of the outputs are very similar. The output is created from a UTF-8 SQL syntax string input and digested to base64. Based on what know so far, this makes sense, since a lot of SQL strings would start the with the same text, e.g. "SELECT ...", and AES works with independent blocks of data. I also append a random integer to the end of each of these inputs, but cannot at the start (because of the specific situation).

Is it a problem, that the start of a crypt can be easily guessed? If so, is there a way to scramble the output, such that it can be unscrambled later with the same key and IV?

Are there alternative algorithms or modes that do this kind of thing? I need the output to be unintelligible and unalterable.

fgrieu · Accepted Answer · 2022-12-30T11:55:10.040

Is it a problem, that the start of a ~~crypt~~ ciphertext can be easily guessed?

That happens by design for excellent encryption systems, e.g. because every ciphertext starts with a version and key identifier. But in the case at hand, that's the symptom of a devastating error: AES-CTR is being used for different records with the same constant IV, therefore the cipher degenerates to XOR with a constant bitstring, which is very poor encryption.

AES-CTR mode is designed to be used as follows:

At encryption of each cryptogram, it's chosen a fresh IV, usually 8-bytes, by some process than makes it very unlikely that the same IV will be chosen again for a given key. An incremental counter might do, if there's no way it can be reset†.
That IV is put as the first bytes of the ciphertext. These bytes are used at decryption to get the IV.
That IV is extended to 128-bit, typically internally to the implementation of the CTR-mode cipher.

I need the output to be unintelligible and unalterable.

Then do not use AES-CTR. It aims only at confidentiality of the data, not integrity, which typically is also an operational requirement, and one we read in "unalterable". For this we have authenticated encryption, e.g. AES-GCM, and variants of that which make nonce (aka IV) reuse a lesser disaster, e.g. AES-GCM-SIV.

Caution: defining the operational requirement of cryptography in database applications is hard. For example, when encrypting the answer to a secret question used for user authentication purpose, authenticated encryption of that data in isolation is not enough (because it still allows substituting the unknown answer with a known one). One solution to that is to enter the identification of the cell encrypted as GCM Additional Authenticated Data.

† It's often difficult to keep track of which IVs have been used. One strategy then is to generate the IVs at random: probability of two identical $b$-bit IVs after $n$ are drawn is no more than $n(n-1)/2^{b+1}$ if a working true random number generator is used. Up to $n$ in millions, that's fine for the usual $b=64$. Above that, an option is to use $b=80$ or $b=96$, noting that no more than $2^{132-b}$ bytes should be encrypted with the same IV.

Is there a standard way of scrambling the output of AES?

1 Answers1