I need a deterministic encryption scheme. Objectives:
- Same plaintext is always encrypted to the same ciphertext.
- Related plaintexts are encrypted to unrelated ciphertexts. (For example, attacker that knows two ciphertexts cannot learn whether corresponsing plaintexts start with the same prefix.)
- Minimum space overhead.
- Authenticated encryption, of course.
There are also some conditions that can make it easier:
- I will encrypt small chunks of data (few dozens of bytes). It is OK to store them in RAM and then start the encryption.
- The data will be padded to the same length all the time. So, we can conclude that the length of the data is some constant.
- Not high performance requirements.
My first idea was to derive IV from plaintext using HMAC and then use some authenticated encryption that requires just unique (potentially predicable) IV, like AES-GCM. This requires both IV and authentication tag to be stored alongside the encryption.
But I have an idea how to reduce the space overhead even more: Use (H)MAC of plaintext as both IV and authentication of the message. I know this is generally dangerous scheme close to encrypt-and-mac, so I specify further requirements for primitives to mitigate its risks:
- The encryption will not leak anything when decrypting attacker-manipulated data. Stream ciphers like AES-CTR look suitable here, because there are no risks related to padding oracle. It just “decrypts” any attacker-provided garbage to another garbage (rejected by MAC check later) without any side channel if the AES-CTR implementation is correct.
- The encryption requires just unique IV, but it does not have any other requirements. Thill, AES-CTR looks suitable here.
- The MAC does not leak any data. HMAC (potentially truncated to 128b) looks suitable there, because failing to do so would at least weaken preimage resistance of the underlying hash function.
- Encryption uses a key completely unrelated to authentication key, of course. (This requirement might not be strictly needed when using, say, AES-CTR with HMAC-SHA256. However the requirement makes reasoning about security easier.)
The whole schema looks somewhat exotic to me, but I believe that I have resolved risks arising from the unusual design.
Is there any existing work that aims to do the same? Is there any research that would tell me anything about security of this scheme?