1

For some project, I need to store some short string in encrypted form. It would be ideal for the current purposes to encrypt it using itself.

# This would be the usual way:
result = AES_ENCRYPT(string, key)

# but I would like to do it this way:
result = AES_ENCRYPT(string, string)

The same string itself would be used as the key!

The 'string' is always some short text, like 20 characters of letters and numbers. This solution seems better to me than using any hash function, since hash could lead to collisions - different strings could potentially produce the same result, which is low probability, I know, but still not zero so I want to avoid that.

My question is, is my approach safe? Encoding the data using the data itself, does it lead to any security issue? Would it mean that it could be easier to decrypt the result, if the 'attacker' knows the fact it is produced by encrypting my way? Thank you.

Tomas M
  • 13
  • 2

1 Answers1

3

Your scheme turns AES into a one way function. As you already found out from the comments this scheme doesn't preclude collisions. There is a good reason why hash functions have a larger output size than the block size of most block ciphers as the birthday problem is applicable for this newly build PRF and normal hash functions.

The chance of a collision is so low for SHA-2 that it is easily possible to prune some of the rightmost bytes output and still have a large security margin in case you need a smaller output. The number of hashes required a collision using SHA-256 is about $2^{128}$. I.e. that's about the same amount of tries as brute forcing AES-128. In the end you need to end up with at least double the amount of the required strength in bits.

Your key is a problem as well. You say you have 20 character of letters and numbers. That's not a good AES key. If you just encode it using ASCII you have a 20 byte key; 20 byte keys are not acceptable for AES as block cipher. So you can either thing of some weird encoding method, or you could use a KDF. But the KDF is likely made up by a block cipher or hash function...

Finally, a hash can take any size of input, and it has been designed to be a one way function. This is what it was designed for, there is simply no need for a block cipher here. If you have a need for the function to be irreversible you may want to use a password hash such as PBKDF2, to counter the lack of entropy in the input.

Maarten Bodewes
  • 96,351
  • 14
  • 169
  • 323