0

I need to encrypt file and directory names/paths but I need the encryption to be deterministic. I need to sync the local files with a cloud storage provider so I can't use probabilistic encryption.

Know that you should not use a static IV when encrypting text, would this be an acceptable work around:

  • Run passphrase through scrypt and store resulting output
  • Take the resulting output from scrypt and hash it (using MD5 for example)
  • Take the first 16 bytes of the hash and use it as the IV to encrypt the directory and file name

There is one thing to note that is really important here: The salt used for scrypt is hard coded most of the time. This is a local application that doesn't require a login and the users must be able decrypt the files on multiple computers so I'm unable to store the salts in a database like usual. The users can specify their own salt, but usually it will be hard coded for the application.

The only other thing I can think of:

  • Use probabilistic encryption
  • Read the directory/file structure from the cloud service provider and local directory
  • Map all the encrypted cloud provider names with their decrypted values
  • Map all the encrypted local names with their decrypted values
  • Sync based on the mappings found above

The only issue with that is that it is time consuming and really difficult to implement when using different cloud service providers.

stackiee
  • 101

1 Answers1

1

From my understanding of your question, what you want is to hide the name of the file, while enable syncing. You don't actually need encryption per se, since the real name of the file could be encrypted together with the entire file (probabilistically). As such, a potential solution is to just HMAC the file name and use that as the value for syncing. No IV or encryption is needed.

If you want to be able to invert this (if that's important), then you should use something like AES-SIV or AES-GCM-SIV (with fixed IV=0; in general, this isn't recommended but if these are really your constraints then this works). This will ensure that two files with parts of the name the same won't be discovered. You can of course do this also by deriving an IV and encrypting as you say, but that's more complicated and not standard. As such, it's not recommended.

My recommendation would be HMAC on the name for syncing, and regular probabilistic encryption on the name and full file.

Yehuda Lindell
  • 28,270
  • 1
  • 69
  • 86