Hash Comparison to Detect Ransomware File Encryption

Question

As detailed in a separate question, I thought I had a way to detect the type of ransomware that encrypts files silently, and then decrypts them on the fly, so as to prevent the user from realizing that the files have been encrypted. I thought that a comparison of present vs. past file hashes would detect file changes: if many files were unexpectedly changed, maybe those changes were due to ransomware encryption.

A comment on that question seems to say that my concept fails because a file must be read in order to be hashed. The ransomware would make the file's contents available to the hashing tool; that tool would find that the contents appeared unchanged; therefore I would get the same hash value as before.

I don't understand that. It seems I need to address it in this separate question. If hashing only takes account of the file's contents, wouldn't it be impossible to hash, say, a file that the user has securely encrypted?

A Cryptography discussion seems to say that hash values, for a file, may vary according to the timing of encryption with a public key. I interpret that as meaning that variations in the encryption process can produce variations in hash values. That seems incompatible with a general claim that hashing would not detect any difference between an encrypted file (even if decrypted on the fly) and its previously unencrypted form.

What am I missing here?

mentallurg · Accepted Answer · 2022-02-21T22:15:18.087

I don't understand that

If your system is infected, there is no guarantee that you read the real file contents, as it is stored on the disk. It can be that the file is encrypted by ransomware. When you request an application to read it, it calls operating system. If ransomware infected the system, it will read the encrypted contents, decrypt it, and provide to the OS and thus to your application. As long as you use an infected system you cannot know what is the real contents on the disk.

The only reliable way to detect encryption is to read files using other system. Boot from USB stick, create hashes of the files, repeat it time to time, e.g. daily or weekly. Of course, it differs from your desire to detect changes immediately.

wouldn't it be impossible to hash, say, a file that the user has securely encrypted?

You can hash any file. Only you know if you have encrypted the file or not. For the operating and for ransomware there is no difference: Any file is just a set of bytes. If you encrypt the file, compute hash, write file to the disk, then read it back, you will get exactly what you have written (your encrypted file). But you will not know if before saving to the disk it was encrypted by ransomware and after reading was decrypted.

that hash values, for a file, may vary according to the timing of encryption with a public key

Not the hash of the plain file, but the encryption result may vary. As a consequence, for different files you will get different hashes.
Encryption results of the same file may vary, but not because of timing. For instance, you can launch AES GCM encryption of the same file with the same password on 100 parallel threads on the same computer at the same time, all of them will produce different results. But when decrypted, they all will produce the same original file.

fgrieu · Answer 2 · 2022-02-22T12:30:57.057

Hashing detects (with overwhelming probability) any difference between two pieces of data, including one being an encrypted version of the other. Thus the principle of comparing hashes of files to detect that many have changed is sound.

There are however a few ways a program systematically encrypting the files on disk could evade detection from a program checking that hashes of files on disk do not change. They include:

Disabling the check by seizing all CPU resources during the encryption.
Hooking into the read code of all programs (including the one performing the hash check) so as to present them unmodified data until all the files have been encrypted, even though the data has already been physically encrypted on disk. This is possible if the encryption key is used to decrypt until the encryption is complete.

Update: there is no need that the cryptoransomware be tailored to the hash, or to the hash comparison program; all it needs to do is correctly implement either of the above two bullet points. On the other hand, the few actual encrypting cryptoransomwares that I studied (in a VM) only partially implemented the first strategy (as a side effect of their main strategy: encrypt as fast as they can), and not the second one, which in modern OSes requires a privilege escalation.

Hash Comparison to Detect Ransomware File Encryption

2 Answers2