4

One thing cryptography articles never seem to explain is how the message actually gets encrypted. You get this long-winded lecture on number theory which ends with, Ta Da! and we have a public and private key. Then, they never explain the process of exactly how some 100,000 byte word document gets encrypted using the public key.

Other times the explanations seem downright misleading. For example, in my book Cryptography (by Meyer and Matyas) it says for RSA the ciphertext is the plain text to the power of key. How do you exponentiate a word document? I don't get it.

Minkus CNB
  • 189
  • 8
Tyler Durden
  • 145
  • 6

4 Answers4

5

Rick Demer already wrote the answer in the very first comment, but without explanation: Hybrid encryption.

But since you asked for a real practical example to encrypt your word document, this is how: Your file is on your disc, and it is 100,000 byte large. You can then do:

  • First, you start up a random number generator. Preferably you should either have true randomness or at least a cryptographically secure RNG.
  • Let the RNG generate a random 256 bit number.
  • Start up your AES 256 engine/program/hardware module, and choose the appropriate mode of operation like CBC. Use the random number as key.
  • Use AES to encrypt the entire content of the file.
  • Get your public key or generate private/public key, depending on what you want to do.
  • Encrypt the random number (used as symmetric key) with your asymmetric encryption scheme. In the case of RSA, do not just use the textbook variant but RSA-OAEP or a similar padding scheme.
  • Create a new file: Put relevant information in the file header (whatever you want to be accessible without decryption). In the data of the file, write down the encrypted random number (it was encrypted with the public key), and then you just add the entire encrypted file-content (encrypted with AES).
  • Optional: You can add delimeters inside your file to make it easier to see where the key ends and where the ciphertext starts. If you want you can also note down which algorithms you actually used to encrypt this specific file.

AES can obviously be replaced with any other symmetric cipher, you can also pick a different mode of operation or use a specific asymmetic encryption scheme to encrypt the random key. If you do so, putting it into the file might help if you ever forget which file was encrypted with which algorithm.

Note on using asymmetric encryption directly

It is also possible to use for instance RSA directly on your $100,000$ byte document. If your RSA key is $1240$ bit long, then you can split your document into $800$ blocks with each block having $1000$ bit (or e.g. $650$ blocks of $\approx 1231$ bits, whatever suits you). Then you can apply RSA directly to each block and be done with it. The problem is that this is extremely slow. The ratio of processed bits per second is magnitudes lower for asymmetric encryption than for symmetric encryption. And then you have the problem, that RSA shouldn't be used in the textbook variant and RSA-OAEP increases the length.

tylo
  • 12,864
  • 26
  • 40
2

Well, exponentiating a word document is rather easy. As you've said, it's $100,000$ bytes, or $800,000$ bits. This word document can thus be interpreted as a number between $0$ and $2^{800,000}$. Sure this number may be large, but it can be exponentiated.

However, more commonly symmetric encryption is used with a $128$ or $256$ bit key to encrypt the word document, and then public-key encryption is used on that key, as number between $0$ and $2^{256}$. This approach is much faster.

orlp
  • 4,355
  • 21
  • 31
0

This post covers RSA, other algorithsm will likely have similar considerations but the details may be different.

For textbook RSA the "message" is a number. Due to the use of modular arithmetic this number must be smaller than the modulus.

We can obviously take a sufficiently short sequence of bytes and encode it as a number. However textbook RSA has a number of security concerns, most obviously that since it's determinitic an attacker can guess the message and validate wheher their guess was correct but also some more subtule cases.

To avoid these issues a randomised padding scheme is used. There are several such schemes but I belive OEAP is considered the modern standard. OEAP combines the message with a random number and then subjects the combination to an all-or-nothing transform.

That doesn't work for large messages though, a typical RSA key has a 1024-4096 bit modulus. Add some overhead for padding and even with a 4096 bit key you are talking about a message length limit of arround 5000 bytes.

You could split a large mesage into chunks, encrypt them seperately and then chain together the result but this would be space-inefficient due to seperate randomised padding for each chunk. It would also be slow.

So in practice we encrypt the message with a symetric encryption algorithm and a randomly generated key. We then apply the random padding to the key and encrypt the padded key using RSA with they public key.

Peter Green
  • 1,613
  • 1
  • 11
  • 17
-2

One thing cryptography articles never seem to explain is how the message actually gets encrypted.

Either you're reading really bad articles or you're missing a step. Try reading better articles and follow along very carefully.

Then, they never explain the process of exactly how some 100,000 byte word document gets encrypted using the public key.

You break up the document into blocks of a particular byte size. Treat each block as a series of bits. That's a number. Feed the number into the crypto algorithm, get a number back out. That's also a series of bits. Now you have turned a series of plaintext blocks into a series of encrypted blocks.

Other times the explanations seem downright misleading. For example, in my book Cryptography (by Meyer and Matyas) it says for RSA the ciphertext is the plain text to the power of key. How do you exponentiate a word document?

If you really want to see how it works in practice then I encourage you to read the source code. For example, download the source code for BouncyCastle and take a look at the GetInputBlockSize, GetOutputBlockSize, ConvertInput, ConvertOutput and ProcessBlock methods in the RSACoreEngine.cs file. This is only 150 lines of pretty easy-to-follow code but it answers all your questions.

As others have mentioned, it is often impractical to use public key crypto on large documents because the math is slow. Usually what you do is choose a symmetric cryptosystem, choose a random key in that cryptosystem, encrypt the document in that cryptosystem, encrypt the symmetric key using the public key. Now the document cannot be decrypted without the symmetric key, and the symmetric key cannot be decryped without the private key.

Eric Lippert
  • 135
  • 4