14

I am familiar with the following method for an chosen-plaintext injection attack on ECB ciphers, where I am allowed to append a block of bytes to the packet being encrypted:

I inject a string with known bytes one less than the blocksize and try figuring out what the first byte of the available data could have been by brute forcing all the 256 possiblities. And then, I carry this information over to extract the next byte.

I can understand why this could be used to extract a blocksize worth of the original plaintext. However, how do I extend this method to get bytes beyond the blocksize?

2 Answers2

12

Suppose we have a block cipher that takes a 16 byte plaintext and produces a 16 byte ciphertext (that is to say $\mathcal{Enc}: \{0,1\}^{128} \rightarrow \{0, 1\}^{128}$). We use this block cipher to encrypt two blocks worth of unknown data, call them $m1$ and $m2$. Additionally we are allowed to prepend some data to these two blocks, let's call it $m0$ (we control this data).

Note that in this scheme nothing prevents us from choosing an $m0$ that is 16 bytes long. This means we effectively have an encryption oracle for a full block, since the first block returned in this case would be $\mathcal{Enc}(m0)$ if ECB mode is being used. This means we can get the encryption of arbitrary blocks of data, which will come in handy.

You correctly state that we can set $m0$ equal to 15 known bytes, and if we have an encryption oracle we can brute force the last byte:

 Block 1          Block 2  Block 3
|AAAAAAAAAAAAAAA?|?......?|?......?|
 |----known----||--m1---|

We just have to send all 256 possible guesses for Block 1 to the encryption oracle and see which one matches the output. Let's say we get a match on the byte encoding "w". We then repeat the process with a one byte shorter $m0$ to get the next byte in the same fashion:

 Block 1          Block 2  Block 3
|AAAAAAAAAAAAAAw?|?......?|?......?|
 |----known----|

We can repeat this process for each byte until we have the whole first block $m1$, which let's say is "we attack at daw". Unfortunately at this point we can't reduce $m0$ by any more bytes since $m0$ would be 0 bytes and we would simply get:

 M1               M2
|we attack at daw|?......?|
 |----known-----|

But we since we now know all of $m1$ we can use the sort of attack we used to recover the first byte of $m1$ to recover the first byte of $m2$. Suppose we again choose $m0$ to be of length 15 bytes:

 Block 1          Block 2          Block 3
|AAAAAAAAAAAAAAAw|e attack at daw?|?......?|
 |------------known-------------|

There's only one unknown byte in Block 2 so all we have to do is again submit all 256 guesses to the encryption oracle, except this time for Block 2 instead of Block 1! This process can be repeated to decrypt an arbitrary amount of ciphertext that is ECB encrypted as long as we can prepend data to the plaintext and have access to an encryption oracle.

puzzlepalace
  • 4,082
  • 1
  • 22
  • 45
6

One way to carry out the chosen-prefix ECB mode attack you describe is as follows:

Step 1: Obtain the ciphertexts corresponding to the following 16 plaintexts, where each X denotes a known byte (typically, part of our chosen prefix), each ? denotes a byte of the (fixed) unknown suffix that we wish to decrypt, and spaces show the boundaries of the 16-byte AES cipher blocks:

XXXXXXXXXXXXXXX? ???????????????? ???????????????? ???????????????...
XXXXXXXXXXXXXX?? ???????????????? ???????????????? ??????????????...
XXXXXXXXXXXXX??? ???????????????? ???????????????? ?????????????...
XXXXXXXXXXXX???? ???????????????? ???????????????? ????????????...
XXXXXXXXXXX????? ???????????????? ???????????????? ???????????...
XXXXXXXXXX?????? ???????????????? ???????????????? ??????????...
XXXXXXXXX??????? ???????????????? ???????????????? ?????????...
XXXXXXXX???????? ???????????????? ???????????????? ????????...
XXXXXXX????????? ???????????????? ???????????????? ???????...
XXXXXX?????????? ???????????????? ???????????????? ??????...
XXXXX??????????? ???????????????? ???????????????? ?????...
XXXX???????????? ???????????????? ???????????????? ????...
XXX????????????? ???????????????? ???????????????? ???...
XX?????????????? ???????????????? ???????????????? ??...
X??????????????? ???????????????? ???????????????? ?...
???????????????? ???????????????? ???????????????? ...

Note that, for this step, it doesn't matter what the known prefix bytes denoted by X actually are. The prefixes can all be different, and they don't even have to be chosen by us, as long as all of them are known to us and have distinct lengths (modulo 16).

Step 2: From the ciphertexts obtained in step 1, find a block for which we already know all but one the plaintext bytes (e.g. XXXXXXXXXXXXXXX?). Obtain the encryptions of all the 256 possible values of that plaintext block, i.e. with the known bytes kept fixed and the missing byte set to all of its 256 possible values, and check which of the resulting ciphertext blocks matches the ciphertext obtained in step 1. (Note that you can do this with a single 256-block chosen plaintext query to the encryption oracle.) This will tell us the value of the missing byte.

Now that we've determined the value of the missing byte (say, a), it becomes part of the known prefix. Thus, we now know that the ciphertexts from step 1 actually correspond to the following partially known plaintexts:

XXXXXXXXXXXXXXXa ???????????????? ???????????????? ???????????????...
XXXXXXXXXXXXXXa? ???????????????? ???????????????? ??????????????...
XXXXXXXXXXXXXa?? ???????????????? ???????????????? ?????????????...
XXXXXXXXXXXXa??? ???????????????? ???????????????? ????????????...
XXXXXXXXXXXa???? ???????????????? ???????????????? ???????????...
XXXXXXXXXXa????? ???????????????? ???????????????? ??????????...
XXXXXXXXXa?????? ???????????????? ???????????????? ?????????...
XXXXXXXXa??????? ???????????????? ???????????????? ????????...
XXXXXXXa???????? ???????????????? ???????????????? ???????...
XXXXXXa????????? ???????????????? ???????????????? ??????...
XXXXXa?????????? ???????????????? ???????????????? ?????...
XXXXa??????????? ???????????????? ???????????????? ????...
XXXa???????????? ???????????????? ???????????????? ???...
XXa????????????? ???????????????? ???????????????? ??...
Xa?????????????? ???????????????? ???????????????? ?...
a??????????????? ???????????????? ???????????????? ...

Now, one of these plaintext blocks (e.g. XXXXXXXXXXa?) again contains only one unknown byte. Thus, we can repeat step 2 above, following the same procedure as before to determine the value of the next unknown byte (let's say it's b), which again becomes a part of the known prefix. Then we can again repeat step 2, this time with the partially known plaintext block XXXXXXXXXab? and thus obtain the third byte of the suffix, and so on.

In particular, after repeating step 2 sixteen times, we will have determined the first 16 bytes of the suffix (say, abcdefghijklmnop). Thus, we now know that the ciphertexts obtained in step 1 correspond to the following partially known plaintexts:

XXXXXXXXXXXXXXXa bcdefghijklmnop? ???????????????? ???????????????...
XXXXXXXXXXXXXXab cdefghijklmnop?? ???????????????? ??????????????...
XXXXXXXXXXXXXabc defghijklmnop??? ???????????????? ?????????????...
XXXXXXXXXXXXabcd efghijklmnop???? ???????????????? ????????????...
XXXXXXXXXXXabcde fghijklmnop????? ???????????????? ???????????...
XXXXXXXXXXabcdef ghijklmnop?????? ???????????????? ??????????...
XXXXXXXXXabcdefg hijklmnop??????? ???????????????? ?????????...
XXXXXXXXabcdefgh ijklmnop???????? ???????????????? ????????...
XXXXXXXabcdefghi jklmnop????????? ???????????????? ???????...
XXXXXXabcdefghij klmnop?????????? ???????????????? ??????...
XXXXXabcdefghijk lmnop??????????? ???????????????? ?????...
XXXXabcdefghijkl mnop???????????? ???????????????? ????...
XXXabcdefghijklm nop????????????? ???????????????? ???...
XXabcdefghijklmn op?????????????? ???????????????? ??...
Xabcdefghijklmno p??????????????? ???????????????? ?...
abcdefghijklmnop ???????????????? ???????????????? ...

Note that now we again have a ciphertext block corresponding to a plaintext block with just one unknown byte: this time, it's bcdefghijklmnop?, consisting of the now known bytes 2 to 16 of the suffix, and the yet unknown 17th byte. Thus, we can again repeat step 2 above to find out the value of the 17th byte, and we can keep repeating the same process over and over until we've finished decrypting the entire suffix.

Ilmari Karonen
  • 46,700
  • 5
  • 112
  • 189