10

This question came up as I tried to answer an earlier question I asked: Cryptographic data structure: sparse array without membership test. I still have not resolved that question to my satisfaction, but this question may be appreciated independently.

I want an error-correcting code that has the additional property that when encoding a string that cannot itself be feasibly distinguished from a random string, the encoding cannot be feasibly distinguished from a random string either. So, for example, Alice could generate a random 1000 bit string, encode it using the algorithm to get a 2000 bit string, and send it over a noisy channel to Bob. Bob could recover Alice's original 1000 bit string (even if there were many corrupted bits) but would have no way of knowing that Alice did not just send a random 2000 bit string.

I realize that "deniable" may not be the best word to use for this property since this word already has a different technical use in cryptography. But I am not sure what to call the property.

Also, are there any tags I should be using besides "algorithm-design"? For both my previous question and this one, I felt that there was no completely appropriate tag.

gmr
  • 271
  • 1
  • 6

4 Answers4

4

Here's an approach that corrects some errors:

  • Alice has a bitstring S that is indistinguishable from random that she wants Trent to know. (Perhaps it's actually a ciphertext that only Bob knows how to decrypt).
  • Alice generates 5 fresh new random bit-strings pM, pN, pP, pQ, and pR just as long as the original random bitstring S.
  • For each bit of S, Alice checks the the corresponding 5 bits of the random strings pM, pN, pP, pQ, and pR. If the majority of those 5 bits is the same as S, Alice keeps that bit the same, otherwise Alice flips all 5 bits, generating 5 modified versions of those random strings M, N, P, Q, and R.
  • Alice sends M, N, P, Q, and R to Trent.
  • For each bit of the messages M, N, P, Q, and R, Trent uses 5-modular redundancy -- the majority of the 5 bits -- to attempt to recover the string S.

Because pM, pN, pP, pQ, and pR are completely randomly generated, and S is indistinguishable from random, Trent is still unable to distinguish M, N, P, Q, and R from random bit-strings.

For any one bit-position, the 5 bits can be in any one of 32 possible patterns with equal probability. If there is no bit error in some particular position, Trent always recovers the correct bit of S at that position. Trent can usually correct any single-bit-flip error across a row of 5 bits. (Given any 1 of the 5 bit positions ahead of time that will suffer a single-bit-flip error, 20 of the 32 possible patterns result in Trent recovering the correct bit of S, but the remaining 12 possible patterns would result in Trent decoding an erroneous bit).

I suspect there is a better encoding that allows Trent to always correct any single-bit-flip error. Perhaps somehow forbidding the 12 vulnerable patterns, even though that allows statistical tests to conclude the collection of strings M, N, P, Q, and R are not completely random?

David Cary
  • 5,744
  • 4
  • 22
  • 35
2

Take a look at the Fujisaki-Okamoto CCA2-Security conversion. This is what you need.

In short, this is a conversion that makes the ciphertext of the McEliece encryption scheme (which is based on error correction codes) to be indistinguishable from random.

mczraf
  • 313
  • 1
  • 7
1

In general, an error correcting code $C$ is simply a collection of codewords of a given length $n$ that meet some desired minimum (Hamming) distance properties. Any tell-tale structure that is present is not intrinsic to the error-correcting properties, it's there in order to ease encoding and decoding and to allow you to prove that the desired properties exist.

Since $C$ can be arbitrarily chosen, there is no way to tell if a single word is a member. If Trent sees a large number of messages (and let's assume he sees them before the noisy channel), he can see if the there is anything unusual about their distribution - that their minimum pairwise distance is larger than expected for that number of messages. The larger the minimum distance between codewords, the more quickly he can conclude (with statistical certainty) that such a scheme is in use.

This can be fixed by changing codes for each message based upon some pre-arranged scheme. Note that for any code $C$ the code $C_k = \{ c + k | c \in C \}$ is an error correcting code with the same power for any $k$. You can probably use a construct analogous to CBC mode to make this secure.

Part of your question seems to propose that Trent knows $C$ but only sees the data after it has passed over the noisy channel. His ability to detect if the error correcting code was used will depend on the channel. At one extreme, if it never corrupts data, he'll always be able to tell whether the sent message is in $C$. In general, this is going to depend on the balance of $d$, the error correcting ability of the code, and the noise properties of the channel. Alice will want $d$ large enough to correct expected errors, but the larger it is the more distinguishing power Trent has.

bmm6o
  • 1,122
  • 7
  • 18
0

I'm assuming your deniability/indistinguishability definition requires a random piece of data and a error correcting piece of data to look the same to a decoder, because that seems like a requirement in the application you linked. In that case any error-correcting code that can fix all one bit errors is necessarily distinguishable from random data. Sketch of proof:

Suppose you have an indistinguishable ECC capable of correcting all one bit errors. Encode some message using it. That code and all the codes that differ from it by one bit must produce the same message when decoded. However, the same cannot be true for all the modified messages, or else the ECC is capable of correcting all two bit errors. Apply induction to show that it means it must correct n-bit errors, which is impossible in general. A random piece of data has a chance of being one of those "corrupted codes", so it can be distinguished by seeing if all the one bit changes get fixed or not.

A scheme such as that detailed in David Cary's answer is probably the best you can do.

otus
  • 32,462
  • 5
  • 75
  • 167