10

Firstly some predicates:-

  • Sufficient hardware generated one time pad key material.
  • No pad reuse.
  • Messages of 160 characters length (think Twitter).
  • 28 characters only in use (A-Z, space and full stop alphabet. Think Morse).
  • The very vast majority of messages based on English grammatical form.
  • The messages would be typed on a computer and read via computer.

Would it be possible to authenticate a one time pad message manually by hand using say pen and paper? Perhaps even a spread sheet, but certainly no programming languages. Hence the typical HMAC is out. The character set and message length are not unmanageable with some determination.

But most importantly, the authentication only has to be manually provable to work. He will not be authenticating every message, just those he chooses to test. That means that the user has to be able to verify to himself that the algorithm works on the messaging system. So he can take an arbitrary message, authenticate and satisfy himself of correctness. The purpose is to create confidence in a manually verifiable way that the system is working correctly and not a fraud or victim of a Man in the Middle attack.

High security is of secondary importance, subservient to the need for provable manual verification of authentication. Whilst a one time pad is totally secure, the authentication algorithm needn't be. Any degree of authentication will be of interest for this question. It would be nice if any proposed solution could have the security level quantified in bits.

I appreciate that there is a spectrum of levels of security for the authentication mechanism. Clearly it ranges from no security (no authentication) to very secure (HMAC calculation). I think that unfortunately we're looking at the lower end of the scale towards the no security end, as the overriding criterion is human computation.

The most basic and brute force approach that occurs is simply sending the same message in multiples. So m ⊕ k1 | m ⊕ k2 | ... and so on. I believe that you'd get 4.8 bits of security for every concatenated message, but I'm probably wrong.

David Cary
  • 5,744
  • 4
  • 22
  • 35
Paul Uszak
  • 15,905
  • 2
  • 32
  • 83

8 Answers8

6

I'll use $m\boxplus k$ for combining characters $m$ and $k$ by addition (modulo $n=28$) within the character set, or strings of individual characters of equal-length, rather than " m ⊕ k " in the question. I consider alphabetical ordering of $n$ (e.g. 28) characters with letters first, such that $\;\mathtt{'A'}\boxplus\mathtt{'A'}\;=\;\mathtt{'A'}\;$ and $\;\mathtt{'B'}\boxplus\mathtt{'C'}\;=\;\mathtt{'D'}\;$.

The question's method of sending $M\boxplus K_1\;\|\;M\boxplus K_2\;\|\;\dots$ is extremely insecure: replacing each ciphertext character $x$ by $x\boxplus\mathtt{'B'}$ is not detected, but correspondingly alters the message. There is a partial fix: for each character $m$ of the message, append to the message the character $m\boxplus m\boxplus m$, then encipher with OTP; since $\gcd(3,28)=1$, this makes altering a plaintext character precisely as hard (or easy) as guessing it. Even with that improvement, if the message (or a segment thereof) is known, it is possible to change it (or that segment) to anything desired, with certainty that the change goes undetected. That's insatisfactory.

Goals

  1. Integrity check should be based on the plaintext, thus catching decryption errors (accidental or not) introduced by whatever deciphers, and avoiding the need to keep the full ciphertext for integrity check.
  2. It shall remain convenient to decipher first (including while receiving), postponing integrity check.
  3. Same unconditional plaintext confidentiality as OTP in ciphertext-only attacks, and only practically negligible leak under an attack model where the MitM gets to know if an attempted forgery succeeds or not. The receiver does not know the message length in advance. We posit integrity of the random pad, that it is synchronized between sender and receiver at the beginning of each message, and is never reused by sender or receiver.
  4. In order to keep the integrity check feasible by hand, we restrict to use of the same $\boxplus$ as used for encryption, public latin square(s), random characters supplied by the random Pad (after those used by en/decryption), a moderate number of characters (or integer) variables (perhaps selection among these, and simple decimal arithmetic). Paper and pencil is OK (even if plaintext-leaking material gets written on it), a calculator is not.
  5. The amount of work for integrity check should be minimized, as well as the transmission overhead, and drain of the random pad.
  6. Quantifiable level of security against forgery, including with known plaintext (which makes sense if we do not fully trust whatever deciphers, see 1).

Circular buffer with latin square and addition (2017-06-03b)

We spare the first $s\ge2$ characters of ciphertext for integrity check, computed from the plaintext and the first $t\ge s$ characters of the random pad (or stream generator); think of $s=4$, $t=10$. The $s$ integrity check characters are computed by the sender, and optionally recomputed and verified by the receiver. All the rest is standard One Time Pad (or stream generator) and we do not further discuss encryption and decryption.

The system uses a single public $n\times n$ latin square; assume for now a random one.

For manual computation, the first $s$ characters of the random pad are written down (with - for space and + for the extra character). For each character of the plaintext, and then for each of the $t-s$ next characters of the random pad:

  • fetch the latin square at the line defined by that character, and the column defined the last character that was written;
  • combine the result using $\boxplus$ (addition modulo $n$ as used in the OTP) with the first character not striked-out (which is also the $s^\text{th}$ from the end), and strike that character;
  • write the resulting character following the $s-1$ characters not striked.

The $s$ characters remaining are the check characters.

(In)Security:

  • Each step combines the newly added character, leftmost and rightmost check characters, into a new check character, in a manner such that knowing 3 our of these 4 characters the other can be uniquely determined. It follows that the $s$ check characters remain uniformly distributed during the whole computation, for constant plaintext. No information about the plaintext can leak from the check characters, and information leak from acceptance/rejection is limited by the forgery rate.
  • The tight alternation of the latin square and $\boxplus$ combiners quickly and non-linearly diffuse changes; this is critical (forgery is trivial if we replace the latin square with $\boxplus$).
  • There is a forgery with $(n-1)^{-2}$ odds of success ($2\log(n-1)\approx9.5$-bit security): alter randomly two consecutive characters, then a third with $n-2$ non altered characters in-between. Slightly better strategies tailored to the latin square (and known/chosen plaintext if applicable) are possible. For the desired $n=28$ it is possible to objectively rank latin squares for their resistance to such attacks.
  • Security can't be better than $s\log_2(n)$-bit (forgery by random choice of check characters); nor than $(t-s)\log_2(n)$-bit for known plaintext (guess of the $t-s$ characters at end of computation).

That's calling for future work.


Methods in Mark N. Wegman and J. Lawrence Carter's New Hash Functions and Their Use in Authentication and Set Equality, in Journal of Computer and System Sciences 22 (1981) would be great, but I did not find something easy to perform without a computer.

fgrieu
  • 149,326
  • 13
  • 324
  • 622
3

I've recently been interested in adding message integrity and authentication to hand ciphers, as I belong to a group of hobbyist classical cipher puzzle solvers, so discussions in this vein get people thinking.

I put together a wiki page on several approaches to adding message integrity as well as a couple ideas of adding authentication, including the answer by @fgrieu. Probably the the most practical approach without complicating the hand cipher too much is to:

  1. Use the socialist millionaire protocol for authentication in the plaintext.
  2. Calculate message integrity with something like the Damm algorithm.
  3. Encrypt the message.

Notice that this is a MAC-then-Encrypt approach.

The recipient then:

  1. Decrypts the message.
  2. Looks for the authentication secrets.
  3. If the secrets do not exist or are unknown, the message may be fraudulent.
  4. Calculate the checksum.
  5. If the calculated checksum does not match the sent checksum, the message might have been modified.

The goal is to produce an authentication and integrity solution that is at least as difficult to break as the security of the hand cipher. Just remember that Bruce Schneier said it best:

There are two kinds of cryptography in this world: cryptography that will stop your kid sister from reading your files, and cryptography that will stop major governments from reading your files.

This answer as about the former.

David Cary
  • 5,744
  • 4
  • 22
  • 35
Aaron Toponce
  • 246
  • 2
  • 12
3

Create an "alphabet" containing every single possible tweet character exactly once, and construct a tabula recta based on this "alphabet".

Let a plaintext character be a checksum. Calculate the checksum with your tabula recta like this:

Find the first plaintext character in the first row of the table,

  • go down until you find the second character,

    left or right until you find the third,

    up or down until you find the fourth,

    left or right until you find the fifth,

    up or down until your find the sixth, etc.

When you hit the last character, make a 90 degree turn and keep going until you hit the edge of the tabula recta. The character you land on is the checksum.

Insert the checksum at the same position in the message as the position of the first key character in your "alphabet".

If the receiver calculates the checksum of the message (ignoring the character at the position indicated by the first key character, of course) and finds it matches the decrypted checksum, he can be {100-[(1 ÷ alphabet length) × 100]}% sure the message was not modified by an adversary.

To give credit where it is due, I got the idea of this snaking operation on the tabula recta from prgomez.com.

Meler Lawler
  • 325
  • 1
  • 10
2

You could use a slight variant of Encrypt-last-block CBC-MAC (ECBC-MAC) with a random single-character permutation plus a one-time pad character per MAC tag "digit". ECBC-MAC is easy to compute by hand and you can make MAC tags arbitrarily large — I'll explain how below.

Generating MAC Keys

You and your partner agree to use one-time pads (OTPs) to communicate. Plaintext and ciphertext use the same alphabet $A$. (In your example, $|A|=28$.)

To generate a MAC key:

  1. Decide how many symbols ("digits") the MAC tag should have. Call that number $N$. Each symbol increases confidence but also increases the amount of work you have to do; thus:

    MAC Size (Symbols) Probability a One-Character Change
    Goes Unnoticed (When $|A|=28$)
    Probability a Multi-Character Change
    Goes Unnoticed (When $|A|=28$)
    1 $1/(|A|-1)$ (3.7037%) $1/|A|$ (3.5714%)
    2 $1/(|A|-1)^2$ (0.1372%) $1/|A|^2$ (0.1276%)
    3 $1/(|A|-1)^3$ (0.0051%) $1/|A|^3$ (0.0046%)
    ... ... ...
    $N$ $1/(|A|-1)^N$ $1/|A|^N$
  2. For each MAC symbol $n$ in $1,...,N$, generate a random permutation of all $A$ alphabet symbols. All permutations must be equally likely — consider a Fisher-Yates shuffle, which you can do by hand, perhaps with the aid of dice or playing cards.

Additionally, for each OTP key you use for encryption, generate $N$ random symbols from $A$, chosen uniformly at random with replacement, which will function like a OTP key.

Example MAC Key

Example MAC key for your alphabet ($|A|=28$) and MAC length $N=2$:

MAC Symbol Number Permutation
1 QCKJGEYMTBHPVZ DRL.UWIOXSFNA
2 MFR.QVOETHIUKSBLAC DZWXYJGNP

Each OTP key will have a group of $N=2$ random symbols. For example, one key might have TG; another might have Y..

MAC Key Requirements

These MAC keys share some OTP key requirements — namely:

  1. Generate MAC keys uniformly at random.
  2. Keep MAC keys absolutely secret.
  3. Securely distribute MAC keys.
  4. Destroy each OTP key's additional $N$ symbols after use. If you're verifying a message's MAC tag, you MUST destroy these extra symbols afterwards regardless of whether verification succeeds or fails. (If you reject the message but keep the OTP key and its extra symbols, you could inadvertently give the adversary an oracle as well as permit replay attacks.)

Calculating MACs

MAC Plaintext or Ciphertext?

You can MAC plaintext or ciphertext — there's no difference in security because you'll essentially OTP-encrypt the MAC tag.

  • MACing plaintext will help message recipients detect encryption and decryption errors but force them to decrypt messages first.
  • MACing ciphertext won't catch decryption errors, but recipients can verify messages before decrypting.

Pick one and make sure your recipients know which to use.

Calculating a MAC

Let's assume (for demonstration purposes) that you MAC ciphertext. Encrypt your message using a OTP key; then compute the ciphertext's MAC thus:

  1. Let $N$ be the number of symbols in the MAC tag. For each $n$ in $1,...,N$, do the following:
    1. Set $x \leftarrow 0$.
    2. For each ciphertext symbol $s_i$ (reading the ciphertext left-to-right):
      1. Let $f_n(x)$ be the symbol at zero-based index $x$ in the $n$th permutation. Set $x \leftarrow f_n(x + s_i \mod |A|)$. This assumes that the first symbol in your alphabet has numerical value $0$, the second $1$, and so on.
    3. Set $x \leftarrow x + e_n \mod |A|$, where $e_n$ is the $n$th extra random symbol from the OTP key. (See "Generating MAC Keys" above for more info about these extra OTP symbols.) This effectively encrypts $x$ using a OTP. Note: These aren't the same symbols you used for encrypting the plaintext!
    4. $x$ is now the $n$th symbol of the encrypted message's MAC tag.
  2. Destroy the message's OTP key, including its $N$ extra random symbols you used for authentication — it's a one-time key. But you can keep the MAC key (the permutations).

Message verification uses the same process.

Example

Let's say your ciphertext is AUTXQ, $A$ is the English alphabet plus space and full stop ($|A|=28$), and the MAC key is the example one above ($N=2$). The OTP key you use to encrypt your message has extra symbols TG. Then:

  • For the first symbol in the MAC tag ($n=1$), the permutation is QCKJGEYMTBHPVZ DRL.UWIOXSFNA and the extra random symbol $e_1$ is 'T'.
    1. Set $x \leftarrow 0$.
    2. $s_0 = 0$ (the symbol 'A'). Set $x \leftarrow f_1(x + s_0 \mod 28) = f_1(0) = 16$ (the symbol 'Q').
    3. $s_1 = 20$ (the symbol 'U'). Set $x \leftarrow f_1(x + s_1 \mod 28) = f_1(8) = 19$ (the symbol 'T').
    4. $s_2 = 19$ (the symbol 'T'). Set $x \leftarrow f_1(x + s_2 \mod 28) = f_1(10) = 7$ (the symbol 'H').
    5. $s_3 = 23$ (the symbol 'X'). Set $x \leftarrow f_1(x + s_3 \mod 28) = f_1(2) = 10$ (the symbol 'K').
    6. $s_4 = 16$ (the symbol 'Q'). Set $x \leftarrow f_1(x + s_4 \mod 28) = f_1(26) = 13$ (the symbol 'N').
    7. Finally, set $x \leftarrow x + e_1 \mod 28 = 13 + 19\mod28 = 4$, which is the symbol 'E'. Thus the first symbol of the MAC tag is 'E'.
  • Do a similar thing for the second MAC tag symbol ($n=2$). The permutation is MFR.QVOETHIUKSBLAC DZWXYJGNP and the extra random symbol $e_2$ is 'G'.
    1. Set $x \leftarrow 0$.
    2. $s_0 = 0$ (the symbol 'A'). Set $x \leftarrow f_2(x + s_0 \mod 28) = f_2(0) = 12$ (the symbol 'M').
    3. $s_1 = 20$ (the symbol 'U'). Set $x \leftarrow f_2(x + s_1 \mod 28) = f_2(4) = 16$ (the symbol 'Q').
    4. $s_2 = 19$ (the symbol 'T'). Set $x \leftarrow f_2(x + s_2 \mod 28) = f_2(7) = 4$ (the symbol 'E').
    5. $s_3 = 23$ (the symbol 'X'). Set $x \leftarrow f_2(x + s_3 \mod 28) = f_2(27) = 15$ (the symbol 'P').
    6. $s_4 = 16$ (the symbol 'Q'). Set $x \leftarrow f_2(x + s_4 \mod 28) = f_2(3) = 27$ (the symbol '.').
    7. Finally, set $x \leftarrow x + e_2 \mod 28 = 27 + 6\mod28 = 5$, which is the symbol 'F'. Thus the second symbol of the MAC tag is 'F'.
  • The complete MAC tag is "EF".

Security

Each of the $N$ symbols of a MAC tag is an instance of ECBC-MAC using a random one-symbol permutation as a block cipher and encrypting the last "block" (the final $x$ value) using a OTP.

  1. There are $(|A|\cdot |A|!)^N$ unique MAC keys and $|A|^N$ unique MAC tags.
  2. Encrypting the MAC tag using an OTP provides perfect secrecy for the tag. Adversaries cannot know the tag you calculated prior to encrypting it with the OTP just by looking at it; therefore, they have to rely on structural weaknesses in CBC-MAC to guess MAC tags.
  3. However, several papers (see 1 and 2 for examples) proved that:
    1. if the block cipher used in CBC-MAC is a pseudorandom function, then CBC-MAC is a pseudorandom function; and
    2. the advantage a computationally unbounded adversary has for forging a CBC-MAC that uses a random function as a block cipher over guessing that random function is less than or equal to $3q^{2} m^{2}/2^{l+1}$, where $q$ is the number of MAC-generating oracle queries the adversary makes, $m$ is the message length (in blocks), and $l$ is the number of bits in the MAC tag. (A similar bound applies if the random function is a random permutation, as in our construction.) $m$ is the length of our encrypted message (because our "block cipher" is a one-symbol permutation) and $l$ is very small in our construction, $\log_2(|A|)$, so this seems like a terrible MAC to use. But note that the advantage is proportional to $q$.
  4. However, because we encrypt our CBC-MAC tag using a OTP that we destroy whether verification succeeds or fails, adversaries cannot make oracle queries: $q = 0$. (Readers might object that eavesdropping on a message provides one oracle query — one example of a message with a valid MAC tag — but OTP-encrypting the MAC tag hides the CBC-MAC value, in effect denying the adversary even one oracle query.) Adversaries can try to forge a valid message, but they get only one chance because the recipient always destroys his/her copy of the $N$ symbols used to encrypt the CBC-MAC tag.
  5. Therefore, adversaries have no advantage over simply guessing MAC keys or MAC tags. MAC keys are larger than MAC tags, so intelligent adversaries would try to guess MAC tags; thus the probability that an adversary can forge a valid message in a man-in-the-middle attack is $1/(|A|-1)^N$ for a single-symbol forgery and $1/|A|^N$ for a multi-symbol forgery.

If any of the assumptions above or MAC key requirements outlined earlier is violated, these guarantees no longer hold.

Also, you cannot use this scheme if multiple recipients share a MAC key. If there are $R>1$ recipients, an adversary could treat $R-1$ of them like oracles (if the recipients act in ways that reveal whether forgeries succeed or fail) and possibly forge a valid message for recipient $R$.

Update: If $|A|=2$, then this scheme reduces to bit-by-bit XOR, which allows adversaries to arbitrarily permute ciphertext symbols without affecting MAC tags. If $|A|=2$, create a two-bit block permutation (block cipher) — a permutation of $\{0,1,2,3\}$ — and process ciphertext bits two at a time, adding padding if necessary.

Update: Using OTP to encrypt the CBC-MAC tag achieves two things: It prevents adversaries from cryptanalyzing the tag directly or using it to cryptanalyze CBC-MAC (perfect secrecy) and it prevents message extension attacks by transforming CBC-MAC into ECBC-MAC.

Update: This used to suggest using MAC keys only once. I updated my answer so that MAC keys can be reused indefinitely. (However, each OTP key used for encryption needs $N$ extra random symbols.)

I also removed discussion about simplifying MAC keys and using random functions instead of random permutations. The scheme above is sufficient.

A Note on Practicality

Large $|A|$ (such as $28$ as in the OP's question) makes generating and using random permutations by hand more difficult. Transforming plaintexts to and from a decimal code (even A = $0$, B = $1$, and so on) reduces $|A|$ to $10$, creating MAC keys that are easier to use.

user103480
  • 21
  • 2
2

I didn't think it has to be this complicated, but the complexity is slowly creeping up.

We make every message the same length, from your question 160 characters, my method adds an extra message (the header) into the senders message, and the message again, so in transit the message is 327 characters. The sender will pad from the end of their text with white space. This may eat your key faster if you typically send 1 or 2 character messages, but it stops someone fingerprinting you from matching message lengths and allows the rest to work. Spaces at the end of the message will not be seen by the receiver.

Now before sending each message we build the header that needs to include an identifying string, a message checksum of some type, the entire message encrypted again using more of the OTP, and a new random start index of the message. Now begin the message at the index and wrap around, and place this header at a random location in the message. You would have your receiving program take out this header, then fix the messages order. Something like:

original message:"this is the message but one six zero chars"
padded message:"this is the message but one six zero chars                                                                                                                      "
identifying string:"zzzz"
checksum (my guess at the LRC character):"f"
random header position(0 to 160+7-1):7
random start index (1 to 28^2): 2 : "ab"
header:"zzzzfab"
next 160 OTP characters XORed with the message:"(160 rnd chars)"
message in transit:"(160 rnd chars) this izzzzfabs the message but one six zero chars                                                                                                                     "
message at receive:"this is the message but one six zero chars"

The identifying string should be strange and long enough not to appear in your messages. When initially transferring the OTP key also agree on this identifying string. The sender should avoid sending the identifying string in their plain message as it would goof up the receiver (but there are methods to deal with this).

The checksum, random header position and random message start index is to limit the effectiveness of MITM attacks by forcing them to guess at values. The checksum is the LRC of the other 326 characters in the message (it mainly serves to stop a MITM adding 1 to each character). The way I would implement the message start index is have at both ends a table with 28^2 = 784 different start locations, but you could make it more complex by changing the direction of text or in other ways. so for example:"aa" means start at 1 (no change). "ab" means start at 2. "ft" means start at 160. "zz" means start at 86. This table does not have to be secret, the added security comes from the index being sent with the message.

If you want to decrypt and verify then, skip 160 chars of your OTP, decrypt the whole message, skip back to the start of your OTP and decrypt the first half of the message. To check if it is authenticated make sure the values match. If you do not need to validate you can just throw away the first half and skip along in the OTP.

Re encrypting the whole message has a cost, it eats 3 bits of your OTP for each effective bit of message you send. And the message is now twice as long as the information you are sending.

Because of the scrambled order a MITM has a 1/167 chance of guessing the character position if they were attempting to change one character from the first 160 and the matching character from the remaining.

daniel
  • 912
  • 5
  • 15
2

Authentication in one-time pad systems was a problem in WW2. How it was solved historically in the SOE at least: an agent has a pre-agreed error pattern (like making a spelling mistake in the seventh word). If he/she omitted to do this, it meant that he/she had been captured and was sending under duress. The captor did not know this pattern presumably, and would not suspect a correctly sent message. So the authentication key was the pattern of error, essentially.

Henno Brandsma
  • 3,862
  • 17
  • 20
2

Use a secret permutation of the alphabet as the top and side of the tabula recta used to encrypt and decrypt messages.

Meler Lawler
  • 325
  • 1
  • 10
1

Encrypt then MAC. We will compute a MAC of $N$ characters from the ciphertext. Pad the message to a multiple of $N$ characters, if needed (since all transmitted messages are $160$ characters long, just choose $N$ to be an integer such that there exists an integer $k$ where $N \times (k + 1) = 160$).

Divide the ciphertext into groups of $N$ characters. Set the MAC to an arbitrary initial value, such as a group of $N A$ characters. For each group of ciphertext, use the group as if it were one time pad key material to encrypt the MAC, making the result the new MAC, until all groups have been used in this way. Then use $N$ characters of actual one time pad key material to encrypt the MAC, and this is the final MAC. Append the MAC to the message.

Weakness: A change in one character of the ciphertext will only change one character of the MAC.

Attack: Changes can be made to particular characters in the ciphertext, which will balance out and produce an unchanged MAC.

AleksanderCH
  • 6,511
  • 10
  • 31
  • 64