Would this simple encrypted chat program be feasible using One Time Pads?

Question

Lets say I want to build a simple chat program that has encryption that is impossible to crack for anyone, even a theoretical government with a massive super computer. Ok here's my idea. You'll need to read all the points to get the whole picture I think.

Only two people can communicate with each other with the chat program. No group conversations.
The people will be communicating over the internet and darknets.
The chat program will just handle basic characters, numbers and symbols that are on a standard US keyboard. This is to keep things simple.
We'll say there's a limit of 160 characters per single chat message. But obviously being a real-time chat program over the internet you can type up more than one message. Chat person #1 (Bob) generates a large list of secret random keys (or pads). Each one time pad is 160 characters long. So we end up with a long list of pads. These get put in a SQLite database or something with primary keys numbered from say 1 to 5000.
Bob copies the database to a USB drive/CD/DVD/Blu-ray and meets chat person #2 (Alice) in person and gives them the database. Alice loads that database up on her computer. Then they secure erase the USB drive or burn/smash the CD/DVD/Blu-ray. Anyway now they both have the same list of one time pads on their computers.
Ok Bob, seeing he generated the list, he uses the odd numbered primary key pads to encrypt and send messages to Alice (#1, #3, #5 etc). Alice being the person who loaded the list, uses the even numbered primary key pads (#2, #4, #6 etc) to encrypt and send messages back to Bob.
Each person's program always keep a record of which numbered pad they've used, so that a message doesn't get encrypted twice with the same pad. There's no mixup in one person using the same pad as the other because they're using odd and even numbered pads. The number of the pad to be used for decrypting the message could be tacked onto the end of the message. The number of the pad to be decrypted wouldn't necessarily need to be encrypted either it could just be on the end of the message eg "#123".
Now to secure the session, and provide a layer of protection against MITM attacks, the whole chat session could be encrypted with SSH or SSL maybe.
After each message is sent by the user it and the pad is secure-erased from their device. Once the receiver has read the message it is secure-erased from their device as well.
Pads are stored on the device using combination of 3 strong block ciphers and a strong user password that is required to be entered before the chat program will run.
If a user thinks they will very soon become compromised they can secure-wipe the pads on their device quickly with a button. This will also notify the other chat user that they should wipe their pads as well. Secure wipe algorithm probably something like Gutmann 35 pass.
Now suppose the chat session gets intercepted by an attacker (Eve) who wants to interfere with the communications, maybe trick Bob into sending another message again with the same pad so they can figure out what was said. Well she won't be able to because Bob is maintaining a list of all the pads he's used and all the numbered pads that Alice has sent him. If attacker Eve tries to send a different message to Bob to pretend it's from Alice then she can't because an HMAC of the message and pad is sent along with the ciphertext. The receiver will verify the HMAC matches at their end to ensure no tampering has occurred.
If Eve tampers with the number on the end specifying the pad to decrypt with then the receiver will try and decrypt that message and realise the message is garbled so they will know something is up. If she tried to re-use an old pad then Bob would know too as that message has already been received indicating tampering. Also if the number increased significantly from the last message received eg last message received was #200 and the next one received was #230 then you'd know there's been missing messages or someone playing man in the middle.
When the two parties have used up their 5000 messages then the chat session is ended and they'll need to regenerate a list of pads again and give them to the other chat party in person so they can continue chatting. Obviously 5000 pads is an arbitrary number. You could in theory generate a million pads and have communication with them for a life time.

Ok so that's my idea in a nutshell. Are there any potential flaws or weaknesses, or possible attack vectors you can see? What improvements would you make to the idea?

score 14 · Accepted Answer · answered Oct 26 '12 at 06:33

Only two people can communicate with each other with the chat program. No group conversations.

This is fairly limited, but let's admit.

The people will be communicating over the internet.

So, an insecure channel. OK.

The chat program will just handle basic characters, numbers and symbols that are on a standard US keyboard. This is to keep things simple.

This doesn't really matter - all that stuff is just bits and the actual contents of the data is irrelevant for encryption and security. It does make implementation easier, though.

We'll say there's a limit of 160 characters per single chat message. But obviously being a real-time chat program over the internet you can type up more than one message. Chat person #1 (Bob) generates a large list of secret random keys (or pads). Each one time pad is 160 characters long. So we end up with a long list of pads. These get put in a SQLite database or something with primary keys numbered from say 1 to 5000.

OK. This is where it starts to go downhill. First, you need to generate completely random pads. These can't be pseudo-random! Random only. So you're already putting 100% of your security into your random number generator, which can either be a hardware device, or an entropy pool provided by your operating system (like /dev/random). This can also be slow!

Bob copies the database to a USB drive/CD/DVD/Blu-ray and meets chat person #2 (Alice) in person and gives them the database. Alice loads that database up on her computer. Then they secure erase the USB drive or burn/smash the CD/DVD/Blu-ray. Anyway now they both have the same list of one time pads on their computers.

OK, why not. Now the attacker needs physical access to either computer to get a hold of the pads, that is acceptable.

Ok Bob, seeing he generated the list, he uses the odd numbered primary key pads to encrypt and send messages to Alice (#1, #3, #5 etc). Alice being the person who loaded the list, uses the even numbered primary key pads (#2, #4, #6 etc) to encrypt and send messages back to Bob.

This works, yes.

Each person's program always keep a record of which numbered pad they've used, so that a message doesn't get encrypted twice with the same pad. There's no mixup in one person using the same pad as the other because they're using odd and even numbered pads. The number of the pad to be used for decrypting the message could be tacked onto the end of the message. The number of the pad to be decrypted wouldn't necessarily need to be encrypted either it could just be on the end of the message eg "#123".

If you send the pad number each time, it leaks some information about how many messages have been sent so far.

Now to secure the session, and provide a layer of protection against MITM attacks, the whole chat session could be encrypted with SSH or SSL maybe.

Well, if you're going to be using SSL or SSH, there is no point in using the one-time-pad as I noted in comments, but let's admit this is for fun. Note that SSH/SSL does not prevent MITM by itself - trusted certificates prevent it. You don't even need it in this case, anyway, since you have a shared secret in the message pad which you can use to authenticate both parties securely.

Now suppose the chat session gets intercepted by an attacker (Eve) who wants to interfere with the communications, maybe trick Bob into sending another message again with the same pad so they can figure out what was said. Well she won't be able to because Bob is maintaining a list of all the pads he's used and all the numbered pads that Alice has sent him. If attacker Eve tries to send a different message to Bob to pretend it's from Alice then she can't because she doesn't have any pads to encrypt the message with.

This isn't quite true - see the next point.

If Eve tampers with the number on the end specifying the pad to decrypt with then the receiver will try and decrypt that message and realise the message is garbled so they will know something is up. If she tried to re-use an old pad then Bob would know too as that message has already been received indicating tampering. Also if the number increased significantly from the last message received eg last message received was #200 and the next one received was #230 then you'd know there's been missing messages or someone playing man in the middle.

You need to objectively quantify "realise the message is garbled". This is generally done by providing a message authentication code (MAC) along with the ciphertext. This MAC can only be generated by whoever possesses the shared secret (in this case, the pad) and if the received MAC (thus, of the original ciphertext) does not match the computed MAC of the received ciphertext, then the ciphertext has been modified in transit (even a single bit flip will be detected).

This is particularly important because the OTP is very weak to a known plaintext attack. If the attacker knows the message is something along the lines of "meet me at 10 baker street", he can, without knowing the pad, change this to "meet me at 04 baker street" using the properties of XOR (or whatever combining operation you use).

When the two parties have used up their 5000 messages then the chat session is ended and they'll need to regenerate a list of pads again and give them to the other chat party in person so they can continue chatting. Obviously 5000 pads is an arbitrary number. You could in theory generate a million pads and have communication with them for a life time.

Well, the problem is obviously keeping those pads safe. If you can ensure that, then sure, there is no problem. The pads also have to be wiped when you're done with them, but in practice this isn't too hard if you are diligent about it.

So in total, there are a few flaws with your scheme:

it is limited to two people
the pads have to be generated randomly
the pads have to be kept safe
once the pads run out, a secure physical meeting has to be arranged to exchange pads
the pads have to be utterly destroyed once used, to make sure previous messages can never be read
there is no authentication/integrity checking which severely compromises the system should the conversation be guessable (cribs like "hello", etc... can all be compromised)

So it would work if you added authentication/integrity checking (such as HMAC-SHA256, or one of those unconditionally secure authentication schemes if you want to stay in the "perfect security" theme), but it would be relatively awkward, unwieldy, and tedious to use and maintain in real life. The killer being, of course, the random number generation.

Compare this with a scheme using some block cipher like AES - you only need to exchange a 128-bit key, which can be used essentially forever.

All in all, as you would expect, the flaws are not in the cryptography itself - after all, this is unconditionally secure cryptography. The real issues are in managing all the pads, which is more of a practical problem than a theoretical one. Because in the real world, nothing is perfect - not even cryptography.

score 7 · Answer 2 · answered Oct 26 '12 at 16:17

Currently, your protocol has a fatal design flaw by only considering an attacker who could manipulate the communication. You mentioned the two security properties confidentiality and integrity and you implicitly expect some kind of deniability by using a OTP and its properties. But your protocol ensures no entity authentication or perfect forward secrecy so that the absence of the latter property results in the disclosure of every conversation of a person by compromising his database. For that reason, your protocol has to ensure the irretrievable deletion of every used key!

I think the off-the-record messaging protocol provides all the security properties you like to have.

score 3 · Answer 3 · answered Oct 26 '12 at 08:46

As for the conspiracy theory issue. You should consider that the NSA deemed AES fit to protect National Security Systems and National Security Information. http://www.nsa.gov/ia/programs/suiteb_cryptography/ No one knows if AES will ever be broken apart from the people who actually broke it already so either the NSA believes AES is secure or they know it's not and that it's only a matter of time before someone else figures it out. Bruce Schneier has a short post on this http://www.schneier.com/blog/archives/2012/03/can_the_nsa_bre.html with which I tend to agree : factoring 1024 bits number is a probably a better target since SSL key exchange is mostly done through RSA

score 3 · Answer 4 · answered Jun 14 '13 at 00:34

You want "a simple chat program that has encryption that is impossible to crack for anyone". This is not a cryptography problem; it is an information security problem. It cannot be solved by cryptography alone. Cryptography might be one tool, but cryptography alone cannot solve this problem. You cannot solve this problem through mere cryptography. Cryptography isn't enough. (Should I say it a fourth way?)

For instance, if Alice or Bob are chatting and the attacker compromises Alice or Bob's computer, then the attacker can see their messages, no matter how much crypto you use. Before you scoff, this is probably a much greater risk. The security of the endpoints is far more likely to be the weakest link than the crypto. It's unlikely that anyone will crack modern crypto, but fairly likely that a sophisticated attacker like the NSA will be able to get spyware or malware onto the computer of an average end users. Just look at the Aurora attacks, where Chinese spies were able to compromise the computers of dozens of very sophisticated large companies, including even companies like Google and Microsoft. Their employees are far more knowledgeable and sophisticated about cryptography than the average end user, and their computers are probably far better administered and monitored than the average end user's, but they still got hacked.

So, you can't solve this problem through crypto alone. Your design is bad engineering, because it provides a false sense of security and doesn't address the greatest risks that are most likely to be an issue in practice. There are better-engineered solutions out there, but they involve taking a holistic look at all the risks, not just the ones that can be solved through straightforward application of cryptography.

Kind Contributor · Answer 5 · 2017-07-06T14:28:17.057

Enhancements:

@Thomas' answer said

…OTP is very weak to a known plaintext attack…

First, that's like saying sumo wrestlers drown underwater. Simply don't repeat the use of the OTP. Use then discard. Keep your sumos in a dry wrestling ring. Simple engineering, not a conceptual flaw.

(Bonus: Look at how most symmetric cryptography algorithms work, they're basically creating a continuous pseudorandom OTP stream with varying shuffling)

Second, I recommend that you still compress your message before sending. There are many options for that, each with varying benefits.

related: http://blog.alivate.com.au/improve-security-with-compression/

It doesn't eliminate the plaintext attack risk but does reduce it, increasing the difficulty of implementing an attack. I believe streaming compression is feasible which further reduces the risk because the tree state cannot be known midway through communication.

Op’s question said

…no group conversations…

It's feasible to implement group conversation support:
- Multi-cast: Becomes increasingly less feasible as the group size increases. Each endpoint in the group needs a separate OTP with every other endpoint. 1:1 = 1 OTP, Group of 3 = 3 OTPs, Group of 4 = 6 OTPs, Group of 5 = 9 OTPs... (can't be bothered working out the function)
- Your Hub: You only need to maintain an OTP with a central trusted party. (That trusted party could be your own server used within your organisation). You never have an OTP to communicate directly with another member. Such a hub should be physical (not a VM). With a central Hub it also makes commercial sense to make it a channel bonding system - enabling the combination of multiple internet links for speed and redundancy.
- Third Party Hub: Much harder to guarantee security (physical security). But such a company can become very good at the generation of purely random OTP and also secure delivery of OTP (secured with AES). The hub could have physical restrictions requiring multiple people to access. Clear physical wiring rather than software routing makes it clear what is connected. Standard hardware (a simple OTP module) makes it clearer that it's not (very) susceptible to hacking.
Hacking

Even though you distribute the OTP physically, you are still susceptible to hacking whereby the OTP stored on USB can be read. You are not limited to "physical theft" only. There are a one option for mitigating this risk:
- Simple Encryption Co-Processor: This could be a network attached device, USB, or a physical co-processor which has isolated routing to a single SATA port. The idea being, that the co-processor is much more difficult to hack. It could be theoretically impossible to hack if the command set is very small (Encrypt:[Message], Decrypt:[Message]) going to the co-processor. Remember it would need to be shielded against EMR spying (including over the power supply lines).
Other Symmetric Encryption

If your goal is security (rather than speed), it doesn't hurt to also have AES over the top (or other). This is particularly true of physical delivery - the OTP should be encrypted using AES and a shared key. But you could also use that or another key for actually communication too. Up to you.
Quantum Encryption

You must recognize that your closest competitor is QE. QE while theoretically perfect security, suffers from a gap in practical implementation (like OTP). But unlike OTP, QE is much more complicated to implement practically. Furthermore, QE is generally slower/lower bandwidth and is therefore used for distributing synchronous keys. But is limited in distance (~100km, generally, but there have been recent advances in Quantum repeaters).

OTP with XOR is beautifully simple, and with multi-TB of tape or hard drive space has the benefit of much higher bandwidth. You can also split your delivery of OTP via different delivery routes and strategies (including AES encryption) and then you can combine the split OTPs with XOR to generate the final OTP.

It would be pretty fun to operate an OTP encryption hub for the most secretive enterprises.

Would this simple encrypted chat program be feasible using One Time Pads?

5 Answers5