0

For multiple OTP keys, why not do this:

Key = Hash[(EC)DH Shared Secret + Counter]

The counter will increase per message, and the output of the hash will be random for each new unique input into it. What are the problems or weaknesses with this design??

EDIT: For clarification: both parties will have the shared secret from an ECDH exchange, random numbers from Pythons os.urandom for generating the ECC key

SamG101
  • 633
  • 4
  • 12

3 Answers3

3

This principle is commonplace, it is a key derivation from a shared secret. The academic thing to do is to use a Key Derivation Function or a MAC where hash of a concatenation is used, perhaps $$\operatorname{HMAC-Hash}\;[\;\text{Key}=\text{(EC)DH Shared Secret},\,\text{Message}=\text{Counter}\;]$$

We call the output a derived key, and it can be a one-use key.

I would not call it a One Time Pad key, even if it is used only once to mask some payload with XOR, because the usual assumption is that a One Time Pad (key) is truly random, and this is only pseudo-random.

fgrieu
  • 149,326
  • 13
  • 324
  • 622
2

We call this idea a key derivation function (hashing the ECDH secret) for a stream cipher (expanding a short secret and a counter into a long pad).

This idea is not problematic or weak; in fact, it is ubiquitous. For example, a part of the TLS protocol essentially works as follows:

  1. Agree on an ECDH secret $s$.
  2. Derive a session key $k = H(s)$ by hashing $s$, where $H$ is the TLS PRF with a particular label.
  3. For the $i^{\mathit{th}}$ message, expand $k$ into a per-message pad $p_i = H'(k, i)$, where $H'$ is ChaCha or AES-CTR.
  4. Encrypt the $i^{\mathit{th}}$ message $m_i$ with the ciphertext $c_i = m_i \oplus p_i$.
  5. (Also authenticate the ciphertext with $k$ and $i$, because preventing forgery is practically always at least as important as keeping secrets.)

Of course, the security is only as good as the security of the ECDH system, the KDF, and the stream cipher—if you use a tiny curve and a hash function with a 32-bit pipe and RC4 as the stream cipher, it won't provide much security.

But if you use a reasonable choice like HKDF-SHA256 to hash an X25519 ECDH secret and use ChaCha/Poly1305 to encrypt and authenticate messages, you'll be fine—any security problems you have won't arise as a consequence of this construction. You could even use HKDF-SHA256 to expand the secret into a long pad directly without ChaCha (but it wouldn't be very fast).


P.S. There is a funny social phenomenon whereby you're not allowed to say the words ‘one-time pad’ in this context. For some reason, the term ‘one-time pad’ is only allowed either for the idealized model $c_i = m_i \oplus p_i$ when the $p_i$ are exactly uniformly distributed, or for historical use of that model (however biased or reused the pad may have been historically)—it is never allowed when you use the model in modern cryptography as you just described with $p_i$ chosen as a pseudorandom function of a secret key, even though the modern notion of stream cipher is obviously inspired and justified by the security of the one-time pad model.

Only the term ‘one-time pad’ works like this; nobody bats an eye when a cryptographer studies an idealized model like $f(\cdots f(f(\mathit{iv}, m_1), m_2) \cdots, m_n)$ for uniform random $f$ and then instantiates it with $f = \operatorname{AES}_k$ but calls both the idealized model and the instantiation ‘CBC’—it is only forbidden to call the idealized model $c = m \oplus p$ instantiated with $p = \operatorname{AES}_k(0)$ a ‘one-time pad’. Perhaps it's because cranks are attracted to red herrings like the value-laden name of the technical property ‘perfect secrecy’ of the one-time pad model, so obsession with one-time pads is a useful indicator for cranks.

Squeamish Ossifrage
  • 49,816
  • 3
  • 122
  • 230
-3

The main problem is that it's not truly random, and therefore unsuitable for a real-by-definition one time pad. Addressing your comment which forms the crux:-

The shared secret will be calculated from keys that were generated using pythons os.urandom function, and whilst not truly random, they are unseeded and generated from many different sources that are unpredictable.

Rewriting your formula, you have an iteration as $ k' = \text{HMAC}[ seed || c ] $. Even without maths we can see that it's not truly random, as I just wrote it down algebraically. No matter how $seed$ is generated, it becomes fixed when incorporated into your formula. You are then simply incrementing counter $c$. And as for "each new unique input": Unique yes, but entirely predictable.

As $c$ increments, the cumulative output will become longer than the program that created it. That breaks the fundamental definition of Kolmogorov randomness, and breaks the perfect secrecy paradigm of the one time pad.

When ever you describe the output of a OTP generating process, it must be longer, not shorter, than the pad itself. Even if only by a teensy-weensy little bit, like a print statement or full stop. It's another way of saying that you can't write down a OPT in algebraic terms. What you have is a mutually agreed upon pseudo random generator.


The Wikipedia article on Kolmogorov complexity is pretty good.

Paul Uszak
  • 15,905
  • 2
  • 32
  • 83