6

I am looking for a proof-of-work scheme which cannot be effectively parallelized.

For example, in hashcash (and by extension bitcoin) you have some collision-resistant hash function $f()$, a target $T$ and some constant $C$. You obtain the proof of work by running $P=f(C, N)$ for some nonce $N$ which gets incremented with each iteration, untill $P\lt T$ (intuitive definition). The computee then publishes $C, N$ and the verifier verifies the condition by running the hash function once.

This can be easily parallelized by using multiple processors and assigning a portion of $N$ to each one.

I know scrypt aims to be memory-intensive and thus expensive and that bcrypt does something similar.

My approach so far is to use a secondary proof by obtaining $P$ as described above and then running $C' \gets P$ and $P'=f(C',N')$ untill $P'\lt T$ as above. This forces a parallel environment to do the same job as the first scheme, provided the adversary can only afford $max(N)$ processors. It also prohibits the use of single-processor systems (and it's kind of a dumb solution anyway).

I tried to look into literature but unfortunately I'm not mature enough for it. I also understand my question is borderline reference-request but I think I could argue it is acceptable. I leave judgement to you.

To conclude: Is there a proof-of-work scheme based on some hard-to-parallelize (P-complete) problem? Intuition tells me this is a problem based on work (repetition) so it is inherently parallelizable.

rath
  • 2,598
  • 3
  • 27
  • 40

3 Answers3

4

What I will describe is "the RSA timelock proof of work protocol",
which does not seem to have a canonical reference.

The server generates large random primes $\hspace{.02 in}p$ and $q$, then calculates
$M \: = \: \hspace{.02 in}p\hspace{-0.02 in}\cdot\hspace{-0.02 in}q \;\;$ and $\;\; L \: = \: \left\lfloor \sqrt{M}\right\rfloor\hspace{-0.02 in}+\hspace{-0.02 in}1 \;\;$ and $\;\; \lambda \: = \: \operatorname{Lcm}(\hspace{.03 in}p\hspace{-0.03 in}-\hspace{-0.04 in}1,\hspace{-0.01 in}q\hspace{-0.03 in}-\hspace{-0.04 in}1) \;\;\;$.
The server publishes $M\hspace{-0.02 in}$, stores $L$ (not necessarily secretly), and keeps $\lambda$ secret.
To give a challenge with difficulty paramter $T$, the server samples $n$ from $\{L,L+1,L+2,L+3,...,(M-L)-2,(M-L)-1,M-L\}$
and then sends $T$ and $n$ to the challenge's recipient.
To solve a challenge, the recipient iterates $\: n\mapsto \operatorname{mod}(n^2\hspace{-0.02 in},\hspace{-0.01 in}M\hspace{.02 in})$
$T$ times and then sends the result to the server.
The client (if honest) will end up with $\:\operatorname{mod}(n\text{^}\hspace{.01 in}(\operatorname{mod}(2^T\hspace{-0.02 in},\lambda)),M\hspace{.02 in})\:$, $\:$ which the server can easily calculate.

1

Any chained task that takes as input the output of a prior iteration cannot be parallelized. PBKDF2 is an example of such an algorithm.

John Deters
  • 3,778
  • 16
  • 29
0

I played with an idea for an asymmetric proof of work scheme that I put a small amount of thought into making parallel resistant.

I will explain the basic idea of the proof of work scheme first, followed by how I attempted to make it parallel resistant.

  • The challenge creator hashes a small number of random bytes with a random iv.
  • A desired number of challenges of a given difficulty weight are generated this way (i.e. 16 1 byte hashes, 4 2 byte hashes, 1 3 byte hash, x y byte hashes...) and sent to the solver with the random iv
  • The solver cracks each hash individually. The length of time required to crack a hash scales exponentially with number of bytes needed to be cracked

Cracking the few unknown bytes takes significantly more work then hashing them did. Creating and verifying should be effectively constant time, as hashing N bytes is effectively a constant time operation. Cracking should be an exponential time operation.

Now, to make it parallel resistant. There are certain block cipher modes of operation which are not parallel friendly. The idea was to adapt and integrate these modes of operation with the series of hash inputs and outputs, as if the hash input was the plaintext and the ciphertext was the hash output.

A really simple example to demonstrate, not necessarily based on a block cipher mode of operation: when generating successive hashes, the recovered bytes of the previous hashes can be included as input to the current one:

  • challenge1 = hash(random_bytes_1 + iv)
  • challenge2 = hash(random_bytes_1 + random_bytes_2 + iv)
  • challenge3 = hash(random_bytes_1 + random_bytes_2 + random_bytes_3 + iv)

Thus, in order for the solver to begin cracking challenge2 without accumulating extra work, challenge1 must be cracked first.

Note that with that example, if the solver is willing to forego the work reduction, they can start cracking whichever hash they want. So that particular example might not be very effective against GPU's, as brute forcing a 2 byte hash is probably more or less the same amount of time as brute forcing a 4 byte hash.

Ella Rose
  • 19,971
  • 6
  • 56
  • 103