8

Bcrypt uses Blowfish to encrypt a derived key from the passphrase, and Blowfish is a cryptographic algorithm, but here it is said that:

Note that Salsa20/8 Core is not a cryptographic hash function since it is not collision-resistant.

so how this is useful in Scrypt?

Abdelouahab Pp
  • 271
  • 3
  • 8

4 Answers4

6

Salsa20/8 is used not to enhance cryptographic strength, but to make random-ordered requests to the RAM (and to slower FPGA/ASIC implementation of scrypt). The scrypt uses PBKDF2-HMAC-SHA-256 (PBKDF2 of HMAC-SHA256) to provide such strength.

There is simple variant of scrypt, with parameters p=1 (Parallelization parameter), N=16384, r=8, taken from linked draft and simplified for p=1:

Algorithm scrypt

Input:
        P       Passphrase, an octet string.
        S       Salt, an octet string.
        N       CPU/Memory cost parameter, must be larger than 1,
                a power of 2 and less than 2^(128 * r / 8).
        r       Block size parameter.
        p       Parallelization parameter, a positive integer
                less than or equal to ((2^32-1) * hLen) / MFLen
                where hLen is 32 and MFlen is 128 * r.
        dkLen   Intended output length in octets of the derived
                key; a positive integer less than or equal to
                (2^32 - 1) * hLen where hLen is 32.

Output:
        DK      Derived key, of length dkLen octets.

Steps:

 1. B[0] = PBKDF2-HMAC-SHA256 (P, S, 1, 128 * r)

 2. B[0] = scryptROMix (r, B[0], N)

 3. DK = PBKDF2-HMAC-SHA256 (P, B[0], 1, dkLen)

We can see that there are two PBKDF2 with HMAC SHA256, one before ROMix and one after. They will provide collision resistance for the scrypt.

And here is the scryptROMix, which uses N-sized array, every element of which is equal to scryptBlockMix of previous element (step 2). Salsa is used inside scryptBlockMix and in scryptROMix it defines both transformations of X and the order of read accesses to V array:

Algorithm scryptROMix

Input:
        r       Block size parameter.
        B       Input octet vector of length 128 * r octets.
        N       CPU/Memory cost parameter, must be larger than 1,
                a power of 2 and less than 2^(128 * r / 8).

Output:
        B'      Output octet vector of length 128 * r octets.

Steps:

 1. X = B

 2. for i = 0 to N - 1 do
      V[i] = X
      X = scryptBlockMix (X)
    end for

 3. for i = 0 to N - 1 do
      j = Integerify (X) mod N
             where Integerify (B[0] ... B[2 * r - 1]) is defined
             as the result of interpreting B[2 * r - 1] as a
             little-endian integer.
      T = X xor V[j]
      X = scryptBlockMix (T)
    end for

 4. B' = X
osgx
  • 550
  • 6
  • 18
4

Salsa20 core is not a collision resistant hash function, see DJB's own webpage:

http://cr.yp.to/salsa20.html

For example, Salsa20core(x) = Salsa20core(x + c) for c = "0000000800000008...", thus demonstrating trivial collisions.

To be concrete, try computing Salsa20core for the the following two inputs:

00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000 00000000000000000000000000000000

and

00000080000000800000008000000080 00000080000000800000008000000080 00000080000000800000008000000080 00000080000000800000008000000080

the output for both inputs should be all zeros.

In what way do you think this property weakens scrypt?

3

Yes, Salsa20 core is not meant to be collision resistant. But that is not relevant to the intended use case of Scrypt: Password hashing.

Password hashing is an unfortunate name, as "hashing" has so many specific meanings depending on the context. Two scenarios where you use password hashing are:

  • Password storage for online services. Imagine your users log in to your site using passwords and you have to store some information per user to check that the supplied password is correct. On the other hand, if that stored information is stolen (SQL injections etc.) you don't want the passwords to be recoverable.
  • Key derivation for symmetric encryption. TrueCrypt and other full disc encryption software encrypt your data using a key that is derived from the password you type in at boot time. Usually that password is not as good as a 128 bit random key would be. For example if you use 5 alphanumeric characters you might only have 36^5=60,466,176 possible passwords. If the time it takes to check whether one password is correct or not is small, bruteforcing becomes quite feasible.

What both have in common is that you have low entropy input data and want to have a guaranteed and relatively high time to compute the hash of the password. This makes sure that bruteforcing the small number of likely passwords is difficult.

Out of the classical security criteria (collision resistance, pre-image resistance and second pre-image resistance) of a cryptographic hash function only pre-image resistance is thus of interest for password hashing, as you don't want an attacker to be able to compute valid login data out of the password hash in the password storage scenario.

Perseids
  • 562
  • 4
  • 13
2

IMHO it's just a warning to the reader that this is not a standard hash-based design like BSD-crypt or PBKDF2, which are traditional choices. They use the Salsa20/8 Core mixing function because its speed improves upon the first mixing function that was used in the defining paper for scrypt (that is referenced in the RFC you linked to): there he uses the SHA-1 compress function (not SHA-1 as a hash) which is function that transforms 20 bytes to 20 bytes (it's really a blockcipher, using the message blocks as a key); In the same paper he then suggests SALSA20/8 as a mixing function.

Henno Brandsma
  • 3,862
  • 17
  • 20