4

I'm having hard time with the implementation of the S-Boxes by Osvik found in this paper: Speeding up Serpent. At the end of the paper, all the s-boxes are given and then, I just implement them. Here's my implementation of $S_0$ as an example :

UInt32Vector Serpent::S0(const UInt32Vector &Y)
{
   UInt32Vector X = Y;
   X[3] ^= X[0]; uint32_t X4 = X[1];
   X[1] &= X[3]; X4 ^= X[2];
   X[1] ^= X[0]; X[0] |= X[3];
   X[0] ^= X4;   X4 ^= X[3];
   X[3] ^= X[2]; X[2] |= X[1];
   X[2] ^= X4;   X4 = ~X4;
   X4 |= X[1];   X[1] ^= X[3];
   X[1] ^= X4;   X[3] |= X[0];
   X[1] ^= X[3]; X4 ^= X[3];

   return {X[1], X4, X[2], X[0]};
}

As you can see, this is exactly the $S_0$ from the paper. I have to precise that I checked my s-boxes 3 times to be sure there are the same as the one in the paper. Now, here's how I use $S_i$ for $i=0,\ldots,7$ :

subkeys.push_back(S[k & 7]({W[j], W[j+1], W[j+2], W[j+3]}));

where $k = 3$ and decreasing by 1 as it is mentioned in the algorithm specification.

What's confusing me is the Osvik implementation of his s-boxes compared to the ones described in his paper. This seems to be different. Moreover, in his key schedule, $i = 3$ is increasing instead of decreasing.

Now, here's my questions :

  1. Where can I find test vectors to test my s-boxes ? I found one for the key schedule in Floppy 4 (ecb_iv.txt) from the full submission package, but nothing about the s-boxes.

  2. Why his s-boxes implementation are different than the ones in his paper ?

  3. Are my $S_0$ implementation and usage corrects with what I gave or did I miss something important ?

Thanks a lot for your helps.

Gabriel L.
  • 297
  • 2
  • 9

1 Answers1

3

To answer your questions in order:

  1. You won't find test vectors for the s-boxes in the submission - the s-box functions are implementation specific optimisations, especially the bit-sliced s-box functions like the Osvik and Gladman/Simpson, which actually compute multiple s-box lookups in parallel.
    If you need to test your s-box implementations, I would take the s-box functions from the implementation you're trying to replicate and generate test vectors yourself in isolation.
  2. The difference in the s-boxes between the paper and the C source file is explained on the homepage of those two resources: "...Since then I have made further algorithm improvements, and results optimized for 3-way parallel execution are used in my implementation...".
    It looks like Osvik did further searching and testing and came up with better (for x86/x64 architecture probably) s-box functions after publishing the initial functions.
  3. I can't see anything obviously wrong with your approach, but I would recommend picking one of the implementations to compare to and generate some test vectors directly from the s-box functions to be certain. This is the approach I was planning to use to integrate the Osvik s-box functions into a library that currently uses the Gladman/Simpson ones - I'm glad you pointed out the inconsistency in the paper/source.
archie
  • 1,998
  • 17
  • 28