12

Modern computers (which crypto programs are usually run on) have a 64-bit multiply, and it only takes one cycle. It's pretty decent mixing at next to no cost.

For block ciphers:

Multiplication by a constant is nonlinear (when combined with other mixing operations), provides diffusion in one direction (easily amended with a rotate), is bijective (necessarily), and is as fast in both directions (with precomputed inverse).

For hashes:

With no need to efficiently reverse the permutation, it is safe to use polynomials (ones that aren't linear). They provide even better mixing and can mix more than one value at a time. If bijectivity is not important, the full 128-bit multiply result can be used, for example by xor-ing the low and high words. Otherwise, rotates should be in there somewhere so the high bits can affect the low bits. It is somewhat costly, but still less costly than what would provide equivalently good mixing with only "lower level" operations. Simple example: a^=(b&~c)<<<7 can be replaced with a+=(b*~c)<<<7, which imposes no extra cost and mixes better most of the time.

I'm sure there's other potential uses for multiply, but those are the first two I came up with.

So, why is multiplication uncommon in cryptographic primitives? Sure, AES and SHA3 are designed to be hardware friendly, but what about all the others? Especially the ones designed for software implementations.


RC6 uses a multiply, though not for the usual purposes, "usual" meaning "like every other block cipher".

Mike Edward Moras
  • 18,161
  • 12
  • 87
  • 240
EPICI
  • 359
  • 2
  • 7

2 Answers2

18

There's at least 5 reasons why multiply is not more often used in symmetric ciphers and hashes:

  1. For use as mixer, multiplication requires more hardware/energy/time than other hardware constructs of comparable cryptographic interest. This is an argument mostly for hardware implementations, but many ciphers (and a few hashes) are designed so as to be fast in hardware (that was the explicit implementation target for DES, and an explicit target for AES, which often is in hardware nowadays).
  2. It is far from universal (and used to be uncommon) that wide multiplication is single-cycle, and the number of cycle(s) listed might not be the end of the story; often there is latency, and that cost extra cycles if the result is used for the next few instructions (for x64 otus pointed this source, the same applies to some ARM).

    Worst there is often timing dependency on one operand (opening an avenue for timing attacks), and no way to control which operand from the comfort of a high-level language (other than by forcing some high-order bits in both operands). For example, the common 32-bit ARM Cortex M3 has UMULL (32x32->64-bit result) documented as requiring 3 to 5 cycles, with

    early termination depending on the size of the source values.

  3. Multiplication only does bit mixing to the left: if $C=A\;B$ then bit $C_j$ is independent of bits $A_i$ and $B_i$ for any $i>j$. Fixing this requires computing the product with more precision than its operand (and some post-processing), which might not be fast/easy in high-level language.

  4. Multiplication by a constant is linear, and ciphers require non-linearity. While some derivatives of multiplication (like squaring) are not linear, multiplication as the sole source of non-linearity could open to algebraic attacks.
  5. As pointed in comment, when truncating the result of a multiplication (effectively working modulo some power of two), $F_A: B\mapsto F_A(B)=A\;B\bmod 2^k$ is a bijection only for odd $A$, and can loose a lot of entropy for other $A$, which is undesirable in a normal mixer.

The only widely used cipher I can name that heavily uses multiplication other than by constant is IDEA, where it is followed by modular reduction modulo the prime $2^{16}+1$, in order to fix issues 3 and 5.

Multiplication is in wide use in asymmetric cryptography: RSA, DSA, Elliptic Curve cryptography over a prime field..

fgrieu
  • 149,326
  • 13
  • 324
  • 622
1

I think simply put. However fast multiplication is, simple bitwise operations are even faster - and can be executed sometimes out of order, and in parallel by multiple execution units in the CPU. Moreover - major criteria of any algorithm design is the ability to implement it in hardware. Simplicity and ability to parallelise operation, thus lowering the cycles per byte number - is the key here.