17

It is commonly understood that CRC satisfies the linear identity with respect to the $\oplus$ (XOR) operation:

$\operatorname{CRC}(a) \oplus \operatorname{CRC}(b) = \operatorname{CRC}(a \oplus b)$

But after some experimentation and research it appears that this is not generally true.

The particular algorithm in question is the one used in HDLC, ANSI X3.66, ITU-T V.42, Ethernet, Serial ATA, MPEG-2, PKZIP, Gzip, Bzip2, PNG (see Wikipedia) which uses the polynomial $\mathtt{0x04C11DB7}$.

In what sense is CRC linear? Is this a misconception?

2 Answers2

22

In practice, CRC operations are often started with a nonzero state. Because of this, the actual equation is usually of the form:

$$crc(a) \oplus crc(b) = crc( a \oplus b ) \oplus c$$

for some constant $c$ (which depends on the length of $a$, $b$).

An alternative way of expressing this is, for three any equal-length bitstrings $a, b, c$, we have:

$$crc(a) \oplus crc(b) \oplus crc(c) = crc( a \oplus b \oplus c ) $$

The technical term for this relationship is affine; in cryptography, we treat it as linear because, for attacks that assume linearity, affine works just as well.

poncho
  • 154,064
  • 12
  • 239
  • 382
3

My answer to how to recalculate a CRC32 on a large byte array

and the comment which follows may explain it.

The linearity comes from the fact that CRC is a remainder of dividing a high degree polynomial with binary coefficients (=data) by a fixed degree polynomial with binary coefficients (=crc polynomial).

Adding of polynomials with binary coefficients is equivalent to an xor operation (and it is obviously linear). So if the data changes, and you know the xor between the old data and the new data, you can calculate CRC of the new data from the CRC of the old data and vice versa.

From security perspective, this makes CRC unreliable way to tell if the data has changed if the data has a padding or even some useless bits in the middle. Those can be easily adjusted to produce the correct "remainder" polynomial by calculating the CRC of each free-to-be-adjusted bit and then solving the system of simultaneous linear equations (to produce intentional collision of CRCs).

Which makes producing CRC collision a trivial problem. This makes CRC unsuitable to detect malicious changes.