1

I was learning the Polynomial Hash function in python, the one used in Rabin Karp Algorithm

This is the implementation I was taught:

def GivenHash(S,p=113,x=10):
  hash = 0
  for i in range(len(S)-1,-1,-1):
    hash = (hash*x + S[i]) % p
  return hash

Here's the (naive) implementation I came up with:

def MyHash(S,p=113,x=10):
  hash = 0
  for i in range(len(S)):
    hash += S[i]*(x**i)
  return hash % p

These equations operate on a list of integers S. I realize that they are equivalent (for eg. they give the same ans 27 on the list S = [1,2,3,4]).

My question is essentially that for 2 integer lists, how do i show that:

(S[0] + (S[1] mod p)*x) mod p = (S[0] + S[1]*x) mod p

I'm new to the concept of modular arithmetic, so I'm sure there's something I'm missing there, but I can't find a way of manipulating the expression using identities to make the left hand side equal to the right.

Similarly, for the rolling hash function, how do i show that:

if h(S[i..i+|P|-1]) = [S[i]*x^0 + S[i+1]*x^1 + .. S[i+|P|-1]*x^(|P|-1)] mod p

then h(S[i+1..i+|P|]) = [ (h(S[i..i+|P|-1]) - S[i]x^0)/x + S[i+|P|]x^|P| ] mod p

Why does taking mod p at the end work instead of having to use it everywhere due to the modulo distributive properties?

(Here h(S[i..i+n]) is the rolling hash of S[i..i+n] and |P| is the length of subarray to be hashed)

1 Answers1

2

It's a Math question, you need to learn modular arithmetic. I will use '%' to denote mod operation, with the same priority as '/':

1. (a+b) % p = (a%p + b%p) % p
2. a%p % p = a%p
3. (a%p * x) % p = (a * x) % p

These rules should be enough to prove your first equation.

Bulat
  • 2,113
  • 1
  • 11
  • 17