6

I don't understand the necessity of introducing the additive term δ in the differential privacy definition. Moreover, reading different papers and blogs they say that because of the δ term the mechanism is "broken" (whatever that means).

I would really appreciate some help in understanding the role and the effect of the δ term.

Many thanks in advance!

primef
  • 63
  • 5

1 Answers1

8

The $\delta$ item is a relaxation of the $\epsilon$-differential privacy notion. The latter is a strong security notion because it requires an algorithm $\mathcal{A}$ to have very close output distributions on "neighbor" datasets $D_1,D_2$ that differ in a single record. From its formal definition

$\Pr[\mathcal{A}(D_1) \in S] \leq e^{\epsilon} \times \Pr[\mathcal{A}(D_2) \in S]$,

we can see that the probability difference should be small for every output set $S$ even if the probabilities $\Pr[\mathcal{A}(D_1) \in S]$ and $\Pr[\mathcal{A}(D_2) \in S]$ are negligible. In other words, such a requirement needs to hold for very unlikely events. Sometimes we want to relax this definition a little bit to cover more general cases as discussed here, so the $\delta$ item is introduced:

$\Pr[\mathcal{A}(D_1) \in S] \leq e^{\epsilon} \times \Pr[\mathcal{A}(D_2) \in S]+\delta$.

This essentially means that those highly unlikely "bad" events happen with probability $\leq\delta$. These "bad" events may break $\epsilon$-differential privacy but are properly bounded.

Shan Chen
  • 2,755
  • 1
  • 13
  • 19