1

For a given classifier h, How is the true error over a distribution D defined? \begin{align*} L_D(h) &= \sideset{\mathbb{E}}{}{}_{x,y \sim D} \Pr[h(x) \neq y] \\ &= \sideset{\mathbb{E}}{}{}_{x,y \sim D} \begin{cases} \Pr[y \neq 0|x] & \text{if } h(x) = 0, \\ \Pr[y \neq 1|x] & \text{if } h(x) = 1. \end{cases} \end{align*}

I saw these two formulae here Showing that Bayes classifier is optimal Are these two equivalent?

Shiv Tavker
  • 113
  • 4

1 Answers1

0

The first line should read $$ \Pr_{x,y \sim D}[h(x) \neq y]. $$ (This is assuming that the classifier is deterministic.)

We can sample $x,y \sim D$ in two steps: first sample $x \sim D'$, and then sample $y \sim C_x$, where $C_x$ is a distribution depending on $x$.

The second line expresses the idea $$ \operatorname*{\mathbb{E}}_{x \sim D'} \Pr_{y \sim C_x}[y \neq h(x)] $$ in an elaborate way.

Both of these are the same, and equal to $$ \operatorname*{\mathbb{E}}_{x \sim D'} \operatorname*{\mathbb{E}}_{y \sim C_x} 1_{h(x) \neq y}. $$

Yuval Filmus
  • 280,205
  • 27
  • 317
  • 514