21

While studying discriminant functions for linear classification, I encountered the following:

.. if $\textbf{x}$ is a point on the decision surface, then $y(\textbf{x}) = 0$, and so the normal distance from the origin to the decision surface is given by:

<p>$$

\frac{\textbf{w}^T \textbf{x}}{\lvert\lvert \textbf{w} \lvert\lvert} = -\frac{w_0}{\lvert\lvert \textbf{w} \lvert\lvert} \tag 1 $$

Where $\textbf{w}$ is a weight vector, and $w_0$ is a bias. In an attempt to derive the above formula I tried the following:

\begin{align*} & \textbf{w}^T \textbf{x} + w_0 = 0 \tag 2\\ & \textbf{w}^T \textbf{x} = -w_0 \tag 3 \end{align*}

After which I am basically stuck. I think that the author gets about from equation $(3)$ to equation $(1)$ by normalising. But isn't calculating the normal (perpendicular) distance quite separate from normalising a vector? Secondly, how does equation $(1)$ translate into the normal distance being $ - \frac{w_0}{\lvert\lvert \textbf{w} \lvert\lvert}$ i.e. How is the quantity $\frac{\textbf{w}^T \textbf{x}}{\lvert\lvert \textbf{w} \lvert\lvert}$ the normal distance ?

grayQuant
  • 2,717
BitRiver
  • 413

4 Answers4

15

I encountered the same confusion - it's one of the few places Bishop is unclear. I derived the distance from the origin to the hyperplane in a different way. Since we know that $w$ is orthogonal to the hyperplane, we know that the point $x'$ on the hyperplane that is closest to the origin can be represented as $x'=\alpha w$ for some scalar $\alpha$. Then, since $x'$ is on the hyperplane, we know that $w^T x' + w_0=0 \Rightarrow \alpha w^Tw+w_0=0 \Rightarrow \alpha=\frac{-w_0}{||w||^2}$. The the distance from $x'$ to the origin is just $||x'||=||\alpha w||=\alpha*||w||=\frac{-w_0}{||w||^2}||w||=\frac{-w_0}{||w||}$. This assumes that $w_0$ is negative, but if you want signed distances, you can modify things to fit your convention.

Leland Stirner
  • 755
  • 1
  • 6
  • 16
5

There is a simple proof which I think is what C.Bishop was hinting at. So basically we have established that the weight vector $\vec{w}$ is orthogonal to the decision boundary. We now take a vector from the origin to a point on the boundary x. The projection of this vector (lets call that vector $\vec{x}$) on $\vec{w}$ will have magnitude equal to the orthogonal distance to the decision boundary. This projection, which we symbolize $proj_{\vec{w}} \vec{x}$ is given by $\frac{\vec{w} \vec{x}}{\|\vec{w}\|^2} \vec{w}$ so $$ \|proj_{\vec{w}} \vec{x}\| =\frac{\vec{w} \vec{x}}{\|\vec{w}\|} $$ see https://en.wikibooks.org/wiki/Linear_Algebra/Orthogonal_Projection_Onto_a_Line). Since x is on the line that means that $\vec{w} \vec{x} + w_0 =0 $ so in the end we get that the orthogonal distance is $$r = \frac{\vec{w} \vec{x}}{\|\vec{w}\|} = -\frac{w_0}{\|\vec{w}\|}$$

projection of vector from origin to point on the decision boundary to weight vector

MrHat
  • 303
1

Main principle: if we want to find distance from line to point, we simply project point onto the vector, perpendicular to line, and find lenght of this projection.

Look at the picture (here): we need to find a distance between line1 and line2. To do it, select point $x$ on the line1 and find lenght of projection of vector $\bar{x}$ onto vector $\bar{w}$, which is $||proj_{\bar{w}}\bar{x}|| = ||\bar{x}|| * cos(\bar{x},\bar{w}) = ||\bar{x}||*\frac{\bar{w}*\bar{x}}{||\bar{w}||*||\bar{x}||} = \frac{\bar{w}*\bar{x}}{||\bar{w}||}$

Numerator is the same as $ w^Tx$, and as a result, distance is $\frac{w^Tx}{||w||}$. In Bishop's book, $\frac{w^Tx}{||w||} = -\frac{w_0}{||w||}$ is not an equality, but rather derivation.

We know that our point $x$ lies on decision line, so equality $w^Tx +w_0 = 0$ holds, or $w^Tx = -w_0 $. Just put it into distance formula $\frac{w^Tx}{||w||}$ to get $\frac{-w_0}{||w||}$.

  • "if we want to find distance from line to point"- I think this needs to be fixed. Here we are actually looking for the distance from the origin to the line so the point would be zero. – Undertherainbow Feb 27 '19 at 07:03
  • Thanks for the explanastion. Can anybody say how do we get (¯,¯) = ¯∗¯/(||¯||∗||¯||) ? – MItrajyoti Jul 30 '22 at 07:41
0

The distance from any point $y$ to a plane given a normal vector $w$ to the plane and any point $x$ on the plane is $d=|\frac{w^T(y-x)}{\|w\|}|$

So in your case, $y$ is the origin, hence $y-x=-x$. Therefore, here, $d=|\frac{w^Tx}{\|w\|}|$.

Sam
  • 78