I have a question of understanding about the Delta Rule:
$\Delta w_i = (y - \hat{y}) \times x_i$
Why does $x$ have to be multiplied again after the difference? If the input is $0$, the product of $w$ and $x$ remains $0$ anyway. Then, it should not matter if the weight changes with an input of $0$.
Let us take the following example:
- $w_i$ = the ith weight of the weight vector
- $\Delta w_i$ = the change in weight $w_i$
- $y$ = the desired output of the neuron for the learning example
- $\hat{y}$ = the actual, calculated output for the learning example
- $x_i$ = the input
Step $t$:
- Input $x = \begin{bmatrix} 1 & 0 & 0 \end{bmatrix}$
- Random weights = $\begin{bmatrix} 0.1 & 0.1 & 0.1 \end{bmatrix}$
- Scalar product $\to 0.1$
- Step function outputs $1 \to \hat{y} = 1$
BUT we want as output $y = 0$ So the weights have to be adjusted as follows:
Step $t+1$:
- $\Delta w_0 = (0 - 1) \times 1 = -1$
- $\Delta w_1 = (0 - 1) \times 0 = 0$
- $\Delta w_2 = (0 - 1) \times 0 = 0$
$w_i^{new} = w_i^{old} + \Delta w_i$
- New weights: $ \begin{bmatrix} -0.9 & 0.1 & 0.1 \end{bmatrix}$
- Scalar product = $-0.9$
- Output $(y = 0)$
You can totally skip the multiplication by $x$, can't you? Because at the latest when the scalar product is created, the weight which is multiplied by $x = 0$ is still not included. So why multiply by $x$ in $\Delta w$?