I'm studying support vector machines and in the process I've bumped into lagrange multipliers with multiple constraints and Karush–Kuhn–Tucker conditions.
I've been trying to study the subject, but still can't get a good enough grasp on the subject. In wikipedia:
(http://en.wikipedia.org/wiki/Lagrange_multiplier#Multiple_constraints)
it says that in order to find the extremum points of a function $f$, (with constraints $g_1, ..., g_m$), we must find a point $\text{x}$ such that
$$\sum_{i=1}^{m}\lambda_{i}\nabla g_i(\text{x}) = \nabla f(\text{x})$$
I understand lagrange multipliers when there is only one constraint, but this is hard to grasp for some reason... :(
Could anyone give me easy-to-understand explanation, why the equation above is true?
Thank you for any guidance :)
P.S.
If it is not a big job to do, I'd be very grateful If someone could also explain the Karush–Kuhn–Tucker conditions which generalize my question :) That would be super!