2

I am teaching myself Artificial Intelligence from scratch without libraries

I have a decent handle on most of it

UPDATE-EDIT

I am lost however on the next step mathematically after deriving the softmax activation function

as an example to hopefully clarify

lets call Softmax Derivative dSM and if that is the name of the function and the index of the value outputted is i

then it would be dSM_i

when the index i is equal to k which i will define as the ground truth vector index

then

the matrix would look like

(dSM_i * (1 - dSM_i)) (-dSM_i * dSM_k) (-dSM_i * dSM_k)
(-dSM_i * dSM_k) (dSM_i * (1 - dSM_i)) (-dSM_i * dSM_k)
(-dSM_i * dSM_k) (-dSM_i * dSM_k) (dSM_i * (1 - dSM_i))

but I dont know what to do from there

how do i go from there to the equation

derivative Of sum of loss w.r.t derivative of activation
multiplied by
derivative of activation w.r.t derivative of input
multiplied by
derivative of input w.r.t derivative of weight

each row of the jacobian matrix has 3 values when all I need has is 1

Please someone help Thanks I cant find anything yet just how to get to the place i can get to already

  • Can you edit your question? It is hardly readable. – Dominique Sep 11 '23 at 13:19
  • ill try to edit it right now – The Thinkrium Sep 11 '23 at 13:20
  • What about the question can be improved to help you understand it better – The Thinkrium Sep 11 '23 at 13:21
  • That Jacobian doesn't look right to me. Why is it a $3\times 3$? Does your activation function have only 3 inputs? In any case, it's odd that all of the diagonal entries should be the same; don't we need to swap out the indices on each row? – Eric Nathan Stucky Sep 11 '23 at 16:50
  • almost everything i have read and the small amount of deriving i have done by hand matches

    where if i == k then its (sm_i * (1 - sm_i) and if i != k then -sm_i* sm_k

    there are versions that use the kronekers delta but it all amounts to the same and the diagonal is the same because it represents a matrix of diff indices

    where indices go from 1 to 3

    – The Thinkrium Sep 11 '23 at 17:12
  • and yes

    Im still learning so the current example only has 3 inputs to softmax activation

    – The Thinkrium Sep 11 '23 at 17:19

0 Answers0