Questions tagged [backpropagation]

Backpropagation or "backward propagation of errors," is an algorithm for supervised learning of artificial neural networks using gradient descent.

Given an artificial neural network and an error function, the backpropagation method calculates the gradient of the error function with respect to the neural network's weights. It is a generalization of the delta rule for perceptrons to multilayer feedforward neural networks.

31 questions

votes

3 answers

How to add the derivative of a matrix to the chain rule?

In machine learning, I'm optimizing a parameter matrix $W$. The loss function is$$L=f(y),$$where $L$ is a scalar, $y=Wx$, $x\in \mathbb{R}^n$, $y\in \mathbb{R}^m$ and the order of $W$ is $m\times n$. In all math textbooks, it is…

asked Aug 01 '22 at 14:07

user900476

votes

0 answers

Estimation the computation complexity of algorithm with more that one choice

The problem: We have a weighted graph $G=(V, E, W)$ with $|V| = n$, $|E| = n-1$ and $W$ is set of edge's weight. The graph $G$ includes one ring on $n_1 \geq 3$ nodes and $n_2$ isolated nodes, $n_1+ n_2 = n$. We want to connect isolated nodes to the…

graph-theory computational-complexity analysis-of-algorithms backpropagation

asked Nov 03 '22 at 00:19

Nick

1,259

votes

1 answer

Derivative of Mean Square Error Function with respect to output

I'm trying to understand the gradient derivation for the back-propagation algorithm. I'm having trouble computing the explicit derivative of the Loss Mean Square Error function with respect to the output value in a regression setting. I only have…

calculus neural-networks mean-square-error backpropagation

asked Jun 20 '22 at 20:20

Orwellian Mentat

votes

0 answers

What do I do once I have the Jacobian Matrix from Softmax Derivative

I am teaching myself Artificial Intelligence from scratch without libraries I have a decent handle on most of it UPDATE-EDIT I am lost however on the next step mathematically after deriving the softmax activation function as an example to hopefully…

calculus jacobian artificial-intelligence backpropagation

asked Sep 11 '23 at 13:17

The Thinkrium

votes

2 answers

Partial derivative with respect to a matrix in RNN backpropagation

I have an issue with the following problem. I am trying to derive the gradients with respect to $x_t, h_{t-1}, W_x, W_h$. $x_t$ is a $N*D$ vector. $h_t$ is a $N*H$ vector. $W_h$ is a $H*H$ matrix. $W_x$ is a $D*H$ matrix. The function is…

linear-algebra matrices derivatives neural-networks backpropagation

asked Jun 26 '23 at 05:09

Samuel Lee

votes

0 answers

How to calculate the upper bound of the gradient of a multi layer ReLu neural network?

Question Layers: We shall denote in the following the layer number by the upper script $\ell$. We have $\ell=0$ for the input layer, $\ell=1$ for the ﬁrst hidden layer, and $\ell=L$ for the output layer. The number of neurons in the layer $\ell$ is…

machine-learning numerical-optimization gradient-descent neural-networks backpropagation

asked Mar 30 '23 at 11:08

river7816

votes

1 answer

Root finding and automatic differentiation

Consider the equation $z = f (z, x)$. We would like to find $z^{\star}$ for $f$ such that $z^{\star} = f (z^{\star}, x)$. One way to do this problem is through naive iteration: $z^{(k + 1)} = f (z^{(k)}, x)$; stop when $z^{(k + 1)} \approx…

derivatives machine-learning newton-raphson backpropagation

asked Apr 06 '22 at 02:58

Leonhard Euler

votes

1 answer

Jacobian Matrix of an Element wise operation on a Matrix

From ref 1 it is clear that when you have an elementwise operation on a vector; the Jacobian matrix of the function wrto its input vector is a diagonal matrix For an input vector $\textbf{x} = \{x_1, x_2, \dots, x_n\}$ on which an element wise…

matrices chain-rule jacobian neural-networks backpropagation

asked Mar 06 '22 at 17:49

Alex Punnen

votes

0 answers

Matrix Derivation for Neural Network Formula

I am learning some insights of Neural network but I have some problem with the derivation of matrix for backpropagation. On an assumption that the formula for calculating for one node in a neural network, which as been vectorized, is $Z^{[i]} =…

matrix-calculus neural-networks backpropagation

asked Mar 04 '22 at 17:56

Hoang Nam

votes

2 answers

Backpropagate through stochastic node

In's commonly said that in VAE, we use reparameterization trick because "we can't backpropagate through stochastic node" It makes sense from the picture, but I found it hard to understand exactly what it means and why. Let's say X ~ N(u, 1). And we…

neural-networks reparameterization-trick backpropagation

asked Jan 30 '22 at 04:00

Jing

2,547

vote

1 answer

Derivative of the Cross Entropy loss function with the Softmax function

I am currently teaching myself the basics of neural networks and backpropagation but some steps regarding the derivation of the derivative of the Cross Entropy loss function with the Softmax activation function I do not understand. Given the loss…

calculus derivatives partial-derivative neural-networks backpropagation

asked Jul 22 '24 at 01:14

Fynn Zentner

vote

1 answer

Why is the numerator-layout Jacobian transposed in backpropagation calculation?

In the derivation of the backpropagation algorithm in Neural Network Design by Hagan et al., we consider the derivative of the scalar-valued sample loss function $\hat{F}$ with respect to a vector of "sensitivities" $\mathbf{n}^{m}$ in layer $m$ of…

matrix-calculus machine-learning chain-rule jacobian backpropagation

asked May 23 '24 at 05:18

aas

vote

1 answer

How to derive expression for gradient in BPPT

I have the following problem: I am trying to derive final expressions for error gradients in a simple recurrent neural network (Backpropagation through Time, BPPT). The parameters and state update equations are the following: $\mathbf{x}_t \in R^n,…

partial-derivative tensor-products matrix-calculus chain-rule backpropagation

asked Feb 11 '23 at 22:26

doktormatte

vote

1 answer

Deriving backpropagation equations - vectorization (regression)

I have a huge problem trying to derive the backpropagation equations. All the solutions I've found online are not detailed as I'd like, hence I'm here asking your help. First of all sorry for this long preface, but I think it's mandatory in order to…

matrices matrix-calculus machine-learning neural-networks backpropagation

asked Jan 21 '23 at 13:58

frad

vote

0 answers

What is the derivative of the Softmax function AFTER subtracting the maximum value from each input?

I'm using the Softmax function as the activation function for the last layer of a neural network I am trying to code up. The function takes in a vector of elements, $\vec{z}$, where the length of $\vec{z}$ is $L$. The function returns the…

partial-derivative neural-networks backpropagation

asked Aug 31 '22 at 04:11

K Bazan

2 Next