4

I am trying to implement an LSTM structure in plain numpy for didactic reason. I clearly understand how to input the data, but not how to output. Suppose I give as inputs a tensor of dimension (n, b, d) where: • n is the length of the sequence • b is the batch size (timestamps in my case) • d the number of features for each example Each example (row) in the dataset is labelled 0-1. However, when I fed the data to the LSTM, I obtain as a result the hidden state h_out which has the same dimension of the hidden size of the network. How can I obtain just a number that can be compared to my labels and properly backpropagated? I read that someone implements another dense layer on top of the LSTM, but it's not clear to me the dimensions that such layer and its weight matrix should have.

Alexbrini
  • 77
  • 7

1 Answers1

2

What you are getting as the output is the internal LSTM state. In order to get value comparable to your labels, add a dense layer on top of it. Output dimension of dense layer would be the number of labels you want result.

  1. If its 0 and 1, only 1 output neuron can work along with sigmoid
  2. If there are 5 label classes, then output dimension of dense layer should also be 5
user5722540
  • 675
  • 4
  • 11