Highest Voted 'encoder' Questions - Data Science Stack Exchange

4

votes

2 answers

What is the difference between BERT architecture and vanilla Transformer architecture

I'm doing some research for the summarization task and found out BERT is derived from the Transformer model. In every blog about BERT that I have read, they focus on explaining what is a bidirectional encoder, So, I think this is what made BERT…

asked Nov 30 '20 at 03:34

Luong Minh Tam

143
1
5

3

votes

1 answer

Why transform embedding dimension in sin-cos positional encoding?

Positional encoding using sine-cosine functions is often used in transformer models. Assume that $X \in R^{l\times d}$ is the embedding of an example, where $l$ is the sequence length and $d$ is the embedding size. This positional encoding layer…

transformer encoder

asked Nov 24 '20 at 00:12

kyc12

165
1
5

3

votes

1 answer

What to do with Transformer Encoder output?

I'm in the middle of learning about Transformer layers, and I feel like I've got enough of the general idea behind them to be dangerous. I'm designing a neural network and my team would like to include them, but we're unsure how to proceed with the…

neural-network transformer encoder pooling

asked Jul 21 '22 at 02:21

Rstan

33
1
3

2

votes

2 answers

Role of decoder in Transformer?

I understand the mechanics of Encoder-Decoder architecture used in the Attention Is All You Need paper. My question is more high level about the role of the decoder. Say we have a sentence translation task: Je suis ètudiant -> I am a student The…

transformer attention-mechanism encoder

asked Nov 23 '20 at 20:29

kyc12

165
1
5

2

votes

1 answer

Encoding correlation

I have rather theory-based question as I'm not that experienced in encoders, embeddings etc. Scientifically I'm mostly oriented around novel evolutionary model-based methods. Let's assume we have data set with highly correlated attributes. Usually…

statistics mathematics theory encoder

asked Jan 13 '20 at 17:13

Piotr Rarus

854
1
5
15

2

votes

1 answer

Doubts regarding function used for positional encoding

In position encoding of the transformer, we usually use a sinusoidal encoding rather than a binary encoding even though a binary encoding could successfully capture the positional information very similar to a sinusoidal encoding (with multiple…

transformer attention-mechanism gpt encoder

asked Jun 01 '25 at 17:18

Ashwin Prasad

21
3

2

votes

2 answers

Is it vital to do label encoding with target variable

Should I always use label encoding while doing binary classification?

encoding binary-classification encoder

asked Apr 15 '22 at 01:26

Rus Pylypyuk

21
2

1

vote

1 answer

Encode time-series of different lengths with keras

I have time-series as my data (one time-series per training example). I would like to encode the data within these series in a fixed-length vector of features using a keras model. The problem is that my different examples' time-series don't have the…

time-series autoencoder encoder

asked Dec 11 '20 at 21:35

Contestosis

191
1
6

1

vote

1 answer

How to add a Decoder & Attention Layer to Bidirectional Encoder with tensorflow 2.0

I am a beginner in machine learning and I'm trying to create a spelling correction model that spell checks for a small amount of vocab (approximately 1000 phrases). Currently, I am refering to the tensorflow 2.0 tutorials for 1. NMT with Attention,…

tensorflow attention-mechanism gru encoder

asked May 18 '20 at 05:15

Dom

11
2

1

vote

1 answer

sklearn serialize label encoder for multiple categorical columns

I have a model with several categorical features that need to be converted to numeric format. I am using a combination of LabelEncoder and OneHotEncoder to achieve this. Once in production, I need to apply the same encoding to new incoming data…

scikit-learn categorical-data labels categorical-encoding encoder

asked May 16 '20 at 20:33

revy

133
5

1

vote

1 answer

How do I implement Dual-encoder model in Pytorch?

I am trying to implement the paper titled Learning Cross-lingual Sentence Representations via a Multi-task Dual-Encoder Model. Here the encoder and decoder share the same weights but I am unable to put it in code. Any links ?

transformer encoder

asked Dec 30 '19 at 10:04

gaurus

351
1
2
5

1

vote

1 answer

How is RNN decoder output calculated?

I was trying to read RNN Encoder Decoder paper. RNN (plain RNN i.e. non encoder-decoder RNN) It starts with giving equation for RNN: hidden state in RNN is given as: ... equation (1) where f is a non linear activation function. The output is a…

neural-network rnn encoder

asked Apr 16 '23 at 20:31

Mahesha999

299
1
9

1

vote

2 answers

What does the output of an encoder in encoder-decoder model represent?

So in most blogs or books touching upon the topic of encoder-decoder architectures the authors usually say that the last hidden state(s) of the encoder is passed as input to the decoder and the encoder output is discarded. They skim over that topic…

deep-learning transformer encoder

asked Feb 27 '23 at 19:40

Marek M.

63
6

1

vote

1 answer

Encode categorical data for unsupervised learning

What is the best encoder for categorical data in unsupervised learning? I am using unsupervised learning on mixed data (such as K-means). Before running my unsupervised algorithm, I am using dimension reduction of my data using FAMD (PCA for mixed…

unsupervised-learning categorical-data encoder

asked Dec 02 '22 at 10:08

Julien PETOT

11
1

1

vote

0 answers

Motivation of LSTM with no Input

I have read this paper where authors use LSTM to learn the attention applied to several sets. They use LSTM without input or output, LSTM just uses the hidden state and evolves it: My question is what is the motivation of using LSTM without any…

lstm encoder

asked Nov 12 '22 at 20:20

Oculu

11
2

Questions tagged [encoder]