Highest Voted Questions - Data Science Stack Exchange

8

votes

1 answer

Which of the NIPS 2014 papers are most significant, and why?

As a newcomer to the field, I find many of the NIPS 2014 papers fascinating, but it is difficult for me to evaluate which ones represent real progress over current approaches. Which papers do you think are most significant and are likely to have a…

machine-learning research state-of-the-art

asked Aug 21 '15 at 18:10

Michael R. Bernstein

189
2

8

votes

2 answers

What are some standard ways of computing the distance between individual search queries?

I made a similar question asking about distance between "documents" (Wikipedia articles, news stories, etc.). I made this a separate question because search queries are considerably smaller than documents and are considerably noisier. I hence…

machine-learning nlp search

asked Jul 05 '14 at 16:20

Matt

821
1
8
12

8

votes

1 answer

Why gradient boosting uses sampling without replacement?

In Random Forest each tree is built selecting a sample with replacement (bootstrap). And I assumed that Gradient Boosting's trees were selected with the same sampling technique. (@BenReiniger corrected me). Here there are the sampling techniques…

machine-learning random-forest decision-trees xgboost sampling

asked Feb 07 '20 at 06:59

Carlos Mougan

6,430
2
20
51

8

votes

2 answers

How word2vec can handle unseen / new words to bypass this for new classifications?

In simple terms, if my classification is based on word2vec as features, what I am supposed to do, if a new word comes, which does not have a word2vec? I am trying to used word2vec or word vectors for classification based on entity. For example: I…

machine-learning nlp deep-learning word-embeddings

asked Aug 11 '15 at 05:04

Sarath

81
1
2

8

votes

2 answers

Linearly increasing data with manual reset

I have a linearly increasing time series dataset of a sensor, with value ranges between 50 and 150. I've implemented a Simple Linear Regression algorithm to fit a regression line on such data, and I'm predicting the date when the series would reach…

machine-learning statistics time-series

asked Jul 04 '14 at 05:12

ArunDhaJ

183
6

8

votes

2 answers

How does one derive the modified tanh activation proposed by LeCun?

In "Efficient Backprop" (http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf), LeCun and others propose a modified tanh activation function of the form: $$ f(x) = 1.7159 * tanh(\frac{2}{3}*x) $$ They argue that : It is easier to approximate with…

neural-network activation-function mathematics

asked Jan 25 '20 at 14:17

Lucas Morin

2,775
5
25
47

8

votes

1 answer

sklearn SimpleImputer too slow for categorical data represented as string values

I have a data set with categorical features represented as string values and I want to fill-in missing values in it. I’ve tried to use sklearn’s SimpleImputer but it takes too much time to fulfill the task as compared to pandas. Both methods produce…

python scikit-learn pandas preprocessing

asked Jan 07 '20 at 12:43

vlc146543

83
1
4

8

votes

1 answer

TensorFlow / Keras: What is stateful = True in LSTM layers?

Could you elaborate on this argument? I found the brief explanation from the docs unsatisfying: stateful: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i…

deep-learning tensorflow lstm rnn gru

asked Jan 07 '20 at 12:15

Leevo

6,445
3
18
52

8

votes

2 answers

NLP : variations of a text without modifying it's meaning

I am currently working on the automation of recurring reports (weekly 30-50 pages reports for around 100 districts). Those reports have a mostly fixed form : maps, graphs, data tables and small zone of text. Apart for some discussion around colors…

nlp neural-style-transfer

asked Jan 04 '20 at 16:53

Lucas Morin

2,775
5
25
47

8

votes

1 answer

F1 score vs accuracy, which metric is more important?

I have two multiclass classification models for making predictions (number of classes is three to be precise). One is Keras neural network, other is Gradient Boosted Classifier from Scikit Learn library. I have noticed that after training on same…

classification

asked Dec 23 '19 at 16:50

Ach113

255
1
2
7

8

votes

1 answer

What is Continuous Ranked Probability Score (CRPS)?

I came across some evolution metric at Kaggle: Continuous Ranked Probability Score (CRPS): Mathematically, $C = \frac{1}{199N} \sum_{m=1}^{N} \sum_{n=-99}^{99} (P(y \le n) -H(n - Y_m))^2,$ where P is the predicted distribution, N is the number of…

python model-evaluations metric

asked Nov 28 '19 at 09:58

user86099

8

votes

3 answers

Pivoting a two-column feature table in Pandas

How can I transform the following DataFrame into one with cities as rows and each cuisine as a column, and 1 or 0 as values (1 if the city has that kind of cuisine)? I think this turns out to be a very common problem in transforming data into…

data-mining feature-extraction pandas

asked Jul 05 '15 at 15:10

blue-dino

383
2
3
11

8

votes

3 answers

How to find similarity between different factors in a dataset

Introduction Let's say I have a dataset of different observation of different people and I want to group people together to know which person is closest to the other one. I also want to have a measure to know how close they are to each others and…

machine-learning r similarity correlation

asked Jun 26 '15 at 20:48

zipp

183
1
4

8

votes

2 answers

Data anonymization in Python

I am working on an industrial project which consists of real data. Now, the data contains sensitive information about company operations which could not be disclosed publically. As a result, I need to anonymize the original data first before…

machine-learning python data data-cleaning anonymization

asked Oct 23 '19 at 23:40

Muhammad Ali

2,509
5
21
22

8

votes

1 answer

Why is word prediction an obsession in Natural Language Processing?

I have heard how great BERT is at masked word prediction, i.e. predicting a missing word from a sentence. In a Medium post about BERT, it says: The basic task of a language model is to predict words in a blank, or it predicts the probability that a…

nlp bert

asked Oct 16 '19 at 14:52

SamR

183
1
5

Most Popular