Highest Voted Questions - Data Science Stack Exchange

10

votes

3 answers

Questions about LSTM cells, units and inputs

I'm trying to learn how LSTM networks work, and even if I get the basics, the details of the internal structure is not clear for me. On this blog link, I found this scheme of a LSTM architecture Where apparently, every circle should correspond to…

machine-learning neural-network deep-learning lstm

asked Dec 07 '17 at 02:01

BBB

101
1
3

10

votes

3 answers

Log file analysis: extracting information part from value part

I'm trying to build a data set on several log files of one of our products. The different log files have their own layout and own content; I successfully grouped them together, only one step remaining... Indeed, the log "messages" are the best…

text-mining clustering

asked Nov 20 '14 at 14:26

Michael Hooreman

813
2
10
21

10

votes

3 answers

Clustering of documents using the topics derived from Latent Dirichlet Allocation

I want to use Latent Dirichlet Allocation for a project and I am using Python with the gensim library. After finding the topics I would like to cluster the documents using an algorithm such as k-means(Ideally I would like to use a good one for…

python clustering lda

asked Nov 13 '14 at 09:19

Swan87

221
1
2
4

10

votes

1 answer

What is the "novel reinforcement learning algorithm" in AlphaGo Zero?

For some reason, AlphaGo Zero isn't getting as much publicity as the original AlphaGo, despite its incredible results. Starting from scratch, it's already beaten AlphaGo Master and has passed numerous other benchmarks. Even more incredibly, it's…

machine-learning deep-learning

asked Oct 19 '17 at 23:38

Dubukay

203
1
7

10

votes

1 answer

A clear visualization of a two-way ANOVA

To provide a full yet simple picture of a 3-level, one-way ANOVA, I use the following visualization where variation within each group (the filled circles) and variation between the groups (black arrows) are simple to be understood. But I'm wondering…

r statistics visualization

asked Oct 02 '17 at 17:34

Reza Norouzian

101
2

10

votes

3 answers

Public dataset for news articles with their associated categories

I am wondering if there are any public datasets of Google news with various news categories such as politics, entertainment, lifestyle, general news, sports etc. I want to use such dataset for topic detection of various sentences or paragraphs. I…

machine-learning data-mining nlp dataset text-mining

asked Sep 26 '17 at 08:56

utengr

213
1
2
10

10

votes

2 answers

How to get an aggregate confusion matrix from n different classifications

I want to test the accuracy of a methodology. I ran it ~400 times, and I got a different classification for each run. I also have the ground truth, i.e., the real classification to test against. For each classification I computed a confusion matrix.…

classification confusion-matrix accuracy

asked Jun 05 '14 at 09:00

gc5

879
2
9
17

10

votes

2 answers

Scalable Outlier/Anomaly Detection

I am trying to setup a big data infrastructure using Hadoop, Hive, Elastic Search (amongst others), and I would like to run some algorithms over certain datasets. I would like the algorithms themselves to be scalable, so this excludes using tools…

data-mining bigdata algorithms outlier

asked Oct 17 '14 at 10:47

doublebyte

430
3
9

10

votes

1 answer

Convolutional network for classification, extremely sensitive to lighting

I trained a convolutional network to classify images of a mechanical component as good or defective. Though the test accuracy was high, I realized that the model performed poorly on images which had slightly different lighting. The features that…

machine-learning classification deep-learning image-classification

asked Sep 03 '17 at 15:04

Effective_cellist

201
1
5

10

votes

1 answer

Is it valuable to normalize/rescale labels in neural network regression?

Have there been any papers, or does anyone have any specific experience to know whether normalizing labels in a regression problem is likely to improve the performance of a neural network? I have labels that are in the range (0,1000) applying square…

neural-network normalization labels

asked Sep 01 '17 at 16:36

davidparks21

433
1
4
18

10

votes

1 answer

How to use Embedding() with 3D tensor in Keras?

I have a list of stock price sequences with 20 timesteps each. That's a 2D array of shape (total_seq, 20). I can reshape it into (total_seq, 20, 1) for concatenation to other features. I also have news title with 10 words for each timestep. So I…

python tensorflow keras rnn lstm

asked Aug 11 '17 at 11:20

offchan

305
3
12

10

votes

2 answers

Why does Q Learning diverge?

My Q-Learning algorithm's state values keep on diverging to infinity, which means my weights are diverging too. I use a neural network for my value-mapping. I've tried: Clipping the "reward + discount * maximum value of action" (max/min set to…

machine-learning python reinforcement-learning q-learning

asked Aug 11 '17 at 01:11

nedward

414
5
13

10

votes

4 answers

Why does it speed up gradient descent if the function is smooth?

I now read a book titled "Hands-on Machine Learning with Scikit-Learn and TensorFlow" and on the chapter 11, it has the following description on the explanation of ELU (Exponential ReLU). Third, the function is smooth everywhere, including around z…

deep-learning gradient-descent

asked Aug 07 '17 at 14:58

Blaszard

911
1
13
30

10

votes

1 answer

Can The linearly non-separable data be learned using polynomial features with logistic regression?

I know that Polynomial Logistic Regression can easily learn a typical data like the following image: I was wondering whether the following two data also can be learned using Polynomial Logistic Regression or not. I guess I have to add more…

machine-learning classification

asked Aug 02 '17 at 10:47

Green Falcon

14,308
10
59
98

10

votes

1 answer

How should one deal with implicit data in recommendation

A recommendation system keeps a log of what recommendations have been made to a particular user and whether that user accepts the recommendation. It's like user_id item_id result 1 4 1 1 7 -1 5 19 1 5 80 …

recommender-system

asked May 25 '14 at 13:57

wdg

203
1
6

Most Popular