Highest Voted Questions - Data Science Stack Exchange

9

votes

3 answers

DTW (Dynamic Time Warping) requires prior normalization?

I'm trying DTW from mlpy, to check similarity between time series. Should I normalize the series before processing them with DTW? Or is it somewhat tolerant and I can use the series as they are? All time series stored in a Pandas Dataframe, each in…

time-series

asked Jan 02 '17 at 19:45

KcFnMi

353
1
4
8

9

votes

1 answer

Machine Learning: Writing Poems

I'm a student of machine learning, and these days I was trying to learn how to use the TensorFlow library. I've gone through various tutorials and trial&errors with tensorflow, and I thought the best way to learn it for real would be to make use of…

machine-learning neural-network tensorflow

asked Dec 16 '16 at 01:43

Daniel

181
1
11

9

votes

3 answers

How do you evaluate ML model already deployed in production?

so to be more clear lets consider the problem of loan default prediction. Let's say I have trained and tested off-line multiple classifiers and ensembled them. Then I gave this model to production. But because people change, data and many other…

machine-learning model-evaluations

asked Dec 06 '16 at 00:00

tomtom

247
3
5

9

votes

1 answer

Is time series multi-step ahead forecasting a sequence to sequence problem?

I'm using the keras package in order to train an LSTM for a univariate time series of type numeric (float). Performing a 1-step ahead forecast is trivial, but I'm not sure how to perform a, let's say, 10-step ahead forecast. Two questions: 1) I read…

time-series keras

asked Dec 05 '16 at 12:56

sevelf

91
1
3

9

votes

1 answer

Sigmoid vs Relu function in Convnets

The question is simple: is there any advantage in using sigmoid function in a convolutional neural network? Because every website that talks about CNN uses Relu function.

convolutional-neural-network

asked Dec 02 '16 at 18:33

Malvrok

105
1
4

9

votes

1 answer

How do i pass data into keras?

I am currently struggling to understand how i should train my regression network using keras. I am not sure how I should pass my input data to the network. Both the input data and the output data is stored as a list of numpy arrays. Each input numpy…

python regression tensorflow keras

asked Nov 13 '16 at 12:38

Loser

165
1
2
7

9

votes

1 answer

How to extract paragraphs from text document?

I have extracted text data from pdf files of annual reports of companies using pdftotext. The extracted file content looks like: Sample pdf file is here FORWARD-LOOKING STATEMENTS In this Annual Report, we have disclosed forward-looking…

data-mining text-mining data-cleaning

asked Nov 11 '16 at 06:06

Sanjeev

191
1
1
4

9

votes

1 answer

After the training phase, is it better to run neural networks on a GPU or CPU?

My understanding is that GPUs are more efficient for running neural nets, but someone recently suggested to me that GPUs are only needed for the training phase. Once trained, it's actually more efficient to run them on CPUs. Is this true?

neural-network deep-learning gpu

asked Nov 06 '16 at 00:34

Crashalot

223
2
5

9

votes

1 answer

Neural network with flexible number of inputs?

Is it possible to create a neural network which provides a consistent output given that the input can be in different length vectors? I am currently in a situation where I have sampled a lot of audio files, which are of different length, and have to…

neural-network regression tensorflow supervised-learning audio-recognition

asked Oct 20 '16 at 18:46

Carlton Banks

619
1
6
26

9

votes

1 answer

How can I get the ImageNet ILSVRC 2012 data used for the classification challenge?

I would like to see if I can reproduce some of the image net results. However, I could not find the data (the list of URLs) used for training / testing in the ILSVRC 2012 (or later) classification challenges. I only found…

dataset image-classification

asked Sep 05 '16 at 15:52

Martin Thoma

19,540
36
98
170

9

votes

4 answers

Build a tool for manually classifying training data images

I have a large number of images that I need to classify for training a clustering algorithm, and I would like to do so offline (the data is proprietary). Basically, I'd like to build a desktop survey tool that enables me to rapidly place each image…

machine-learning image-classification training

asked Aug 09 '16 at 20:07

atkat12

278
2
5

9

votes

1 answer

Minimum number of trees for Random Forest classifier

I am searching for a theoretical or experimental estimation of the lower bound for the number of trees in a Random Forest classifier. I usually test different combinations and select the one that (using cross-validation) provides the median best…

random-forest decision-trees

asked Aug 09 '16 at 09:28

gc5

879
2
9
17

9

votes

2 answers

What's an LSTM-LM formulation?

I am reading this paper "Sequence to Sequence Learning with Neural Networks" http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf Under "2. The Model" it says: The LSTM computes this conditional probability by…

machine-learning neural-network nlp rnn machine-translation

asked Aug 04 '16 at 08:25

Taivanbat Badamdorj

211
2
4

9

votes

2 answers

Neural network obfuscation

Neural networks are trained to minimize some error function over the weights of the neural connections. In some applications, these weights could be considered intellectual property. Is there a way to encrypt these weights and still have an…

neural-network

asked Aug 03 '16 at 19:15

Maxwell Johansen

91
3

9

votes

3 answers

Regression model R2 drops when I remove outliers: is that even possible?

I'm analyzing how outliers in my dataset of size 8x8000 affect regression models. I have three scenarios: raw dataset (with outliers), Winsorized dataset (2% of the extreme outliers adjusted), and dataset without outliers (rows with outliers…

regression svm outlier

asked Feb 03 '25 at 16:35

ml.freak

103
4

Most Popular