Highest Voted Questions - Data Science Stack Exchange

59

votes

3 answers

How to set batch_size, steps_per epoch, and validation steps?

I am starting to learn CNNs using Keras. I am using the theano backend. I don't understand how to set values to: batch_size steps_per_epoch validation_steps What should be the value set to batch_size, steps_per_epoch, and validation_steps, if I…

machine-learning keras cnn theano

asked Mar 30 '18 at 06:53

Ermene

693
1
6
6

59

votes

5 answers

Number of parameters in an LSTM model

How many parameters does a single stacked LSTM have? The number of parameters imposes a lower bound on the number of training examples required and also influences the training time. Hence knowing the number of parameters is useful for training…

deep-learning rnn

asked Mar 09 '16 at 11:14

wabbit

1,297
2
12
15

58

votes

8 answers

Why do internet companies prefer Java/Python for data scientist job?

I see a many times in job description for data scientist asking for Python/Java experience and disregard R. Below is a personal email I received from chief data scientist of a company I applied for through linkedin. X, Thanks for connecting and…

beginner tools career reference-request

asked Aug 18 '16 at 05:05

StatguyUser

885
1
8
20

58

votes

2 answers

How to interpret the output of XGBoost importance?

I ran a xgboost model. I don't exactly know how to interpret the output of xgb.importance. What is the meaning of Gain, Cover, and Frequency and how do we interpret them? Also, what does Split, RealCover, and RealCover% mean? I have some extra…

machine-learning xgboost

asked Jun 21 '16 at 06:02

user14204

57

votes

8 answers

Why Is Overfitting Bad in Machine Learning?

Logic often states that by overfitting a model, its capacity to generalize is limited, though this might only mean that overfitting stops a model from improving after a certain complexity. Does overfitting cause models to become worse regardless of…

machine-learning predictive-modeling

asked May 14 '14 at 18:09

blunders

1,932
2
15
19

56

votes

5 answers

How do subsequent convolution layers work?

This question boils down to "how do convolution layers exactly work. Suppose I have an $n \times m$ greyscale image. So the image has one channel. In the first layer, I apply a $3\times 3$ convolution with $k_1$ filters and padding. Then I have…

neural-network convolutional-neural-network

asked Dec 02 '15 at 21:53

Martin Thoma

19,540
36
98
170

56

votes

2 answers

train_test_split() error: Found input variables with inconsistent numbers of samples

Fairly new to Python but building out my first RF model based on some classification data. I've converted all of the labels into int64 numerical data and loaded into X and Y as a numpy array, but I am hitting an error when I am trying to train the…

python scikit-learn sampling

asked Jul 06 '17 at 05:17

josh_gray

663
1
5
4

55

votes

9 answers

Is the R language suitable for Big Data

R has many libraries which are aimed at Data Analysis (e.g. JAGS, BUGS, ARULES etc..), and is mentioned in popular textbooks such as: J.Krusche, Doing Bayesian Data Analysis; B.Lantz, "Machine Learning with R". I've seen a guideline of 5TB for a…

bigdata r

asked May 14 '14 at 11:15

akellyirl

723
1
6
9

55

votes

9 answers

How do I compare columns in different data frames?

I would like to compare one column of a df with other df's. The columns are names and last names. I'd like to check if a person in one data frame is in another one.

pandas dataframe

asked Jun 12 '18 at 22:34

a_a_a

837
2
8
11

54

votes

2 answers

Why not always use the ADAM optimization technique?

It seems the Adaptive Moment Estimation (Adam) optimizer nearly always works better (faster and more reliably reaching a global minimum) when minimising the cost function in training neural nets. Why not always use Adam? Why even bother using…

neural-network optimization

asked Apr 15 '18 at 16:55

PyRsquared

1,666
1
12
18

52

votes

3 answers

What loss function to use for imbalanced classes (using PyTorch)?

I have a dataset with 3 classes with the following items: Class 1: 900 elements Class 2: 15000 elements Class 3: 800 elements I need to predict class 1 and class 3, which signal important deviations from the norm. Class 2 is the default “normal”…

neural-network pytorch

asked Apr 01 '19 at 19:00

Muppet

837
1
9
13

52

votes

3 answers

What is "experience replay" and what are its benefits?

I've been reading Google's DeepMind Atari paper and I'm trying to understand the concept of "experience replay". Experience replay comes up in a lot of other reinforcement learning papers (particularly, the AlphaGo paper), so I want to understand…

reinforcement-learning q-learning

asked Jul 19 '17 at 04:15

Ryan Zotti

4,209
3
21
33

52

votes

3 answers

What does the notation mAP@[.5:.95] mean?

For detection, a common way to determine if one object proposal was right is Intersection over Union (IoU, IU). This takes the set $A$ of proposed object pixels and the set of true object pixels $B$ and calculates: $$IoU(A, B) = \frac{A \cap B}{A…

computer-vision

asked Feb 07 '17 at 09:09

Martin Thoma

19,540
36
98
170

51

votes

3 answers

StandardScaler before or after splitting data - which is better?

When I was reading about using StandardScaler, most of the recommendations were saying that you should use StandardScaler before splitting the data into train/test, but when i was checking some of the codes posted online (using sklearn) there were…

machine-learning scikit-learn preprocessing

asked Sep 18 '18 at 02:35

tsumaranaina

725
1
6
17

51

votes

8 answers

Deep Learning vs gradient boosting: When to use what?

I have a big data problem with a large dataset (take for example 50 million rows and 200 columns). The dataset consists of about 100 numerical columns and 100 categorical columns and a response column that represents a binary class problem. The…

machine-learning classification deep-learning

asked Nov 20 '14 at 06:49

Nitesh

1,625
1
12
22

Most Popular