Most Popular

1500 questions
63
votes
10 answers

Machine learning - features engineering from date/time data

What are the common/best practices to handle time data for machine learning application? For example, if in data set there is a column with timestamp of event, such as "2014-05-05", how you can extract useful features from this column if any? Thanks…
Igor Bobriakov
  • 1,071
  • 2
  • 9
  • 11
63
votes
6 answers

Should a model be re-trained if new observations are available?

So, I have not been able to find any literature on this subject but it seems like something worth giving a thought: What are the best practices in model training and optimization if new observations are available? Is there any way to determine the…
neural-nut
  • 1,803
  • 3
  • 18
  • 28
62
votes
9 answers

Is there any domain where Bayesian Networks outperform neural networks?

Neural networks get top results in Computer Vision tasks (see MNIST, ILSVRC, Kaggle Galaxy Challenge). They seem to outperform every other approach in Computer Vision. But there are also other tasks: Kaggle Molecular Activity Challenge Regression:…
Martin Thoma
  • 19,540
  • 36
  • 98
  • 170
62
votes
4 answers

What is the advantage of keeping batch size a power of 2?

While training models in machine learning, why is it sometimes advantageous to keep the batch size to a power of 2? I thought it would be best to use a size that is the largest fit in your GPU memory / RAM. This answer claims that for some packages,…
James Bond
  • 1,265
  • 2
  • 12
  • 13
62
votes
3 answers

LeakyReLU vs PReLU

I thought both, PReLU and Leaky ReLU are: $$f(x) = \max(x, \alpha x) \qquad \text{ with } \alpha \in (0, 1)$$ Keras, however, has both functions in the docs. Leaky ReLU Source of LeakyReLU: return K.relu(inputs, alpha=self.alpha) Hence (see relu…
62
votes
6 answers

Latent Dirichlet Allocation vs Hierarchical Dirichlet Process

Latent Dirichlet Allocation (LDA) and Hierarchical Dirichlet Process (HDP) are both topic modeling processes. The major difference is LDA requires the specification of the number of topics, and HDP doesn't. Why is that so? And what are the…
alvas
  • 2,510
  • 7
  • 28
  • 40
62
votes
5 answers

RNN vs CNN at a high level

I've been thinking about the Recurrent Neural Networks (RNN) and their varieties and Convolutional Neural Networks (CNN) and their varieties. Would these two points be fair to say: Use CNNs to break a component (such as an image) into subcomponents…
Larry Freeman
  • 745
  • 1
  • 6
  • 8
61
votes
3 answers

How to fight underfitting in a deep neural net

When I started with artificial neural networks (NN) I thought I'd have to fight overfitting as the main problem. But in practice I can't even get my NN to pass the 20% error rate barrier. I can't even beat my score on random forest! I'm seeking some…
lithuak
  • 733
  • 1
  • 6
  • 8
61
votes
10 answers

IDE alternatives for R programming (RStudio, IntelliJ IDEA, Eclipse, Visual Studio)

I use RStudio for R programming. I remember about solid IDE-s from other technology stacks, like Visual Studio or Eclipse. I have two questions: What other IDE-s than RStudio are used (please consider providing some brief description on them). Does…
IgorS
  • 5,474
  • 11
  • 34
  • 43
61
votes
6 answers

Does XGBoost handle multicollinearity by itself?

I'm currently using XGBoost on a data-set with 21 features (selected from list of some 150 features), then one-hot coded them to obtain ~98 features. A few of these 98 features are somewhat redundant, for example: a variable (feature) $A$ also…
neural-nut
  • 1,803
  • 3
  • 18
  • 28
60
votes
5 answers

Neural networks: which cost function to use?

I am using TensorFlow for experiments mainly with neural networks. Although I have done quite some experiments (XOR-Problem, MNIST, some Regression stuff, ...) now, I struggle with choosing the "correct" cost function for specific problems because…
60
votes
8 answers

Does scikit-learn have a forward selection/stepwise regression algorithm?

I am working on a problem with too many features and training my models takes way too long. I implemented a forward selection algorithm to choose features. However, I was wondering does scikit-learn have a forward selection/stepwise regression…
Maksud
  • 725
  • 1
  • 7
  • 6
60
votes
3 answers

What does Logits in machine learning mean?

"One common mistake that I would make is adding a non-linearity to my logits output." What does the term "logit" means here or what does it represent ?
Rajat
  • 1,167
  • 2
  • 10
  • 10
59
votes
6 answers

Should I go for a 'balanced' dataset or a 'representative' dataset?

My 'machine learning' task is of separating benign Internet traffic from malicious traffic. In the real world scenario, most (say 90% or more) of Internet traffic is benign. Thus I felt that I should choose a similar data setup for training my…
pnp
  • 693
  • 1
  • 6
  • 10
59
votes
4 answers

What is the difference between bootstrapping and cross-validation?

I used to apply K-fold cross-validation for robust evaluation of my machine learning models. But I'm aware of the existence of the bootstrapping method for this purpose as well. However, I cannot see the main difference between them in terms of…
Fredrik
  • 1,047
  • 3
  • 10
  • 12