Highest Voted 'machine-learning-model' Questions

46

votes

8 answers

What would I prefer - an over-fitted model or a less accurate model?

Let's say we have two models trained. And let's say we are looking for good accuracy. The first has an accuracy of 100% on training set and 84% on test set. Clearly over-fitted. The second has an accuracy of 83% on training set and 83% on test set.…

asked Jan 12 '20 at 13:48

EitanT

569
4
3

16

votes

2 answers

Why should we use (or not) dropout on the input layer?

People generally avoid using dropout at the input layer itself. But wouldn't it be better to use it? Adding dropout (given that it's randomized it will probably end up acting like another regularizer) should make the model more robust. It will make…

machine-learning machine-learning-model dropout deep-learning

asked Sep 19 '18 at 19:59

Aditya

2,520
2
17
35

15

votes

3 answers

What are the disadvantages of accuracy?

I have been reading about evaluating a model with accuracy only and I have found some disadvantages. Among them, I read that it equates all errors. How could this problem be solved? Maybe assigning costs to each type of failure? Thank you very much…

machine-learning machine-learning-model accuracy model-evaluations

asked Apr 18 '22 at 08:35

PicaR

334
2
13

11

votes

3 answers

LightGBM - Why Exclusive Feature Bundling (EFB)?

I'm currently studying GBDT and started reading LightGBM's research paper. In section 4. they explain the Exclusive Feature Bundling algorithm, which aims at reducing the number of features by regrouping mutually exclusive features into bundles,…

feature-selection decision-trees xgboost machine-learning-model gbm

asked Nov 30 '18 at 14:36

Tom

113
1
5

11

votes

8 answers

I got 100% accuracy on my test set,is there something wrong?

I got 100% accuracy on my test set using decision tree algorithm, but only got 85% accuracy with random forest. Is there something wrong with my model or is decision tree best suited for the dataset provided? Code: from sklearn.model_selection…

scikit-learn random-forest decision-trees accuracy machine-learning-model

asked Jul 19 '18 at 08:16

Harigovind Valsakumar

113
1
1
8

11

votes

2 answers

Why should I understand AI architectures?

Why should I understand what is happening deep down in some AI architecture? For example LSTM-BERT- Partial Conv... Architectures like this. Why should I understand what is going on while I can find any model on the Internet or any implementations…

machine-learning deep-learning cnn machine-learning-model bert

asked Nov 07 '21 at 13:20

CanP

127
1
3

10

votes

2 answers

Optimising for Brier objective function directly gives worse Brier score than optimising with custom objective - what does it tell me?

I am training an XGBoost model and as I care the most about resulting probabilities, not classification itself I have chosen Brier score as a metric for my model, so that probabilities would be well calibrated. I tuned my hyperparameters using…

xgboost machine-learning-model optimization objective-function

asked Apr 06 '20 at 07:27

Xaume

212
3
14

10

votes

3 answers

Chi-square as evaluation metrics for nonlinear machine learning regression models

I am using machine learning models to predict an ordinal variable (values: 1,2,3,4, and 5) using 7 different features. I posed this as a regression problem, so the final outputs of a model are continuous variables. So an evaluation box plot looks…

machine-learning-model model-evaluations metric

asked Aug 06 '18 at 18:08

Alex

201
1
3

9

votes

2 answers

How to Use Shap Kernal Explainer with Pipeline models?

I have a pandas DataFrame X. I would like to find the prediction explanation of a a particular model. My model is given below: pipeline = Pipeline(steps= [ ('imputer', imputer_function()), ('classifier', RandomForestClassifier() …

machine-learning machine-learning-model data-science-model ipython

asked May 23 '19 at 14:57

Nayana Madhu

436
1
3
8

9

votes

3 answers

Encoding before vs after train test split?

Am new to ML and working on a dataset with lot of categorical variables with high cardinality. I observed that in lot of tutorials for encoding like here, the encoding is applied after the train and test split. Can I check why is it done so? Why…

machine-learning deep-learning neural-network classification machine-learning-model

asked Feb 01 '22 at 07:50

The Great

2,725
3
23
49

8

votes

1 answer

How could I estimate slope of lines on a scatter plot?

I have a list of coordinate pairs. To the human eye, they form lines with a constant slope: This is how I generated that image above: import numpy as np np.random.seed(42) slope = 1.2 # all lines have the same slope offsets = np.arange(10) # we…

machine-learning tensorflow regression machine-learning-model

asked Dec 29 '21 at 16:09

zabop

235
2
8

7

votes

3 answers

Alternatives with better GPU than Google Colab Pro

I am currently running/training MAchine learning models that are very GPU expensive, Google Colab Pro is not giving me enough GPU/RAM Is there any alternatives with better GPU and more RAM than Google Colab Pro??

machine-learning machine-learning-model training gpu colab

asked May 05 '21 at 15:50

The Dan

221
1
2
8

7

votes

6 answers

Is it advisable to combine two dataset?

I have two datasets on heart rate of subjects that were recorded in two different places (two different continent to be exact). The two research experiments aimed to find the subjects' emotions based on how much their heart rate change over time. I…

data machine-learning-model

asked Sep 30 '18 at 16:43

Lapatrie

145
2
9

7

votes

1 answer

Machine learning model for ranking that outputs probabilities

Traditionally ML algorithms for ranking take the features as input and then output a "ranking score" which do not have a natural probabilistic interpretation. For example, suppose we have three laptops: "macbookAir", "macbookPro", "msSurface", and a…

deep-learning machine-learning-model xgboost ranking learning-to-rank

asked Feb 12 '25 at 13:48

Ishigami

173
5

6

votes

3 answers

Which models can handle null values?

Unfortunately trying to google or research null values in machine learning always brings up pages trying to teach you how to impute the values instead, but I'm trying to find models that can handle null values as input. The only one I've found…

decision-trees machine-learning-model gradient-descent

asked Jan 28 '20 at 19:41

user1777900

171
1
2

Questions tagged [machine-learning-model]