Questions tagged [self-study]

55 questions
16
votes
3 answers

How to self-learn data science?

I am a self-taught web developer and am interested in teaching myself data science, but I'm unsure of how to begin. In particular, I'm wondering: What fields are there within data science? (e.g., Artificial Intelligence, machine learning, data…
xyhhx
  • 263
  • 2
  • 6
12
votes
1 answer

Gradient Boosting Tree: "the more variable the better"?

From the tutorial of the XGBoost, I think when each tree grows, all the variables are scanned to be selected to split nodes, and the one with the maximum gain split will be chosen. So my question is that what if I add some noise variables into the…
WCMC
  • 465
  • 1
  • 5
  • 11
9
votes
2 answers

Relationship between VC dimension and degrees of freedom

I'm studying machine learning and I feel there is a strong relationship between the concept of VC dimension and the more classical (statistical) concept of degrees of freedom. Can anyone explain such a connection?
stochazesthai
  • 543
  • 4
  • 5
8
votes
2 answers

What is init_score in lightGBM?

In the tutorial boosting from existing prediction in lightGBM R, there is a init_score parameter in function setinfo. I am wondering what init_score means? In the help page, it says: init_score: initial score is the base prediction lightgbm will…
WCMC
  • 465
  • 1
  • 5
  • 11
7
votes
4 answers

What are some good books on Machine Learning and AI like Krugman, Wells and Graddy's "Essentials of Economics"

I am a Logistics student. I like the book "Essentials of Economics" by Krugman, Wells and Graddy in that it is concise, easygoing and not only a beginners book (it gradually approaches advanced subjects thus paving the way for further rigorous…
user36339
  • 131
  • 1
  • 5
7
votes
5 answers

Does ensemble (bagging, boosting, stacking, etc) always at least increase performance?

Ensembling is getting more and more popular. I understand that there are in general three big fields of ensembling, bagging, boosting and stacking. My question is that does the ensembling always at least increase the performance in practice? I…
WCMC
  • 465
  • 1
  • 5
  • 11
6
votes
2 answers

How do I test a difference between two proportions representing fatality rate for Covid 19 in Philippines and World (except Philippines)?

I'm trying to analyse if the fatality rate from my country (A third world country) vary significantly from the world's fatality rate. So I'd basically have two samples, labeled (Philippines) and (World excluding the Philippines) then i can compute…
6
votes
4 answers

What courses / subjects are most important to the field of Data Science?

I've taken it upon myself to begin a career change. I have a decent background in mathematics, but lack in programming or data science specific skills (such as data munging). I have been looking through data science curricula and already feel…
TheRealFakeNews
  • 161
  • 1
  • 1
  • 2
5
votes
1 answer

Early stopping and bounds

Say I am training neural networks using a train set and set aside a validation set V. I obtain models h's after each epoch along with the validation losses(0-1 loss) $\hat{L}(h_1,V)$, $\hat{L}(h_2,V)$ ... if I use the early stopping rule suggested…
ChuckP
  • 153
  • 2
4
votes
2 answers

Depth of a Neural network

I was self-teaching myself. I totally understand why depth of a neural network affects the learning and how it differs than its width. But I am looking for some theoretical justification about it. Papers I could come up with, e.g., Benefits of depth…
ARAT
  • 273
  • 4
  • 13
4
votes
2 answers

How to make sense of confusion matrix

Consider a binary classification problem with 0 labels denoting normal and 1 abnormal or rare. The number of instances with 0 classes are more in comparison to 1. In general, 1) Does 0 always refer to positive or a negative depending on what we…
Srishti M
  • 471
  • 4
  • 9
3
votes
1 answer

Resources on on-line machine learning

I am wondering if there are any books/articles/tutorials about "on-line machine learning"? For example, this website has nice lecture notes (from lec16) on some of the aspects: https://web.eecs.umich.edu/~jabernet/eecs598course/fall2015/web/ or this…
Slim Shady
  • 137
  • 5
2
votes
1 answer

Studying and choosing between different neural network structures

I would like to develop a model that uses convolutional neural networks for image classification. From the many different network structures described in papers and articles online, I would like to choose, as a starting point, the one that better…
2
votes
0 answers

Binary classification with unexplained data

My apologies for cross-posting to stackoverflow and cross validated. Not really sure which one is the most relevant place. Please shed some light on me with this task. Description Assuming the training data looks like 0 |…
Barabbas
  • 29
  • 4
2
votes
0 answers

Four parameter self-starting function based on SSfpl

I am currently working with a self-starting function for four parameters which I based on SSfpl but with a different formula. This is the formula for my self-starting function: (b1 * ((b2 * x)^b4)) / (1 + ((b2 * x)^b4))^(b3 / b4) The code below is…
HYDR0GEN
  • 21
  • 2
1
2 3 4