Questions tagged [esl]

Elements of Statistical Learning (ESL) is a popular introductory text on data mining written by Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie. It includes topics on data mining, statistical inference, and prediction.

Elements of Statistical Learning (ESL) is a popular introductory text on data mining written by Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie. It includes topics on data mining, statistical inference, and prediction.

It is available through the Stanford website at:

https://web.stanford.edu/~hastie/Papers/ESLII.pdf

5 questions
16
votes
5 answers

Beginner math books for Machine Learning

I'm a Computer Science engineer with no background in statistics or advanced math. I'm studying the book Python Machine Learning by Raschka and Mirjalili, but when I tried to understand the math of the Machine Learning, I wasn't able to understand…
2
votes
2 answers

Why do decision trees have low accuracy?

It seems to be generally acknowledged that decision trees have low prediction accuracy. Is there a concise explanation for why they have low accuracy? I've read this so much, I've accepted it to be true, but I realize I don't have any intuition as…
2
votes
2 answers

Prerequisites for Elements of Statistical Learning

I am working through Elements of Statistical Learning, and unfortunately have found great difficulty in following the math. I have taken the standard series of Calculus courses (i.e., up to multi-variable calculus) and linear algebra, and have taken…
1
vote
1 answer

How to compute Hessian matrix for log-likelihood function for Logistic Regression

I am currently studying the Elements of Statistical Learning book. The following equation is in page 120. It calculates the Hessian matrix for the log-likelihood function as follows \begin{equation} \dfrac{\partial^2…
1
vote
1 answer

Interpreting the variance of parameter estimates in linear regression

I am reading through ESL and came across this equation (3.6) where the variance of the parameter estimates are provided as $$Var(\hat{\beta}) = (X^TX)^{-1}{\sigma}^2$$ I can understand the mathematics through which this equation is obtained, but I…