What is meant by "a linear model applied to the output of the poly() function"?

Question

Currently, I am studying the third chapter of An Introduction to Statistical Learning with Application in R which discusses linear regression. In section 3.6.5: Non-linear Transformations of the Predictors the poly() function was used to create a polynomial regression model. After that the writers wrote:

By default, the poly() function orthogonalizes the predictors: this means that the features output by this function are not simply a sequence of powers of the argument. However, a linear model applied to the output of the poly() function will have the same fitted values as a linear model applied to the raw polynomials (although the coefficient estimates, standard errors, and p-values will differ). In order to obtain the raw polynomials from the poly() function, the argument raw = TRUE must be used.

Here I can't understand what the writers meant by " a linear model applied to the output of the poly() function.... ".

Can anyone please help me?

Peter · Answer 1 · 2021-11-29T16:10:06.407

Orthogonal polynomials are different to "normal" polynomials. Thus the regression output will be different (coefficients, standard errors etc).

library(ISLR)
df = ISLR::Auto
reg1 = lm(mpg~poly(weight,3,raw=F),data=df)
summary(reg1)
reg2 = lm(mpg~poly(weight,3,raw=T),data=df)
summary(reg2)

For reg1 the result is:

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)                 23.4459     0.2112 111.008  < 2e-16 ***
poly(weight, 3, raw = F)1 -128.4436     4.1817 -30.716  < 2e-16 ***
poly(weight, 3, raw = F)2   23.1589     4.1817   5.538 5.65e-08 ***
poly(weight, 3, raw = F)3    0.2204     4.1817   0.053    0.958

For reg2 the result is:

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)                6.170e+01  1.104e+01   5.587 4.36e-08 ***
poly(weight, 3, raw = T)1 -1.793e-02  1.091e-02  -1.643    0.101    
poly(weight, 3, raw = T)2  1.515e-06  3.450e-06   0.439    0.661    
poly(weight, 3, raw = T)3  1.846e-11  3.503e-10   0.053    0.958

However, predicted (aka "fitted") values from both models will be same:

predict(reg1,newdata=df)[0:3]
       1        2        3 
18.26982 17.07799 18.72854
predict(reg2,newdata=df)[0:3]
       1        2        3 
18.26982 17.07799 18.72854

You also can look the the actual numbers behind the polynomials:

poly(df$weight,3,raw=F)[0:3]
[1] 0.03134202 0.04259480 0.02729340
poly(df$weight,3,raw=T)[0:3]
[1] 3504 3693 3436

The poly function returns "normal" polynomials when raw=T. But since these are correlated (and not orthogonal) and since they are simply different numbers (compared to orthogonal polynomials), estimated coefficients etc. in both models above differ.

poly(c(1,2,3,4),degree = 2,raw=T)
     1  2
[1,] 1  1
[2,] 2  4
[3,] 3  9
[4,] 4 16

What is meant by "a linear model applied to the output of the poly() function"?

1 Answers1