2

I've made a logistic regression to combine two independent variables in R, using pROC package and I obtain this:

 summary(fit)

Call: glm(formula = Case ~ X + Y, family = "binomial", data = data)

Deviance Residuals: 
  Min       1Q     Median     3Q      Max  
-1.5751  -0.8277  -0.6095   1.0701   2.3080  

Coefficients:
             Estimate  Std. Error z value Pr(>|z|)    
(Intercept) -0.153731   0.538511  -0.285 0.775281    
X           -0.048843   0.012856  -3.799 0.000145 ***
Y            0.028364   0.009077   3.125 0.001780 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 287.44  on 241  degrees of freedom
Residual deviance: 260.34  on 239  degrees of freedom
AIC: 266.34

Number of Fisher Scoring iterations: 4

>     fit

Call:  glm(formula = Case ~ X + Y, family = "binomial", data = data)

Coefficients:
  (Intercept)       X            Y  
   -0.15373     -0.04884      0.02836  

Degrees of Freedom: 241 Total (i.e. Null);  239 Residual
Null Deviance:      287.4 
Residual Deviance:  260.3        AIC: 266.3

Now I need to extract some information from this data and I'm not sure about how to do it. First, I need the model equation: suppose that fit is a combined predictor called CP; could it be CP=-0.15-0.05X+0.03Y?

Then, the resulting combined predictor from the regression should present a median value, so that I can compare median from the two groups Case and Controls which I used to make the regression (in other words, my X and Y variables are N-dimensional with N = N1+N2, where N1 = Number of Controls, for which Case=0, and N2 = Number of Cases, for which Case=1).

Aleksandr Blekh
  • 6,603
  • 4
  • 29
  • 55
Ciochi
  • 21
  • 1
  • 1
  • 2

1 Answers1

3

In order to extract some data from the fitted glm model object, you need to figure out where that data resides (use documentation and str() for that). Some data might be available from the summary.glm object, while more detailed data is available from the glm object itself. For extracting model parameters, you can use coef() function or direct access to the structure.

UPDATE:

From Princeton's* introduction to R course's website, GLM section - see for details & examples:

The functions that can be used to extract results from the fit include

- 'residuals' or 'resid', for the deviance residuals
- 'fitted' or 'fitted.values', for the fitted values (estimated probabilities)
- 'predict', for the linear predictor (estimated logits)
- 'coef' or 'coefficients', for the coefficients, and
- 'deviance', for the deviance. 

Some of these functions have optional arguments; for example, you can extract five different types of residuals, called "deviance", "pearson", "response" (response - fitted value), "working" (the working dependent variable in the IRLS algorithm - linear predictor), and "partial" (a matrix of working residuals formed by omitting each term in the model). You specify the one you want using the type argument, for example residuals(lrfit,type="pearson").

*) More accurately, this website is by Germán Rodríguez from Princeton University.

Aleksandr Blekh
  • 6,603
  • 4
  • 29
  • 55