Conditional Logistic Regression | Computational Chemistry Resources

Conditional logistic regression is like logistic regression, but it can take stratification and matching into account.

One form of conditional logistic regression can be performed by loading the Epi package. An additional form can be performed by loading the survival package.

This example uses the Epi package. The individual components of the clogistic command syntax have been shown through the equals signs. The data$ is not necessary because the data variable is specified in the equation (which was an option for glm). Conditional logistic regression requires the use of stratified data, where the column of the strata is specified (here as matched_sets). Unlike before, the definition must be entered after defining the test; summary will not work.

> library(Epi)
> data = read.table("/path/to/text/file/with/data", header=TRUE, na.strings = "NA")
> EPI.clogistic <- clogistic(formula = STATUS ~ VariableA + VariableB + VariableC + VariableD, strata = matched_sets, data = data)
> EPI.clogistic

Call:
clogistic(formula = STATUS ~ VariableA + VariableB + VariableC + VariableD,
    strata = matched_sets, data = data)




                  coef exp(coef) se(coef)      z       p
VariableA     -0.01002      0.99  0.03261 -0.307 7.6e-01
VariableB      0.50098      1.65  0.36442  1.375 1.7e-01
VariableC      0.15458      1.17  0.27884  0.554 5.8e-01
VariableD      0.00486      1.00  0.00112  4.343 1.4e-05

Likelihood ratio test=26.5  on 4 df, p=2.51e-05, n=292

This example uses the survival package. It has two distinct differences from Epi’s clogit. Firstly, strata are included into the formula, as opposed to being a second parameter. Second, the summary function can be used.

> library(survival)
> data = read.table("/path/to/text/file/with/data", header=TRUE, na.strings = "NA")
> survival.clogit <- clogit(formula = STATUS ~ VariableA + VariableB + VariableC + VariableD + strata(matched_sets) data = data)
> summary(survival.clogit)
Call:
coxph(formula = Surv(rep(1, 295L), STATUS) ~ VariableA + VariableB + VariableC +
   VariableD + strata(matched_sets), data = data, method = "exact")

  n= 294, number of events= 93
   (1 observation deleted due to missingness)

                   coef exp(coef)  se(coef)      z Pr(>|z|)
VariableA     -0.010025  0.990025  0.032612 -0.307    0.759
VariableB      0.500983  1.650342  0.364415  1.375    0.169
VariableC      0.154577  1.167164  0.278843  0.554    0.579
VariableD      0.004864  1.004876  0.001120  4.343 1.41e-05 ***
---
Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

          exp(coef) exp(-coef) lower .95 upper .95
VariableA     0.990     1.0101    0.9287     1.055
VariableB     1.650     0.6059    0.8079     3.371
VariableC     1.167     0.8568    0.6757     2.016
VariableD     1.005     0.9951    1.0027     1.007

Rsquare= 0.086   (max possible= 0.494 )
Likelihood ratio test= 26.5   on 4 df,   p=2.511e-05
Wald test            = 21.21  on 4 df,   p=0.0002873
Score (logrank) test = 24.77  on 4 df,   p=5.607e-05

The level of significance for the p-value is given by the number of asterisks. Three asterisks means that the p-value for that result is below 0.001, but larger than 0. Significant results allow the null hypothesis to be rejected, and the significance code specifies whether this is done at the 90% (.), 95% (*) , 99% (**), or 99.9% (***) level.