Courtney Brown, Ph.D.
H O M E
A Brief Biography
Curriculum Vitae
Career Guidance Videos
Political Music Videos
Political Music Articles
R Tutorial Videos
Data Sets and
Computer Programs
Scholarly
Speculative Nonfiction
The Farsight Institute
Farsight Prime (Video)
Book Reviews
Videos
Publicity Photos
Speaking Requests
Farsight Prime
African Television
Music Videos

C O N T A C T
Follow on Facebook Courtney Brown
Follow on FB
Follow on Twitter Courtney Brown
Follow on Twitter
Courtney Brown on Instagram

 

Evaluating Multicollinearity with Multiple Regression in R

Below is computer code written in the R programming language that conducts a multicollinearity evaluation of a multiple regression model using TOL (for "tolerance") and VIF (for "variance inflation factor"). The program also conducts Ridge Regression to test the stability of estimated parameters in the context of multicollinearity. Just copy and paste it into R and watch it rip. The data set for this R program can be found HERE.

# First we get our data.
library(car)
mydata <- read.table("panel80.txt")
# attach(mydata) # In case you want to work with the variables directly
names(mydata) # This shows us all the variable names.
# options(scipen=20) # suppress "scientific" notation
options(scipen=NULL) # Brings things back to normal
reagan.model <- lm(REAFEEL3 ~ INC + AGE + PARTYID + REPPART3 + INC:AGE + INC*PARTYID*REPPART3, data=mydata)
summary(reagan.model)
layout(matrix(c(1,2,3,4),2,2)) # optional 4 graphs/page
plot(reagan.model) # These are diagnostic plots.
vif(lm(REAFEEL3 ~ INC + AGE + PARTYID + REPPART3 + INC:AGE + INC*PARTYID*REPPART3, data=mydata))
tol <- 1/vif(lm(REAFEEL3 ~ INC + AGE + PARTYID + REPPART3 + INC:AGE + INC*PARTYID*REPPART3, data=mydata))
tol
mysubsetdata<-subset(mydata, select=c(REAFEEL3, REPPART3, INC, AGE, PARTYID)) #This keeps only the variables that we are using.
cor(mysubsetdata, use = "pairwise.complete.obs") # A correlation matrix for the variables in the regression

windows()

library(MASS)
x <- lm.ridge(REAFEEL3 ~ INC + AGE + PARTYID + REPPART3 + INC:AGE + INC*PARTYID*REPPART3, data=mydata, lambda=seq(0,100,by=1))
plot(x)
title("Ridge Regresssion")
abline(h=0)
abline(v=50,lty=3)
x # This prints out the values of the ridge estimates as lambda increases.