Chapter 30 Linear Regression
30.1 Linear Regression in R
#you need to load the MASS library
#the function mvrnorm generates a matrix in which you need to specify the mean, variance and covariance
xy <- mvrnorm(10,mu=c(180,180),matrix(c(100,85,85,100),2))
## [,1] [,2]
## [1,] 170.6260 174.7535
## [2,] 174.7535 204.7070
x <- xy[,1]
y <- xy[,2]
#plot the scatter points and add the line of least squares
plot(x,y, pch=16, xlab="mid-parental height (cm)", ylab="child's height (cm)")
abline(lm(y~x), col="red", lwd=4)
#add lines depicting the residuals
fitted <- predict(lm(y~x))
for(i in 1:10) {lines(c(x[i],x[i]),c(y[i],fitted[i]))}
## Call:
## lm(formula = y ~ x)
## Residuals:
## Min 1Q Median 3Q Max
## -5.491 -2.882 -1.077 1.151 10.494
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -2.4346 24.8950 -0.098 0.925
## x 1.0242 0.1373 7.460 7.19e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Residual standard error: 5.38 on 8 degrees of freedom
## Multiple R-squared: 0.8743, Adjusted R-squared: 0.8586
## F-statistic: 55.66 on 1 and 8 DF, p-value: 7.193e-05
30.2 Diagnostics
Often we want to run some diagnostics on the model to assess whether our assumptions are justified e.g. whether the residuals are normally distributed or whether there are specific points that have undue influence on the model. In R there are inbuilt functions to do this. Simply use plot(lm(y~x)). This will give you four different plots with diagnostics