Robust Linear Models

R Statistics

Use lm_robust() to calculate heteroskedasticity-robust standard error values of housing prices

Mia Forsline true
1/20/2022

Learning Goals

Use estimatr to explore how lm_robust() can be used as a heteroskedasticity-robust estimator

Set up

show
library(estimatr)
library(kableExtra)
library(tidyverse)
library(modelsummary)

HPRICE2 <- read.csv("HPRICE2.csv")

kbl(head(HPRICE2)) %>%
  kable_classic(full_width = F, html_font = "Cambria")
price nox rooms stratio
24000 5 7 15
21599 5 6 18
34700 5 7 18
33400 5 7 19
36199 5 7 19
28701 5 6 19

Use lm_robust() to run a bivariate regression with heteroskedasticity-robust standard errors

show
model1 <- lm_robust(formula = price ~ nox , data = HPRICE2)
summary(model1)

Call:
lm_robust(formula = price ~ nox, data = HPRICE2)

Standard error type:  HC2 

Coefficients:
            Estimate Std. Error t value   Pr(>|t|) CI Lower CI Upper
(Intercept)    39232     1451.2   27.03 3.858e-100    36381    42083
nox            -3060      260.5  -11.75  2.536e-28    -3572    -2548
             DF
(Intercept) 504
nox         504

Multiple R-squared:  0.1731 ,   Adjusted R-squared:  0.1715 
F-statistic: 137.9 on 1 and 504 DF,  p-value: < 2.2e-16

Predict the housing price when nox = 7

show
predicted_price <- data.frame(nox=c(7)) 
output <- predict(model1, newdata=predicted_price, se.fit=TRUE, interval='confidence')

avg_price <- round(output$fit[1], digits = 2)
ci_upper <- round(output$fit[3], digits = 2)
ci_lower <- round(output$fit[2], digits = 2)
se <- round(output$se.fit[1], digits = 2)

If NOx = 7, we predict the average housing price to be approximately $17812.74 with a 95% confidence interval of 16709.01 to 18916.48 and a standard error of 561.79.

Use lm_robust() to run a multiple regression with heteroskedasticity-robust standard errors

show
model2 <- lm_robust(formula = price ~ nox + rooms, data = HPRICE2)
summary(model2)

Call:
lm_robust(formula = price ~ nox + rooms, data = HPRICE2)

Standard error type:  HC2 

Coefficients:
            Estimate Std. Error t value  Pr(>|t|) CI Lower CI Upper
(Intercept)   -16343     4809.5  -3.398 7.327e-04   -25792    -6893
nox            -1646      265.1  -6.207 1.133e-09    -2166    -1125
rooms           7635      650.4  11.738 2.749e-28     6357     8913
             DF
(Intercept) 503
nox         503
rooms       503

Multiple R-squared:  0.5023 ,   Adjusted R-squared:  0.5003 
F-statistic: 150.4 on 2 and 503 DF,  p-value: < 2.2e-16

Predict the housing price when nox = 5 and rooms = 6

show
predicted_price=data.frame(nox=c(5), rooms=c(6))
output <- predict(model2, newdata=predicted_price, se.fit=TRUE, interval='confidence')

avg_price <- round(output$fit[1], digits = 2)
ci_upper <- round(output$fit[3], digits = 2)
ci_lower <- round(output$fit[2], digits = 2)
se <- round(output$se.fit[1], digits = 2)

If NOx = 5 and the house has 6 rooms, we predict the average housing price to be approximately $21238.77 with a 95% confidence interval of 20648.73 to 21828.81 and a standard error of 300.32.

Citation

For attribution, please cite this work as

Forsline (2022, Jan. 20). Mia Forsline: Robust Linear Models. Retrieved from miaforsline.github.io/

BibTeX citation

@misc{forsline_lm,
  author = {Forsline, Mia},
  title = {Mia Forsline: Robust Linear Models},
  url = {miaforsline.github.io/},
  year = {2022}
}