Regression Models: Teachers & Class Size / Caffine & Exercise

test	csize	tqual	ses
504	38	3.793	0.7947
651.3	23	5.277	2.301
623.6	42	6.084	0.4609
539	21	2.654	2.635
673.1	20	5.429	2.703
584.4	46	5.506	0.09412

Data

test - average classroom score on a standardized test
csize - classroom size; number of students
tqual - teacher quality evaluated by third party experts on a scale of 1 to 100
ses - socio economic status of students measured as median household income

Scatterpolot

Class Size (x-axis) and Test Score (y-axis). This will serve as a visual representation of our baseline model of the relationship between Class Size and Test Score.

#plot( dat$csize, dat$test, 
#     xlab="Class Size", ylab="Test Scores",
#     main="Relationship Between Class Size and Test Scores" )
#     abline( lm(test~csize,data=dat), col="firebrick", lwd=2 )

# CHANGE THE PLOT STYLE: 
#
# pch=19             # change point style
# cex=1.5            # change point size
# col="firebrick"    # change color *points?*
# bty="n"            # remove bounding box


#Also changed Abline to black so it was visible, 
#made others black instead.

plot( dat$csize, dat$test, col="firebrick", pch=19, bty="n", cex=1.5, xlab="Class Size", ylab="Test Scores",main="Relationship Between Class Size and Test Scores" )
abline( lm(test~csize,data=dat), col="black", lwd=2 )

Regression & Plot of Class size & Test Scores

\(test = b_0 + b_1 \cdot tqual + e1\)

model.01 <- lm( test ~ tqual, data=dat )
e1.test.score <- model.01$residual
plot( dat$csize, e1.test.score )
abline( lm(e1.test.score~dat$csize), col="firebrick", lwd=2 )

The strength of the relationship improved dramatically as we have removed any excess or unexplained residual variance from the test scores.
#### Regression of Test scoreson SES

\(test = b_0 + b_1 \cdot ses + e2\)

model.02 <- lm( test ~ ses, data=dat )
e2.test.score <- model.02$residual
plot( dat$csize, e2.test.score )
abline( lm(e2.test.score~dat$csize), col="firebrick", lwd=2 )

The strength of the relationship is lost because class size and test score are highly correlated so when we use only the residuals and compare them with test scores, the slope falls flat.
#### Analysis

These graphs demonstrate the effects of adding the control variables Teacher Quality and SES to the baseline model of the relationship between Class Size and Test Score. Conceptually, the control variable will improve a model by removing variance in the DV to make the estimate either more precise (smaller standard errors) or less bias (adjust the slope of the policy variable).

Teacher Quality has partially reduced the unexplained portions of the outomes of test scores in our main model.

Socio Economic Status has removed the explained portion of the variance as we can see in the slope from our second graph.

To change the slope of this model, it would require the variable selected would need to be correlated with caffiene consumption. The stress index variable is the most closely correlated with caffiene consumptuion, therefore it should produce a slope of zero. If the items were uncorrelated, the slope would not change.


	Dependent variable:

	heart.rate

caffeine	0.037
	(0.047)

stress.index	0.228
	(0.246)

gym.time	-1.440^***
	(0.062)

Constant	116.022^***
	(2.982)


Observations	100
R²	0.873
Residual Std. Error	11.016 (df = 96)

Note:	p<0.1; p<0.05; p<0.01

Control Variable Regression Models

Regression Models: Teachers & Class Size / Caffine & Exercise

Data

Scatterpolot

Regression & Plot of Class size & Test Scores

Caffeine and Exercise