Use the data set CPS2015, described in Empirical Exercise 8.2, to answer the following questions.

a. Discuss the internal validity of the regressions that you used to answer Empirical Exercise 8.2(l). Include a discussion of possible omitted variable bias, misspecification of the functional form of the regression, errors in variables, sample selection, simultaneous causality, and inconsistency of the OLS standard errors.

b. The data set CPS96_15 described in Empirical Exercise 3.1 includes data from 1996 and 2015. Use these data to investigate the (temporal) external validity of the conclusions that you reached in Empirical Exercise 8.2(l). [Note: Remember to adjust for inflation, as explained in Empirical Exercise 3.1(b).]

a. The internal validity of the regression used to answer Empirical Exercise 8.2(l) depends on several factors.

Possible omitted variable bias: There is a possibility of omitted variable bias if there are other variables that affect the dependent variable (wage) and are correlated with the independent variable (education) but not included in the regression. For example, if there is a variable like work experience that affects wages but is not included in the regression, then the coefficient on education may be biased.

Misspecification of the functional form of the regression: If the relationship between the dependent variable and independent variable is not linear, then using a linear regression model can lead to misspecification of the functional form. This can lead to biased coefficient estimates.

Errors in variables: If there are measurement errors in the variables used in the regression, then the coefficient estimates may be biased.

Sample selection: The sample used in the regression may not be representative of the population, which can lead to biased coefficient estimates.

Simultaneous causality: There is a possibility of simultaneous causality if the dependent variable also affects the independent variable. For example, if wages affect education, then the coefficient on education may be biased.

Inconsistency of the OLS standard errors: If the errors are not independently and identically distributed, or if there is heteroskedasticity, then the standard errors estimated by ordinary least squares (OLS) may be inconsistent. This can lead to biased t-statistics and incorrect hypothesis testing.

b. To investigate the temporal external validity of the conclusions reached in Empirical Exercise 8.2(l), we can use the CPS96_15 data set. We would need to adjust for inflation to make the wage data comparable across the two years.

We can estimate the same regression model as in Empirical Exercise 8.2(l) using the CPS96_15 data set and compare the coefficient estimates with those from the original regression. If the coefficient estimates are similar across the two data sets, then we can conclude that the conclusions reached in Empirical Exercise 8.2(l) have external validity.

However, if the coefficient estimates differ significantly across the two data sets, then we may need to investigate the reasons for the differences. For example, if the coefficient on education is much larger in the CPS96_15 data set than in the CPS2015 data set, then we may need to investigate whether the returns to education have changed over time, or whether there are other factors that are affecting the relationship between education and wages.