Exercises lecture 1
Simple Linear Regression
Adapted from J. Wooldridge
Introductory econometrics, a Modern Approach
2.1 Let kids denote the number of
... [Show More] children ever born to a woman, and let educ denote
years of education for the woman.
A simple model relating fertility to years of education is:
kids = β0+ β1 educ + u
where u is the unobserved error.
(i) What kinds of factors are contained in u? Are these likely to be correlated with level
of education?
(ii) Will a simple regression analysis uncover the ceteris paribus effect of education
on fertility? Explain.
2.3 The following table contains the ACT scores and the GPA (grade point average) for
eight college students. Grade point average is based on a four-point scale and has been
rounded to one digit after the decimal.
(i) Estimate the relationship between GPA and ACT using OLS; that is, obtain the
intercept and slope estimates in the equation
• Comment on the direction of the relationship.
• Does the intercept have a useful interpretation here? Explain.
• How much higher is the GPA predicted to be if the ACT score is increased by five
2
points?
(ii) Compute the fitted values and residuals for each observation, and verify that the
residuals (approximately) sum to zero.
(iii) What is the predicted value of GPA when ACT = 20?
(iv) How much of the variation in GPA for these eight students is explained by ACT?
Explain.
2.4
The data set BWGHT.RAW contains data on births to women in the United States. Two
variables of interest are
• the dependent variable, infant birth weight in ounces (bwght), and
• an explanatory variable, average number of cigarettes the mother smoked per day
during pregnancy (cigs).
The following simple regression was estimated using data on n = 1,388 births:
(i) What is the predicted birth weight when cigs = 0? What about when cigs = 20
(one pack per day)? Comment on the difference.
(ii) Does this simple regression necessarily capture a causal relationship between the
child’s birth weight and the mother’s smoking habits? Explain.
(iii) To predict a birth weight of 125 ounces (3.5kg), what would cigs have to be?
Comment.
(iv) The proportion of women in the sample who do not smoke while pregnant is about
0.85 ( about 85%). Does this help reconcile your finding from part (iii)?
3
INTRODUCTION TO
FUNCTIONAL FORMS INVOLVING LOGARITHMS
2.6 Using data from 1988 for houses sold in Andover, Massachusetts, from Kiel and
McClain (1995), the following equation relates housing price (price) to the distance from
a recently built garbage incinerator (dist):
(i) Interpret the coefficient on log(dist). Is the sign of this estimate what you expect it to
be?
(ii) Do you think simple regression provides an unbiased estimator of the ceteris
paribus elasticity of price with respect to dist? (Think about the city’s decision on
where to put the incinerator.)
(iii) What other factors about a house affect its price? Might these be correlated with
distance from the incinerator?
4
COMPUTER EXERCISES
DATA:
CEOSAL2.RAW
C2.2
The data set in CEOSAL2.RAW contains information on chief executive officers for U.S.
corporations. The variable salary is annual compensation, in thousands of dollars, and
ceoten is prior number of years as company CEO.
(i) Find the average salary and the average tenure in the sample.
(ii) How many CEOs are in their first year as CEO (that is, ceoten = 0)? What is the
longest tenure as a CEO?
(iii) Estimate the simple regression model
l
!og(salary) = β0 + β1 ceoten + u
and report your results in the usual form. What is the (approximate) predicted percentage
increase in salary given one more year as a CEO?
DATA: SLEEP75.RAW
C2.3
Use the data in SLEEP75.RAW from Biddle and Hamermesh (1990) to study whether
there is a tradeoff between the time spent sleeping per week and the time spent in paid
work. We could use either variable as the dependent variable. For concreteness, estimate
the model:
sleep = β0 + β1 totwrk + u
where sleep is minutes spent sleeping at night per week and totwrk is total minutes
worked during the week.
(i) Report your results in equation form along with the number of observations and R2
.
What does the intercept in this equation mean?
(ii) If totwrk increases by 2 hours, by how much is sleep estimated to fall? Do you find
this to be a large effect?
DATA: WAGE2.RAW
C2.4
Use the data in WAGE2.RAW to estimate a simple regression explaining monthly salary
(wage) in terms of IQ score (IQ).
(i) Find the average salary and average IQ in the sample. What is the sample
5
standard deviation of IQ? (IQ scores are standardized so that the average in the
population is 100 with a standard deviation equal to 15.)
(ii) Estimate a simple regression model where a one-point increase in IQ changes
wage by a constant dollar amount. Use this model to find the predicted increase in
wage for an increase in IQ of 15 points. Does IQ explain most of the variation in
wage?
(iii) Now, estimate a model where each one-point increase in IQ has the same percentage
effect on wage. If IQ increases by 15 points, what is the approximate percentage increase
in predicted wage?
DATA: RDCHEM.RAW
C2.5 For the population of firms in the chemical industry, let rd denote annual
expenditures on research and development, and let sales denote annual sales (both are in
millions of dollars).
(i) Write down a model (not an estimated equation) that implies a constant elasticity
between rd and sales. Which parameter is the elasticity?
(ii) Now, estimate the model using the data in RDCHEM.RAW. Write out the estimated
equation in the usual form. What is the estimated elasticity of rd with respect to sales?
Explain in words what this elasticity means.
T [Show Less]