Homework 1 - Solutions Question 1 Choose the correct statement regarding the sum of residuals calculated using Ordinary Least Squares (OLS). A. The sum of
... [Show More] residuals will always be nonzero whatever the form of the linear regression as long as you are using OLS to estimate the coefficients. B. The sum of residuals will always be equal to zero if you include intercept term in your model and you are using OLS to estimate the coefficients C. The sum of residuals may or may not be zero when using OLS and the R software makes mathematical adjustments to make it zero. D. The sum of residuals may or may not be zero when using OLS and the R software makes no mathematical adjustments because it is not needed. Sol: B Explanation: The intercept is the catchall term that takes within itself anything that is not being predicted/accounted for by the independent variables. Question 2 Choose the correct statement regarding the error terms in the assumption of Ordinary Least Squares (OLS). A. The error terms are normally distributed with a constant non-zero mean and constant Variance B. The variance of error terms may or may not be constant as long as the terms are normally distributed with mean equal to zero C. The error terms follow lognormal distribution with mean equal to zero and constant variance D. The error terms are normally distributed with mean equal to zero and a constant variance Sol: D Explanation: The OLS model assumes that the error terms (residuals) are normally distributed with mean equals to zero and constant variance (property of homoscedasticity of variances) Questions 3 - 6 The National Traffic Study Institute is conducting a study to find out the relationship between the speed at which the car is moving and the distance it takes to stop after applying the brakes. You were hired as a statistician to work on this problem. The data can be accessed as follows: ges(“Ecdat”) library(Ecdat) data(cars) You can easily see that these are the variables present in the dataset and the corresponding units using help command on R console – speed (in mph) and dist (in ft). Use this dataset for the following 5 questions. Question 3 Let’s try to find out if there is a correlation between the distance needed to stop and the speed at which the car is moving. What correlation value do you find when doing this in R? A. 0 B. 0.72 C. 0.81 D. 1 Ans: C Explanation: cor(cars$speed, cars$dist) = 0.806 Question 4 Would you say that distance to stop and speed of the car are? A. Not correlated B. Inversely correlated C. Well correlated D. Perfectly correlated Ans: C Explanation: Well-correlated because the value is close to 1 (perfect correlation), but not exactly. Question 5 Now, let’s fit a linear model with distance needed to stop as the response and speed as the predictor. What is the percent variation explained by speed, intercept, and coefficient of speed? A. 0.65, -17.58 and 3.93 B. 0.65, 17.58 and 3.93 C. 0.65, 8.28 and 0.16 D. 0.89, 0 and 0.31 Ans: A Explanation: Percent variation explained by speed is the R-squared value = 0.65; intercept of speed (from the regression summary table) = -17.58; coefficient of speed (from the table again) = 3.93 Question 6 Now suppose we need to change the units of distance needed to stop from feet to meters and speed from mph to meters per second because we need the results to be standard units. What would be the results for percent variation explained by speed, intercept, and coefficient of speed? [Show Less]