Summer MGT 6203 FINAL EXAM
PART2 – CODING
Week 4
Use the abalone.csv dataset to answer questions from 1 to 3:
1. Using the function lm, create a
... [Show More] linear regression that regresses “Rings” onto “Diameter” and
“Height” (i.e., “Rings” is the response variable and “Diameter” ,“Height” are the
independent variables). Which one of the following statements is FALSE?
a. Diameter is significant at a 5% Confidence Interval
b. R-Squared is 0.35
c. The intercept is 11.71
d. One unit of change in height causes the number of rings to increase by 19.81 on
average keeping "Height" constant
Explanation: The intercept is 2.3939 (Lesson 1, Video 7, Slide 4)2. Using the function lm, create a linear regression that regresses “Rings” onto all the features
except “Type”. From this new model, which three features have the highest VIF score?
a. LongetstShell, WholeWeight, Diameter
b. Height, Diameter, ShellWeight
c. ShellWeight, ShuckedWeight, VisceraWeight
d. VisceraWeight, Height, LongestShell
Explanation: See code below (Lesson 1 / Video 8/ Slides 12 - 15)
3. Create two separate datasets from abalone.csv. The first data set will contain Male (M)
abalones and infant (I) abalones. The second dataset will contain Female (F) abalones and
infant (I) abalones. Now use a linear regression model to compute the difference estimator
(average difference in diameter) for each dataset (taking infant as the reference in each
case).
a. 0.125, 0.118
b. 0.113, 0.128
c. 0.128, 0.113
d. 0.118, 0.125
Explanation: The b1 coefficient for each model is 0.113 and 0.128 respectively. See
summaries below (Lesson 5/ Video 3 / Slide 4 and Lesson 5/ Video 4/1-10)
Commented [JD1]: 3 and 4 are the same. So need to omit
one of them
Commented [HW2R1]: DoneUse the Admissions.csv dataset to answer Q4-5
Fit a logistic regression model using Admitted as the response variable and all the other variables as
independent variables. Once the model is done, predict the probabilities of getting admitted for all
the datapoints. Using a threshold of 0.75 identify the students as admitted using predicted
probabilities (i.e., if probability > 0.75 identify the student as admitted).
4. What is the Accuracy of the logistic regression model?
a. 0.84
b. 0.88
c. 0.76
d. None of the aboveExplanation: Lesson 3 / Video 6 / Slides 3 – 8
5. What is the AUC of ROC curve of our predicted admissions and true admissions?
a. 0.867
b. 0.824
c. 0.782
d. None of the above [Show Less]