ISYE 6414 FINAL EXAM REVIEW 2023 NOVEMBER QUESTIONS WITH COMPLETE SOLUTION
Least Square Elimination (LSE) cannot be applied to GLM models. Ans***False
... [Show More] - it is applicable but does not use data distribution information fully.
In multiple linear regression with idd and equal variance, the least squares estimation of regression coefficients are always unbiased. Ans***True - the least squares estimates are BLUE (Best Linear Unbiased Estimates) in multiple linear regression.
Maximum Likelihood Estimation is not applicable for simple linear regression and multiple linear regression. Ans***False - In SLR and MLR, the SLE and MLE are the same with normal idd data.
The backward elimination requires a pre-set probability of type II error Ans***False - Type I error
The first degree of freedom in the F distribution for any of the three procedures in stepwise is always equal to one. Ans***True
MLE is used for the GLMs for handling complicated link function modeling in the X-Y relationship. Ans***True
In the GLMs the link function cannot be a non linear regression. Ans***False - It can be linear, non linear, or parametric
When the p-value of the slope estimate in the SLR is small the r-squared becomes smaller too. Ans***False - When P value is small, the model fits become more significant and R squared become larger.
In GLMs the main reason one does not use LSE to estimate model parameters is the potential constrained in the parameters. Ans***False - The potential constraint in the parameters of GLMs is handled by the link function.
The R-squared and adjusted R-squared are not appropriate model comparisons for non linear regression but are for linear regression models. Ans***TRUE - The underlying assumption of R-squared calculations is that you are fitting a linear model.
The decision in using ANOVA table for testing whether a model is significant depends on the normal distribution of the response variable Ans***True
When the data may not be normally distributed, AIC is more appropriate for variable selection than adjusted R-squared Ans***True
The slope of a linear regression equation is an example of a correlation coefficient. Ans***False - the correlation coefficient is the r value. Will have the same + or - sign as the slope.
In multiple linear regression, as the value of R-squared increases, the relationship between predictors becomes stronger Ans***False - r squared measures how much variability is explained by the model, NOT how strong the predictors are.
When dealing with a multiple linear regression model, an adjusted R-squared can be greater than the corresponding unadjusted R-Squared value. Ans***False - the adjusted rsquared value take the number and types of predictors into account. It is lower than the r squared value.
In a multiple regression problem, a quantitative input variable x is replaced by x − mean(x). The R-squared for the fitted model will be the same Ans***True
The estimated coefficients of a regression line is positive, when the coefficient of determination is positive. Ans***False - r squared is always positive.
If the outcome variable is quantitative and all explanatory variables take values 0 or 1, a logistic regression model is most appropriate. Ans***False - More research is necessary to determine the correct model.
After fitting a logistic regression model, a plot of residuals versus fitted values is useful for checking if model assumptions are violated. Ans***False - for logistic regression use deviance residuals.
In a greenhouse experiment with several predictors, the response variable is the number of seeds that germinate out of 60 that are planted with different treatment combinations. A Poisson regression model is most appropriate for modeling this data Ans***False - poisson regression models rate or count data.
For Poisson regression, we can reduce type I errors of identifying statistical significance in the regression coefficients by increasing the sample size. Ans***True
Both LASSO and ridge regression always provide greater residual sum of squares than that of simple multiple linear regression. Ans***True
If data on (Y, X) are available at only two values of X, then the model Y = \beta_1 X
+ \beta_2 X^2 + \epsilon provides a better fit than Y = \beta_0 + \beta_1 X +
\epsilon. Ans***False - nothing to determine of a quadratic model is necessary or required.
If the Cook's distance for any particular observation is greater than one, that data point is definitely a record error and thus needs to be discarded. Ans***False - must see a comparison of data points. Is 1 too large?
We can use residual analysis to conclusively determine the assumption of independence Ans***False - we can only determine uncorrelated errors.
It is possible to apply logistic regression when the response variable Y has 3 classes. Ans***True
. A correlation coefficient close to 1 is evidence of a cause-and-effect relationship between the two variables. Ans***False- cause and effect can only be determined by a well designed experiment.
Multiplying a variable by 10 in LASSO regression, decreases the chance that the coefficient of this variable is nonzero. Ans***False - I am not sure why anyone would think this would be true.
In regression inference, the 99% confidence interval of coefficient \beta_0 is always wider than the 95% confidence interval of \beta_1. Ans***False- can only compare beta1 with beta1 and beta0 with beta0
The regression coefficients for the Poisson regression model can be estimated in exact/closed form. Ans***False - MLE is NOT closed form.
Mean square error is commonly used in statistics to obtain estimators that may be biased, but less uncertain than unbiased ones. And that's preferred. Ans***True
Regression models are only appropriate for continuous response variables. Ans***False - logistic and poisson model probability and rate
The assumptions in logistic regression are - Linearity, Independence of response variable, and the link function is the logit function. Ans***True - linearity is measured through the link, , the g of the probability of success and the predicted variable.
The log odds function, also called the logit function, which is the log of the ratio between the probability of a success and the probability of a failure Ans***True
In logistic regression we interpret the Betas in terms of the response variable. Ans***False - we interpret it in terms of the odds of success or the log odds of success
In logistic regression we have an additional error term to estimate. Ans***False - there is not error term in logistic regression.
The least square estimation for the standard regression model is equivalent with Maximum Likelihood Estimation, under the assumption of normality. Ans***True
The variance estimator in logistic regression has a closed form expression. Ans***False
- use statistical software to obtain the variance-co-variance matrix [Show Less]