Getting Started with HW 2
DSAN 5300: Statistical Learning
Full Text for HW-2.5: ISLR-6.8 #8(a-d)
Here, for ease of access (since the problem in this case is from the previous edition of ISLR, pdf here—thank you Prof. James for the PDF link!), is the full text of Section 6.8 Problem 8. Remember that you only need to do (a) through (d)! The full problem is here just for completeness (you can think about how you’d approach parts (e) and (f)).
In this exercise, we will generate simulated data, and will then use this data to perform best subset selection.
Use the
rnorm()
function to generate a predictor \(X\) of length \(n = 100\), as well as a noise vector \(\epsilon\) of length \(n = 100\).Generate a response vector \(Y\) of length \(n = 100\) according to the model
\[ Y = \beta_0 + \beta_1 X + \beta_2 X^2 + \beta_3 X^3 + \epsilon, \]
where \(\beta_0\), \(\beta_1\), \(\beta_2\), and \(\beta_3\) are constants of your choice.
Use the
regsubsets()
function to perform best subset selection in order to choose the best model containing the predictors \(X, X^2, \ldots, X^{10}\). What is the best model obtained according to \(C_p\), BIC, and adjusted \(R^2\)? Show some plots to provide evidence for your answer, and report the coefficients of the best model obtained. Note you will need to use thedata.frame()
function to create a single data set containing both \(X\) and \(Y\).Repeat (c), using forward stepwise selection and also using backwards stepwise selection. How does your answer compare to the results in (c)?
Now fit a lasso model to the simulated data, again using \(X, X^2, \ldots, X^{10}\) as predictors. Use cross-validation to select the optimal value of \(\lambda\). Create plots of the cross-validation error as a function of \(\lambda\). Report the resulting coefficient estimates, and discuss the results obtained.
Now generate a response vector \(Y\) according to the model
\[ Y = \beta_0 + \beta_7 X^7 + \epsilon, \]
and perform best subset selection and the lasso. Discuss the results obtained.