Students are encouraged to work together on homework. However, sharing, copying, or providing any part of a homework solution or code is an infraction of the University’s rules on Academic Integrity. Any violation will be punished as severely as possible. Final submissions must be uploaded to compass2g. No email or hardcopy will be accepted. For late submission policy and grading rubrics, please refer to the course website.
What is expected for the submission to Gradescope
HWx_yourNetID.pdf
. For example, HW01_rqzhu.pdf
. Please note that this must be a .pdf
file generated by a .Rmd
file. .html
format cannot be accepted.Please note that your homework file is a PDF report instead of a messy collection of R codes. This report should include:
Ruoqing Zhu (rqzhu)
by your name and NetID if you are using this template).R
code chunks visible for grading.R
code chunks that support your answers.Answer: I fit SVM with the following choice of tuning parameters ...
Requirements regarding the .Rmd
file.
Rmd
files. However, your PDF file should be rendered directly from it.We have implemented the Nadaraya-Watson kernel estimator in HW 6. In this question, we will investigate a local linear regression: \[ \widehat{f}\left(x\right)=\widehat{\beta}_{0}\left(x\right)+ \widehat{\beta}_{1}\left(x\right) x, \] where \(x\) is a testing point. Local coefficients \(\widehat{\beta}_{r}\left(x \right)\) for \(r=0, 1\) are obtained by minimizing the object function \[ \underset{\beta_{0}(x), \, \beta_{1}(x)}{\operatorname{minimize}} \quad \sum_{i=1}^{n} K_{\lambda} \left(x, x_{i}\right) \Big[y_{i}-\beta_{0}(x) - \beta_1(x) x_{i} \Big]^{2}. \]
In this question, we will use the Gaussian kernel \(K(u) = \frac{1}{\sqrt{2 \pi}} e^{- \frac{u^2}{2}}\).
[20 pts] Write a function myLocLinear(trainX, trainY, testX, lambda)
, where lambda
is the bandwidth and testX
is all testing samples. This function returns predictions on testX
. The solution of \(\beta_{0}(x)\) and \(\beta_{1}(x)\) can be obtained by fitting a weighted linear regression. The formula is provided on Page 25 of our lecture note.
[15 pts] Fit a local linear regression with our given training data. The testing data are generated using the code given below. Try a set of bandwidth \(\lambda = 0.05, 0.1, \ldots, 0.55, 0.6\) when calculating the kernel function.
train = read.csv('hw7_Q1_train.csv')
testX = 2 * pi * seq(0, 1, by = 0.01)
testY = sin(testX)
plot(train$x, train$y, pch = 19, cex = 0.3, xlab = "x", ylab = "y")
lines(testX, testY, col = "darkorange", lwd=2.0)
legend("topright", legend = c("Truth"),
col = c("darkorange"), lty = 1, cex = 2)
For both question 2 and 3, you need to write your own code. We will use the handwritten digit recognition data from the ElemStatLearn
package. We only consider the train-test split, with the pre-defined zip.train
and zip.test
. Simply use zip.train
as the training data, and zip.test
as the testing data for all evaluations and tuning. No cross-validation is needed in the training process.
help(zip.train)
. library(ElemStatLearn)
# load training and testing data
dim(zip.train)
## [1] 7291 257
dim(zip.test)
## [1] 2007 257
# number of each digit
table(zip.train[, 1])
##
## 0 1 2 3 4 5 6 7 8 9
## 1194 1005 731 658 652 556 664 645 542 644
cov
are allowed. Do NOT print your results.You are not required to write a single function to perform LDA, but you could consider defining a function as myLDA(testX, mu_list, sigma_pool)
, where mu_list
is the estimated mean vector for each class, and sigma_pool
is the pooled variance estimation. This function should return the predicted class based on comparing discriminant functions \(\delta_k(x) = w_k^T x + b_k\) given on page 32 of the lecture note.
0
.table()
function in R.QDA uses a quadratic discriminant function. However, QDA does not work directly in this example because we do not have enough samples to provide an invertible sample covariance matrix for each digit. An alternative idea to fix this issue is to consider a regularized QDA method, which uses \[\widehat \Sigma_k(\alpha) = \alpha \widehat \Sigma_k + (1-\alpha) \widehat \Sigma \] instead of \(\Sigma_k\). Then, they are used in the decision rules given in page 36 of lecture notes. Complete the following questions
myRQDA(testX, mu_list, sigma_list, sigma_pool, alpha)
, where allpha
is a scaler alpha
and testX
is your testing covariate matrix. And you may need a new sigma_list
for all the \(\Sigma_k\). This function should return a vector of predicted digits.alpha_all = seq(0, 0.3, by = 0.05)