Machine LearningLEC 04: Maximum Likelihood Estimation
Machine Learning
LEC 05
TA1
Mohammad AkbariInstructor
TAs
00-00Time
Regularizers, basis functions
and cross-validation
Machine LearningLEC 04: Maximum Likelihood Estimation
Outline of thelecture
•This lecture will teach you how to fit nonlinear functions by using bases functions and how to control model complexity. The goal isfor youto:
•Learn how to derive ridgeregression.
•Understand the trade-off of fitting the data and regularizingit.
•Learn polynomialregression.
•Understand that, if basis functions are given, the problemof learning the parameters is stilllinear.
•Learncross-validation.
•Understand model complexity andgeneralization.
Machine LearningLEC 04: Maximum Likelihood Estimation
Regularization
Machine LearningLEC 04: Maximum Likelihood Estimation
Derivation
Machine LearningLEC 04: Maximum Likelihood Estimation
Ridge regression as constrainedoptimization
Machine LearningLEC 04: Maximum Likelihood Estimation
Regularizationpaths
[Hastie, Tibshirani & Friedmanbook]
q8
Asdincreases, t(d)decreases and each qi goes tozero.
q1
q2
q
t(d)
Machine LearningLEC 04: Maximum Likelihood EstimationRidge regression and Maximum a Posteriori
(MAP)learning
Machine LearningLEC 04: Maximum Likelihood EstimationRidge regression and Maximum a Posteriori
(MAP)learning
Machine LearningLEC 04: Maximum Likelihood Estimation
Going nonlinear via basisfunctions
We introduce basis functions. to deal withnonlinearity:
For example
Machine LearningLEC 04: Maximum Likelihood Estimation
Going nonlinear via basisfunctions
Machine LearningLEC 04: Maximum Likelihood Estimation
Effect of data when we have the rightmodel
yi =q0+xiq1+xi2q2+N( 0, s2)
Machine LearningLEC 04: Maximum Likelihood Estimation
Effect of data when the model is toosimple
yi =q0+xiq1+xi2q2+N( 0, s2)
Machine LearningLEC 04: Maximum Likelihood Estimation
Effect of data when the model is verycomplex
yi =q0+xiq1+xi2q2+N( 0, s2)
Machine LearningLEC 04: Maximum Likelihood Estimation
Machine LearningLEC 04: Maximum Likelihood Estimation
Example: Ridge regression with a polynomial of degree 14
y(xi ) = 1q0+xiq1+xi2q2+...+xi13q13+
xi14]
xi14q14
xi2F=[1xi...xi13
smalld largedmediumd
x
J(q)=(y-Fq)T(y -Fq)+d2qTq
yyy
Machine LearningLEC 04: Maximum Likelihood Estimation
Kernel regression andRBFs
Wecanusekernelsorradialbasisfunctions(RBFs)asfeatures:
Machine LearningLEC 04: Maximum Likelihood Estimation
Machine LearningLEC 04: Maximum Likelihood Estimation
Machine LearningLEC 04: Maximum Likelihood Estimation
We can choose thelocationsµof the basis functions to be theinputs. That is, µi =xi .
These basis functions are the known askernels.
The choice of width lis tricky, as illustratedbelow.
Rightl
Too largel
•kernels
•Too smalll
Machine LearningLEC 04: Maximum Likelihood Estimation
The big question is how do we
choose the regularization
coefficient, the width of the
kernels or the polynomial
order?
Machine LearningLEC 04: Maximum Likelihood Estimation
Simple solution:cross-validation
Machine LearningLEC 04: Maximum Likelihood Estimation
K-foldcrossvalidation
Machine LearningLEC 04: Maximum Likelihood Estimation
Example: Ridge regression with polynomial of degree14
Machine LearningLEC 04: Maximum Likelihood Estimation
Where cross-validation fails)(K-means)
Machine LearningLEC 04: Maximum Likelihood Estimation
Nextlecture
In the next lecture, we delve into the world ofoptimization.
Please revise your multivariable calculus and in particularthe
definition ofgradient