Exercise 04 Linear Regression PDF
Exercise 04 Linear Regression PDF
Exercise 04 Linear Regression PDF
JEN
'
Linear Regression
.EE#unwm*ekN4mM&:lRgIfm+yRY
X O
YL
mE¥i
'
*leant
'
national
i=1
FR.
{
Find the equation for the value of w that minimizes this error function. eindfo.pt)
Furthermore, explain how this weighting factor, ti , can be interpreted in terms of
ee¥¥mm÷iy¥*rti¥¥e÷•tag
1) the variance of the noise on the data and
2) data points for which there are exact copies in the dataset.
c.
2 Ridge regression
Problem 2: Show that the following holds: The ridge regression estimates can be obtained by ordinary
least squares regression
p on an augmented dataset: Augment the design matrix 2 RN ⇥M with M
additional rows I M ⇥M and augment y with M zeros. '
TOI) Etty
-
d Problem 3:
-
FEI:# tray
Derive the closed form solution for ridge regression error function
N
1X T
Eridge (w) =
2
(w (xi ) yi ) 2 +
2
wT w ¥ CMHINHNINXN ,D
MHK
i=1
(
MH
.
Additionally, discuss the scenario when the number of training samples N is smaller than the number of
basis functions M . What computational issues arise in this case? How does regularization address them?
.EE#Eqgtaai y yi
p*¥IEBtoc*i tg
.gg
¥ EI*7=9f
varia) -_
Efxy Exp -
EHH-E.ztfz.am
Upload a single PDF file with your homework solution to Moodle by 10.11.2019, 23:59 CET. We
recommend to typeset your solution (using LATEX or Word), but handwritten solutions are also accepted.
If your handwritten solution is illegible, it won’t be graded and you waive your right to dispute that.
i n
gtfo
E
M
. -
N
ee .
Qoli
i -
si on
Ff
② s t
I
°*4¥ffE
i
r k s n
E
÷•÷t
* -
-
.
I 1- 10 Oh
to
To d 3 pl
EFE
.ae?o3sfriE
-
-
EE x
"
Ei 30 - + be ON 3
I
go 3 t E
FR EE i
-
i
FEE H in
Erie #
,
38 ON A ④
WE +
.
w
pier OH
← f
-
in it off -13 tf t f-SO
" -
z TW
-
N
gots to F
got
i
an w 3
•
E - ro
is
v
'
{ Sd
-
in -
-
in
ios
t
33 t
{§
'
t E
-331-8
D n
3- '
D
oo -
-
B v
E B +3
§
u
W & og og
of
*
3 or
v
g BE
ex
B N
F BI N
f N G
1¥
"
tf
-
IE - -
I
ri
0 if : .
-
x
-
EE s
SW f
I
.
§ t Q
s E O E -
x - t
E
s
! EE
::÷÷÷
←
K -
.
f
⇐¥
E
w
x E !
F
x
i 0
A
Ad I i al
• a sh i
1- B E r .
* E
-
e EE
OH
as ¥ it
+
E n i
f- EO x -
WE > WE g
.EE
EEE -
is #
*
Ii § # -
W "
3 EH
÷:¥i¥ f÷¥ ¥
N
I
HN
to E r
÷
3 € v
a
AA w ON
n
+
if n
•
tis
'l U
T t
B d 3
.-
fo +
3
3 -
"
-53 A n
-
WE
N
1- so E
)
t
B B
OT z
too a * of
t t
R t t toy t n
③ 3 3 -
+ *
IN .IR OH -
I 5 . -
B
E S u
ar f s
t
V ily i
e v
t ar - 3
3 x t 3 to
+3 ok @
a
oh OH oh 3
✓ I
WE z t +3 e- FAM W
SWE Al A 3
d ④
-
N
-
to us
-
too A
Ipo
- -
- -
Id l # t
" NN in E n -
og
-
¥8 is §& 38 of 3 n f
3
-
=
, oo -
3
w et E
b g
808 GB
og
F 3 n
p to,
:/
§
3
+3
E
÷÷
:
E
or
÷
- NE
DE
Machine Learning — WS2019 — Module IN2064 Sheet 04 · Page 2
✓ Problem 4: In class, we only considered functions of the form f : Rn ! R. What about the general case
of f : Rn ! Rm ? For linear regression with multiple outputs, write down the loglikelihood formulation
and derive the MLE of the parameters.
Ami
v Problem 5: We want to perform regression on a dataset consisting of N samples xi 2 RD with corre-
sponding targets yi 2 R (represented compactly as X 2 RN ⇥D and y 2 RN ).
Assume that we have fitted an L2 -regularized linear regression model and obtained the optimal weight
vector w⇤ 2 RD as
⇤
w = arg min
1X T
(w xi
N
yi ) 2 + wT w yall
2 2
Xw*=X④
w
i=1
Wnew
Note that there is no bias term.
Now, assume that we obtained a new data matrix X new by scaling all samples by the same positive factor
a 2 (0, 1). That is, X new = aX (and respectively xnew
i = axi ).
a) Find the weight vector wnew that will produce the same predictions on X new as w⇤ produces on
- -
O
X.
* b) Find the regularization factor new 2 R, such that the solution w⇤new of the new L2 -regularized
.
linear regression problem
Xw*=aX When
N
1X new
w⇤new = arg min (wT X new
i yi ) 2 + wT w
w 2 2
i=1
X. fwiawneu ) OO X
-
-
Upload a single PDF file with your homework solution to Moodle by 10.11.2019, 23:59 CET. We
recommend to typeset your solution (using LATEX or Word), but handwritten solutions are also accepted.
If your handwritten solution is illegible, it won’t be graded and you waive your right to dispute that.
. .
no£
-
G I at
~ n I
• at €
~ Astro
SE Yu
o
. - o
I
R B -
*
EEN
JE
n
.
§ + E
en es s
a E EEE £ I
B " if
-
+ t
-
a
HE t
* § as
idea TENE
Ej
so T§' ET HE Ep
at g Es E or 3 D
I BE £ 8
g.X
x
I
Et
A
es e n
s
e
&
Ift god
-
EEO ion
-
f Es te
w
s
Gd
- ×
DB
E TN ←
€
3
-
8
EFE
e
EE
'
×
E
Fan or or or -
E -
iE€ IEA
n
E Igt E
gars ooo
'
- -
- o u Ea
B i. 8 IoT
xx
DE F- &
E
q Erin TETE
w
e &
oo
ii
ABE g
E of
s
II 9 ¥ JE
Its
Sf
&
ETI s
"
n go
F. E o S v 3
←
E
"
-
X
11
IF I
+
3 "
£8
IE
-
3 - E
- ee . -
a t ×
or -
er §
-
←
+ ¥ +
+3 is
5 oh
£3 i i
E
or
'
I .EE Ed E
-
x
¥¥¥B i
TEE :{ ' NE
www 's II
'
Ent
+
:
←
3×7 T E
¥ i I i 3
3 E
a -
f- 3
.
E
n 's
En '
f I
HE
F # tee # E E
IEEE
"
TX d v
8- - N i # IT god
E to " "
v
-
IE . -
-
E
e
-
ED
<
n
n
-5¥
did
•
at E
.
-
.
€3
o
- i
- • -
3
-
or § Er
EEE - WE
→
53 3 i
j
Tf
.oaiF¥aE±¥
i i EE Eon
.io .
-
•
qq.EE
EE EEE E .
-
-
-
:
&
.
T _-
.
Be a
E
-
res
EE As
① w i 3
¥
BI
.
B
N
E WINNE
3
E F x EE EE
{
ANE I 1- it
3.s-nw.EO.N.ES
×
3
I 33 →
oE
WE n ¥ a
T
oh ANE
AB HE E n
I
⇐÷¥÷
% E
"
EE t
£ at F E
Ige te E
-
I
e-
+
.
{
¥
90
z Ii
'
÷÷:¥÷÷÷: .
ok
¥1 :*
as ×
'
'
q tze } "
in
It 3£ f 3 ON
"
II ¥
'
AA I E
g
oleo EE £8 X u
V
£i¥*
e-
-
OH
I
-
¥9,4
¥5
'
O
*
-
-
d-
i *
¥TwE÷Et*B!§
¥5 t.ge
r
i
¥ As
.
" BD .
e NE .
J tf re
ENT v n Ad
AB
×
1- ⑧
e
.E*EtE*aE¥I
-
It
'
of
Tor - B
t
V
na
& t
• * EEE ←
-
T s -
x
site
seise . .
! -
B B B
& a X
id
- X
t
- '
"
g
* -
* x
EEE
.
e -
E #
go
is XE
EEE
one
n¥§ -
,
n
s
a- in
BE 4 ah t H
to
- stir
8*7
Tx Ei
IE O
E EEE I
g EET 't
" o
poi E E E sie T
DA EE E 5 3
E Tigre33 #
uFEEE_ANEaWE--#.
3
g-
-
u
i
-
t r
-
D
w e x
x
1- t Tn
f
n
x N -
X
H
n
T
rn "
r
IT to
H
t
>
n
f T x
BE orix E
x x
A N t II n y
'
T
8/88
I t x oo
-
Eko
→ -
4
ro x x t
¥
I E E A " E Eko +
or b b t § X x
×
"
" W
x x A
× No IIe 50¥ x
x
E- -
n x F x
-
"
x Tn
- N T E H
if
FT
' "
A n
a n
H -
a
a E "
E' I +
I t
Eto E # Ele E
t
x '
X X
f I + €
I t t U X t
x
U
X K X
×
n
I I
i
-
t v x
D
1-
x n
"
X
v I
+
DT
H F
← a
T
H
c
+
E
x
x w
t
x
E
+
µ x no
reports
T R
#
Machine Learning — WS2019 — Module IN2064 Sheet 04 · Page 3
In-class Exercises
Problem 7: Assume that we are given a dataset, where each sample xi and regression target yi is
generated according to the following process
The 4 regression algorithms below are applied to the given data. Your task is to say what the bias and
variance of these models are (low or high). Provide a 1-2 sentence explanation to each of your answers.
a) Linear regression high bias , high van .
Problem 8: Given is a training set consisting of samples X = {x1 , x2 , . . . , xN } with respective regression
targets y = {y1 , y2 , . . . , yN } where xi 2 RD and yi 2 R.
Alice fits a linear regression model f (xi ) = wT xi to the dataset using the closed form solution for linear
regression (normal equations).
Bob has heard that by transforming the inputs xi with a vector-valued function , he can fit an alternative
function, g(xi ) = v T (xi ), using the same procedure (solving the normal equations). He decides to use
a linear transformation (xi ) = AT xi , where A 2 RD⇥D has full rank.
a) Show that Bob’s procedure will fit the same function as Alice’s original procedure, that is f (x) =
g(x) for all x 2 RD (given that w and v minimize the training set error).
b) Can Bob’s procedure lead to a lower training set error than Alice’s if the matrix A is not invertible?
Explain your answer.
Upload a single PDF file with your homework solution to Moodle by 10.11.2019, 23:59 CET. We
recommend to typeset your solution (using LATEX or Word), but handwritten solutions are also accepted.
If your handwritten solution is illegible, it won’t be graded and you waive your right to dispute that.
Machine Learning — WS2019 — Module IN2064 Sheet 04 · Page 3
In-class Exercises
Problem 7: Assume that we are given a dataset, where each sample xi and regression target yi is
generated according to the following process
The 4 regression algorithms below are applied to the given data. Your task is to say what the bias and
variance of these models are (low or high). Provide a 1-2 sentence explanation to each of your answers.
a) Linear regression
b) Polynomial regression with degree 3
c) Polynomial regression with degree 10
Problem 8: Given is a training set consisting of samples X = {x1 , x2 , . . . , xN } with respective regression
targets y = {y1 , y2 , . . . , yN } where xi 2 RD and yi 2 R.
Alice fits a linear regression model f (xi ) = wT xi to the dataset using the closed form solution for linear
regression (normal equations).
Bob has heard that by transforming the inputs xi with a vector-valued function , he can fit an alternative
function, g(xi ) = v T (xi ), using the same procedure (solving the normal equations). He decides to use
a linear transformation (xi ) = AT xi , where A 2 RD⇥D has full rank.
a) Show that Bob’s procedure will fit the same function as Alice’s original procedure, that is f (x) =
g(x) for all x 2 RD (given that w and v minimize the training set error).
b) Can Bob’s procedure lead to a lower training set error than Alice’s if the matrix A is not invertible?
Explain your answer.
Upload a single PDF file with your homework solution to Moodle by 10.11.2019, 23:59 CET. We
recommend to typeset your solution (using LATEX or Word), but handwritten solutions are also accepted.
If your handwritten solution is illegible, it won’t be graded and you waive your right to dispute that.