Midtermsols Sp2010

EE127A 3/19/10
L. El Ghaoui
Midterm Solutions
1. (4 points) Consider the set in R3 , defined by the equation
P := x ∈ R3 : x1 + 2x2 + 3x3 = 1 .

(a) Show that the set P is an affine subspace of dimension 2. To this end, express it
as x0 + span(x1 , x2 ), where x0 ∈ P, and x1 , x2 are independent vectors.
(b) Find the minimum Euclidean distance from 0 to the set P. Find a point that
achieves the minimum distance. (Hint: either apply a formula if you know it, or
prove that the minimum-distance point is proportional to the vector a := (1, 2, 3).)
Solutions:
(a) The affine subspace P is of dimension 2 in R3 , it is a (hyper-) plane. To show

this, we solve the equation for one of the variables, say x1 :
x1 = 1 − 2x2 − 3x3 .
This shows that any vector x ∈ P can be expressed as

       
1 − 2x2 − 3x3 1 −2 −3
x= x2  =  0  + x2  1  + x3  0  ,
x3 0 0 1
with x2 , x3 free parameters. Thus, P = x0 + span(x1 , x2 ), with

     
1 −2 −3
x0 =  0  , x 1 =  1  , x 2 =  0  .
0 0 1
We check that the two vectors x1 , x2 are indeed independent, since λx1 + µx2 = 0
implies λ = µ = 0.
(b) The minimum distance to the affine set {x : Ax = b}, with b ∈ Rm , and
A ∈ Rm×n full row rank (that is, AAT is positive-definite), is given by the formula
x∗ = AT (AAT )−1 b. Applying this formula to A = aT , b = 1, yields
 
1
∗ a 1  
x = T = 2 .
a a 14
3
1
√ √
The minimum distance is kx∗ k2 = 1/ aT a = 1/ 14.
Alternatively, we notice that any vector x ∈ R3 can be decomposed as x = ta + z,
with t ∈ R, and z ∈ R3 , with z T a = 0. The condition x ∈ P then implies
t = 1/aT a. Since kxk22 = t2 aT a + z T z = (1/aT a) + z T z, the objective function of
the minimum Euclidean distance problem
r
1
min kxk2 = min T
+ z T z : aT z = 0
x∈P z a a
is minimal when z = 0. This shows that x∗ = a/(aT a), as claimed.
2. (8 points) Consider the operation of finding the point symmetric to a given point about
a given line L in Rn .
x0
x
Figure 1: A point and its symmetric about the line L.
We define the line as L := {x0 + tu, t ∈ R}, where x0 is a point on the line, and u its
direction, which we assume is normalized: kuk2 = 1. For a given point x ∈ Rn , We
denote by f (x) ∈ Rn the point that is symmetric to x about the line. (See Fig 1.)
That is, f (x) = 2p(x) − x where p(x) is the projection of x on the line:
p(x) = arg min kp − xk2

p∈L
(a) Show that the mapping f is affine. Describe it in terms of a n × n matrix A and a
n × 1 vector b, such that f (x) = Ax + b for every x. (It will be useful to use the
notation P := uuT .)
(b) What is the geometric interpretation of the vector b?
(c) Show that the mapping f is linear if and only if the line passes through 0.
(d) Show that f (f (x)) = x for every x. What is the geometric meaning of this
property?
2
(e) What is the range and nullspace of the matrix A? What is the rank of A? Is A
invertible?
(f) Show that A is symmetric, find its eigenvalue decomposition (EVD). Hint: define
u2 , . . . , un to be an orthonormal basis for the subspace orthogonal to u, and show
that the orthogonal matrix U := [u, u2 , . . . , un ] contains eigenvectors of A.
(g) Find an SVD decomposition of A. What is the relationship between the EVD of
A with its SVD?
(h) Assume that the input is bounded: kxk2 ≤ 1. Find a bound on the Euclidean
norm of the output f (x). Find an input x that achieves the bound.
Solutions:
(a) The minimum of the function with values
h(t) = ktu + x0 − xk22 = t2 − 2tuT (x − x0 ) + kx − x0 k22

= (t − uT (x − x0 ))2 + constant
is obtained with t(x) := uT (x − x0 ). Thus, we have
p(x) = t(x)u + x0 = (uT (x − x0 ))u + x0 = P (x − x0 ) + x0 , P := uuT .
Hence
f (x) = 2p(x) − x = (2P − I)(x − x0 ) + x0 = Ax + b,
where A = 2P − I, b = 2(I − P )x0 .
(b) Since f (0) = b, the latter is simply the symmetric to the origin about the line.
(c) The mapping is linear if only if b = 0, that is, when x0 satisfies P x0 = x0 . Hence,
x0 = uuT x0 = (uT x0 )u is proportional to u. In that case, the line goes through 0,
since 0 = x0 + tu, with t = −(uT x0 ).
(d) We have P u = u. Further, P 2 = (uuT )(uuT ) = (uT u)uuT = uuT = P . The latter
implies
A2 = (2P − I)(2P − I) = 4P 2 − 4P + I = I.
In addition, P b = 2P (I − P )x0 = 0. We obtain Ab = (2P − I)b = −b. We thus
obtain that
f (f (x)) = A(Ax + b) + b = A2 x + Ab + b = x − b + b = x.
The geometry of this is simply that the symmetric to the symmetric is itself.
(e) The nullspace of A is the set of vectors with Ax = 0, meaning 2P x = x. Thus,
x = 2(uT x)u is proportional to u. Since Ax = 0, but Au = u 6= 0, we must have
uT x = 0, hence x = 2(uT x)u = 0. We conclude that the nullspace is {0}, the
range is Rn , and A is full rank, hence invertible since it is also square.
3
(f) Since A = 2uuT − I, it is symmetric.
Let u2 , . . . , un be an orthonormal basis for the subspace orthogonal to u; we have
uTi u = 0, i = 2, . . . , n. We have Au = u, Aui = −ui , i = 2, . . . , n. Hence the vec-
tor u is an eigenvector associated with the eigenvalue 1, and the ui , i = 2, . . . , n,
are eigenvectors all associated with the eigenvalue −1. Writing the previous con-
ditions compactly as AU = U Λ, with U = [u, u2 , . . . , un ] an orthogonal matrix,
Λ = diag(1, −1, . . . , −1), we obtain that A admits the symmetric eigenvalue de-
composition A = U ΛU T .
(g) We have P ui = (uT ui )u = 0, i = 2, . . . , n. With U := [u, u2 , . . . , un ], we get
P U = [P u, P u2 , . . . , P un ] = [u, 0, . . . , 0], therefore
AU = [Au, Au2 , . . . , Aun ] = [u, −u2 , . . . , −un ] =: V.
Both U, V are orthogonal matrices. Post-multiplying the above relation by U T =
U −1 , we obtain A = V U T , which is the SVD of A, with V the left singular vectors,
and U the right singular vectors. Note that every singular value of A is one, which
is consistent with A2 = AAT = AT A = I.
The relationship with the SVD is simply that the eigenvectors u, u2 , . . . , un are
the right singular vectors as well. Flipping the signs on the last n − 1 eigenvectors
provides the left singular vectors.
(h) We want to solve
max kAx + bk2 .
x : kxk2 ≤1
Using the SVD of A = U V T , we reduce the problem to

max kx̃ + U T bk2 ,
x̃ : kx̃k2 ≤1
where b̃ = U T b, and x̃ = V T x. The solution is obvious: simply choose a unit-norm

vector in the same direction as U T b:
UT b UT b
x̃ = = .
kU T bk2 kbk2
We obtain
AT b Ab b
x = V x̃ = = =− ,
kbk2 kbk2 kbk2
where b = f (0) = 2(I − P )x0 is the point symmetric to 0 about the line L.
In other words, the worst-case input in the ball {x : kxk2 ≤ 1} is simply the one
that extends away from the line in the direction opposite to the projection of 0
on the line.
3. (6 points) We are given m of points x1 , . . . , xm in Rn . To a given normalized direction
w ∈ Rn (kwk2 = 1), we associate the line with direction w passing through the origin,
L(w) = {tw : t ∈ R}.
4
We then consider the projection of the points xi , i = 1, . . . , m, on the line L(w), and
look at the associated coordinates of the points on the line. These projected values are
given by ti (w) := arg min ktw − xi k2 , i = 1, . . . , m.
t
We assume that for any w, the empirical average t̂(w) of the projected values ti (w),
i = 1, . . . , m, and their empirical variance σ 2 (w), are both constant, independent of
the direction w (wih kwk2 = 1). Denote by t̂ and σ 2 the (constant) empirical average
and variance. Justify your answer to the following as carefully as you can.
(a) Show that ti (w) = xTi w, i = 1, . . . , m.

(b) Show that the empirical average of the data points,
m
1 X
x̂ := xi ,
m i=1
is zero.
(c) Show that the empirical covariance matrix of the data points,
m
1 X
Σ := (xi − x̂)(xi − x̂)T ,
m i=1
is of the form σ 2 · I, where I is the identity matrix of order n. (Hint: the largest
eigenvalue λmax of the matrix Σ can be written as: λmax = maxw {wT Σw : wT w =
1}, and a similar expression holds for the smallest eigenvalue.)
Solutions:
(a) For a given i = 1, . . . , m, we have
ti (w) = arg min ktw − xi k2 .

t
Let us drop i for a moment, and solve the least-squares problem with variable
t ∈ R:
p∗ := min ktw − xk22 .
t
One can apply the closed-form solution for least-squares problem, in which the ma-
trix involved is the full column-rank matrix w. This leads to t(w) = (wT w)−1 wT x =
wT x. (Recall that kwk2 = 1.)
Alternatively, we can solve the above problem directly, exploiting again kwk2 = 1:
p∗ = min t2 − 2(wT x)t + kxk22 = min (t − (wT x))2 + C, with C := kxk22 − (wT x)2 .
t t
The quantity C is constant (independent of the variable t). The first term in
the objective function above is non-negative, hence p∗ ≥ C. This lower bound is
attained with t = xT w.
5
(b) The empirical average of the numbers ti (w), i = 1, . . . , m, is
m m
1 X 1 X T
t̂(w) = ti (w) = w xi = wT x̂,
m i=1 m i=1
where x̂ is the empirical average of the data points. We obtain that there is a
constant α ∈ R such that
∀ w, kwk2 = 1 : wT x̂ = α.
Expressing the condition above for both w and −w, we obtain that α = 0. This
means that x̂ is orthogonal to any (unit-norm) vector, hence it is zero.
(c) The empirical variance of the numbers ti (w), i = 1, . . . , m, is given by
m
2 1 X
σ (w) = (ti (w) − t̂(w))2 .
m i=1
Exploiting ti (w) = wT xi , i = 1, . . . , m, and t̂(w) = wT x̂ = 0, we obtain

m m m
!
1 X 1 X 1 X
σ 2 (w) = (wT xi )2 = (wT xi )(xTi w) = wT xi xTi w = wT Σw,
m i=1 m i=1 m i=1
where Σ is the empirical covariance matrix of the data points. The property of
constant variance is thus equivalent to the fact that the quadratic form w →
wT Σw is a constant function on the unit ball {w : kwk2 = 1}. We have denoted
by σ 2 this constant.
The largest and smallest eigenvalue of Σ admit the variational representation
λmax (Σ) = max wT Σw, λmin (Σ) = min wT Σw.

kwk2 =1 kwk2 =1
Since the objective function of these problem is the same constant function, we
obtain that λmax (Σ) = λmin (Σ) = σ 2 . Hence all eigenvalues of Σ are equal, to σ 2 .
That is, the diagonal matrix of eigenvalues is given by Λ = σ 2 I. The eigenvalue
decomposition of Σ is of the form Σ = U T ΛU , with U an orthogonal matrix of
eigenvectors. Since Λ = σ 2 I, we obtain that Σ = σ 2 U T U = σ 2 I as well.

Midtermsols Sp2010

Uploaded by

Copyright:

Available Formats

Midtermsols Sp2010

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Midtermsols Sp2010

Uploaded by

Copyright:

Available Formats

EE127A 3/19/10

1. (4 points) Consider the set in R3 , defined by the equation

(a) The affine subspace P is of dimension 2 in R3 , it is a (hyper-) plane. To show

This shows that any vector x ∈ P can be expressed as

with x2 , x3 free parameters. Thus, P = x0 + span(x1 , x2 ), with

is minimal when z = 0. This shows that x∗ = a/(aT a), as claimed.

Figure 1: A point and its symmetric about the line L.

p(x) = arg min kp − xk2

(a) The minimum of the function with values

h(t) = ktu + x0 − xk22 = t2 − 2tuT (x − x0 ) + kx − x0 k22

is obtained with t(x) := uT (x − x0 ). Thus, we have

p(x) = t(x)u + x0 = (uT (x − x0 ))u + x0 = P (x − x0 ) + x0 , P := uuT .

Using the SVD of A = U V T , we reduce the problem to

where b̃ = U T b, and x̃ = V T x. The solution is obvious: simply choose a unit-norm

(a) Show that ti (w) = xTi w, i = 1, . . . , m.

(a) For a given i = 1, . . . , m, we have

ti (w) = arg min ktw − xi k2 .

Exploiting ti (w) = wT xi , i = 1, . . . , m, and t̂(w) = wT x̂ = 0, we obtain

λmax (Σ) = max wT Σw, λmin (Σ) = min wT Σw.

You might also like