HW 3

ECEN 689: Fall 2020

HW 3

Due Date: Thursday, Sep 17, 2020.

Reading Assignment: Please read Appendices A and B, and then Chapters 2, 3, and 4 of the
textbook, in this order.

1. Let X = R, the real line. Let H = {(a, +∞) : a is a real number}. That is, H is the set of
semi-infinite intervals which are unbounded on the right. Determine VC-dim(H).

2. Let X = R, the real line. Let H = {(a1 , a2 ) : a1 and a2 are real numbers}. That is, H is the
set of bounded intervals. Determine VC-dim(H).

3. (i) Let X = R2 , the two-dimensional plane. Let R(a1 ,a2 ) := {(x1 , x2 ) : x1 ≥ a1 , x2 ≥

a2 } denote the two-dimensional semi-infinite rectangle with “southwest” corner at the
point (a1 , a2 ). Let H consist of all such semi-infinite rectangles, i.e., H = {R(a1 ,a2 ) :
a1 and a2 are real numbers}. Determine VC-dim(H).
(ii) Generalize the above result to the higher dimensional case where X = Rn ,
R(a1 ,a2 ,...,an ) := {(x1 , x2 , . . . , n) : x1 ≥ a1 , x2 ≥ a2 , . . . , xn ≥ an }, and H = {R(a1 ,a2 ,...,an ) :
a1 , a2 , . . . , an are real numbers}.

4. In this problem, you can use the fact that there is uniform convergence in the law of weak
numbers for a class H of sets if and only if its VC-dim(H ) is finite. The Glivenko-Cantelli
Theorem says that empirical distribution functions converge in the L∞ -norm to the true
distribution function in probability. Here is what it means. Let F (x) be the distribution
function of a random variable X, i.e., P (X ≤ x) = F (x). We wish to estimate this distribution
function. For this purpose we obtain m i.i.d.samples {x1 , x2 , . . . , xm } where each xi ∼ P .
1 Pm
Then we construct the empirical distribution function Gm (x) := m i=1 1(xi ≤ x). Show
that P (supx ||Gm (x) − F (x)|| > ) → 0 as m → ∞.

5. Let X = {(x1 , x2 ) : 0 ≤ x1 ≤ 1, 0 ≤ x2 ≤ 1} be the two-dimensional unit square. Let H0

consist of all sets consisting of a finite number of points from X . Let H contain all the sets
in H0 plus one more set, the square B := {(x1 , x2 ) : 0 ≤ x1 ≤ 21 , 0 ≤ x2 ≤ 12 } of area 41 that
is inside X . Let D be the uniform probability distribution on X . Let the true hypothesis be
B. Suppose a sample of m points x(1) , x(2) , . . . , x(m) is drawn i.i.d. according to D. Let their
labels be y1 , y2 , . . . , ym , where yi = 1 if both coordinates of xi are in [0, 12 ], and is 0 otherwise.
(i) Now consider the following estimator ĥm = {x(i) : 1 ≤ i ≤ m, yi = 1} that simply consists
of all the points that were labeled 1. Is ĥm an empirical risk minimizer?
(ii) For a fixed  = 1/4, , what can we say about the probability that the error of ĥm exceeds
 as m → ∞?
(iii) Now consider the following estimator ĥ0m ≡ B, i.e., whatever be the labeled sample, it
simply outputs the square B of area 14 . Is ĥ0m an empirical risk minimizer?
(iv) For any fixed  > 0, what can we say about the probability that the error of ĥ0m exceeds

