Convex Optimisation Solutions

3
Convex functions
3
2
1
PSfrag replacements
Could f be convex (concave, quasiconvex, quasiconcave)? Explain your answer. Repeat

for the level curves shown below.
PSfrag replacements
1 2 3
Solution. The first function could be quasiconvex because the sublevel sets appear to be
convex. It is definitely not concave or quasiconcave because the superlevel sets are not
convex.
It is also not convex, for the following reason. We plot the function values along the
dashed line labeled I.
3
2
1
PSfrag replacements
II
Along this line the function passes through the points marked as black dots in the figure
below. Clearly along this line segment, the function is not convex.
Exercises
3
2
1
PSfrag replacements
If we repeat the same analysis for the second function, we see that it could be concave
(and therefore it could be quasiconcave). It cannot be convex or quasiconvex, because
the sublevel sets are not convex.
3.3 Inverse of an increasing convex function. Suppose f : R R is increasing and convex
on its domain (a, b). Let g denote its inverse, i.e., the function with domain (f (a), f (b))
and g(f (x)) = x for a < x < b. What can you say about convexity or concavity of g?
Solution. g is concave. Its hypograph is
hypo g
=
=
=
=
{(y, t) | t g(y)}
{(y, t) | f (t) f (g(y))}
{(y, t) | f (t) y)}
0
1
1
0
(because f is increasing)
epi f.
For differentiable g, f , we can also prove the result as follows. Differentiate g(f (x)) = x
once to get
g 0 (f (x)) = 1/f 0 (x).
so g is increasing. Differentiate again to get
g 00 (f (x)) =
f 00 (x)
,
f 0 (x)3
so g is concave.
3.4 [RV73, page 15] Show that a continuous function f : Rn R is convex if and only if for
every line segment, its average value on the segment is less than or equal to the average
of its values at the endpoints of the segment: For every x, y Rn ,
1
0
f (x + (y x)) d
f (x) + f (y)
.
2
Solution. First suppose that f is convex. Jensens inequality can be written as

f (x + (y x)) f (x) + (f (y) f (x))
for 0 1. Integrating both sides from 0 to 1 we get
1
0
f (x + (y x)) d
1
0
(f (x) + (f (y) f (x))) d =
f (x) + f (y)
.
2
Now we show the converse. Suppose f is not convex. Then there are x and y and
0 (0, 1) such that
f (0 x + (1 0 )y) > 0 f (x) + (1 0 )f (y).
Convex functions
Consider the function of given by

F () = f (x + (1 )y) f (x) (1 )f (y),
which is continuous since f is. Note that F is zero for = 0 and = 1, and positive at 0 .
Let be the largest zero crossing of F below 0 and let be the smallest zero crossing
of F above 0 . Define u = x + (1 )y and v = x + (1 )y. On the interval (, ),
we have
F () = f (x + (1 )y) > f (x) + (1 )f (y),
so for (0, 1),
f (u + (1 )v) > f (u) + (1 )f (v).
Integrating this expression from = 0 to = 1 yields
1
0
f (u + (u v)) d >
(f (u) + (f (u) f (v))) d =
f (u) + f (v)
.
2
In other words, the average of f over the interval [u, v] exceeds the average of its values
at the endpoints. This proves the converse.
3.5 [RV73, page 22] Running average of a convex function. Suppose f : R R is convex,
with R+ dom f . Show that its running average F , defined as
F (x) =
1
x
f (t) dt,
dom F = R++ ,
is convex. You can assume f is differentiable.

Solution. F is differentiable with
F 0 (x)
F 00 (x)
=
=
=
(1/x2 )
(2/x3 )
3
(2/x )
f (t) dt + f (x)/x
0
x
f (t) dt 2f (x)/x2 + f 0 (x)/x
0
x
0
(f (t) f (x) f 0 (x)(t x)) dt.
Convexity now follows from the fact that

f (t) f (x) + f 0 (x)(t x)
for all x, t dom f , which implies F 00 (x) 0.

3.6 Functions and epigraphs. When is the epigraph of a function a halfspace? When is the
epigraph of a function a convex cone? When is the epigraph of a function a polyhedron?
Solution. If the function is affine, positively homogeneous (f (x) = f (x) for 0),
and piecewise-affine, respectively.
3.7 Suppose f : Rn R is convex with dom f = Rn , and bounded above on Rn . Show that
f is constant.
Solution. Suppose f is not constant, i.e., there exist x, y with f (x) < f (y). The function
g(t) = f (x + t(y x))
is convex, with g(0) < g(1). By Jensens inequality
g(1)
1
t1
g(0) + g(t)
t
t
for all t > 1, and therefore

g(t) tg(1) (t 1)g(0) = g(0) + t(g(1) g(0)),
so g grows unboundedly as t . This contradicts our assumption that f is bounded.
Convex functions
(a) The Hessian of f must be positive semidefinite everywhere:

)F 0.
2 f(z) = F T 2 f (F z + x
(b) The condition in (a) means that v T 2 f (F z + x
)v 0 for all v with Av = 0, i.e.,
v T AT Av = 0 = v T 2 f (F z + x
)v 0.
The result immediately follows from the hint.
3.10 An extension of Jensens inequality. One interpretation of Jensens inequality is that
randomization or dithering hurts, i.e., raises the average value of a convex function: For
f convex and v a zero mean random variable, we have E f (x0 + v) f (x0 ). This leads
to the following conjecture. If f0 is convex, then the larger the variance of v, the larger
E f (x0 + v).
(a) Give a counterexample that shows that this conjecture is false. Find zero mean
random variables v and w, with var(v) > var(w), a convex function f , and a point
x0 , such that E f (x0 + v) < E f (x0 + w).
(b) The conjecture is true when v and w are scaled versions of each other. Show that
E f (x0 + tv) is monotone increasing in t 0, when f is convex and v is zero mean.
Solution.
(a) Define f : R R as
f (x) =
x0 = 0, and scalar random variables
w=
1
1
with probability 1/2

0,
x,
x0
x > 0,
v=
4
4/9

with probability 9/10.
w and v are zero-mean and

var(v) = 16/9 > 1 = var(w).
However,
E f (v) = 2/5 < 1/2 = E f (w).
(b) f (x0 +tv) is convex in t for fixed v, hence if v is a random variable, g(t) = E f (x0 +tv)
is a convex function of t. From Jensens inequality,
g(t) = E f (x0 + tv) f (x0 ) = g(0).
Now consider two points a, b, with 0 < a < b. If g(b) < g(a), then
a
ba
a
ba
g(0) + g(b) <
g(a) + g(a) = g(a)
b
b
b
b
which contradicts Jensens inequality. Therefore we must have g(b) g(a).
3.11 Monotone mappings. A function : Rn Rn is called monotone if for all x, y dom ,

((x) (y))T (x y) 0.
(Note that monotone as defined here is not the same as the definition given in 3.6.1.
Both definitions are widely used.) Suppose f : Rn R is a differentiable convex function.
Show that its gradient f is monotone. Is the converse true, i.e., is every monotone
mapping the gradient of a convex function?
Exercises
Examples
3.15 A family of concave utility functions. For 0 < 1 let
u (x) =
x 1
,
with dom u = R+ . We also define u0 (x) = log x (with dom u0 = R++ ).

(a) Show that for x > 0, u0 (x) = lim0 u (x).
(b) Show that u are concave, monotone increasing, and all satisfy u (1) = 0.
These functions are often used in economics to model the benefit or utility of some quantity
of goods or money. Concavity of u means that the marginal utility (i.e., the increase
in utility obtained for a fixed increase in the goods) decreases as the amount of goods
increases. In other words, concavity models the effect of satiation.
Solution.
(a) In this limit, both the numerator and denominator go to zero, so we use lHopitals
rule:
(d/d)(x 1)
x log x
= lim
= log x.
lim u (x) = lim
0
0
0
(d/d)
1
(b) By inspection we have
u (1) =
1 1
= 0.
The derivative is given by
u0 (x) = x1 ,
which is positive for all x (since 0 < < 1), so these functions are increasing. To
show concavity, we examine the second derivative:
u00 (x) = ( 1)x2 .
Since this is negative for all x, we conclude that u is strictly concave.
3.16 For each of the following functions determine whether it is convex, concave, quasiconvex,
or quasiconcave.
(a) f (x) = ex 1 on R.
Solution. Strictly convex, and therefore quasiconvex. Also quasiconcave but not
concave.
(b) f (x1 , x2 ) = x1 x2 on R2++ .
Solution. The Hessian of f is
2
f (x) =
0
1
1
0
which is neither positive semidefinite nor negative semidefinite. Therefore, f is

neither convex nor concave. It is quasiconcave, since its superlevel sets
{(x1 , x2 ) R2++ | x1 x2 }
are convex. It is not quasiconvex.
(c) f (x1 , x2 ) = 1/(x1 x2 ) on R2++ .
2 f (x) =
1
x1 x2
2/(x21 )
1/(x1 x2 )
1/(x1 x2 )
2/x22
0
Therefore, f is convex and quasiconvex. It is not quasiconcave or concave.
Convex functions
(d) f (x1 , x2 ) = x1 /x2 on R2++ .

2 f (x) =
1/x22
2x1 /x32
0
1/x22
which is not positive or negative semidefinite. Therefore, f is not convex or concave.

It is quasiconvex and quasiconcave (i.e., quasilinear), since the sublevel and superlevel sets are halfspaces.
(e) f (x1 , x2 ) = x21 /x2 on R R++ .
Solution. f is convex, as mentioned on page 72. (See also figure 3.3). This is easily
verified by working out the Hessian:
2
f (x) =
2/x2
2x1 /x22
2x1 /x22
2x21 /x32
= (2/x2 )
1
2x1 /x2
2x1 /x2
0.
Therefore, f is convex and quasiconvex. It is not concave or quasiconcave (see the

figure).
1
(f) f (x1 , x2 ) = x
, where 0 1, on R2++ .
1 x2
Solution. Concave and quasiconcave. The Hessian is
2 f (x)
( 1)x12 x1
2
(1 )x11 x
2
1
)x
1 x2
(1
1
(1 )x
1 x2
0.
(1 )x11 x
2
1
(1 )()x
1 x2
1/x21
1/x1 x2
1/x1 x2
1/x22
1/x1
1/x2

T
1/x1
1/x2
f is not convex or quasiconvex.

3.17 Suppose p < 1, p 6= 0. Show that the function
n
X
f (x) =
xpi
i=1
!1/p
Pn
1/2
with dom f = Rn
x )2 and
++ is concave. This includes as special cases f (x) = (
i=1 i
Pn
the harmonic mean f (x) = ( i=1 1/xi )1 . Hint. Adapt the proofs for the log-sum-exp
function and the geometric mean in 3.1.5.
Solution. The first derivatives of f are given by
n
X p (1p)/p p1
f (x)
=(
xi )
xi
=
xi
i=1
f (x)
xi
1p
The second derivatives are

2 f (x)
1p
=
xi xj
xi
for i 6= j, and
f (x)
xi
2 f (x)
1p
=
f (x)
x2i
p
f (x)2
x2i
f (x)
xj
1p
1p
1p
xi
1p
f (x)
f (x)
xi
f (x)2
xi xj
1p
1p
Convex functions
(e) f (x, t) = log(tp kxkpp ) where p > 1 and dom f = {(x, t) | t > kxkp }. You can
use the fact that kxkpp /up1 is convex in (x, u) for u > 0 (see exercise 3.23).
Solution. Express f as
f (x, t)
=
=
log tp1 log(t kxkpp /tp1 )
(p 1) log t log(t kxkpp /tp1 ).
The first term is convex. The second term is the composition of a decreasing convex
function and a concave function, and is also convex.
3.23 Perspective of a function.
(a) Show that for p > 1,
f (x, t) =
kxkpp
|x1 |p + + |xn |p
= p1
p1
t
t
is convex on {(x, t) | t > 0}.

Solution. This is the perspective function of kxkpp = |x1 |p + + |xn |p .
(b) Show that
f (x) =
kAx + bk22
cT x + d
is convex on {x | cT x + d > 0}, where A Rmn , b Rm , c Rn and d R.

Solution. This function is the composition of the function g(y, t) = y T y/t with an
affine transformation (y, t) = (Ax + b, cT x + d). Therefore convexity of f follows
from the fact that g is convex on {(y, t) | t > 0}.
For convexity of g one can note that it is the perspective of xT x, or directly verify
that the Hessian

I/t
y/t2
2
g(y, t) =
y T /t y T y/t3
is positive semidefinite, since
v
w
T
I/t
y T /t
y/t2
y T y/t3

v
w
= ktv ywk22 /t3 0
for all v and w.

3.24 Some functions on the probability simplex. Let x be a real-valued random variable which
takes values in {a1 , . . . , an } where a1 < a2 < < an , with prob(x = ai ) = pi ,
i = 1, . . . , n. For each of the following functions of p (on the probability simplex {p
T
Rn
+ | 1 p = 1}), determine if the function is convex, concave, quasiconvex, or quasiconcave.
(a) E x.
Solution. E x = p1 a1 + + pn an is linear, hence convex, concave, quasiconvex,
and quasiconcave
(b) prob(x ).
Pn
Solution. Let j = min{i | ai }. Then prob(x ) = i=j pi , This is a linear
function of p, hence convex, concave, quasiconvex, and quasiconcave.
(c) prob( x ).
Solution. Let j = min{i | ai } and k = max{i | ai }. Then prob( x
Pk
) =
p . This is a linear function of p, hence convex, concave, quasiconvex,
i=j i
and quasiconcave.
Exercises
(d)
Pn
i=1
pi log pi , the negative entropy of the distribution.
Solution. p log p is a convex function on R+ (assuming 0 log 0 = 0), so

p log pi
i i
is convex (and hence quasiconvex).
The function is not concave or quasiconcave. Consider, for example, n = 2, p1 =
(1, 0) and p2 = (0, 1). Both p1 and p2 have function value zero, but the convex combination (0.5, 0.5) has function value log(1/2) < 0. This shows that the superlevel
sets are not convex.
(e) var x = E(x E x)2 .
Solution. We have
var x = E x2 (E x)2 =
n
X
i=1
pi a2i (
n
X
p i ai ) 2 ,
i=1
so var x is a concave quadratic function of p.

The function is not convex or quasiconvex. Consider the example with n = 2, a1 = 0,
a2 = 1. Both (p1 , p2 ) = (1/4, 3/4) and (p1 , p2 ) = (3/4, 1/4) lie in the probability
simplex and have var x = 3/16, but the convex combination (p1 , p2 ) = (1/2, 1/2) has
a variance var x = 1/4 > 3/16. This shows that the sublevel sets are not convex.
(f) quartile(x) = inf{ | prob(x ) 0.25}.
Solution. The sublevel and the superlevel sets of quartile(x) are convex (see
problem 2.15), so it is quasiconvex and quasiconcave.
quartile(x) is not continuous (it takes values in a discrete set {a1 , . . . , an }, so it is
not convex or concave. (A convex or a concave function is always continuous on the
relative interior of its domain.)
(g) The cardinality of the smallest set A {a1 , . . . , an } with probability 90%. (By
cardinality we mean the number of elements in A.)
Solution. f is integer-valued, so it can not be convex or concave. (A convex or a
concave function is always continuous on the relative interior of its domain.)
f is quasiconcave because its superlevel sets are convex. We have f (p) if and
only if
k
X
p[i] < 0.9,
i=1
where k = max{i = 1, . . . , n | i < } is the largest integer less than , and p[i] is
Pk
the ith largest component of p. We know that
p is a convex function of p,
i=1 [i]
Pk
so the inequality
p
<
0.9
defines
a
convex
set.
[i]
i=1
In general, f (p) is not quasiconvex. For example, we can take n = 2, a1 = 0 and
a2 = 1, and p1 = (0.1, 0.9) and p2 = (0.9, 0.1). Then f (p1 ) = f (p2 ) = 1, but
f ((p1 + p2 )/2) = f (0.5, 0.5) = 2.
(h) The minimum width interval that contains 90% of the probability, i.e.,
inf { | prob( x ) 0.9} .
Solution. The minimum width interval that contains 90% of the probability must
be of the form [ai , aj ] with 1 i j n, because
prob( x ) =
j
X
k=i
pk = prob(ai x ak )
where i = min{k | ak }, and j = max{k | ak }.
Convex functions
We show that the function is quasiconcave. We have f (p) if and only if all
intervals of width less than have a probability less than 90%,
j
X
pk < 0.9
k=i
for all i, j that satisfy aj ai < . This defines a convex set.

The function is not convex, concave nor quasiconvex in general. Consider the example with n = 3, a1 = 0, a2 = 0.5 and a3 = 1. On the line p1 + p3 = 0.95, we
have
(
0
p1 + p3 = 0.95, p1 [0.05, 0.1] [0.9, 0.95]
0.5 p1 + p3 = 0.95, p1 (0.1, 0.15] [0.85, 0.9)
f (p) =
1
p1 + p3 = 0.95, p1 (0.15, 0.85)
It is clear that f is not convex, concave nor quasiconvex on the line.
3.25 Maximum probability distance between distributions. Let p, q Rn represent two probability distributions on {1, . . . , n} (so p, q 0, 1T p = 1T q = 1). We define the maximum
probability distance dmp (p, q) between p and q as the maximum difference in probability
assigned by p and q, over all events:
dmp (p, q) = max{| prob(p, C) prob(q, C)| | C {1, . . . , n}}.
Here
P prob(p, C) is the probability of C, under the distribution p, i.e., prob(p, C) =
p.
iC i
Pn
Find a simple expression for dmp , involving kp qk1 = i=1 |pi qi |, and show that dmp
is a convex function on Rn Rn . (Its domain is {(p, q) | p, q 0, 1T p = 1T q = 1}, but
it has a natural extension to all of Rn Rn .)
Solution. Noting that
prob(q, C)),
prob(p, C) prob(q, C) = (prob(p, C)

= {1, . . . , n} \ C, we can just as well express dmp as
where C
dmp (p, q) = max{prob(p, C) prob(q, C) | C {1, . . . , n}}.
This shows that dmp is convex, since it is the maximum of 2n linear functions of (p, q).
Lets now identify the (or a) subset C that maximizes
prob(p, C) prob(q, C) =
The solution is
X
iC
(pi qi ).
C ? = {i {1, . . . , n} | pi > qi }.
Lets show this. The indices for which pi = qi clearly dont matter, so we will ignore
them, and assume without loss of generality that for each index, p> qi or pi < qi . Now
consider any other subset C. If there is an element k in C ? but not C, then by adding
k to C we increase prob(p, C) prob(q, C) by pk qk > 0, so C could not have been
optimal. Conversely, suppose that k C \ C ? , so pk qk < 0. If we remove k from C,
wed increase prob(p, C) prob(q, C) by qk pk > 0, so C could not have been optimal.
P
Thus, we have dmp (p, q) =
(pi qi ). Now lets express this in terms of kp qk1 .
pi >qi
Using
X
X
(pi qi ) = 1T p 1T q = 0,
(pi qi ) +
pi >qi
pi qi
Convex functions
(a) Let A = conv B. Since B A, we obviously have SB (y) SA (y). Suppose we have
strict inequality for some y, i.e.,
yT u < yT v
for all u B and some v A. This leads to a contradiction, because
by definition v
P
is
Pthe convex combination of a set of points ui B, i.e., v = i i ui , with i 0,
= 1. Since
i i
y T ui < y T v
for all i, this would imply
yT v =
i y T ui <
i y T v = y T v.
We conclude that we must have equality SB (y) = SA (y).

(b) Follows from
SA+B (y)
=
=
=
sup{y T (u + v) | u A, v B}
sup{y T u | u A} + sup{y T v | u B}
SA (y) + SB (y).
(c) Follows from

SAB (y)
=
=
=
sup{y T u | u A B}
max{sup{y T u | u A}, sup{y T v | u B}

max{SA (y), SB (y)}.
(d) Obviously, if A B, then SA (y) SB (y) for all y. We need to show that if A 6 B,
then SA (y) > SB (y) for some y.
Suppose A 6 B. Consider a point x
A, x
6 B. Since B is closed and convex, x
can be strictly separated from B by a hyperplane, i.e., there is a y 6= 0 such that

yT x
> yT x
for all x B. It follows that SB (y) < y T x
SA (y).
Conjugate functions
3.36 Derive the conjugates of the following functions.
(a) Max function. f (x) = maxi=1,...,n xi on Rn .
Solution. We will show that
f (y) =
if y 0, 1T y = 1
otherwise.
We first verify the domain of f . First suppose y has a negative component, say
yk < 0. If we choose a vector x with xk = t, xi = 0 for i 6= k, and let t go to
infinity, we see that
xT y max xi = tyk ,
i
so y is not in dom f . Next, assume y 0 but 1T y > 1. We choose x = t1 and let

t go to infinity, to show that
xT y max xi = t1T y t
i
Exercises
is unbounded above. Similarly, when y 0 and 1T y < 1, we choose x = t1 and
let t go to infinity.
The remaining case for y is y 0 and 1T y = 1. In this case we have
xT y max xi
i
for all x, and therefore x ymaxi xi 0 for all x, with equality for x = 0. Therefore
f (y) = 0.
(b) Sum of largest elements. f (x) =
Solution. The conjugate is
f (y) =
Pr
i=1
x[i] on Rn .
0 y 1,
otherwise,
1T y = r
We first verify the domain of f . Suppose y has a negative component, say yk < 0.
If we choose a vector x with xk = t, xi = 0 for i 6= k, and let t go to infinity, we
see that
xT y f (x) = tyk ,
so y is not in dom f .
Next, suppose y has a component greater than 1, say yk > 1. If we choose a vector
x with xk = t, xi = 0 for i 6= k, and let t go to infinity, we see that
xT y f (x) = tyk t ,
so y is not in dom f .
Finally, assume that 1T x 6= r. We choose x = t1 and find that
xT y f (x) = t1T y tr
is unbounded above, as t or t .
If y satisfies all the conditions we have
xT y f (x)
for all x, with equality for x = 0. Therefore f (y) = 0.
(c) Piecewise-linear function on R. f (x) = maxi=1,...,m (ai x + bi ) on R. You can

assume that the ai are sorted in increasing order, i.e., a1 am , and that none
of the functions ai x + bi is redundant, i.e., for each k there is at least one x with
f (x) = ak x + bk .
Solution. Under the assumption, the graph of f is a piecewise-linear, with breakpoints (bi bi+1 )/(ai+1 ai ), i = 1, . . . , m 1. We can write f as
f (y) = sup xy max (ai x + bi )

x
i=1,...,m
We see that dom f = [a1 , am ], since for y outside that range, the expression inside
the supremum is unbounded above. For ai y ai+1 , the supremum in the
definition of f is reached at the breakpoint between the segments i and i + 1, i.e.,
at the point (bi+1 bi )/(ai+1 ai ), so we obtain
f (y) = bi (bi+1 bi )
y ai
ai+1 ai
where i is defined by ai y ai+1 . Hence the graph of f is also a piecewise-linear

curve connecting the points (ai , bi ) for i = 1, . . . , m. Geometrically, the epigraph
of f is the epigraphical hull of the points (ai , bi ).
Convex functions
(d) Power function. f (x) = xp on R++ , where p > 1. Repeat for p < 0.
Solution. Well use standard notation: we define q by the equation 1/p + 1/q = 1,
i.e., q = p/(p 1).
We start with the case p > 1. Then xp is strictly convex on R+ . For y < 0 the
function yx xp achieves its maximum for x > 0 at x = 0, so f (y) = 0. For y > 0
the function achieves its maximum at x = (y/p)1/(p1) , where it has value
y(y/p)1/(p1) (y/p)p/(p1) = (p 1)(y/p)q .
Therefore we have
f (y) =
y0
y > 0.
0
(p 1)(y/p)q
For p < 0 similar arguments show that dom f = R++ and f (y) =
p
(y/p)q .
q
(e) Geometric mean. f (x) = ( xi )1/n on Rn

++ .
Solution. The conjugate function is
f (y) =
if y 0,
otherwise.
(yi )
1/n
1/n
We first verify the domain of f . Assume y has a positive component, say yk > 0.
Then we can choose xk = t and xi = 1, i 6= k, to show that
xT y f (x) = tyk +
X
i6=k
yi t1/n
is unbounded above as a function of t > 0. Hence the condition y 0 is indeed

required.
Q
Next assume that y 0, but ( i (yi ))1/n < 1/n. We choose xi = t/yi , and
obtain
!1/n
Y 1
T
( )
x y f (x) = tn t
yi
i
as t . This demonstrates that the second condition for the domain of f is also
needed.
1/n
Q
(yi )
1/n, and x 0. The arithmeticNow assume that y 0 and
i
geometric mean inequality states that
xT y
Y
i
(yi xi )
!1/n
Y
i
xi
!1/n
i.e., xT y f (x) with equality for xi = 1/yi . Hence, f (y) = 0.
(f) Negative generalized logarithm for second-order cone. f (x, t) = log(t 2 xT x) on

{(x, t) Rn R | kxk2 < t}.
Solution.
f (y, u) = 2 + log 4 log(u2 y T y),
dom f = {(y, u) | kyk2 < u}.
We first verify the domain. Suppose kyk2 u. Choose x = sy, t = s(kxk2 + 1) >
skyk2 su, with s 0. Then
y T x + tu > sy T y su2 = s(u2 y T y) 0,
Exercises
3.52 [MO79, 3.E.2] Log-convexity of moment functions. Suppose f : R R is nonnegative
with R+ dom f . For x 0 define
(x) =
ux f (u) du.
Show that is a log-convex function. (If x is a positive integer, and f is a probability

density function, then (x) is the xth moment of the distribution.)
Use this to show that the Gamma function,
(x) =
ux1 eu du,
is log-convex for x 1.
Solution. g(x, u) = ux f (u) is log-convex (as well as log-concave) in x for all u > 0. It
follows directly from the property on page 106 that
(x) =
g(x, u) du =
ux f (u) du
is log-convex.
3.53 Suppose x and y are independent random vectors in Rn , with log-concave probability
density functions f and g, respectively. Show that the probability density function of the
sum z = x + y is log-concave.
Solution. The probability density function of x + y is f g.
3.54 Log-concavity of Gaussian cumulative distribution function. The cumulative distribution
function of a Gaussian random variable,
1
f (x) =
2
et
/2
dt,
is log-concave. This follows from the general result that the convolution of two log-concave
functions is log-concave. In this problem we guide you through a simple self-contained
proof that f is log-concave. Recall that f is log-concave if and only if f 00 (x)f (x) f 0 (x)2
for all x.
(a) Verify that f 00 (x)f (x) f 0 (x)2 for x 0. That leaves us the hard part, which is to
show the inequality for x < 0.
(b) Verify that for any t and x we have t2 /2 x2 /2 + xt.
(c) Using part (b) show that et
/2
et
ex
/2
/2xt
dt ex
. Conclude that
/2
ext dt.
(d) Use part (c) to verify that f 00 (x)f (x) f 0 (x)2 for x 0.
Solution. The derivatives of f are
2
f 0 (x) = ex /2 / 2,
(a) f 00 (x) 0 for x 0.
(b) Since t2 /2 is convex we have
f 00 (x) = xex
/2
/ 2.
t2 /2 x2 /2 + x(t x) = xt x2 /2.
This is the general inequality
g(t) g(x) + g 0 (x)(t x),
which holds for any differentiable convex function, applied to g(t) = t2 /2.
Convex functions
(c) Take exponentials and integrate.

(d) This basic inequality reduces to
xex
i.e.,
/2
et
/2
et
/2
ex /2
.
x
dt
This follows from part (c) because
dt ex
ext dt =
ex
.
x
3.55 Log-concavity of the cumulative distribution function of a log-concave probability density.

In this problem we extend the result of exercise 3.54. Let g(t) = exp(h(t)) be a differentiable log-concave probability density function, and let
f (x) =
be its cumulative distribution.
f 00 (x)f (x) (f 0 (x))2 for all x.
g(t) dt =
eh(t) dt
We will show that f is log-concave, i.e., it satisfies
(a) Express the derivatives of f in terms of the function h. Verify that f 00 (x)f (x)
(f 0 (x))2 if h0 (x) 0.
(b) Assume that h0 (x) < 0. Use the inequality
h(t) h(x) + h0 (x)(t x)
(which follows from convexity of h), to show that
eh(t) dt
eh(x)
.
h0 (x)
Use this inequality to verify that f 00 (x)f (x) (f 0 (x))2 if h0 (x) 0.

Solution.
(a) f (x) =
Rx
eh(t) dt, f 0 (x) = eh(x) , f 00 (x) = h0 (x)eh(x) . Log-concavity means

0
h (x)e
h(x)
eh(t) dt e2h(x) ,
which is obviously true if h (x) 0.

(b) Take exponentials and integrate both sides of h(t) h(x) h0 (x)(t x):
(h (x))
eh(t) dt
exh
exh
eh(x)
h0 (x)
eh(x) .
eh(t) dt
(x)h(x)
eth
(x)h(x) xh0 (x)
(x)
dt
/(h0 (x))

Convex Optimisation Solutions

Uploaded by

Convex Optimisation Solutions

Uploaded by

3

Could f be convex (concave, quasiconvex, quasiconcave)? Explain your answer. Repeat

Solution. First suppose that f is convex. Jensens inequality can be written as

(f (x) + (f (y) f (x))) d =

Consider the function of given by

(f (u) + (f (u) f (v))) d =

is convex. You can assume f is differentiable.

f (t) dt 2f (x)/x2 + f 0 (x)/x

(f (t) f (x) f 0 (x)(t x)) dt.

Convexity now follows from the fact that

for all x, t dom f , which implies F 00 (x) 0.

for all t > 1, and therefore

(a) The Hessian of f must be positive semidefinite everywhere:

with probability 1/2

with probability 1/10

w and v are zero-mean and

3.11 Monotone mappings. A function : Rn Rn is called monotone if for all x, y dom ,

with dom u = R+ . We also define u0 (x) = log x (with dom u0 = R++ ).

The derivative is given by

which is neither positive semidefinite nor negative semidefinite. Therefore, f is

Therefore, f is convex and quasiconvex. It is not quasiconcave or concave.

(d) f (x1 , x2 ) = x1 /x2 on R2++ .

which is not positive or negative semidefinite. Therefore, f is not convex or concave.

Therefore, f is convex and quasiconvex. It is not concave or quasiconcave (see the

f is not convex or quasiconvex.

The second derivatives are

log tp1 log(t kxkpp /tp1 )

(p 1) log t log(t kxkpp /tp1 ).

is convex on {(x, t) | t > 0}.

(b) Show that

is convex on {x | cT x + d > 0}, where A Rmn , b Rm , c Rn and d R.

= ktv ywk22 /t3 0

for all v and w.

pi log pi , the negative entropy of the distribution.

Solution. p log p is a convex function on R+ (assuming 0 log 0 = 0), so

so var x is a concave quadratic function of p.

p[i] < 0.9,

where i = min{k | ak }, and j = max{k | ak }.

for all i, j that satisfy aj ai < . This defines a convex set.

prob(p, C) prob(q, C) = (prob(p, C)

for all i, this would imply

We conclude that we must have equality SB (y) = SA (y).

(c) Follows from

max{sup{y T u | u A}, sup{y T v | u B}

can be strictly separated from B by a hyperplane, i.e., there is a y 6= 0 such that

so y is not in dom f . Next, assume y  0 but 1T y > 1. We choose x = t1 and let

(c) Piecewise-linear function on R. f (x) = maxi=1,...,m (ai x + bi ) on R. You can

f (y) = sup xy max (ai x + bi )

where i is defined by ai y ai+1 . Hence the graph of f is also a piecewise-linear

(e) Geometric mean. f (x) = ( xi )1/n on Rn

is unbounded above as a function of t > 0. Hence the condition y  0 is indeed

i.e., xT y f (x) with equality for xi = 1/yi . Hence, f (y) = 0.

(f) Negative generalized logarithm for second-order cone. f (x, t) = log(t 2 xT x) on

dom f = {(y, u) | kyk2 < u}.

Show that is a log-convex function. (If x is a positive integer, and f is a probability

Solution. The derivatives of f are

(c) Take exponentials and integrate.

This follows from part (c) because

3.55 Log-concavity of the cumulative distribution function of a log-concave probability density.

We will show that f is log-concave, i.e., it satisfies

Use this inequality to verify that f 00 (x)f (x) (f 0 (x))2 if h0 (x) 0.

eh(t) dt, f 0 (x) = eh(x) , f 00 (x) = h0 (x)eh(x) . Log-concavity means

which is obviously true if h (x) 0.

(x)h(x) xh0 (x)

You might also like

so y is not in dom f . Next, assume y 0 but 1T y > 1. We choose x = t1 and let

is unbounded above as a function of t > 0. Hence the condition y 0 is indeed