A New Polynomial-Time Algorithm For Linear Programming: N. Karmarkar
A New Polynomial-Time Algorithm For Linear Programming: N. Karmarkar
A New Polynomial-Time Algorithm For Linear Programming: N. Karmarkar
A N E W POLYNOMIAL-TIME ALGORITHM
FOR LINEAR P R O G R A M M I N G
N. K A R M A R K A R
We present a new polynomial-time algorithm for linear programming. In the worst case,
the algorithm requires O(tf'SL) arithmetic operations on O(L) bit numbers, where n is the number of
variables and L is the number of bits in the input. The running,time of this algorithm is better than
the ellipsoid algorithm by a factor of O(n~'~). We prove that given a polytope P and a strictly in-
terior point a E P, there is a projective transformation of the space that maps P, a to P', a' having
the following property. The ratio of the radius of the smallest sphere with center a', containing
P ' to the radius of the largest sphere with center a' contained in P ' is O(n). The algorithm consists
of repeated application of such projective transformations each followed by optimization over an
inscribed sphere to create a sequence of points which converges to the optimal solution in poly-
nomial time.
1. Informal outline
We shall reduce the general linear programming problem over the rationals
to the lollowing particular case:
minimize crx
Ax=O
subject to ~ ~'xi = 1
(x=>O.
Here x = ( x l . . . . , x.)TER ", cEZ", A E Z " ~ ' .
This is a substantially revised version of the paper presented at the Symposium on Theory
of Computing, Washington D. C., April 1984.
AMS subject classification (1980): 90 C 05
374 N. KARMARKAR
Let rE, A , fr, denote the minimum values of objective function f ( x ) on E, P, and
E' respectively.
f ( a o ) - f ~ <----f(ao)--fe ~--f(ao)-fw = vtf(ao)-fe].
The last equation follows from the linearity off(x).
f(a0) - f ~ 1
f(ao)-fp- v
~-~ ~ (1_1).
f(ao) - f e
Thus by going from a0 to the point, say a', that minimizesf(x) over E we come closer
to the minimum value of the objective function by a factor I - v -1. We can repeat
the same process with a' as the center. The rate of convergence of this method depends
on v, the smaller the value of v, the faster the convergence.
A NEW POLYNOMIAL-TIME ALGORITHM FOR LINEAR P R O G R A M M I N G 375
is non-singular.
We shall be interested in projective transformations of the hyperplane
2:= { IZx, = 1}.
1. T is one-to-one and maps the simplex onto itself. Its inverse is given by
a i XPt
Xi = ,~ adxj i= 1,2,...,n.
J
2. Each facet of the simplex given by xi=0 is mapped onto the corresponding
facet xi=0.
1
3. Image of the point x = a is the center of the simplex given by X-'I ~ - - .
n
4. Let Ai denote the ith column of A. Then the system of equations
z~ Aixi = 0
i
now becomes
.~ Aiaix~
t
=0
Z a,x;
i
376 N. K A R M A R K A R
or
Z A;x; = o
i
where
A~ = aiAi.
We denote the affine subspace {x'lA'x'=0} by f2".
5. If a~ f2 then the center of the simplex aoC f2'.
Let B(ao, r) be the largest ball with center a0, the same as the center of the
simplex, and radius r that is contained in the simplex. Similarly let B(ao, R) be
the smallest ball containing A. It is easy to show that R / r = n - 1 .
B(a0, r) ~ A ~ B(ao, R)
B(a0, r)O~2' =c A (q (2" ~ B(a0, R)N f2'.
But A fqf2' is the image 17' of the polytope H=Af~f2. The intersection of a ball
and an affine subspace is a ball of lower dimension in that subspace and has the
same radius if the center of the original ball lies in the affine subspace, which it
does in our case. Hence we get
R
B'(ao, r ) ~ H ' ~ B ' ( a o , R), -r- = n - 1
which proves that v =dimension of the space can always be achieved by this method.
(Note: A is an n - 1 dimensional simplex.)
2. Main algorithm
2.1. Problem definition
In order to focus on the ideas which are new and the most important in our
algorithm, we make several simplifying assumptions. In Section 5 we will remove
all these restrictions by showing how a linear programming problem in standard
form can be transformed into our canonical form.
2.2. Comments
1. Note that the feasible region is the intersection of an affine space with a
simplex rather than the positive orthant. Initially it takes one projective transfor-
mation to bring a linear programming problem to this form.
2. The linear system of equations defining f2 is homogeneous i.e., the right
hand side is zero. The projective transformation that transforms positive orthant
into simplex also transforms inhomogeneous system of equations into an homo-
geneous one.
3. The target minimum value of the objective function is zero. This will be
justified by combining primal and dual into a single problem.
The algorithm creates a sequence of points x (°), x (1), ..., x (~) by these steps:
Step 0. Initialize
x (°) : center of the simplex
Step 1. Compute the next point in the sequence
x(k+ 1) .~_ ~D(x(k)).
Db"
h~--~
erDb , •
Return b.
Step 2. Check for infeasibility.
We define a "potential" function by
cTx
f(x) = Z ln--
i Xt
We expect certain improvement 6 in the potential function at each step. The value
of 3 depends on the choice of parameter ~ in Step 1.4• For example if ~ - 1/4 then
6=1/8. If we don't observe the expected improvement i.e. if f(x(k+l))>f(x(k))--3
then we stop and conclude that the minimum value of the objective function must
be strictly positive. When the canonical form of the problem was obtained by trans-
formation on the standard linear program, then this situation corresponds to the
case that the original problem does not have a finite optimum i.e. it is either infea-
sible or unbounded•
Step 3. Check for optimality.
This check is carried out periodically• It involves going from the current
interior point to an extreme point without increasing the value of the objective
function and testing the extreme point for optimality• This is done only when the
time spent since the last check exceeds the time required for checking•
Theorem 1. In O(n(q+log n)) steps the algorithm finds a feasible point x such that
cTx
_ _ -: ~ 2--q •
e Ta 0
given by
f(x) y
7 ~ x; J"
Theorem 2. Either (i) f(xtk+l))~f(x(k))--6 or
(ii) the minimum value o f objective function is strictly positive, where 6 is a
constant and depends on the choice o f the value o f the parameter ~ o f Step 1, Sub-
step 4.
A particular choice that works: If ~ = 1/4, then 6~1/8.
One step of the algorittun is of the form b=~0(a) which consists of a sequ-
ence of three steps:
1. Perform the projective transformation T(a, a0) of the simplex A that
maps input point a to thecenter a0.
2. Optimize (approximately) the transformed objective function over an
inscribed sphere to find a point b'.
3. Apply the inverse of T to point b' to obtain output b.
We describe each of the steps in detail.
Let r be the radius of the largest ball with center a0 that can be inscribed in
the simplex A. Then
1
t-
en(n-i)
We optimize over a smaller ball B(ao, c~r) 0 < ~ < 1 for two reasons:
1. It allows optimization of f '(y) to be approximated very closely by opti-
mization of a linear function.
2. If we wish to perform arithmetic operations approximately rather than
by exact rational arithmetic, it gives us a margin to absorb round-off errors without
going outside the simplex.
One choice of the value of ~ that works is ~ = 1/4 and corresponds to 3 > 1/8
in Theorem 2. We restrict the affine space ~ ' = {Yl ADy=0} by the equation ~'y~=
= 1. Let O" be the resulting at'fine space. Let B be the matrix obtained by adding
a row of all l's to AD. Then any displacement in ~ " is in the null space of B,
i.e., for u E ~ " ~ ? ' = u + K e r B .
We are interested in optimizing f ' ( y ) over B(a0, ~r)NO". First, we prove
the existence of a point that achieves a constant reduction in the potential function.
Theorem3. There exists a point b'EB(a0, c o ' ) n ~ " such that f ' ( b ' ) = < f ' ( a 0 ) - 3 ,
where 3 is a constant.
Then we prove that minimization of f '(x) can be approximated by minimi-
zation of the linear function c'rx.
Theorem 4. Let b" be the point that minimizes c ' r x over B(ao, e r ) N f U . Then
f ' ( b ' ) ~ = f ' ( a o ) - 6 where 3 is a positive constant depending on ~. For c~=1/4 we
may take 6 = 1/8.
Algorithm A.
1. Project c' orthogonally hIto the null space o f B:
c, = [ I - Br(BB r) -1B]c'.
2. Normalize cv:
Cp
: lop---L
3. Take a step o f length ~tr along the direction - c v
h p _~ a o - ~ r c p ,
Theorem 5. The point b" returned by algorithm A minimizes c ' r x over B(a0, ~r)OfY'.
382 N. KARMARKAR
Proof of Theorem 5. Let z be any point such that zEB(ao, ~r)7) Q"
=~ B T ( B B r ) - a B ( b " - Z) = 0
= (c' -- ce)T(b" -- Z) = 0
= [cpr[{~.er(ao-- z) -- ~r}
c~(b'-z) <_- 0
cPTao ]
c'rb ' 1 -2
and
b~ = (1 - 5~)aoj + 2x*.
Hence
c ' r ao b~ (1 - )0 a. r + 2x,*-
f'(ao)-f'(b') = Z In = ~ In
c ' r b ' aoj i (1 - 2)aoj
=Zln[14 2 x~.]
j l --2 aojJ"
A NEW POLYNOMIAL-TIME ALGORITHM FOR LINEAR PROGRAMMING 383
Now we use the following inequality: If P~_->O then /-/(1 +P3=>I + z ~ P t . There-
i
fore, z~ In (1 + P 3 =>ln (1 + z~ P~).
i
h" = ( 1 - 2 ) a o + 2 X *
b'-ao = 2(x*-ao)
c{r = ]b'-aol = 2[x*-aol -<- XR
where R is the radius of the smallest circumscribing sphere and R/r=n- 1.
r
)~ -->_cx -- - -c~
R n--1
2n n~
14 1~-2 >- 1 ' na/(n -1) = 12r - - >- 1 +
-- -I- 1 - c ~ / ( n - 1) n-l--~
f ' ( a o ) - f ' ( b ' ) ->_ ln(1 + c 0.
Taking 6 = I n ( 1 +c0, f'(h')<=f'(ao)-3. I
Proof of Theorem 4. First we prove a few lemmas.
X2
Lemma 4.1. If Ix1-<__13<1 then [ln(l+x)-xl~--
2(1 - / ~ )
Proof. A simple exercise in calculus. I
y
7" ~ aoj / (l/n) 2 n(n--1)
Xj -- aoj =
i-<13 for j 1. . . . , n
I aoj
Iln(iNxj--aoll
ao.~ J
xi--ao,]<
aoj
I txi--ao,l'
= 2(1- 13) ~ aoi )' by Lemma 4.1
Zln = Z l n [ l + xj-a°J]
xj
j j aoj aoj J
~[lnXJ x,--aoy<= ------'-szl (xi--aoji' - fl~
• ao~. aoj 2(I--13__#~ aoj. / 2(1--13)
but
xy--aoj_ 1 [~xj--~aoj]=-~ol/[1--1]=O
J aoj -- aoj " j
~- 2(1-13)" I
384 N. KARMARKAR
=-Z In xj
j aoj
I ~ In xj f12
(2~ If(x)- (f(a°)+f(x))l = aoj <= 2(i= D
by Lemwa 4.2.
Since f(x) depends on c'rx in a monotonically increasing manner, fix)
aud e'rx achieve their minimum values over B(a0, ~r) (3 Q" at the same point, viz b',
(3) f(b,,,) ~ f(b').
By (1), (2) and (3),
fl ~ ~2 ~2 H 2
f(ao) - f ( b ' ) >= In (1 + 00
(l-fl) - 2 n ( n _ 1) [1 _ ~/n__Z_~_]
Define
0¢2 ~2//
6=a
2 (n-l)
[ V ]
1--c~
f ( a o ) - f ( b ' ) => 6.
Observe that
~2
cx~ l
1. .lira
. . . 3(n)=~z-~- • If ~=~-, then lira 6 (n) => I
1-c¢
>1
2. If" n=>4, c~=1 then ~ = T o " |
In cr x(k)
n ~_ 2 In (x5 k)) -- ~ In a0j - k3.
crao j J
c r x tk) k
But x~k)~l, aoj= 1 , so lnc---~ao<-ln(n)--~'n Therefore there exists
a constant k" such that after m=k'(n(q+ln (n))) steps.
cTx
Crao =< 2 - L II
x_~O u~O.
By duality theory, this combined problem is feasible if and only the original problem
has finite optimum solution.
Step 2. bltroduee slack variables.
Ax-y =b
Aru+v = c
crx-bru = 0
x_~O u~O y=>O v~O.
386 N. KARMARKAR
Step 3. Introduce an artificial variable to create an interior starthlg pohTt. Let Xo, Yo,
uo, vo be m i c t l y interior points in the positive orthant
minimize ~.
Ax-y+(b-Axo+Yo)• = b
JAru+v+(c-Aruo)~ = c
subject to ] e r x _ b r u + ( _ c r x o + b r v o ) X = 0
Ix_-->0 y=>0, u=>0, v:>0, )~=>0.
Note that x - x 0 , Y=Yo, u = u o , V=Vo, ;~=1 is a strictly interior feasible solution
which we can take as a starting point. The minimum value of X is zero if and only
if the problem in step 2 is feasible.
Step 3. Change o f Notation. For convenience of description we rewrite the problem
in Step 3 as
minimize c r x , c, xER"
subject to Ax = b, x => 0
x = a is a known strictly interior starting point, and the target value of the objective
function we are interested in is zero.
Step 4. A projective transformation o f the positive orthant into a simplex. Consider
the transformation x ' = T ( x ) where xER", x ' E R "+1 are defined by
xl/ai i = 1, 2, ..., n
x" -- X (xflaj) + 1
J
Xn+ 1 ~- 1 - X i.
i=1
a;x~ i = 1, 2 . . . . . n.
"k"i _ _ Xn+l
~ Aix i = b
i=1
A NEW POLYNOMIAL-TIME ALGORITHM FOR LINEAR PROGRAMMING 387
then
.~, Aialx; = b
Xn+l
or
Aiaix; - b x , + t 0.
i=l
Xn+ l
we get
n+l
ciaixi•
Y I
- O.
l=l Xn+l
Define c'ER "+1 by
c;=aici i=1,2 .... ,n
c L 1 = 0.
Then c T x = 0 implies c ' T x ' = 0 .
Therefore the image of Z is given by Z ' = {x'ER"+l[c'rx'=0}.
The transformed problem is
minimize crTx ~
A'x' = 0
subject to n+l
×' > 0, Z x i' = 1
i=l
The center of the simplex is a feasible starting point. Thus we get a problem
in the canonical form.
Application of our main algorithm to this problem either gives a solution
x' such that c ' T x ' = 0 or we conclude that minimum value of c ' r x ' is strictly posi-
tive. In the first case, the inverse image of x' gives an optimal solution to the original
problem. In the latter case we conclude lhat the original problem does not have
a finite optimum, i.e. it is either infeasible or unbounded.
9*
388 N. KARMARKAR
(1)
The only quantity that changes from step to step is the diagonal matrix D,
since D}k) =x} k). In order to take advantage of the computations done in previous
steps, we exploit the following facts about matrix inserve:
Let D, D' be n × n diagonal matrices.
1. If D and D" are "close" in some suitable norm, the inverse of AD'2A r
can be used in place of the inverse of AD2A r.
2. If D and D' differ in only one entry, the inverse of AD'2A r can be com-
puted in O(n') arithmetic operations, given the inverse of AD2A r.
(Note: Instead of finding the inverse, one could also devise an algorithm
based on other techniques such as LU decomposition. In practice, it may be better
to update the LU decomposition directly, but this will not change the theoretical
worst-case complexity of the algorithm.)
We define a diagonal matrix D "ok), a "working approximation" to D t*) in
step k, such that
1 [D','k']
(2) ~I_D~) I -<2 for i-- 1,2,...,n.
modified by a rank-one matrix expressed as the outer product of two vectors u and
v. Then its inverse can be modified by the equation
(3) (M+uvr)-i - - 1 (M-Xu)(M-Xv) r
=M -- l + u r M - l v "
Given M -x, computation of (M+uvr) -x can be done in O(n 2) steps. In order to
apply equation (3), note that if D" and D" differ only in the ith entry then
(4) AD'2Ar=AD"ZAr+[D~ -D~Z] alaT
where a~ is the ith column of A. This is clearly a rank-one modification. If D' and
D" differ in l entries we can perform I successive rank-one updates in O(n21) time
to obtain the new inverse. All other operations required for computing cv, the
orthogonal projection of c, involve either multiplication of a vector by a matrix,
or linear combination of two vectors, hence can be performed in O(n 2) time.
[ADQ-1DA r ADQ-le]
(9) BQ-XB r = [ (ADO_Xe)r erQ_Xe | .
390 N. KARMARKAR
The last inequality follows from Theorem 3. Thus for the modified algorithm, the
claim of Theorem 3 is modified as follows
Because of the second inclusion, Lemma 4.2 continues to be valid and we can appro-
ximate minimization of f "(x) by minimization of a linear function. The claim (as in
Theorem 2) about one step of the algorithm f ( x (k+t))~f(x(k))-6 also remains
valid where 6 is redefined as
This affects the number of steps by only a constant factor and the algorithm still
works correctly.
A NEW POLYNOMIAL-TIME ALGORITHM FOR LINEAR PROGRAMMING 391
In this Subsection we show that the total number of rank-one updating ope-
rations in m steps of the modified algorithm is O(mJ/n). Since each rank-one modi-
fication requires 0@21 arithmetic operations, the average work per step is O(n 2"5)
as compared to O(n 3) in the simpler form of the algorithm.
In each step lh'-a0]_=c~r. Substituting h'=T(x(k+t)); a 0 = n - l e and T ( x ) =
D-1X I
- - r - -
Therefore • [A~ k ) - 1]2_-<fla. Recall that D (k) = Diag {x(~'), x~ k). . . . . x(,k)}, D(k+l)=
i
Diag {x}k+l), ,,(~,+11 x,(k+l)}. D "(k) is updated in two stages.
First we scale D "(k) by a factor a (k) defined in equation (16)
(18) D ' ( k + l ) _-- a(g)D'(k).
we r e s e t D ; I ( g + I ) = D ! / k + l ) .
Define discrepancy between D uo and D '(k) as
Then
(21) k•l
k~k I
steps of the algorithm and N = Z nL, be the total number of updating operations
1=1
in m steps.
By equations (20) and (21),
k=l
Hence
Iln A}~)I = Z ]lnA[ k) - (A}~)- 1)-4-(A!k)-- 1)1
i=l i=l
-~ ~ [lnA}~)-(A}*'-I)I÷ ~ [A}~)- ll
i=i i=1
2 (l
7. Concluding remarks
7.1. Global analysis of optimization algorithms
While theoretical methods for analyzing local convergence of non-linear
programming and other geometric algorithms are well-developed the state-of-the-art
of global convergence analysis is rather unsatisfactory. The algorithmic and analy-
tical techniques introduced in this paper may turn out to be valuable in designing
geometric algorithms with provably good global convergence properties. Our method
can be thought of as a steepest descent method with respect to a particular metric
space over a simplex defined in terms of "cross-ratio", a projective invariant. The
global nature of our result was made possible because any (strictly interior) point in
the feasible region can be mapped to any other such point by a transformation that
preserves affine spaces as well as the metric. This metric can be easily generalized to
arbitrary convex sets with "well-behaved" boundary and to intersections of such con-
vex sets. This metric effectively transforms the feasible region so that the boundary of
the region is at infinite distance from the interior points. Furthermore, this tians-
formation is independent of the objective function being optimized. Contrast this
with the penalty function methods which require an ad hoc mixture of objective
function and penalty function. It is not clear a priori in what proportion the two
functions should be mixed and the right proportion depends on both the objective
function and the feasible region.
References
N. Karmarkar
A T & T Bell Laboratories
Murray Hill, N J 07974
U.S.A.