17 Dynamic Programming Matrix Chain Multiplication No Pause

Lecture 17/18: Dynamic Programming - Matrix
Chain Parenthesization
COMS10007 - Algorithms
Dr. Christian Konrad
01.04.2019 and 02.04.2019
Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 1 / 18

Matrix Multiplication
Problem: Matrix-Multiplication
1 Input: Matrices A, B with A.columns = B.rows
2 Output: Matrix product A × B
Example:
 q   r 
2 3 r 6 2 4
1 0 q 0 1 2

0 1 2
2 6 × 2 0 0 = 12
p 4 p
  
2
0 9 18 0 0
Notation: p × q matrix: p rows and q columns
p × q matrix times q × r matrix gives a p × r matrix

(A × B)i,j = row i of A times column j of B

Algorithm for Matrix-Multiplication
Algorithm: (A × B)i,j = row i of A times column j of B
Require: Matrices A, B with A.columns = B.rows

Let C be a new A.rows × B.columns matrix
for i ← 1 . . . A.rows do
for j ← 1 . . . B.columns do
Cij ← 0
for k ← 1 . . . A.columns do
Cij ← Cij + Aik · Bkj
return C
Algorithm Matrix-Multiply(A, B)
Runtime:
Three nested loops: O(A.rows · B.columns · A.columns)
Number of Multiplications: A.rows · B.columns · A.columns
Multiplying two n × n matrices: runtime O(n3 )
Background: Faster Matrix Multiplication
History: Multiplying two n × n matrices
before 1969: O(n3 )

1969: Strassen O(n2.8074 ) (divide-and-conquer)
1990: Coppersmith-Winograd O(n2.3755 )
2010: Stothers O(n2.374 )
2011: Virginia Williams O(n2.3728642 )
2014: Le Gall O(n2.3728639 )
Important Problem:
Many algorithms rely on fast matrix multiplication

Better bound for matrix multiplication improves many
algorithms

The Matrix-chain Multiplication Problem
Problem: Matrix-Chain-Multiplication
1 Input: A sequence (chain) of n matrices A1 , A2 , A3 , . . . , An
2 Output: The product A1 × A2 × A3 × · · · × An
Discussion:
Ai .columns = Ai+1 .rows for every 1 ≤ i < n
Assume Ai has dimension pi−1 × pi , for vector p[0 . . . n]
Matrix product is associative:
(A1 × A2 ) × A3 = A1 × (A2 × A3 )
Exploit Associativity: Parenthesize A1 × A2 × A3 × . . . An so as

to minimize the number of scalar multiplications (and thus the
runtime)

Order matters
Example: Three matrices A1 , A2 , A3 with dimensions
A1 : 10 × 100 A2 : 100 × 5 A3 : 5 × 50
(p0 = 10, p1 = 100, p2 = 5, p3 = 50)
Computation of (A1 × A2 ) × A3 :
A1 × A2 = A12 requires 10 · 100 · 5 = 5000 multiplications
A12 × A3 requires 10 · 5 · 50 = 2500 multiplications
Total: 7500 multiplications
Computation of A1 × (A2 × A3 ):
A2 × A3 = A23 requires 100 · 5 · 50 = 25000 multiplications
A1 × A23 requires 10 · 100 · 50 = 50000 multiplications
Total: 75000 multiplications
The Matrix-Chain-Parenthesization Problem
Problem: Matrix-Chain-Parenthesization
1 Input: A sequence (chain) of n matrices A , A , A , . . . , A
1 2 3 n
2 Output: A parenthesization of A × A × A × · · · × A that
1 2 3 n
minimizes the number of scalar multiplications
How many Parenthesizations P(n) are there?

We write: Aij for the product Ai × Ai+1 × · · · × Aj
There is a final matrix multiplication: A1k × A(k+1)n , for some
1 ≤ k ≤ n − 1. Hence:
(
1 if n = 1 ,
P(n) = Pn−1
k=1 P(k)P(n − k) if n ≥ 2 .
Example: Four matrices A1 , A2 , A3 , A4
A1 × A24 A12 × A34 A13 × A4

Number of Parenthesizations
Example (continued): Four matrices A1 , A2 , A3 , A4
A1 × A24 A12 × A34 A13 × A4
2
X
P(3) = P(k)P(n − k) = P(1)P(2) + P(2)P(1) = 2
k=1
3
X
P(4) = P(k)P(n − k) = P(1)P(3) + P(2)P(2) + P(3)P(1)
k=1
= P(3) + 1 + P(3) = 2P(3) + 1 = 5 .
1 A1 × ((A2 × A3 ) × A4 )
2 A1 × ((A2 × (A3 × A4 ))
3 (A1 × A2 ) × (A3 × A4 )
4 ((A1 × A2 ) × A3 ) × A4
5 (A1 × (A2 × A3 )) × A4

Number of Parenthesizations (2)
A Bound on the Number of Parenthesizations:

(
1 if n = 1 ,
P(n) = Pn−1
k=1 P(k)P(n − k) if n ≥ 2 .
1, 1, 2, 5, 14, 42, 132, 429, 1430, 4862, 16796, 58786, 208012, 742900, . . .
It can be seen that there are Ω(2n ) possibilities

An efficient algorithm thus cannot try out all possibilities
We will give a dynamic programming algorithm

Optimal Substructure
Optimal Substructure
We say that a problem P exhibits optimal substructure if:
An optimal solution to P contains within it optimal solutions to

subproblems of P.
Optimal Substructure in Matrix-Chain-Parenthesization

Consider optimal solution to instance of size n
Suppose that last product is A1k × A(k+1)n
Then the optimal solution contains optimal parenthesizations
of A1 × A2 × · · · × Ak and Ak+1 × Ak+2 × . . . An
Proof. Suppose it did not contain optimal parenthesizations
of A1 × A2 × · · · × Ak and of Ak+1 × Ak+2 × . . . An . Then
picking optimal parenthesizations of the two subproblems
would give better solution to initial instance.
Recursive Solution
Optimal Solution to Subproblem:
m[i, j] : minimum number of scalar multiplications needed to
compute Ai × Ai+1 × · · · × Aj = Aij
Observe that m[i, i] = 0 (chain consists of single matrix Ai )
Suppose j > i. Suppose last multiplication in optimal solution
is: Aik × A(k+1)j , for some k
Then: cost of multiplying Aik × A(k+1)j
m[i, j] = m[i, k] + m[k + 1, j] + pi−1 pk pj
(Aik : pi−1 × pk matrix, A(k+1)j : pk × pj matrix)
Since we do not know k, we try out all possibilities and
choose the best solution:
(
0 if i = j ,
m[i, j] =
mini≤k<j {m[i, k] + m[k + 1, j] + pi−1 pk pj } if i < j .

Computing the Optimal Costs
(
0 if i = j ,
m[i, j] =
mini≤k<j {m[i, k] + m[k + 1, j] + pi−1 pk pj } if i < j .
Algorithmic Considerations:
As in Pole-Cutting, we could implement this recursive
formula directly. → exponential runtime
Instead, we compute the table m[i, j] bottom up
Observe that there are less than n2 subproblems m[i, j] (i and
j take values in {1, . . . , n})
We will see that computing one value m[i, j] takes O(n) time
This yields an O(n3 ) time algorithm

Dynamic Programming Algorithm
Require: Integer n, vector of dimensions of matrices p so that

matrix Ai has dimensions pi−1 × pi
Let m[1 . . . n, 1 . . . n] be a new array
for i ← 1 . . . n do
m[i, i] ← 0
for l ← 2 . . . n do {chain length}
for i ← 1 . . . n − l + 1 do {left position}
j ← i + l − 1 {right position}
m[i, j] ← ∞
for k ← i . . . j − 1 do
m[i, j] ← min{m[i, j], m[i, k] + m[k + 1, j] + pi−1 pk pj }
return m
Algorithm Matrix-Chain-Value(n, p)
Pn Pn−l+1 Pi+l−2
Runtime: O(n3 ) (by evaluating l=2 i=1 k=1 O(1))

Runtime Evaluation
Useful Formula:
b
X
1=b−a+1
i=a
n n−l+1
X X i+l−2
X n n−l+1
X X i+l−2
X
O(1) = O(1) · 1
l=2 i=1 k=1 l=2 i=1 k=1
Xn X
n Xn Xn Xn n X
X n
≤ O(1) · 1 = O(1) · n = O(1) · n 1
l=1 i=1 k=1 l=1 i=1 l=1 i=1
Xn n
X
= O(1) · n n = O(1) · n2 1 = O(1) · n2 · n = O(1)n3
l=1 l=1
3
= O(n ) .

Example n = 4 and p = 3 7 6 2 9
1 2 3 4
1
2
3
4
for i ← 1 . . . n do
m[i, i] ← 0

Example n = 4 and p = 3 7 6 2 9
1 2 3 4
1 0
2 0
3 0
4 0
for i ← 1 . . . n do
m[i, i] ← 0

Example n = 4 and p = 3 7 6 2 9
1 2 3 4
1 0
2 0
3 0
4 0
for l ← 2 . . . n do
for i ← 1 . . . n − l + 1 do
m[i, j] ← ∞
for k ← i . . . j − 1 do
l = 2, i = 1, j = 2

Example n = 4 and p = 3 7 6 2 9
1 2 3 4
1 0
2 126 0
3 0
4 0
for l ← 2 . . . n do
for i ← 1 . . . n − l + 1 do
m[i, j] ← ∞
for k ← i . . . j − 1 do
l = 2, i = 1, j = 2
m[1, 2] = m[1, 1] + m[2, 2] + p0 p1 p2 = 0 + 0 + 3 · 7 · 6 = 126

Example n = 4 and p = 3 7 6 2 9
1 2 3 4
1 0
2 126 0
3 0
4 0
for l ← 2 . . . n do
for i ← 1 . . . n − l + 1 do
m[i, j] ← ∞
for k ← i . . . j − 1 do
l = 2, i = 2, j = 3

Example n = 4 and p = 3 7 6 2 9
1 2 3 4
1 0
2 126 0
3 84 0
4 0
for l ← 2 . . . n do
for i ← 1 . . . n − l + 1 do
m[i, j] ← ∞
for k ← i . . . j − 1 do
l = 2, i = 2, j = 3
m[2, 3] = m[2, 2] + m[3, 3] + p1 p2 p3 = 0 + 0 + 7 · 6 · 2 = 84

Example n = 4 and p = 3 7 6 2 9
1 2 3 4
1 0
2 126 0
3 84 0
4 0
for l ← 2 . . . n do
for i ← 1 . . . n − l + 1 do
m[i, j] ← ∞
for k ← i . . . j − 1 do
l = 2, i = 3, j = 4

Example n = 4 and p = 3 7 6 2 9
1 2 3 4
1 0
2 126 0
3 84 0
4 108 0
for l ← 2 . . . n do
for i ← 1 . . . n − l + 1 do
m[i, j] ← ∞
for k ← i . . . j − 1 do
l = 2, i = 3, j = 4
m[3, 4] = m[3, 3] + m[4, 4] + p2 p3 p4 = 0 + 06 · 2 · 9 = 108

Example n = 4 and p = 3 7 6 2 9
1 2 3 4
1 0
2 126 0
3 84 0
4 108 0
for l ← 2 . . . n do
for i ← 1 . . . n − l + 1 do
m[i, j] ← ∞
for k ← i . . . j − 1 do
l = 3, i = 1, j = 3

Example n = 4 and p = 3 7 6 2 9
1 2 3 4
1 0
2 126 0
3 106 84 0
4 108 0
for l ← 2 . . . n do
for i ← 1 . . . n − l + 1 do
m[i, j] ← ∞
for k ← i . . . j − 1 do
l = 3, i = 1, j = 3
m[1, 1] + m[2, 3] + p0 p1 p3 = 0 + 84 + 3 · 7 · 2 = 84 + 42 = 106

m[1, 2] + m[3, 3] + p0 p2 p3 = 126 + 0 + 3 · 6 · 2 = 126 + 36 = 162
Example n = 4 and p = 3 7 6 2 9
1 2 3 4
1 0
2 126 0
3 106 84 0
4 108 0
for l ← 2 . . . n do
for i ← 1 . . . n − l + 1 do
m[i, j] ← ∞
for k ← i . . . j − 1 do
l = 3, i = 2, j = 4

Example n = 4 and p = 3 7 6 2 9
1 2 3 4
1 0
2 126 0
3 106 84 0
4 210 108 0
for l ← 2 . . . n do
for i ← 1 . . . n − l + 1 do
m[i, j] ← ∞
for k ← i . . . j − 1 do
l = 3, i = 2, j = 4
m[2, 2] + m[3, 4] + p1 p2 p4 = 0 + 108 + 7 · 6 · 9 = 108 + 378 = 486

m[2, 3] + m[4, 4] + p1 p3 p4 = 84 + 0 + 7 · 2 · 9 = 84 + 36 = 210
Example n = 4 and p = 3 7 6 2 9
1 2 3 4
1 0
2 126 0
3 106 84 0
4 210 108 0
for l ← 2 . . . n do
for i ← 1 . . . n − l + 1 do
m[i, j] ← ∞
for k ← i . . . j − 1 do
l = 4, i = 1, j = 4

Example n = 4 and p = 3 7 6 2 9
1 2 3 4
1 0
2 126 0
3 106 84 0
4 160 210 108 0
for l ← 2 . . . n do
for i ← 1 . . . n − l + 1 do
m[i, j] ← ∞
for k ← i . . . j − 1 do
m[1, 1] + m[2, 4] + p0 p1 p4 = 0 + 210 + 3 · 7 · 9 = 399

m[1, 2] + m[3, 4] + p0 p2 p4 = 126 + 108 + 3 · 6 · 9 = 396
m[1, 3] + m[4, 4] + p0 p3 p4 = 106 + 0 + 3 · 2 · 9 = 160
Optimal Solution of Example
Example: n = 4 and p = 3 7 6 2 9
Algorithm outputs value of optimal solution: m[1, 4] = 160
We would like to know the optimal parenthesization as well
((A1 × A2 ) × A3 ) × A4
→ Modify algorithm to keep track of parameters that give
minimum in array s

Keep Track of Optimal Choices
Let m[1 . . . n, 1 . . . n] be a new array
for i ← 1 . . . n do
m[i, i] ← 0
m[i, j] ← ∞
for k ← i . . . j − 1 do
return m, s
Algorithm Matrix-Chain-Value(n, p)

Keep Track of Optimal Choices
Let m[1 . . . n, 1 . . . n] and s[1 . . . n, 2 . . . n] be new arrays
for i ← 1 . . . n do
m[i, i] ← 0
m[i, j] ← ∞
for k ← i . . . j − 1 do
q ← m[i, k] + m[k + 1, j] + pi−1 pk pj
if q < m[i, j] then
m[i, j] ← q
s[i, j] ← k
return m
Algorithm Matrix-Chain-Order(A, B)
Print Optimal Parenthesization
Using s to find Optimal Parenthesization
Require: Array s, positions i, j

if i = j then
print “Ai ”
else
print “(”
Print-Optimal-Parens(s, i, s[i, j])
Print-Optimal-Parens(s, s[i, j] + 1, j)
print “)”
Algorithm Print-Optimal-Parens(s, i, j)
Call Print-Optimal-Parens(s, 1, n) to obtain parenthesization

17 Dynamic Programming Matrix Chain Multiplication No Pause

Uploaded by

Copyright:

Available Formats

17 Dynamic Programming Matrix Chain Multiplication No Pause

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

17 Dynamic Programming Matrix Chain Multiplication No Pause

Uploaded by

Copyright:

Available Formats

Lecture 17/18: Dynamic Programming - Matrix

Dr. Christian Konrad

01.04.2019 and 02.04.2019

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 1 / 18

Notation: p × q matrix: p rows and q columns

p × q matrix times q × r matrix gives a p × r matrix

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 2 / 18

Require: Matrices A, B with A.columns = B.rows

History: Multiplying two n × n matrices

before 1969: O(n3 )

Many algorithms rely on fast matrix multiplication

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 4 / 18

Exploit Associativity: Parenthesize A1 × A2 × A3 × . . . An so as

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 5 / 18

(p0 = 10, p1 = 100, p2 = 5, p3 = 50)

How many Parenthesizations P(n) are there?

Example: Four matrices A1 , A2 , A3 , A4

A1 × A24 A12 × A34 A13 × A4

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 7 / 18

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 8 / 18

A Bound on the Number of Parenthesizations:

It can be seen that there are Ω(2n ) possibilities

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 9 / 18

An optimal solution to P contains within it optimal solutions to

Optimal Substructure in Matrix-Chain-Parenthesization

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 11 / 18

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 12 / 18

Require: Integer n, vector of dimensions of matrices p so that

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 13 / 18

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 14 / 18

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 15 / 18

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 15 / 18

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 15 / 18

m[1, 2] = m[1, 1] + m[2, 2] + p0 p1 p2 = 0 + 0 + 3 · 7 · 6 = 126

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 15 / 18

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 15 / 18

m[2, 3] = m[2, 2] + m[3, 3] + p1 p2 p3 = 0 + 0 + 7 · 6 · 2 = 84

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 15 / 18

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 15 / 18

m[3, 4] = m[3, 3] + m[4, 4] + p2 p3 p4 = 0 + 06 · 2 · 9 = 108

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 15 / 18

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 15 / 18

m[1, 1] + m[2, 3] + p0 p1 p3 = 0 + 84 + 3 · 7 · 2 = 84 + 42 = 106

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 15 / 18

m[2, 2] + m[3, 4] + p1 p2 p4 = 0 + 108 + 7 · 6 · 9 = 108 + 378 = 486

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 15 / 18

m[1, 1] + m[2, 4] + p0 p1 p4 = 0 + 210 + 3 · 7 · 9 = 399

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 16 / 18

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 17 / 18

Using s to find Optimal Parenthesization

Require: Array s, positions i, j

Call Print-Optimal-Parens(s, 1, n) to obtain parenthesization

Dr. Christian Konrad Lecture 17/18: Matrix Chain Parenthesization 18 / 18

You might also like