Quantum Information Theory Tutorial

Quantum Information Theory Tutorial
Mark M. Wilde
Hearne Institute for Theoretical Physics,

Department of Physics and Astronomy,
Center for Computation and Technology,
Louisiana State University,
Baton Rouge, Louisiana, USA
[email protected]
Reference: Quantum Information Theory

published by Cambridge University Press (2nd edition)
QuILT’s Eve, November 1, 2018, Louisiana State University, Baton Rouge, Louisiana
Mark M. Wilde (LSU) 1 / 113

Main questions
What are the ultimate limitations on communication imposed by

physical laws?
What are methods for achieving these limits?
To address these questions, we need to consider quantum mechanics,

and so we are naturally led to an intersection of information theory
and quantum mechanics called quantum information theory
What is different about quantum and “classical” information theory?
What tasks can we achieve with quantum mechanics that we cannot

without it? (long list: Bell inequalities, super-dense coding,
teleportation, data locking, data hiding, quantum cryptography, etc.)

Prehistory of quantum information theory
1927 Heisenberg uncertainty principle
1935 Einstein–Podolsky–Rosen paper questioning compatibility of

uncertainty principle and phenomenon of quantum entanglement /
1964 Bell’s theorem as an answer / 2009 Berta et al. entropic
uncertainty relation as another answer
1932 von Neumann quantum entropy / 1962 Umegaki quantum

relative entropy / 1973 Lieb–Ruskai strong subadditivity of quantum
entropy / 1975 Lindblad data-processing for quantum relative entropy
1970s theory of quantum measurements and similarity measures for

quantum states — Helstrom, Holevo (Shannon Award 2016), Ozawa,
Bures, Uhlmann, etc.

Information theory
1948 — Shannon set the foundations of information theory, defining

notions like data compression and channel capacity and giving
answers in terms of entropy and mutual information, resp.
Shannon considered only classical physics (without quantum effects)
His work (and that of others) ultimately led to questions like:
“How do quantum effects enhance communication capacity?”
“How do quantum effects enhance communication security?”
“What are some quantum communication tasks that do not have a

counterpart in the classical world?”

Tutorial overview
Quantum states and channels
Fundamental protocols: Bell / CHSH game, entanglement

distribution, super-dense coding, quantum teleportation
Distance measures for quantum states
Information measures
Quantum data compression
Communication over quantum channels

Review of quantum formalism
Let’s begin by reviewing some basics of quantum information
All we need to start understanding quantum information is how to

represent states and evolutions of quantum systems.
We do this by using density matrices and quantum channels.
These ideas extend how we represent states of a classical system with

probability distributions and evolutions of these classical systems with
classical channels (conditional probability distributions).
We’ll find that the set of quantum states contains all classical states
and is far richer, which is suggestive of why we can do things that are
not possible in classical information theory.

Quantum states

Quantum states
The state of a quantum system is given by a square matrix called the

density matrix, usually denoted by ρ, σ, τ , ω, etc.
It should be positive semi-definite and have trace equal to one. That

is, all of its eigenvalues should be non-negative and sum up to one.
We write these conditions symbolically as ρ ≥ 0 and Tr{ρ} = 1. Can
abbreviate more simply as ρ ∈ D(H), to be read as “ρ is in the set of
density matrices.”
The dimension of the matrix indicates the number of distinguishable

states of the quantum system.
For example, a physical qubit is a quantum system with dimension

two. A classical bit, which has two distinguishable states, can be
embedded into a qubit.

Interpretation of density matrix
The density matrix, in addition to a description of an experimental

procedure, is all that one requires to predict the (probabilistic)
outcomes of a given experiment performed on a quantum system.
It is a generalization of (and subsumes) a probability distribution,

which describes the state of a classical system. All probability
distributions can be embedded into a quantum state by placing the
entries along the diagonal of the density matrix.

Let’s talk about qubits...
Superconducting phase qubit from

http://web.physics.ucsb.edu/˜martinisgroup/photos.shtml,
taken by Erik Lucero
Examples of quantum states
Let
1
|0i ≡ , h0| ≡ 1 0 ,
0

1 0
so that density matrix ρ0 ≡ |0ih0| = .
0 0
Similarly, let
0
|1i ≡ , h1| ≡ 0 1 ,
1

0 0
so that density matrix ρ1 ≡ |1ih1| = .
0 1
Then ρ0 ρ1 = 0. The states ρ0 and ρ1 are orthogonal to each other,
and, physically, this means that they are perfectly distinguishable.
What we have done here is to embed classical bits into quantum bits.
We can think of ρ0 as ‘0’ and ρ1 as ‘1.’
Mixtures of quantum states
Any probabilistic mixture of two quantum states is also a quantum

state. That is, for σ0 , σ1 ∈ D(H) and p ∈ [0, 1], we have
pσ0 + (1 − p)σ1 ∈ D(H).
The set of density matrices is thus convex.
For our classical example, we find
pρ0 + (1 − p)ρ1 = p|0ih0| + (1 − p)|1ih1|

p 0
= .
0 1−p
This is the statement that probabilistic classical bits can be embedded

into quantum bits, and the probabilities appear along the diagonal of
the matrix. Can we have other kinds of quantum states?

Superpositions of quantum states
Construct the following unit vector as a superposition of |0i and |1i:

α
|ψi ≡ α|0i + β|1i = ,
β
where α, β ∈ C and |α|2 + |β|2 = 1. Note that hψ| = α∗ β ∗ and

hϕ|ψi denotes the inner product of vectors |ψi and |ϕi.

The unit vector |ψi leads to the following quantum state:
|α|2 αβ ∗

|ψihψ| = .
βα∗ |β|2
The difference between this quantum state and the others we’ve
considered so far is the presence of off-diagonal elements in the
density matrix (called quantum coherences).
|α|2

0
This state is physically distinct from .
0 |β|2
Bloch sphere
We can visualize the state of a qubit using the Bloch sphere. To see
this, consider the Pauli matrices

1 0 0 1 0 −i 1 0
I ≡ , X ≡ , Y ≡ , Z≡ .
0 1 1 0 i 0 0 −1
The last three Pauli matrices have eigenvalues ±1 and eigenvectors:

1 1
|±i ≡ √ (|0i ± |1i) , |±Y i ≡ √ (|0i ± i|1i) , |0i, |1i.
2 2
We can write the density matrix ρ of a qubit in terms of three
parameters rx , ry , and rz :
1
ρ= (I + rx X + ry Y + rz Z ) ,
2
where rx2 + ry2 + rz2 ≤ 1, which is the equation of a unit sphere in R3 .
Bloch sphere
We can visualize the state of a qubit using the Bloch sphere:
The maximally mixed state I /2 = (|0ih0| + |1ih1|)/2 is at the center.

Classical states are on the line going from |0i to |1i.
A quantum state is pure if it is on the surface and otherwise mixed.
Higher dimensional quantum systems
A density matrix can have dimension ≥ 2 and can be written as

X
ρ= ρi,j |iihj|,
i,j
where {|ii ≡ ei } is the standard basis and ρi,j are the matrix elements.
Since every density matrix is positive semi-definite and has trace
equal to one, it has a spectral decomposition as
X
ρ= pX (x)|φx ihφx |,
x
where {pX (x)} are the non-negative eigenvalues, summing to one,

and {|φx i} is a set of orthonormal eigenvectors.
A density matrix ρ is pure if there exists a unit vector |ψi such that
ρ = |ψihψ| and otherwise it is mixed.
Multiple qubits...
IBM five-qubit universal quantum computer (released May 2016)

Composite quantum systems
Just as we need more than one bit for information processing to

become interesting, quantum information really only becomes
interesting when multiple quantum systems can interact.
We use Cartesian product to represent state of two or more bits:
(0, 0), (0, 1), (1, 0), (1, 1) ∈ Z2 × Z2 ,
but Cartesian product is not rich enough to capture quantum states.

Consider that before we constructed a quantum state from a
superposition of two unit vectors. So we could imagine constructing a
quantum state from a superposition of vectors as
α|0, 0i + β|0, 1i + γ|1, 0i + δ|1, 1i,
where |α|2 + |β|2 + |γ|2 + |δ|2 = 1. But what are |i, ji?

Tensor product
We use the tensor product to represent multiple quantum systems.

For vectors, it is defined as
   
a2 a1 a2
a
 1 b2
a1 a2   a1 b2 
⊗ ≡ = .
b1 b2  a2   b1 a2 
b1
b2 b1 b2
So, then with this definition, we have

 
α
 β 
|ϕi ≡ α|0i ⊗ |0i + β|0i ⊗ |1i + γ|1i ⊗ |0i + δ|1i ⊗ |1i = 
 γ ,

which leads to a two-qubit density operator |ϕihϕ|.

System labels
Often it can be helpful to write system labels, which indicate which

qubit Alice possesses and which Bob possesses:
|ϕiAB ≡ α|0iA ⊗ |0iB + β|0iA ⊗ |1iB + γ|1iA ⊗ |0iB + δ|1iA ⊗ |1iB .
We can also write the labels on the two-qubit density operator:
|ϕihϕ|AB .
Often we abbreviate the above more simply as
α|00iAB + β|01iAB + γ|10iAB + δ|11iAB .

Tensor product for matrices
For matrices K and L, the tensor product is defined in a similar way:

k11 k12 l11 l12
K ⊗L≡ ⊗
k21 k22 l21 l22
 
l11 l12 l11 l12
 k11 l21 l22 k12
l21 l22

≡ 
l l l l
k21 11 12 k22 11 12
 
l21 l22 l21 l22
 
k11 l11 k11 l12 k12 l11 k12 l12
 k11 l21 k11 l22 k12 l21 k12 l22 
= k21 l11 k21 l12 k22 l11 k22 l12  .

k21 l21 k21 l22 k22 l21 k22 l22

Properties of tensor product
For vectors:
z(|φi ⊗ |ψi) = (z|φi) ⊗ |ψi = |φi ⊗ (z|ψi),

(|φ1 i + |φ2 i) ⊗ |ψi = |φ1 i ⊗ |ψi + |φ2 i ⊗ |ψi,
|φi ⊗ (|ψ1 i + |ψ2 i) = |φi ⊗ |ψ1 i + |φi ⊗ |ψ2 i.
Matrices acting on vectors:
(K ⊗ L)(|φi ⊗ |ψi) = K |φi ⊗ L|ψi,

!
X X
(K ⊗ L) λx |φx i ⊗ |ψx i = λx K |φx i ⊗ L|ψx i,
x x
!
X X
µx Kx ⊗ Lx (|φi ⊗ |ψi) = µx Kx |φi ⊗ Lx |ψi.
x x
Inner product: (hφ1 | ⊗ hψ1 |)(|φ2 i ⊗ |ψ2 i) = hφ1 |φ2 ihψ1 |ψ2 i.
Composite quantum systems
If the state of Alice’s system is ρ and the state of Bob’s system is σ
and they have never interacted in the past, then the state of the joint
Alice-Bob system is
ρA ⊗ σB .
We use the system labels to say who has what.
For example, their state could be

|0ih0|A ⊗ |0ih0|B , or
|1ih1|A ⊗ |1ih1|B ,
or a mixture of both, with p ∈ [0, 1]:
p|0ih0|A ⊗ |0ih0|B + (1 − p)|1ih1|A ⊗ |1ih1|B .

Quantum entanglement...
Depiction of quantum entanglement taken from

http://thelifeofpsi.com/2013/10/28/bertlmanns-socks/

Separable states and entangled states
If Alice and Bob prepare states ρxA and σBx based on a random
variable X with distribution pX , then the state of their systems is
X
pX (x)ρxA ⊗ σBx .
x
Such states are called separable states and can be prepared using
local operations and classical communication (no need for a quantum
interaction between A and B to prepare these states).
By spectral decomposition, every separable state can be written as
X
pZ (z)|ψ z ihψ z |A ⊗ |φz ihφz |B ,
z
where, for each z, |ψ z iA and |φz iB are unit vectors.

Entangled states are states that cannot be written in the above form.

Example of entangled state
A prominent example of an entangled state is the ebit (eee · bit):
|ΦihΦ|AB ,
where |ΦiAB ≡ √1 (|00iAB + |11iAB ).

2
In matrix form, this is
 
1 0 0 1
1 0 0 0 0 
|ΦihΦ|AB =  .
2 0
 0 0 0 
1 0 0 1
To see that this is entangled, consider that for every |ψiA and |φiB
1
|hΦ|AB |ψiA ⊗ |φiB |2 ≤
2
⇒ impossible to write |ΦihΦ|AB as a separable state.
Tool: Schmidt decomposition
Schmidt decomposition theorem

Given a two-party unit vector |ψiAB ∈ HA ⊗ HB , we can express it as
d−1
X √
|ψiAB ≡ pi |iiA |iiB , where
i=0
P
probabilities pi are real, strictly positive, and normalized i pi = 1.
{|iiA } and {|iiB } are orthonormal bases for systems A and B.
√
pi i∈{0,...,d−1} is the vector of Schmidt coefficients.
Schmidt rank d of |ψiAB is equal to the number of Schmidt
coefficients pi in its Schmidt decomposition and satisfies
d ≤ min {dim(HA ), dim(HB )} .
State |ψihψ|AB is entangled iff d ≥ 2.

Tool: Partial trace
The trace of a matrix X can be realized as

X
Tr{X } = hi|X |ii,
i
where {|ii} is an orthonormal basis.
Partial trace of a matrix YAB acting on HA ⊗ HB can be realized as

X
TrA {YAB } = (hi|A ⊗ IB )YAB (|iiA ⊗ IB ),
i
where {|iiA } is an orthonormal basis for HA and IB is the identity

matrix acting on HB .
Both trace and partial trace are linear operations.

Interpretation of partial trace
Suppose Alice and Bob possess quantum systems in the state ρAB .
We calculate the density matrix for Alice’s system using partial trace:
ρA ≡ TrA {ρAB }.
We can then use ρA to predict the outcome of any experiment
performed on Alice’s system alone.
Partial trace generalizes marginalizing a probability distribution:
( )
X
TrY pX ,Y (x, y )|xihx|X ⊗ |y ihy |Y
x,y
X
= pX ,Y (x, y )|xihx|X Tr {|y ihy |Y }
x,y
" #
X X X
= pX ,Y (x, y ) |xihx|X = pX (x)|xihx|X ,
x y x
P
where pX (x) ≡ y pX ,Y (x, y ).
Purification of quantum noise...
Artistic rendering of the notion of purification

(Image courtesy of seaskylab at FreeDigitalPhotos.net)
Tool: Purification of quantum states
A purification of a state ρS on system S is a pure quantum state

|ψihψ|RS on systems R and S, such that
ρS = TrR {|ψihψ|RS }.
P p
Simple construction: take
P |ψiRS = x p(x)|xiR ⊗ |xiS if ρS has
spectral decomposition x p(x)|xihx|S .
Two different states |ψihψ|RS and |φihφ|RS purify ρS iff they are
related by a unitary UR acting on the reference system. Necessity:
TrR {(UR ⊗ IS )|ψihψ|RS (UR† ⊗ IS )} = TrR {(UR† UR ⊗ IS )|ψihψ|RS }

= TrR {|ψihψ|RS }
= ρS .
To prove sufficiency, use Schmidt decomposition.

Uses and interpretations of purification
The concept of purification is one of the most often used tools in

quantum information theory.
This concept does not exist in classical information theory and
represents a radical departure (i.e., in classical information theory it is
not possible to have a definite state of two systems such that the
reduced systems are individually indefinite).
Physical interpretation: Noise or mixedness in a quantum state is due
to entanglement with an inaccessible reference / environment system.
Cryptographic interpretation: In the setting of quantum cryptography,
we assume that an eavesdropper Eve has access to the full
purification of a state ρAB that Alice and Bob share. This means
physically that Eve has access to every other system in the universe
that Alice and Bob do not have access to!
Advantage: only need to characterize Alice and Bob’s state in order
to understand what Eve has.
Quantum channels

Classical channels
Classical channels model evolutions of classical systems.
What are the requirements that we make for classical channels?
1) They should be linear maps, which means they respect convexity.
2) They should take probability distributions to probability

distributions (i.e., they should output a legitimate state of a classical
system when a classical state is input).
These requirements imply that the evolution of a classical system is

specified by a conditional probability matrix N with entries pY |X (y |x),
so that the input-output relationship of a classical channel is given by
X
pY = N pX ⇐⇒ pY (y ) = pY |X (y |x)pX (x).
x

Quantum channels
Quantum channels model evolutions of quantum systems.

We make similar requirements:
A quantum channel N is a linear map acting on the space of
(density) matrices:
N (pρ + (1 − p)σ) = pN (ρ) + (1 − p)N (σ),
where p ∈ [0, 1] and ρ, σ ∈ D(H).

We demand that a quantum channel should take quantum states to
quantum states.
This means that it should be trace (probability) preserving:
Tr{N (X )} = Tr{X }
for all X ∈ L(H) (linear operators, i.e., matrices).

Complete positivity
Other requirement is complete positivity.
We can always expand XRS ∈ L(HR ⊗ HS ) as
|iihj|R ⊗ XSi,j ,
X
XRS =
i,j
and then define

|iihj|R ⊗ NS XSi,j ,
X
(idR ⊗NS )(XRS ) =
i,j
with the interpretation being that “nothing (identity channel)

happens on system R while the channel N acts on system S.”
A quantum channel should also be completely positive:
(idR ⊗NS )(XRS ) ≥ 0,
where idR denotes the identity channel acting on system R of
arbitrary size and XRS ∈ L(HR ⊗ HS ) is such that XRS ≥ 0.
Quantum channels: completely positive, trace-preserving
A map N satisfying the requirements of linearity, trace preservation,

and complete positivity takes all density matrices to density matrices
and is called a quantum channel.
To check whether a given map is completely positive, it suffices to

check whether
(idR ⊗NS )(|ΦihΦ|RS ) ≥ 0,
where
1 X
|ΦiRS = √ |iiR ⊗ |iiS
d i
and d = dim(HR ) = dim(HS ).
Interpretation: the state resulting from a channel acting on one share

of a maximally entangled state completely characterizes the channel.

Choi-Kraus representation theorem
Structure theorem for quantum channels

Every quantum channel N can be written in the following form:
Ki XKi† ,
X
N (X ) = (1)
i
where {Ki } is a set of Kraus operators, with the property that

X †
Ki Ki = I . (2)
i
The form given in (1) corresponds to complete positivity and the condition
in (2) to trace (probability) preservation. This decomposition is not
unique, but one can find a minimal decomposition by taking a spectral
decomposition of (idR ⊗NS )(|ΦihΦ|RS ).

Examples of quantum channels
Quantum bit-flip channel for p ∈ [0, 1]:
ρ → (1 − p)ρ + pX ρX .
Quantum depolarizing channel for p ∈ [0, 1]:
ρ → (1 − p)ρ + p Tr{ρ}π,
where π ≡ I /d (maximally mixed state).
Quantum erasure channel for p ∈ [0, 1]:
ρ → (1 − p)ρ + p Tr{ρ}|eihe|,
where he|ρ|ei = 0 for all inputs ρ.

Unitary channels
If a channel has one Kraus operator (call it U), then it satisfies

U † U = I and is thus a unitary matrix.1
Unitary channels are ideal, reversible channels.
Instruction sequences for quantum algorithms (to be run on quantum

computers) are composed of ideal, unitary channels.
So if a quantum channel has more than one Kraus operator (in a

minimal decomposition), then it is non-unitary and irreversible.
1
It could also be part of a unitary matrix, in which case it is called an “isometry.”
Preparation channels
Preparation channels take classical systems as input and produce

quantum systems as output.
A preparation channel P has the following form:

X
P(ρ) = hx|ρ|xiσ x ,
x
where {|xi} is an orthonormal basis and {σ x } is a set of states.
Inputting the classical state |xihx| leads to quantum output σ x , i.e.,

it is just the map
x → σx ,
where x is a classical letter. Sometimes called “cq” channel, short for
“classical-to-quantum” channel.

Measurement channels
Measurement channels take quantum systems as input and produce

classical systems as output.
A measurement channel M has the following form:

X
M(ρ) = Tr{M x ρ}|xihx|,
x
Mx = I .
P
where Mx ≥ 0 for all x and x
Can also interpret a measurement channel as returning the classical

value x with probability Tr{M x ρ}.
We depict them as

“Measuring an operator”
Let G be a Hermitian operator with spectral decomposition

X
G= µ x Πx ,
x
where µx are real eigenvalues and Πx are projections onto

corresponding eigensubspaces.
We say that an experimenter “measures an operator G ” by

performing the following measurement channel:
X
ρ→ Tr{Πx ρ}|xihx|,
x
where {|xi} is an orthonormal basis.

Entanglement-breaking channels
An entanglement-breaking channel N is defined such that for every

input state ρRS , the output
(idR ⊗NS )(ρRS )
is a separable state.
To determine whether a given channel is entanglement-breaking, it

suffices to check whether the following state is separable:
(idR ⊗NS )(|ΦihΦ|RS ).

Entanglement-breaking channels
Every entanglement-breaking (EB) channel N can be written as a

composition of a measurement M followed by a preparation P:
N = P ◦ M.
Thus, internally, every EB channel transforms a quantum system to a

classical one and then back: q → c → q. In this sense, such channels
are one step up from classical channels and inherit some properties of
classical channels.

Purifications of quantum channels
Recall that we can purify quantum states and understand noise as

arising due to entanglement with an inaccessible reference system.
We can also purify quantum channels and understand a noisy process

as arising from a unitary interaction with an inaccessible environment.
Stinespring’s theorem
For every quantum channel NA→B , there exists a pure state |0ih0|E and a
unitary matrix UAE →BE 0 , acting on input systems A and E and producing
output systems B and E 0 , such that
NA→B (ρA ) = TrE 0 {UAE →BE 0 (ρA ⊗ |0ih0|E )(UAE →BE 0 )† }.

Construction of a unitary extension
Standard construction of a unitary extension of a quantum
P channel:
Given Kraus operators {Ki } for N such that N (ρ) = i Ki ρKi† , take
X
V = Ki ⊗ |iiE 0 h0|E .
i
V †V = I , so we can fill in other columns such that matrix is unitary

(call the result U).
Then
Ki ρKj† ⊗ |iihj|E 0 ,
X
U(ρA ⊗ |0ih0|E )U † =
i,j
 
X 
and TrE 0 {U(ρA ⊗ |0ih0|E )U † } = TrE 0 Ki ρKj† ⊗ |iihj|E 0
 
i,j
Ki ρKi† = N (ρ).
X
=
i

Summary of quantum states and channels
Every quantum state is a positive, semi-definite matrix with trace

equal to one.
Quantum states of multiple systems can be separable or entangled.
Quantum states can be purified (this notion does not exist in classical
information theory).
Quantum channels are completely positive, trace-preserving maps.
Preparation channels take classical systems to quantum systems, and
measurement channels take quantum systems to classical systems.
Quantum channels can also be purified (i.e., every quantum channel
can be realized by a unitary interaction with an environment, followed
by partial trace). This notion also does not exist in classical
information theory.

Fundamental protocols

Bell experiment / CHSH game
How is quantum information different from classical information?
One way to answer this question is to devise operational tasks for

which a quantum strategy outperforms a classical one.
The most famous is the Bell experiment / CHSH game.2
The game involves two spatially separated parties (the players Alice
and Bob) and a referee.
2
A “loop-hole free” implementation of this experiment was conducted in 2015 (see
arXiv:1508.05949).
Bell experiment / CHSH game
Game begins with referee randomly picking bits x and y .

Referee sends x and y to Alice and Bob, respectively.
Alice replies with a bit a and Bob with a bit b.
They win if and only if a ⊕ b = x ∧ y .
Classical strategies
The most general classical strategy allows for Alice and Bob to
possess shared randomness before the game begins.
However, can show that shared randomness does not help them win.
Thus, to compute the winning probability with classical strategies, it

suffices to consider deterministic classical strategies.

Deterministic classical strategies
General deterministic strategy: x → ax for Alice and y → by for Bob.

The following table presents the winning conditions for the four
different values of x and y using this deterministic strategy:
x y x ∧y = ax ⊕ b y
0 0 0 = a0 ⊕ b0
0 1 0 = a0 ⊕ b1
1 0 0 = a1 ⊕ b0
1 1 1 = a1 ⊕ b1
They cannot always win. (If they could, there would be a

contradiction, because adding up 3rd column gives 1 while adding up
4th column gives 0.)
The best they can do is to win only 3/4 = 0.75 of the time!
Strategy achieving this: Alice and Bob each always report back zero.
Quantum strategy
Allow Alice and Bob to share two qubits in the state |ΦihΦ|AB before
the game starts.
If Alice receives x = 0, then she performs a measurement of Z . If she

receives x = 1, then she performs a measurement of X . In each case,
she reports the outcome as a.
If Bob receives
√ y = 0, then he performs a measurement of
(X + Z )/ 2.√ If he receives y = 1, then he performs a measurement
of (Z − X )/ 2. In each case, he reports the outcome as b.
This quantum strategy has a winning probability of

cos2 (π/8) ≈ 0.85 > 0.75 and thus represents a significant separation
between classical and quantum information theory.

Loophole-free Bell test...
Picture of loophole-free Bell test at TU Delft

(Image taken from http://hansonlab.tudelft.nl/loophole-free-bell-test/)

Three fundamental protocols
The three important noiseless protocols in quantum information

theory are entanglement distribution, super-dense coding, and
quantum teleportation.
They are the building blocks for later core quantum communication
protocols, in which we replace a noiseless resource with a noisy one.

Communication resources
Resources
Let [c → c] denote a noiseless classical bit channel from Alice
(sender) to Bob (receiver), which performs the following mapping on
a qubit density matrix:

ρ ρ 1 1 ρ 0
ρ = 00 01 → ρ + Z ρZ = 00 .
ρ10 ρ11 2 2 0 ρ11
Let [q → q] denote a noiseless quantum bit channel from Alice to

Bob, which perfectly preserves a qubit density matrix.
Let [qq] denote a noiseless ebit shared between Alice and Bob, which
is a maximally entangled state |ΦihΦ|AB .
Entanglement distribution, super-dense coding, and teleportation are
non-trivial protocols for combining these resources.

Preparing a maximally entangled state of two qubits
How to prepare a maximally entangled state?

Alice begins by preparing two qubits in the tensor-product state:
|0ih0|A ⊗ |0ih0|A0 .

1 1
√1
Let H = , which is a unitary matrix. Alice performs the
1 −1
2
unitary channel H(·)H † on her system A, leading to the global state
HA |0ih0|A HA† ⊗ |0ih0|A0 .
Alice performs CNOT = |0ih0|A ⊗ IA0 + |1ih1|A ⊗ XA0 . This is a

unitary called controlled-NOT, because it flips the second bit if and
only if the first bit is one (these actions are done in superposition).
After doing this, the state on AA0 becomes |ΦihΦ|AA0 .

Entanglement distribution
|0〉 A H
|0〉 A’ id A’→B
Alice performs local operations (the Hadamard and CNOT) and

consumes one use of a noiseless qubit channel to generate one
noiseless ebit |ΦihΦ|AB shared with Bob.
Resource inequality: [q → q] ≥ [qq].

Bell states
Consider that, for a 2 × 2 matrix MB ,

1
hΦ|AB IA ⊗ MB |ΦiAB = Tr{MB }.
2
I has trace 2 and Pauli matrices X , Y , and Z are traceless.

Multiplying any two of them of them gives another Pauli matrix.
These facts imply that the following set forms an orthonormal basis:
{|ΦiAB , XA |ΦiAB , ZA |ΦiAB , ZA XA |ΦiAB }.
So the following states are perfectly distinguishable:
{|ΦihΦ|AB , XA |ΦihΦ|AB XA , ZA |ΦihΦ|AB ZA , ZA XA |ΦihΦ|AB XA ZA }.

Bell measurement
The measurement channel that distinguishes these states is called the

Bell measurement:
ρAB → Tr{|ΦihΦ|AB ρAB }|00ih00|

+ Tr{XA |ΦihΦ|AB XA ρAB }|01ih01|
+ Tr{ZA |ΦihΦ|AB ZA ρAB }|10ih10|
+ Tr{ZA XA |ΦihΦ|AB XA ZA ρAB }|11ih11|.
This measurement can be implemented on a quantum computer by

performing controlled-NOT from A to B, Hadamard on A, and then
measuring A and B in the standard basis.

Super-dense coding
Conditional Operations
x1
x2 Qubit
Channel
X Z
+
|Ф 〉AB x1
x2
Bell Measurement
Alice and Bob share an ebit. Alice would like to transmit two classical
bits x1 x2 to Bob. She performs a Pauli rotation conditioned on x1 x2
and sends her share of the ebit over a noiseless qubit channel. Bob
then performs a Bell measurement to get x1 x2 .
Resource inequality: [q → q] + [qq] ≥ 2[c → c].

Algebraic trick for quantum teleportation
Let |ψihψ| be the state of a qubit where |ψi = α|0i + β|1i.

By using the algebra of the tensor product, can show that
|ψiA0 |ΦiAB ∝ |ΦiA0 A |ψiB + XA |ΦiA0 A XB |ψiB

+ ZA |ΦiA0 A ZB |ψiB + ZA XA |ΦiA0 A XB ZB |ψiB .
Performing the Bell measurement channel on systems AA0 leads to

the following state:
1h
|00ih00|AA0 ⊗ |ψihψ|B + |01ih01|AA0 ⊗ XB |ψihψ|B XB
4
+ |10ih10|AA0 ⊗ ZB |ψihψ|B ZB
i
+ |11ih11|AA0 ⊗ XB ZB |ψihψ|B ZB XB .
Alice then sends the two classical bits in AA0 to Bob. Bob can then
undo the Pauli rotations and recover the state |ψihψ|B .
Teleportation
Bell Measurement
Two Classical
|ψ〉A’ Channels
|Ф + 〉AB
X Z |ψ〉B
Conditional Operations
Alice would like to transmit an arbitrary quantum state |ψihψ|A0 to

Bob. Alice and Bob share an ebit before the protocol begins. Alice
can “teleport” her quantum state to Bob by consuming the
entanglement and two uses of a noiseless classical bit channel.
Resource inequality: 2[c → c] + [qq] ≥ [q → q].

Teleportation between Canary Islands...
Teleportation between two Canary Islands 143 km apart. Green lasers were
used only for stabilization—invisible infrared photons were teleported
(Image taken from http://www.ing.iac.es/PR/press/quantum.html)
Distance measures

Function of a diagonalizable matrix
If an n × n matrix D is diagonal with entries d1 , . . . , dn , then for a

function f , we define
 
g (d1 ) 0 ··· 0
 .. 
 0 g (d2 ) . 
f (D) =  .

..

 ..

. 0 
0 ··· 0 g (dn )
where g (x) = f (x) if x 6= 0 and g (x) = 0 otherwise.

If a matrix A is diagonalizable as A = KDK −1 , then for a function f ,
we define
f (A) = Kf (D)K −1 .
Evaluating the function only on the support of the matrix allows for
functions such as f (x) = x −1 and f (x) = log x.

Trace distance
√
Define the trace norm of a matrix X by kX k1 ≡ Tr{ X † X }.
Trace norm induces trace distance between two matrices X and Y :
kX − Y k1 .
For two density matrices ρ and σ, the following bounds hold
0 ≤ kρ − σk1 ≤ 2.
LHS saturated iff ρ = σ and RHS iff ρ is orthogonal to σ.
For commuting ρ and σ, trace distance reduces to variational distance
between probability distributions along diagonals.
Has an operational meaning as the bias of the optimal success
probability in a hypothesis test to distinguish ρ from σ.
Does not increase under the action of a quantum channel:
kρ − σk1 ≥ kN (ρ) − N (σ)k1 .

Fidelity
Fidelity F (ρ, σ) between density matrices ρ and σ is
√ √
F (ρ, σ) ≡ k ρ σk21 .
For pure states |ψihψ| and |φihφ|, reduces to squared overlap:
F (|ψihψ|, |φihφ|) = |hψ|φi|2 .
For commuting ρ and σ, reduces to Bhattacharyya coefficient of
probability distributions along diagonals.
For density matrices ρ and σ, the following bounds hold:
0 ≤ F (ρ, σ) ≤ 1.
LHS saturated iff ρ and σ are orthogonal and RHS iff ρ = σ.
Fidelity does not decrease under the action of a quantum channel N :
F (ρ, σ) ≤ F (N (ρ), N (σ)).

Uhlmann’s theorem
Uhlmann’s theorem states that
F (ρS , σS ) = max |hψ|RS UR ⊗ IS |φiRS |2 ,

UR
where |ψiRS and |φiRS purify ρS and σS , respectively.
A core theorem used in quantum Shannon theory, and in other areas

such as quantum complexity theory and quantum error correction.
Since it involves purifications, this theorem has no analog in classical

information theory.

Relations between fidelity and trace distance
Trace distance is useful because it obeys the triangle inequality, and

fidelity is useful because we have Uhlmann’s theorem.
The following inequalities relate the two measures, which allows for
going back and forth between them:
p 1 p
1− F (ρ, σ) ≤ kρ − σk1 ≤ 1 − F (ρ, σ).
2
A distance measure which

p has both properties (triangle inequality and
Uhlmann’s theorem) is 1 − F (ρ, σ).

Information measures

Entropy and information...
Entropy and information can be discomforting...

Quantum relative entropy
One of the most fundamental information measures is the quantum

relative entropy, defined for a state ρ and a positive semi-definite
matrix σ as
D(ρkσ) ≡ Tr{ρ[log2 ρ − log2 σ]},
when supp(ρ) ⊆ supp(σ) and as +∞ otherwise.
It does not increase under the action of a quantum channel N :
D(ρkσ) ≥ D(N (ρ)kN (σ)).
If Tr{ρ} ≥ Tr{σ}, then

D(ρkσ) ≥ 0,
with equality holding iff ρ = σ.
1
Quantum Pinsker inequality: D(ρkσ) ≥ 2 ln 2 kρ − σk21 .

Children of quantum relative entropy
Relative entropy as “parent” entropy

Many entropies can be written in terms of relative entropy:
H(A)ρ ≡ −D(ρA kIA ) = − Tr{ρA log2 ρA } (entropy)
H(A|B)ρ ≡ −D(ρAB kIA ⊗ ρB ) (conditional entropy)
I (A; B)ρ ≡ D(ρAB kρA ⊗ ρB ) (mutual information)
I (AiB)ρ ≡ D(ρAB kIA ⊗ ρB ) (coherent information)
Equalities
H(A|B)ρ = H(AB)ρ − H(B)ρ
I (AiB)ρ = −H(A|B)ρ
I (A; B)ρ = H(A)ρ + H(B)ρ − H(AB)ρ
I (A; B|C )ρ ≡ H(AC )ρ + H(BC )ρ − H(ABC )ρ − H(C )ρ
I (A; B|C )ρ = H(B|C )ρ − H(B|AC )ρ

Evaluating quantum entropy
How do we evaluate the formula for quantum entropy of a state ρA ?

Consider spectral decomposition:
X
ρA = pX (x)|xihx|A .
x
Then, with η(x) = −x log2 (x),

( )
X
H(A)ρ = Tr{η(ρA )} = Tr η(pX (x))|xihx|A
x
X X
= η(pX (x)) Tr{|xihx|A } = η(pX (x)) = H(pX ).
x x
Quantum entropy of ρA is equal to Shannon entropy of eigenvalues.

⇒ Entropy of a pure state is equal to zero.

Bipartite pure-state entanglement
Let |ψihψ|AB be a pure state.
By Schmidt decomposition theorem, we know that

Xp
|ψiAB = pX (x)|xiA ⊗ |xiB ,
x
for prob. distribution pX and orthonormal bases {|xiA } and {|xiB }.
⇒ Eigenvalues of marginal states TrB {|ψihψ|AB } and TrA {|ψihψ|AB }

are equal.
Thus, H(A)ρ = H(B)ρ if ρAB is a pure state.
Exercise: For a tripartite pure state |φihφ|ABC ,
H(A|B)φ + H(A|C )φ = 0.

Conditional quantum entropy can be negative
One of the most striking differences between classical and quantum

information theory: conditional quantum entropy can be negative.
Consider the conditional quantum entropy of the ebit |ΦihΦ|AB .
The global state is pure, while the marginal TrA {|ΦihΦ|AB } is

maximally mixed.
This implies that H(AB)Φ = 0 and H(B)Φ = 1, and thus
H(A|B)Φ = −1.
If a state σAB is separable, then one can show that H(A|B)σ ≥ 0. So

a negative conditional entropy implies that a state is entangled
(signature of entanglement).

Strong subadditivity
Strong subadditivity
Let ρABC be a tripartite quantum state. Then
I (A; B|C )ρ ≥ 0.
Equivalent statements (by definition)

Entropy sum of two individual systems is larger than entropy sum of
their union and intersection:
H(AC )ρ + H(BC )ρ ≥ H(ABC )ρ + H(C )ρ .
Conditional entropy does not decrease under the loss of system A:
H(B|C )ρ ≥ H(B|AC )ρ .

Monogamy of entanglement
By employing strong subadditivity and the Schmidt decomposition,

we see that
H(A|B)ρ + H(A|C )ρ ≥ 0.
This is a nontrivial statement for quantum states, given that H(A|B)ρ

can be negative.
Thus, if H(A|B)ρ < 0, implying that Alice is entangled with Bob,

then it must be the case that H(A|C )ρ is large enough such that the
sum is non-negative.
Often called “monogamy of entanglement,” because it says that Alice

cannot be strongly entangled with both Bob and Charlie.


Quantum information source
We model a quantum information source as an ensemble of pure

states: {pX (x), |φx ihφx |}.
The source has expected density matrix

X
ρ= pX (x)|φx ihφx |. (3)
x
Every density matrix has a spectral decomposition:

X
ρ= pZ (z)|zihz|,
z
where pZ is a probability distribution and {|zi} is an O.N. basis. This

decomposition in general is different from the one in (3).

Quantum data compression protocols
Inspired by Shannon, we consider independent calls of the quantum

information source and allow for compression schemes that have
slight error which vanishes in the limit of many calls of the source.
An (n, R, ε) quantum data compression scheme consists of an
encoding channel E n , with output system W , and a decoding channel
Dn such that
1
log2 dim(HW ) ≤ R,
n
and
X
pX n (x n )F (|φx n ihφx n |, (Dn ◦ E n )[|φx n ihφx n |]) ≥ 1 − ε.
xn
A rate R is achievable if for all ε ∈ (0, 1) and sufficiently large n,

there exists an (n, R, ε) quantum compression scheme.
Quantum data compression limit = infimum of achievable rates.
Quantum data compression theorem
Pa source {pX (x), |φx ihφx |} is

The quantum data compression limit of
equal to the quantum entropy of ρ = x pX (x)|φx ihφx |.
Focus on achievability part. To prove it, we use the notion of

quantum typicality.

Quantum typicality
P
Given a density matrix ρ with spectral decomposition z pZ (z)|zihz|,
define its (n, δ)-typical subspace by

ρ n
1 n

Tn,δ ≡ span |z i : − log2 pZ n (z ) − H(ρ) ≤ δ , where
n
n
pZ n (z ) ≡ pZ (z1 ) · · · pZ (zn ), |z n i ≡ |z1 i ⊗ · · · ⊗ |zn i.
Let Πρn,δ denote the projection onto Tn,δ

ρ
.
Then,
Tr{Πρn,δ ρ⊗n } ≥ 1 − ε,
(1 − ε)2n[H(ρ)−δ] ≤ Tr{Πρn,δ } ≤ 2n[H(ρ)+δ] ,
2−n[H(ρ)+δ] Πρn,δ ≤ Πρn,δ ρ⊗n Πρn,δ ≤ 2−n[H(ρ)−δ] Πρn,δ .
Inequalities with ε are true for all ε ∈ (0, 1) and sufficiently large n.
Main idea for quantum data compression: measure typical subspace.

Successful with probability 1 − ε.
If successful, perform a unitary that rotates typical subspace to space
of dimension ≤ 2n[H(ρ)+δ] (represented with n[H(ρ) + δ] qubits).
Send qubits to Bob, who then undoes the compression unitary.
Scheme is guaranteed to meet the fidelity criterion.
Classical communication

Classical communication code
Suppose that Alice and Bob are connected by a quantum channel

NA→B and that they are allowed to use it n times. The resulting
⊗n
channel is NA→B , with Kraus operators that are tensor products of
the individual Kraus operators.
An (n, R, ε) classical comm. code consists of an encoding channel
EM 0 →An and a decoding measurement channel DB n →M̂ such that:
⊗n
F (ΦM M̂ , (DB n →M̂ ◦ NA→B ◦ EM 0 →An )(ΦMM 0 )) ≥ 1 − ε,
where
1 X
ΦM M̂ ≡ |mihm|M ⊗ |mihm|M̂ ,
dim(HM ) m
1
and n log2 (dim(HM )) ≥ R.
Note that ΦM M̂ represents a classical state, and the goal is for the
coding scheme to preserve the classical correlations in this state.
Schematic of a classical communication code

Classical capacity
A rate R for classical communication is achievable if for all ε ∈ (0, 1)

and sufficiently large n, there exists an (n, R, ε) classical
communication code.
The classical capacity C (N ) of a quantum channel N is equal to the

supremum of all achievable rates.

What is known about classical capacity
Lower bound on classical capacity:
χ(N ) ≤ C (N )
where χ(N ) = max x I (X ; B)ω ,

pX (x),ρA
X
ωXB ≡ pX (x)|xihx|X ⊗ N (ρxA ).
x
For some special channels, we know that χ(N ) = C (N ).

But it is also known that there exists a channel for which
χ(N ) < C (N ).
This superadditivity phenomenon is due to quantum entanglement.

Achievability part: Random coding
Borrow the idea of random coding from Shannon, but then we need
to figure out a decoding channel.
Consider an ensemble {pX (x), ρxA } that Alice can pick at the channel
input. This leads to the output ensemble
{pX (x), σAx ≡ NA→B (ρxA )}.
So pick classical codewords randomly according to pX (x). This leads

to a codebook {x n (m) ≡ x1 (m) · · · xn (m)}m∈[dim(HM )] .
The channel output after sending the mth message is

x n (m) x (m) x (m)
σB n ≡ σB11 ⊗ · · · ⊗ σBnn .

Achievability part: Sequential decoding
x n (m)
To every channel output σB n , there exists a conditionally typical
projector Πm , with properties similar to those of the typical projector.
A sequential decoding strategy consists of performing a sequence of
binary tests using conditionally typical projectors, asking “Is it the
first message? Is it the second message? etc.” until there is a “hit.”
When sending the mth message, the success probability in decoding it
using this strategy is
x n (m)
Tr{Πm Π̂m−1 · · · Π̂1 σB n Π̂1 · · · Π̂m−1 Πm },
where Π̂i ≡ I − Πi .
This implies that the error probability is
x n (m)
1 − Tr{Πm Π̂m−1 · · · Π̂1 σB n Π̂1 · · · Π̂m−1 Πm }.

Error Analysis
The expected channel output with respect to the code distribution is
σB = x pX (x)σBx , which has a typical projection Πσ .
P
The error probability will ultimately change just slightly by

incorporating this projection into the analysis:
x n (m) x n (m)
Tr{Πσ σB n Πσ }−Tr{Πm Π̂m−1 · · · Π̂1 Πσ σB n Πσ Π̂1 · · · Π̂m−1 Πm }.
Using a quantum version of the union bound, this can be bounded
from above by
v
u m−1
x n (m) x n (m)
u X
2 Tr{(I − Πm )Πσ σB n Πσ } +
t Tr{Πi Πσ σB n Πσ }
i=1
The two terms above are exactly analogous to similar error terms that
arise in the analysis of Shannon’s channel coding theorem.
By taking an expecation with respect to the code distribution, we can
then analyze this error.
Error to bound:
v
u m−1
X n (m) X n (m)
u X
2 EC {Tr{(I − Πm )Πσ σB n Πσ }} +
t EC {Tr{Πi Πσ σB n Πσ }}
i=1
The first term can be made small using properties of typicality.

The second term can be made small by choosing the code rate to be
smaller than the mutual information I (X ; B) = H(B) − H(B|X ).
Consider that
x n (m) X n (m)
EC {Tr{Πi Πσ σB n Πσ }} = Tr{EX n (i) {Πi }Πσ EX n (m) {σB n }Πσ }
= Tr{EX n (i) {Πi }Πσ σ ⊗n Πσ }
≤ 2−n[H(B)−δ] Tr{EX n (i) {Πi }Πσ }}
≤ 2−n[H(B)−δ] EX n (i) {Tr{Πi }}
≤ 2−n[H(B)−δ] 2n[H(B|X )+δ]
= 2−n[I (X ;B)−2δ] .

Conclusion of achievability part
As long as we pick dim(HM ) = 2n[I (X ;B)−3δ] , then there exists a code

with small error probability, which we can make approach zero by
picking n larger and larger.
We can then expurgate the code if we wish to go from average to

maximal error probability (throw away the worse half of the
codewords, as in the classical case).
So the Holevo information I (X ; B) is an achievable rate.

Converse theorem
The converse part of the theorem establishes the regularized Holevo
information as an upper bound on classical capacity:
1
C (N ) ≤ lim χ(N ⊗n ).
n→∞ n
For some channels, such as entanglement-breaking channels, the
following collapse happens for all n:
1
χ(N ⊗n ) = χ(N ).
n
But we know it does not happen in general. That is, it is known that
there exists a channel for which
1
χ(N ) < lim χ(N ⊗n ).
n→∞ n
So there still remains quite a bit to understand about classical

capacity.
Entanglement-assisted comm.

Entanglement-assisted classical communication code
Now allow for Alice and Bob to share entanglement before
communication begins. From super-dense coding, we know that
entanglement can double the classical capacity of a noiseless qubit
channel. What about in general?
An (n, R, ε) entanglement-assisted classical comm. code consists of
an encoding channel EM 0 TA →An , a decoding measurement channel
DB n TB →M̂ , and an entangled state ΨTA TB such that:
⊗n
F (ΦM M̂ , (DB n TB →M̂ ◦ NA→B ◦ EM 0 TA →An )(ΦMM 0 ⊗ ΨTA TB )) ≥ 1 − ε,
where
1 X
ΦM M̂ ≡ |mihm|M ⊗ |mihm|M̂ ,
dim(HM ) m
and n1 log2 (dim(HM )) ≥ R.

The goal again is for the coding scheme to preserve the classical
correlations in the state ΦM M̂ .
Schematic of an EA classical communication code

Entanglement-assisted classical capacity
A rate R for entanglement-assisted (EA) classical communication is

achievable if for all ε ∈ (0, 1) and sufficiently large n, there exists an
(n, R, ε) EA classical communication code.
The EA classical capacity CEA (N ) of a quantum channel N is equal

to the supremum of all achievable rates.

What is known about entanglement-assisted capacity
Entanglement-assisted capacity theorem:
CEA (N ) = I (N )
where I (N ) = max I (R; B)ω ,

φRA
ωRB ≡ NA→B (φRA ).
Thus, this problem is completely solved!

CEA (N ) does not change if there is a quantum feedback channel from
Bob to Alice. We even know strong converse theorems for this setting
as well. In these senses, the entanglement-assisted capacity represents
the fully quantum analog of Shannon’s channel capacity theorem.

Entanglement-assisted coding (simple version)
Allow Alice and Bob to share a maximally entangled state |ΦihΦ|AB .
They then induce the following ensemble by Alice applying a

randomly selected, generalized Pauli operator to her input:
d , (NA→B 0 ⊗ idB ) (|Φx,z x,z

−2
AB ihΦAB |) .
where |Φx,z iAB = X (x)A Z (z)A |ΦiAB . (This is the same ensemble
from super-dense coding if N is the identity channel.)
By previous achievability result and some entropy manipulations, we

can conclude that the mutual information I (B 0 ; B)N (Φ) is achievable.
More general argument establishes that I (B 0 ; B)N (φ) is achievable,

where φAB is a pure bipartite state. So then CEA (N ) ≥ I (N ).

Entanglement-assisted converse theorem
Employ data processing and the chain rule for conditional mutual
information to conclude that
CEA (N ) ≤ I (N ).
Can even establish this bound when there is a quantum feedback

channel of unlimited dimension connecting Bob to Alice, a setup like

Quantum communication

Quantum communication code
Now Alice would like to transmit quantum information intact to or

generate entanglement with Bob, perhaps for some distributed
quantum computation.
An (n, R, ε) quantum communication code consists of an encoding
channel EM 0 →An and a decoding channel DB n →M̂ such that:
⊗n
F (ΦM M̂ , (DB n →M̂ ◦ NA→B ◦ EM 0 →An )(ΦMM 0 )) ≥ 1 − ε,
where ΦM M̂ is the maximally entangled state:
1 X
ΦM M̂ ≡ |mihm0 |M ⊗ |mihm0 |M̂ ,
dim(HM ) 0
m,m
and n1 log2 (dim(HM )) ≥ R.

The goal now is for the coding scheme to preserve the quantum
correlations in the state ΦM M̂ .
Quantum capacity
A rate R for quantum communication is achievable if for all ε ∈ (0, 1)

and sufficiently large n, there exists an (n, R, ε) quantum
communication code.
The quantum capacity Q(N ) of a quantum channel N is equal to the

supremum of all achievable rates.

What is known about quantum capacity
Coherent information lower bound on quantum capacity:
Ic (N ) ≤ Q(N )
where Ic (N ) = max I (RiB)ω ,

φRA
ωRB ≡ NA→B (φRA ).

If a quantum channel is degradable (meaning that the receiver can
simulate the channel from the input to the environment), then
Ic (N ) = Q(N ).
A number of interesting quantum channels have this property.
Quantum capacity is not known for most non-degradable channels. It
also exhibits a striking effect called superactivation: there exist
zero-quantum capacity channels such that they can combine to have
a non-zero quantum capacity. (This does not occur for the basic
setups in classical information theory.)
Achieving the coherent information
There are now many coding methods known for achieving the
coherent information rate.
Perhaps the most prominent is known as the decoupling method.
Suppose that Alice, Bob, and Eve share a tripartite pure entangled
state |ψihψ|RBE after Alice transmits her share of the entanglement
with the reference through a noisy channel.
Then if the reduced state ψRE on the reference system and Eve’s
system is approximately decoupled, meaning that
kψRE − ψR ⊗ σE k1 ≤ ε,
where σE is arbitrary state, this implies that Bob can decode quantum
information that Alice intended to send to him. Can show that
decoupling is possible as long as qubit rate ≈ coherent information.
Decoupling method
Why does this work? Suppose the state is exactly decoupled. Then
one purification of the state ψRE is the state |ψihψ|RBE that they
share after the channel acts.
Another purification of ψRE = ψR ⊗ σE is |ψihψ|RB1 ⊗ |σihσ|B2 E ,
where |ψihψ|RB1 is the original state that Alice sent through the
channel and |σihσ|B2 E is some other state that purifies the state σE
of the environment.
All purifications are related by isometries and Bob possesses the
purification of R and E ,
⇒ There exists some unitary UB→B1 B2 such that
UB→B1 B2 |ψiRBE = |ψiRB1 ⊗ |σiB2 E .
This unitary is then Bob’s decoder!
Thus, the decoupling condition implies the existence of a decoder for
Bob, so that it is only necessary to show the existence of an encoder
that decouples the reference from the environment.
Future directions

Open questions
It might be difficult to find a general formula for quantum capacity.

Some suspect that the quantity is uncomputable.
Other capacities: private capacity, locking capacity, data hiding
capacity (some results known but many questions remain).
Constructing codes for quantum channels. Major open question for
quantum polar codes is to find an efficiently implementable decoder.
Network quantum information theory: Some results known for
multiple access, broadcast, interference, relay channels. Full
characterization of some of these capacities is still open.
Strong converses and 2nd-order asymptotics. Some results known.
Major open question to establish strong converse property for
quantum capacity of degradable channels. Open: 2nd-order
asymptotics for entanglement-assisted capacity of all channels.

Other topics
Capacities of Gaussian quantum channels. These model practical

communication channels. A number of open questions remain here..
Covert communication over quantum channels.
Quantum channels with memory.
Security of quantum cryptography (bringing theoretical security
proofs closer to experimental implementations).
Reformulating thermodynamics in the quantum regime using some
tools of quantum information theory.
Quantifying entanglement (resource theory of entanglement).
Strengthenings of fundamental quantum entropy inequalities.

Quantum Information Theory Tutorial

Uploaded by

Copyright:

Available Formats

Quantum Information Theory Tutorial

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Quantum Information Theory Tutorial

Uploaded by

Copyright:

Available Formats

Quantum Information Theory Tutorial

Hearne Institute for Theoretical Physics,

Reference: Quantum Information Theory

Mark M. Wilde (LSU) 1 / 113

What are the ultimate limitations on communication imposed by

What are methods for achieving these limits?

To address these questions, we need to consider quantum mechanics,

What is different about quantum and “classical” information theory?

What tasks can we achieve with quantum mechanics that we cannot

Mark M. Wilde (LSU) 2 / 113

1927 Heisenberg uncertainty principle

1935 Einstein–Podolsky–Rosen paper questioning compatibility of

1932 von Neumann quantum entropy / 1962 Umegaki quantum

1970s theory of quantum measurements and similarity measures for

Mark M. Wilde (LSU) 3 / 113

1948 — Shannon set the foundations of information theory, defining

Shannon considered only classical physics (without quantum effects)

His work (and that of others) ultimately led to questions like:

“How do quantum effects enhance communication capacity?”

“How do quantum effects enhance communication security?”

“What are some quantum communication tasks that do not have a

Mark M. Wilde (LSU) 4 / 113

Quantum states and channels

Fundamental protocols: Bell / CHSH game, entanglement

Distance measures for quantum states

Quantum data compression

Communication over quantum channels

Mark M. Wilde (LSU) 5 / 113

Let’s begin by reviewing some basics of quantum information

All we need to start understanding quantum information is how to

We do this by using density matrices and quantum channels.

These ideas extend how we represent states of a classical system with

Mark M. Wilde (LSU) 6 / 113

Mark M. Wilde (LSU) 7 / 113

The state of a quantum system is given by a square matrix called the

It should be positive semi-definite and have trace equal to one. That

The dimension of the matrix indicates the number of distinguishable

For example, a physical qubit is a quantum system with dimension

Mark M. Wilde (LSU) 8 / 113

The density matrix, in addition to a description of an experimental

It is a generalization of (and subsumes) a probability distribution,

Mark M. Wilde (LSU) 9 / 113

Superconducting phase qubit from

Any probabilistic mixture of two quantum states is also a quantum

pσ0 + (1 − p)σ1 ∈ D(H).

The set of density matrices is thus convex.

For our classical example, we find

pρ0 + (1 − p)ρ1 = p|0ih0| + (1 − p)|1ih1|

This is the statement that probabilistic classical bits can be embedded

Mark M. Wilde (LSU) 12 / 113

where α, β ∈ C and |α|2 + |β|2 = 1. Note that hψ| = α∗ β ∗ and

hϕ|ψi denotes the inner product of vectors |ψi and |ϕi.

The last three Pauli matrices have eigenvalues ±1 and eigenvectors:

The maximally mixed state I /2 = (|0ih0| + |1ih1|)/2 is at the center.

A density matrix can have dimension ≥ 2 and can be written as

where {pX (x)} are the non-negative eigenvalues, summing to one,

IBM five-qubit universal quantum computer (released May 2016)

Mark M. Wilde (LSU) 17 / 113

Just as we need more than one bit for information processing to

(0, 0), (0, 1), (1, 0), (1, 1) ∈ Z2 × Z2 ,