A New Quantum f-Divergence for Trace Class Operators in Hilbert Spaces

Dragomir, Silvestru Sever

doi:10.3390/e16115853

Open AccessArticle

A New Quantum f-Divergence for Trace Class Operators in Hilbert Spaces

by

Silvestru Sever Dragomir

^1,2

¹

Mathematics, School of Engineering & Science, Victoria University, PO Box 14428, Melbourne City, MC 8001, Australia

²

School of Computational & Applied Mathematics, University of the Witwatersrand, Private Bag 3, Johannesburg 2050, South Africa

Entropy 2014, 16(11), 5853-5875; https://doi.org/10.3390/e16115853

Submission received: 13 October 2014 / Revised: 30 October 2014 / Accepted: 3 November 2014 / Published: 6 November 2014

(This article belongs to the Section Information Theory, Probability and Statistics)

Download Versions Notes

Abstract

:

A new quantum f-divergence for trace class operators in Hilbert Spaces is introduced. It is shown that for normalised convex functions it is nonnegative. Some upper bounds are provided. Applications for some classes of convex functions of interest are also given.

Keywords:

self-adjoint bounded linear operators; functions of operators; trace of operators; quantum divergence

1. Introduction

Let (X,

A

) be a measurable space satisfying

| A | > 2

and μ bea σ-finite measure on (X,

A

). Let

P

be the set of all probability measures on (X,

A

) which are absolutely continuous with respect to μ. For P, Q ∈

P

, let

p = \frac{d P}{d μ}

and

q = \frac{d Q}{d μ}

denote the Radon-Nikodym derivatives of P and Q with respect to μ.

Two probability measures P, Q ∈

P

are said to be orthogonal and we denote this by Q ⊥ P if:

P ({q = 0}) = Q ({p = 0}) = 1.

Let f : [0, ∞) → (−∞, ∞] be a convex function that is continuous at zero, i.e.,

f (0) = \lim_{u ↓ 0} f (u)

.

In 1963, I. Csiszár [1] introduced the concept of f-divergence as follows.

Definition 1. Let P, Q ∈

P

. Then:

I_{f} (Q, P) = \int_{X} p (x) f [\frac{q (x)}{p (x)}] d μ (x),

(1)

is called the f-divergence of the probability distributions Q and P.

Remark 1. Observe that, the integrand in the formula (1) is undefined when p (x) = 0. The way to overcome this problem is to postulate for f as above that:

0 f [\frac{q (x)}{0}] = q (x) \lim_{u ↓ 0} [u f (\frac{1}{u})], x \in X .

(2)

For f continuous convex on [0, ∞) we obtain the *-conjugate function of f by:

f * (u) = u f (\frac{1}{u}), u \in (0, \infty)

and:

f * (0) = \lim_{u ↓ 0} f * (u) .

It is also known that if f is continuous convex on [0, ∞), then so is f*.

The following two theorems contain the most basic properties of f-divergences. For their proofs we refer the reader to Chapter 1 of [2] (see also [3]).

Theorem 1 (Uniqueness and symmetry theorem). Let f, f₁ be continuous convex on [0, ∞). We have:

I_{f_{1}} (Q, P) = I_{f} (Q, P),

for all P,Q ∈

P

if and only if there exists a constant c ∈ ℝ, such that:

f_{1} (u) = f (u) + c (u - 1),

for any u ∈ [0, ∞).

Theorem 2 (Range of values theorem). Let f : [0, ∞) → ℝ be a continuous convex function on [0, ∞). For any P,Q ∈

P

, we have the double inequality:

f (1) \leq I_{f} (Q, P) \leq f (0) + f * (0) .

(3)

If P = Q, then the equality holds in the first part of (3).
If f is strictly convex at one, then the equality holds in the first part of (3) if and only if P = Q;
If Q ⊥ P, then the equality holds in the second part of (3).
If f (0) + f* (0) < ∞, then equality holds in the second part of (3) if and only if Q ⊥ P.

The following result is a refinement of the second inequality in Theorem 2 (see Theorem 3 in [3]).

A function f defined on [0, ∞) is called normalised if f (1) = 0.

Theorem 3. Let f be a continuous convex and normalised function on [0, ∞) with f (0) + f* (0) < ∞.

Then

0 \leq I_{f} (Q, P) \leq \frac{1}{2} [f (0) + f * (0)] V (Q, P)

(4)

for any Q, P ∈

P

.

For other inequalities for f-divergence see [4–15].

We now give some examples of f-divergences that are well-known and often used in the literature (see also [3]).

(1) The class of χ^α-divergences. The f-divergences of this class, which is generated by the function χ^α, α ∈ [1, ∞), defined by:

χ^{a} (u) = | u - 1 |^{a}, u \in [0, \infty)

have the form:

I_{f} (Q, P) = \int_{X} p {| \frac{q}{p} - 1 |}^{a} d μ = \int_{X} p^{1 - a} {| q - p |}^{a} d μ .

(5)

From this class only the parameter α = 1 provides a distance in the topological sense, namely the total variation distance

V (Q, P) = \int_{X} | q - p | d μ

. The most prominent special case of this class is, however, Karl Pearson’s χ²-divergence:

χ^{2} (Q, P) = \int_{X} \frac{q^{2}}{p} d μ - 1

that is obtained for α = 2.

(2) Dichotomy class. From this class, generated by the function f_α : [0, ∞) → ℝ:

f_{a} (u) = {\begin{array}{l} u - 1 - \ln u & for a = 0; \\ \frac{1}{a (1 - a)} [a u + 1 - a - u^{a}] & for a \in ℝ \ {0, 1}; \\ 1 - u + u \ln u & for a = 1; \end{array}

only the parameter

a = \frac{1}{2} (f_{\frac{1}{2}} (u) = 2 {(\sqrt{u} - 1)}^{2})

provides a distance, namely, the Hellinger distance:

H (Q, P) = {[\int_{X} {(\sqrt{q} - \sqrt{p})}^{2} d μ]}^{\frac{1}{2}} .

Another important divergence is the Kullback–Leibler divergence obtained for α = 1,

K L (Q, P) = \int_{X} q \ln (\frac{q}{p}) d μ .

(3) Matsushita’s divergences. The elements of this class, which is generated by the function φ_α, α ∈ (0,1] given by:

φ_{a} (u) : = | 1 - u^{a} |^{\frac{1}{a}}, u \in [0, \infty),

are prototypes of metric divergences, providing the distances [I_φa (Q, P)]^a.

(4) Puri-Vincze divergences. This class is generated by the functions Φ_α, α ∈ [1, ∞) given by:

Φ_{a} (u) : = \frac{| 1 - u |^{a}}{{(u + 1)}^{a - 1}}, u \in [0, \infty) .

It has been shown in [16] that this class provides the distances

{[I_{Φ_{a}} (Q, P)]}^{\frac{1}{a}}

.

(5) Divergences of Arimoto-type. This class is generated by the functions:

ψ_{a} (u) : = {\begin{array}{l} \frac{a}{a - 1} [{(1 + u^{a})}^{\frac{1}{a}} - 2^{\frac{1}{a} - 1} (1 + u)] & for a \in (0, \infty)\{1}; \\ (1 + u) \ln 2 + u \ln u - 1 (1 + u) \ln (1 + u) & for a = 1; \\ \frac{1}{2} | 1 - u | & for a = \infty . \end{array}

It has been shown in [17] that this class provides the distances

{[I_{ψ_{a}} (Q, P)]}^{min (a, \frac{1}{a})}

for α ∈ (0, ∞) and

\frac{1}{2} V (Q, P)

for α = ∞.

In order to introduce a quantum f-divergence for trace class operators in Hilbert spaces and study its properties we need some preliminary facts as follows.

2. Trace of Operators

Let

(H, 〈 \cdot, \cdot 〉)

be a complex Hilbert space and {ei}_i∈I an orthonormal basis of H. We say that A ∈

ℬ

(H) is a Hilbert–Schmidt operator if:

{\sum_{i \in I} ‖ A e_{i} ‖}^{2} < \infty .

(6)

It is well know that, if {e_i}_i∈I and {f_j}_j_∈_J are orthonormal bases for H and A ∈

ℬ

(H), then:

\sum_{i \in I} {‖ A e_{i} ‖}^{2} = \sum_{j \in I} {‖ A f_{i} ‖}^{2} = \sum_{j \in I} {‖ A * f_{j} ‖}^{2}

(7)

showing that the definition (6) is independent of the orthonormal basis and A is a Hilbert–Schmidt operator iff A* is a Hilbert–Schmidt operator.

Let

ℬ

₂ (H) the set of Hilbert–Schmidt operators in

ℬ

(H). For A ∈

ℬ

₂ (H) we define:

‖ A_{2} ‖ : = {({\sum_{i \in I} ‖ A e_{i} ‖}^{2})}^{1 / 2}

(8)

for {e_i}_i∈I an orthonormal basis of H. This definition does not depend on the choice of the orthonormal basis.

Using the triangle inequality in l² (I), one checks that

ℬ

₂ (H) is a vector space and that ‖·‖₂ is a norm on

ℬ

₂ (H), which is usually called in the literature as the Hilbert-Schmidt norm.

Denote the modulus of an operator A ∈

ℬ

(H) by |A| := (A*A)^1/2.

Because

‖ | A | x ‖ = ‖ A x ‖

for all x ∈ H, A is Hilbert–Schmidt iff |A| is Hilbert–Schmidt and

{‖ A ‖}_{2} = {‖ | A | ‖}_{2}

. From (7) we have that if A ∈

ℬ

₂ (H), then A* ∈

ℬ

₂ (H) and ||A||₂ = ||A*||₂.

The following theorem collects some of the most important properties of Hilbert–Schmidt operators:

Theorem 4. We have:

( $ℬ$ ₂ (H), ||·||₂) is a Hilbert space with inner product:

{〈 A, B 〉}_{2} : = \sum_{i \in I} 〈 {Ae}_{i}, {Be}_{i} 〉 = \sum_{i \in I} 〈 B *, {Ae}_{i}, e_{i} 〉

(9)

and the definition does not depend on the choice of the orthonormal basis {e_i}_i∈I;

We have the inequalities:

‖ A ‖ \leq {‖ A ‖}_{2}

(10)

for any A E

ℬ

₂ (H) and:

{‖ AT ‖}_{2} {‖ TA ‖}_{2} \leq ‖ T ‖ {‖ A ‖}_{2}

(11)

for any A E

ℬ

₂ (H) and T ∈

ℬ

(H);

$ℬ$ ₂ (H) is an operator ideal in $ℬ$ (H), i.e.,

ℬ (H) ℬ_{2} (H) ℬ (H) \subseteq ℬ_{2} (H);

$ℬ$ _fin (H), the space of operators of finite rank, is a dense subspace of $ℬ$ ₂ (H);
$ℬ_{2} (H) \subseteq K (H)$ , where $K (H)$ denotes the algebra of compact operators on H.

If {e_i}_i∈I an orthonormal basis of H, we say that A ∈

ℬ

(H) is trace class if:

{‖ A ‖}_{1} : = \sum_{i \in I} 〈 | A | e_{i}, e_{i} 〉 < \infty .

(12)

The definition of ||A||₁ does not depend on the choice of the orthonormal basis {e_i}_i∈I. We denote by

ℬ_{1}

(H) the set of trace class operators in

ℬ

(H).

The following proposition holds:

Proposition 1. If A E

ℬ

(H), then the following are equivalent:

A ∈ $ℬ_{1}$ (H);
|A|^1/2 ∈ $ℬ_{2}$ (H);
A (or |A|) is the product of two elements of $ℬ_{2}$ (H).

The following properties are also well known:

Theorem 5. With the above notations:

We have

{‖ A ‖}_{1} = {‖ A * ‖}_{1} a n d {‖ A ‖}_{2} \leq {‖ A ‖}_{1}

(13)

for any A ∈

ℬ_{1}

(H);

$ℬ_{1}$ (H) is an operator ideal in $ℬ$ (H), i.e.,

ℬ (H) ℬ_{1} (H) ℬ (H) \subseteq ℬ_{1} (H);

We have:

ℬ_{2} (H) ℬ_{2} (H) = ℬ_{1} (H);

We have:

{‖ A ‖}_{1} = \sup {{〈 A, B 〉}_{2} | B \in ℬ_{2} (H), ‖ B ‖ \leq 1};

( $ℬ_{1}$ (H), ||·||₁) is a Banach space.
We have the following isometric isomorphisms:

ℬ_{1} (H) ≅ K (H) * a n d ℬ_{1} (H) * ≅ ℬ (H),

where K (H)* is the dual space of K (H) and

ℬ_{1}

(H)* is the dual space of

ℬ_{1}

(H).

We define the trace of a trace class operator A ∈

ℬ_{1}

(H) to be:

tr (A) : = \sum_{i \in I} 〈 {Ae}_{i}, e_{i} 〉,

(14)

where {e_i}_i∈I an orthonormal basis of H. Note that this coincides with the usual definition of the trace if H is finite-dimensional. We observe that the series (14) converges absolutely and it is independent from the choice of basis.

The following result collects some properties of the trace:

Theorem 6. We have:

If A ∈ $ℬ_{1}$ (H), then A* ∈ $ℬ_{1}$ (H) and:

tr (A *) = \bar{tr (A)};

(15)

If A ∈ $ℬ_{1}$ (H) and T ∈ $ℬ$ (H), then AT, TA ∈ $ℬ_{1}$ (H) and:

tr (AT) = tr (TA) a n d | tr (AT) | \leq {‖ A ‖}_{1} ‖ T ‖;

(16)

tr (·) is a bounded linear functional on $ℬ_{1}$ (H) with ||tr|| = 1;
If A, B ∈ $ℬ_{2}$ (H), then AB, BA ∈ $ℬ_{1}$ (H) and tr (AB) = tr (BA);
$ℬ$ _fin (H) is a dense subspace of $ℬ_{1}$ (H).

Utilising the trace notation, we obviously have that:

{〈 A, B 〉}_{2} = tr (B * A) = tr (A B *) and {‖ A ‖}_{2}^{2} = tr (| A |^{2})

for any A, B ∈

ℬ_{2}

(H).

The following Hölder’s type inequality has been obtained by Ruskai in [18]:

| tr (A B) | \leq tr (| A B |) \leq {[tr (| A |^{1 / a})]}^{a} {[tr (| B |^{1 / 1 - a})]}^{1 - a}

(17)

where α ∈ (0,1) and A, B ∈

ℬ

(H) with

| A |^{1 / a}, | B |^{1 / (1 - a)} \in ℬ_{1} (H) .

In particular, for

a = \frac{1}{2}

we get the Schwarz inequality:

| tr (A B) | \leq tr (| A B |) \leq {[tr (| A |^{2})]}^{1 / 2} {[tr(| B |^{2})]}^{1 / 2}

(18)

with A, B ∈

ℬ_{2}

(H).

If A ≥ 0 and P ∈

ℬ_{1}

(H) with P ≥ 0, then:

0 \leq tr (P A) \leq ‖ A ‖ tr (P) .

(19)

Indeed, since A ≥ 0, then (Ax, x) ≥ 0 for any x ∈ H. If {e_i}_i_∈_I an orthonormal basis of H, then:

0 \leq 〈 {AP}^{1 / 2} e_{i}, P^{1 / 2} e_{i} 〉 \leq ‖ A ‖ {‖ P^{1 / 2} e_{i} ‖}^{2} = ‖ A ‖ 〈 P e_{i}, e_{i} 〉

for any i ∈ I. Summing over i ∈ I, we get:

0 \leq \sum_{i \in I} 〈 {AP}^{1 / 2} e_{i}, P^{1 / 2} e_{i} 〉 \leq ‖ A ‖ \sum_{i \in I} 〈 P e_{i}, e_{i} 〉 \leq ‖ A ‖ tr (P)

and since:

\sum_{i \in I} 〈 {AP}^{1 / 2} e_{i}, P^{1 / 2} e_{i} 〉 = \sum_{i \in I} 〈 P^{1 / 2} {AP}^{1 / 2} e_{i}, e_{i} 〉 = tr (P^{1 / 2} {AP}^{1 / 2}) = tr (P A)

we obtain the desired result (19).

This obviously imply the fact that, if A and B are self-adjoint operators with A ≤ B and P ∈

ℬ_{1}

(H) with P ≥ 0, then:

tr (P A) \leq tr (P B) .

(20)

Now, if A is a self-adjoint operator, then we know that:

| 〈 A x, x 〉 | \leq 〈 | A | x, x 〉 for any x \in H .

This inequality follows by Jensen’s inequality for the convex function f (t) = |t| defined on a closed interval containing the spectrum of A.

If {e_i}_i∈I is an orthonormal basis of H, then:

\begin{matrix} | tr (P A) | = | \sum_{i \in I} 〈 {AP}^{1 / 2} e_{i}, P^{1 / 2} e_{i} 〉 | \leq \sum_{i \in I} | 〈 {AP}^{1 / 2} e_{i}, P^{1 / 2} e_{i} 〉 | \\ \leq \sum_{i \in I} 〈 | A | P^{1 / 2} e_{i}, P^{1 / 2} e_{i} 〉 = tr (P | A |), \end{matrix}

(21)

for any A a self-adjoint operator and

P \in ℬ_{1}^{+} (H) : = {P \in ℬ_{1} (H)}

with P ≥ 0}.

For the theory of trace functionals and their applications the reader is referred to [19].

For some classical trace inequalities see [20–4], which are continuations of the work of Bellman [25]. For related works the reader can refer to [20,24,26–2].

3. Classical Quantum f-Divergence

On complex Hilbert space

(ℬ_{2} (H), {〈 \cdot, \cdot 〉}_{2})

, where the Hilbert–chmidt inner product is defined by:

{〈 U, V 〉}_{2} : tr (V * U)

for A,B ∈

ℬ^{+}

(H) consider the operators L_A:

ℬ_{2}

(H) →

ℬ_{2}

(H) and

ℜ_{B} : ℬ_{2} (H) \to ℬ_{2} (H)

defined by:

L_{A} T : = AT and ℜ_{B} T : = T B .

We observe that they are well defined and since:

{〈 L_{A} T, T 〉}_{2} = {〈 AT, T 〉}_{2} = tr (T * AT) = tr (| T * |^{2} A) \geq 0

and:

{〈 ℜ_{B} T, T 〉}_{2} = {〈 T B, T 〉}_{2} = tr (T * T B) = tr (| T |^{2} B) \geq 0

for any T ∈

ℬ_{2}

(H), they are also positive in the operator order of

ℬ_{2}

(H)), the Banach algebra of all bounded operators on

ℬ_{2}

(H) with the norm ||·||₂ where ||T||₂ = tr (|T|²), T ∈

ℬ_{2}

(H).

Since tr (|X*|²) = tr (|X|²) for any

X \in ℬ_{2} (H)

, then also:

\begin{array}{l} tr (T * AT) = tr (T * A^{1 / 2} A^{1 / 2} T) = tr ((A^{1 / 2} T) * A^{1 / 2} T) \\ = tr ({| A^{1 / 2} T |}^{2}) = tr ({| (A^{1 / 2} T) * |}^{2}) = tr (| T * A^{1 / 2} |^{2}) \end{array}

for A ≥ 0 and

T \in ℬ_{2} (H)

.

We observe that

L_{A}

and

ℜ_{B}

are commutative, therefore the product

L_{A} ℜ_{B}

is a self-adjoint positive operator on

ℬ_{2} (H)

for any positive operators

A, B \in ℬ (H)

.

For

A, B \in ℬ^{+} (H)

with B invertible, we define the Araki transform

A_{A, B} : ℬ_{2} (H) \to ℬ_{2} (H)

by

A_{A, B} : = L_{A} ℜ_{B - 1}

. We observe that for

T \in ℬ_{2} (H)

, and we have

A_{A, B} T = {ATB}^{- 1}

and:

{〈 A_{A, B} T, T 〉}_{2} = {〈 {ATB}^{- 1}, T 〉}_{2} = tr (T * {ATB}^{- 1}) .

Observe also, by the properties of trace, that:

\begin{matrix} tr (T * {ATB}^{- 1}) = tr (B^{- 1 / 2} T * A^{1 / 2} A^{1 / 2} {TB}^{- 1 / 2}) \\ = tr ((A^{1 / 2} {TB}^{- 1 / 2}) * (A^{1 / 2} {TB}^{- 1 / 2})) = tr ({| A^{1 / 2} {TB}^{- 1 / 2} |}^{2}) \end{matrix}

giving that:

{〈 A_{A, B} T, T 〉}_{2} = tr ({| A^{1 / 2} {TB}^{- 1 / 2} |}^{2}) \geq 0

(22)

for any

T \in ℬ_{2} (H)

.

Let U be a self-adjoint linear operator on a complex Hilbert space (K; ⟨·, ·,⟩). The Gelfand map establishes a ∗-isometrically isomorphism Φ between the set C (Sp (U)) of all continuous functions defined on the spectrum of U, denoted Sp (U), and the C^∗-algebra C^∗ (U) generated by U and the identity operator 1_K on K as follows:

For any f, g ∈ C (Sp (U)) and any α, β ∈ ℂ we have

Φ (αf + βg) = αΦ (f) + βΦ (g);
Φ (fg) = Φ (f) Φ (g) and $Φ (\bar{f}) = Φ (f) *$ ;
‖Φ (f)‖ = ‖f‖ := sup_t∈_Sp(_U₎ |f (t)|;
Φ (f₀) = 1_K and Φ (f₁) = U, where f₀ (t) = 1 and f₁ (t) = t, for t ∈ Sp (U).

With this notation we define:

f (U) : = Φ (f) for all f \in C (Sp (U))

and we call it the continuous functional calculus for a self-adjoint operator U.

If U is a self-adjoint operator and f is a real valued continuous function on Sp (U), then f (t) ≥ 0 for any t ∈ Sp (U) implies that f (U) ≥ 0, i.e., f (U) is a positive operator on K. Moreover, if both f and g are real valued functions on Sp (U), then the following important property holds:

f (t) \geq g (t) for any t \in Sp (U) implies that f (U) \geq g (U)

(P)

in the operator order of B (K).

Let f : [0, ∞) → ℝ be a continuous function. Utilising the continuous functional calculus for the Araki self-adjoint operator

A_{Q, P} \in ℬ (ℬ_{2} (H))

we can define the quantum f-divergence for Q, P ∈ S (H) := {P ∈ B₁ (H) , P ≥ 0 with tr (P) = 1} and P invertible, by:

S_{f} (Q, P) : = {〈 f (A_{Q, P}) P^{1 / 2}, P^{1 / 2} 〉}_{2} = tr (P^{1 / 2} f (A_{Q, P}) P^{1 / 2}) .

If we consider the continuous convex function f : [0, ∞) → R, with f (0) := 0 and f (t) = t ln t for t > 0, then for Q, P ∈ S (H) and Q, P invertible, we have:

S_{f} (Q, P) = tr [Q (\ln Q - \ln P)] = : U (Q, P),

which is the Umegaki relative entropy.

If we take the continuous convex function f : [0, ∞) → R, f (t) = |t − 1| for t ≥ 0, then for Q, P ∈ S (H) with P invertible, we have:

S_{f} (Q, P) = tr (P^{1 / 2} | A_{Q, P} - 1_{ℬ_{2} (H)} | P^{1 / 2}) = tr (| Q - P |) = : V (Q, P),

where V (Q, P ) is the variational distance.

If we take f : [0, ∞) → R, f (t) = t² − 1 for t ≥ 0, then for Q, P ∈ S (H) with P invertible, we have:

S_{f} (Q, P) = tr (Q^{2} P^{- 1}) - 1 = : χ^{2} (Q, P),

(23)

which is called the χ²-distance.

Let q ∈ (0, 1) and define the convex function f_q : [0, ∞) → R by

f_{q} (t) = \frac{1 - t^{q}}{1 - q}

. Then:

S_{f_{q}} (Q, P) = \frac{1 - tr (Q^{q} P^{1 - q})}{1 - q},

which is Tsallis relative entropy.

If we consider the convex function f : [0, ∞) → R by

f (t) = \frac{1}{2} {(\sqrt{t} - 1)}^{2}

, then:

S_{f} (Q, P) = 1 - tr (Q^{1 / 2} P^{1 / 2}) = : h^{2} (Q, P),

which is known as Hellinger discrimination.

If we take f : (0, ∞) → R, f (t) = − ln t, then for Q, P ∈ S (H) and Q, P invertible, we have:

S_{f} (Q, P) = tr [P (\ln P - \ln Q)] = U (P, Q) .

In the important case of finite dimensional space H and the generalized inverse P⁻¹, numerous properties of the quantum f-divergence have been obtained in the recent papers [33–36] and the references therein. We omit the details.

4. A New Quantum f-Divergence

In order to simplify the writing, we denote by S₁ (H) the set of all density operators which are elements of

ℬ_{1}^{+} (H)

having unit trace.

We observe that, if P, Q are self-adjoint with P, Q ≥ 0 and P is invertible, then

P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} \geq 0

.

Let f : [0, ∞) → ℝ be a continuous convex function on [0, ∞). We can define the following new quantum f-divergence functional:

D_{f} (Q, P) : = tr [P f (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})]

(D)

for Q, P ∈ S₁ (H) with P invertible. The definition can be extended for any continuous function.

If we take the convex function f (t) = t² − 1, t ≥ 0, then we get:

D_{f} (Q, P) : = tr [P [{(P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})}^{2} - 1]] = tr (Q^{2} P^{- 1}) - 1 = : χ^{2} (Q, P),

for Q, P ∈ S₁ (H) with P invertible, which is the Karl Pearson’s χ²-divergence version for trace class operators. This divergence is the same as the one generated by the classical f-divergence, see (23).

More general, if we take the convex function f (t) = tⁿ −1, t ≥ 0 and n a natural number with n ≥ 2, then we get:

\begin{array}{l} D_{f} (Q, P) : = tr [P [{(P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})}^{n} - 1]] = tr (Q {({QP}^{- 1})}^{n - 1}) - 1 \\ = : D_{\tilde{χ} n} (Q, P) \end{array}

for Q, P ∈ S₁ (H) with P invertible.

If we take the convex function f (t) = t ln t for t > 0 and f (0) := 0, then we get:

\begin{array}{l} D_{f} (Q, P) = tr [P [P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} \ln (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})]] \\ = tr [P^{\frac{1}{2}} {QP}^{- \frac{1}{2}} \ln (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})] = : D_{KL} (Q, P) \end{array}

for Q, P ∈ S₁ (H) with P and Q invertible. We observe that this is not the same as Umegaki relative entropy introduced above.

If we take the convex function f (t) = − ln t for t > 0, then we get:

\begin{array}{l} D_{f} (Q, P) = - tr [P \ln (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})] = tr [P \ln {(P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})}^{- 1}] \\ = tr [P \ln (P^{\frac{1}{2}} Q^{- 1} P^{- \frac{1}{2}})] = : {\tilde{D}}_{KL} (Q, P) \end{array}

for Q, P ∈ S₁ (H) with P and Q invertible.

If we take the convex function f (t) = |t − 1| , t ≥ 0, then we get:

D_{f} (Q, P) = tr [P | P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1_{H} |] = : D_{V} (Q, P)

for Q, P ∈ S₁ (H) with P invertible.

If we consider the convex function

f (t) = \frac{1}{t} - 1, t > 0

, then:

\begin{array}{r} D_{f} (Q, P) = t r [P [{(P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})}^{- 1} - 1_{H}]] \\ = t r [P (P^{- \frac{1}{2}} Q^{- 1} P^{\frac{1}{2}})] - 1 = χ^{2} (P, Q) \end{array}

for Q, P ∈ S₁ (H) with P and Q invertible.

If we take the convex function

f (t) = f_{q} (t) = \frac{1 - t^{q}}{1 - q}, q \in (0, 1)

, then we get:

D_{f q} (Q, P) = \frac{1}{1 - q} (1 - tr [P {(P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})}^{q}]),

which is different, in general, from the Tsallis relative entropy introduced above.

Other examples may be considered by taking the convex functions from the introduction. The details are omitted.

Suppose that I is an interval of real numbers with interior

\overset{°}{I}

and f : I → ℝ is a convex function on I. Then, f is continuous on

\overset{°}{I}

and has finite left and right derivatives at each point of

\overset{°}{I}

. Moreover, if

x, y \in \overset{°}{I}

and x < y, then

{f^{'}}_{-} (x) \leq {f^{'}}_{+} (x) \leq {f^{'}}_{-} (y) \leq {f^{'}}_{+} (y)

which shows that both

{f^{'}}_{-}

and

{f^{'}}_{+}

are nondecreasing function on

\overset{°}{I}

. It is also known that a convex function must be differentiable except for at most countably many points.

For a convex function f : I → ∈, the subdifferential of f denoted by ∂f is the set of all functions φ : I → [−∞, ∞], such that

φ (\overset{°}{I}) \subset ℝ

and:

f (x) \geq f (a) + (x - a) φ (a) for any x, a \in I .

(24)

It is also well known that if f is convex on I, then ∂f is nonempty,

{f^{'}}_{-}, {f^{'}}_{+} \in \partial f

and if φ ∈ ∂f, then:

{f^{'}}_{-} (x) \leq φ (x) \leq {f^{'}}_{+} (x) for any x \in \overset{°}{I} .

In particular, φ is a nondecreasing function.

If f is differentiable and convex on

\overset{°}{I}

, then ∂f = {f′}.

Theorem 7. Let f be a continuous convex function on [0, ∞) with f (1) = 0. Then, we have:

0 \leq D_{f} (Q, P)

(25)

for any Q, P ∈ S₁ (H) with P invertible.

If f is continuously differentiable on (0, ∞), then we also have:

D_{f} (Q, P) \leq D_{(\cdot) f^{'} (\cdot)} (Q, P) - D_{f'} (Q, P) .

(26)

Proof. For any x ≥ 0, we have from the gradient inequality (24) that:

f (x) \geq f (1) + (x - 1) {f^{'}}_{+} (1)

and since f is normalised, then:

f (x) \geq f (x - 1) {f^{'}}_{+} (1) .

(27)

Utilising the property (P) for the positive operator

A = P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}}

where Q,P ∈ S₁ (H) with P invertible, then we have the inequality in the operator order:

f (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}}) \geq (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1) {f^{'}}_{+} (1) .

(28)

Utilising the property (20) for the inequality (28), we have:

\begin{array}{l} tr [P f (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})] \geq tr [P (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1) {f^{'}}_{+} (1)] \\ = {f^{'}}_{+} (1) [tr ({PP}^{- \frac{1}{2}} {QP}^{- \frac{1}{2}}) - tr (p)] \\ = {f^{'}}_{+} (1) [tr (Q) - tr (P)] = 0 \end{array}

and the inequality (25) is proven.

From the gradient inequality, we also have for any x ≥ 0:

(x - 1) f^{'} (x) + f (1) \geq f (x)

and since f is normalised, then:

(x - 1) f^{'} (x) \geq f (x)

which, as above, implies that:

(P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1_{H}) f^{'} (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}}) \geq f (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}}) .

(29)

Making use of the property (20) for the inequality (29), then we get:

tr [P (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1_{H}) f^{'} (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})] \geq tr [P f (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})],

(30)

which is the required inequality (26).

Remark 2. If we take f (t) = −ln t, t > 0 in Theorem 7, then we get:

0 \leq {\tilde{D}}_{KL} (Q, P) \leq χ^{2} (P, Q)

(31)

for any Q, P ∈ S₁ (H) with P and Q invertible.

If we take the convex function ε (t) = e^t−1 − 1, then:

D_{ε} (Q, P) = [P \exp (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1_{H})] - 1,

where Q, P ∈ S₁ (H) with P invertible.

By Theorem 7, we get:

0 \leq D_{ε} (Q, P) \leq tr [P (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1_{H}) \exp (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1_{H})],

(32)

where Q, P ∈ S₁ (H) with P invertible.

The inequality in (32) is equivalent to:

0 \leq D_{ε} (Q, P) \leq \frac{1}{2} [tr [P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} \exp (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1_{H})] + 1],

where Q, P ∈ S₁ (H) with P invertible.

The following lemma is of interest in itself:

Lemma 1. Let S be a self-adjoint operator such that γ1_H ≤ S ≤ Γ1_H for some real constants Γ ≥ γ. Then, for any

P \in ℬ_{1}^{+} (H) \ {0}

, we have:

\begin{array}{l} 0 \leq \frac{tr ({PS}^{2})}{tr (P)} - {(\frac{tr (PS)}{tr (P)})}^{2} \\ \leq \frac{1}{2} (Γ - γ) \frac{1}{tr (P)} (P | S - \frac{tr (PS)}{tr (P)} 1_{H} |) \\ \leq \frac{1}{2} (Γ - γ) {[\frac{tr ({PS}^{2})}{tr (P)} - {(\frac{tr (PS)}{tr (P)})}^{2}]}^{1 / 2} \leq \frac{1}{4} {(Γ - γ)}^{2} . \end{array}

(33)

Proof. Observe that:

\begin{array}{l} \frac{1}{tr (P)} tr (P (S - \frac{Γ + γ}{2} 1_{H}) (S - \frac{tr (PS)}{tr (P)} 1_{H})) \\ = \frac{1}{tr (P)} tr (PS (S - \frac{tr (PS)}{tr (P)} 1_{H})) \\ - \frac{Γ + γ}{2} \frac{1}{tr (P)} tr (P (S - \frac{tr (PS)}{tr (P)} 1_{H})) \\ = \frac{tr ({PS}^{2})}{tr (P)} - {(\frac{tr (PS)}{tr (P)})}^{2} \end{array}

(34)

since, obviously:

tr (P (S - \frac{tr (PS)}{tr (P)} 1_{H})) = 0.

Now, since γ1_H ≤ S ≤ Γ1_H, then:

| S - \frac{Γ + γ}{2} 1_{H} | \leq \frac{1}{2} (Γ - γ) .

Taking the modulus in (34) and using the properties of trace, we have:

\begin{array}{l} \frac{tr ({PS}^{2})}{tr (P)} - {(\frac{tr (PS)}{tr (P)})}^{2} \\ = \frac{1}{tr (P)} | tr (P (S - \frac{Γ + γ}{2} 1_{H}) (S - \frac{tr (PS)}{tr (P)} 1_{H})) | \\ \leq \frac{1}{tr (P)} tr (P | (S - \frac{Γ + γ}{2} 1_{H}) (S - \frac{tr (PS)}{tr (P)} 1_{H}) |) \\ \leq \frac{1}{2} (Γ - γ) \frac{1}{tr (P)} tr (P | S - \frac{tr (PS)}{tr (P)} 1_{H} |), \end{array}

(35)

which proves the first part of (33).

By Schwarz inequality for trace, we also have:

\begin{array}{l} \frac{1}{tr (P)} tr (P | S - \frac{tr (PS)}{tr (P)} 1_{H} |) \\ \leq {[\frac{1}{tr (P)} tr (P {(S - \frac{tr (PS)}{tr (P)} 1_{H})}^{2})]}^{1 / 2} \\ = {[\frac{tr ({PS}^{2})}{tr (P)} - {(\frac{tr (PS)}{tr (P)})}^{2}]}^{1 / 2} . \end{array}

(36)

From (35) and (36), we get:

\begin{array}{l} \frac{tr ({PS}^{2})}{tr (P)} - {(\frac{tr (PS)}{tr (P)})}^{2} \\ \leq \frac{1}{2} (Γ - γ) {[\frac{tr ({PS}^{2})}{tr (P)} - {(\frac{tr (PS)}{tr (P)})}^{2}]}^{1 / 2}, \end{array}

which implies that:

{[\frac{tr ({PS}^{2})}{tr (P)} - {(\frac{tr (PS)}{tr (P)})}^{2}]}^{1 / 2} \leq \frac{1}{2} (Γ - γ) .

By (36), we then obtain:

\begin{array}{l} \frac{1}{tr (P)} tr (P | S - \frac{tr (PS)}{tr (P)} 1_{H} |) \\ \leq {[\frac{tr ({PS}^{2})}{tr (P)} - {(\frac{tr (PS)}{tr (P)})}^{2}]}^{1 / 2} \leq \frac{1}{2} (Γ - γ) \end{array}

that proves the last part of (33). ❚

Corollary 1. Let Q, P ∈ S₁ (H) with P invertible and such that there exists 0 < r ≤ 1 ≤ R satisfying the condition (38). Then:

\begin{array}{l} 0 \leq χ^{2} (Q, P) \leq \frac{1}{2} (R - r) D_{V} (Q, P) \leq \frac{1}{2} (R - r) χ (Q, P) \\ \leq \frac{1}{2} {(R - r)}^{2} . \end{array}

(37)

Proof. Utilising the inequality (33) for

S = P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}}

we have:

\begin{array}{l} (0 \leq) χ^{2} (Q, P) \leq \frac{1}{2} (R - r) tr (P | P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1_{H} |) \\ \leq \frac{1}{2} (R - r) χ (Q, P) \leq \frac{1}{4} {(R - r)}^{2}, \end{array}

and the inequality (37) is proved. ❚

We observe that if Q, P ∈ S₁ (H) with P invertible and there exists r, R > 0 with:

r 1_{H} \leq P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} \leq R 1_{H},

(38)

then by the property (20), we get:

r tr (P) \leq tr ({PP}^{- \frac{1}{2}} {QP}^{- \frac{1}{2}}) \leq R t r (P)

showing that r ≤ 1 ≤ R.

The following result provides a simple upper bound for the quantum f-divergence D_f (Q, P).

Theorem 8. Let f be a continuous convex function on [0, ∞) with f (1) = 0. Then, we have:

\begin{array}{l} 0 \leq D_{f} (Q, P) \leq \frac{1}{2} [{f^{'}}_{-} (R) - {f^{'}}_{+} (r)] D_{V} (Q, P) \\ \leq \frac{1}{2} [{f^{'}}_{-} (R) - {f^{'}}_{+} (r)] χ (Q, P) \\ \leq \frac{1}{4} (R - r) [{f^{'}}_{-} (R) - {f^{'}}_{+} (r)] \end{array}

(39)

for any Q, P ∈ S₁ (H) with P invertible and satisfying the condition (38).

Proof. Without loosing the generality, we prove the inequality in the case when f is continuously differentiable on (0, ∞).

We have:

\begin{array}{l} tr [P (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1_{H}) [f^{'} (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}}) - λ 1_{H}]] \\ = tr [P (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1_{H}) f^{'} (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})] \end{array}

(40)

for any λ ∈ ℝ and for any Q, P ∈ S₁ (H) with P invertible.

Since f′ is monotonic nondecreasing on [r, R], then:

{f^{'}}_{+} (r) \leq f^{'} (x) \leq {f^{'}}_{-} (R) for any x \in [r, R] .

This implies in the operator order that:

{f^{'}}_{+} (r) 1_{H} \leq f^{'} (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}}) \leq {f^{'}}_{-} (R) 1_{H},

therefore:

| f^{'} (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}}) - \frac{{f^{'}}_{-} (R) + {f^{'}}_{+} (r)}{2} 1_{H} | \leq \frac{1}{2} [{f^{'}}_{-} (R) - {f^{'}}_{+} (r)] 1_{H} .

(41)

From (30) and (40), we have:

\begin{array}{l} 0 \leq tr [P f (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})] \leq tr [P (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1_{H}) f^{'} (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})] \\ = tr [P (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} 1_{H}) [f^{'} (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}}) - \frac{{f^{'}}_{-} (R) + {f^{'}}_{+} (r)}{2} 1_{H}]] \\ | tr [P (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1_{H}) [f^{'} (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}}) - \frac{{f^{'}}_{-} (R) + {f^{'}}_{+} (r)}{2} 1_{H}]] | \\ \leq tr [P | (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1_{H}) [f^{'} (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}}) - \frac{{f^{'}}_{-} (R) + {f^{'}}_{+} (r)}{2} 1_{H}] |] \\ = tr [P | (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1_{H}) | | f^{'} (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}}) - \frac{{f^{'}}_{-} (R) + {f^{'}}_{+} (r)}{2} 1_{H} |] \\ \leq \frac{1}{2} [{f^{'}}_{-} (R) - {f^{'}}_{+} (r)] tr [P | P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - 1_{H} |] \\ = \frac{1}{2} [{f^{'}}_{-} (R) - {f^{'}}_{+} (r)] V (Q, P), \end{array}

which proves the first inequality in (39).

The rest follows by (37). ❚

Example 1. 1) If we take f (t) = − ln t, t > 0 in Theorem 8, then we get:

\begin{array}{l} 0 \leq {\tilde{D}}_{KL} (Q, P) \leq \frac{R - r}{2 r R} D_{V} (Q, P) \\ \leq \frac{R - r}{2 r R} χ (Q, P) \leq \frac{{(R - r)}^{2}}{4 r R} \end{array}

(42)

for any Q, P ∈ S₁ (H) with P, Q invertible and satisfying the condition:

r 1_{H} \leq P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} \leq R 1_{H},

(43)

with r > 0.

2) With the same conditions as in 1) for Q, P and if we take f (t) = t ln t, t > 0 in Theorem 8, then we get:

\begin{array}{l} 0 \leq D_{KL} (Q, P) \leq \frac{1}{2} \ln (\frac{R}{r}) D_{V} (Q, P) \\ \leq \frac{1}{2} \ln (\frac{R}{r}) χ (Q, P) \leq \frac{1}{4} (R - r) \ln (\frac{R}{r}) . \end{array}

(44)

3) If we take in (39)

f (t) = f_{q} (t) = \frac{1 - t^{q}}{1 - q}

, then we get:

\begin{array}{l} 0 \leq D_{f q} (Q, P) \leq \frac{q}{2 (1 - q)} (\frac{R^{1 - q} - r^{1 - q}}{R^{1 - q} r^{1 - q}}) V (Q, P) \\ \leq \frac{q}{2 (1 - q)} (\frac{R^{1 - q} - r^{1 - q}}{R^{1 - q} r^{1 - q}}) χ (Q, P) \\ \leq \frac{q}{4 (1 - q)} (\frac{R^{1 - q} - r^{1 - q}}{R^{1 - q} r^{1 - q}}) (R - r) \end{array}

(45)

provided that Q, P ∈ S₁ (H), with P, Q invertible and satisfying the condition (43).

We have the following upper bound, as well:

Theorem 9. Let f : [0, ∞) → ℝ be a continuous convex function that is normalized. If Q, P ∈ S₁ (H), with P invertible, and there exists R ≥ 1 ≥ r ≥ 0 such that the condition (38) is satisfied, then:

0 \leq D_{f} (Q, P) \leq \frac{(R - 1) f (r) + (1 - r) f (R)}{R - r} .

(46)

Proof. By the convexity of f, we have:

f (t) = f (\frac{(R - t) r + (t - r) R}{R - r}) \leq \frac{(R - t) f (r) + (t - r) f (R)}{R - r}

for any t ∈ [r, R].

This inequality implies the following inequality in the operator order of B (H):

f (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}}) \leq \frac{(R 1_{H} - P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}}) f (r) + (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - r 1_{H}) f (R)}{R - r},

(47)

for Q, P ∈ S₁ (H), with P invertible, and R ≥ 1 ≥ r ≥ 0 such that the condition (38) is satisfied.

Utilising the property (20), we get from (47) that:

\begin{array}{l} t r [P f (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})] \\ \leq \frac{f (r)}{R - r} tr [P (R 1_{H} - P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})] + \frac{f (R)}{R - r} tr [P (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}} - r 1_{H})] \\ = \frac{(R - 1) f (r) + (1 - r) f (R)}{R - r}, \end{array}

(48)

and the inequality (46) is thus proven. ❚

Remark 3. If we take in (46) f (t) = t² − 1, then we get:

0 \leq χ^{2} (Q, P) \leq (R - 1) (1 - r) \frac{R + r + 2}{R - r}

(49)

for Q, P ∈ S₁ (H), with P invertible and satisfying the condition (38).

If we take in (46) f (t) = t ln t, then we get the inequality:

0 \leq D_{KL} (Q, P) \leq \ln [r^{\frac{(R - 1) r}{R - r}} R^{\frac{R (1 - r)}{R - r}}]

(50)

provided that Q, P ∈ S₁ (H), with P, Q invertible and satisfying the condition (38).

With the same assumptions for P, Q, if we take in (46) f (t) = − ln t, then we get the inequality:

0 \leq {\tilde{D}}_{KL} (Q, P) \leq \ln [r^{\frac{1 - R}{R - r}} R^{\frac{r - 1}{R - r}}] .

(51)

5. Further Upper Bounds

We also have:

Theorem 10. Let f : [0, ∞) → ℝ be a continuous convex function that is normalized. If Q, P ∈ S₁ (H), with P invertible, and there exists R > 1 > r ≥ 0 such that the condition (38) is satisfied, then:

\begin{array}{l} 0 \leq D_{f} (Q, P) \leq \frac{(R - 1) (1 - r)}{R - r} Ψ_{f} (1; r, R) \\ \leq \frac{(R - 1) (1 - r)}{R - r} sup_{t \in (r, R)} Ψ_{f} (t; r, R) \\ \leq (R - 1) (1 - r) \frac{{f^{'}}_{-} (R) - {f^{'}}_{+} (r)}{R - r} \\ \leq \frac{1}{4} (R - r) [{f^{'}}_{-} (R) - {f^{'}}_{+} (r)] \end{array}

(52)

where Ψ_f (·; r, R) : (r, R) → ℝ is defined by:

Ψ_{f} (t; r, R) = \frac{f (R) - f (t)}{R - t} - \frac{f (t) - f (r)}{t - r} .

(53)

We also have:

\begin{array}{l} 0 \leq S_{f} (Q, P) \leq \frac{(R - 1) (1 - r)}{R - r} Ψ_{f} (1; r, R) \\ \leq \frac{1}{4} (R - r) Ψ_{f} (1; r, R) \\ \leq \frac{1}{4} (R - r) \sup_{t \in (r, R)} Ψ_{f} (t; r, R) \\ \leq \frac{1}{4} (R - r) [{f^{'}}_{-} (R) - {f^{'}}_{+} (r)] . \end{array}

(54)

Proof. By denoting:

Δ_{f} (t; r, R) : = \frac{(t - R) f (R) + (R - t) f (r)}{R - r} - (t), t \in [r, R]

we have:

\begin{array}{l} Δ_{f} (t; r, R) = \frac{(t - r) f (R) + (R - t) f (r) - (R - r) f (t)}{R - r} \\ = \frac{(t - r) f (R) + (R - t) f (r) - (T - t + t - r) f (t)}{R - r} \\ = \frac{(t - r) [f (R) - f (t)] - (R - t) [f (t) - f (r)]}{M - m} \\ = \frac{(R - t) (t - r)}{R - r} Ψ_{f} (t; r, R) \end{array}

(55)

for any t ∈ (r, R).

From the proof of Theorem 9 and since f (1) = 0, we have:

\begin{array}{l} tr [P f (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})] \leq \frac{(R - 1) f (r) + (1 - r) f (R)}{R - r} - f (1) \\ = Δ_{f} (1; r, R) = \frac{(R - 1) (1 - r)}{R - r} Ψ_{f} (1; r, R) \end{array}

for any Q, P ∈ S₁ (H), with P invertible, and R > 1 > r ≥ 0 such that the condition (38) is valid.

Since:

\begin{array}{l} Ψ_{f} (1; r, R) \leq \sup_{t \in (r, R)} Ψ_{f} (t; r, R) \\ = \sup_{t \in (r, R)} [\frac{f (R) - f (t)}{R - t} - \frac{f (t) - f (r)}{t - r}] \\ \leq \sup_{t \in (r, R)} [\frac{f (R) - f (t)}{R - t}] + \sup_{t \in (r, R)} [\frac{f (t) - f (r)}{t - r}] \\ = \sup_{t \in (r, R)} [\frac{f (R) - f (t)}{R - t}] - \inf_{t \in (r, R)} [\frac{f (t) - f (r)}{t - r}] \\ = {f^{'}}_{-} (R) - {f^{'}}_{+} (r), \end{array}

(56)

and, obviously:

\frac{1}{R - r} (R - 1) (1 - r) \leq \frac{1}{4} (R - r),

(57)

then by (55)–(57), we have the desired result (52).

The rest is obvious. ❚

Remark 4. If we consider the convex normalized function f (t) = t² − 1, then:

Ψ_{f} (t; r, R) = \frac{R^{2} - t^{2}}{R - t} - \frac{t^{2} - r^{2}}{t - r} = R - r, t \in (r, R)

and we get from (52) the simple inequality:

0 \leq χ^{2} (Q, P) \leq (R - 1) (1 - r)

(58)

for Q, P ∈ S₁ (H), with P invertible and satisfying the condition (38), which is better than (49).

If we take the convex normalized function f (t) = t⁻¹ − 1, then we have:

Ψ_{f} (t; r, R) = \frac{R^{- 1} - t^{- 1}}{R - t} - \frac{t^{- 1} - r^{- 1}}{t - r} = \frac{R - r}{r R t}, t \in [r, R]

Furthermore:

D_{f} (Q, P)] = χ^{2} (P, Q) .

Using (52), we get:

(0 \leq) χ^{2} (P, Q) \leq \frac{(R - 1) (1 - r)}{R r}

(59)

for Q, P ∈ S₁ (H), with Q invertible and satisfying the condition (38).

If we consider the convex function f (t) = − ln t defined on [r, R] ⊂ (0, ∞), then:

\begin{array}{l} Ψ_{f} (t; r, R) = \frac{- \ln R + \ln t}{R - t} - \frac{- \ln t + \ln r}{t - r} \\ = \frac{(R - r) \ln t - (R - t) \ln r - (t - r) \ln R}{(M - t) (t - m)} \\ = \ln {(\frac{t^{R - r}}{r^{R - t} M^{t - r}})}^{\frac{1}{(R - t) (t - r)}}, t \in (r, R) . \end{array}

Then, by (52), we have:

(0 \leq) {\tilde{D}}_{KL} (Q, P) \leq \ln [r^{\frac{1 - R}{R - r}} R^{\frac{r - 1}{R - r}}] \leq \frac{(R - 1) (1 - r)}{r R}

(60)

for Q, P ∈ S₁ (H), with P, Q invertible and satisfying the condition (38).

If we consider the convex function f (t) = t ln t defined on [r, R] ⊂ (0, ∞), then:

Ψ_{f} (t; r, R) = \frac{R \ln R - t \ln t}{R - t} - \frac{t \ln t - r \ln r}{t - r}, t \in (r, R),

which gives that:

Ψ_{f} (1; r, R) = \frac{R \ln R}{R - 1} - \frac{r \ln r}{1 - r} .

Using (52), we get:

\begin{array}{l} (0 \leq) D_{KL} (Q, P) \leq \ln [R^{\frac{(1 - r) R}{R - r}} r^{\frac{(1 - R) r}{R - r}}] \\ \leq (R - 1) (1 - r) \ln [{(\frac{R}{r})}^{\frac{1}{R - r}}] \end{array}

(61)

for Q, P ∈ S₁ (H), with P, Q invertible and satisfying the condition (38).

We also have:

Theorem 11. Let f : [0, ∞) → ℝ be a continuous convex function that is normalized. If Q, P ∈ S₁ (H), with P invertible, and there exists R > 1 > r ≥ 0 such that the condition (38) is satisfied, then:

\begin{array}{l} 0 \leq D_{f} (Q, P) \\ \leq 2 max {\frac{R - 1}{R - r}, \frac{1 - r}{R - r}} [\frac{f (r) + f (R)}{2} - f (\frac{r + R}{2})] \\ \leq 2 [\frac{f (r) + f (R)}{2} - f (\frac{r + R}{2})] . \end{array}

(62)

Proof. We recall the following result (see for instance [37]) that provides a refinement and a reverse for the weighted Jensen’s discrete inequality:

\begin{array}{l} n min_{i \in {1, \dots, n}} {p_{i}} [\frac{1}{n} \sum_{i = 1}^{n} f (x_{i}) - f (\frac{1}{n} \sum_{i = 1}^{n} x_{i})] \\ \leq \frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} f (x_{i}) - f (\frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} x_{i}) \\ \leq n max_{i \in {1, \dots, n}} {p_{i}} [\frac{1}{n} \sum_{i = 1}^{n} f (x_{i}) - f (\frac{1}{n} \sum_{i = 1}^{n} x_{i})], \end{array}

(63)

where f : C → ℝ is a convex function defined on the convex subset C of the linear space X,{x_i}_i_∈{1,…_n_}⊂ C are vectors and {p_i}_i_∈{1_,…n_} are nonnegative numbers with

P_{n} : = \sum_{i = 1}^{n} p_{i} > 0

.

For n = 2, we deduce from (63) that:

\begin{array}{l} 2 min {s, 1 - s} [\frac{f (x) + f (y)}{2} - f (\frac{x + y}{2})] \\ \leq s f (x) + (1 - s) f (y) - f (s x + (1 - s) y) \\ \leq 2 max {s, 1 - s} [\frac{f (x) + f (y)}{2} - f (\frac{x + y}{2})] \end{array}

(64)

for any x, y ∈ C and s ∈ [0, 1].

Now, if we use the second inequality in (64) for x = r, y = R,

s = \frac{R - t}{R - r}

with t ∈ [r, R], then we have:

\begin{array}{l} \frac{(R - t) f (r) + (t - r) f (R)}{R - r} - f (t) \\ \leq 2 max {\frac{R - t}{R - r}, \frac{t - r}{R - r}} [\frac{f (r) + f (R)}{2} - f (\frac{r + R}{2})] \\ \leq 2 [\frac{f (r) + f (R)}{2} - f (\frac{r + R}{2})] \end{array}

(65)

for any t ∈ [r, R].

This implies that:

\begin{array}{l} tr [P f (P^{- \frac{1}{2}} {QP}^{- \frac{1}{2}})] \\ \leq \frac{(R - 1) f (r) + (1 - r) f (R)}{R - r} \\ \leq 2 max {\frac{R - 1}{R - r}, \frac{1 - r}{R - r}} [\frac{f (r) + f (R)}{2} - f (\frac{r + R}{2})] \\ \leq 2 [\frac{f (r) + f (R)}{2} - f (\frac{r + R}{2})] \end{array}

and the proof is completed. ❚

Remark 5. If we take in (62) f (t) = t⁻¹ − 1, then we have:

0 \leq χ^{2} (P, Q) \leq max {R - 1, 1 - r} \frac{R - r}{r R (r + R)}

(66)

for Q, P ∈ S₁ (H), with P invertible and satisfying the condition (38).

If we take in (62) f (t) = − ln t, then we have:

\begin{array}{l} 0 \leq {\tilde{D}}_{KL} (Q, P) \leq max {\frac{R - 1}{R - r}, \frac{1 - r}{R - r}} \ln (\frac{{(R + r)}^{2}}{4 r R}) \\ \leq \ln (\frac{{(R + r)}^{2}}{4 r R}) \end{array}

(67)

for Q, P ∈ S₁ (H), with P invertible and satisfying the condition (38).

From (42), we have the following absolute upper bound:

0 \leq {\tilde{D}}_{KL} (Q, P) \leq \frac{{(R - r)}^{2}}{4 r R}

(68)

for Q, P ∈ S₁ (H), with P invertible and satisfying the condition (38).

Utilising the elementary inequality ln x ≤ x − 1, x > 0, we have that:

\ln (\frac{{(R + r)}^{2}}{4 r R}) \leq \frac{{(R - r)}^{2}}{4 r R},

which shows that (67) is better than (68).

6. Conclusions

In this paper we have introduced a new quantum f-divergence for trace class operators in Hilbert Spaces. It is shown that for normalised convex functions it is nonnegative. Some upper bounds are provided. Applications for some classes of convex functions of interest are also given.

Acknowledgments

The author would like to thank the anonymous referees for valuable comments that have been implemented in the final version of the paper.

Conflicts of Interest

The author declares no conflict of interest.

References

Csiszár, I. Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten. Magyar Tud. Akad. Mat. Kutató Int. Közl 1963, 8, 85–108. (in German). [Google Scholar]
Liese, F.; Vajda, I. Convex Statistical Distances. In Texte zur Mathematik, Band 95; Teubuer: Leipzig, Germany, 1987. [Google Scholar]
Cerone, P.; Dragomir, S.S.; Österreicher, F. Bounds on extended f-divergences for a variety of classes. Kybernetika 2004, 40, 745–756. [Google Scholar]
Cerone, P.; Dragomir, S.S. Approximation of the integral mean divergence and f-divergence via mean results. Math. Comput. Modelling 2005, 42, 207–219. [Google Scholar]
Dragomir, S.S. Some inequalities for (m, M)-convex mappings and applications for the Csiszár Φ-divergence in information theory. Math. J. Ibaraki Univ 2001, 33, 35–50. [Google Scholar]
Dragomir, S.S. Some inequalities for two Csiszár divergences and applications. Mat. Bilten 2001, 25, 73–90. [Google Scholar]
Dragomir, S.S. An upper bound for the Csiszár f-divergence in terms of the variational distance and applications. Panamer. Math. J 2002, 12, 43–54. [Google Scholar]
Dragomir, S.S. Upper and lower bounds for Csiszár f-divergence in terms of Hellinger discrimination and applications. Nonlinear Anal. Forum 2002, 7, 1–13. [Google Scholar]
Dragomir, S.S. Bounds for f-divergences under likelihood ratio constraints. Appl. Math 2003, 48, 205–223. [Google Scholar]
Dragomir, S.S. New inequalities for Csiszár divergence and applications. Acta Math. Vietnam 2003, 28, 123–134. [Google Scholar]
Dragomir, S.S. A generalized f-divergence for probability vectors and applications. Panamer. Math. J 2003, 13, 61–69. [Google Scholar]
Dragomir, S.S. Some inequalities for the Csiszár φ -divergence when φ is an L-Lipschitzian function and applications. Ital. J. Pure Appl. Math No. 2004, 15, 57–76. [Google Scholar]
Dragomir, S.S. A converse inequality for the Csiszár Φ-divergence. Tamsui Oxf. J. Math. Sci 2004, 20, 35–53. [Google Scholar]
Dragomir, S.S. Some general divergence measures for probability distributions. Acta Math. Hung 2005, 109, 331–345. [Google Scholar]
Dragomir, S.S. A refinement of Jensen’s inequality with applications for f-divergence measures. Taiwan. J. Math 2010, 14, 153–164. [Google Scholar]
Kafka, P.; Österreicher, F.; Vincze, I. On powers of f-divergence defining a distance. Stud. Sci. Math. Hung. 1991, 26, 415–422. [Google Scholar]
Österreicher, F.; Vajda, I. A new class of metric divergences on probability spaces and its applicability in statistics. Ann. Inst. Stat. Math 2003, 55, 639–653. [Google Scholar]
Ruskai, M.B. Inequalities for traces on von Neumann algebras. Commun. Math. Phys 1972, 26, 280–289. [Google Scholar]
Simon, B. Trace Ideals and Their Applications; Cambridge University Press: Cambridge, UK, 1979. [Google Scholar]
Chang, D. A matrix trace inequality for products of Hermitian matrices. J. Math. Anal. Appl 1999, 237, 721–725. [Google Scholar]
Coop, I.D. On matrix trace inequalities and related topics for products of Hermitian matrix. J. Math. Anal. Appl 1994, 188, 999–1001. [Google Scholar]
Neudecker, H. A matrix trace inequality. J. Math. Anal. Appl 1992, 166, 302–303. [Google Scholar]
Yang, Y. A matrix trace inequality. J. Math. Anal. Appl 1988, 133, 573–574. [Google Scholar]
Ando, T. Matrix Young inequalities. Oper. Theory Adv. Appl 1995, 75, 33–38. [Google Scholar]
Bellman, R. Some inequalities for positive definite matrices. In General Inequalities 2; Beckenbach, E.F., Ed.; Birkhäuser Basel: Basel, Switzerland, 1980; pp. 89–90. [Google Scholar]
Belmega, E.V.; Jungers, M.; Lasaulce, S. A generalization of a trace inequality for positive definite matrices. Aust. J. Math. Anal. Appl. 2010, 7, 5. [Google Scholar]
Furuichi, S.; Lin, M. Refinements of the trace inequality of Belmega, Lasaulce and Debbah. Aust. J. Math. Anal. Appl. 2010, 7, 4. [Google Scholar]
Lee, H.D. On some matrix inequalities. Korean J. Math 2008, 16, 565–571. [Google Scholar]
Liu, L. A trace class operator inequality. J. Math. Anal. Appl 2007, 328, 1484–1486. [Google Scholar]
Shebrawi, K.; Albadawi, H. Operator norm inequalities of Minkowski type. J. Inequal. Pure Appl. Math 2008, 9, 1–10. [Google Scholar]
Ulukök, Z.; Türkmen, R. On some matrix trace inequalities. J. Inequal. Appl 2010, 201486, 1–201486–8. [Google Scholar]
Manjegani, S. Hölder and Young inequalities for the trace of operators. Positivity 2007, 11, 239–250. [Google Scholar]
Hiai, F.; Petz, D. From quasi-entropy to various quantum information quantities. Publ. Res. Inst. Math. Sci 2012, 48, 525–542. [Google Scholar]
Hiai, F.; Mosonyi, M.; Petz, D.; Bény, C. Quantum f-divergences and error correction. Rev. Math. Phys 2011, 23, 691–747. [Google Scholar]
Petz, D. From quasi-entropy. Ann. Univ. Sci. Bp. Eötvös Sect. Math 2012, 55, 81–92. [Google Scholar]
Petz, D. From f-divergence to quantum quasi-entropies and their use. Entropy 2010, 12, 304–325. [Google Scholar]
Dragomir, S.S. Bounds for the normalized Jensen functional. Bull. Aust. Math. Soc. 2006, 74, 471–476. [Google Scholar]
De Barra, G. Measure Theory and Integration; Ellis Horwood Ltd.: Chichester, UK, 1981. [Google Scholar]
Chen, L.; Wong, C. Inequalities for singular values and traces. Linear Algebra Appl 1992, 171, 109–120. [Google Scholar]

© 2014 by the authors; licensee MDPI, Basel, Switzerland This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dragomir, S.S. A New Quantum f-Divergence for Trace Class Operators in Hilbert Spaces. Entropy 2014, 16, 5853-5875. https://doi.org/10.3390/e16115853

AMA Style

Dragomir SS. A New Quantum f-Divergence for Trace Class Operators in Hilbert Spaces. Entropy. 2014; 16(11):5853-5875. https://doi.org/10.3390/e16115853

Chicago/Turabian Style

Dragomir, Silvestru Sever. 2014. "A New Quantum f-Divergence for Trace Class Operators in Hilbert Spaces" Entropy 16, no. 11: 5853-5875. https://doi.org/10.3390/e16115853

Article Menu

A New Quantum f-Divergence for Trace Class Operators in Hilbert Spaces

Abstract

1. Introduction

2. Trace of Operators

3. Classical Quantum f-Divergence

4. A New Quantum f-Divergence

5. Further Upper Bounds

6. Conclusions

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI