Quantum key distribution rates from non-symmetric conic optimization

Andrés González Lorente Departamento de Física Teórica, Atómica y Óptica, Universidad de Valladolid, 47011 Valladolid, Spain    Pablo V. Parellada Departamento de Física Teórica, Atómica y Óptica, Universidad de Valladolid, 47011 Valladolid, Spain    Miguel Castillo-Celeita Departamento de Física Teórica, Atómica y Óptica, Universidad de Valladolid, 47011 Valladolid, Spain    Mateus Araújo Departamento de Física Teórica, Atómica y Óptica, Universidad de Valladolid, 47011 Valladolid, Spain
(24th July 2024)
Abstract

Computing key rates in quantum key distribution (QKD) numerically is essential to unlock more powerful protocols, that use more sophisticated measurement bases or quantum systems of higher dimension. It is a difficult optimization problem, that depends on minimizing a convex non-linear function: the (quantum) relative entropy. Standard conic optimization techniques have for a long time been unable to handle the relative entropy cone, as it is a non-symmetric cone, and the standard algorithms can only handle symmetric ones. Recently, however, a practical algorithm has been discovered for optimizing over non-symmetric cones, including the relative entropy. Here we adapt this algorithm to the problem of computation of key rates, obtaining an efficient technique for lower bounding them. In comparison to previous techniques it has the advantages of flexibility, ease of use, and above all performance.

1 Introduction

Secret key rates in QKD are usually calculated analytically [gisin2002, scarani2009, xu2020, pirandola2020]. This is only tractable for simple protocols with highly symmetric measurement bases, such as BB84 [bennett1984], the six-state protocol [bruss1998], or their generalizations to mutually unbiased bases (MUBs) in higher dimensions [cerf2002, sheridan2010]. Recently there has been intense interest in numerical techniques for the computation of key rates in order to unlock more sophisticated protocols, that use more parameters, arbitrary measurement bases, and higher dimensions.

The best existing techniques are the Gauss-Newton technique from [hu2022], and the Gauss-Radau technique from [araujo22]. The former consists of an implementation of a bespoke Gauss-Newton solver for the key rate problem, and the latter consists of reducing the problem to a convergent hierarchy of semidefinite programs. Both allow for the computation of optimal key rates for arbitrary protocols, but suffer from several disadvantages. The main one, common to both techniques, is low performance, which limits their applicability to protocols with low dimension. Specific problems of the Gauss-Newton technique are that it doesn’t use standard conic optimization techniques, which makes it challenging to implement and inflexible. For example, it cannot handle protocols that use anything other than equality constraints. A specific problem of the Gauss-Radau technique is that for any fixed level of the hierarchy it only delivers an approximation of the key rate, and the cost of achieving a given precision can be prohibitive.

Here we introduce a new technique that is several times faster than both techniques, while also solving their specific issues. Unlike the Gauss-Newton technique, it uses standard conic optimization, making it easy to implement and combine with other kinds of constraints that might be required by the protocols. Unlike the Gauss-Radau technique, it doesn’t rely on reformulating the problem as a sequence of semidefinite programs, but instead solves it directly.

We use the framework from [winick2018] to formulate the key rate problem as a convex optimization problem, the minimization of a (quantum) relative entropy over a convex domain. Using the relative entropy cone, which is a non-symmetric cone, it becomes a conic optimization problem. Although for a long time algorithms for optimizing over non-symmetric cones have been known [nesterov1999, tuncel2001, nesterov2012], they were hardly practical, as they required for instance a tractable barrier function for the dual cone. This changed with the discovery of Skajaa and Ye’s algorithm [skajaa2015, papp2017], and with it solving the key rate problem becomes in principle possible. There is, however, a technical difficulty: this formulation of the key rate problem is often not strictly feasible, and such problems cannot be reliably solved with conic optimization methods. The standard technique to deal with this, called facial reduction [drusvyatskiy2017], does apply, but the resulting reduced problem is no longer a minimization of the relative entropy. For this reason we introduce a new convex cone to deal directly with the reduced problem.

We implemented our new cone in the programming language Julia [bezanson2017] as an extension to the solver Hypatia [coey2022], which implements an improved version of the Skajaa and Ye algorithm [coey2023]. This solver is interfaced by the modeller JuMP [lubin2023], making it easy to use, and takes advantage of Julia’s flexible type system, allowing the user to solve the optimization problems using double, double-double, quadruple, or arbitrary precision.

The paper is organized as follows. In Section 2 we reformulate the key rate problem as a conic problem and introduce the QKD cone. In Section 3 we propose a barrier function and obtain the derivatives necessary to implement the cone. In Section 4 we show how to formulate some QKD protocols to compute the key rates with our technique. In Section 5 we benchmark our technique against the Gauss-Newton and Gauss-Radau techniques.

2 QKD rate as a conic problem

In quantum key distribution the asymptotic key rate is given by the Devetak-Winter rate [devetak2005]:

KH(A|E)H(A|B),𝐾𝐻conditional𝐴𝐸𝐻conditional𝐴𝐵K\geq H(A|E)-H(A|B),italic_K ≥ italic_H ( italic_A | italic_E ) - italic_H ( italic_A | italic_B ) , (1)

where H(A|E)𝐻conditional𝐴𝐸H(A|E)italic_H ( italic_A | italic_E ) is the conditional von Neumann entropy between Alice and Eve, and H(A|B)𝐻conditional𝐴𝐵H(A|B)italic_H ( italic_A | italic_B ) the conditional von Neumann entropy between Alice and Bob. Since H(A|B)𝐻conditional𝐴𝐵H(A|B)italic_H ( italic_A | italic_B ) is completely determined by the measured data, the challenge is to compute H(A|E)𝐻conditional𝐴𝐸H(A|E)italic_H ( italic_A | italic_E ), which has to be minimized over all quantum states compatible with the measured statistics. Following [winick2018], it can be expressed in terms of the relative entropy as

H(A|E)ρABE=D(𝒢(ρAB)𝒵(𝒢(ρAB))),𝐻subscriptconditional𝐴𝐸subscript𝜌𝐴𝐵𝐸𝐷conditional𝒢subscript𝜌𝐴𝐵𝒵𝒢subscript𝜌𝐴𝐵H(A|E)_{\rho_{ABE}}=D(\mathcal{G}(\rho_{AB})\|\mathcal{Z}(\mathcal{G}(\rho_{AB% }))),italic_H ( italic_A | italic_E ) start_POSTSUBSCRIPT italic_ρ start_POSTSUBSCRIPT italic_A italic_B italic_E end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_D ( caligraphic_G ( italic_ρ start_POSTSUBSCRIPT italic_A italic_B end_POSTSUBSCRIPT ) ∥ caligraphic_Z ( caligraphic_G ( italic_ρ start_POSTSUBSCRIPT italic_A italic_B end_POSTSUBSCRIPT ) ) ) , (2)

where D()D(\cdot\|\cdot)italic_D ( ⋅ ∥ ⋅ ) is the relative entropy, 𝒢𝒢\mathcal{G}caligraphic_G is a CP map that takes ρABsubscript𝜌𝐴𝐵\rho_{AB}italic_ρ start_POSTSUBSCRIPT italic_A italic_B end_POSTSUBSCRIPT to an extended quantum state that includes the secret key as a subsystem, and 𝒵𝒵\mathcal{Z}caligraphic_Z is a CPTP map that decoheres the key subsystem. 𝒵𝒵\mathcal{Z}caligraphic_Z is necessarily of the form 𝒵(ρ)=iΠiρΠi𝒵𝜌subscript𝑖subscriptΠ𝑖𝜌subscriptΠ𝑖\mathcal{Z}(\rho)=\sum_{i}\Pi_{i}\rho\Pi_{i}caligraphic_Z ( italic_ρ ) = ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT roman_Π start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ρ roman_Π start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT for some projectors ΠisubscriptΠ𝑖\Pi_{i}roman_Π start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT that sum to identity. The optimization problem is thus

minρD(𝒢(ρ)𝒵(𝒢(ρ)))s.t.ρ0,tr(ρ)=1,tr(Ekρ)=pkk\begin{gathered}\min_{\rho}\ D(\mathcal{G}(\rho)\|\mathcal{Z}(\mathcal{G}(\rho% )))\\ \text{s.t.}\quad\rho\succeq 0,\quad\operatorname{tr}(\rho)=1,\\ \operatorname{tr}(E_{k}\rho)=p_{k}\ \forall k\end{gathered}start_ROW start_CELL roman_min start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_D ( caligraphic_G ( italic_ρ ) ∥ caligraphic_Z ( caligraphic_G ( italic_ρ ) ) ) end_CELL end_ROW start_ROW start_CELL s.t. italic_ρ ⪰ 0 , roman_tr ( italic_ρ ) = 1 , end_CELL end_ROW start_ROW start_CELL roman_tr ( italic_E start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_ρ ) = italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∀ italic_k end_CELL end_ROW (3)

where Eksubscript𝐸𝑘E_{k}italic_E start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT are the POVMs encoding the QKD protocol, and pksubscript𝑝𝑘p_{k}italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT are the estimated probabilities. This formulation makes the computation of key rates a convex optimization problem. To make it a conic optimization problem, however, we need to write it in terms of the relative entropy convex cone, namely

𝒦RE=cl{(h,ρ,σ)×n×n;ρ0,σ0,hD(ρσ)},subscript𝒦REclformulae-sequence𝜌𝜎superscript𝑛superscript𝑛formulae-sequencesucceeds𝜌0formulae-sequencesucceeds𝜎0𝐷conditional𝜌𝜎\mathcal{K}_{\text{RE}}=\text{cl}\,\{(h,\rho,\sigma)\in\mathbb{R}\times\mathbb% {H}^{n}\times\mathbb{H}^{n};\,\rho\succ 0,\ \sigma\succ 0,\ h\geq D(\rho\|% \sigma)\},caligraphic_K start_POSTSUBSCRIPT RE end_POSTSUBSCRIPT = cl { ( italic_h , italic_ρ , italic_σ ) ∈ blackboard_R × blackboard_H start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT × blackboard_H start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ; italic_ρ ≻ 0 , italic_σ ≻ 0 , italic_h ≥ italic_D ( italic_ρ ∥ italic_σ ) } , (4)

where nsuperscript𝑛\mathbb{H}^{n}blackboard_H start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT denotes the space of complex Hermitian matrices. It becomes then

minh,ρhs.t.ρ0,tr(ρ)=1,tr(Ekρ)=pkk,(h,𝒢(ρ),𝒵(𝒢(ρ)))𝒦RE\begin{gathered}\min_{h,\rho}\ h\\ \text{s.t.}\quad\rho\succeq 0,\quad\operatorname{tr}(\rho)=1,\\ \operatorname{tr}(E_{k}\rho)=p_{k}\ \forall k,\\ \Big{(}h,\mathcal{G}(\rho),\mathcal{Z}(\mathcal{G}(\rho))\Big{)}\in\mathcal{K}% _{\text{RE}}\end{gathered}start_ROW start_CELL roman_min start_POSTSUBSCRIPT italic_h , italic_ρ end_POSTSUBSCRIPT italic_h end_CELL end_ROW start_ROW start_CELL s.t. italic_ρ ⪰ 0 , roman_tr ( italic_ρ ) = 1 , end_CELL end_ROW start_ROW start_CELL roman_tr ( italic_E start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_ρ ) = italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∀ italic_k , end_CELL end_ROW start_ROW start_CELL ( italic_h , caligraphic_G ( italic_ρ ) , caligraphic_Z ( caligraphic_G ( italic_ρ ) ) ) ∈ caligraphic_K start_POSTSUBSCRIPT RE end_POSTSUBSCRIPT end_CELL end_ROW (5)

In principle it can now be solved by using Skajaa and Ye’s algorithm, but there’s a difficulty: as with any interior-point algorithm, the problem needs to be strictly feasible for the solution to be found reliably and efficiently. In other words, there must exist a point (h,ρ)𝜌(h,\rho)( italic_h , italic_ρ ) that satisfies the equality constraints of the conic problem and is in the interior of all cones we use. This can fail here in two places: the equality constraints tr(Ekρ)=pktrsubscript𝐸𝑘𝜌subscript𝑝𝑘\operatorname{tr}(E_{k}\rho)=p_{k}roman_tr ( italic_E start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_ρ ) = italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT might be only satisfiable by a quantum state with null eigenvalues, and the map 𝒢𝒢\mathcal{G}caligraphic_G might take a positive definite quantum state to one with null eigenvalues. The map 𝒵𝒵\mathcal{Z}caligraphic_Z, on the other hand, preserves positive definiteness111To see that, assume for the sake of contradiction that |ψket𝜓\mathopen{}\mathclose{{}\left|\psi}\right\rangle| italic_ψ ⟩ is a null eigenvector of 𝒵(ρ)𝒵𝜌\mathcal{Z}(\rho)caligraphic_Z ( italic_ρ ) for some positive definite ρ𝜌\rhoitalic_ρ. Then ψ|𝒵(ρ)|ψ=iψ|ΠiρΠi|ψ=0quantum-operator-product𝜓𝒵𝜌𝜓subscript𝑖quantum-operator-product𝜓subscriptΠ𝑖𝜌subscriptΠ𝑖𝜓0\mathopen{}\mathclose{{}\left\langle\psi}\right|\mathcal{Z}(\rho)\mathopen{}% \mathclose{{}\left|\psi}\right\rangle=\sum_{i}\mathopen{}\mathclose{{}\left% \langle\psi}\right|\Pi_{i}\rho\Pi_{i}\mathopen{}\mathclose{{}\left|\psi}\right% \rangle=0⟨ italic_ψ | caligraphic_Z ( italic_ρ ) | italic_ψ ⟩ = ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⟨ italic_ψ | roman_Π start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ρ roman_Π start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_ψ ⟩ = 0. Since ρ𝜌\rhoitalic_ρ is positive definite, ψ|ΠiρΠi|ψ>0quantum-operator-product𝜓subscriptΠ𝑖𝜌subscriptΠ𝑖𝜓0\mathopen{}\mathclose{{}\left\langle\psi}\right|\Pi_{i}\rho\Pi_{i}\mathopen{}% \mathclose{{}\left|\psi}\right\rangle>0⟨ italic_ψ | roman_Π start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ρ roman_Π start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_ψ ⟩ > 0 for all nonzero Πi|ψsubscriptΠ𝑖ket𝜓\Pi_{i}\mathopen{}\mathclose{{}\left|\psi}\right\rangleroman_Π start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_ψ ⟩. Therefore, to satisfy the equation we must have Πi|ψ=0isubscriptΠ𝑖ket𝜓0for-all𝑖\Pi_{i}\mathopen{}\mathclose{{}\left|\psi}\right\rangle=0\ \forall iroman_Π start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | italic_ψ ⟩ = 0 ∀ italic_i. Summing over i𝑖iitalic_i and using the fact that iΠi=𝟙subscript𝑖subscriptΠ𝑖1\sum_{i}\Pi_{i}=\mathds{1}∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT roman_Π start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = blackboard_1, we arrive at |ψ=0ket𝜓0\mathopen{}\mathclose{{}\left|\psi}\right\rangle=0| italic_ψ ⟩ = 0, contradiction. [hu2022].

We note that this same difficulty affects other techniques [winick2018, hu2022, araujo22]: in [winick2018] they perturb the singular matrices with identity to make them full rank, whereas in [hu2022, araujo22] they reformulate the problem in terms of the support of the relevant matrices, a procedure known as facial reduction [drusvyatskiy2017]. Facial reduction is more reliable, elegant, and results in a smaller problem than perturbing with identity, so we adopt this approach here as well.

We would thus like to perform facial reduction along the lines of [hu2022]: first find an isometry V𝑉Vitalic_V that encodes the support of the feasible ρ𝜌\rhoitalic_ρ, such that ρ=VσV𝜌𝑉𝜎superscript𝑉\rho=V\sigma V^{\dagger}italic_ρ = italic_V italic_σ italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT for full-rank σ𝜎\sigmaitalic_σ. Then replace the equality constraints with tr(VEkVσ)=:tr(Fkσ)=pk\operatorname{tr}(V^{\dagger}E_{k}V\sigma)=:\operatorname{tr}(F_{k}\sigma)=p_{k}roman_tr ( italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_E start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_V italic_σ ) = : roman_tr ( italic_F start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_σ ) = italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, dropping the redundant ones. Finally, find maps 𝒢^^𝒢\widehat{\mathcal{G}}over^ start_ARG caligraphic_G end_ARG and 𝒵^^𝒵\widehat{\mathcal{Z}}over^ start_ARG caligraphic_Z end_ARG such that 𝒢^(σ)^𝒢𝜎\widehat{\mathcal{G}}(\sigma)over^ start_ARG caligraphic_G end_ARG ( italic_σ ) and 𝒵^(σ)^𝒵𝜎\widehat{\mathcal{Z}}(\sigma)over^ start_ARG caligraphic_Z end_ARG ( italic_σ ) are the restrictions of the maps σ𝒢(VσV)maps-to𝜎𝒢𝑉𝜎superscript𝑉\sigma\mapsto\mathcal{G}(V\sigma V^{\dagger})italic_σ ↦ caligraphic_G ( italic_V italic_σ italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) and σ𝒵(𝒢(VσV))maps-to𝜎𝒵𝒢𝑉𝜎superscript𝑉\sigma\mapsto\mathcal{Z}(\mathcal{G}(V\sigma V^{\dagger}))italic_σ ↦ caligraphic_Z ( caligraphic_G ( italic_V italic_σ italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) ) to the support of their ranges.

While that’s easy to do, the resulting problem is no longer an optimization over the relative entropy; in particular D(𝒢^(σ)𝒵^(σ))D(𝒢(ρ)𝒵(𝒢(ρ)))𝐷conditional^𝒢𝜎^𝒵𝜎𝐷conditional𝒢𝜌𝒵𝒢𝜌D(\widehat{\mathcal{G}}(\sigma)\|\widehat{\mathcal{Z}}(\sigma))\neq D(\mathcal% {G}(\rho)\|\mathcal{Z}(\mathcal{G}(\rho)))italic_D ( over^ start_ARG caligraphic_G end_ARG ( italic_σ ) ∥ over^ start_ARG caligraphic_Z end_ARG ( italic_σ ) ) ≠ italic_D ( caligraphic_G ( italic_ρ ) ∥ caligraphic_Z ( caligraphic_G ( italic_ρ ) ) ). Instead, to obtain the reformulated problem we use the following identity [coles2012]:

D(𝒢(ρ)𝒵(𝒢(ρ)))=H(𝒢(ρ))+H(𝒵(𝒢(ρ))),𝐷conditional𝒢𝜌𝒵𝒢𝜌𝐻𝒢𝜌𝐻𝒵𝒢𝜌D(\mathcal{G}(\rho)\|\mathcal{Z}(\mathcal{G}(\rho)))=-H(\mathcal{G}(\rho))+H(% \mathcal{Z}(\mathcal{G}(\rho))),italic_D ( caligraphic_G ( italic_ρ ) ∥ caligraphic_Z ( caligraphic_G ( italic_ρ ) ) ) = - italic_H ( caligraphic_G ( italic_ρ ) ) + italic_H ( caligraphic_Z ( caligraphic_G ( italic_ρ ) ) ) , (6)

where H(ρ)=tr(ρlogρ)𝐻𝜌tr𝜌𝜌H(\rho)=-\operatorname{tr}(\rho\log\rho)italic_H ( italic_ρ ) = - roman_tr ( italic_ρ roman_log italic_ρ ) is the von Neumann entropy, together with the simple observation that H(𝒢(ρ))=H(𝒢^(σ))𝐻𝒢𝜌𝐻^𝒢𝜎H(\mathcal{G}(\rho))=H(\widehat{\mathcal{G}}(\sigma))italic_H ( caligraphic_G ( italic_ρ ) ) = italic_H ( over^ start_ARG caligraphic_G end_ARG ( italic_σ ) ) and H(𝒵(𝒢(ρ)))=H(𝒵^(σ))𝐻𝒵𝒢𝜌𝐻^𝒵𝜎H(\mathcal{Z}(\mathcal{G}(\rho)))=H(\widehat{\mathcal{Z}}(\sigma))italic_H ( caligraphic_Z ( caligraphic_G ( italic_ρ ) ) ) = italic_H ( over^ start_ARG caligraphic_Z end_ARG ( italic_σ ) ). This motivates the definition of a new convex cone:

𝒦QKD𝒢^,𝒵^={(h,σ)×n;σ0,hH(𝒢^(σ))+H(𝒵^(σ))}.superscriptsubscript𝒦QKD^𝒢^𝒵formulae-sequence𝜎superscript𝑛formulae-sequencesucceeds-or-equals𝜎0𝐻^𝒢𝜎𝐻^𝒵𝜎\mathcal{K}_{\text{QKD}}^{\widehat{\mathcal{G}},\widehat{\mathcal{Z}}}=\{(h,% \sigma)\in\mathbb{R}\times\mathbb{H}^{n};\,\sigma\succeq 0,\ h\geq-H(\widehat{% \mathcal{G}}(\sigma))+H(\widehat{\mathcal{Z}}(\sigma))\}.caligraphic_K start_POSTSUBSCRIPT QKD end_POSTSUBSCRIPT start_POSTSUPERSCRIPT over^ start_ARG caligraphic_G end_ARG , over^ start_ARG caligraphic_Z end_ARG end_POSTSUPERSCRIPT = { ( italic_h , italic_σ ) ∈ blackboard_R × blackboard_H start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ; italic_σ ⪰ 0 , italic_h ≥ - italic_H ( over^ start_ARG caligraphic_G end_ARG ( italic_σ ) ) + italic_H ( over^ start_ARG caligraphic_Z end_ARG ( italic_σ ) ) } . (7)

We emphasize that 𝒢^^𝒢\widehat{\mathcal{G}}over^ start_ARG caligraphic_G end_ARG and 𝒵^^𝒵\widehat{\mathcal{Z}}over^ start_ARG caligraphic_Z end_ARG must come from the reduction of 𝒢𝒢\mathcal{G}caligraphic_G and 𝒵𝒵\mathcal{Z}caligraphic_Z; if we let them be arbitrary CP maps 𝒦QKD𝒢^,𝒵^superscriptsubscript𝒦QKD^𝒢^𝒵\mathcal{K}_{\text{QKD}}^{\widehat{\mathcal{G}},\widehat{\mathcal{Z}}}caligraphic_K start_POSTSUBSCRIPT QKD end_POSTSUBSCRIPT start_POSTSUPERSCRIPT over^ start_ARG caligraphic_G end_ARG , over^ start_ARG caligraphic_Z end_ARG end_POSTSUPERSCRIPT is not even a convex cone in general.

With the cone in hand, we can finally state the reduced problem:

minh,σhs.t.tr(σ)=1,tr(Fkσ)=pkk,(h,σ)𝒦QKD𝒢^,𝒵^\begin{gathered}\min_{h,\sigma}\ h\\ \text{s.t.}\quad\operatorname{tr}(\sigma)=1,\quad\operatorname{tr}(F_{k}\sigma% )=p_{k}\ \forall k,\\ (h,\sigma)\in\mathcal{K}_{\text{QKD}}^{\widehat{\mathcal{G}},\widehat{\mathcal% {Z}}}\end{gathered}start_ROW start_CELL roman_min start_POSTSUBSCRIPT italic_h , italic_σ end_POSTSUBSCRIPT italic_h end_CELL end_ROW start_ROW start_CELL s.t. roman_tr ( italic_σ ) = 1 , roman_tr ( italic_F start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_σ ) = italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∀ italic_k , end_CELL end_ROW start_ROW start_CELL ( italic_h , italic_σ ) ∈ caligraphic_K start_POSTSUBSCRIPT QKD end_POSTSUBSCRIPT start_POSTSUPERSCRIPT over^ start_ARG caligraphic_G end_ARG , over^ start_ARG caligraphic_Z end_ARG end_POSTSUPERSCRIPT end_CELL end_ROW (8)

Note that the constraint ρ0succeeds-or-equals𝜌0\rho\succeq 0italic_ρ ⪰ 0 has been dropped; although it’s equivalent to σ0succeeds-or-equals𝜎0\sigma\succeq 0italic_σ ⪰ 0, the latter is redundant with the definition of 𝒦QKD𝒢^,𝒵^superscriptsubscript𝒦QKD^𝒢^𝒵\mathcal{K}_{\text{QKD}}^{\widehat{\mathcal{G}},\widehat{\mathcal{Z}}}caligraphic_K start_POSTSUBSCRIPT QKD end_POSTSUBSCRIPT start_POSTSUPERSCRIPT over^ start_ARG caligraphic_G end_ARG , over^ start_ARG caligraphic_Z end_ARG end_POSTSUPERSCRIPT.

We emphasize that even in the case where no facial reduction is necessary, and 𝒢^=𝒢^𝒢𝒢\widehat{\mathcal{G}}=\mathcal{G}over^ start_ARG caligraphic_G end_ARG = caligraphic_G and 𝒵^=𝒵𝒢^𝒵𝒵𝒢\widehat{\mathcal{Z}}=\mathcal{Z}\circ\mathcal{G}over^ start_ARG caligraphic_Z end_ARG = caligraphic_Z ∘ caligraphic_G, it is still advantageous to use the QKD cone (7) instead of the relative entropy cone (4) for four reasons: (i) the constraint ρ0succeeds-or-equals𝜌0\rho\succeq 0italic_ρ ⪰ 0 can not be dropped in the relative entropy cone unless 𝒢𝒢\mathcal{G}caligraphic_G has a positive inverse, (ii) the QKD cone has one less matrix variable to optimize over, (iii) the derivatives of the barrier function of the QKD cone, that we introduce below, are less computationally demanding, (iv) as the QKD cone explicitly depends on 𝒢^^𝒢\widehat{\mathcal{G}}over^ start_ARG caligraphic_G end_ARG and 𝒵^^𝒵\widehat{\mathcal{Z}}over^ start_ARG caligraphic_Z end_ARG, it can exploit their structure; specifically, we use the fact that 𝒵^^𝒵\widehat{\mathcal{Z}}over^ start_ARG caligraphic_Z end_ARG necessarily has a block diagonal structure to drastically speed up the computation of several derivatives involving it.

3 Implementation of the QKD cone

In order to successfully optimize over a given convex cone 𝒦𝒦\mathcal{K}caligraphic_K, both using Skajaa and Ye’s algorithm or traditional methods for symmetric cones, it’s essential to provide a logarithmically homogeneous self-concordant barrier (LHSCB) function for it [nesterov1997]. It is defined as a function f:int(𝒦):𝑓int𝒦f:\operatorname{int}(\mathcal{K})\to\mathbb{R}italic_f : roman_int ( caligraphic_K ) → blackboard_R such that f(xi)𝑓subscript𝑥𝑖f(x_{i})\to\inftyitalic_f ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) → ∞ along every sequence converging to the boundary of 𝒦𝒦\mathcal{K}caligraphic_K, is three times continuously differentiable, strictly convex, and

f(τx)=f(x)νlogτxint(𝒦),τ>0,formulae-sequence𝑓𝜏𝑥𝑓𝑥𝜈𝜏formulae-sequencefor-all𝑥int𝒦for-all𝜏0\displaystyle f(\tau x)=f(x)-\nu\log\tau\quad\forall x\in\operatorname{int}(% \mathcal{K}),\forall\tau>0,italic_f ( italic_τ italic_x ) = italic_f ( italic_x ) - italic_ν roman_log italic_τ ∀ italic_x ∈ roman_int ( caligraphic_K ) , ∀ italic_τ > 0 , (9)
|3f(x)[ξ,ξ,ξ]|2(2f(x)[ξ,ξ])3/2xint(𝒦),ξ.formulae-sequencesuperscript3𝑓𝑥𝜉𝜉𝜉2superscriptsuperscript2𝑓𝑥𝜉𝜉32for-all𝑥int𝒦for-all𝜉\displaystyle|\nabla^{3}f(x)[\xi,\xi,\xi]|\leq 2(\nabla^{2}f(x)[\xi,\xi])^{3/2% }\quad\forall x\in\operatorname{int}(\mathcal{K}),\forall\xi.| ∇ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_f ( italic_x ) [ italic_ξ , italic_ξ , italic_ξ ] | ≤ 2 ( ∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_f ( italic_x ) [ italic_ξ , italic_ξ ] ) start_POSTSUPERSCRIPT 3 / 2 end_POSTSUPERSCRIPT ∀ italic_x ∈ roman_int ( caligraphic_K ) , ∀ italic_ξ . (10)

While most of these properties are straightforward to verify, self-concordance (10) is not, and has only recently been proven for the following barrier function for the relative entropy cone (4) [fawzi2022]:

f(h,ρ,σ)=log(hD(ρσ))logdet(ρ)logdet(σ).𝑓𝜌𝜎𝐷conditional𝜌𝜎logdet𝜌logdet𝜎f(h,\rho,\sigma)=-\log(h-D(\rho\|\sigma))-\operatorname{logdet}(\rho)-% \operatorname{logdet}(\sigma).italic_f ( italic_h , italic_ρ , italic_σ ) = - roman_log ( italic_h - italic_D ( italic_ρ ∥ italic_σ ) ) - roman_logdet ( italic_ρ ) - roman_logdet ( italic_σ ) . (11)

We propose as the barrier function for the QKD cone (7) its obvious analogue:

f(h,ρ)=log(h+H(𝒢^(ρ))H(𝒵^(ρ)))logdet(ρ).𝑓𝜌𝐻^𝒢𝜌𝐻^𝒵𝜌logdet𝜌f(h,\rho)=-\log(h+H(\widehat{\mathcal{G}}(\rho))-H(\widehat{\mathcal{Z}}(\rho)% ))-\operatorname{logdet}(\rho).italic_f ( italic_h , italic_ρ ) = - roman_log ( italic_h + italic_H ( over^ start_ARG caligraphic_G end_ARG ( italic_ρ ) ) - italic_H ( over^ start_ARG caligraphic_Z end_ARG ( italic_ρ ) ) ) - roman_logdet ( italic_ρ ) . (12)

We conjecture that it is also self-concordant based on the similarity and the fact that our extensive numerical tests were successful.

In order to implement the cone (7) in the solver Hypatia, we need to provide the gradient, Hessian, and third-order derivative of the barrier function (12), and choose an initial point [coey2023].

3.1 Gradient, Hessian, and third-order derivative

To obtain the gradient, Hessian, and third-order derivative of the barrier function (12) we adapt the results of [faybusovich2022] to complex matrices, and combine them with the results of [coey2024] for the relative entropy cone. In the code they are needed in vectorized form, as explained in Appendix A. Here we present them in the usual notation for clarity.

Let g(ρ)=H((ρ))𝑔𝜌𝐻𝜌g(\rho)=-H(\mathcal{L}(\rho))italic_g ( italic_ρ ) = - italic_H ( caligraphic_L ( italic_ρ ) ) for some positive map \mathcal{L}caligraphic_L. Then its gradient is

ρg(ρ)=(𝟙+log((ρ))),subscript𝜌𝑔𝜌superscript1𝜌\nabla_{\rho}g(\rho)=\mathcal{L}^{\dagger}(\mathds{1}+\log(\mathcal{L}(\rho))),∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_g ( italic_ρ ) = caligraphic_L start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( blackboard_1 + roman_log ( caligraphic_L ( italic_ρ ) ) ) , (13)

where superscript\mathcal{L}^{\dagger}caligraphic_L start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT is the adjoint of \mathcal{L}caligraphic_L. Its Hessian is the linear operator

ρ,ρ2g(ρ)=ξ(U(Γ[1](Λ)(U(ξ)U))U),subscriptsuperscript2𝜌𝜌𝑔𝜌𝜉maps-tosuperscript𝑈direct-productsuperscriptΓdelimited-[]1Λsuperscript𝑈𝜉𝑈superscript𝑈\nabla^{2}_{\rho,\rho}g(\rho)=\xi\mapsto\mathcal{L}^{\dagger}\mathopen{}% \mathclose{{}\left(U(\Gamma^{[1]}(\Lambda)\odot(U^{\dagger}\mathcal{L}(\xi)U))% U^{\dagger}}\right),∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_ρ , italic_ρ end_POSTSUBSCRIPT italic_g ( italic_ρ ) = italic_ξ ↦ caligraphic_L start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( italic_U ( roman_Γ start_POSTSUPERSCRIPT [ 1 ] end_POSTSUPERSCRIPT ( roman_Λ ) ⊙ ( italic_U start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT caligraphic_L ( italic_ξ ) italic_U ) ) italic_U start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) , (14)

where direct-product\odot is the elementwise (Schur) product, (ρ)=UΛU𝜌𝑈Λsuperscript𝑈\mathcal{L}(\rho)=U\Lambda U^{\dagger}caligraphic_L ( italic_ρ ) = italic_U roman_Λ italic_U start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT is a diagonalization of (ρ)𝜌\mathcal{L}(\rho)caligraphic_L ( italic_ρ ), and Γ[1](Λ)superscriptΓdelimited-[]1Λ\Gamma^{[1]}(\Lambda)roman_Γ start_POSTSUPERSCRIPT [ 1 ] end_POSTSUPERSCRIPT ( roman_Λ ) is defined as

Γ[1](Λ)ij={log(λi)log(λj)λiλj,λiλjλi1,λi=λj,superscriptΓdelimited-[]1subscriptΛ𝑖𝑗casessubscript𝜆𝑖subscript𝜆𝑗subscript𝜆𝑖subscript𝜆𝑗subscript𝜆𝑖subscript𝜆𝑗superscriptsubscript𝜆𝑖1subscript𝜆𝑖subscript𝜆𝑗\Gamma^{[1]}(\Lambda)_{ij}=\begin{cases}\frac{\log(\lambda_{i})-\log(\lambda_{% j})}{\lambda_{i}-\lambda_{j}},&\lambda_{i}\neq\lambda_{j}\\ \lambda_{i}^{-1},&\lambda_{i}=\lambda_{j}\,,\end{cases}roman_Γ start_POSTSUPERSCRIPT [ 1 ] end_POSTSUPERSCRIPT ( roman_Λ ) start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = { start_ROW start_CELL divide start_ARG roman_log ( italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - roman_log ( italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) end_ARG start_ARG italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG , end_CELL start_CELL italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≠ italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , end_CELL start_CELL italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , end_CELL end_ROW (15)

where λi=Λiisubscript𝜆𝑖subscriptΛ𝑖𝑖\lambda_{i}=\Lambda_{ii}italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = roman_Λ start_POSTSUBSCRIPT italic_i italic_i end_POSTSUBSCRIPT. Its third-order derivative applied to ξ,ξ𝜉𝜉\xi,\xiitalic_ξ , italic_ξ is

ρ,ρ,ρ3g(ρ)[ξ,ξ]=(UM(ξ)U),superscriptsubscript𝜌𝜌𝜌3𝑔𝜌𝜉𝜉superscriptsubscript𝑈subscript𝑀𝜉superscriptsubscript𝑈\nabla_{\rho,\rho,\rho}^{3}g(\rho)[\xi,\xi]=\mathcal{L}^{\dagger}(U_{\mathcal{% L}}M_{\mathcal{L}}(\xi)U_{\mathcal{L}}^{\dagger})\,,∇ start_POSTSUBSCRIPT italic_ρ , italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_g ( italic_ρ ) [ italic_ξ , italic_ξ ] = caligraphic_L start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( italic_U start_POSTSUBSCRIPT caligraphic_L end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT caligraphic_L end_POSTSUBSCRIPT ( italic_ξ ) italic_U start_POSTSUBSCRIPT caligraphic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) , (16)

where

M(ξ)i,j=2kξ~i,kξ~k,jΓi,j,k[2](Λ),subscript𝑀subscript𝜉𝑖𝑗2subscript𝑘subscript~𝜉𝑖𝑘subscript~𝜉𝑘𝑗subscriptsuperscriptΓdelimited-[]2𝑖𝑗𝑘subscriptΛM_{\mathcal{L}}(\xi)_{\,i,j}=2\sum_{k}\tilde{\xi}_{i,k}\tilde{\xi}_{k,j}\Gamma% ^{[2]}_{i,j,k}(\Lambda_{\mathcal{L}})\,,italic_M start_POSTSUBSCRIPT caligraphic_L end_POSTSUBSCRIPT ( italic_ξ ) start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT = 2 ∑ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT over~ start_ARG italic_ξ end_ARG start_POSTSUBSCRIPT italic_i , italic_k end_POSTSUBSCRIPT over~ start_ARG italic_ξ end_ARG start_POSTSUBSCRIPT italic_k , italic_j end_POSTSUBSCRIPT roman_Γ start_POSTSUPERSCRIPT [ 2 ] end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , italic_j , italic_k end_POSTSUBSCRIPT ( roman_Λ start_POSTSUBSCRIPT caligraphic_L end_POSTSUBSCRIPT ) , (17)

with ξ~=U(ξ)U~𝜉superscriptsubscript𝑈𝜉subscript𝑈\tilde{\xi}=U_{\mathcal{L}}^{\dagger}\mathcal{L}(\xi)U_{\mathcal{L}}over~ start_ARG italic_ξ end_ARG = italic_U start_POSTSUBSCRIPT caligraphic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT caligraphic_L ( italic_ξ ) italic_U start_POSTSUBSCRIPT caligraphic_L end_POSTSUBSCRIPT and

Γi,j,k[2](Λ)={Γi,j[1](Λ)Γi,k[1](Λ)λjλk,λjλkΓi,j[1](Λ)Γj,j[1](Λ)λiλj,λiλj=λkλi2/2,λi=λj=λk.subscriptsuperscriptΓdelimited-[]2𝑖𝑗𝑘ΛcasessubscriptsuperscriptΓdelimited-[]1𝑖𝑗ΛsubscriptsuperscriptΓdelimited-[]1𝑖𝑘Λsubscript𝜆𝑗subscript𝜆𝑘subscript𝜆𝑗subscript𝜆𝑘subscriptsuperscriptΓdelimited-[]1𝑖𝑗ΛsubscriptsuperscriptΓdelimited-[]1𝑗𝑗Λsubscript𝜆𝑖subscript𝜆𝑗subscript𝜆𝑖subscript𝜆𝑗subscript𝜆𝑘superscriptsubscript𝜆𝑖22subscript𝜆𝑖subscript𝜆𝑗subscript𝜆𝑘\Gamma^{[2]}_{i,j,k}(\Lambda)=\begin{cases}\frac{\Gamma^{[1]}_{i,j}(\Lambda)-% \Gamma^{[1]}_{i,k}(\Lambda)}{\lambda_{j}-\lambda_{k}},&\lambda_{j}\neq\lambda_% {k}\\ \frac{\Gamma^{[1]}_{i,j}(\Lambda)-\Gamma^{[1]}_{j,j}(\Lambda)}{\lambda_{i}-% \lambda_{j}},&\lambda_{i}\neq\lambda_{j}=\lambda_{k}\\ -\lambda_{i}^{-2}/2,&\lambda_{i}=\lambda_{j}=\lambda_{k}\>.\end{cases}roman_Γ start_POSTSUPERSCRIPT [ 2 ] end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , italic_j , italic_k end_POSTSUBSCRIPT ( roman_Λ ) = { start_ROW start_CELL divide start_ARG roman_Γ start_POSTSUPERSCRIPT [ 1 ] end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT ( roman_Λ ) - roman_Γ start_POSTSUPERSCRIPT [ 1 ] end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , italic_k end_POSTSUBSCRIPT ( roman_Λ ) end_ARG start_ARG italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG , end_CELL start_CELL italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≠ italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL divide start_ARG roman_Γ start_POSTSUPERSCRIPT [ 1 ] end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT ( roman_Λ ) - roman_Γ start_POSTSUPERSCRIPT [ 1 ] end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j , italic_j end_POSTSUBSCRIPT ( roman_Λ ) end_ARG start_ARG italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG , end_CELL start_CELL italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≠ italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL - italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT / 2 , end_CELL start_CELL italic_λ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_λ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT . end_CELL end_ROW (18)

The gradient and Hessian of the term logdet(ρ)logdet𝜌-\operatorname{logdet}(\rho)- roman_logdet ( italic_ρ ) are well-known to be ρ1superscript𝜌1-\rho^{-1}- italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT and ξρ1ξρ1maps-to𝜉superscript𝜌1𝜉superscript𝜌1\xi\mapsto\rho^{-1}\xi\rho^{-1}italic_ξ ↦ italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_ξ italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, respectively [nesterov1997]. The third order derivative can be easily calculated to be

(ξ,ζ)ρ1ξρ1ζρ1ρ1ζρ1ξρ1.maps-to𝜉𝜁superscript𝜌1𝜉superscript𝜌1𝜁superscript𝜌1superscript𝜌1𝜁superscript𝜌1𝜉superscript𝜌1(\xi,\zeta)\mapsto-\rho^{-1}\xi\rho^{-1}\zeta\rho^{-1}-\rho^{-1}\zeta\rho^{-1}% \xi\rho^{-1}.( italic_ξ , italic_ζ ) ↦ - italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_ξ italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_ζ italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT - italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_ζ italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_ξ italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT . (19)

Putting everything together, the gradient of the barrier f(h,ρ)𝑓𝜌f(h,\rho)italic_f ( italic_h , italic_ρ ) is given by

hf(h,ρ)subscript𝑓𝜌\displaystyle\nabla_{h}f(h,\rho)∇ start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT italic_f ( italic_h , italic_ρ ) =1u,absent1𝑢\displaystyle=-\frac{1}{u},= - divide start_ARG 1 end_ARG start_ARG italic_u end_ARG , (20)
ρf(h,ρ)subscript𝜌𝑓𝜌\displaystyle\nabla_{\rho}f(h,\rho)∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_f ( italic_h , italic_ρ ) =1uρuρ1,absent1𝑢subscript𝜌𝑢superscript𝜌1\displaystyle=-\frac{1}{u}\nabla_{\rho}\,u-\rho^{-1},= - divide start_ARG 1 end_ARG start_ARG italic_u end_ARG ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u - italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , (21)

where u=h+H(𝒢^(ρ))H(𝒵^(ρ))𝑢𝐻^𝒢𝜌𝐻^𝒵𝜌u=h+H(\widehat{\mathcal{G}}(\rho))-H(\widehat{\mathcal{Z}}(\rho))italic_u = italic_h + italic_H ( over^ start_ARG caligraphic_G end_ARG ( italic_ρ ) ) - italic_H ( over^ start_ARG caligraphic_Z end_ARG ( italic_ρ ) ) and ρu=𝒢^(𝟙+log(𝒢^(ρ)))+𝒵^(𝟙+log(𝒵^(ρ)))subscript𝜌𝑢superscript^𝒢1^𝒢𝜌superscript^𝒵1^𝒵𝜌\nabla_{\rho}\,u=-\widehat{\mathcal{G}}^{\dagger}(\mathds{1}+\log(\widehat{% \mathcal{G}}(\rho)))+\widehat{\mathcal{Z}}^{\dagger}(\mathds{1}+\log(\widehat{% \mathcal{Z}}(\rho)))∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u = - over^ start_ARG caligraphic_G end_ARG start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( blackboard_1 + roman_log ( over^ start_ARG caligraphic_G end_ARG ( italic_ρ ) ) ) + over^ start_ARG caligraphic_Z end_ARG start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( blackboard_1 + roman_log ( over^ start_ARG caligraphic_Z end_ARG ( italic_ρ ) ) ), and the Hessian by

h,h2f(h,ρ)superscriptsubscript2𝑓𝜌\displaystyle\nabla_{h,h}^{2}f(h,\rho)∇ start_POSTSUBSCRIPT italic_h , italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_f ( italic_h , italic_ρ ) =1u2,absent1superscript𝑢2\displaystyle=\frac{1}{u^{2}},= divide start_ARG 1 end_ARG start_ARG italic_u start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , (22)
h,ρ2f(h,ρ)superscriptsubscript𝜌2𝑓𝜌\displaystyle\nabla_{h,\rho}^{2}f(h,\rho)∇ start_POSTSUBSCRIPT italic_h , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_f ( italic_h , italic_ρ ) =1u2ρu,absent1superscript𝑢2subscript𝜌𝑢\displaystyle=\frac{1}{u^{2}}\nabla_{\rho}\,u,= divide start_ARG 1 end_ARG start_ARG italic_u start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u , (23)
ρ,ρ2f(h,ρ)superscriptsubscript𝜌𝜌2𝑓𝜌\displaystyle\nabla_{\rho,\rho}^{2}f(h,\rho)∇ start_POSTSUBSCRIPT italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_f ( italic_h , italic_ρ ) =ξ1u2tr[(ρu)ξ]ρu1uρ,ρ2u[ξ]+ρ1ξρ1,absent𝜉maps-to1superscript𝑢2trsubscript𝜌𝑢𝜉subscript𝜌𝑢1𝑢superscriptsubscript𝜌𝜌2𝑢delimited-[]𝜉superscript𝜌1𝜉superscript𝜌1\displaystyle=\xi\mapsto\frac{1}{u^{2}}\operatorname{tr}\big{[}(\nabla_{\rho}% \,u)\xi\big{]}\nabla_{\rho}\,u-\frac{1}{u}\nabla_{\rho,\rho}^{2}\,u[\xi]+\rho^% {-1}\xi\rho^{-1},= italic_ξ ↦ divide start_ARG 1 end_ARG start_ARG italic_u start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG roman_tr [ ( ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u ) italic_ξ ] ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u - divide start_ARG 1 end_ARG start_ARG italic_u end_ARG ∇ start_POSTSUBSCRIPT italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_u [ italic_ξ ] + italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_ξ italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , (24)

where

ρ,ρ2u[ξ]=𝒢^(U𝒢(Γ[1](Λ𝒢)(U𝒢𝒢^(ξ)U𝒢))U𝒢)+𝒵^(U𝒵(Γ[1](Λ𝒵)(U𝒵𝒵^(ξ)U𝒵))U𝒵),superscriptsubscript𝜌𝜌2𝑢delimited-[]𝜉superscript^𝒢subscript𝑈𝒢direct-productsuperscriptΓdelimited-[]1subscriptΛ𝒢superscriptsubscript𝑈𝒢^𝒢𝜉subscript𝑈𝒢superscriptsubscript𝑈𝒢superscript^𝒵subscript𝑈𝒵direct-productsuperscriptΓdelimited-[]1subscriptΛ𝒵superscriptsubscript𝑈𝒵^𝒵𝜉subscript𝑈𝒵superscriptsubscript𝑈𝒵\nabla_{\rho,\rho}^{2}\,u[\xi]=-\widehat{\mathcal{G}}^{\dagger}\mathopen{}% \mathclose{{}\left(U_{\mathcal{G}}(\Gamma^{[1]}(\Lambda_{\mathcal{G}})\odot(U_% {\mathcal{G}}^{\dagger}\widehat{\mathcal{G}}(\xi)U_{\mathcal{G}}))U_{\mathcal{% G}}^{\dagger}}\right)+\widehat{\mathcal{Z}}^{\dagger}\mathopen{}\mathclose{{}% \left(U_{\mathcal{Z}}(\Gamma^{[1]}(\Lambda_{\mathcal{Z}})\odot(U_{\mathcal{Z}}% ^{\dagger}\widehat{\mathcal{Z}}(\xi)U_{\mathcal{Z}}))U_{\mathcal{Z}}^{\dagger}% }\right),∇ start_POSTSUBSCRIPT italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_u [ italic_ξ ] = - over^ start_ARG caligraphic_G end_ARG start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( italic_U start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ( roman_Γ start_POSTSUPERSCRIPT [ 1 ] end_POSTSUPERSCRIPT ( roman_Λ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ) ⊙ ( italic_U start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT over^ start_ARG caligraphic_G end_ARG ( italic_ξ ) italic_U start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ) ) italic_U start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) + over^ start_ARG caligraphic_Z end_ARG start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( italic_U start_POSTSUBSCRIPT caligraphic_Z end_POSTSUBSCRIPT ( roman_Γ start_POSTSUPERSCRIPT [ 1 ] end_POSTSUPERSCRIPT ( roman_Λ start_POSTSUBSCRIPT caligraphic_Z end_POSTSUBSCRIPT ) ⊙ ( italic_U start_POSTSUBSCRIPT caligraphic_Z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT over^ start_ARG caligraphic_Z end_ARG ( italic_ξ ) italic_U start_POSTSUBSCRIPT caligraphic_Z end_POSTSUBSCRIPT ) ) italic_U start_POSTSUBSCRIPT caligraphic_Z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) , (25)

and 𝒢^(ρ)=U𝒢Λ𝒢U𝒢^𝒢𝜌subscript𝑈𝒢subscriptΛ𝒢superscriptsubscript𝑈𝒢\widehat{\mathcal{G}}(\rho)=U_{\mathcal{G}}\Lambda_{\mathcal{G}}U_{\mathcal{G}% }^{\dagger}over^ start_ARG caligraphic_G end_ARG ( italic_ρ ) = italic_U start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT roman_Λ start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT italic_U start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT and 𝒵^(ρ)=U𝒵Λ𝒵U𝒵^𝒵𝜌subscript𝑈𝒵subscriptΛ𝒵superscriptsubscript𝑈𝒵\widehat{\mathcal{Z}}(\rho)=U_{\mathcal{Z}}\Lambda_{\mathcal{Z}}U_{\mathcal{Z}% }^{\dagger}over^ start_ARG caligraphic_Z end_ARG ( italic_ρ ) = italic_U start_POSTSUBSCRIPT caligraphic_Z end_POSTSUBSCRIPT roman_Λ start_POSTSUBSCRIPT caligraphic_Z end_POSTSUBSCRIPT italic_U start_POSTSUBSCRIPT caligraphic_Z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT are diagonalizations of 𝒢^(ρ)^𝒢𝜌\widehat{\mathcal{G}}(\rho)over^ start_ARG caligraphic_G end_ARG ( italic_ρ ) and 𝒵^(ρ)^𝒵𝜌\widehat{\mathcal{Z}}(\rho)over^ start_ARG caligraphic_Z end_ARG ( italic_ρ ).

The third order derivatives are given by:

h,h,h3f(h,ρ)superscriptsubscript3𝑓𝜌\displaystyle\nabla_{h,h,h}^{3}f(h,\rho)∇ start_POSTSUBSCRIPT italic_h , italic_h , italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_f ( italic_h , italic_ρ ) =2u3,absent2superscript𝑢3\displaystyle=-\frac{2}{u^{3}},= - divide start_ARG 2 end_ARG start_ARG italic_u start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG , (26)
h,h,ρ3f(h,ρ)superscriptsubscript𝜌3𝑓𝜌\displaystyle\nabla_{h,h,\rho}^{3}f(h,\rho)∇ start_POSTSUBSCRIPT italic_h , italic_h , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_f ( italic_h , italic_ρ ) =2u3ρu,absent2superscript𝑢3subscript𝜌𝑢\displaystyle=-\frac{2}{u^{3}}\nabla_{\rho}u,= - divide start_ARG 2 end_ARG start_ARG italic_u start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u , (27)
h,ρ,ρ3f(h,ρ)superscriptsubscript𝜌𝜌3𝑓𝜌\displaystyle\nabla_{h,\rho,\rho}^{3}f(h,\rho)∇ start_POSTSUBSCRIPT italic_h , italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_f ( italic_h , italic_ρ ) =ξ2u3tr[(ρu)ξ]ρu+1u2ρ,ρ2u[ξ],absent𝜉maps-to2superscript𝑢3trsubscript𝜌𝑢𝜉subscript𝜌𝑢1superscript𝑢2superscriptsubscript𝜌𝜌2𝑢delimited-[]𝜉\displaystyle=\xi\mapsto-\frac{2}{u^{3}}\operatorname{tr}\big{[}(\nabla_{\rho}% \,u)\xi\big{]}\nabla_{\rho}u+\frac{1}{u^{2}}\nabla_{\rho,\rho}^{2}\,u[\xi],= italic_ξ ↦ - divide start_ARG 2 end_ARG start_ARG italic_u start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG roman_tr [ ( ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u ) italic_ξ ] ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u + divide start_ARG 1 end_ARG start_ARG italic_u start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ∇ start_POSTSUBSCRIPT italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_u [ italic_ξ ] , (28)
ρ,ρ,ρ3f(h,ρ)superscriptsubscript𝜌𝜌𝜌3𝑓𝜌\displaystyle\nabla_{\rho,\rho,\rho}^{3}f(h,\rho)∇ start_POSTSUBSCRIPT italic_ρ , italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_f ( italic_h , italic_ρ ) =(ξ,ζ)2u3tr[(ρu)ξ]tr[(ρu)ζ]ρu+1u2tr[(ρ,ρ2u[ζ])ξ]ρuabsent𝜉𝜁maps-to2superscript𝑢3trsubscript𝜌𝑢𝜉trsubscript𝜌𝑢𝜁subscript𝜌𝑢1superscript𝑢2trsuperscriptsubscript𝜌𝜌2𝑢delimited-[]𝜁𝜉subscript𝜌𝑢\displaystyle=(\xi,\zeta)\mapsto-\frac{2}{u^{3}}\operatorname{tr}\big{[}(% \nabla_{\rho}\,u)\xi\big{]}\operatorname{tr}\big{[}(\nabla_{\rho}\,u)\zeta\big% {]}\nabla_{\rho}u+\frac{1}{u^{2}}\operatorname{tr}\big{[}(\nabla_{\rho,\rho}^{% 2}\,u[\zeta])\xi\big{]}\nabla_{\rho}u= ( italic_ξ , italic_ζ ) ↦ - divide start_ARG 2 end_ARG start_ARG italic_u start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG roman_tr [ ( ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u ) italic_ξ ] roman_tr [ ( ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u ) italic_ζ ] ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u + divide start_ARG 1 end_ARG start_ARG italic_u start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG roman_tr [ ( ∇ start_POSTSUBSCRIPT italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_u [ italic_ζ ] ) italic_ξ ] ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u
+1u2tr[(ρu)ξ]ρ,ρ2u[ζ]+1u2tr[(ρu)ζ]ρ,ρ2u[ξ]1superscript𝑢2trsubscript𝜌𝑢𝜉superscriptsubscript𝜌𝜌2𝑢delimited-[]𝜁1superscript𝑢2trsubscript𝜌𝑢𝜁superscriptsubscript𝜌𝜌2𝑢delimited-[]𝜉\displaystyle\qquad\qquad\quad\,+\frac{1}{u^{2}}\operatorname{tr}\big{[}(% \nabla_{\rho}\,u)\xi\big{]}\nabla_{\rho,\rho}^{2}u[\zeta]+\frac{1}{u^{2}}% \operatorname{tr}\big{[}(\nabla_{\rho}\,u)\zeta\big{]}\nabla_{\rho,\rho}^{2}u[\xi]+ divide start_ARG 1 end_ARG start_ARG italic_u start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG roman_tr [ ( ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u ) italic_ξ ] ∇ start_POSTSUBSCRIPT italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_u [ italic_ζ ] + divide start_ARG 1 end_ARG start_ARG italic_u start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG roman_tr [ ( ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u ) italic_ζ ] ∇ start_POSTSUBSCRIPT italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_u [ italic_ξ ]
1uρ,ρ,ρ3u[ξ,ζ](ρ1ξρ1ζρ1+ρ1ζρ1ξρ1).1𝑢superscriptsubscript𝜌𝜌𝜌3𝑢𝜉𝜁superscript𝜌1𝜉superscript𝜌1𝜁superscript𝜌1superscript𝜌1𝜁superscript𝜌1𝜉superscript𝜌1\displaystyle\qquad\qquad\quad\,-\frac{1}{u}\nabla_{\rho,\rho,\rho}^{3}u[\xi,% \zeta]-(\rho^{-1}\xi\rho^{-1}\zeta\rho^{-1}+\rho^{-1}\zeta\rho^{-1}\xi\rho^{-1% }).- divide start_ARG 1 end_ARG start_ARG italic_u end_ARG ∇ start_POSTSUBSCRIPT italic_ρ , italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_u [ italic_ξ , italic_ζ ] - ( italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_ξ italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_ζ italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_ζ italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_ξ italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) . (29)

In particular, applying ρ,ρ,ρ3f(h,ρ)superscriptsubscript𝜌𝜌𝜌3𝑓𝜌\nabla_{\rho,\rho,\rho}^{3}f(h,\rho)∇ start_POSTSUBSCRIPT italic_ρ , italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_f ( italic_h , italic_ρ ) to the point [ξ,ξ]𝜉𝜉[\xi,\xi][ italic_ξ , italic_ξ ] gives

ρ,ρ,ρ3f(h,ρ)[ξ,ξ]superscriptsubscript𝜌𝜌𝜌3𝑓𝜌𝜉𝜉\displaystyle\nabla_{\rho,\rho,\rho}^{3}f(h,\rho)[\xi,\xi]∇ start_POSTSUBSCRIPT italic_ρ , italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_f ( italic_h , italic_ρ ) [ italic_ξ , italic_ξ ] =2u3(tr[(ρu)ξ])2ρu+1u2tr[(ρ,ρ2u[ξ])ξ]ρuabsent2superscript𝑢3superscripttrsubscript𝜌𝑢𝜉2subscript𝜌𝑢1superscript𝑢2trsuperscriptsubscript𝜌𝜌2𝑢delimited-[]𝜉𝜉subscript𝜌𝑢\displaystyle=-\frac{2}{u^{3}}\big{(}\operatorname{tr}\big{[}(\nabla_{\rho}\,u% )\xi\big{]}\big{)}^{2}\nabla_{\rho}u+\frac{1}{u^{2}}\operatorname{tr}\big{[}(% \nabla_{\rho,\rho}^{2}\,u[\xi])\xi\big{]}\nabla_{\rho}u= - divide start_ARG 2 end_ARG start_ARG italic_u start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_ARG ( roman_tr [ ( ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u ) italic_ξ ] ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u + divide start_ARG 1 end_ARG start_ARG italic_u start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG roman_tr [ ( ∇ start_POSTSUBSCRIPT italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_u [ italic_ξ ] ) italic_ξ ] ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u
+2u2tr[(ρu)ξ]ρ,ρ2u[ξ]1uρ,ρ,ρ3u[ξ,ξ]2superscript𝑢2trsubscript𝜌𝑢𝜉superscriptsubscript𝜌𝜌2𝑢delimited-[]𝜉1𝑢superscriptsubscript𝜌𝜌𝜌3𝑢𝜉𝜉\displaystyle\quad+\frac{2}{u^{2}}\operatorname{tr}\big{[}(\nabla_{\rho}\,u)% \xi\big{]}\nabla_{\rho,\rho}^{2}u[\xi]-\frac{1}{u}\nabla_{\rho,\rho,\rho}^{3}u% [\xi,\xi]+ divide start_ARG 2 end_ARG start_ARG italic_u start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG roman_tr [ ( ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u ) italic_ξ ] ∇ start_POSTSUBSCRIPT italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_u [ italic_ξ ] - divide start_ARG 1 end_ARG start_ARG italic_u end_ARG ∇ start_POSTSUBSCRIPT italic_ρ , italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_u [ italic_ξ , italic_ξ ]
2ρ1ξρ1ξρ1,2superscript𝜌1𝜉superscript𝜌1𝜉superscript𝜌1\displaystyle\quad-2\rho^{-1}\xi\rho^{-1}\xi\rho^{-1},- 2 italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_ξ italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_ξ italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , (30)

where

ρ,ρ,ρ3u[ξ,ξ]=𝒢^(U𝒢M𝒢(ξ)U𝒢)+𝒵^(U𝒵M𝒵(ξ)U𝒵),superscriptsubscript𝜌𝜌𝜌3𝑢𝜉𝜉superscript^𝒢subscript𝑈𝒢subscript𝑀𝒢𝜉superscriptsubscript𝑈𝒢superscript^𝒵subscript𝑈𝒵subscript𝑀𝒵𝜉superscriptsubscript𝑈𝒵\nabla_{\rho,\rho,\rho}^{3}u[\xi,\xi]=-{\widehat{\mathcal{G}}}^{\dagger}(U_{% \mathcal{G}}M_{\mathcal{G}}(\xi)U_{\mathcal{G}}^{\dagger})+\widehat{\mathcal{Z% }}^{\dagger}(U_{\mathcal{Z}}M_{\mathcal{Z}}(\xi)U_{\mathcal{Z}}^{\dagger}),∇ start_POSTSUBSCRIPT italic_ρ , italic_ρ , italic_ρ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_u [ italic_ξ , italic_ξ ] = - over^ start_ARG caligraphic_G end_ARG start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( italic_U start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ( italic_ξ ) italic_U start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) + over^ start_ARG caligraphic_Z end_ARG start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( italic_U start_POSTSUBSCRIPT caligraphic_Z end_POSTSUBSCRIPT italic_M start_POSTSUBSCRIPT caligraphic_Z end_POSTSUBSCRIPT ( italic_ξ ) italic_U start_POSTSUBSCRIPT caligraphic_Z end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) , (31)

and M𝒢(ξ)subscript𝑀𝒢𝜉M_{\mathcal{G}}(\xi)italic_M start_POSTSUBSCRIPT caligraphic_G end_POSTSUBSCRIPT ( italic_ξ ) and M𝒵(ξ)subscript𝑀𝒵𝜉M_{\mathcal{Z}}(\xi)italic_M start_POSTSUBSCRIPT caligraphic_Z end_POSTSUBSCRIPT ( italic_ξ ) are the analogues of (17).

3.2 Initial point

The initial point can in principle be chosen to be any point in the interior of the cone [skajaa2015], but following [coey2023, dahl2022] we’d like to use the central point, as it provides better performance. It is defined as the minimizer of

minh,ρf(h,ρ)+12(h,ρ)22,subscript𝜌𝑓𝜌12subscriptsuperscriptnorm𝜌22\min\limits_{h,\rho}f(h,\rho)+\frac{1}{2}\|(h,\rho)\|^{2}_{2}\,,roman_min start_POSTSUBSCRIPT italic_h , italic_ρ end_POSTSUBSCRIPT italic_f ( italic_h , italic_ρ ) + divide start_ARG 1 end_ARG start_ARG 2 end_ARG ∥ ( italic_h , italic_ρ ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , (32)

where f(h,ρ)𝑓𝜌f(h,\rho)italic_f ( italic_h , italic_ρ ) is our barrier function (12), and (h,ρ)22=h2+ρ,ρsubscriptsuperscriptnorm𝜌22superscript2𝜌𝜌\|(h,\rho)\|^{2}_{2}=h^{2}+\langle\rho,\rho\rangle∥ ( italic_h , italic_ρ ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_h start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ⟨ italic_ρ , italic_ρ ⟩. The minimizer is obtained by taking the gradient of the objective and setting it to zero:

(h,ρ)=f(h,ρ).𝜌𝑓𝜌\displaystyle(h,\rho)=-\nabla f(h,\rho).( italic_h , italic_ρ ) = - ∇ italic_f ( italic_h , italic_ρ ) . (33)

This results in the pair of equations

h\displaystyle hitalic_h =1u,absent1𝑢\displaystyle=\frac{1}{u},= divide start_ARG 1 end_ARG start_ARG italic_u end_ARG , (34)
ρ𝜌\displaystyle\rhoitalic_ρ =1uρu+ρ1,absent1𝑢subscript𝜌𝑢superscript𝜌1\displaystyle=\frac{1}{u}\nabla_{\rho}\,u+\rho^{-1},= divide start_ARG 1 end_ARG start_ARG italic_u end_ARG ∇ start_POSTSUBSCRIPT italic_ρ end_POSTSUBSCRIPT italic_u + italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , (35)

where we are using the gradient from equations (20)–(21).

Let now u=hD𝑢𝐷u=h-Ditalic_u = italic_h - italic_D, where D=H(𝒢^(ρ))+H(𝒵^(ρ))𝐷𝐻^𝒢𝜌𝐻^𝒵𝜌D=-H(\widehat{\mathcal{G}}(\rho))+H(\widehat{\mathcal{Z}}(\rho))italic_D = - italic_H ( over^ start_ARG caligraphic_G end_ARG ( italic_ρ ) ) + italic_H ( over^ start_ARG caligraphic_Z end_ARG ( italic_ρ ) ). We get then222The negative solution for hhitalic_h is excluded as it is outside the cone.

h=D2+1+D24,𝐷21superscript𝐷24\displaystyle h=\frac{D}{2}+\sqrt{1+\frac{D^{2}}{4}},italic_h = divide start_ARG italic_D end_ARG start_ARG 2 end_ARG + square-root start_ARG 1 + divide start_ARG italic_D start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 end_ARG end_ARG , (36)
ρρ1=h(𝒢^(𝟙+log(𝒢^(ρ)))+𝒵^(𝟙+log(𝒵^(ρ)))).𝜌superscript𝜌1superscript^𝒢1^𝒢𝜌superscript^𝒵1^𝒵𝜌\displaystyle\rho-\rho^{-1}=h\mathopen{}\mathclose{{}\left(-\widehat{\mathcal{% G}}^{\dagger}(\mathds{1}+\log(\widehat{\mathcal{G}}(\rho)))+\widehat{\mathcal{% Z}}^{\dagger}(\mathds{1}+\log(\widehat{\mathcal{Z}}(\rho)))}\right).italic_ρ - italic_ρ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = italic_h ( - over^ start_ARG caligraphic_G end_ARG start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( blackboard_1 + roman_log ( over^ start_ARG caligraphic_G end_ARG ( italic_ρ ) ) ) + over^ start_ARG caligraphic_Z end_ARG start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( blackboard_1 + roman_log ( over^ start_ARG caligraphic_Z end_ARG ( italic_ρ ) ) ) ) . (37)

While we have been unable to obtain a general solution of these equations, we note that often the maps satisfy 𝒢^(𝟙+log(𝒢^(𝟙)))=𝒵^(𝟙+log(𝒵^(𝟙)))superscript^𝒢1^𝒢1superscript^𝒵1^𝒵1\widehat{\mathcal{G}}^{\dagger}(\mathds{1}+\log(\widehat{\mathcal{G}}(\mathds{% 1})))=\widehat{\mathcal{Z}}^{\dagger}(\mathds{1}+\log(\widehat{\mathcal{Z}}(% \mathds{1})))over^ start_ARG caligraphic_G end_ARG start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( blackboard_1 + roman_log ( over^ start_ARG caligraphic_G end_ARG ( blackboard_1 ) ) ) = over^ start_ARG caligraphic_Z end_ARG start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( blackboard_1 + roman_log ( over^ start_ARG caligraphic_Z end_ARG ( blackboard_1 ) ) ). In this case ρ=𝟙𝜌1\rho=\mathds{1}italic_ρ = blackboard_1 will satisfy equation (37) for any value of hhitalic_h, which we can then choose to be given by equation (36). We use therefore these values as our initial point.

Note that even when equation (37) is not satisfied this point is still in the interior of the cone, as 𝟙1\mathds{1}blackboard_1 is full rank and D2+1+D24>D𝐷21superscript𝐷24𝐷\frac{D}{2}+\sqrt{1+\frac{D^{2}}{4}}>Ddivide start_ARG italic_D end_ARG start_ARG 2 end_ARG + square-root start_ARG 1 + divide start_ARG italic_D start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 end_ARG end_ARG > italic_D, and thus is a valid starting point.

4 Examples

4.1 BB84

In order to illustrate the technique let us start with a toy example, the computation of the key rate for the entanglement based version of the BB84 protocol, using as estimated parameters the qubit error rates (QBERs) in the X𝑋Xitalic_X and Z𝑍Zitalic_Z bases qxsubscript𝑞𝑥q_{x}italic_q start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT and qzsubscript𝑞𝑧q_{z}italic_q start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT. If we use only the Z𝑍Zitalic_Z basis to generate key, the key map 𝒢𝒢\mathcal{G}caligraphic_G can be taken to be identity, and the decoherence map 𝒵𝒵\mathcal{Z}caligraphic_Z is ρi=01(|ii|𝟙)ρ(|ii|𝟙)\rho\mapsto\sum_{i=0}^{1}(\mathopen{}\mathclose{{}\left|i\middle\rangle\middle% \langle i}\right|\otimes\mathds{1})\rho(\mathopen{}\mathclose{{}\left|i\middle% \rangle\middle\langle i}\right|\otimes\mathds{1})italic_ρ ↦ ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( | italic_i ⟩ ⟨ italic_i | ⊗ blackboard_1 ) italic_ρ ( | italic_i ⟩ ⟨ italic_i | ⊗ blackboard_1 ). The analytical expression for H(A|E)𝐻conditional𝐴𝐸H(A|E)italic_H ( italic_A | italic_E ) is 1h(qx)1subscript𝑞𝑥1-h(q_{x})1 - italic_h ( italic_q start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ), where hhitalic_h is the binary entropy.

In the case where (qx,qz)(0,1)×2subscript𝑞𝑥subscript𝑞𝑧superscript01absent2(q_{x},q_{z})\in(0,1)^{\times 2}( italic_q start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ) ∈ ( 0 , 1 ) start_POSTSUPERSCRIPT × 2 end_POSTSUPERSCRIPT there is a full rank state compatible with the constraints and the support of the range of 𝒢𝒢\mathcal{G}caligraphic_G and 𝒵𝒵\mathcal{Z}caligraphic_Z is the full space, so no facial reduction is needed. Then 𝒢^=𝒢^𝒢𝒢\widehat{\mathcal{G}}=\mathcal{G}over^ start_ARG caligraphic_G end_ARG = caligraphic_G and 𝒵^=𝒵𝒢^𝒵𝒵𝒢\widehat{\mathcal{Z}}=\mathcal{Z}\circ\mathcal{G}over^ start_ARG caligraphic_Z end_ARG = caligraphic_Z ∘ caligraphic_G, and the conic program is

minh,ρhs.t.tr(ρ)=1,tr(Qxρ)=qx,tr(Qzρ)=qz,(h,ρ)𝒦QKD𝒢^,𝒵^,\begin{gathered}\min_{h,\rho}\ h\\ \text{s.t.}\quad\operatorname{tr}(\rho)=1,\quad\operatorname{tr}(Q_{x}\rho)=q_% {x},\quad\operatorname{tr}(Q_{z}\rho)=q_{z},\\ (h,\rho)\in\mathcal{K}_{\text{QKD}}^{\widehat{\mathcal{G}},\widehat{\mathcal{Z% }}},\end{gathered}start_ROW start_CELL roman_min start_POSTSUBSCRIPT italic_h , italic_ρ end_POSTSUBSCRIPT italic_h end_CELL end_ROW start_ROW start_CELL s.t. roman_tr ( italic_ρ ) = 1 , roman_tr ( italic_Q start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_ρ ) = italic_q start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , roman_tr ( italic_Q start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT italic_ρ ) = italic_q start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL ( italic_h , italic_ρ ) ∈ caligraphic_K start_POSTSUBSCRIPT QKD end_POSTSUBSCRIPT start_POSTSUPERSCRIPT over^ start_ARG caligraphic_G end_ARG , over^ start_ARG caligraphic_Z end_ARG end_POSTSUPERSCRIPT , end_CELL end_ROW (38)

where Qxsubscript𝑄𝑥Q_{x}italic_Q start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT and Qzsubscript𝑄𝑧Q_{z}italic_Q start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT are the projectors that produce the QBERs.

In the case where qz=0subscript𝑞𝑧0q_{z}=0italic_q start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT = 0 and qx(0,1)subscript𝑞𝑥01q_{x}\in(0,1)italic_q start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ∈ ( 0 , 1 ) the support of the feasible ρ𝜌\rhoitalic_ρ is span{|ϕ+,|ϕ}spanketsuperscriptitalic-ϕketsuperscriptitalic-ϕ\operatorname{span}\{\mathopen{}\mathclose{{}\left|\phi^{+}}\right\rangle,% \mathopen{}\mathclose{{}\left|\phi^{-}}\right\rangle\}roman_span { | italic_ϕ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ⟩ , | italic_ϕ start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT ⟩ }. Letting V=|ϕ+0|+|ϕ1|V=\mathopen{}\mathclose{{}\left|\phi^{+}\middle\rangle\middle\langle 0}\right|% +\mathopen{}\mathclose{{}\left|\phi^{-}\middle\rangle\middle\langle 1}\right|italic_V = | italic_ϕ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ⟩ ⟨ 0 | + | italic_ϕ start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT ⟩ ⟨ 1 | we can write ρ=VσV𝜌𝑉𝜎superscript𝑉\rho=V\sigma V^{\dagger}italic_ρ = italic_V italic_σ italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT for a 2×2222\times 22 × 2 matrix σ𝜎\sigmaitalic_σ. The constraint tr(VQzVσ)=0trsuperscript𝑉subscript𝑄𝑧𝑉𝜎0\operatorname{tr}(V^{\dagger}Q_{z}V\sigma)=0roman_tr ( italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_Q start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT italic_V italic_σ ) = 0 becomes tautological and is dropped, and the constraint tr(VQxVσ)=qxtrsuperscript𝑉subscript𝑄𝑥𝑉𝜎subscript𝑞𝑥\operatorname{tr}(V^{\dagger}Q_{x}V\sigma)=q_{x}roman_tr ( italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_Q start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_V italic_σ ) = italic_q start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT is reduced to tr(|11|σ)=qx\operatorname{tr}(\mathopen{}\mathclose{{}\left|1\middle\rangle\middle\langle 1% }\right|\sigma)=q_{x}roman_tr ( | 1 ⟩ ⟨ 1 | italic_σ ) = italic_q start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT. The map σVσVmaps-to𝜎𝑉𝜎superscript𝑉\sigma\mapsto V\sigma V^{\dagger}italic_σ ↦ italic_V italic_σ italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT has range supported on the range of V𝑉Vitalic_V, namely span{|ϕ+,|ϕ}spanketsuperscriptitalic-ϕketsuperscriptitalic-ϕ\operatorname{span}\{\mathopen{}\mathclose{{}\left|\phi^{+}}\right\rangle,% \mathopen{}\mathclose{{}\left|\phi^{-}}\right\rangle\}roman_span { | italic_ϕ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ⟩ , | italic_ϕ start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT ⟩ }, so to restrict it there we simply remove the isometry, obtaining 𝒢^(σ)=σ^𝒢𝜎𝜎\widehat{\mathcal{G}}(\sigma)=\sigmaover^ start_ARG caligraphic_G end_ARG ( italic_σ ) = italic_σ. The map σ𝒵(VσV)maps-to𝜎𝒵𝑉𝜎superscript𝑉\sigma\mapsto\mathcal{Z}(V\sigma V^{\dagger})italic_σ ↦ caligraphic_Z ( italic_V italic_σ italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) has range supported on span{|00,|11}spanket00ket11\operatorname{span}\{\mathopen{}\mathclose{{}\left|00}\right\rangle,\mathopen{% }\mathclose{{}\left|11}\right\rangle\}roman_span { | 00 ⟩ , | 11 ⟩ }, so we restrict it there, choosing the Kraus operators of 𝒵^^𝒵\widehat{\mathcal{Z}}over^ start_ARG caligraphic_Z end_ARG to be {|0+|,|1|}\{\mathopen{}\mathclose{{}\left|0\middle\rangle\middle\langle+}\right|,% \mathopen{}\mathclose{{}\left|1\middle\rangle\middle\langle-}\right|\}{ | 0 ⟩ ⟨ + | , | 1 ⟩ ⟨ - | }.

In the case where qx=0subscript𝑞𝑥0q_{x}=0italic_q start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = 0 and qz(0,1)subscript𝑞𝑧01q_{z}\in(0,1)italic_q start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ∈ ( 0 , 1 ) the support of the feasible ρ𝜌\rhoitalic_ρ is span{|ϕ+,|ψ+}spanketsuperscriptitalic-ϕketsuperscript𝜓\operatorname{span}\{\mathopen{}\mathclose{{}\left|\phi^{+}}\right\rangle,% \mathopen{}\mathclose{{}\left|\psi^{+}}\right\rangle\}roman_span { | italic_ϕ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ⟩ , | italic_ψ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ⟩ }. Letting V=|ϕ+0|+|ψ+1|V=\mathopen{}\mathclose{{}\left|\phi^{+}\middle\rangle\middle\langle 0}\right|% +\mathopen{}\mathclose{{}\left|\psi^{+}\middle\rangle\middle\langle 1}\right|italic_V = | italic_ϕ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ⟩ ⟨ 0 | + | italic_ψ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ⟩ ⟨ 1 | we can write ρ=VσV𝜌𝑉𝜎superscript𝑉\rho=V\sigma V^{\dagger}italic_ρ = italic_V italic_σ italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT for a 2×2222\times 22 × 2 matrix σ𝜎\sigmaitalic_σ. The constraint tr(VQxVσ)=0trsuperscript𝑉subscript𝑄𝑥𝑉𝜎0\operatorname{tr}(V^{\dagger}Q_{x}V\sigma)=0roman_tr ( italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_Q start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_V italic_σ ) = 0 becomes tautological and is dropped, and the constraint tr(VQzVσ)=qztrsuperscript𝑉subscript𝑄𝑧𝑉𝜎subscript𝑞𝑧\operatorname{tr}(V^{\dagger}Q_{z}V\sigma)=q_{z}roman_tr ( italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_Q start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT italic_V italic_σ ) = italic_q start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT is reduced to tr(|11|σ)=qz\operatorname{tr}(\mathopen{}\mathclose{{}\left|1\middle\rangle\middle\langle 1% }\right|\sigma)=q_{z}roman_tr ( | 1 ⟩ ⟨ 1 | italic_σ ) = italic_q start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT. To reduce the map σVσVmaps-to𝜎𝑉𝜎superscript𝑉\sigma\mapsto V\sigma V^{\dagger}italic_σ ↦ italic_V italic_σ italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT we again simply remove the isometry, obtaining 𝒢^(σ)=σ^𝒢𝜎𝜎\widehat{\mathcal{G}}(\sigma)=\sigmaover^ start_ARG caligraphic_G end_ARG ( italic_σ ) = italic_σ. The map σ𝒵(VσV)maps-to𝜎𝒵𝑉𝜎superscript𝑉\sigma\mapsto\mathcal{Z}(V\sigma V^{\dagger})italic_σ ↦ caligraphic_Z ( italic_V italic_σ italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) this time has full rank range, so its reduction is simply itself, 𝒵^(σ)=𝒵(VσV)^𝒵𝜎𝒵𝑉𝜎superscript𝑉\widehat{\mathcal{Z}}(\sigma)=\mathcal{Z}(V\sigma V^{\dagger})over^ start_ARG caligraphic_Z end_ARG ( italic_σ ) = caligraphic_Z ( italic_V italic_σ italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ).

In the case where qx=qz=0subscript𝑞𝑥subscript𝑞𝑧0q_{x}=q_{z}=0italic_q start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = italic_q start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT = 0 the only feasible ρ𝜌\rhoitalic_ρ is |ϕ+ϕ+|\mathopen{}\mathclose{{}\left|\phi^{+}\middle\rangle\middle\langle\phi^{+}}\right|| italic_ϕ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ⟩ ⟨ italic_ϕ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT | so there’s nothing to optimize.

4.2 DMCV

A more interesting case of facial reduction is when a POVM {Ei}i=0n1superscriptsubscriptsubscript𝐸𝑖𝑖0𝑛1\{E_{i}\}_{i=0}^{n-1}{ italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT is used to generate key. This is the case for example in discrete-modulated continuous variable (DMCV) QKD [lin2019]. For simplicity, assume that the estimated probabilities are compatible with a full-rank state, so that facial reduction is only needed for the maps 𝒢𝒢\mathcal{G}caligraphic_G and 𝒵𝒵\mathcal{Z}caligraphic_Z.

Let V=i=0n1𝟙AEiB|ia𝑉superscriptsubscript𝑖0𝑛1tensor-productsubscript1𝐴subscriptsubscript𝐸𝑖𝐵subscriptket𝑖𝑎V=\sum_{i=0}^{n-1}\mathds{1}_{A}\otimes\sqrt{E_{i}}_{B}\otimes\mathopen{}% \mathclose{{}\left|i}\right\rangle_{a}italic_V = ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT blackboard_1 start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT ⊗ square-root start_ARG italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ⊗ | italic_i ⟩ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT be the standard Naimark dilation of the POVM333Note that here we are applying the POVMs on Bob’s side, as is standard in DMCV., where a𝑎aitalic_a is an ancilla subsystem. Then 𝒢(ρ)=VρV𝒢𝜌𝑉𝜌superscript𝑉\mathcal{G}(\rho)=V\rho V^{\dagger}caligraphic_G ( italic_ρ ) = italic_V italic_ρ italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT, and 𝒵𝒵\mathcal{Z}caligraphic_Z has Kraus operators {𝟙AB|ii|a}i=0n1\{\mathds{1}_{AB}\otimes\mathopen{}\mathclose{{}\left|i\middle\rangle\middle% \langle i}\right|_{a}\}_{i=0}^{n-1}{ blackboard_1 start_POSTSUBSCRIPT italic_A italic_B end_POSTSUBSCRIPT ⊗ | italic_i ⟩ ⟨ italic_i | start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT. As before, since 𝒢𝒢\mathcal{G}caligraphic_G is just an isometry, the reduction consists simply of removing it, and 𝒢^(ρ)=ρ^𝒢𝜌𝜌\widehat{\mathcal{G}}(\rho)=\rhoover^ start_ARG caligraphic_G end_ARG ( italic_ρ ) = italic_ρ. The Kraus operators of 𝒵𝒢𝒵𝒢\mathcal{Z}\circ\mathcal{G}caligraphic_Z ∘ caligraphic_G are {𝟙AEiB|ia}i=0n1superscriptsubscripttensor-productsubscript1𝐴subscriptsubscript𝐸𝑖𝐵subscriptket𝑖𝑎𝑖0𝑛1\{\mathds{1}_{A}\otimes\sqrt{E_{i}}_{B}\otimes\mathopen{}\mathclose{{}\left|i}% \right\rangle_{a}\}_{i=0}^{n-1}{ blackboard_1 start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT ⊗ square-root start_ARG italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ⊗ | italic_i ⟩ start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT. It is easy to see that if all Eisubscript𝐸𝑖E_{i}italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are full rank then 𝒵𝒢𝒵𝒢\mathcal{Z}\circ\mathcal{G}caligraphic_Z ∘ caligraphic_G has full rank range, and 𝒵^=𝒵𝒢^𝒵𝒵𝒢\widehat{\mathcal{Z}}=\mathcal{Z}\circ\mathcal{G}over^ start_ARG caligraphic_Z end_ARG = caligraphic_Z ∘ caligraphic_G. Otherwise further reduction is needed (or starting with a more appropriate Naimark dilation).

For the purpose of benchmarking we implemented here the heterodyne protocol from [lin2019] (a more sophisticated application of our technique to DMCV is presented in [pascualgarcia2024]). In it the POVMs are in fact full rank and the facial reduction is as described in the previous paragraph. The main parameter of the protocol is the number of photons at which the Fock space is cut off, Ncsubscript𝑁𝑐N_{c}italic_N start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT. The quantum state ρ𝜌\rhoitalic_ρ has dimension 4(Nc+1)4subscript𝑁𝑐14(N_{c}+1)4 ( italic_N start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT + 1 ), and the Kraus operators of 𝒵^^𝒵\widehat{\mathcal{Z}}over^ start_ARG caligraphic_Z end_ARG are 16(Nc+1)×4(Nc+1)16subscript𝑁𝑐14subscript𝑁𝑐116(N_{c}+1)\times 4(N_{c}+1)16 ( italic_N start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT + 1 ) × 4 ( italic_N start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT + 1 ). As it is a prepare-and-measure protocol, it also has constraints on the partial trace of ρ𝜌\rhoitalic_ρ, which in general may introduce null eigenvalues and necessitate further facial reduction. Here it is not the case.

4.3 MUB protocol

In the MUB protocol introduced in [cerf2002, sheridan2010], Alice and Bob measure a complete set of MUBs for some prime d𝑑ditalic_d, with Bob’s bases being the transpose of Alice’s, and estimate the probability of getting equal results. As in [araujo22], here there is no such limitations of using prime dimensions, equal results, or even exact MUBs.

As it is an entanglement-based protocol and we are generating key in a single basis, the key map 𝒢𝒢\mathcal{G}caligraphic_G can be taken to be identity, and the decoherence map 𝒵𝒵\mathcal{Z}caligraphic_Z to have Kraus operators {|ii|𝟙}i=0d1\{\mathopen{}\mathclose{{}\left|i\middle\rangle\middle\langle i}\right|\otimes% \mathds{1}\}_{i=0}^{d-1}{ | italic_i ⟩ ⟨ italic_i | ⊗ blackboard_1 } start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d - 1 end_POSTSUPERSCRIPT. When the estimated statistics are compatible with a full-rank state the facial reduction is trivial: 𝒢^=𝒢^𝒢𝒢\widehat{\mathcal{G}}=\mathcal{G}over^ start_ARG caligraphic_G end_ARG = caligraphic_G and 𝒵^=𝒵𝒢^𝒵𝒵𝒢\widehat{\mathcal{Z}}=\mathcal{Z}\circ\mathcal{G}over^ start_ARG caligraphic_Z end_ARG = caligraphic_Z ∘ caligraphic_G.

4.4 Overlapping bases protocol

In the overlapping bases protocol introduced in [araujo22], Alice and Bob measure only nearest-neighbour superpositions, with again Bob’s bases being the transpose of Alice’s. 𝒢𝒢\mathcal{G}caligraphic_G and 𝒵𝒵\mathcal{Z}caligraphic_Z are the same as in the MUB protocol, and the facial reduction is also trivial the estimated statistics are compatible with a full-rank state: 𝒢^=𝒢^𝒢𝒢\widehat{\mathcal{G}}=\mathcal{G}over^ start_ARG caligraphic_G end_ARG = caligraphic_G and 𝒵^=𝒵𝒢^𝒵𝒵𝒢\widehat{\mathcal{Z}}=\mathcal{Z}\circ\mathcal{G}over^ start_ARG caligraphic_Z end_ARG = caligraphic_Z ∘ caligraphic_G.

We note, however, that in [araujo22] the protocol was restricted to even d2𝑑2d\geq 2italic_d ≥ 2, whereas here we use it also for odd d𝑑ditalic_d.

4.5 Dealing with experimental data

As argued in [araujo22], in order to deal with experimental data one cannot naïvely set the probabilities 𝐩𝐩\mathbf{p}bold_p to be equal to the measured relative frequencies 𝐟𝐟\mathbf{f}bold_f. Instead, one should estimate the covariance matrix ΣΣ\Sigmaroman_Σ, and minimize H(A|E)𝐻conditional𝐴𝐸H(A|E)italic_H ( italic_A | italic_E ) over the desired confidence region, approximated as the intersection of the ellipsoid Σ12(𝐩𝐟)2χsubscriptnormsuperscriptΣ12𝐩𝐟2𝜒\big{\|}\Sigma^{-\frac{1}{2}}(\mathbf{p}-\mathbf{f})\big{\|}_{2}\leq\chi∥ roman_Σ start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( bold_p - bold_f ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_χ with the set of parameters corresponding to valid quantum states.

One can do that with a simple modification of our conic program (8):

minh,σ,𝐩hs.t.tr(σ)=1,tr(𝐅σ)=𝐩,Σ12(𝐩𝐟)2χ,(h,σ)𝒦QKD𝒢^,𝒵^\begin{gathered}\min_{h,\sigma,\mathbf{p}}\ h\\ \text{s.t.}\quad\operatorname{tr}(\sigma)=1,\quad\operatorname{tr}(\mathbf{F}% \sigma)=\mathbf{p},\quad\big{\|}\Sigma^{-\frac{1}{2}}(\mathbf{p}-\mathbf{f})% \big{\|}_{2}\leq\chi,\\ (h,\sigma)\in\mathcal{K}_{\text{QKD}}^{\widehat{\mathcal{G}},\widehat{\mathcal% {Z}}}\end{gathered}start_ROW start_CELL roman_min start_POSTSUBSCRIPT italic_h , italic_σ , bold_p end_POSTSUBSCRIPT italic_h end_CELL end_ROW start_ROW start_CELL s.t. roman_tr ( italic_σ ) = 1 , roman_tr ( bold_F italic_σ ) = bold_p , ∥ roman_Σ start_POSTSUPERSCRIPT - divide start_ARG 1 end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT ( bold_p - bold_f ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_χ , end_CELL end_ROW start_ROW start_CELL ( italic_h , italic_σ ) ∈ caligraphic_K start_POSTSUBSCRIPT QKD end_POSTSUBSCRIPT start_POSTSUPERSCRIPT over^ start_ARG caligraphic_G end_ARG , over^ start_ARG caligraphic_Z end_ARG end_POSTSUPERSCRIPT end_CELL end_ROW (39)

where the constraint we added is a second-order-cone constraint, which is supported by almost every conic solver in existence, and in particular the one we are using, Hypatia.

5 Numerical results

5.1 Performance

In order to illustrate the performance of our technique, we ran the conic program (8) and compared the running time to the Gauss-Newton technique [hu2022] and the Gauss-Radau technique [araujo22]. We didn’t compare to the Frank-Wolfe technique from [winick2018] because [hu2022] already demonstrated superior performance. We didn’t compare to the technique from [karimi2024] because they only use the vanilla relative entropy cone; as such they cannot perform facial reduction and are unable to handle the QKD problem in full generality.

The protocols we benchmarked are the DMCV, MUB and overlap protocols described in subsections 4.2, 4.3 and 4.4. For the DMCV protocol we used the same parameters as in [hu2022]: amplitude of the coherent states α=0.35𝛼0.35\alpha=0.35italic_α = 0.35, noise ξ=0.05𝜉0.05\xi=0.05italic_ξ = 0.05, distance L=60𝐿60L=60italic_L = 60, and transmittance η=100.02L𝜂superscript100.02𝐿\eta=10^{-0.02L}italic_η = 10 start_POSTSUPERSCRIPT - 0.02 italic_L end_POSTSUPERSCRIPT. For the MUB and overlap protocols we used the probabilities from the isotropic state ρiso(v)=v|ϕ+ϕ+|+(1v)𝟙/d2\rho_{\text{iso}}(v)=v\mathopen{}\mathclose{{}\left|\phi^{+}\middle\rangle% \middle\langle\phi^{+}}\right|+(1-v)\mathds{1}/d^{2}italic_ρ start_POSTSUBSCRIPT iso end_POSTSUBSCRIPT ( italic_v ) = italic_v | italic_ϕ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ⟩ ⟨ italic_ϕ start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT | + ( 1 - italic_v ) blackboard_1 / italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT with visibility v=0.95𝑣0.95v=0.95italic_v = 0.95. For the MUB protocol we computed the probabilities using all bases, included the complex ones, whereas for the overlap protocol we restricted ourselves to the real ones. We did this in order to illustrate an additional advantage of our technique: since it uses standard conic optimization, we can avail ourselves of standard symmetrization techniques to show that we can optimize over real variables only, as done in [araujo22]. This provides an additional boost in performance.

The calculations were done on an AMD Ryzen Threadripper Pro 5955WX with 4 GHz and 16 cores, on a machine with 512 GiB RAM running Ubuntu Linux 22.04. Our code was run with Julia 1.10 and the modeller JuMP [lubin2023], and the two other techniques with MATLAB 2023b. For the Gauss-Radau technique we used the solver MOSEK 10.1 [mosek] and the modeller YALMIP [yalmip]. In all cases we have only reported the time taken by the solver, discounting the time taken to set up the problem, and in the Gauss-Newton case the time taken to do facial reduction numerically.

The results are shown in Figures 1, 2, and 3.

4444666688881010101012121212100superscript10010^{0}10 start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT101superscript10110^{1}10 start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT102superscript10210^{2}10 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT103superscript10310^{3}10 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPTNcsubscript𝑁𝑐N_{c}italic_N start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPTtime (s)QKD coneGauss-NewtonGauss-Radau
Figure 1: Running time in seconds (logarithmic scale) as a function of the photon cut-off number for the DMCV protocol.
2222444466668888101010101212121214141414103superscript10310^{-3}10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT102superscript10210^{-2}10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT101superscript10110^{-1}10 start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT100superscript10010^{0}10 start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT101superscript10110^{1}10 start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT102superscript10210^{2}10 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT103superscript10310^{3}10 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT104superscript10410^{4}10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPTd𝑑ditalic_dtime (s)QKD coneGauss-NewtonGauss-Radau
Figure 2: Running time in seconds (logarithmic scale) as a function of the local state dimension for the MUB protocol. Note that for d=6,10𝑑610d=6,10italic_d = 6 , 10, and 12121212 the bases used are only roughly unbiased.
2222444466668888101010101212121214141414102superscript10210^{-2}10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT101superscript10110^{-1}10 start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT100superscript10010^{0}10 start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT101superscript10110^{1}10 start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT102superscript10210^{2}10 start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT103superscript10310^{3}10 start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT104superscript10410^{4}10 start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPTd𝑑ditalic_dtime (s)QKD coneGauss-NewtonGauss-Radau
Figure 3: Running time in seconds (logarithmic scale) as a function of the local state dimension for the overlap protocol.

5.2 Precision

In order to obtain results with higher precision, one might try setting tighter tolerances for the conic solver. This is however not fruitful, as very quickly one meets fundamental obstacles due to the limited precision with which the standard 64-bit floating point numbers can perform computations or even represent the problem parameters. For this reason we have used instead different implementations of floating point numbers that provide more bits of precision. This is easy due to Julia’s flexible type system, which is fully taken advantage of by the solver Hypatia and the modeller JuMP; in order to use a different type we simply need to give it as a parameter. Hypatia automatically sets the appropriate tolerances for each type. We have used the following four types:

  • Float64

    Default 64-bit floating point number implementing the IEEE 754 standard, known as double. Has 53 bits of precision.

  • Float128

    128-bit floating point number implementing the IEEE 754 standard, known as quadruple. Has 113 bits of precision. Provided by the package Quadmath [quadmath2024], which wraps the GCC library libquadmath.

  • Double64

    Non-standard implementation of a 128-bit floating point number as a pair of 64-bit floating point numbers, known as double-double. Much faster than Float128, but has smaller precision and smaller range of exponent. Has 106 bits of precision. Provided by the package DoubleFloats [sarnoff2024].

  • BigFloat

    Arbitrary precision floating-point number. We used the default settings, with which it occupies 640 bits of memory and provides 256 bits of precision. Provided by the Julia standard library, wrapping the GNU MPFR library [fousse2007].

In order to evaluate the error for which type, we solved the conic program (8) for the protocols we know the analytical answers: BB84 4.1 and MUB 4.3. The results are shown in Table 1.

Although the analytical answer is also known for the DMCV protocol 4.2 for the case ξ=0𝜉0\xi=0italic_ξ = 0, a comparison is not meaningful: after facial reduction there is a unique state compatible with the constraints, so there’s nothing to optimize. We have nevertheless provided the analytical answer and the facial reduction in the accompanying source code for the interested reader.

BB84 MUB
Float64 5.4×1085.4superscript1085.4\times 10^{-8}5.4 × 10 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT 5.7×1085.7superscript1085.7\times 10^{-8}5.7 × 10 start_POSTSUPERSCRIPT - 8 end_POSTSUPERSCRIPT
Double64 1.4×10131.4superscript10131.4\times 10^{-13}1.4 × 10 start_POSTSUPERSCRIPT - 13 end_POSTSUPERSCRIPT 1.6×10121.6superscript10121.6\times 10^{-12}1.6 × 10 start_POSTSUPERSCRIPT - 12 end_POSTSUPERSCRIPT
Float128 1.7×10141.7superscript10141.7\times 10^{-14}1.7 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT 7.7×10147.7superscript10147.7\times 10^{-14}7.7 × 10 start_POSTSUPERSCRIPT - 14 end_POSTSUPERSCRIPT
BigFloat 7.9×10327.9superscript10327.9\times 10^{-32}7.9 × 10 start_POSTSUPERSCRIPT - 32 end_POSTSUPERSCRIPT 1.6×10311.6superscript10311.6\times 10^{-31}1.6 × 10 start_POSTSUPERSCRIPT - 31 end_POSTSUPERSCRIPT
Table 1: Absolute difference between analytical H(A|E)𝐻conditional𝐴𝐸H(A|E)italic_H ( italic_A | italic_E ) and the one computed via the conic program (8) for various floating point implementations, for the BB84 and MUB protocols. For the BB84 protocol we used the parameters qx=qz=0.025subscript𝑞𝑥subscript𝑞𝑧0.025q_{x}=q_{z}=0.025italic_q start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = italic_q start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT = 0.025, and for the MUB protocol we used dimension 3 with 4 bases and the probabilities of an isotropic state with visibility v=0.95𝑣0.95v=0.95italic_v = 0.95.

Note that in general a number that has a finite decimal expansion does not have a finite binary expansion. For example, 0.9510=0.11110011¯2subscript0.95100.1111subscript¯001120.95_{10}=0.1111\overline{0011}_{2}0.95 start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT = 0.1111 over¯ start_ARG 0011 end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Therefore, even when giving input parameters that at first sight seem exact, one often incurs in truncation errors. Therefore, in order to take advantage of higher precision types, the input parameters also need to use them444One must also be careful to write 0.950.950.950.95 as (e.g.) Double64(95)/100 instead of Double64(0.95), as the latter first computes 0.950.950.950.95 as a Float64 before converting it to Double64.. For this reason our example codes don’t require the type desired for the computation to be specified, but rather read it from the type of the input parameters.

It’s important to emphasize that the increased precision comes at the cost of a much longer running time. Not only the elementary operations take longer, but also the number of iterations increases in order to meet the tighter tolerances.

6 Conclusion

In this work we have introduced a new technique for obtaining the asymptotic key rate in quantum key distribution. It is based on defining a new convex cone for the key rate problem and optimizing over it with standard conic optimization methods. It provides a dramatic improvement in performance with respect to previous techniques. Moreover, it is flexible and easy to use, as one can freely combine it with other convex cones, and it is interfaced through a convenient modelling language, JuMP.

As a primal-dual method, it directly provides a lower bound through the value of the dual objective. However, we lack an explicit expression for the dual cone, which stops us from converting the dual minimizer into an independent witness for the lower bound. One can nevertheless construct such a witness from the primal minimizer as described in [winick2018] and done for example in [pascualgarcia2024].

The most pressing issue is to adapt the method to finding finite key rates. Although one can use the asymptotic key rate to obtain a finite key rate via the (generalized) entropy accumulation theorem [dupuis2020, metger2022], as done for example in [pascualgarcia2024], such bounds are known to be loose. Tight bounds are provided by more recent techniques [vanhimbeeck2024, arqand2024], that however require optimization over Rényi entropies. It would be interesting to apply conic methods to this case.

7 Code availability

The implementation of the cone introduced in this paper and the code for the examples shown are available in https://github.com/araujoms/ConicQKD.jl.

8 Related work

While conducting this research, we found that He et al. [he2024] had independently applied the Skajaa and Ye algorithm to the problem of QKD using a specialized cone, and proved the self-concordance of the barrier function (12).

9 Acknowledgements

The research of A.G.L., P.V.P., M.C.C. and M.A. was supported by the European Union–Next Generation UE/MICIU/Plan de Recuperación, Transformación y Resiliencia/Junta de Castilla y León. We thank Chris Coey and Lea Kapelevich for help with Hypatia’s code and useful discussions.

\printbibliography

Appendix A Vectorization

Let vec:dO×dIdOdI:vecsuperscriptsubscript𝑑𝑂subscript𝑑𝐼superscriptsubscript𝑑𝑂subscript𝑑𝐼\operatorname{vec}:\mathbb{C}^{d_{O}\times d_{I}}\to\mathbb{C}^{d_{O}d_{I}}roman_vec : blackboard_C start_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_O end_POSTSUBSCRIPT × italic_d start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT end_POSTSUPERSCRIPT → blackboard_C start_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_O end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT end_POSTSUPERSCRIPT be the col-major vectorization of a real or complex rectangular matrix. It is useful to represent it as

vec(X)=𝟙dIXvec(𝟙dI),vec𝑋tensor-productsubscript1subscript𝑑𝐼𝑋vecsubscript1subscript𝑑𝐼\operatorname{vec}(X)=\mathds{1}_{d_{I}}\otimes X\operatorname{vec}(\mathds{1}% _{d_{I}}),roman_vec ( italic_X ) = blackboard_1 start_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⊗ italic_X roman_vec ( blackboard_1 start_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) , (40)

where vec(𝟙dI)=i=0dI1|iivecsubscript1subscript𝑑𝐼superscriptsubscript𝑖0subscript𝑑𝐼1ket𝑖𝑖\operatorname{vec}(\mathds{1}_{d_{I}})=\sum_{i=0}^{d_{I}-1}\mathopen{}% \mathclose{{}\left|ii}\right\rangleroman_vec ( blackboard_1 start_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT - 1 end_POSTSUPERSCRIPT | italic_i italic_i ⟩. It is common to write vec(X)vec𝑋\operatorname{vec}(X)roman_vec ( italic_X ) as |X{|{X}\rangle\!\rangle}| italic_X ⟩ ⟩. A useful identity is

vec(ABC)=CTAvec(B).vec𝐴𝐵𝐶tensor-productsuperscript𝐶𝑇𝐴vec𝐵\operatorname{vec}(ABC)=C^{T}\otimes A\operatorname{vec}(B).roman_vec ( italic_A italic_B italic_C ) = italic_C start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ⊗ italic_A roman_vec ( italic_B ) . (41)

For implementation purposes we work with a non-redundant vectorization for Hermitian matrices svec(X)svec𝑋\operatorname{svec}(X)roman_svec ( italic_X ). The real and complex cases differ. In the real case, svec:d×dd(d+1)/2:svecsuperscript𝑑𝑑superscript𝑑𝑑12\operatorname{svec}:\mathbb{R}^{d\times d}\to\mathbb{R}^{d(d+1)/2}roman_svec : blackboard_R start_POSTSUPERSCRIPT italic_d × italic_d end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_d ( italic_d + 1 ) / 2 end_POSTSUPERSCRIPT is the col-major vectorization of the upper triangle, with the off-diagonals multiplied by 22\sqrt{2}square-root start_ARG 2 end_ARG. In the complex case, svec:d×dd2:svecsuperscript𝑑𝑑superscriptsuperscript𝑑2\operatorname{svec}:\mathbb{C}^{d\times d}\to\mathbb{R}^{d^{2}}roman_svec : blackboard_C start_POSTSUPERSCRIPT italic_d × italic_d end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT additionally splits the complex numbers into the real part and minus the imaginary part555This is an arbitrary choice, made to coincide with the code., storing them consecutively.

The factor of 22\sqrt{2}square-root start_ARG 2 end_ARG multiplying the off-diagonals is necessary to ensure that

X,Y=vec(X),vec(Y)=svec(X),svec(Y)X,Y.formulae-sequence𝑋𝑌vec𝑋vec𝑌svec𝑋svec𝑌for-all𝑋𝑌\langle X,Y\rangle=\langle\operatorname{vec}(X),\operatorname{vec}(Y)\rangle=% \langle\operatorname{svec}(X),\operatorname{svec}(Y)\rangle\quad\forall X,Y.⟨ italic_X , italic_Y ⟩ = ⟨ roman_vec ( italic_X ) , roman_vec ( italic_Y ) ⟩ = ⟨ roman_svec ( italic_X ) , roman_svec ( italic_Y ) ⟩ ∀ italic_X , italic_Y . (42)

Let V𝑉Vitalic_V be the isometry666This is an abuse of notation since the operator is different in the real and complex cases. Also note that in the complex case V𝑉Vitalic_V is additionally unitary. such that

vec(X)=Vsvec(X)vec𝑋𝑉svec𝑋\operatorname{vec}(X)=V\operatorname{svec}(X)roman_vec ( italic_X ) = italic_V roman_svec ( italic_X ) (43)

for all Hermitian X𝑋Xitalic_X. Suppose you are interested in representing a linear function f𝑓fitalic_f as a matrix F𝐹Fitalic_F such that

Fvec(X)=vec(f(X))X.𝐹vec𝑋vec𝑓𝑋for-all𝑋F\operatorname{vec}(X)=\operatorname{vec}(f(X))\quad\forall X.italic_F roman_vec ( italic_X ) = roman_vec ( italic_f ( italic_X ) ) ∀ italic_X . (44)

This is easy to do using identity (41) and linearity. If this function is additionally Hermitian-preserving we have that

VFVsvec(X)=svec(f(X))superscript𝑉𝐹𝑉svec𝑋svec𝑓𝑋V^{\dagger}FV\operatorname{svec}(X)=\operatorname{svec}(f(X))italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_F italic_V roman_svec ( italic_X ) = roman_svec ( italic_f ( italic_X ) ) (45)

for all Hermitian X𝑋Xitalic_X.

This is mainly useful for proving theorems, as a direct computation of VFVsuperscript𝑉𝐹𝑉V^{\dagger}FVitalic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_F italic_V via this formula is inefficient. For the particular case of f(X)=KXK𝑓𝑋𝐾𝑋superscript𝐾f(X)=KXK^{\dagger}italic_f ( italic_X ) = italic_K italic_X italic_K start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT we implemented an efficient function to compute it, skron(K)=V(K¯K)Vskron𝐾superscript𝑉tensor-product¯𝐾𝐾𝑉\operatorname{skron}(K)=V^{\dagger}(\bar{K}\otimes K)Vroman_skron ( italic_K ) = italic_V start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( over¯ start_ARG italic_K end_ARG ⊗ italic_K ) italic_V.