Abstract
Computing key rates in quantum key distribution (QKD) numerically is essential to unlock more powerful protocols, that use more sophisticated measurement bases or quantum systems of higher dimension. It is a difficult optimization problem, that depends on minimizing a convex non-linear function: the (quantum) relative entropy. Standard conic optimization techniques have for a long time been unable to handle the relative entropy cone, as it is a non-symmetric cone, and the standard algorithms can only handle symmetric ones. Recently, however, a practical algorithm has been discovered for optimizing over non-symmetric cones, including the relative entropy. Here we adapt this algorithm to the problem of computation of key rates, obtaining an efficient technique for lower bounding them. In comparison to previous techniques it has the advantages of flexibility, ease of use, and above all performance.
1 Introduction
Secret key rates in QKD are usually calculated analytically [gisin2002, scarani2009, xu2020, pirandola2020]. This is only tractable for simple protocols with highly symmetric measurement bases, such as BB84 [bennett1984], the six-state protocol [bruss1998], or their generalizations to mutually unbiased bases (MUBs) in higher dimensions [cerf2002, sheridan2010]. Recently there has been intense interest in numerical techniques for the computation of key rates in order to unlock more sophisticated protocols, that use more parameters, arbitrary measurement bases, and higher dimensions.
The best existing techniques are the Gauss-Newton technique from [hu2022], and the Gauss-Radau technique from [araujo22]. The former consists of an implementation of a bespoke Gauss-Newton solver for the key rate problem, and the latter consists of reducing the problem to a convergent hierarchy of semidefinite programs. Both allow for the computation of optimal key rates for arbitrary protocols, but suffer from several disadvantages. The main one, common to both techniques, is low performance, which limits their applicability to protocols with low dimension. Specific problems of the Gauss-Newton technique are that it doesn’t use standard conic optimization techniques, which makes it challenging to implement and inflexible. For example, it cannot handle protocols that use anything other than equality constraints. A specific problem of the Gauss-Radau technique is that for any fixed level of the hierarchy it only delivers an approximation of the key rate, and the cost of achieving a given precision can be prohibitive.
Here we introduce a new technique that is several times faster than both techniques, while also solving their specific issues. Unlike the Gauss-Newton technique, it uses standard conic optimization, making it easy to implement and combine with other kinds of constraints that might be required by the protocols. Unlike the Gauss-Radau technique, it doesn’t rely on reformulating the problem as a sequence of semidefinite programs, but instead solves it directly.
We use the framework from [winick2018] to formulate the key rate problem as a convex optimization problem, the minimization of a (quantum) relative entropy over a convex domain. Using the relative entropy cone, which is a non-symmetric cone, it becomes a conic optimization problem. Although for a long time algorithms for optimizing over non-symmetric cones have been known [nesterov1999, tuncel2001, nesterov2012], they were hardly practical, as they required for instance a tractable barrier function for the dual cone. This changed with the discovery of Skajaa and Ye’s algorithm [skajaa2015, papp2017], and with it solving the key rate problem becomes in principle possible. There is, however, a technical difficulty: this formulation of the key rate problem is often not strictly feasible, and such problems cannot be reliably solved with conic optimization methods. The standard technique to deal with this, called facial reduction [drusvyatskiy2017], does apply, but the resulting reduced problem is no longer a minimization of the relative entropy. For this reason we introduce a new convex cone to deal directly with the reduced problem.
We implemented our new cone in the programming language Julia [bezanson2017] as an extension to the solver Hypatia [coey2022], which implements an improved version of the Skajaa and Ye algorithm [coey2023]. This solver is interfaced by the modeller JuMP [lubin2023], making it easy to use, and takes advantage of Julia’s flexible type system, allowing the user to solve the optimization problems using double, double-double, quadruple, or arbitrary precision.
The paper is organized as follows. In Section 2 we reformulate the key rate problem as a conic problem and introduce the QKD cone. In Section 3 we propose a barrier function and obtain the derivatives necessary to implement the cone. In Section 4 we show how to formulate some QKD protocols to compute the key rates with our technique. In Section 5 we benchmark our technique against the Gauss-Newton and Gauss-Radau techniques.
2 QKD rate as a conic problem
In quantum key distribution the asymptotic key rate is given by the Devetak-Winter rate [devetak2005]:
|
|
|
(1) |
where is the conditional von Neumann entropy between Alice and Eve, and the conditional von Neumann entropy between Alice and Bob. Since is completely determined by the measured data, the challenge is to compute , which has to be minimized over all quantum states compatible with the measured statistics. Following [winick2018], it can be expressed in terms of the relative entropy as
|
|
|
(2) |
where is the relative entropy, is a CP map that takes to an extended quantum state that includes the secret key as a subsystem, and is a CPTP map that decoheres the key subsystem. is necessarily of the form for some projectors that sum to identity. The optimization problem is thus
|
|
|
(3) |
where are the POVMs encoding the QKD protocol, and are the estimated probabilities. This formulation makes the computation of key rates a convex optimization problem. To make it a conic optimization problem, however, we need to write it in terms of the relative entropy convex cone, namely
|
|
|
(4) |
where denotes the space of complex Hermitian matrices. It becomes then
|
|
|
(5) |
In principle it can now be solved by using Skajaa and Ye’s algorithm, but there’s a difficulty: as with any interior-point algorithm, the problem needs to be strictly feasible for the solution to be found reliably and efficiently. In other words, there must exist a point that satisfies the equality constraints of the conic problem and is in the interior of all cones we use. This can fail here in two places: the equality constraints might be only satisfiable by a quantum state with null eigenvalues, and the map might take a positive definite quantum state to one with null eigenvalues. The map , on the other hand, preserves positive definiteness [hu2022].
We note that this same difficulty affects other techniques [winick2018, hu2022, araujo22]: in [winick2018] they perturb the singular matrices with identity to make them full rank, whereas in [hu2022, araujo22] they reformulate the problem in terms of the support of the relevant matrices, a procedure known as facial reduction [drusvyatskiy2017]. Facial reduction is more reliable, elegant, and results in a smaller problem than perturbing with identity, so we adopt this approach here as well.
We would thus like to perform facial reduction along the lines of [hu2022]: first find an isometry that encodes the support of the feasible , such that for full-rank . Then replace the equality constraints with , dropping the redundant ones. Finally, find maps and such that and are the restrictions of the maps and to the support of their ranges.
While that’s easy to do, the resulting problem is no longer an optimization over the relative entropy; in particular . Instead, to obtain the reformulated problem we use the following identity [coles2012]:
|
|
|
(6) |
where is the von Neumann entropy, together with the simple observation that and . This motivates the definition of a new convex cone:
|
|
|
(7) |
We emphasize that and must come from the reduction of and ; if we let them be arbitrary CP maps is not even a convex cone in general.
With the cone in hand, we can finally state the reduced problem:
|
|
|
(8) |
Note that the constraint has been dropped; although it’s equivalent to , the latter is redundant with the definition of .
We emphasize that even in the case where no facial reduction is necessary, and and , it is still advantageous to use the QKD cone (7) instead of the relative entropy cone (4) for four reasons: (i) the constraint can not be dropped in the relative entropy cone unless has a positive inverse, (ii) the QKD cone has one less matrix variable to optimize over, (iii) the derivatives of the barrier function of the QKD cone, that we introduce below, are less computationally demanding, (iv) as the QKD cone explicitly depends on and , it can exploit their structure; specifically, we use the fact that necessarily has a block diagonal structure to drastically speed up the computation of several derivatives involving it.
3 Implementation of the QKD cone
In order to successfully optimize over a given convex cone , both using Skajaa and Ye’s algorithm or traditional methods for symmetric cones, it’s essential to provide a logarithmically homogeneous self-concordant barrier (LHSCB) function for it [nesterov1997]. It is defined as a function such that along every sequence converging to the boundary of , is three times continuously differentiable, strictly convex, and
|
|
|
(9) |
|
|
|
(10) |
While most of these properties are straightforward to verify, self-concordance (10) is not, and has only recently been proven for the following barrier function for the relative entropy cone (4) [fawzi2022]:
|
|
|
(11) |
We propose as the barrier function for the QKD cone (7) its obvious analogue:
|
|
|
(12) |
We conjecture that it is also self-concordant based on the similarity and the fact that our extensive numerical tests were successful.
In order to implement the cone (7) in the solver Hypatia, we need to provide the gradient, Hessian, and third-order derivative of the barrier function (12), and choose an initial point [coey2023].
3.1 Gradient, Hessian, and third-order derivative
To obtain the gradient, Hessian, and third-order derivative of the barrier function (12) we adapt the results of [faybusovich2022] to complex matrices, and combine them with the results of [coey2024] for the relative entropy cone. In the code they are needed in vectorized form, as explained in Appendix A. Here we present them in the usual notation for clarity.
Let for some positive map . Then its gradient is
|
|
|
(13) |
where is the adjoint of . Its Hessian is the linear operator
|
|
|
(14) |
where is the elementwise (Schur) product, is a diagonalization of , and is defined as
|
|
|
(15) |
where . Its third-order derivative applied to is
|
|
|
(16) |
where
|
|
|
(17) |
with and
|
|
|
(18) |
The gradient and Hessian of the term are well-known to be and , respectively [nesterov1997]. The third order derivative can be easily calculated to be
|
|
|
(19) |
Putting everything together, the gradient of the barrier is given by
|
|
|
|
(20) |
|
|
|
|
(21) |
where and , and the Hessian by
|
|
|
|
(22) |
|
|
|
|
(23) |
|
|
|
|
(24) |
where
|
|
|
(25) |
and and are diagonalizations of and .
The third order derivatives are given by:
|
|
|
|
(26) |
|
|
|
|
(27) |
|
|
|
|
(28) |
|
|
|
|
|
|
|
|
|
|
|
|
(29) |
In particular, applying to the point gives
|
|
|
|
|
|
|
|
|
|
|
|
(30) |
where
|
|
|
(31) |
and and are the analogues of (17).
3.2 Initial point
The initial point can in principle be chosen to be any point in the interior of the cone [skajaa2015], but following [coey2023, dahl2022] we’d like to use the central point, as it provides better performance. It is defined as the minimizer of
|
|
|
(32) |
where is our barrier function (12), and . The minimizer is obtained by taking the gradient of the objective and setting it to zero:
|
|
|
(33) |
This results in the pair of equations
|
|
|
|
(34) |
|
|
|
|
(35) |
where we are using the gradient from equations (20)–(21).
Let now , where . We get then
|
|
|
(36) |
|
|
|
(37) |
While we have been unable to obtain a general solution of these equations, we note that often the maps satisfy . In this case will satisfy equation (37) for any value of , which we can then choose to be given by equation (36). We use therefore these values as our initial point.
Note that even when equation (37) is not satisfied this point is still in the interior of the cone, as is full rank and , and thus is a valid starting point.
6 Conclusion
In this work we have introduced a new technique for obtaining the asymptotic key rate in quantum key distribution. It is based on defining a new convex cone for the key rate problem and optimizing over it with standard conic optimization methods. It provides a dramatic improvement in performance with respect to previous techniques. Moreover, it is flexible and easy to use, as one can freely combine it with other convex cones, and it is interfaced through a convenient modelling language, JuMP.
As a primal-dual method, it directly provides a lower bound through the value of the dual objective. However, we lack an explicit expression for the dual cone, which stops us from converting the dual minimizer into an independent witness for the lower bound. One can nevertheless construct such a witness from the primal minimizer as described in [winick2018] and done for example in [pascualgarcia2024].
The most pressing issue is to adapt the method to finding finite key rates. Although one can use the asymptotic key rate to obtain a finite key rate via the (generalized) entropy accumulation theorem [dupuis2020, metger2022], as done for example in [pascualgarcia2024], such bounds are known to be loose. Tight bounds are provided by more recent techniques [vanhimbeeck2024, arqand2024], that however require optimization over Rényi entropies. It would be interesting to apply conic methods to this case.
Appendix A Vectorization
Let be the col-major vectorization of a real or complex rectangular matrix. It is useful to represent it as
|
|
|
(40) |
where . It is common to write as . A useful identity is
|
|
|
(41) |
For implementation purposes we work with a non-redundant vectorization for Hermitian matrices . The real and complex cases differ. In the real case, is the col-major vectorization of the upper triangle, with the off-diagonals multiplied by . In the complex case, additionally splits the complex numbers into the real part and minus the imaginary part, storing them consecutively.
The factor of multiplying the off-diagonals is necessary to ensure that
|
|
|
(42) |
Let be the isometry such that
|
|
|
(43) |
for all Hermitian . Suppose you are interested in representing a linear function as a matrix such that
|
|
|
(44) |
This is easy to do using identity (41) and linearity. If this function is additionally Hermitian-preserving we have that
|
|
|
(45) |
for all Hermitian .
This is mainly useful for proving theorems, as a direct computation of via this formula is inefficient. For the particular case of we implemented an efficient function to compute it, .