Quantum key distribution rates from non-symmetric conic optimization

Andrés González Lorente Departamento de Física Teórica, Atómica y Óptica, Universidad de Valladolid, 47011 Valladolid, Spain Pablo V. Parellada Departamento de Física Teórica, Atómica y Óptica, Universidad de Valladolid, 47011 Valladolid, Spain Miguel Castillo-Celeita Departamento de Física Teórica, Atómica y Óptica, Universidad de Valladolid, 47011 Valladolid, Spain Mateus Araújo Departamento de Física Teórica, Atómica y Óptica, Universidad de Valladolid, 47011 Valladolid, Spain

(24th July 2024)

Abstract

Computing key rates in quantum key distribution (QKD) numerically is essential to unlock more powerful protocols, that use more sophisticated measurement bases or quantum systems of higher dimension. It is a difficult optimization problem, that depends on minimizing a convex non-linear function: the (quantum) relative entropy. Standard conic optimization techniques have for a long time been unable to handle the relative entropy cone, as it is a non-symmetric cone, and the standard algorithms can only handle symmetric ones. Recently, however, a practical algorithm has been discovered for optimizing over non-symmetric cones, including the relative entropy. Here we adapt this algorithm to the problem of computation of key rates, obtaining an efficient technique for lower bounding them. In comparison to previous techniques it has the advantages of flexibility, ease of use, and above all performance.

1 Introduction

Secret key rates in QKD are usually calculated analytically [gisin2002, scarani2009, xu2020, pirandola2020]. This is only tractable for simple protocols with highly symmetric measurement bases, such as BB84 [bennett1984], the six-state protocol [bruss1998], or their generalizations to mutually unbiased bases (MUBs) in higher dimensions [cerf2002, sheridan2010]. Recently there has been intense interest in numerical techniques for the computation of key rates in order to unlock more sophisticated protocols, that use more parameters, arbitrary measurement bases, and higher dimensions.

The best existing techniques are the Gauss-Newton technique from [hu2022], and the Gauss-Radau technique from [araujo22]. The former consists of an implementation of a bespoke Gauss-Newton solver for the key rate problem, and the latter consists of reducing the problem to a convergent hierarchy of semidefinite programs. Both allow for the computation of optimal key rates for arbitrary protocols, but suffer from several disadvantages. The main one, common to both techniques, is low performance, which limits their applicability to protocols with low dimension. Specific problems of the Gauss-Newton technique are that it doesn’t use standard conic optimization techniques, which makes it challenging to implement and inflexible. For example, it cannot handle protocols that use anything other than equality constraints. A specific problem of the Gauss-Radau technique is that for any fixed level of the hierarchy it only delivers an approximation of the key rate, and the cost of achieving a given precision can be prohibitive.

Here we introduce a new technique that is several times faster than both techniques, while also solving their specific issues. Unlike the Gauss-Newton technique, it uses standard conic optimization, making it easy to implement and combine with other kinds of constraints that might be required by the protocols. Unlike the Gauss-Radau technique, it doesn’t rely on reformulating the problem as a sequence of semidefinite programs, but instead solves it directly.

We use the framework from [winick2018] to formulate the key rate problem as a convex optimization problem, the minimization of a (quantum) relative entropy over a convex domain. Using the relative entropy cone, which is a non-symmetric cone, it becomes a conic optimization problem. Although for a long time algorithms for optimizing over non-symmetric cones have been known [nesterov1999, tuncel2001, nesterov2012], they were hardly practical, as they required for instance a tractable barrier function for the dual cone. This changed with the discovery of Skajaa and Ye’s algorithm [skajaa2015, papp2017], and with it solving the key rate problem becomes in principle possible. There is, however, a technical difficulty: this formulation of the key rate problem is often not strictly feasible, and such problems cannot be reliably solved with conic optimization methods. The standard technique to deal with this, called facial reduction [drusvyatskiy2017], does apply, but the resulting reduced problem is no longer a minimization of the relative entropy. For this reason we introduce a new convex cone to deal directly with the reduced problem.

We implemented our new cone in the programming language Julia [bezanson2017] as an extension to the solver Hypatia [coey2022], which implements an improved version of the Skajaa and Ye algorithm [coey2023]. This solver is interfaced by the modeller JuMP [lubin2023], making it easy to use, and takes advantage of Julia’s flexible type system, allowing the user to solve the optimization problems using double, double-double, quadruple, or arbitrary precision.

The paper is organized as follows. In Section 2 we reformulate the key rate problem as a conic problem and introduce the QKD cone. In Section 3 we propose a barrier function and obtain the derivatives necessary to implement the cone. In Section 4 we show how to formulate some QKD protocols to compute the key rates with our technique. In Section 5 we benchmark our technique against the Gauss-Newton and Gauss-Radau techniques.

2 QKD rate as a conic problem

In quantum key distribution the asymptotic key rate is given by the Devetak-Winter rate [devetak2005]:

K\geq H(A|E)-H(A|B),

(1)

where $H(A|E)$ is the conditional von Neumann entropy between Alice and Eve, and $H(A|B)$ the conditional von Neumann entropy between Alice and Bob. Since $H(A|B)$ is completely determined by the measured data, the challenge is to compute $H(A|E)$ , which has to be minimized over all quantum states compatible with the measured statistics. Following [winick2018], it can be expressed in terms of the relative entropy as

H(A|E)_{\rho_{ABE}}=D(\mathcal{G}(\rho_{AB})\|\mathcal{Z}(\mathcal{G}(\rho_{AB% }))),

(2)

where $D(\cdot\|\cdot)$ is the relative entropy, $\mathcal{G}$ is a CP map that takes $\rho_{AB}$ to an extended quantum state that includes the secret key as a subsystem, and $\mathcal{Z}$ is a CPTP map that decoheres the key subsystem. $\mathcal{Z}$ is necessarily of the form $\mathcal{Z}(\rho)=\sum_{i}\Pi_{i}\rho\Pi_{i}$ for some projectors $\Pi_{i}$ that sum to identity. The optimization problem is thus

\begin{gathered}\min_{\rho}\ D(\mathcal{G}(\rho)\|\mathcal{Z}(\mathcal{G}(\rho% )))\\ \text{s.t.}\quad\rho\succeq 0,\quad\operatorname{tr}(\rho)=1,\\ \operatorname{tr}(E_{k}\rho)=p_{k}\ \forall k\end{gathered}

(3)

where $E_{k}$ are the POVMs encoding the QKD protocol, and $p_{k}$ are the estimated probabilities. This formulation makes the computation of key rates a convex optimization problem. To make it a conic optimization problem, however, we need to write it in terms of the relative entropy convex cone, namely

\mathcal{K}_{\text{RE}}=\text{cl}\,\{(h,\rho,\sigma)\in\mathbb{R}\times\mathbb% {H}^{n}\times\mathbb{H}^{n};\,\rho\succ 0,\ \sigma\succ 0,\ h\geq D(\rho\|% \sigma)\},

(4)

where $\mathbb{H}^{n}$ denotes the space of complex Hermitian matrices. It becomes then

\begin{gathered}\min_{h,\rho}\ h\\ \text{s.t.}\quad\rho\succeq 0,\quad\operatorname{tr}(\rho)=1,\\ \operatorname{tr}(E_{k}\rho)=p_{k}\ \forall k,\\ \Big{(}h,\mathcal{G}(\rho),\mathcal{Z}(\mathcal{G}(\rho))\Big{)}\in\mathcal{K}% _{\text{RE}}\end{gathered}

(5)

In principle it can now be solved by using Skajaa and Ye’s algorithm, but there’s a difficulty: as with any interior-point algorithm, the problem needs to be strictly feasible for the solution to be found reliably and efficiently. In other words, there must exist a point $(h,\rho)$ that satisfies the equality constraints of the conic problem and is in the interior of all cones we use. This can fail here in two places: the equality constraints $\operatorname{tr}(E_{k}\rho)=p_{k}$ might be only satisfiable by a quantum state with null eigenvalues, and the map $\mathcal{G}$ might take a positive definite quantum state to one with null eigenvalues. The map $\mathcal{Z}$ , on the other hand, preserves positive definiteness¹¹1To see that, assume for the sake of contradiction that $\mathopen{}\mathclose{{}\left|\psi}\right\rangle$ is a null eigenvector of $\mathcal{Z}(\rho)$ for some positive definite $\rho$ . Then $\mathopen{}\mathclose{{}\left\langle\psi}\right|\mathcal{Z}(\rho)\mathopen{}% \mathclose{{}\left|\psi}\right\rangle=\sum_{i}\mathopen{}\mathclose{{}\left% \langle\psi}\right|\Pi_{i}\rho\Pi_{i}\mathopen{}\mathclose{{}\left|\psi}\right% \rangle=0$ . Since $\rho$ is positive definite, $\mathopen{}\mathclose{{}\left\langle\psi}\right|\Pi_{i}\rho\Pi_{i}\mathopen{}% \mathclose{{}\left|\psi}\right\rangle>0$ for all nonzero $\Pi_{i}\mathopen{}\mathclose{{}\left|\psi}\right\rangle$ . Therefore, to satisfy the equation we must have $\Pi_{i}\mathopen{}\mathclose{{}\left|\psi}\right\rangle=0\ \forall i$ . Summing over $i$ and using the fact that $\sum_{i}\Pi_{i}=\mathds{1}$ , we arrive at $\mathopen{}\mathclose{{}\left|\psi}\right\rangle=0$ , contradiction. [hu2022].

We note that this same difficulty affects other techniques [winick2018, hu2022, araujo22]: in [winick2018] they perturb the singular matrices with identity to make them full rank, whereas in [hu2022, araujo22] they reformulate the problem in terms of the support of the relevant matrices, a procedure known as facial reduction [drusvyatskiy2017]. Facial reduction is more reliable, elegant, and results in a smaller problem than perturbing with identity, so we adopt this approach here as well.

We would thus like to perform facial reduction along the lines of [hu2022]: first find an isometry $V$ that encodes the support of the feasible $\rho$ , such that $\rho=V\sigma V^{\dagger}$ for full-rank $\sigma$ . Then replace the equality constraints with $\operatorname{tr}(V^{\dagger}E_{k}V\sigma)=:\operatorname{tr}(F_{k}\sigma)=p_{k}$ , dropping the redundant ones. Finally, find maps $\widehat{\mathcal{G}}$ and $\widehat{\mathcal{Z}}$ such that $\widehat{\mathcal{G}}(\sigma)$ and $\widehat{\mathcal{Z}}(\sigma)$ are the restrictions of the maps $\sigma\mapsto\mathcal{G}(V\sigma V^{\dagger})$ and $\sigma\mapsto\mathcal{Z}(\mathcal{G}(V\sigma V^{\dagger}))$ to the support of their ranges.

While that’s easy to do, the resulting problem is no longer an optimization over the relative entropy; in particular $D(\widehat{\mathcal{G}}(\sigma)\|\widehat{\mathcal{Z}}(\sigma))\neq D(\mathcal% {G}(\rho)\|\mathcal{Z}(\mathcal{G}(\rho)))$ . Instead, to obtain the reformulated problem we use the following identity [coles2012]:

D(\mathcal{G}(\rho)\|\mathcal{Z}(\mathcal{G}(\rho)))=-H(\mathcal{G}(\rho))+H(% \mathcal{Z}(\mathcal{G}(\rho))),

(6)

where $H(\rho)=-\operatorname{tr}(\rho\log\rho)$ is the von Neumann entropy, together with the simple observation that $H(\mathcal{G}(\rho))=H(\widehat{\mathcal{G}}(\sigma))$ and $H(\mathcal{Z}(\mathcal{G}(\rho)))=H(\widehat{\mathcal{Z}}(\sigma))$ . This motivates the definition of a new convex cone:

\mathcal{K}_{\text{QKD}}^{\widehat{\mathcal{G}},\widehat{\mathcal{Z}}}=\{(h,% \sigma)\in\mathbb{R}\times\mathbb{H}^{n};\,\sigma\succeq 0,\ h\geq-H(\widehat{% \mathcal{G}}(\sigma))+H(\widehat{\mathcal{Z}}(\sigma))\}.

(7)

We emphasize that $\widehat{\mathcal{G}}$ and $\widehat{\mathcal{Z}}$ must come from the reduction of $\mathcal{G}$ and $\mathcal{Z}$ ; if we let them be arbitrary CP maps $\mathcal{K}_{\text{QKD}}^{\widehat{\mathcal{G}},\widehat{\mathcal{Z}}}$ is not even a convex cone in general.

With the cone in hand, we can finally state the reduced problem:

\begin{gathered}\min_{h,\sigma}\ h\\ \text{s.t.}\quad\operatorname{tr}(\sigma)=1,\quad\operatorname{tr}(F_{k}\sigma% )=p_{k}\ \forall k,\\ (h,\sigma)\in\mathcal{K}_{\text{QKD}}^{\widehat{\mathcal{G}},\widehat{\mathcal% {Z}}}\end{gathered}

(8)

Note that the constraint $\rho\succeq 0$ has been dropped; although it’s equivalent to $\sigma\succeq 0$ , the latter is redundant with the definition of $\mathcal{K}_{\text{QKD}}^{\widehat{\mathcal{G}},\widehat{\mathcal{Z}}}$ .

We emphasize that even in the case where no facial reduction is necessary, and $\widehat{\mathcal{G}}=\mathcal{G}$ and $\widehat{\mathcal{Z}}=\mathcal{Z}\circ\mathcal{G}$ , it is still advantageous to use the QKD cone (7) instead of the relative entropy cone (4) for four reasons: (i) the constraint $\rho\succeq 0$ can not be dropped in the relative entropy cone unless $\mathcal{G}$ has a positive inverse, (ii) the QKD cone has one less matrix variable to optimize over, (iii) the derivatives of the barrier function of the QKD cone, that we introduce below, are less computationally demanding, (iv) as the QKD cone explicitly depends on $\widehat{\mathcal{G}}$ and $\widehat{\mathcal{Z}}$ , it can exploit their structure; specifically, we use the fact that $\widehat{\mathcal{Z}}$ necessarily has a block diagonal structure to drastically speed up the computation of several derivatives involving it.

3 Implementation of the QKD cone

In order to successfully optimize over a given convex cone $\mathcal{K}$ , both using Skajaa and Ye’s algorithm or traditional methods for symmetric cones, it’s essential to provide a logarithmically homogeneous self-concordant barrier (LHSCB) function for it [nesterov1997]. It is defined as a function $f:\operatorname{int}(\mathcal{K})\to\mathbb{R}$ such that $f(x_{i})\to\infty$ along every sequence converging to the boundary of $\mathcal{K}$ , is three times continuously differentiable, strictly convex, and

	$\displaystyle f(\tau x)=f(x)-\nu\log\tau\quad\forall x\in\operatorname{int}(% \mathcal{K}),\forall\tau>0,$		(9)
	$\displaystyle\|\nabla^{3}f(x)[\xi,\xi,\xi]\|\leq 2(\nabla^{2}f(x)[\xi,\xi])^{3/2% }\quad\forall x\in\operatorname{int}(\mathcal{K}),\forall\xi.$		(10)

While most of these properties are straightforward to verify, self-concordance (10) is not, and has only recently been proven for the following barrier function for the relative entropy cone (4) [fawzi2022]:

f(h,\rho,\sigma)=-\log(h-D(\rho\|\sigma))-\operatorname{logdet}(\rho)-% \operatorname{logdet}(\sigma).

(11)

We propose as the barrier function for the QKD cone (7) its obvious analogue:

f(h,\rho)=-\log(h+H(\widehat{\mathcal{G}}(\rho))-H(\widehat{\mathcal{Z}}(\rho)% ))-\operatorname{logdet}(\rho).

(12)

We conjecture that it is also self-concordant based on the similarity and the fact that our extensive numerical tests were successful.

In order to implement the cone (7) in the solver Hypatia, we need to provide the gradient, Hessian, and third-order derivative of the barrier function (12), and choose an initial point [coey2023].

3.1 Gradient, Hessian, and third-order derivative

To obtain the gradient, Hessian, and third-order derivative of the barrier function (12) we adapt the results of [faybusovich2022] to complex matrices, and combine them with the results of [coey2024] for the relative entropy cone. In the code they are needed in vectorized form, as explained in Appendix A. Here we present them in the usual notation for clarity.

Let $g(\rho)=-H(\mathcal{L}(\rho))$ for some positive map $\mathcal{L}$ . Then its gradient is

\nabla_{\rho}g(\rho)=\mathcal{L}^{\dagger}(\mathds{1}+\log(\mathcal{L}(\rho))),

(13)

where $\mathcal{L}^{\dagger}$ is the adjoint of $\mathcal{L}$ . Its Hessian is the linear operator

\nabla^{2}_{\rho,\rho}g(\rho)=\xi\mapsto\mathcal{L}^{\dagger}\mathopen{}% \mathclose{{}\left(U(\Gamma^{[1]}(\Lambda)\odot(U^{\dagger}\mathcal{L}(\xi)U))% U^{\dagger}}\right),

(14)

where $\odot$ is the elementwise (Schur) product, $\mathcal{L}(\rho)=U\Lambda U^{\dagger}$ is a diagonalization of $\mathcal{L}(\rho)$ , and $\Gamma^{[1]}(\Lambda)$ is defined as

\Gamma^{[1]}(\Lambda)_{ij}=\begin{cases}\frac{\log(\lambda_{i})-\log(\lambda_{% j})}{\lambda_{i}-\lambda_{j}},&\lambda_{i}\neq\lambda_{j}\\ \lambda_{i}^{-1},&\lambda_{i}=\lambda_{j}\,,\end{cases}

(15)

where $\lambda_{i}=\Lambda_{ii}$ . Its third-order derivative applied to $\xi,\xi$ is

\nabla_{\rho,\rho,\rho}^{3}g(\rho)[\xi,\xi]=\mathcal{L}^{\dagger}(U_{\mathcal{% L}}M_{\mathcal{L}}(\xi)U_{\mathcal{L}}^{\dagger})\,,

(16)

where

M_{\mathcal{L}}(\xi)_{\,i,j}=2\sum_{k}\tilde{\xi}_{i,k}\tilde{\xi}_{k,j}\Gamma% ^{[2]}_{i,j,k}(\Lambda_{\mathcal{L}})\,,

(17)

with $\tilde{\xi}=U_{\mathcal{L}}^{\dagger}\mathcal{L}(\xi)U_{\mathcal{L}}$ and

\Gamma^{[2]}_{i,j,k}(\Lambda)=\begin{cases}\frac{\Gamma^{[1]}_{i,j}(\Lambda)-% \Gamma^{[1]}_{i,k}(\Lambda)}{\lambda_{j}-\lambda_{k}},&\lambda_{j}\neq\lambda_% {k}\\ \frac{\Gamma^{[1]}_{i,j}(\Lambda)-\Gamma^{[1]}_{j,j}(\Lambda)}{\lambda_{i}-% \lambda_{j}},&\lambda_{i}\neq\lambda_{j}=\lambda_{k}\\ -\lambda_{i}^{-2}/2,&\lambda_{i}=\lambda_{j}=\lambda_{k}\>.\end{cases}

(18)

The gradient and Hessian of the term $-\operatorname{logdet}(\rho)$ are well-known to be $-\rho^{-1}$ and $\xi\mapsto\rho^{-1}\xi\rho^{-1}$ , respectively [nesterov1997]. The third order derivative can be easily calculated to be

(\xi,\zeta)\mapsto-\rho^{-1}\xi\rho^{-1}\zeta\rho^{-1}-\rho^{-1}\zeta\rho^{-1}% \xi\rho^{-1}.

(19)

Putting everything together, the gradient of the barrier $f(h,\rho)$ is given by

	$\displaystyle\nabla_{h}f(h,\rho)$	$\displaystyle=-\frac{1}{u},$		(20)
	$\displaystyle\nabla_{\rho}f(h,\rho)$	$\displaystyle=-\frac{1}{u}\nabla_{\rho}\,u-\rho^{-1},$		(21)

where $u=h+H(\widehat{\mathcal{G}}(\rho))-H(\widehat{\mathcal{Z}}(\rho))$ and $\nabla_{\rho}\,u=-\widehat{\mathcal{G}}^{\dagger}(\mathds{1}+\log(\widehat{% \mathcal{G}}(\rho)))+\widehat{\mathcal{Z}}^{\dagger}(\mathds{1}+\log(\widehat{% \mathcal{Z}}(\rho)))$ , and the Hessian by

$\displaystyle\nabla_{h,h}^{2}f(h,\rho)$	$\displaystyle=\frac{1}{u^{2}},$	(22)
$\displaystyle\nabla_{h,\rho}^{2}f(h,\rho)$	$\displaystyle=\frac{1}{u^{2}}\nabla_{\rho}\,u,$	(23)
$\displaystyle\nabla_{\rho,\rho}^{2}f(h,\rho)$	$\displaystyle=\xi\mapsto\frac{1}{u^{2}}\operatorname{tr}\big{[}(\nabla_{\rho}% \,u)\xi\big{]}\nabla_{\rho}\,u-\frac{1}{u}\nabla_{\rho,\rho}^{2}\,u[\xi]+\rho^% {-1}\xi\rho^{-1},$	(24)

where

\nabla_{\rho,\rho}^{2}\,u[\xi]=-\widehat{\mathcal{G}}^{\dagger}\mathopen{}% \mathclose{{}\left(U_{\mathcal{G}}(\Gamma^{[1]}(\Lambda_{\mathcal{G}})\odot(U_% {\mathcal{G}}^{\dagger}\widehat{\mathcal{G}}(\xi)U_{\mathcal{G}}))U_{\mathcal{% G}}^{\dagger}}\right)+\widehat{\mathcal{Z}}^{\dagger}\mathopen{}\mathclose{{}% \left(U_{\mathcal{Z}}(\Gamma^{[1]}(\Lambda_{\mathcal{Z}})\odot(U_{\mathcal{Z}}% ^{\dagger}\widehat{\mathcal{Z}}(\xi)U_{\mathcal{Z}}))U_{\mathcal{Z}}^{\dagger}% }\right),

(25)

and $\widehat{\mathcal{G}}(\rho)=U_{\mathcal{G}}\Lambda_{\mathcal{G}}U_{\mathcal{G}% }^{\dagger}$ and $\widehat{\mathcal{Z}}(\rho)=U_{\mathcal{Z}}\Lambda_{\mathcal{Z}}U_{\mathcal{Z}% }^{\dagger}$ are diagonalizations of $\widehat{\mathcal{G}}(\rho)$ and $\widehat{\mathcal{Z}}(\rho)$ .

The third order derivatives are given by:

$\displaystyle\nabla_{h,h,h}^{3}f(h,\rho)$	$\displaystyle=-\frac{2}{u^{3}},$	(26)
$\displaystyle\nabla_{h,h,\rho}^{3}f(h,\rho)$	$\displaystyle=-\frac{2}{u^{3}}\nabla_{\rho}u,$	(27)
$\displaystyle\nabla_{h,\rho,\rho}^{3}f(h,\rho)$	$\displaystyle=\xi\mapsto-\frac{2}{u^{3}}\operatorname{tr}\big{[}(\nabla_{\rho}% \,u)\xi\big{]}\nabla_{\rho}u+\frac{1}{u^{2}}\nabla_{\rho,\rho}^{2}\,u[\xi],$	(28)
$\displaystyle\nabla_{\rho,\rho,\rho}^{3}f(h,\rho)$	$\displaystyle=(\xi,\zeta)\mapsto-\frac{2}{u^{3}}\operatorname{tr}\big{[}(% \nabla_{\rho}\,u)\xi\big{]}\operatorname{tr}\big{[}(\nabla_{\rho}\,u)\zeta\big% {]}\nabla_{\rho}u+\frac{1}{u^{2}}\operatorname{tr}\big{[}(\nabla_{\rho,\rho}^{% 2}\,u[\zeta])\xi\big{]}\nabla_{\rho}u$
	$\displaystyle\qquad\qquad\quad\,+\frac{1}{u^{2}}\operatorname{tr}\big{[}(% \nabla_{\rho}\,u)\xi\big{]}\nabla_{\rho,\rho}^{2}u[\zeta]+\frac{1}{u^{2}}% \operatorname{tr}\big{[}(\nabla_{\rho}\,u)\zeta\big{]}\nabla_{\rho,\rho}^{2}u[\xi]$
	$\displaystyle\qquad\qquad\quad\,-\frac{1}{u}\nabla_{\rho,\rho,\rho}^{3}u[\xi,% \zeta]-(\rho^{-1}\xi\rho^{-1}\zeta\rho^{-1}+\rho^{-1}\zeta\rho^{-1}\xi\rho^{-1% }).$	(29)

In particular, applying $\nabla_{\rho,\rho,\rho}^{3}f(h,\rho)$ to the point $[\xi,\xi]$ gives

$\displaystyle\nabla_{\rho,\rho,\rho}^{3}f(h,\rho)[\xi,\xi]$	$\displaystyle=-\frac{2}{u^{3}}\big{(}\operatorname{tr}\big{[}(\nabla_{\rho}\,u% )\xi\big{]}\big{)}^{2}\nabla_{\rho}u+\frac{1}{u^{2}}\operatorname{tr}\big{[}(% \nabla_{\rho,\rho}^{2}\,u[\xi])\xi\big{]}\nabla_{\rho}u$
	$\displaystyle\quad+\frac{2}{u^{2}}\operatorname{tr}\big{[}(\nabla_{\rho}\,u)% \xi\big{]}\nabla_{\rho,\rho}^{2}u[\xi]-\frac{1}{u}\nabla_{\rho,\rho,\rho}^{3}u% [\xi,\xi]$
	$\displaystyle\quad-2\rho^{-1}\xi\rho^{-1}\xi\rho^{-1},$	(30)

where

\nabla_{\rho,\rho,\rho}^{3}u[\xi,\xi]=-{\widehat{\mathcal{G}}}^{\dagger}(U_{% \mathcal{G}}M_{\mathcal{G}}(\xi)U_{\mathcal{G}}^{\dagger})+\widehat{\mathcal{Z% }}^{\dagger}(U_{\mathcal{Z}}M_{\mathcal{Z}}(\xi)U_{\mathcal{Z}}^{\dagger}),

(31)

and $M_{\mathcal{G}}(\xi)$ and $M_{\mathcal{Z}}(\xi)$ are the analogues of (17).

3.2 Initial point

The initial point can in principle be chosen to be any point in the interior of the cone [skajaa2015], but following [coey2023, dahl2022] we’d like to use the central point, as it provides better performance. It is defined as the minimizer of

\min\limits_{h,\rho}f(h,\rho)+\frac{1}{2}\|(h,\rho)\|^{2}_{2}\,,

(32)

where $f(h,\rho)$ is our barrier function (12), and $\|(h,\rho)\|^{2}_{2}=h^{2}+\langle\rho,\rho\rangle$ . The minimizer is obtained by taking the gradient of the objective and setting it to zero:

\displaystyle(h,\rho)=-\nabla f(h,\rho).

(33)

This results in the pair of equations

	$\displaystyle h$	$\displaystyle=\frac{1}{u},$		(34)
	$\displaystyle\rho$	$\displaystyle=\frac{1}{u}\nabla_{\rho}\,u+\rho^{-1},$		(35)

where we are using the gradient from equations (20)–(21).

Let now $u=h-D$ , where $D=-H(\widehat{\mathcal{G}}(\rho))+H(\widehat{\mathcal{Z}}(\rho))$ . We get then²²2The negative solution for $h$ is excluded as it is outside the cone.

	$\displaystyle h=\frac{D}{2}+\sqrt{1+\frac{D^{2}}{4}},$		(36)
	$\displaystyle\rho-\rho^{-1}=h\mathopen{}\mathclose{{}\left(-\widehat{\mathcal{% G}}^{\dagger}(\mathds{1}+\log(\widehat{\mathcal{G}}(\rho)))+\widehat{\mathcal{% Z}}^{\dagger}(\mathds{1}+\log(\widehat{\mathcal{Z}}(\rho)))}\right).$		(37)

While we have been unable to obtain a general solution of these equations, we note that often the maps satisfy $\widehat{\mathcal{G}}^{\dagger}(\mathds{1}+\log(\widehat{\mathcal{G}}(\mathds{% 1})))=\widehat{\mathcal{Z}}^{\dagger}(\mathds{1}+\log(\widehat{\mathcal{Z}}(% \mathds{1})))$ . In this case $\rho=\mathds{1}$ will satisfy equation (37) for any value of $h$ , which we can then choose to be given by equation (36). We use therefore these values as our initial point.

Note that even when equation (37) is not satisfied this point is still in the interior of the cone, as $\mathds{1}$ is full rank and $\frac{D}{2}+\sqrt{1+\frac{D^{2}}{4}}>D$ , and thus is a valid starting point.

4 Examples

4.1 BB84

In order to illustrate the technique let us start with a toy example, the computation of the key rate for the entanglement based version of the BB84 protocol, using as estimated parameters the qubit error rates (QBERs) in the $X$ and $Z$ bases $q_{x}$ and $q_{z}$ . If we use only the $Z$ basis to generate key, the key map $\mathcal{G}$ can be taken to be identity, and the decoherence map $\mathcal{Z}$ is $\rho\mapsto\sum_{i=0}^{1}(\mathopen{}\mathclose{{}\left|i\middle\rangle\middle% \langle i}\right|\otimes\mathds{1})\rho(\mathopen{}\mathclose{{}\left|i\middle% \rangle\middle\langle i}\right|\otimes\mathds{1})$ . The analytical expression for $H(A|E)$ is $1-h(q_{x})$ , where $h$ is the binary entropy.

In the case where $(q_{x},q_{z})\in(0,1)^{\times 2}$ there is a full rank state compatible with the constraints and the support of the range of $\mathcal{G}$ and $\mathcal{Z}$ is the full space, so no facial reduction is needed. Then $\widehat{\mathcal{G}}=\mathcal{G}$ and $\widehat{\mathcal{Z}}=\mathcal{Z}\circ\mathcal{G}$ , and the conic program is

\begin{gathered}\min_{h,\rho}\ h\\ \text{s.t.}\quad\operatorname{tr}(\rho)=1,\quad\operatorname{tr}(Q_{x}\rho)=q_% {x},\quad\operatorname{tr}(Q_{z}\rho)=q_{z},\\ (h,\rho)\in\mathcal{K}_{\text{QKD}}^{\widehat{\mathcal{G}},\widehat{\mathcal{Z% }}},\end{gathered}

(38)

where $Q_{x}$ and $Q_{z}$ are the projectors that produce the QBERs.

In the case where $q_{z}=0$ and $q_{x}\in(0,1)$ the support of the feasible $\rho$ is $\operatorname{span}\{\mathopen{}\mathclose{{}\left|\phi^{+}}\right\rangle,% \mathopen{}\mathclose{{}\left|\phi^{-}}\right\rangle\}$ . Letting $V=\mathopen{}\mathclose{{}\left|\phi^{+}\middle\rangle\middle\langle 0}\right|% +\mathopen{}\mathclose{{}\left|\phi^{-}\middle\rangle\middle\langle 1}\right|$ we can write $\rho=V\sigma V^{\dagger}$ for a $2\times 2$ matrix $\sigma$ . The constraint $\operatorname{tr}(V^{\dagger}Q_{z}V\sigma)=0$ becomes tautological and is dropped, and the constraint $\operatorname{tr}(V^{\dagger}Q_{x}V\sigma)=q_{x}$ is reduced to $\operatorname{tr}(\mathopen{}\mathclose{{}\left|1\middle\rangle\middle\langle 1% }\right|\sigma)=q_{x}$ . The map $\sigma\mapsto V\sigma V^{\dagger}$ has range supported on the range of $V$ , namely $\operatorname{span}\{\mathopen{}\mathclose{{}\left|\phi^{+}}\right\rangle,% \mathopen{}\mathclose{{}\left|\phi^{-}}\right\rangle\}$ , so to restrict it there we simply remove the isometry, obtaining $\widehat{\mathcal{G}}(\sigma)=\sigma$ . The map $\sigma\mapsto\mathcal{Z}(V\sigma V^{\dagger})$ has range supported on $\operatorname{span}\{\mathopen{}\mathclose{{}\left|00}\right\rangle,\mathopen{% }\mathclose{{}\left|11}\right\rangle\}$ , so we restrict it there, choosing the Kraus operators of $\widehat{\mathcal{Z}}$ to be $\{\mathopen{}\mathclose{{}\left|0\middle\rangle\middle\langle+}\right|,% \mathopen{}\mathclose{{}\left|1\middle\rangle\middle\langle-}\right|\}$ .

In the case where $q_{x}=0$ and $q_{z}\in(0,1)$ the support of the feasible $\rho$ is $\operatorname{span}\{\mathopen{}\mathclose{{}\left|\phi^{+}}\right\rangle,% \mathopen{}\mathclose{{}\left|\psi^{+}}\right\rangle\}$ . Letting $V=\mathopen{}\mathclose{{}\left|\phi^{+}\middle\rangle\middle\langle 0}\right|% +\mathopen{}\mathclose{{}\left|\psi^{+}\middle\rangle\middle\langle 1}\right|$ we can write $\rho=V\sigma V^{\dagger}$ for a $2\times 2$ matrix $\sigma$ . The constraint $\operatorname{tr}(V^{\dagger}Q_{x}V\sigma)=0$ becomes tautological and is dropped, and the constraint $\operatorname{tr}(V^{\dagger}Q_{z}V\sigma)=q_{z}$ is reduced to $\operatorname{tr}(\mathopen{}\mathclose{{}\left|1\middle\rangle\middle\langle 1% }\right|\sigma)=q_{z}$ . To reduce the map $\sigma\mapsto V\sigma V^{\dagger}$ we again simply remove the isometry, obtaining $\widehat{\mathcal{G}}(\sigma)=\sigma$ . The map $\sigma\mapsto\mathcal{Z}(V\sigma V^{\dagger})$ this time has full rank range, so its reduction is simply itself, $\widehat{\mathcal{Z}}(\sigma)=\mathcal{Z}(V\sigma V^{\dagger})$ .

In the case where $q_{x}=q_{z}=0$ the only feasible $\rho$ is $\mathopen{}\mathclose{{}\left|\phi^{+}\middle\rangle\middle\langle\phi^{+}}\right|$ so there’s nothing to optimize.

4.2 DMCV

A more interesting case of facial reduction is when a POVM $\{E_{i}\}_{i=0}^{n-1}$ is used to generate key. This is the case for example in discrete-modulated continuous variable (DMCV) QKD [lin2019]. For simplicity, assume that the estimated probabilities are compatible with a full-rank state, so that facial reduction is only needed for the maps $\mathcal{G}$ and $\mathcal{Z}$ .

Let $V=\sum_{i=0}^{n-1}\mathds{1}_{A}\otimes\sqrt{E_{i}}_{B}\otimes\mathopen{}% \mathclose{{}\left|i}\right\rangle_{a}$ be the standard Naimark dilation of the POVM³³3Note that here we are applying the POVMs on Bob’s side, as is standard in DMCV., where $a$ is an ancilla subsystem. Then $\mathcal{G}(\rho)=V\rho V^{\dagger}$ , and $\mathcal{Z}$ has Kraus operators $\{\mathds{1}_{AB}\otimes\mathopen{}\mathclose{{}\left|i\middle\rangle\middle% \langle i}\right|_{a}\}_{i=0}^{n-1}$ . As before, since $\mathcal{G}$ is just an isometry, the reduction consists simply of removing it, and $\widehat{\mathcal{G}}(\rho)=\rho$ . The Kraus operators of $\mathcal{Z}\circ\mathcal{G}$ are $\{\mathds{1}_{A}\otimes\sqrt{E_{i}}_{B}\otimes\mathopen{}\mathclose{{}\left|i}% \right\rangle_{a}\}_{i=0}^{n-1}$ . It is easy to see that if all $E_{i}$ are full rank then $\mathcal{Z}\circ\mathcal{G}$ has full rank range, and $\widehat{\mathcal{Z}}=\mathcal{Z}\circ\mathcal{G}$ . Otherwise further reduction is needed (or starting with a more appropriate Naimark dilation).

For the purpose of benchmarking we implemented here the heterodyne protocol from [lin2019] (a more sophisticated application of our technique to DMCV is presented in [pascualgarcia2024]). In it the POVMs are in fact full rank and the facial reduction is as described in the previous paragraph. The main parameter of the protocol is the number of photons at which the Fock space is cut off, $N_{c}$ . The quantum state $\rho$ has dimension $4(N_{c}+1)$ , and the Kraus operators of $\widehat{\mathcal{Z}}$ are $16(N_{c}+1)\times 4(N_{c}+1)$ . As it is a prepare-and-measure protocol, it also has constraints on the partial trace of $\rho$ , which in general may introduce null eigenvalues and necessitate further facial reduction. Here it is not the case.

4.3 MUB protocol

In the MUB protocol introduced in [cerf2002, sheridan2010], Alice and Bob measure a complete set of MUBs for some prime $d$ , with Bob’s bases being the transpose of Alice’s, and estimate the probability of getting equal results. As in [araujo22], here there is no such limitations of using prime dimensions, equal results, or even exact MUBs.

As it is an entanglement-based protocol and we are generating key in a single basis, the key map $\mathcal{G}$ can be taken to be identity, and the decoherence map $\mathcal{Z}$ to have Kraus operators $\{\mathopen{}\mathclose{{}\left|i\middle\rangle\middle\langle i}\right|\otimes% \mathds{1}\}_{i=0}^{d-1}$ . When the estimated statistics are compatible with a full-rank state the facial reduction is trivial: $\widehat{\mathcal{G}}=\mathcal{G}$ and $\widehat{\mathcal{Z}}=\mathcal{Z}\circ\mathcal{G}$ .

4.4 Overlapping bases protocol

In the overlapping bases protocol introduced in [araujo22], Alice and Bob measure only nearest-neighbour superpositions, with again Bob’s bases being the transpose of Alice’s. $\mathcal{G}$ and $\mathcal{Z}$ are the same as in the MUB protocol, and the facial reduction is also trivial the estimated statistics are compatible with a full-rank state: $\widehat{\mathcal{G}}=\mathcal{G}$ and $\widehat{\mathcal{Z}}=\mathcal{Z}\circ\mathcal{G}$ .

We note, however, that in [araujo22] the protocol was restricted to even $d\geq 2$ , whereas here we use it also for odd $d$ .

4.5 Dealing with experimental data

As argued in [araujo22], in order to deal with experimental data one cannot naïvely set the probabilities $\mathbf{p}$ to be equal to the measured relative frequencies $\mathbf{f}$ . Instead, one should estimate the covariance matrix $\Sigma$ , and minimize $H(A|E)$ over the desired confidence region, approximated as the intersection of the ellipsoid $\big{\|}\Sigma^{-\frac{1}{2}}(\mathbf{p}-\mathbf{f})\big{\|}_{2}\leq\chi$ with the set of parameters corresponding to valid quantum states.

One can do that with a simple modification of our conic program (8):

\begin{gathered}\min_{h,\sigma,\mathbf{p}}\ h\\ \text{s.t.}\quad\operatorname{tr}(\sigma)=1,\quad\operatorname{tr}(\mathbf{F}% \sigma)=\mathbf{p},\quad\big{\|}\Sigma^{-\frac{1}{2}}(\mathbf{p}-\mathbf{f})% \big{\|}_{2}\leq\chi,\\ (h,\sigma)\in\mathcal{K}_{\text{QKD}}^{\widehat{\mathcal{G}},\widehat{\mathcal% {Z}}}\end{gathered}

(39)

where the constraint we added is a second-order-cone constraint, which is supported by almost every conic solver in existence, and in particular the one we are using, Hypatia.

5 Numerical results

5.1 Performance

In order to illustrate the performance of our technique, we ran the conic program (8) and compared the running time to the Gauss-Newton technique [hu2022] and the Gauss-Radau technique [araujo22]. We didn’t compare to the Frank-Wolfe technique from [winick2018] because [hu2022] already demonstrated superior performance. We didn’t compare to the technique from [karimi2024] because they only use the vanilla relative entropy cone; as such they cannot perform facial reduction and are unable to handle the QKD problem in full generality.

The protocols we benchmarked are the DMCV, MUB and overlap protocols described in subsections 4.2, 4.3 and 4.4. For the DMCV protocol we used the same parameters as in [hu2022]: amplitude of the coherent states $\alpha=0.35$ , noise $\xi=0.05$ , distance $L=60$ , and transmittance $\eta=10^{-0.02L}$ . For the MUB and overlap protocols we used the probabilities from the isotropic state $\rho_{\text{iso}}(v)=v\mathopen{}\mathclose{{}\left|\phi^{+}\middle\rangle% \middle\langle\phi^{+}}\right|+(1-v)\mathds{1}/d^{2}$ with visibility $v=0.95$ . For the MUB protocol we computed the probabilities using all bases, included the complex ones, whereas for the overlap protocol we restricted ourselves to the real ones. We did this in order to illustrate an additional advantage of our technique: since it uses standard conic optimization, we can avail ourselves of standard symmetrization techniques to show that we can optimize over real variables only, as done in [araujo22]. This provides an additional boost in performance.

The calculations were done on an AMD Ryzen Threadripper Pro 5955WX with 4 GHz and 16 cores, on a machine with 512 GiB RAM running Ubuntu Linux 22.04. Our code was run with Julia 1.10 and the modeller JuMP [lubin2023], and the two other techniques with MATLAB 2023b. For the Gauss-Radau technique we used the solver MOSEK 10.1 [mosek] and the modeller YALMIP [yalmip]. In all cases we have only reported the time taken by the solver, discounting the time taken to set up the problem, and in the Gauss-Newton case the time taken to do facial reduction numerically.

The results are shown in Figures 1, 2, and 3.

Figure 1: Running time in seconds (logarithmic scale) as a function of the photon cut-off number for the DMCV protocol.

Figure 2: Running time in seconds (logarithmic scale) as a function of the local state dimension for the MUB protocol. Note that for

d=6,10

, and

12

the bases used are only roughly unbiased.

Figure 3: Running time in seconds (logarithmic scale) as a function of the local state dimension for the overlap protocol.

5.2 Precision

In order to obtain results with higher precision, one might try setting tighter tolerances for the conic solver. This is however not fruitful, as very quickly one meets fundamental obstacles due to the limited precision with which the standard 64-bit floating point numbers can perform computations or even represent the problem parameters. For this reason we have used instead different implementations of floating point numbers that provide more bits of precision. This is easy due to Julia’s flexible type system, which is fully taken advantage of by the solver Hypatia and the modeller JuMP; in order to use a different type we simply need to give it as a parameter. Hypatia automatically sets the appropriate tolerances for each type. We have used the following four types:

Float64

Default 64-bit floating point number implementing the IEEE 754 standard, known as double. Has 53 bits of precision.
Float128

128-bit floating point number implementing the IEEE 754 standard, known as quadruple. Has 113 bits of precision. Provided by the package Quadmath [quadmath2024], which wraps the GCC library libquadmath.
Double64

Non-standard implementation of a 128-bit floating point number as a pair of 64-bit floating point numbers, known as double-double. Much faster than Float128, but has smaller precision and smaller range of exponent. Has 106 bits of precision. Provided by the package DoubleFloats [sarnoff2024].
BigFloat

Arbitrary precision floating-point number. We used the default settings, with which it occupies 640 bits of memory and provides 256 bits of precision. Provided by the Julia standard library, wrapping the GNU MPFR library [fousse2007].

In order to evaluate the error for which type, we solved the conic program (8) for the protocols we know the analytical answers: BB84 4.1 and MUB 4.3. The results are shown in Table 1.

Although the analytical answer is also known for the DMCV protocol 4.2 for the case $\xi=0$ , a comparison is not meaningful: after facial reduction there is a unique state compatible with the constraints, so there’s nothing to optimize. We have nevertheless provided the analytical answer and the facial reduction in the accompanying source code for the interested reader.

	BB84	MUB
Float64	$5.4\times 10^{-8}$	$5.7\times 10^{-8}$
Double64	$1.4\times 10^{-13}$	$1.6\times 10^{-12}$
Float128	$1.7\times 10^{-14}$	$7.7\times 10^{-14}$
BigFloat	$7.9\times 10^{-32}$	$1.6\times 10^{-31}$

Table 1: Absolute difference between analytical

H(A|E)

and the one computed via the conic program (8) for various floating point implementations, for the BB84 and MUB protocols. For the BB84 protocol we used the parameters

q_{x}=q_{z}=0.025

, and for the MUB protocol we used dimension 3 with 4 bases and the probabilities of an isotropic state with visibility

v=0.95

Note that in general a number that has a finite decimal expansion does not have a finite binary expansion. For example, $0.95_{10}=0.1111\overline{0011}_{2}$ . Therefore, even when giving input parameters that at first sight seem exact, one often incurs in truncation errors. Therefore, in order to take advantage of higher precision types, the input parameters also need to use them⁴⁴4One must also be careful to write $0.95$ as (e.g.) Double64(95)/100 instead of Double64(0.95), as the latter first computes $0.95$ as a Float64 before converting it to Double64.. For this reason our example codes don’t require the type desired for the computation to be specified, but rather read it from the type of the input parameters.

It’s important to emphasize that the increased precision comes at the cost of a much longer running time. Not only the elementary operations take longer, but also the number of iterations increases in order to meet the tighter tolerances.

6 Conclusion

In this work we have introduced a new technique for obtaining the asymptotic key rate in quantum key distribution. It is based on defining a new convex cone for the key rate problem and optimizing over it with standard conic optimization methods. It provides a dramatic improvement in performance with respect to previous techniques. Moreover, it is flexible and easy to use, as one can freely combine it with other convex cones, and it is interfaced through a convenient modelling language, JuMP.

As a primal-dual method, it directly provides a lower bound through the value of the dual objective. However, we lack an explicit expression for the dual cone, which stops us from converting the dual minimizer into an independent witness for the lower bound. One can nevertheless construct such a witness from the primal minimizer as described in [winick2018] and done for example in [pascualgarcia2024].

The most pressing issue is to adapt the method to finding finite key rates. Although one can use the asymptotic key rate to obtain a finite key rate via the (generalized) entropy accumulation theorem [dupuis2020, metger2022], as done for example in [pascualgarcia2024], such bounds are known to be loose. Tight bounds are provided by more recent techniques [vanhimbeeck2024, arqand2024], that however require optimization over Rényi entropies. It would be interesting to apply conic methods to this case.

7 Code availability

The implementation of the cone introduced in this paper and the code for the examples shown are available in https://github.com/araujoms/ConicQKD.jl.

8 Related work

While conducting this research, we found that He et al. [he2024] had independently applied the Skajaa and Ye algorithm to the problem of QKD using a specialized cone, and proved the self-concordance of the barrier function (12).

9 Acknowledgements

The research of A.G.L., P.V.P., M.C.C. and M.A. was supported by the European Union–Next Generation UE/MICIU/Plan de Recuperación, Transformación y Resiliencia/Junta de Castilla y León. We thank Chris Coey and Lea Kapelevich for help with Hypatia’s code and useful discussions.

\printbibliography

Appendix A Vectorization

Let $\operatorname{vec}:\mathbb{C}^{d_{O}\times d_{I}}\to\mathbb{C}^{d_{O}d_{I}}$ be the col-major vectorization of a real or complex rectangular matrix. It is useful to represent it as

\operatorname{vec}(X)=\mathds{1}_{d_{I}}\otimes X\operatorname{vec}(\mathds{1}% _{d_{I}}),

(40)

where $\operatorname{vec}(\mathds{1}_{d_{I}})=\sum_{i=0}^{d_{I}-1}\mathopen{}% \mathclose{{}\left|ii}\right\rangle$ . It is common to write $\operatorname{vec}(X)$ as ${|{X}\rangle\!\rangle}$ . A useful identity is

\operatorname{vec}(ABC)=C^{T}\otimes A\operatorname{vec}(B).

(41)

For implementation purposes we work with a non-redundant vectorization for Hermitian matrices $\operatorname{svec}(X)$ . The real and complex cases differ. In the real case, $\operatorname{svec}:\mathbb{R}^{d\times d}\to\mathbb{R}^{d(d+1)/2}$ is the col-major vectorization of the upper triangle, with the off-diagonals multiplied by $\sqrt{2}$ . In the complex case, $\operatorname{svec}:\mathbb{C}^{d\times d}\to\mathbb{R}^{d^{2}}$ additionally splits the complex numbers into the real part and minus the imaginary part⁵⁵5This is an arbitrary choice, made to coincide with the code., storing them consecutively.

The factor of $\sqrt{2}$ multiplying the off-diagonals is necessary to ensure that

\langle X,Y\rangle=\langle\operatorname{vec}(X),\operatorname{vec}(Y)\rangle=% \langle\operatorname{svec}(X),\operatorname{svec}(Y)\rangle\quad\forall X,Y.

(42)

Let $V$ be the isometry⁶⁶6This is an abuse of notation since the operator is different in the real and complex cases. Also note that in the complex case $V$ is additionally unitary. such that

\operatorname{vec}(X)=V\operatorname{svec}(X)

(43)

for all Hermitian $X$ . Suppose you are interested in representing a linear function $f$ as a matrix $F$ such that

F\operatorname{vec}(X)=\operatorname{vec}(f(X))\quad\forall X.

(44)

This is easy to do using identity (41) and linearity. If this function is additionally Hermitian-preserving we have that

V^{\dagger}FV\operatorname{svec}(X)=\operatorname{svec}(f(X))

(45)

for all Hermitian $X$ .

This is mainly useful for proving theorems, as a direct computation of $V^{\dagger}FV$ via this formula is inefficient. For the particular case of $f(X)=KXK^{\dagger}$ we implemented an efficient function to compute it, $\operatorname{skron}(K)=V^{\dagger}(\bar{K}\otimes K)V$ .