\newmdenv

[skipabove=7pt, skipbelow=7pt, backgroundcolor=darkblue!15, innerleftmargin=5pt, innerrightmargin=5pt, innertopmargin=5pt, leftmargin=0cm, rightmargin=0cm, innerbottommargin=5pt, linewidth=1pt]tBox \newmdenv[skipabove=7pt, skipbelow=7pt, backgroundcolor=blue2!25, innerleftmargin=5pt, innerrightmargin=5pt, innertopmargin=5pt, leftmargin=0cm, rightmargin=0cm, innerbottommargin=5pt, linewidth=1pt]dBox \newmdenv[skipabove=7pt, skipbelow=7pt, backgroundcolor=darkred!15, innerleftmargin=5pt, innerrightmargin=5pt, innertopmargin=5pt, leftmargin=0cm, rightmargin=0cm, innerbottommargin=5pt, linewidth=1pt]rBox

Non-asymptotic Approximation Error Bounds of Parameterized Quantum Circuits

Zhan Yu1, 2   Qiuhao Chen111footnotemark: 1   Yuling Jiao1, 3   Yinan Li1, 3   Xiliang Lu1, 3
Xin Wang4   Jerry Zhijian Yang1, 3
1 School of Mathematics and Statistics, Wuhan University, Wuhan 430072, China
2 Centre for Quantum Technologies, National University of Singapore, 117543, Singapore
3 Hubei Key Laboratory of Computational Science, Wuhan 430072, China
4 Thrust of Artificial Intelligence, Information Hub,
Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511453, China
Z.Y. and Q.C. contributed equally to this [email protected]@whu.edu.cn
Abstract

Parameterized quantum circuits (PQCs) have emerged as a promising approach for quantum neural networks. However, understanding their expressive power in accomplishing machine learning tasks remains a crucial question. This paper investigates the expressivity of PQCs for approximating general multivariate function classes. Unlike previous Universal Approximation Theorems for PQCs, which are either nonconstructive or rely on parameterized classical data processing, we explicitly construct data re-uploading PQCs for approximating multivariate polynomials and smooth functions. We establish the first non-asymptotic approximation error bounds for these functions in terms of the number of qubits, quantum circuit depth, and number of trainable parameters. Notably, we demonstrate that for approximating functions that satisfy specific smoothness criteria, the quantum circuit size and number of trainable parameters of our proposed PQCs can be smaller than those of deep ReLU neural networks. We further validate the approximation capability of PQCs through numerical experiments. Our results provide a theoretical foundation for designing practical PQCs and quantum neural networks for machine learning tasks that can be implemented on near-term quantum devices, paving the way for the advancement of quantum machine learning.

1 Introduction

In quantum computing, one key area is to investigate if quantum computers could accelerate classical machine learning tasks in data analysis and artificial intelligence, giving rise to an interdisciplinary field known as quantum machine learning [1]. As the quantum analogs of classical neural networks, parameterized quantum circuits (PQCs) [2] have gained significant attention as a prominent paradigm to yield quantum advantages. PQCs offer a concrete and practical way to implement quantum machine learning algorithms in noisy and intermediate-scale quantum (NISQ) devices [3], rendering them well-suited for a diverse array of tasks [4, 5, 6, 7, 8, 9, 10, 11].

To establish the practical significance of quantum machine learning, an ongoing pursuit is to demonstrate their superiority in solving real-world learning problems compared to classical learning models, including the most commonly used deep neural networks [12]. Typical supervised learning tasks, such as image classification and price prediction, aim to construct a model to learn a mapping function from the input to output via training data sets. Essentially, the goal is to approximate multivariate functions. This viewpoint leads to the celebrated Universal Approximation Theorem [13, 14], which limits what neural networks can theoretically learn. Recently, powerful tools from approximation theory have been utilized to establish a fruitful mathematical framework for understanding the “black magic” of deep learning by establishing non-asymptotic approximation error bounds of deep neural networks in terms of the width, depth, number of weights (neurons) and function complexities, see e.g. Refs. [15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25] and references therein.

Substantial investigations have showcased the power of quantum machine learning for specific learning tasks [26, 27, 28, 29, 30, 31, 32, 33]. A fundamental question is whether the expressivity of quantum machine learning models is as powerful as, or is more powerful than, the expressivity of classical machine learning models. This can be illustrated by proving universal approximation theorems for PQCs [34, 35, 36, 37, 38, 39, 40, 41], indicating that there exist PQCs with suitable parameter configurations to approximate target functions up to a given approximation accuracy. This will justify the power of PQCs to solve supervised learning tasks in a mathematical way. To further investigate whether PQCs are more expressive than the classical models or not, it is natural to examine the PQC approximation performance by establishing approximation error bounds for important function classes. Such quantitative error bounds are less known in the quantum setting, because the hypothesis functions generated by PQCs are more complicated than those generated by classical neural networks.

The difficulties of analyzing the PQC approximation performances can be partially overcome by allowing parameterized classical data processing. Namely, trainable parameters are allowed not only in the quantum gates in PQCs but also in the classical data pre- and post-processing. This allows one to prove approximation error bounds following classical strategies [39, 41, 40]. For instance, Goto et al. [39] proved PQC approximation error rate for Lipschitz continuous functions in terms of the number of qubits and trainable parameters by incorporating trainable parameters in the measurement post-processing phase; similar results can also be obtained by utilizing Tensor-Train Network [41] or by linear transformations to preprocess the classical data.

However, utilizing parameterized classical data processing makes it hard to distinguish whether the expressive power of PQCs comes from the classical or quantum parts. In fact, parameterized classical data processing enables one to directly convert the hypothesis functions generated by the quantum models into hypothesis functions generated by classical ones and adapt expressivity results for classical machine learning models to extract the expressivity of such quantum models. As a consequence, the resulting PQCs have very simple structures and short depth. It remains unknown whether one can prove approximation error bounds for PQCs without parameterized classical data processing. On the other hand, Zhao et al. [42] proved exponential lower bounds on the number of trainable parameters (in terms of the number of variables) needed for approximating bounded Lipschitz continuous functions using PQCs without parameterized classical data processing, illustrating that using PQCs to approximate Lipschitz functions still suffers from the curse of dimensionality (CoD) met by classical deep neural networks [43]. However, this does not rule out the possibility that one can achieve the same approximation rate with PQCs of smaller size compared to classical deep neural networks.

In this paper, we explicitly construct the first PQCs without parameterized classical data processing for approximating multivariate polynomials and smooth functions; a glance at these constructed PQCs is illustrated in Fig. 1. This eliminates the ambiguity regarding whether the expressivity originates from classical or quantum parts. We also establish non-asymptotic PQC approximation error bounds, in the sense that the PQC approximation performances are characterized in terms of the number of qubits (width), the depth of PQCs, the number of trainable parameters/gates (parameter count), and the function complexities. These results enable us to compare the approximation power of PQCs with that of classical neural networks. Notably, we show that for multivariate smooth functions, the quantum circuit size and the number of trainable parameters of our proposed PQCs demonstrate an improvement over the prior result of deep ReLU neural networks [21], one of the most commonly used neural network family in classical deep learning theory. Our proposed PQCs not only possess the universal approximation property but also achieve parameter efficiency comparable to classical neural networks, potentially leading to more efficient and scalable quantum machine learning algorithms for real-world tasks.

Refer to caption
Figure 1: Overview of PQCs for approximating continuous functions. (a) Flowchart illustrating the strategy for using PQCs to approximate continuous functions via implementing Bernstein polynomials. The input data x𝑥xitalic_x is encoded into the PQC through S(x)𝑆𝑥S(x)italic_S ( italic_x ), with the PQC (blue background) capable of representing parity-constrained polynomials up to degree 3333 (as x𝑥xitalic_x is encoded three times). The technique of linear combination of unitaries (LCU) is used to aggregate these polynomials together. The output of PQC derives from measurement with a specific observable. Fine-tuning trainable parameters in RZsubscript𝑅𝑍R_{Z}italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT gates yields a polynomial output depicted in the right panel. (b) Flowchart illustrating the strategy of approximation via local Taylor expansions. We first apply a PQC to localize the input domain into K=5𝐾5K=5italic_K = 5 regions. For example, for input x[0.8,1]𝑥0.81x\in[0.8,1]italic_x ∈ [ 0.8 , 1 ], PQC outputs x=0.8superscript𝑥0.8x^{\prime}=0.8italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 0.8 as a fixed point. Then xx𝑥superscript𝑥x-x^{\prime}italic_x - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT will be fed into a new PQC for implementing the local Taylor expansions at the fixed point xsuperscript𝑥x^{\prime}italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, forming a nesting architecture. Control gates with pink backgrounds implement the Taylor coefficients. Fine-tuning trainable parameters in RXsubscript𝑅𝑋R_{X}italic_R start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT and RZsubscript𝑅𝑍R_{Z}italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT gates yields a piecewise polynomial with degree 3333 that approximates the target function.

2 Preliminaries

Quantum states.

The basic unit of information in quantum computing is the qubit, which can exist in a superposition of the states 0 and 1 simultaneously, unlike classical bits that are restricted to either 0 or 1. A pure quantum state in the d𝑑ditalic_d-dimensional Hilbert space dsuperscript𝑑\mathbb{C}^{d}blackboard_C start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT is represented by the Dirac notation |ϕketitalic-ϕ\ket{\phi}| start_ARG italic_ϕ end_ARG ⟩. The conjugate transpose of |ϕketitalic-ϕ\ket{\phi}| start_ARG italic_ϕ end_ARG ⟩ is denoted by ϕ|braitalic-ϕ\bra{\phi}⟨ start_ARG italic_ϕ end_ARG |. The inner product of two quantum states |ϕketitalic-ϕ\ket{\phi}| start_ARG italic_ϕ end_ARG ⟩ and |ψket𝜓\ket{\psi}| start_ARG italic_ψ end_ARG ⟩ is written as ϕ|ψinner-productitalic-ϕ𝜓\braket{\phi}{\psi}⟨ start_ARG italic_ϕ end_ARG | start_ARG italic_ψ end_ARG ⟩. An important property is that ϕ|ϕ=1inner-productitalic-ϕitalic-ϕ1\braket{\phi}{\phi}=1⟨ start_ARG italic_ϕ end_ARG | start_ARG italic_ϕ end_ARG ⟩ = 1 for any pure state |ψket𝜓\ket{\psi}| start_ARG italic_ψ end_ARG ⟩. By convention, the computational basis states for single-qubit systems are written as |0=[1,0]Tket0superscript10𝑇\ket{0}=[1,0]^{T}| start_ARG 0 end_ARG ⟩ = [ 1 , 0 ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT and |1=[0,1]Tket1superscript01𝑇\ket{1}=[0,1]^{T}| start_ARG 1 end_ARG ⟩ = [ 0 , 1 ] start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT, where the superscript T𝑇Titalic_T denotes the transpose. For n𝑛nitalic_n-qubit systems, the computational basis states are expressed as |j{|0,|1}nket𝑗superscriptket0ket1tensor-productabsent𝑛\ket{j}\in\{\ket{0},\ket{1}\}^{\otimes n}| start_ARG italic_j end_ARG ⟩ ∈ { | start_ARG 0 end_ARG ⟩ , | start_ARG 1 end_ARG ⟩ } start_POSTSUPERSCRIPT ⊗ italic_n end_POSTSUPERSCRIPT, where tensor-product\otimes denotes the tensor product operation.

Quantum gates.

Quantum gates are building blocks of quantum circuits operating on quantum states. Unlike classical gates, quantum gates are reversible and described as unitary matrices. In quantum machine learning, common parameterized quantum gates include single-qubit Pauli rotation gates RX(θ)=eθX/2subscript𝑅𝑋𝜃superscript𝑒𝜃𝑋2R_{X}(\theta)=e^{-\theta X/2}italic_R start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT ( italic_θ ) = italic_e start_POSTSUPERSCRIPT - italic_θ italic_X / 2 end_POSTSUPERSCRIPT, RY(θ)=eθY/2subscript𝑅𝑌𝜃superscript𝑒𝜃𝑌2R_{Y}(\theta)=e^{-\theta Y/2}italic_R start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ( italic_θ ) = italic_e start_POSTSUPERSCRIPT - italic_θ italic_Y / 2 end_POSTSUPERSCRIPT, and RZ(θ)=eθX/2subscript𝑅𝑍𝜃superscript𝑒𝜃𝑋2R_{Z}(\theta)=e^{-\theta X/2}italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT ( italic_θ ) = italic_e start_POSTSUPERSCRIPT - italic_θ italic_X / 2 end_POSTSUPERSCRIPT that rotate a quantum state through angle θ𝜃\thetaitalic_θ around the corresponding axis, where the three Pauli operators are defined as:

X=[0110],Y=[0ii0],Z=[1001],formulae-sequence𝑋matrix0110formulae-sequence𝑌matrix0𝑖𝑖0𝑍matrix1001X=\begin{bmatrix}0&1\\ 1&0\end{bmatrix},\quad Y=\begin{bmatrix}0&-i\\ i&0\end{bmatrix},\quad Z=\begin{bmatrix}1&0\\ 0&-1\end{bmatrix},italic_X = [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL 1 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] , italic_Y = [ start_ARG start_ROW start_CELL 0 end_CELL start_CELL - italic_i end_CELL end_ROW start_ROW start_CELL italic_i end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] , italic_Z = [ start_ARG start_ROW start_CELL 1 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL - 1 end_CELL end_ROW end_ARG ] ,

where i𝑖iitalic_i represents the imaginary unit. Commonly used two-qubit quantum gates include CNOT gate that flips the target qubit if and only if the the control qubit is in |1ket1\ket{1}| start_ARG 1 end_ARG ⟩.

Quantum measurement

The quantum measurement is a procedure manipulating a quantum system to extract classical information. The simplest measurement is the computational basis measurement: For a single-qubit system |ψ=α|0+β|1ket𝜓𝛼ket0𝛽ket1\ket{\psi}=\alpha\ket{0}+\beta\ket{1}| start_ARG italic_ψ end_ARG ⟩ = italic_α | start_ARG 0 end_ARG ⟩ + italic_β | start_ARG 1 end_ARG ⟩, the outcome is either |0ket0\ket{0}| start_ARG 0 end_ARG ⟩ with probability |α|2superscript𝛼2|\alpha|^{2}| italic_α | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT or |1ket1\ket{1}| start_ARG 1 end_ARG ⟩ with probability |β|2superscript𝛽2|\beta|^{2}| italic_β | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. These measurements project the quantum state onto the measured basis, collapsing the state itself. Observables, represented by Hermitian operators, correspond to measurable quantities in a quantum system like energy or position. Each observable has a set of possible outcomes (eigenvalues) and corresponding states (eigenvectors). When a measurement of an observable is performed, the outcome is one of the eigenvalues, and the state of the system collapses to the corresponding eigenvector. If we are measuring a state |ψket𝜓\ket{\psi}| start_ARG italic_ψ end_ARG ⟩ using observable 𝒪𝒪{\cal O}caligraphic_O, the expected value of outcome is ψ|𝒪|ψbra𝜓𝒪ket𝜓\bra{\psi}{\cal O}\ket{\psi}⟨ start_ARG italic_ψ end_ARG | caligraphic_O | start_ARG italic_ψ end_ARG ⟩. This represents the average result one would expect from repeated measurements on identically prepared systems. A comprehensive introduction to the fundamental notations and concepts of quantum computation can be found in [44].

Data re-uploading PQCs.

The PQCs we shall construct in this paper are of data re-uploading type [11], i.e., consisting of interleaved data encoding circuit blocks and trainable circuit blocks. More precisely, let 𝒙𝒙\bm{x}bold_italic_x be the input data vector and 𝜽=(𝜽0,,𝜽L)𝜽subscript𝜽0subscript𝜽𝐿\bm{\theta}=(\bm{\theta}_{0},\ldots,\bm{\theta}_{L})bold_italic_θ = ( bold_italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , bold_italic_θ start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) be a set of trainable parameter vectors. S(𝒙)𝑆𝒙S(\bm{x})italic_S ( bold_italic_x ) is a quantum circuit that encode 𝒙𝒙\bm{x}bold_italic_x and V(𝜽j)𝑉subscript𝜽𝑗V(\bm{\theta}_{j})italic_V ( bold_italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) is a trainable quantum circuit with trainable parameter vector 𝜽jsubscript𝜽𝑗\bm{\theta}_{j}bold_italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. An L𝐿Litalic_L-layer data re-uploading PQC can be then expressed as

U𝜽(𝒙)=V(𝜽𝟎)j=1LS(𝒙)V(𝜽𝒋),subscript𝑈𝜽𝒙𝑉subscript𝜽0superscriptsubscriptproduct𝑗1𝐿𝑆𝒙𝑉subscript𝜽𝒋U_{\bm{\theta}}(\bm{x})=V(\bm{\theta_{0}})\prod_{j=1}^{L}S(\bm{x})V(\bm{\theta% _{j}}),italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_x ) = italic_V ( bold_italic_θ start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT ) ∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT italic_S ( bold_italic_x ) italic_V ( bold_italic_θ start_POSTSUBSCRIPT bold_italic_j end_POSTSUBSCRIPT ) , (1)

Applying U𝜽(𝒙)subscript𝑈𝜽𝒙U_{\bm{\theta}}(\bm{x})italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_x ) to an initial quantum state and measuring the output states provides a way to express functions on 𝒙𝒙\bm{x}bold_italic_x:

fU𝜽(𝒙)0|U𝜽(𝒙)𝒪U𝜽(𝒙)|0,subscript𝑓subscript𝑈𝜽𝒙bra0subscriptsuperscript𝑈𝜽𝒙𝒪subscript𝑈𝜽𝒙ket0f_{U_{\bm{\theta}}}(\bm{x})\coloneqq\bra{0}U^{\dagger}_{\bm{\theta}}(\bm{x}){% \cal O}U_{\bm{\theta}}(\bm{x})\ket{0},italic_f start_POSTSUBSCRIPT italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ⟨ start_ARG 0 end_ARG | italic_U start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_x ) caligraphic_O italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩ , (2)

where 𝒪𝒪{\cal O}caligraphic_O is some Hermitian observable. The approximation capability of the PQC U𝜽(𝒙)subscript𝑈𝜽𝒙U_{\bm{\theta}}(\bm{x})italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_x ) can be characterized by the classes of functions that fU𝜽(𝒙)subscript𝑓subscript𝑈𝜽𝒙f_{U_{\bm{\theta}}}(\bm{x})italic_f start_POSTSUBSCRIPT italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) can approximate by tuning the trainable parameter vector 𝜽𝜽\bm{\theta}bold_italic_θ. We then turn to an example of single-qubit PQCs approximating univariate functions. For the input x[1,1]𝑥11x\in[-1,1]italic_x ∈ [ - 1 , 1 ], we utilized the Pauli X𝑋Xitalic_X basis encoding scheme [10] and defined the data encoding operator as a Pauli X rotation S(x)eiarccos(x)X𝑆𝑥superscript𝑒𝑖𝑥𝑋S(x)\coloneqq e^{i\arccos(x)X}italic_S ( italic_x ) ≔ italic_e start_POSTSUPERSCRIPT italic_i roman_arccos ( italic_x ) italic_X end_POSTSUPERSCRIPT. Interleaving the data encoding unitary S(x)𝑆𝑥S(x)italic_S ( italic_x ) with some parameterized Pauli Z𝑍Zitalic_Z rotations RZ(θ)subscript𝑅𝑍𝜃R_{Z}(\theta)italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT ( italic_θ ) gives the circuit of data re-uploading PQC for one variable as U𝜽(x)RZ(θ0)j=1LS(x)RZ(θj)subscript𝑈𝜽𝑥subscript𝑅𝑍subscript𝜃0superscriptsubscriptproduct𝑗1𝐿𝑆𝑥subscript𝑅𝑍subscript𝜃𝑗U_{\bm{\theta}}(x)\coloneqq R_{Z}(\theta_{0})\prod_{j=1}^{L}S(x)R_{Z}(\theta_{% j})italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( italic_x ) ≔ italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT italic_S ( italic_x ) italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) where 𝜽=(θ0,,θL)L+1𝜽subscript𝜃0subscript𝜃𝐿superscript𝐿1\bm{\theta}\ =(\theta_{0},\ldots,\theta_{L})\in{{\mathbb{R}}}^{L+1}bold_italic_θ = ( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_θ start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_L + 1 end_POSTSUPERSCRIPT is a set of trainable parameters. Utilizing results from quantum signal processing [45, 46, 47], there exists 𝜽L+1𝜽superscript𝐿1\bm{\theta}\in{{\mathbb{R}}}^{L+1}bold_italic_θ ∈ blackboard_R start_POSTSUPERSCRIPT italic_L + 1 end_POSTSUPERSCRIPT such that U𝜽(x)subscript𝑈𝜽𝑥U_{\bm{\theta}}(x)italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( italic_x ) implements polynomial transformations p(x)[x]𝑝𝑥delimited-[]𝑥p(x)\in{{\mathbb{R}}}[x]italic_p ( italic_x ) ∈ blackboard_R [ italic_x ] as p(x)=+|U𝜽(x)|+𝑝𝑥quantum-operator-productsubscript𝑈𝜽𝑥p(x)=\braket{+}{U_{\bm{\theta}}(x)}{+}italic_p ( italic_x ) = ⟨ start_ARG + end_ARG | start_ARG italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( italic_x ) end_ARG | start_ARG + end_ARG ⟩ for any x[1,1]𝑥11x\in[-1,1]italic_x ∈ [ - 1 , 1 ] if and only if the degree of p(x)𝑝𝑥p(x)italic_p ( italic_x ) is at most L𝐿Litalic_L, the parity of p(x)𝑝𝑥p(x)italic_p ( italic_x ) is Lmod2modulo𝐿2L\bmod 2italic_L roman_mod 2 111A polynomial p(x)𝑝𝑥p(x)italic_p ( italic_x ) has parity 00 if all coefficients corresponding to odd powers of x𝑥xitalic_x are 00, and similarly p(x)𝑝𝑥p(x)italic_p ( italic_x ) has parity 1111 if all coefficients corresponding to even powers of x𝑥xitalic_x are 00., and |p(x)|1𝑝𝑥1\lvert p(x)\rvert\leq 1| italic_p ( italic_x ) | ≤ 1 for all x[1,1]𝑥11x\in[-1,1]italic_x ∈ [ - 1 , 1 ]. Then, univariate functions that could be approximated by the specified polynomial p(x)𝑝𝑥p(x)italic_p ( italic_x ) could also be approximated by the PQC U𝜽(x)subscript𝑈𝜽𝑥U_{\bm{\theta}}(x)italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( italic_x ). Other than the real polynomials, there are also types of single-qubit PQC with Pauli Z𝑍Zitalic_Z basis encoding that could implement complex trigonometric polynomials [37].

3 Expressivity of PQCs for multivariate continuous functions

3.1 Explicit construction of PQCs for multivariate polynomials

Although PQCs for approximate univariate functions have been constructed and analyzed, they have not yet been generally extended to the case of multivariate functions. Current proofs of universal approximation for multivariate functions are nonconstructive [34, 38] and require arbitrary circuit width, arbitrary multi-qubit global parameterized unitaries, and arbitrary observables. Goto et al. [39] proposed several constructions for approximating multivariate functions with the assistance of parameterized data pre-processing and post-processing, yielding a quantum-enhanced hybrid scheme rather than a purely quantum setting.

We now move to our explicit construction of PQCs for multivariate polynomials. A multivariate polynomial with d𝑑ditalic_d variables and degree s𝑠sitalic_s is defined as p(𝒙)𝜶1sc𝜶𝒙𝜶𝑝𝒙subscriptsubscriptdelimited-∥∥𝜶1𝑠subscript𝑐𝜶superscript𝒙𝜶p(\bm{x})\coloneqq\sum_{\lVert\bm{\alpha}\rVert_{1}\leq s}c_{\bm{\alpha}}\bm{x% ^{\alpha}}italic_p ( bold_italic_x ) ≔ ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT bold_italic_x start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT where 𝒙𝜶=x1α1x2α2xdαdsuperscript𝒙𝜶superscriptsubscript𝑥1subscript𝛼1superscriptsubscript𝑥2subscript𝛼2superscriptsubscript𝑥𝑑subscript𝛼𝑑\bm{x^{\alpha}}=x_{1}^{\alpha_{1}}x_{2}^{\alpha_{2}}\cdots x_{d}^{\alpha_{d}}bold_italic_x start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT = italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ⋯ italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT. To implement the multivariate polynomial p(𝒙)𝑝𝒙p(\bm{x})italic_p ( bold_italic_x ), we first build a PQC to express a monomial c𝜶𝒙𝜶subscript𝑐𝜶superscript𝒙𝜶c_{\bm{\alpha}}\bm{x^{\alpha}}italic_c start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT bold_italic_x start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT. The construction is a trivial extension of the univariate case: We simply apply the single-qubit PQC with Pauli X𝑋Xitalic_X basis encoding on each xjsubscript𝑥𝑗x_{j}italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT to implement xjαjsuperscriptsubscript𝑥𝑗subscript𝛼𝑗x_{j}^{\alpha_{j}}italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT for 1jd1𝑗𝑑1\leq j\leq d1 ≤ italic_j ≤ italic_d, respectively. The coefficient c𝜶subscript𝑐𝜶c_{\bm{\alpha}}\in{{\mathbb{R}}}italic_c start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT ∈ blackboard_R could be implemented by any of these PQCs. Thus we could construct a PQC U𝜶(𝒙)j=1dU𝜽j(xj)superscript𝑈𝜶𝒙superscriptsubscripttensor-product𝑗1𝑑subscript𝑈subscript𝜽𝑗subscript𝑥𝑗U^{\bm{\alpha}}(\bm{x})\coloneqq\bigotimes_{j=1}^{d}U_{\bm{\theta}_{j}}(x_{j})italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x ) ≔ ⨂ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) such that +|dU𝜶(𝒙)|+d=c𝜶𝒙𝜶superscriptbratensor-productabsent𝑑superscript𝑈𝜶𝒙superscriptkettensor-productabsent𝑑subscript𝑐𝜶superscript𝒙𝜶\bra{+}^{\otimes d}\!U^{\bm{\alpha}}(\bm{x})\!\ket{+}^{\otimes d}=c_{\bm{% \alpha}}\bm{x^{\alpha}}⟨ start_ARG + end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x ) | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT = italic_c start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT bold_italic_x start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT. The depth of the PQC U𝜶(𝒙)superscript𝑈𝜶𝒙U^{\bm{\alpha}}(\bm{x})italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x ) is at most 2s+12𝑠12s+12 italic_s + 1, the width is at most d𝑑ditalic_d, and the number of parameters is at most s+d𝑠𝑑s+ditalic_s + italic_d.

Having PQCs that implement monomials, the next step is to aggregate monomials to implement the multivariate polynomial. A natural idea is to sum the monomial PQCs together as Up(𝒙)=𝜶1sU𝜶(𝒙)subscript𝑈𝑝𝒙subscriptsubscriptdelimited-∥∥𝜶1𝑠superscript𝑈𝜶𝒙U_{p}(\bm{x})=\sum_{\lVert\bm{\alpha}\rVert_{1}\leq s}U^{\bm{\alpha}}(\bm{x})italic_U start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) = ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x ). However, the addition operation in quantum computing is non-trivial as the sum of unitary operators is not necessarily unitary. To overcome this issue, we utilize linear combination of unitaries (LCU) [48] to implement the operator Up(𝒙)subscript𝑈𝑝𝒙U_{p}(\bm{x})italic_U start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) on a quantum computer. Realizing the linear combination of PQCs U𝜶(𝒙)superscript𝑈𝜶𝒙U^{\bm{\alpha}}(\bm{x})italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x ) requires applying multi-qubit control on each U𝜶(𝒙)superscript𝑈𝜶𝒙U^{\bm{\alpha}}(\bm{x})italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x ), which could be further decomposed into linear-depth quantum circuits of CNOT gates and single-qubit rotation gates without using any ancilla qubit [49]. Then we can obtain the polynomial p(𝒙)=+|dUp(𝒙)|+d𝑝𝒙superscriptbratensor-productabsent𝑑subscript𝑈𝑝𝒙superscriptkettensor-productabsent𝑑p(\bm{x})=\bra{+}^{\otimes d}\!U_{p}(\bm{x})\!\ket{+}^{\otimes d}italic_p ( bold_italic_x ) = ⟨ start_ARG + end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT by applying the Hadamard test on the LCU circuit. Summarizing the above, we establish the following theorem about using PQCs to implement multivariate polynomials. A formal description of such PQCs is given in Appendix B.

Theorem 1.

For any multivariate polynomial p(𝐱)𝑝𝐱p(\bm{x})italic_p ( bold_italic_x ) with d𝑑ditalic_d variables and degree s𝑠sitalic_s such that |p(𝐱)|1𝑝𝐱1\lvert p(\bm{x})\rvert\leq 1| italic_p ( bold_italic_x ) | ≤ 1 for 𝐱[0,1]d𝐱superscript01𝑑\bm{x}\in[0,1]^{d}bold_italic_x ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, there exists a PQC Wp(𝐱)subscript𝑊𝑝𝐱W_{p}(\bm{x})italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) such that

fWp(𝒙)0|Wp(𝒙)Z(0)Wp(𝒙)|0=p(𝒙)subscript𝑓subscript𝑊𝑝𝒙bra0subscriptsuperscript𝑊𝑝𝒙superscript𝑍0subscript𝑊𝑝𝒙ket0𝑝𝒙f_{W_{p}}(\bm{x})\coloneqq\bra{0}W^{\dagger}_{p}(\bm{x})Z^{(0)}W_{p}(\bm{x})% \ket{0}=p(\bm{x})italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ⟨ start_ARG 0 end_ARG | italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩ = italic_p ( bold_italic_x ) (3)

where Z(0)superscript𝑍0Z^{(0)}italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT is the Pauli Z𝑍Zitalic_Z observable on the first qubit. The width of the PQC is O(d+logs+slogd)𝑂𝑑𝑠𝑠𝑑O(d+\log s+s\log d)italic_O ( italic_d + roman_log italic_s + italic_s roman_log italic_d ), the depth is O(s2ds(logs+slogd))𝑂superscript𝑠2superscript𝑑𝑠𝑠𝑠𝑑O(s^{2}d^{s}(\log s+s\log d))italic_O ( italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ( roman_log italic_s + italic_s roman_log italic_d ) ), and the number of parameters is O(sds(s+d))𝑂𝑠superscript𝑑𝑠𝑠𝑑O(sd^{s}(s+d))italic_O ( italic_s italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ( italic_s + italic_d ) ).

Note that the initial state in the Hadamard test is |0dsuperscriptket0tensor-productabsent𝑑\ket{0}^{\otimes d}| start_ARG 0 end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT since |+dsuperscriptkettensor-productabsent𝑑\ket{+}^{\otimes d}| start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT could be easily prepared by applying Hadamard gates on |0dsuperscriptket0tensor-productabsent𝑑\ket{0}^{\otimes d}| start_ARG 0 end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT. Measuring the first qubit of Wp(𝒙)subscript𝑊𝑝𝒙W_{p}(\bm{x})italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) for O(1ε2)𝑂1superscript𝜀2O(\frac{1}{\varepsilon^{2}})italic_O ( divide start_ARG 1 end_ARG start_ARG italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) times is needed to estimate the value of p(𝒙)𝑝𝒙p(\bm{x})italic_p ( bold_italic_x ) up to an additive error ε𝜀\varepsilonitalic_ε. We could further use the amplitude estimation algorithm [50] to reduce the overhead while increasing the circuit depth by O(1ε)𝑂1𝜀O(\frac{1}{\varepsilon})italic_O ( divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ).

3.2 PQC approximation for continuous functions

Polynomials play a central role in approximation theory. The celebrated Weierstrass approximation theorem (see e.g. [51, Sec. 10.2.2]) indicates that polynomials are sufficient to approximate continuous univariate functions. For multivariate functions, their approximation can be implemented using Bernstein polynomials [52, 53]. We shall apply these results to prove PQC approximation error bounds for multivariate Lipschitz continuous functions.

For a d𝑑ditalic_d-variable continuous function f:[0,1]d:𝑓superscript01𝑑f\mathrel{\mathop{\mathchar 58\relax}}[0,1]^{d}\to{{\mathbb{R}}}italic_f : [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → blackboard_R, the multivariate Bernstein polynomial with degree n+𝑛superscriptn\in{{\mathbb{N}}}^{+}italic_n ∈ blackboard_N start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT of f𝑓fitalic_f is defined as

Bn(𝒙)k1=0nkd=0nf(𝒌n)j=1d(nkj)xjkj(1xj)nkj,subscript𝐵𝑛𝒙superscriptsubscriptsubscript𝑘10𝑛superscriptsubscriptsubscript𝑘𝑑0𝑛𝑓𝒌𝑛superscriptsubscriptproduct𝑗1𝑑binomial𝑛subscript𝑘𝑗superscriptsubscript𝑥𝑗subscript𝑘𝑗superscript1subscript𝑥𝑗𝑛subscript𝑘𝑗B_{n}(\bm{x})\coloneqq\sum_{k_{1}=0}^{n}\cdots\sum_{k_{d}=0}^{n}f\bigl{(}\frac% {\bm{k}}{n}\bigr{)}\prod_{j=1}^{d}\binom{n}{k_{j}}x_{j}^{k_{j}}(1-x_{j})^{n-k_% {j}},italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ∑ start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⋯ ∑ start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_f ( divide start_ARG bold_italic_k end_ARG start_ARG italic_n end_ARG ) ∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( FRACOP start_ARG italic_n end_ARG start_ARG italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ) italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_n - italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT , (4)

where 𝒌=(k1,,kd){0,,n}d𝒌subscript𝑘1subscript𝑘𝑑superscript0𝑛𝑑\bm{k}=(k_{1},\ldots,k_{d})\in\{0,\ldots,n\}^{d}bold_italic_k = ( italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_k start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ∈ { 0 , … , italic_n } start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. It is known that Bernstein polynomials converge uniformly to f𝑓fitalic_f on [0,1]dsuperscript01𝑑[0,1]^{d}[ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT as n𝑛n\to\inftyitalic_n → ∞ [52, 53]. The PQC constructed in Theorem 1 could implement the Bernstein polynomial with proper rescaling, which implies that the PQC is a universal approximator for any bounded continuous functions.

Theorem 2 (The Universal Approximation Theorem of PQC).

For any continuous function f:[0,1]d[1,1]:𝑓superscript01𝑑11f\mathrel{\mathop{\mathchar 58\relax}}[0,1]^{d}\to[-1,1]italic_f : [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → [ - 1 , 1 ], given an ε>0𝜀0\varepsilon>0italic_ε > 0, there exist an n𝑛n\in{{\mathbb{N}}}italic_n ∈ blackboard_N and a PQC Wb(𝐱)subscript𝑊𝑏𝐱W_{b}(\bm{x})italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) with width O(dlogn)𝑂𝑑𝑛O(d\log n)italic_O ( italic_d roman_log italic_n ), depth O(dndlogn)𝑂𝑑superscript𝑛𝑑𝑛O(dn^{d}\log n)italic_O ( italic_d italic_n start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT roman_log italic_n ) and the number of trainable parameters O(dnd)𝑂𝑑superscript𝑛𝑑O(dn^{d})italic_O ( italic_d italic_n start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ) such that

|f(𝒙)fWb(𝒙)|ε𝑓𝒙subscript𝑓subscript𝑊𝑏𝒙𝜀\lvert f(\bm{x})-f_{W_{b}}(\bm{x})\rvert\leq\varepsilon| italic_f ( bold_italic_x ) - italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) | ≤ italic_ε (5)

for all 𝐱[0,1]d𝐱superscript01𝑑\bm{x}\in[0,1]^{d}bold_italic_x ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, where fWb(𝐱)0|Wb(𝐱)Z(0)Wb(𝐱)|0subscript𝑓subscript𝑊𝑏𝐱bra0subscriptsuperscript𝑊𝑏𝐱superscript𝑍0subscript𝑊𝑏𝐱ket0f_{W_{b}}(\bm{x})\coloneqq\bra{0}W^{\dagger}_{b}(\bm{x})Z^{(0)}W_{b}(\bm{x})% \ket{0}italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ⟨ start_ARG 0 end_ARG | italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩.

Theorem 2 serves as the quantum counterpart to the universal approximation theorem of classical neural networks. Moreover, the PQCs that universally approximate continuous functions are explicitly constructed without any impractical assumption, improving the previous results presented in Refs. [34, 38]. Moreover, for continuous functions f𝑓fitalic_f satisfying the Lipschitz condition, |f(𝒙)f(𝒚)|𝒙𝒚𝑓𝒙𝑓𝒚subscriptdelimited-∥∥𝒙𝒚\lvert f(\bm{x})-f(\bm{y})\rvert\leq\ell\lVert\bm{x}-\bm{y}\rVert_{\infty}| italic_f ( bold_italic_x ) - italic_f ( bold_italic_y ) | ≤ roman_ℓ ∥ bold_italic_x - bold_italic_y ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT for any 𝒙,𝒚𝒙𝒚\bm{x},\bm{y}bold_italic_x , bold_italic_y, the approximation rate of Bernstein polynomials could be quantitatively characterized in terms of the degree n𝑛nitalic_n, the number of variables d𝑑ditalic_d and the Lipschitz constant \ellroman_ℓ [53]. Thus a non-asymptotic error bound for PQC approximating Lipschitz continuous functions could be obtained as follows.

Theorem 3.

Given a Lipschitz continuous function f:[0,1]d[1,1]:𝑓superscript01𝑑11f\mathrel{\mathop{\mathchar 58\relax}}[0,1]^{d}\to[-1,1]italic_f : [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → [ - 1 , 1 ] with a Lipschitz constant \ellroman_ℓ, for any ε>0𝜀0\varepsilon>0italic_ε > 0 and n𝑛n\in{{\mathbb{N}}}italic_n ∈ blackboard_N, there exists a PQC Wb(𝐱)subscript𝑊𝑏𝐱W_{b}(\bm{x})italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) with such that fWb(𝐱)0|Wb(𝐱)Z(0)Wb(𝐱)|0subscript𝑓subscript𝑊𝑏𝐱bra0subscriptsuperscript𝑊𝑏𝐱superscript𝑍0subscript𝑊𝑏𝐱ket0f_{W_{b}}(\bm{x})\coloneqq\bra{0}W^{\dagger}_{b}(\bm{x})Z^{(0)}W_{b}(\bm{x})% \ket{0}italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ⟨ start_ARG 0 end_ARG | italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩ satisfies

|f(𝒙)fWb(𝒙)|ε+2((1+2nε2)d1)ε+d2d2nε2𝑓𝒙subscript𝑓subscript𝑊𝑏𝒙𝜀2superscript1superscript2𝑛superscript𝜀2𝑑1𝜀𝑑superscript2𝑑superscript2𝑛superscript𝜀2\lvert f(\bm{x})-f_{W_{b}}(\bm{x})\rvert\leq\varepsilon+2\biggl{(}\Bigl{(}1+% \frac{\ell^{2}}{n\varepsilon^{2}}\Bigr{)}^{d}-1\biggr{)}\leq\varepsilon+d2^{d}% \frac{\ell^{2}}{n\varepsilon^{2}}| italic_f ( bold_italic_x ) - italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) | ≤ italic_ε + 2 ( ( 1 + divide start_ARG roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_n italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT - 1 ) ≤ italic_ε + italic_d 2 start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT divide start_ARG roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_n italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG (6)

for all 𝐱[0,1]d𝐱superscript01𝑑\bm{x}\in[0,1]^{d}bold_italic_x ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. The width of the PQC is O(dlogn)𝑂𝑑𝑛O(d\log n)italic_O ( italic_d roman_log italic_n ), the depth is O(dndlogn)𝑂𝑑superscript𝑛𝑑𝑛O\bigl{(}dn^{d}\log{n}\bigr{)}italic_O ( italic_d italic_n start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT roman_log italic_n ), and the number of parameters is O(dnd)𝑂𝑑superscript𝑛𝑑O(dn^{d})italic_O ( italic_d italic_n start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ).

We prove these theorems in Appendix C. Although a quantitative approximation error bound is characterized in Theorem 3, we could find that n𝑛nitalic_n must be sufficiently large to obtain a good precision, yielding an extremely deep PQC. This inefficiency is essentially due to the intrinsic difficulty of using a single global polynomial to approximate a continuous function uniformly. A possible approach that may overcome the obstacle is to use local polynomials to achieve a piecewise approximation, which we will discover in the next section.

3.3 PQC approximation for Hölder smooth functions

To achieve a piecewise approximation of multivariate functions, we follow the path of classical deep neural networks approximation [18, 21, 25], which utilizes multivariate Taylor series to approximate target functions in small local regions.

We focus on Hölder smooth functions. Let β=s+r>0𝛽𝑠𝑟0\beta=s+r>0italic_β = italic_s + italic_r > 0, where r(0,1]𝑟01r\in(0,1]italic_r ∈ ( 0 , 1 ] and s+𝑠superscripts\in{{\mathbb{N}}}^{+}italic_s ∈ blackboard_N start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT. For a finite constant B0>0subscript𝐵00B_{0}>0italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT > 0, the β𝛽\betaitalic_β-Hölder class of functions β([0,1]d,B0)superscript𝛽superscript01𝑑subscript𝐵0{\cal H}^{\beta}([0,1]^{d},B_{0})caligraphic_H start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is defined as

β([0,1]d,B0)={f:[0,1]d,max𝜶1s𝜶fB0,max𝜶1=ssup𝒙𝒚|𝜶f(𝒙)𝜶f(𝒚)|𝒙𝒚2rB0},{\cal H}^{\beta}([0,1]^{d},B_{0})\!=\!\Bigl{\{}f\!\mathrel{\mathop{\mathchar 5% 8\relax}}[0,1]^{d}\!\to\!{{\mathbb{R}}},\max_{\lVert\bm{\alpha}\rVert_{1}\leq s% }\lVert\partial^{\bm{\alpha}}f\rVert_{\infty}\!\leq\!B_{0},\max_{\lVert\bm{% \alpha}\rVert_{1}=s}\sup_{\bm{x}\neq\bm{y}}\frac{\lvert\partial^{\bm{\alpha}}f% (\bm{x})-\partial^{\bm{\alpha}}f(\bm{y})\rvert}{\lVert\bm{x}-\bm{y}\rVert_{2}^% {r}}\!\leq\!B_{0}\Bigr{\}},caligraphic_H start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = { italic_f : [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → blackboard_R , roman_max start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT ∥ ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , roman_max start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_s end_POSTSUBSCRIPT roman_sup start_POSTSUBSCRIPT bold_italic_x ≠ bold_italic_y end_POSTSUBSCRIPT divide start_ARG | ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ( bold_italic_x ) - ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ( bold_italic_y ) | end_ARG start_ARG ∥ bold_italic_x - bold_italic_y ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT end_ARG ≤ italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } , (7)

where 𝜶=α1αdsuperscript𝜶superscriptsubscript𝛼1superscriptsubscript𝛼𝑑\partial^{\bm{\alpha}}=\partial^{\alpha_{1}}\cdots\partial^{\alpha_{d}}∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT = ∂ start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ⋯ ∂ start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT for 𝜶=(α1,,αd)d𝜶subscript𝛼1subscript𝛼𝑑superscript𝑑\bm{\alpha}=(\alpha_{1},\ldots,\alpha_{d})\in{{\mathbb{N}}}^{d}bold_italic_α = ( italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ∈ blackboard_N start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. We note that Hölder smooth functions are natural generalizations of various continuous functions: When β(0,1)𝛽01\beta\in(0,1)italic_β ∈ ( 0 , 1 ), f𝑓fitalic_f is Hölder continuous with order β𝛽\betaitalic_β and Hölder constant B0subscript𝐵0B_{0}italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT; when β=1𝛽1\beta=1italic_β = 1, f𝑓fitalic_f is Lipschitz continuous with Lipschitz constant B0subscript𝐵0B_{0}italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT; when 1<β1𝛽1<\beta\in{{\mathbb{N}}}1 < italic_β ∈ blackboard_N, fCs([0,1]d)𝑓superscript𝐶𝑠superscript01𝑑f\in C^{s}([0,1]^{d})italic_f ∈ italic_C start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ), the class of s𝑠sitalic_s-smooth functions whose s𝑠sitalic_s-th partial derivatives exist and are bounded. As shown in Petersen and Voigtlaender [18], for any β𝛽\betaitalic_β-Hölder smooth function fβ([0,1]d,B0)𝑓superscript𝛽superscript01𝑑subscript𝐵0f\in{\cal H}^{\beta}([0,1]^{d},B_{0})italic_f ∈ caligraphic_H start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), its local Taylor expansion at some fixed point 𝒙0[0,1]dsubscript𝒙0superscript01𝑑\bm{x}_{0}\in[0,1]^{d}bold_italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT satisfies

|f(𝒙)𝜶1s𝜶f(𝒙𝟎)𝜶!(𝒙𝒙𝟎)𝜶|ds𝒙𝒙𝟎2β𝑓𝒙subscriptsubscriptdelimited-∥∥𝜶1𝑠superscript𝜶𝑓subscript𝒙0𝜶superscript𝒙subscript𝒙0𝜶superscript𝑑𝑠subscriptsuperscriptdelimited-∥∥𝒙subscript𝒙0𝛽2\Big{\lvert}f(\bm{x})-\sum_{\lVert\bm{\alpha}\rVert_{1}\leq s}\frac{\partial^{% \bm{\alpha}}f(\bm{x_{0}})}{\bm{\alpha}!}(\bm{x}-\bm{x_{0}})^{\bm{\alpha}}\Big{% \rvert}\leq d^{s}\lVert\bm{x}-\bm{x_{0}}\rVert^{\beta}_{2}| italic_f ( bold_italic_x ) - ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT divide start_ARG ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ( bold_italic_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT ) end_ARG start_ARG bold_italic_α ! end_ARG ( bold_italic_x - bold_italic_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT | ≤ italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ∥ bold_italic_x - bold_italic_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT (8)

for all 𝒙[0,1]d𝒙superscript01𝑑\bm{x}\in[0,1]^{d}bold_italic_x ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, where 𝜶!=α1!αd!𝜶subscript𝛼1subscript𝛼𝑑\bm{\alpha}!=\alpha_{1}!\cdots\alpha_{d}!bold_italic_α ! = italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ! ⋯ italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT !. Next, we show how to construct PQCs to implement the Taylor expansion of β𝛽\betaitalic_β-Hölder functions in the following three steps.

Localization.

To utilize the Hölder smoothness, we need to first localize the entire region [0,1]dsuperscript01𝑑[0,1]^{d}[ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. The motivation of localization is to determine the local point 𝒙𝟎subscript𝒙0\bm{x_{0}}bold_italic_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT in Eq. 8 so that the distance between 𝒙𝒙\bm{x}bold_italic_x and 𝒙𝟎subscript𝒙0\bm{x_{0}}bold_italic_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT is fairly small. An intuitive configuration is illustrated in Fig. 2, where the stars represent the local points. Given K𝐾K\in{{\mathbb{N}}}italic_K ∈ blackboard_N and Δ(0,13K)Δ013𝐾\Delta\in(0,\frac{1}{3K})roman_Δ ∈ ( 0 , divide start_ARG 1 end_ARG start_ARG 3 italic_K end_ARG ), for each 𝜼=(η1,,ηd){0,1,,K1}d𝜼subscript𝜂1subscript𝜂𝑑superscript01𝐾1𝑑\bm{\eta}=(\eta_{1},\ldots,\eta_{d})\in\{0,1,\ldots,K-1\}^{d}bold_italic_η = ( italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_η start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ∈ { 0 , 1 , … , italic_K - 1 } start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, we define

Q𝜼{𝒙=(x1,,xd):xi[ηiK,ηi+1KΔ1ηi<K1]}.subscript𝑄𝜼𝒙subscript𝑥1subscript𝑥𝑑:subscript𝑥𝑖subscript𝜂𝑖𝐾subscript𝜂𝑖1𝐾Δsubscript1subscript𝜂𝑖𝐾1Q_{\bm{\eta}}\coloneqq\Bigl{\{}\bm{x}=(x_{1},\ldots,x_{d})\mathrel{\mathop{% \mathchar 58\relax}}x_{i}\in\bigl{[}\frac{\eta_{i}}{K},\frac{\eta_{i}+1}{K}-% \Delta\cdot 1_{\eta_{i}<K-1}\bigr{]}\Bigr{\}}.italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ≔ { bold_italic_x = ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) : italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ [ divide start_ARG italic_η start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_K end_ARG , divide start_ARG italic_η start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + 1 end_ARG start_ARG italic_K end_ARG - roman_Δ ⋅ 1 start_POSTSUBSCRIPT italic_η start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT < italic_K - 1 end_POSTSUBSCRIPT ] } . (9)

By the definition of Q𝜼subscript𝑄𝜼Q_{\bm{\eta}}italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT, the region [0,1]dsuperscript01𝑑[0,1]^{d}[ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT is approximately divided into small hypercubes 𝜼Q𝜼subscript𝜼subscript𝑄𝜼\bigcup_{\bm{\eta}}Q_{\bm{\eta}}⋃ start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT and some trifling region Λ(d,K,Δ)[0,1]d(𝜼Q𝜼)Λ𝑑𝐾Δsuperscript01𝑑subscript𝜼subscript𝑄𝜼\Lambda(d,K,\Delta)\coloneqq[0,1]^{d}\setminus(\bigcup_{\bm{\eta}}Q_{\bm{\eta}})roman_Λ ( italic_d , italic_K , roman_Δ ) ≔ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ∖ ( ⋃ start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ), as illustrated in Fig. 2.

Refer to caption
(a)
Refer to caption
(b)
Figure 2: An illustration of localization. The left panel demonstrates the localization 𝜼Q𝜼subscript𝜼subscript𝑄𝜼\bigcup_{\bm{\eta}}Q_{\bm{\eta}}⋃ start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT for K=5𝐾5K=5italic_K = 5 and d=1𝑑1d=1italic_d = 1. The right panel shows the case of localization for K=5𝐾5K=5italic_K = 5 and d=2𝑑2d=2italic_d = 2. The “volume” of the trifling region Λ(d,K,Δ)Λ𝑑𝐾Δ\Lambda(d,K,\Delta)roman_Λ ( italic_d , italic_K , roman_Δ ) is no more than dKΔ𝑑𝐾ΔdK\Deltaitalic_d italic_K roman_Δ.

We construct a PQC that maps all 𝒙Q𝜼𝒙subscript𝑄𝜼\bm{x}\in Q_{\bm{\eta}}bold_italic_x ∈ italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT to some fixed point 𝒙𝜼=𝜼Ksubscript𝒙𝜼𝜼𝐾\bm{x_{\eta}}=\frac{\bm{\eta}}{K}bold_italic_x start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT = divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG in Q𝜼subscript𝑄𝜼Q_{\bm{\eta}}italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT, i.e., approximating the piecewise-constant function D(𝒙)=𝜼K𝐷𝒙𝜼𝐾D(\bm{x})=\frac{\bm{\eta}}{K}italic_D ( bold_italic_x ) = divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG if 𝒙Q𝜼𝒙subscript𝑄𝜼\bm{x}\in Q_{\bm{\eta}}bold_italic_x ∈ italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT. We describe our construction for d=1𝑑1d=1italic_d = 1, where D(x)=kK𝐷𝑥𝑘𝐾D(x)=\frac{k}{K}italic_D ( italic_x ) = divide start_ARG italic_k end_ARG start_ARG italic_K end_ARG if x[kK,k+1KΔ1k<K1]𝑥𝑘𝐾𝑘1𝐾Δsubscript1𝑘𝐾1x\in[\frac{k}{K},\frac{k+1}{K}-\Delta\cdot 1_{k<K-1}]italic_x ∈ [ divide start_ARG italic_k end_ARG start_ARG italic_K end_ARG , divide start_ARG italic_k + 1 end_ARG start_ARG italic_K end_ARG - roman_Δ ⋅ 1 start_POSTSUBSCRIPT italic_k < italic_K - 1 end_POSTSUBSCRIPT ] for k=0,,K1𝑘0𝐾1k=0,\ldots,K-1italic_k = 0 , … , italic_K - 1. The multivariate case could be naturally generalized by applying D(x)𝐷𝑥D(x)italic_D ( italic_x ) to each variable xjsubscript𝑥𝑗x_{j}italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. The idea is to construct a polynomial that approximates the function D(x)𝐷𝑥D(x)italic_D ( italic_x ) based on the polynomial approximation to the sign function [54], which a single-qubit PQC can then implement. Generalizing to the multivariate localization, there exists a PQC WD(𝒙)subscript𝑊𝐷𝒙W_{D}(\bm{x})italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ( bold_italic_x ) of depth O(1ΔlogKε)𝑂1Δ𝐾𝜀O(\frac{1}{\Delta}\log\frac{K}{\varepsilon})italic_O ( divide start_ARG 1 end_ARG start_ARG roman_Δ end_ARG roman_log divide start_ARG italic_K end_ARG start_ARG italic_ε end_ARG ) and width O(d)𝑂𝑑O(d)italic_O ( italic_d ) such that the output fWD(𝒙)subscript𝑓subscript𝑊𝐷𝒙f_{W_{D}}(\bm{x})italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) maps 𝒙𝒙\bm{x}bold_italic_x to the corresponding fixed point 𝒙𝜼subscript𝒙𝜼\bm{x_{\eta}}bold_italic_x start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT with precision ε𝜀\varepsilonitalic_ε. We can obtain an estimation of 𝜼𝜼\bm{\eta}bold_italic_η using KfWD(𝒙)𝐾subscript𝑓subscript𝑊𝐷𝒙\lfloor Kf_{W_{D}}(\bm{x})\rfloor⌊ italic_K italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ⌋.

Implementing the Taylor coefficients.

Next, we use PQC to implement the Taylor coefficients ξ𝜼,𝜶𝜶f(𝒙𝜼)𝜶![1,1]subscript𝜉𝜼𝜶superscript𝜶𝑓subscript𝒙𝜼𝜶11\xi_{{\bm{\eta}},\bm{\alpha}}\coloneqq\frac{\partial^{\bm{\alpha}}f(\bm{x_{% \eta}})}{\bm{\alpha}!}\in[-1,1]italic_ξ start_POSTSUBSCRIPT bold_italic_η , bold_italic_α end_POSTSUBSCRIPT ≔ divide start_ARG ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ( bold_italic_x start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ) end_ARG start_ARG bold_italic_α ! end_ARG ∈ [ - 1 , 1 ] for each 𝜼=(η1,,ηd){0,1,,K1}d𝜼subscript𝜂1subscript𝜂𝑑superscript01𝐾1𝑑{\bm{\eta}}=(\eta_{1},\ldots,\eta_{d})\in\{0,1,\ldots,K-1\}^{d}bold_italic_η = ( italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_η start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ∈ { 0 , 1 , … , italic_K - 1 } start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT and 𝜶𝜶\bm{\alpha}bold_italic_α, which is essentially a point-fitting problem. Then we could construct a PQC Uco𝜶=𝜼|𝜼𝜼|RX(θ𝜼,𝜶)U_{co}^{\bm{\alpha}}=\sum_{{\bm{\eta}}}\lvert{\bm{\eta}}\rangle\!\langle{\bm{% \eta}}\rvert\otimes R_{X}(\theta_{\bm{\eta},\bm{\alpha}})italic_U start_POSTSUBSCRIPT italic_c italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT | bold_italic_η ⟩ ⟨ bold_italic_η | ⊗ italic_R start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT bold_italic_η , bold_italic_α end_POSTSUBSCRIPT ) such that 𝜼,0|Uco𝜶|𝜼,0=ξ𝜼,𝜶bra𝜼0superscriptsubscript𝑈𝑐𝑜𝜶ket𝜼0subscript𝜉𝜼𝜶\bra{{\bm{\eta}},0}U_{co}^{\bm{\alpha}}\ket{{\bm{\eta}},0}=\xi_{{\bm{\eta}},% \bm{\alpha}}⟨ start_ARG bold_italic_η , 0 end_ARG | italic_U start_POSTSUBSCRIPT italic_c italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT | start_ARG bold_italic_η , 0 end_ARG ⟩ = italic_ξ start_POSTSUBSCRIPT bold_italic_η , bold_italic_α end_POSTSUBSCRIPT, where |𝜼=|η1|ηdket𝜼tensor-productketsubscript𝜂1ketsubscript𝜂𝑑\ket{{\bm{\eta}}}=\ket{\eta_{1}}\otimes\cdots\otimes\ket{\eta_{d}}| start_ARG bold_italic_η end_ARG ⟩ = | start_ARG italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ⟩ ⊗ ⋯ ⊗ | start_ARG italic_η start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_ARG ⟩ and θ𝜼,𝜶=2arccos(ξ𝜼,𝜶)subscript𝜃𝜼𝜶2subscript𝜉𝜼𝜶\theta_{\bm{\eta},\bm{\alpha}}=2\arccos(\xi_{\bm{\eta},\bm{\alpha}})italic_θ start_POSTSUBSCRIPT bold_italic_η , bold_italic_α end_POSTSUBSCRIPT = 2 roman_arccos ( italic_ξ start_POSTSUBSCRIPT bold_italic_η , bold_italic_α end_POSTSUBSCRIPT ). The depth of U𝜶subscript𝑈𝜶U_{\bm{\alpha}}italic_U start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT is O(Kd)𝑂superscript𝐾𝑑O(K^{d})italic_O ( italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ), the width is O(dlogK)𝑂𝑑𝐾O(d\log K)italic_O ( italic_d roman_log italic_K ), and the number of parameters is O(Kd)𝑂superscript𝐾𝑑O(K^{d})italic_O ( italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ). Note that the state |𝜼ket𝜼\ket{{\bm{\eta}}}| start_ARG bold_italic_η end_ARG ⟩ can be prepared using basis encoding on the provided 𝜼𝜼\bm{\eta}bold_italic_η =KfWD(𝒙)absent𝐾subscript𝑓subscript𝑊𝐷𝒙=\lfloor Kf_{W_{D}}(\bm{x})\rfloor= ⌊ italic_K italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ⌋ from the localization step.

Implementing multivariate Taylor series.

To implement the multivariate Taylor expansion of a function at some fixed point 𝒙𝜼subscript𝒙𝜼\bm{x_{\eta}}bold_italic_x start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT, we first build a PQC to represent a single term in the Taylor series, which could be done by combining the PQC, which implements the Taylor coefficients and the PQC which implements monomials, i.e., constructing U𝜼𝜶(𝒙)Uco𝜶U𝜶(𝒙𝒙𝜼)subscriptsuperscript𝑈𝜶𝜼𝒙tensor-productsuperscriptsubscript𝑈𝑐𝑜𝜶superscript𝑈𝜶𝒙subscript𝒙𝜼U^{\bm{\alpha}}_{\bm{\eta}}(\bm{x})\coloneqq U_{co}^{\bm{\alpha}}\otimes U^{% \bm{\alpha}}(\bm{x}-\bm{x_{\eta}})italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ( bold_italic_x ) ≔ italic_U start_POSTSUBSCRIPT italic_c italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ⊗ italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x - bold_italic_x start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ). The depth of U𝜼𝜶(𝒙)subscriptsuperscript𝑈𝜶𝜼𝒙U^{\bm{\alpha}}_{\bm{\eta}}(\bm{x})italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ( bold_italic_x ) is O(Kd+s)𝑂superscript𝐾𝑑𝑠O(K^{d}+s)italic_O ( italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + italic_s ), the width is O(dlogK)𝑂𝑑𝐾O(d\log K)italic_O ( italic_d roman_log italic_K ), and the number of parameters is at most Kd+s+dsuperscript𝐾𝑑𝑠𝑑K^{d}+s+ditalic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + italic_s + italic_d. The next step is to aggregate single Taylor terms together to implement the truncated Taylor expansion of the target function. We use LCU to construct the PQC Ut(𝒙,𝒙𝜼)𝜶1sU𝜼𝜶(𝒙)subscript𝑈𝑡𝒙subscript𝒙𝜼subscriptsubscriptdelimited-∥∥𝜶1𝑠subscriptsuperscript𝑈𝜶𝜼𝒙U_{t}(\bm{x},\bm{x_{\eta}})\coloneqq\sum_{\lVert\bm{\alpha}\rVert_{1}\leq s}U^% {\bm{\alpha}}_{\bm{\eta}}(\bm{x})italic_U start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x , bold_italic_x start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ) ≔ ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ( bold_italic_x ) so that we can implement the Taylor expansion of the function f𝑓fitalic_f at point 𝒙𝜼subscript𝒙𝜼\bm{x_{\eta}}bold_italic_x start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT as 𝜼,0|+|dUt(𝒙,𝒙𝜼)|𝜼,0|+dbra𝜼0superscriptbratensor-productabsent𝑑subscript𝑈𝑡𝒙subscript𝒙𝜼ket𝜼0superscriptkettensor-productabsent𝑑\bra{\bm{\eta},0}\!\bra{+}^{\otimes d}U_{t}(\bm{x},\bm{x_{\eta}})\ket{\bm{\eta% },0}\!\ket{+}^{\otimes d}⟨ start_ARG bold_italic_η , 0 end_ARG | ⟨ start_ARG + end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x , bold_italic_x start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ) | start_ARG bold_italic_η , 0 end_ARG ⟩ | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT.

We construct a nested PQC as Ut(𝒙,fWD(𝒙))subscript𝑈𝑡𝒙subscript𝑓subscript𝑊𝐷𝒙U_{t}(\bm{x},f_{W_{D}}(\bm{x}))italic_U start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x , italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ), such that for any input 𝒙𝒙\bm{x}bold_italic_x, the corresponding fixed point could be determined by the localization PQC. Such a PQC could be used, together with the Hadamard test, to approximate Hölder smooth functions. In particular, we prove the approximation error bound of our constructed PQC based on the error rate of Taylor expansion in Eq. 8.

Theorem 4.

Given a function fβ([0,1]d,1)𝑓superscript𝛽superscript01𝑑1f\in{\cal H}^{\beta}([0,1]^{d},1)italic_f ∈ caligraphic_H start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , 1 ) with β=r+s𝛽𝑟𝑠\beta=r+sitalic_β = italic_r + italic_s, r(0,1]𝑟01r\in(0,1]italic_r ∈ ( 0 , 1 ] and s+𝑠superscripts\in{{\mathbb{N}}}^{+}italic_s ∈ blackboard_N start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT, for any K𝐾K\in{{\mathbb{N}}}italic_K ∈ blackboard_N and Δ(0,13K)Δ013𝐾\Delta\in(0,\frac{1}{3K})roman_Δ ∈ ( 0 , divide start_ARG 1 end_ARG start_ARG 3 italic_K end_ARG ), there exists a PQC Wt(𝐱)subscript𝑊𝑡𝐱W_{t}(\bm{x})italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) such that fWt(𝐱)0|Wt(𝐱)Z(0)Wt(𝐱)|0subscript𝑓subscript𝑊𝑡𝐱bra0subscriptsuperscript𝑊𝑡𝐱superscript𝑍0subscript𝑊𝑡𝐱ket0f_{W_{t}}(\bm{x})\coloneqq\bra{0}W^{\dagger}_{t}(\bm{x})Z^{(0)}W_{t}(\bm{x})% \ket{0}italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ⟨ start_ARG 0 end_ARG | italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩ satisfies

|f(𝒙)fWt(𝒙)|ds+β/2Kβ𝑓𝒙subscript𝑓subscript𝑊𝑡𝒙superscript𝑑𝑠𝛽2superscript𝐾𝛽\lvert f(\bm{x})-f_{W_{t}}(\bm{x})\rvert\leq d^{s+\beta/2}K^{-\beta}| italic_f ( bold_italic_x ) - italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) | ≤ italic_d start_POSTSUPERSCRIPT italic_s + italic_β / 2 end_POSTSUPERSCRIPT italic_K start_POSTSUPERSCRIPT - italic_β end_POSTSUPERSCRIPT (10)

for 𝐱𝛈Q𝛈𝐱subscript𝛈subscript𝑄𝛈\bm{x}\in\bigcup_{{\bm{\eta}}}Q_{\bm{\eta}}bold_italic_x ∈ ⋃ start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT. The width of the PQC is O(dlogK+logs+slogd)𝑂𝑑𝐾𝑠𝑠𝑑O(d\log K+\log s+s\log d)italic_O ( italic_d roman_log italic_K + roman_log italic_s + italic_s roman_log italic_d ), the depth is O(s2dsKd(logs+slogd+dlogK))+1ΔlogK)O(s^{2}d^{s}K^{d}(\log s+s\log d+d\log K))+\frac{1}{\Delta}\log K)italic_O ( italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( roman_log italic_s + italic_s roman_log italic_d + italic_d roman_log italic_K ) ) + divide start_ARG 1 end_ARG start_ARG roman_Δ end_ARG roman_log italic_K ), and the number of parameters is O(sds(s+d+Kd)+dΔlogK)𝑂𝑠superscript𝑑𝑠𝑠𝑑superscript𝐾𝑑𝑑Δ𝐾O(sd^{s}(s+d+K^{d})+\frac{d}{\Delta}\log K)italic_O ( italic_s italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ( italic_s + italic_d + italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ) + divide start_ARG italic_d end_ARG start_ARG roman_Δ end_ARG roman_log italic_K ).

The proof can be found in Appendix D. Note that the PQC in Theorem 4 consists of two nested parts and its depth is counted as the sum of two PQCs for simplicity. We have established the uniform convergence property of PQCs for approximating Hölder smooth function on [0,1]dsuperscript01𝑑[0,1]^{d}[ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT except for the trifling region Λ(d,K,Δ)Λ𝑑𝐾Δ\Lambda(d,K,\Delta)roman_Λ ( italic_d , italic_K , roman_Δ ). The Lebesgue measure of such a trifling region is no more than dKΔ𝑑𝐾ΔdK\Deltaitalic_d italic_K roman_Δ. We can set Δ=KdΔsuperscript𝐾𝑑\Delta=K^{-d}roman_Δ = italic_K start_POSTSUPERSCRIPT - italic_d end_POSTSUPERSCRIPT with no influence on the size of the constructed PQC, and a similar approximation error bound in the entire region [0,1]dsuperscript01𝑑[0,1]^{d}[ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT under the L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT distance could be obtained.

4 Numerical experiments

This section presents numerical experiments to illustrate the expressivity of our proposed PQCs in approximating multivariate functions. We focus on approximating a bivariate polynomial function

f(x,y)=(x2+y1.5π)2+(x+y2+π)2+(x+y0.5π)25π2,𝑓𝑥𝑦superscriptsuperscript𝑥2𝑦1.5𝜋2superscript𝑥superscript𝑦2𝜋2superscript𝑥𝑦0.5𝜋25superscript𝜋2f(x,y)=\frac{(x^{2}+y-1.5\pi)^{2}+(x+y^{2}+\pi)^{2}+(x+y-0.5\pi)^{2}}{5\pi^{2}},italic_f ( italic_x , italic_y ) = divide start_ARG ( italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_y - 1.5 italic_π ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_x + italic_y start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_π ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_x + italic_y - 0.5 italic_π ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 5 italic_π start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ,

over the domain (x,y)[0,1]2𝑥𝑦superscript012(x,y)\in[0,1]^{2}( italic_x , italic_y ) ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. The approximation process involves two separate steps: (1) Learning a piecewise-constant function, D(x)=kK𝐷𝑥𝑘𝐾D(x)=\frac{k}{K}italic_D ( italic_x ) = divide start_ARG italic_k end_ARG start_ARG italic_K end_ARG if x[kK,k+1K)𝑥𝑘𝐾𝑘1𝐾x\in[\frac{k}{K},\frac{k+1}{K})italic_x ∈ [ divide start_ARG italic_k end_ARG start_ARG italic_K end_ARG , divide start_ARG italic_k + 1 end_ARG start_ARG italic_K end_ARG ), using a single-qubit PQC, where K+𝐾superscriptK\in\mathbb{N}^{+}italic_K ∈ blackboard_N start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT determines the number of intervals for the piecewise-constant function. (2) Learning the Taylor expansion of f(x,y)𝑓𝑥𝑦f(x,y)italic_f ( italic_x , italic_y ) using multi-qubit PQCs based on Theorem 4. Both learning processes are implemented on a Gold 6248 2.50 GHz Intel(R) Xeon(R) CPU.

We randomly sample 200200200200 data points within the domain [0,1]01[0,1][ 0 , 1 ] to create training and test datasets for D(x)𝐷𝑥D(x)italic_D ( italic_x ). A single-qubit PQC with adjustable parameters L=764𝐿764L=764italic_L = 764 (L=996𝐿996L=996italic_L = 996) is used to learn D(x)𝐷𝑥D(x)italic_D ( italic_x ) with K=2𝐾2K=2italic_K = 2 (K=10𝐾10K=10italic_K = 10). Each parameter of the PQC is randomly initialized within the range [0,π]0𝜋[0,\pi][ 0 , italic_π ]. We use the Adam optimizer [55] with a learning rate of 0.010.010.010.01 to minimize the Mean Squared Error (MSE) loss function during training. The training process was limited to a maximum of 300300300300 iterations with a batch size of 100 data points. Early termination occurred if the MSE reached below 104superscript10410^{-4}10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT. The achieved MSE on the test data was 3.57×1043.57E-43.57\text{\times}{10}^{-4}start_ARG 3.57 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 4 end_ARG end_ARG (K=2𝐾2K=2italic_K = 2) and 1.04×1041.04E-41.04\text{\times}{10}^{-4}start_ARG 1.04 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 4 end_ARG end_ARG (K=10𝐾10K=10italic_K = 10). The numerical results are visualized in Fig. 3.

Refer to caption
Figure 3: Simulation results of localization. We use single-qubit PQCs to approximate the localization function D(x)𝐷𝑥D(x)italic_D ( italic_x ) for K=2𝐾2K=2italic_K = 2 and K=10𝐾10K=10italic_K = 10 respectively.

Similar to the previous step, we randomly sampled 200200200200 data points within the domain [0,1]2superscript012[0,1]^{2}[ 0 , 1 ] start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT to create training and test datasets for f(x,y)𝑓𝑥𝑦f(x,y)italic_f ( italic_x , italic_y ). A nested PQC structure was designed. It combined 12121212 two-qubit PQCs with a depth of 2222, allowing the approximation of a degree-4 polynomial through a combination of lower-degree ones. Additionally, Taylor coefficients were stored in a separate matrix of size K2×12superscript𝐾212K^{2}\times 12italic_K start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT × 12. The number of trainable parameters varied from 120120120120 (K=2𝐾2K=2italic_K = 2) to 1272127212721272 (K=10𝐾10K=10italic_K = 10), each initialized randomly from [0,π]0𝜋[0,\pi][ 0 , italic_π ]. The Adam optimizer with a learning rate of 0.01 was used to minimize the MSE loss during training. The training was limited to 500500500500 iterations with a batch size of 100, with early termination for MSE below 104E-4{10}^{-4}start_ARG end_ARG start_ARG ⁢ end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 4 end_ARG end_ARG. The achieved MSE on the test data was 2.22×1042.22E-42.22\text{\times}{10}^{-4}start_ARG 2.22 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 4 end_ARG end_ARG (K=2𝐾2K=2italic_K = 2) and 9.82×1059.82E-59.82\text{\times}{10}^{-5}start_ARG 9.82 end_ARG start_ARG times end_ARG start_ARG power start_ARG 10 end_ARG start_ARG - 5 end_ARG end_ARG (K=10𝐾10K=10italic_K = 10). Fig. 4 visualizes the results. As K𝐾Kitalic_K increases, the PQC demonstrates improved approximation performance, aligning with the theoretical findings.

Refer to caption
Figure 4: Simulation results for learning f(x,y)𝑓𝑥𝑦f(x,y)italic_f ( italic_x , italic_y ). The left two panels are derived by interpolating and smoothing the output values of PQC on 100 test data points.

5 Discussion

To the best of our knowledge, our results establish the first explicit PQC constructions for approximating Lipschitz continuous and Hölder smooth functions with quantitative approximation error bounds. These results open up the possibility of comparing the size of PQCs and the size of classical deep neural networks for accomplishing the same function approximation tasks and see if there is any quantum advantage in terms of the model size and the number of trainable parameters. Here, we mainly focus on the comparison with the results of approximation errors of classical machine learning models. In classical deep learning, the deep feed-forward neural network (FNN) equipped with the rectified linear unit (ReLU) activation function is one of the most commonly used models. The quantitative approximation error bounds of ReLU FNNs for approximating continuous functions have been recently established, including the nearly optimal approximation error bounds of ReLU FNNs for smooth functions [21]. We briefly compare the approximation errors of PQCs and ReLU FNNs in terms of width, depth and the number of trainable parameters. Detailed comparisons can be found in Appendix E.

We consider multivariate smooth functions in Cus([0,1]d)subscriptsuperscript𝐶𝑠𝑢superscript01𝑑C^{s}_{u}([0,1]^{d})italic_C start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ) (the unit ball of Cs([0,1]d)superscript𝐶𝑠superscript01𝑑C^{s}([0,1]^{d})italic_C start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT )) with smooth index s𝑠s\in{{\mathbb{N}}}italic_s ∈ blackboard_N as the target functions in our comparison. Note that smooth functions with smooth index s𝑠sitalic_s are exactly (s+1)𝑠1(s+1)( italic_s + 1 )-Hölder smooth functions by definition. For simplicity, we first show the case of s=2𝑠2s=2italic_s = 2. To achieve the same approximation error ε𝜀\varepsilonitalic_ε (say some constant), we need to set KQ=Θ(d2/ε)subscript𝐾𝑄Θsuperscript𝑑2𝜀K_{Q}=\Theta(d^{2}/\sqrt{\varepsilon})italic_K start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT = roman_Θ ( italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / square-root start_ARG italic_ε end_ARG ) for the constructed PQCs from Theorem 4 and set KC=Θ(2d/2/ε)subscript𝐾𝐶Θsuperscript2𝑑2𝜀K_{C}=\Theta(2^{d/2}/\sqrt{\varepsilon})italic_K start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT = roman_Θ ( 2 start_POSTSUPERSCRIPT italic_d / 2 end_POSTSUPERSCRIPT / square-root start_ARG italic_ε end_ARG ) for the constructed near-optimal ReLU FNNs from Ref. [21]. Substituting the choices of K𝐾Kitalic_K’s in the sizes of PQCs and ReLU FNNs, we have

Width of PQC×Depth of PQCWidth of FNN×Depth of FNN=O(d3KQd2d+3KCd/2)=O(εd/42d2dlogd).Width of PQCDepth of PQCWidth of FNNDepth of FNN𝑂superscript𝑑3superscriptsubscript𝐾𝑄𝑑superscript2𝑑3superscriptsubscript𝐾𝐶𝑑2𝑂superscript𝜀𝑑4superscript2superscript𝑑2𝑑𝑑\frac{\text{Width of PQC}\times\text{Depth of PQC}}{\text{Width of FNN}\times% \text{Depth of FNN}}=O\Bigl{(}\frac{d^{3}K_{Q}^{d}}{2^{d+3}K_{C}^{d/2}}\Bigr{)% }=O\Bigl{(}\frac{\varepsilon^{-d/4}}{2^{d^{2}-d\log d}}\Bigr{)}.divide start_ARG Width of PQC × Depth of PQC end_ARG start_ARG Width of FNN × Depth of FNN end_ARG = italic_O ( divide start_ARG italic_d start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT italic_K start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_ARG start_ARG 2 start_POSTSUPERSCRIPT italic_d + 3 end_POSTSUPERSCRIPT italic_K start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d / 2 end_POSTSUPERSCRIPT end_ARG ) = italic_O ( divide start_ARG italic_ε start_POSTSUPERSCRIPT - italic_d / 4 end_POSTSUPERSCRIPT end_ARG start_ARG 2 start_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_d roman_log italic_d end_POSTSUPERSCRIPT end_ARG ) . (11)

One can obtain a similar relation for the number of required parameters in PQCs and ReLU FNNs for approximating smooth functions and extend these results to any 2s<d2𝑠𝑑2\leq s<d2 ≤ italic_s < italic_d, which holds relevance in numerous real-world applications (e.g., the input dimension d𝑑ditalic_d is 784784784784 for the MNIST dataset and is 150 528150528150\,528150 528 for the ImageNet dataset [56], and empirically s10𝑠10s\leq 10italic_s ≤ 10). Therefore, to achieve the same approximation error, the required quantum circuit size and number of parameters of PQCs is exponentially smaller than the required network size and number of parameters of ReLU FNNs proposed in Ref. [21].

Aiming to understand and continuously expand the range of problems that can be addressed using quantum machine learning, we have demonstrated the approximation capabilities of PQC models in supervised learning. We characterized the approximation error of PQCs in terms of the model size, delivering a deeper understanding of the expressive power of PQCs that is beyond the universal approximation properties. With these results, we can unlock the full potential of these models and drive advancements in quantum machine learning. Notably, by comparing our results with the near-optimal approximation error bound of classical ReLU neural networks, we demonstrate an improvement over the classical models on approximating high-dimensional functions that satisfy specific smoothness criteria, quantified by an improvement on the model size and the number of parameters.

Unlike many other investigations in the universal approximation properties of PQC models [26, 27, 28, 29, 30, 31, 32, 33], our constructions of PQCs for approximating broad classes of continuous functions do not rely on any impractical assumptions. All the variables take the form of parameters within single-qubit rotation gates, avoiding any classical parameterized pre-processing or post-processing. Ultimately, our research provides valuable insights into the theoretical underpinnings of PQCs in quantum machine learning and paves the way for leveraging its capabilities in machine learning for both classical and quantum applications.

In this work, we introduce a novel nested PQC structure, which significantly improves the approximation capabilities. Future work could focus on exploring more powerful PQC constructions based on our proposed idea and understanding the capabilities and limitations of PQCs in more practical tasks even with real-world data. Developing efficient training strategies for PQCs, such as accelerated methods that achieve faster convergence rates, will also be interesting.

Acknowledgments and Disclosure of Funding

Part of this work was done when Z.Y. was visiting Wuhan University. Z.Y. thanks Patrick Rebentrost for helpful discussions. The authors thank the helpful comments from the anonymous reviewers. This work is supported by the National Key Research and Development Program of China (No. 2020YFA0714200), the National Nature Science Foundation of China (No. 62302346, No. 12125103, No. 12071362, No.  12371424, No. 12371441) and supported by the “Fundamental Research Funds for the Central Universities”.

References

  • Biamonte et al. [2017] Jacob Biamonte, Peter Wittek, Nicola Pancotti, Patrick Rebentrost, Nathan Wiebe, and Seth Lloyd. Quantum machine learning. Nature, 549(7671):195–202, September 2017. ISSN 1476-4687. doi: 10.1038/nature23474.
  • Benedetti et al. [2019] Marcello Benedetti, Erika Lloyd, Stefan Sack, and Mattia Fiorentini. Parameterized quantum circuits as machine learning models. Quantum Science and Technology, 4(4):043001, November 2019. ISSN 2058-9565. doi: 10.1088/2058-9565/ab4eb5.
  • Preskill [2018] John Preskill. Quantum Computing in the NISQ era and beyond. Quantum, 2:79, August 2018. ISSN 2521-327X. doi: 10.22331/q-2018-08-06-79. URL https://doi.org/10.22331/q-2018-08-06-79.
  • Kandala et al. [2017] Abhinav Kandala, Antonio Mezzacapo, Kristan Temme, Maika Takita, Markus Brink, Jerry M. Chow, and Jay M. Gambetta. Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets. Nature, 549(7671):242–246, 2017. ISSN 1476-4687. doi: 10.1038/nature23879.
  • Cerezo et al. [2022] M. Cerezo, Kunal Sharma, Andrew Arrasmith, and Patrick J. Coles. Variational quantum state eigensolver. npj Quantum Information, 8(1):113, 2022. ISSN 2056-6387. doi: 10.1038/s41534-022-00611-6.
  • Cao et al. [2019] Yudong Cao, Jonathan Romero, Jonathan P. Olson, Matthias Degroote, Peter D. Johnson, Mária Kieferová, Ian D. Kivlichan, Tim Menke, Borja Peropadre, Nicolas P. D. Sawaya, Sukin Sim, Libor Veis, and Alán Aspuru-Guzik. Quantum chemistry in the age of quantum computing. Chemical Reviews, 119(19):10856–10915, 2019. ISSN 0009-2665. doi: 10.1021/acs.chemrev.8b00803.
  • Pan et al. [2023] Xiaoxuan Pan, Zhide Lu, Weiting Wang, Ziyue Hua, Yifang Xu, Weikang Li, Weizhou Cai, Xuegang Li, Haiyan Wang, Yi-Pu Song, Chang-Ling Zou, Dong-Ling Deng, and Luyan Sun. Deep quantum neural networks on a superconducting processor. Nature Communications, 14(1):4006, 2023. ISSN 2041-1723. doi: 10.1038/s41467-023-39785-8.
  • Ren et al. [2022] Wenhui Ren, Weikang Li, Shibo Xu, Ke Wang, Wenjie Jiang, Feitong Jin, Xuhao Zhu, Jiachen Chen, Zixuan Song, Pengfei Zhang, Hang Dong, Xu Zhang, Jinfeng Deng, Yu Gao, Chuanyu Zhang, Yaozu Wu, Bing Zhang, Qiujiang Guo, Hekang Li, Zhen Wang, Jacob Biamonte, Chao Song, Dong-Ling Deng, and H. Wang. Experimental quantum adversarial learning with programmable superconducting qubits. Nature Computational Science, 2(11):711–717, 2022. ISSN 2662-8457. doi: 10.1038/s43588-022-00351-9.
  • Huang et al. [2021a] He-Liang Huang, Yuxuan Du, Ming Gong, Youwei Zhao, Yulin Wu, Chaoyue Wang, Shaowei Li, Futian Liang, Jin Lin, Yu Xu, Rui Yang, Tongliang Liu, Min-Hsiu Hsieh, Hui Deng, Hao Rong, Cheng-Zhi Peng, Chao-Yang Lu, Yu-Ao Chen, Dacheng Tao, Xiaobo Zhu, and Jian-Wei Pan. Experimental quantum generative adversarial networks for image generation. Phys. Rev. Appl., 16:024051, Aug 2021a. doi: 10.1103/PhysRevApplied.16.024051. URL https://link.aps.org/doi/10.1103/PhysRevApplied.16.024051.
  • Mitarai et al. [2018] Kosuke Mitarai, Makoto Negoro, Masahiro Kitagawa, and Keisuke Fujii. Quantum Circuit Learning. Physical Review A, 98(3):032309, September 2018. ISSN 2469-9926, 2469-9934. doi: 10.1103/PhysRevA.98.032309.
  • Pérez-Salinas et al. [2020] Adrián Pérez-Salinas, Alba Cervera-Lierta, Elies Gil-Fuster, and José I. Latorre. Data re-uploading for a universal quantum classifier. Quantum, 4:226, February 2020. doi: 10.22331/q-2020-02-06-226.
  • LeCun et al. [2015] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436–444, 2015. ISSN 1476-4687. doi: 10.1038/nature14539.
  • Cybenko [1989] George Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems, 2(4):303–314, 1989. URL https://link.springer.com/article/10.1007/BF02551274.
  • Hornik et al. [1989] Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal approximators. Neural Networks, 2(5):359–366, January 1989. ISSN 08936080. doi: 10.1016/0893-6080(89)90020-8. URL https://linkinghub.elsevier.com/retrieve/pii/0893608089900208.
  • Barron [1993] A.R. Barron. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information Theory, 39(3):930–945, May 1993. ISSN 1557-9654. doi: 10.1109/18.256500.
  • Yarotsky [2017] Dmitry Yarotsky. Error bounds for approximations with deep ReLU networks. Neural Networks, 94:103–114, 2017. ISSN 0893-6080. doi: 10.1016/j.neunet.2017.07.002.
  • Yarotsky [2018] Dmitry Yarotsky. Optimal approximation of continuous functions by very deep ReLU networks. In Proceedings of the 31st Conference On Learning Theory, pages 639–649. PMLR, July 2018. URL https://proceedings.mlr.press/v75/yarotsky18a.html.
  • Petersen and Voigtlaender [2018] Philipp Petersen and Felix Voigtlaender. Optimal approximation of piecewise smooth functions using deep ReLU neural networks. Neural Networks, 108:296–330, December 2018. ISSN 0893-6080. doi: 10.1016/j.neunet.2018.08.019.
  • Yarotsky and Zhevnerchuk [2020] Dmitry Yarotsky and Anton Zhevnerchuk. The phase diagram of approximation rates for deep neural networks. In Advances in Neural Information Processing Systems, volume 33, pages 13005–13015. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper/2020/hash/979a3f14bae523dc5101c52120c535e9-Abstract.html.
  • Shen [2020] Zuowei Shen. Deep Network Approximation Characterized by Number of Neurons. Communications in Computational Physics, 28(5):1768–1811, June 2020. ISSN 1815-2406, 1991-7120. doi: 10.4208/cicp.OA-2020-0149.
  • Lu et al. [2021] Jianfeng Lu, Zuowei Shen, Haizhao Yang, and Shijun Zhang. Deep Network Approximation for Smooth Functions. SIAM Journal on Mathematical Analysis, 53(5):5465–5506, January 2021. ISSN 0036-1410. doi: 10.1137/20M134695X. URL https://epubs.siam.org/doi/10.1137/20M134695X.
  • Shen et al. [2022] Zuowei Shen, Haizhao Yang, and Shijun Zhang. Optimal approximation rate of ReLU networks in terms of width and depth. Journal de Mathématiques Pures et Appliquées, 157:101–135, January 2022. ISSN 0021-7824. doi: 10.1016/j.matpur.2021.07.009. URL https://www.sciencedirect.com/science/article/pii/S0021782421001124.
  • Weinan et al. [2022] E Weinan, Chao Ma, and Lei Wu. The barron space and the flow-induced function spaces for neural network models. Constructive Approximation, 55(1):369–406, 2022.
  • Jiao et al. [2023a] Yuling Jiao, Yanming Lai, Xiliang Lu, Fengru Wang, Jerry Zhijian Yang, and Yuanyuan Yang. Deep neural networks with ReLU-sine-exponential activations break curse of dimensionality in approximation on hölder class. SIAM Journal on Mathematical Analysis, 55(4):3635–3649, 2023a. doi: 10.1137/21M144431X. URL https://doi.org/10.1137/21M144431X.
  • Jiao et al. [2023b] Yuling Jiao, Guohao Shen, Yuanyuan Lin, and Jian Huang. Deep nonparametric regression on approximate manifolds: Nonasymptotic error bounds with polynomial prefactors. The Annals of Statistics, 51(2):691–716, April 2023b. ISSN 0090-5364, 2168-8966. doi: 10.1214/23-AOS2266. URL https://projecteuclid.org/journals/annals-of-statistics/volume-51/issue-2/Deep-nonparametric-regression-on-approximate-manifolds--Nonasymptotic-error-bounds/10.1214/23-AOS2266.full.
  • Havlíček et al. [2019] Vojtěch Havlíček, Antonio D. Córcoles, Kristan Temme, Aram W. Harrow, Abhinav Kandala, Jerry M. Chow, and Jay M. Gambetta. Supervised learning with quantum-enhanced feature spaces. Nature, 567(7747):209–212, March 2019. ISSN 1476-4687. doi: 10.1038/s41586-019-0980-2. URL https://www.nature.com/articles/s41586-019-0980-2.
  • Du et al. [2020] Yuxuan Du, Min-Hsiu Hsieh, Tongliang Liu, and Dacheng Tao. Expressive power of parametrized quantum circuits. Physical Review Research, 2(3):033125, July 2020. doi: 10.1103/PhysRevResearch.2.033125.
  • Liu et al. [2021] Yunchao Liu, Srinivasan Arunachalam, and Kristan Temme. A rigorous and robust quantum speed-up in supervised machine learning. Nature Physics, 17(9):1013–1017, 2021. ISSN 1745-2481. doi: 10.1038/s41567-021-01287-z.
  • Huang et al. [2021b] Hsin-Yuan Huang, Richard Kueng, and John Preskill. Information-theoretic bounds on quantum advantage in machine learning. Phys. Rev. Lett., 126:190505, May 2021b. doi: 10.1103/PhysRevLett.126.190505. URL https://link.aps.org/doi/10.1103/PhysRevLett.126.190505.
  • Jerbi et al. [2021] Sofiene Jerbi, Casper Gyurik, Simon Marshall, Hans Briegel, and Vedran Dunjko. Parametrized Quantum Policies for Reinforcement Learning. In Advances in Neural Information Processing Systems, volume 34, pages 28362–28375. Curran Associates, Inc., 2021. URL https://proceedings.neurips.cc/paper/2021/hash/eec96a7f788e88184c0e713456026f3f-Abstract.html.
  • Huang et al. [2021c] Hsin-Yuan Huang, Michael Broughton, Masoud Mohseni, Ryan Babbush, Sergio Boixo, Hartmut Neven, and Jarrod R. McClean. Power of data in quantum machine learning. Nature Communications, 12(1):2631, May 2021c. ISSN 2041-1723. doi: 10.1038/s41467-021-22539-9. URL https://www.nature.com/articles/s41467-021-22539-9.
  • Jerbi et al. [2023] Sofiene Jerbi, Lukas J. Fiderer, Hendrik Poulsen Nautrup, Jonas M. Kübler, Hans J. Briegel, and Vedran Dunjko. Quantum machine learning beyond kernel methods. Nature Communications, 14(1):517, January 2023. ISSN 2041-1723. doi: 10.1038/s41467-023-36159-y. URL https://www.nature.com/articles/s41467-023-36159-y.
  • Jäger and Krems [2023] Jonas Jäger and Roman V. Krems. Universal expressiveness of variational quantum classifiers and quantum kernels for support vector machines. Nature Communications, 14(1):576, February 2023. ISSN 2041-1723. doi: 10.1038/s41467-023-36144-5. URL https://www.nature.com/articles/s41467-023-36144-5.
  • Schuld et al. [2021] Maria Schuld, Ryan Sweke, and Johannes Jakob Meyer. Effect of data encoding on the expressive power of variational quantum-machine-learning models. Physical Review A, 103(3):032430, March 2021. doi: 10.1103/PhysRevA.103.032430.
  • Gil Vidal and Theis [2020] Francisco Javier Gil Vidal and Dirk Oliver Theis. Input Redundancy for Parameterized Quantum Circuits. Frontiers in Physics, 8, 2020. ISSN 2296-424X. URL https://www.frontiersin.org/articles/10.3389/fphy.2020.00297.
  • Pérez-Salinas et al. [2021] Adrián Pérez-Salinas, David López-Núñez, Artur García-Sáez, P. Forn-Díaz, and José I. Latorre. One qubit as a universal approximant. Physical Review A, 104(1):012405, July 2021. doi: 10.1103/PhysRevA.104.012405.
  • Yu et al. [2022] Zhan Yu, Hongshun Yao, Mujin Li, and Xin Wang. Power and limitations of single-qubit native quantum neural networks. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 27810–27823. Curran Associates, Inc., 2022. URL https://proceedings.neurips.cc/paper_files/paper/2022/file/b250de41980b58d34d6aadc3f4aedd4c-Paper-Conference.pdf.
  • Manzano et al. [2023] Alberto Manzano, David Dechant, Jordi Tura, and Vedran Dunjko. Parametrized Quantum Circuits and their approximation capacities in the context of quantum machine learning, July 2023.
  • Goto et al. [2021] Takahiro Goto, Quoc Hoan Tran, and Kohei Nakajima. Universal Approximation Property of Quantum Machine Learning Models in Quantum-Enhanced Feature Spaces. Physical Review Letters, 127(9):090506, August 2021. doi: 10.1103/PhysRevLett.127.090506.
  • Gonon and Jacquier [2023] Lukas Gonon and Antoine Jacquier. Universal Approximation Theorem and error bounds for quantum neural networks and quantum reservoirs, July 2023.
  • Qi et al. [2023] Jun Qi, Chao-Han Huck Yang, Pin-Yu Chen, and Min-Hsiu Hsieh. Theoretical error performance analysis for variational quantum circuit based functional regression. npj Quantum Information, 9(1):4, 2023. ISSN 2056-6387. doi: 10.1038/s41534-022-00672-7.
  • Zhao et al. [2023] Haimeng Zhao, Laura Lewis, Ishaan Kannan, Yihui Quek, Hsin-Yuan Huang, and Matthias C. Caro. Learning quantum states and unitaries of bounded gate complexity, 2023.
  • Grohs and Kutyniok [2022] Philipp Grohs and Gitta Kutyniok. Mathematical aspects of deep learning. Cambridge University Press, 2022.
  • Nielsen and Chuang [2010] Michael A Nielsen and Isaac L Chuang. Quantum computation and quantum information. Cambridge university press, 2010.
  • Low et al. [2016] Guang Hao Low, Theodore J. Yoder, and Isaac L. Chuang. Methodology of Resonant Equiangular Composite Quantum Gates. Physical Review X, 6(4):041067, December 2016. doi: 10.1103/PhysRevX.6.041067.
  • Low and Chuang [2017] Guang Hao Low and Isaac L. Chuang. Optimal Hamiltonian Simulation by Quantum Signal Processing. Physical Review Letters, 118(1):010501, January 2017. doi: 10.1103/PhysRevLett.118.010501.
  • Gilyén et al. [2019] András Gilyén, Yuan Su, Guang Hao Low, and Nathan Wiebe. Quantum singular value transformation and beyond: Exponential improvements for quantum matrix arithmetics. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 193–204, June 2019. doi: 10.1145/3313276.3316366.
  • Childs and Wiebe [2012] Andrew M. Childs and Nathan Wiebe. Hamiltonian simulation using linear combinations of unitary operations. Quantum Inf. Comput., 12(11-12):901–924, 2012. doi: 10.26421/QIC12.11-12-1.
  • da Silva and Park [2022] Adenilton J. da Silva and Daniel K. Park. Linear-depth quantum circuits for multiqubit controlled gates. Physical Review A, 106(4):042602, October 2022. doi: 10.1103/PhysRevA.106.042602.
  • Brassard et al. [2002] Gilles Brassard, Peter Hoyer, Michele Mosca, and Alain Tapp. Quantum Amplitude Amplification and Estimation. Contemporary Mathematics, 305:53–74, 2002. doi: 10.1090/conm/305/05215.
  • Davidson and Donsig [2002] Kenneth R. Davidson and Allan P. Donsig. Real analysis with real applications. Prentice Hall, 2002. URL https://cir.nii.ac.jp/crid/1130000794786166144.
  • Heitzinger [2002] Clemens Heitzinger. Simulation and Inverse Modeling of Semiconductor Manufacturing Processes. Thesis, Technische Universität Wien, 2002.
  • Foupouagnigni and Mouafo Wouodjié [2020] Mama Foupouagnigni and Merlin Mouafo Wouodjié. On Multivariate Bernstein Polynomials. Mathematics, 8(9):1397, September 2020. ISSN 2227-7390. doi: 10.3390/math8091397.
  • Low [2017] Guang Hao Low. Quantum Signal Processing by Single-Qubit Dynamics. Thesis, Massachusetts Institute of Technology, 2017. URL https://dspace.mit.edu/handle/1721.1/115025.
  • Kingma and Ba [2015] Diederik P. Kingma and Jimmy Ba. Adam: A Method for Stochastic Optimization. In Yoshua Bengio and Yann LeCun, editors, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. URL http://arxiv.org/abs/1412.6980.
  • Deng et al. [2009] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, June 2009. doi: 10.1109/CVPR.2009.5206848. URL https://ieeexplore.ieee.org/abstract/document/5206848.
  • Wang et al. [2023] Youle Wang, Lei Zhang, Zhan Yu, and Xin Wang. Quantum Phase Processing and its Applications in Estimating Phase and Entropies, July 2023.
  • Vapnik and Chervonenkis [1982] V. N. Vapnik and A. Ya. Chervonenkis. Necessary and Sufficient Conditions for the Uniform Convergence of Means to their Expectations. Theory of Probability & Its Applications, 26(3):532–553, January 1982. ISSN 0040-585X. doi: 10.1137/1126059.
  • Tikhomirov [1993] V. M. Tikhomirov. ϵitalic-ϵ\epsilonitalic_ϵ-Entropy and ϵitalic-ϵ\epsilonitalic_ϵ-Capacity of Sets In Functional Spaces. In A. N. Shiryayev, editor, Selected Works of A. N. Kolmogorov: Volume III: Information Theory and the Theory of Algorithms, Mathematics and Its Applications, pages 86–170. Springer Netherlands, Dordrecht, 1993. ISBN 978-94-017-2973-4. doi: 10.1007/978-94-017-2973-4_7.
  • Bartlett and Mendelson [2003] Peter L. Bartlett and Shahar Mendelson. Rademacher and Gaussian Complexities: Risk Bounds and Structural Results. Journal of Machine Learning Research, 3(3):463, April 2003. ISSN 15324435. URL https://search.ebscohost.com/login.aspx?direct=true&db=a9h&AN=10257714&lang=zh-cn&site=ehost-live.
  • Du et al. [2022] Yuxuan Du, Zhuozhuo Tu, Xiao Yuan, and Dacheng Tao. Efficient Measure for the Expressivity of Variational Quantum Algorithms. Physical Review Letters, 128(8):080506, February 2022. doi: 10.1103/PhysRevLett.128.080506.
  • Bu et al. [2022] Kaifeng Bu, Dax Enshan Koh, Lu Li, Qingxian Luo, and Yaobo Zhang. Statistical complexity of quantum circuits. Physical Review A, 105(6):062431, June 2022. doi: 10.1103/PhysRevA.105.062431.
  • Caro and Datta [2020] Matthias C. Caro and Ishaun Datta. Pseudo-dimension of quantum circuits. Quantum Machine Intelligence, 2(2):14, November 2020. ISSN 2524-4914. doi: 10.1007/s42484-020-00027-5.
  • Chen et al. [2022] Chih-Chieh Chen, Masaru Sogabe, Kodai Shiba, Katsuyoshi Sakamoto, and Tomah Sogabe. General Vapnik–Chervonenkis dimension bounds for quantum circuit learning. Journal of Physics: Complexity, 3(4):045007, November 2022. ISSN 2632-072X. doi: 10.1088/2632-072X/ac9f9b.
  • Abbas et al. [2021] Amira Abbas, David Sutter, Christa Zoufal, Aurelien Lucchi, Alessio Figalli, and Stefan Woerner. The power of quantum neural networks. Nature Computational Science, 1(6):403–409, June 2021. ISSN 2662-8457. doi: 10.1038/s43588-021-00084-1.
  • DeVore et al. [1989] Ronald A. DeVore, Ralph Howard, and Charles Micchelli. Optimal nonlinear approximation. manuscripta mathematica, 63(4):469–478, December 1989. ISSN 1432-1785. doi: 10.1007/BF01171759.
  • Hornik [1991] Kurt Hornik. Approximation capabilities of multilayer feedforward networks. Neural networks, 4(2):251–257, 1991. URL https://www.sciencedirect.com/science/article/abs/pii/089360809190009T.
  • E et al. [2022] Weinan E, Chao Ma, and Lei Wu. The Barron Space and the Flow-Induced Function Spaces for Neural Network Models. Constructive Approximation, 55(1):369–406, February 2022. ISSN 1432-0940. doi: 10.1007/s00365-021-09549-y. URL https://doi.org/10.1007/s00365-021-09549-y.
  • Stone [1948] M. H. Stone. The Generalized Weierstrass Approximation Theorem. Mathematics Magazine, 21(4):167–184, 1948. ISSN 0025-570X. doi: 10.2307/3029750.
  • He et al. [2016] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, June 2016. doi: 10.1109/CVPR.2016.90.
  • Ren et al. [2015] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc., 2015. URL https://proceedings.neurips.cc/paper_files/paper/2015/file/14bfa6bb14875e45bba028a21ed38046-Paper.pdf.
  • Yang et al. [2019] Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. XLNet: Generalized Autoregressive Pretraining for Language Understanding. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips.cc/paper/2019/hash/dc6a7e655d7e5840e66733e9ee67cc69-Abstract.html.
  • Devlin et al. [2019] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, June 2019. Association for Computational Linguistics. doi: 10.18653/v1/N19-1423. URL https://aclanthology.org/N19-1423.
  • Zhang et al. [2023] Shijun Zhang, Jianfeng Lu, and Hongkai Zhao. Deep Network Approximation: Beyond ReLU to Diverse Activation Functions, September 2023. URL http://arxiv.org/abs/2307.06555.
  • Gühring et al. [2020] Ingo Gühring, Gitta Kutyniok, and Philipp Petersen. Error bounds for approximations with deep ReLU neural networks in ws,p norms. Analysis and Applications, 18(05):803–859, 2020. doi: 10.1142/S0219530519410021. URL https://doi.org/10.1142/S0219530519410021.
  • Schmidt-Hieber [2020] Johannes Schmidt-Hieber. Nonparametric regression using deep neural networks with ReLU activation function. The Annals of Statistics, 48(4):1875 – 1897, 2020. doi: 10.1214/19-AOS1875. URL https://doi.org/10.1214/19-AOS1875.

Supplementary Material

Appendix A Preliminaries

In this section, we will first present some essential mathematical foundations for deriving the main results of this work. Moreover, to contextualize our work within the existing literature, we comprehensively review relevant studies in Section A.3.

A.1 Notation

We unify the notations throughout the whole work. The univariate polynomial ring over a field 𝔽𝔽\mathbb{F}blackboard_F is symbolized as 𝔽[x]𝔽delimited-[]𝑥\mathbb{F}[x]blackboard_F [ italic_x ], with the variable x𝑥xitalic_x representing the input. The ring of Laurent polynomial 𝔽[x,x1]𝔽𝑥superscript𝑥1\mathbb{F}[x,x^{-1}]blackboard_F [ italic_x , italic_x start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ] is an extension of the polynomial ring obtained by adding inverses of x𝑥xitalic_x. The collection of natural numbers is represented by the symbol :={1,2,3,}\mathbb{N}\mathrel{\mathop{\mathchar 58\relax}}=\{1,2,3,\dots\}blackboard_N : = { 1 , 2 , 3 , … }, while the set of non-negative integers is denoted as 0:={0}\mathbb{N}_{0}\mathrel{\mathop{\mathchar 58\relax}}=\{0\}\cup\mathbb{N}blackboard_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT : = { 0 } ∪ blackboard_N. The 1111-norm of a vector 𝜶=(α1,α2,,αd)𝜶subscript𝛼1subscript𝛼2subscript𝛼𝑑\bm{\alpha}=(\alpha_{1},\alpha_{2},\dots,\alpha_{d})bold_italic_α = ( italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) is denoted by 𝜶1:=|α1|+|α2|++|αd|\|\bm{\alpha}\|_{1}\mathrel{\mathop{\mathchar 58\relax}}=|\alpha_{1}|+|\alpha_% {2}|+\cdots+|\alpha_{d}|∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT : = | italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | + | italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | + ⋯ + | italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT |.

A.2 Data re-uploading PQCs

In this section, we review the concept of data re-uploading PQC and define the PQC we use in this paper. The data re-uploading PQC is a quantum circuit that consists of interleaved data encoding circuit blocks and trainable circuit blocks [35, 11]. More precisely, let 𝒙𝒙\bm{x}bold_italic_x be the input data vector and 𝜽=(𝜽𝟎,,𝜽𝑳)𝜽subscript𝜽0subscript𝜽𝑳\bm{\theta}=(\bm{\theta_{0}},\ldots,\bm{\theta_{L}})bold_italic_θ = ( bold_italic_θ start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT , … , bold_italic_θ start_POSTSUBSCRIPT bold_italic_L end_POSTSUBSCRIPT ) be a set of trainable parameters. S(𝒙)𝑆𝒙S(\bm{x})italic_S ( bold_italic_x ) is a quantum circuit that encode 𝒙𝒙\bm{x}bold_italic_x and V(𝜽j)𝑉subscript𝜽𝑗V(\bm{\theta}_{j})italic_V ( bold_italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) is a trainable quantum circuit with trainable parameter vector 𝜽jsubscript𝜽𝑗\bm{\theta}_{j}bold_italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. An L𝐿Litalic_L-layer data re-uploading PQC can be then expressed as

U𝜽(𝒙)=V(𝜽𝟎)j=1LS(𝒙)V(𝜽𝒋),subscript𝑈𝜽𝒙𝑉subscript𝜽0superscriptsubscriptproduct𝑗1𝐿𝑆𝒙𝑉subscript𝜽𝒋U_{\bm{\theta}}(\bm{x})=V(\bm{\theta_{0}})\prod_{j=1}^{L}S(\bm{x})V(\bm{\theta% _{j}}),italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_x ) = italic_V ( bold_italic_θ start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT ) ∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT italic_S ( bold_italic_x ) italic_V ( bold_italic_θ start_POSTSUBSCRIPT bold_italic_j end_POSTSUBSCRIPT ) , (A.1)

Applying U𝜽(𝒙)subscript𝑈𝜽𝒙U_{\bm{\theta}}(\bm{x})italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( bold_italic_x ) to a quantum state and measuring the output states provides a way to express functions on 𝒙𝒙\bm{x}bold_italic_x. The expressivity of the data re-uploading PQC model can be characterized by the classes of functions that it can implement. It is common to build data encoding circuits and trainable circuits using the most prevalent Pauli rotation operators,

RX(θ)=[cosθ2isinθ2isinθ2cosθ2],RY(θ)=[cosθ2sinθ2sinθ2cosθ2],RZ(θ)=[eiθ200eiθ2].formulae-sequencesubscript𝑅𝑋𝜃matrix𝜃2𝑖𝜃2𝑖𝜃2𝜃2formulae-sequencesubscript𝑅𝑌𝜃matrix𝜃2𝜃2𝜃2𝜃2subscript𝑅𝑍𝜃matrixsuperscript𝑒𝑖𝜃200superscript𝑒𝑖𝜃2R_{X}(\theta)=\begin{bmatrix}\cos\frac{\theta}{2}&-i\sin\frac{\theta}{2}\\[4.3% 0554pt] -i\sin\frac{\theta}{2}&\cos\frac{\theta}{2}\end{bmatrix},\quad R_{Y}(\theta)=% \begin{bmatrix}\cos\frac{\theta}{2}&-\sin\frac{\theta}{2}\\[4.30554pt] \sin\frac{\theta}{2}&\cos\frac{\theta}{2}\end{bmatrix},\quad R_{Z}(\theta)=% \begin{bmatrix}e^{-i\frac{\theta}{2}}&0\\[4.30554pt] 0&e^{i\frac{\theta}{2}}\end{bmatrix}.italic_R start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT ( italic_θ ) = [ start_ARG start_ROW start_CELL roman_cos divide start_ARG italic_θ end_ARG start_ARG 2 end_ARG end_CELL start_CELL - italic_i roman_sin divide start_ARG italic_θ end_ARG start_ARG 2 end_ARG end_CELL end_ROW start_ROW start_CELL - italic_i roman_sin divide start_ARG italic_θ end_ARG start_ARG 2 end_ARG end_CELL start_CELL roman_cos divide start_ARG italic_θ end_ARG start_ARG 2 end_ARG end_CELL end_ROW end_ARG ] , italic_R start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ( italic_θ ) = [ start_ARG start_ROW start_CELL roman_cos divide start_ARG italic_θ end_ARG start_ARG 2 end_ARG end_CELL start_CELL - roman_sin divide start_ARG italic_θ end_ARG start_ARG 2 end_ARG end_CELL end_ROW start_ROW start_CELL roman_sin divide start_ARG italic_θ end_ARG start_ARG 2 end_ARG end_CELL start_CELL roman_cos divide start_ARG italic_θ end_ARG start_ARG 2 end_ARG end_CELL end_ROW end_ARG ] , italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT ( italic_θ ) = [ start_ARG start_ROW start_CELL italic_e start_POSTSUPERSCRIPT - italic_i divide start_ARG italic_θ end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_e start_POSTSUPERSCRIPT italic_i divide start_ARG italic_θ end_ARG start_ARG 2 end_ARG end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ] . (A.2)

Different data encoding schemes lead to different types of data re-uploading PQCs.

In some cases, trainable parameters are also included both during the initial data encoding phase and the final processing of measurement outcomes. These PQCs are considered to have hybrid structures. For instance, in the models proposed by Refs. [35, 36, 40], each input data is multiplied by a specific trainable parameter and subsequently subjected to RZsubscript𝑅𝑍R_{Z}italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT gates during the data encoding stage. In a similar vein, Refs. [39, 40] incorporate trainable weights into each measurement outcome generated by the constructed PQCs, aggregating these weighted outcomes to produce the final result. Such a structure makes it hard to judge whether the expressive power comes from the classical or quantum part.

A.2.1 Implementing real polynomials

We first introduce the data re-uploading PQC for implementing real univariate polynomials. We utilize the so-called Pauli X𝑋Xitalic_X basis encoding [10]: The data encoding unitary is a single-qubit rotation defined as

S(x)eiarccos(x)X=(xi1x2i1x2x),𝑆𝑥superscript𝑒𝑖𝑥𝑋matrix𝑥𝑖1superscript𝑥2𝑖1superscript𝑥2𝑥S(x)\coloneqq e^{i\arccos(x)X}=\begin{pmatrix}x&i\sqrt{1-x^{2}}\\ i\sqrt{1-x^{2}}&x\end{pmatrix},italic_S ( italic_x ) ≔ italic_e start_POSTSUPERSCRIPT italic_i roman_arccos ( italic_x ) italic_X end_POSTSUPERSCRIPT = ( start_ARG start_ROW start_CELL italic_x end_CELL start_CELL italic_i square-root start_ARG 1 - italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL italic_i square-root start_ARG 1 - italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_CELL start_CELL italic_x end_CELL end_ROW end_ARG ) , (A.3)

where x[1,1]𝑥11x\in[-1,1]italic_x ∈ [ - 1 , 1 ] is the input data. Then interlaying the data encoding unitary S(x)𝑆𝑥S(x)italic_S ( italic_x ) with some parameterized Pauli Z𝑍Zitalic_Z rotations RZ(θ)subscript𝑅𝑍𝜃R_{Z}(\theta)italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT ( italic_θ ) gives the circuit of data re-uploading PQC for one variable as

U𝜽(x)RZ(θ0)j=1LS(x)RZ(θj),subscript𝑈𝜽𝑥subscript𝑅𝑍subscript𝜃0superscriptsubscriptproduct𝑗1𝐿𝑆𝑥subscript𝑅𝑍subscript𝜃𝑗U_{\bm{\theta}}(x)\coloneqq R_{Z}(\theta_{0})\prod_{j=1}^{L}S(x)R_{Z}(\theta_{% j}),italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( italic_x ) ≔ italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT italic_S ( italic_x ) italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) , (A.4)

where 𝜽=(θ0,,θL)L+1𝜽subscript𝜃0subscript𝜃𝐿superscript𝐿1\bm{\theta}\ =(\theta_{0},\ldots,\theta_{L})\in{{\mathbb{R}}}^{L+1}bold_italic_θ = ( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_θ start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_L + 1 end_POSTSUPERSCRIPT is a set of trainable parameters. The PQC in Eq. A.4 can be used to implement polynomial transformations on input x𝑥xitalic_x, as shown in the following lemma.

Lemma S1 ([47]).

There exists 𝛉L+1𝛉superscript𝐿1\bm{\theta}\in{{\mathbb{R}}}^{L+1}bold_italic_θ ∈ blackboard_R start_POSTSUPERSCRIPT italic_L + 1 end_POSTSUPERSCRIPT such that

U𝜽(x)=(P(x)iQ(x)1x2iQ(x)1x2P(x))subscript𝑈𝜽𝑥matrix𝑃𝑥𝑖𝑄𝑥1superscript𝑥2𝑖superscript𝑄𝑥1superscript𝑥2superscript𝑃𝑥U_{\bm{\theta}}(x)=\begin{pmatrix}P(x)&iQ(x)\sqrt{1-x^{2}}\\ iQ^{*}(x)\sqrt{1-x^{2}}&P^{*}(x)\end{pmatrix}italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( italic_x ) = ( start_ARG start_ROW start_CELL italic_P ( italic_x ) end_CELL start_CELL italic_i italic_Q ( italic_x ) square-root start_ARG 1 - italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL italic_i italic_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_x ) square-root start_ARG 1 - italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_CELL start_CELL italic_P start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_x ) end_CELL end_ROW end_ARG ) (A.5)

if and only if polynomials P,Q[x]𝑃𝑄delimited-[]𝑥P,Q\in{{\mathbb{C}}}[x]italic_P , italic_Q ∈ blackboard_C [ italic_x ] satisfy

  1. 1.

    deg(P)Ldegree𝑃𝐿\deg(P)\leq Lroman_deg ( italic_P ) ≤ italic_L and deg(Q)L1degree𝑄𝐿1\deg(Q)\leq L-1roman_deg ( italic_Q ) ≤ italic_L - 1,

  2. 2.

    P𝑃Pitalic_P has parity Lmod2modulo𝐿2L\bmod 2italic_L roman_mod 2 and Q𝑄Qitalic_Q has parity (L1)mod2modulo𝐿12(L-1)\bmod 2( italic_L - 1 ) roman_mod 2222For a polynomial P[x]𝑃delimited-[]𝑥P\in{{\mathbb{C}}}[x]italic_P ∈ blackboard_C [ italic_x ], P𝑃Pitalic_P has parity 00 if all coefficients corresponding to odd powers of x𝑥xitalic_x are 00, and similarly P𝑃Pitalic_P has parity 1111 if all coefficients corresponding to even powers of x𝑥xitalic_x are 00.,

  3. 3.

    x[1,1]for-all𝑥11\forall x\in[-1,1]∀ italic_x ∈ [ - 1 , 1 ], |P(x)|2+(1x2)|Q(x)|2=1superscript𝑃𝑥21superscript𝑥2superscript𝑄𝑥21\lvert P(x)\rvert^{2}+(1-x^{2})\lvert Q(x)\rvert^{2}=1| italic_P ( italic_x ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( 1 - italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) | italic_Q ( italic_x ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1.

As shown in the above lemma, one could implement a polynomial transformation Poly(x)Poly𝑥\operatorname{Poly}(x)roman_Poly ( italic_x ) such that Poly(x)=0|U𝜽(x)|0=P(x)Poly𝑥quantum-operator-product0subscript𝑈𝜽𝑥0𝑃𝑥\operatorname{Poly}(x)=\braket{0}{U_{\bm{\theta}}(x)}{0}=P(x)roman_Poly ( italic_x ) = ⟨ start_ARG 0 end_ARG | start_ARG italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( italic_x ) end_ARG | start_ARG 0 end_ARG ⟩ = italic_P ( italic_x ). Notice that the achievable polynomial Poly(x)Poly𝑥\operatorname{Poly}(x)roman_Poly ( italic_x ) implemented in this way is limited to P(x)𝑃𝑥P(x)italic_P ( italic_x ) for which there exists a polynomial Q(x)𝑄𝑥Q(x)italic_Q ( italic_x ) satisfying the conditions of Lemma S1. As the target polynomial is often real in practice, we could overcome such a limitation by defining Poly(x)=+|U𝜽(x)|+=(P(x))+i(Q(x))1x2Poly𝑥quantum-operator-productsubscript𝑈𝜽𝑥𝑃𝑥𝑖𝑄𝑥1superscript𝑥2\operatorname{Poly}(x)=\braket{+}{U_{\bm{\theta}}(x)}{+}=\Re(P(x))+i\Re(Q(x))% \sqrt{1-x^{2}}roman_Poly ( italic_x ) = ⟨ start_ARG + end_ARG | start_ARG italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( italic_x ) end_ARG | start_ARG + end_ARG ⟩ = roman_ℜ ( italic_P ( italic_x ) ) + italic_i roman_ℜ ( italic_Q ( italic_x ) ) square-root start_ARG 1 - italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG. Then we can achieve any real polynomials with parity Lmod2modulo𝐿2L\bmod 2italic_L roman_mod 2 such that deg(Poly(x))LdegreePoly𝑥𝐿\deg(\operatorname{Poly}(x))\leq Lroman_deg ( roman_Poly ( italic_x ) ) ≤ italic_L, and |Poly(x)|1Poly𝑥1\lvert\operatorname{Poly}(x)\rvert\leq 1| roman_Poly ( italic_x ) | ≤ 1 for all x[1,1]𝑥11x\in[-1,1]italic_x ∈ [ - 1 , 1 ].

Corollary S2 ([47]).

There exists 𝛉L+1𝛉superscript𝐿1\bm{\theta}\in{{\mathbb{R}}}^{L+1}bold_italic_θ ∈ blackboard_R start_POSTSUPERSCRIPT italic_L + 1 end_POSTSUPERSCRIPT such that

p(x)=+|U𝜽(x)|+𝑝𝑥quantum-operator-productsubscript𝑈𝜽𝑥p(x)=\braket{+}{U_{\bm{\theta}}(x)}{+}italic_p ( italic_x ) = ⟨ start_ARG + end_ARG | start_ARG italic_U start_POSTSUBSCRIPT bold_italic_θ end_POSTSUBSCRIPT ( italic_x ) end_ARG | start_ARG + end_ARG ⟩ (A.6)

if and only if the real polynomial p(x)[x]𝑝𝑥delimited-[]𝑥p(x)\in{{\mathbb{R}}}[x]italic_p ( italic_x ) ∈ blackboard_R [ italic_x ] satisfies

  1. 1.

    deg(p(x))Ldegree𝑝𝑥𝐿\deg(p(x))\leq Lroman_deg ( italic_p ( italic_x ) ) ≤ italic_L,

  2. 2.

    p(x)𝑝𝑥p(x)italic_p ( italic_x ) has parity Lmod2modulo𝐿2L\bmod 2italic_L roman_mod 2 333A polynomial p(x)𝑝𝑥p(x)italic_p ( italic_x ) has parity 00 if all coefficients corresponding to odd powers of x𝑥xitalic_x are 00, and similarly p(x)𝑝𝑥p(x)italic_p ( italic_x ) has parity 1111 if all coefficients corresponding to even powers of x𝑥xitalic_x are 00.,

  3. 3.

    x[1,1]for-all𝑥11\forall x\in[-1,1]∀ italic_x ∈ [ - 1 , 1 ], |p(x)|1𝑝𝑥1\lvert p(x)\rvert\leq 1| italic_p ( italic_x ) | ≤ 1.

Remark S1.

The results of PQC with Pauli X𝑋Xitalic_X basis encoding presented here have been established in the technique of quantum signal processing [45, 46, 47], which uses interleaving signal operators and signal processing operators to transform the input signal. The QSP circuit could be identified as a PQC in the context of quantum machine learning.

A.2.2 Implementing trigonometric polynomials

Other than the real polynomials, there are also types of single-qubit PQC with Pauli Z𝑍Zitalic_Z basis encoding that could implement complex trigonometric polynomials [37]. The data encoding unitary is a single-qubit rotation in the Pauli Z𝑍Zitalic_Z basis

S(x)RZ(x)=(eix/200eix/2),𝑆𝑥subscript𝑅𝑍𝑥matrixsuperscript𝑒𝑖𝑥200superscript𝑒𝑖𝑥2S(x)\coloneqq R_{Z}(x)=\begin{pmatrix}e^{ix/2}&0\\ 0&e^{-ix/2}\end{pmatrix},italic_S ( italic_x ) ≔ italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT ( italic_x ) = ( start_ARG start_ROW start_CELL italic_e start_POSTSUPERSCRIPT italic_i italic_x / 2 end_POSTSUPERSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_e start_POSTSUPERSCRIPT - italic_i italic_x / 2 end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) , (A.7)

where x𝑥x\in{{\mathbb{R}}}italic_x ∈ blackboard_R is the data. By interleaving the data encoding unitary S(x)𝑆𝑥S(x)italic_S ( italic_x ) with trainable gates RY(θ)RZ(ϕ)subscript𝑅𝑌𝜃subscript𝑅𝑍italic-ϕR_{Y}(\theta)R_{Z}(\phi)italic_R start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ( italic_θ ) italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT ( italic_ϕ ), the PQC is defined as

U𝜽,ϕ(x)RZ(ω)RY(θ0)RZ(ϕ0)j=1LS(x)RY(θj)RZ(ϕj),subscript𝑈𝜽bold-italic-ϕ𝑥subscript𝑅𝑍𝜔subscript𝑅𝑌subscript𝜃0subscript𝑅𝑍subscriptitalic-ϕ0superscriptsubscriptproduct𝑗1𝐿𝑆𝑥subscript𝑅𝑌subscript𝜃𝑗subscript𝑅𝑍subscriptitalic-ϕ𝑗U_{\bm{\theta},\bm{\phi}}(x)\coloneqq R_{Z}(\omega)R_{Y}(\theta_{0})R_{Z}(\phi% _{0})\prod_{j=1}^{L}S(x)R_{Y}(\theta_{j})R_{Z}(\phi_{j}),italic_U start_POSTSUBSCRIPT bold_italic_θ , bold_italic_ϕ end_POSTSUBSCRIPT ( italic_x ) ≔ italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT ( italic_ω ) italic_R start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT ( italic_ϕ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT italic_S ( italic_x ) italic_R start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT ( italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) , (A.8)

where 𝜽=(θ0,,θL)L+1𝜽subscript𝜃0subscript𝜃𝐿superscript𝐿1\bm{\theta}=(\theta_{0},\ldots,\theta_{L})\in{{\mathbb{R}}}^{L+1}bold_italic_θ = ( italic_θ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_θ start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_L + 1 end_POSTSUPERSCRIPT, ϕ=(ϕ0,,ϕL)L+1bold-italic-ϕsubscriptitalic-ϕ0subscriptitalic-ϕ𝐿superscript𝐿1\bm{\phi}=(\phi_{0},\ldots,\phi_{L})\in{{\mathbb{R}}}^{L+1}bold_italic_ϕ = ( italic_ϕ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_ϕ start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_L + 1 end_POSTSUPERSCRIPT and ω𝜔\omega\in{{\mathbb{R}}}italic_ω ∈ blackboard_R. The following lemma characterizes the correspondence between PQC with σzsubscript𝜎𝑧\sigma_{z}italic_σ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT basis encoding and complex trigonometric polynomials.

Lemma S3 ([37]).

There exist 𝛉,ϕL+1𝛉bold-ϕsuperscript𝐿1\bm{\theta},\bm{\phi}\in{{\mathbb{R}}}^{L+1}bold_italic_θ , bold_italic_ϕ ∈ blackboard_R start_POSTSUPERSCRIPT italic_L + 1 end_POSTSUPERSCRIPT and ω𝜔\omega\in{{\mathbb{R}}}italic_ω ∈ blackboard_R such that

U𝜽,ϕ(x)=(P(x)Q(x)Q(x)P(x))subscript𝑈𝜽bold-italic-ϕ𝑥matrix𝑃𝑥𝑄𝑥superscript𝑄𝑥superscript𝑃𝑥U_{\bm{\theta},\bm{\phi}}(x)=\begin{pmatrix}P(x)&-Q(x)\\ Q^{*}(x)&P^{*}(x)\end{pmatrix}italic_U start_POSTSUBSCRIPT bold_italic_θ , bold_italic_ϕ end_POSTSUBSCRIPT ( italic_x ) = ( start_ARG start_ROW start_CELL italic_P ( italic_x ) end_CELL start_CELL - italic_Q ( italic_x ) end_CELL end_ROW start_ROW start_CELL italic_Q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_x ) end_CELL start_CELL italic_P start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_x ) end_CELL end_ROW end_ARG ) (A.9)

if and only if Laurent polynomials P,Q[eix/2,eix/2]𝑃𝑄superscript𝑒𝑖𝑥2superscript𝑒𝑖𝑥2P,Q\in{{\mathbb{C}}}[e^{ix/2},e^{-ix/2}]italic_P , italic_Q ∈ blackboard_C [ italic_e start_POSTSUPERSCRIPT italic_i italic_x / 2 end_POSTSUPERSCRIPT , italic_e start_POSTSUPERSCRIPT - italic_i italic_x / 2 end_POSTSUPERSCRIPT ] satisfy

  1. 1.

    deg(P)Ldegree𝑃𝐿\deg(P)\leq Lroman_deg ( italic_P ) ≤ italic_L and deg(Q)Ldegree𝑄𝐿\deg(Q)\leq Lroman_deg ( italic_Q ) ≤ italic_L,

  2. 2.

    P𝑃Pitalic_P and Q𝑄Qitalic_Q have parity Lmod2modulo𝐿2L\bmod 2italic_L roman_mod 2,

  3. 3.

    xfor-all𝑥\forall x\in{{\mathbb{R}}}∀ italic_x ∈ blackboard_R, |P(x)|2+|Q(x)|2=1superscript𝑃𝑥2superscript𝑄𝑥21\lvert P(x)\rvert^{2}+\lvert Q(x)\rvert^{2}=1| italic_P ( italic_x ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + | italic_Q ( italic_x ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1.

Note that Laurent polynomials in [eix/2,eix/2]superscript𝑒𝑖𝑥2superscript𝑒𝑖𝑥2{{\mathbb{C}}}[e^{ix/2},e^{-ix/2}]blackboard_C [ italic_e start_POSTSUPERSCRIPT italic_i italic_x / 2 end_POSTSUPERSCRIPT , italic_e start_POSTSUPERSCRIPT - italic_i italic_x / 2 end_POSTSUPERSCRIPT ] with parity 00 are Laurent polynomials in [eix,eix]superscript𝑒𝑖𝑥superscript𝑒𝑖𝑥{{\mathbb{C}}}[e^{ix},e^{-ix}]blackboard_C [ italic_e start_POSTSUPERSCRIPT italic_i italic_x end_POSTSUPERSCRIPT , italic_e start_POSTSUPERSCRIPT - italic_i italic_x end_POSTSUPERSCRIPT ] without parity constraints, which implies that the trigonometric QSP could implement complex trigonometric polynomials.

Corollary S4 ([37, 57]).

There exist 𝛉,ϕ2L+1𝛉bold-ϕsuperscript2𝐿1\bm{\theta},\bm{\phi}\in{{\mathbb{R}}}^{2L+1}bold_italic_θ , bold_italic_ϕ ∈ blackboard_R start_POSTSUPERSCRIPT 2 italic_L + 1 end_POSTSUPERSCRIPT and ω𝜔\omega\in{{\mathbb{R}}}italic_ω ∈ blackboard_R such that

t(x)=0|U𝜽,ϕ(x)|0𝑡𝑥bra0subscript𝑈𝜽bold-italic-ϕ𝑥ket0t(x)=\bra{0}U_{\bm{\theta},\bm{\phi}}(x)\ket{0}italic_t ( italic_x ) = ⟨ start_ARG 0 end_ARG | italic_U start_POSTSUBSCRIPT bold_italic_θ , bold_italic_ϕ end_POSTSUBSCRIPT ( italic_x ) | start_ARG 0 end_ARG ⟩ (A.10)

if and only if the complex-valued trigonometric polynomial t(x)=j=LLcjeijx𝑡𝑥superscriptsubscript𝑗𝐿𝐿subscript𝑐𝑗superscript𝑒𝑖𝑗𝑥t(x)=\sum_{j=-L}^{L}c_{j}e^{ijx}italic_t ( italic_x ) = ∑ start_POSTSUBSCRIPT italic_j = - italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_i italic_j italic_x end_POSTSUPERSCRIPT satisfies |t(x)|1𝑡𝑥1\lvert t(x)\rvert\leq 1| italic_t ( italic_x ) | ≤ 1 for all x𝑥x\in{{\mathbb{R}}}italic_x ∈ blackboard_R.

A.3 Related work in PQC approximation

In this subsection, we review prior literature related to the approximation capabilities of PQCs, which characterizes how the architectural properties of a PQC affect the resulting functions it can fit, and its ensuing performance. After a systematic comparison, we conclude that our results provide precise error bounds for continuous function approximation and make no assumptions about the constructed PQCs. More importantly, all the variables in our proposal take the form of parameters within rotation gates and remain distinct from the data encoding gates to avoid any classical computational influence, thus preserving the inherent quantum property of our approach.

In theoretical machine learning, statistical complexity is a notion that measures the inherent richness characterizing a given hypothesis space. There are various statistical complexity measures, including the Vapnik-Chervonenkis (VC) dimension [58], the metric entropy [59], the Gaussian complexity [60], and the Rademacher complexity [60], etc. To gauge the statistical complexity of PQCs, Du et al. [61] have explored the covering entropy of PQCs in terms of the number of quantum gates and the measurement observable. Bu et al. [62] have investigated the dependence of the Rademacher complexity of PQCs on the resources, width, depth, and the property of input and output registers. The assessment of PQCs has extended to encompass an array of statistical complexity measures, including the Pseudo-Dimension, as delineated in Caro and Datta [63], and the VC dimension, as expounded upon in Chen et al. [64]. Furthermore, the evaluation of PQC expressivity has extended its purview to metrics rooted in information theory. Abbas et al. [65] have evaluated PQC expressivity through the prism of the effective dimension, a data-dependent metric contingent upon the Fisher information. In a parallel endeavor, Du et al. [27] have concentrated their attention on generative tasks, employing entanglement entropy as a metric for quantifying PQC expressivity. It is important to underscore that, while statistical complexity metrics and information-inspired metrics provide invaluable insights into the ‘volume’ of hypothesis spaces, they do not precisely delineate the functions amenable to representation by these models.

To further explore the intricacies of PQCs and their expressivity, an alternative avenue of research has emerged, as highlighted by recent studies [34, 35, 37, 36, 38]. They rewrote the PQC output, i.e., the inner product between an input quantum state and a variational observable, in the form of partial Fourier series. This innovative perspective introduces a more nuanced toolbox for assessing PQC expressivity, offering fresh insights within the quantum machine learning domain, notably with respect to the universal approximation property (UAP). However, it is imperative to underscore that many investigations employing Fourier expansion have been predicated upon certain impractical assumptions. These assumptions encompass the demand for arbitrary parameterized global unitaries and observables, thus posing significant challenges to the practical implementation of the constructed quantum circuits. The existence proof of universal approximation also does not explicitly give approximation error bounds of PQCs.

A very general approach to expressiveness in the context of approximation is the method of nonlinear widths by DeVore et al. [66] that concerns the approximation of a family of functions under the assumption of a continuous dependence of the model on the approximated function. Pérez-Salinas et al. [36] have proved that single-qubit data re-uploading PQCs are universal function approximators, inheriting the famous universal approximation theorem for neural networks [13, 67]. In a quantum-enhanced context, Goto et al. [39] have constructed PQCs to approximate any continuous function guided by the Stone-Weierstrass theorem. Qi et al. [41] have studied the approximation error of PQCs enhanced by tensor-train networks. Their investigation focused on smooth functions, considering factors such as the number of qubits and quantum measurement counts. Furthermore, Gonon and Jacquier [40] have defined a specific hypothesis space consisting of non-oscillating functions, drawing inspiration from Barron [15] and devised PQCs for approximating such functions without encountering the curse of dimensionality (CoD). Notably, the mitigation of CoD arises from their specific hypothesis space definition and is also observed within the domain of classical neural network [68]. It is essential to acknowledge that these works unveil a hybrid nature, blurring the boundaries between classical and quantum domains in circuit construction. The hybrid structure manifests in the data encoding phase and becomes evident in the weighted summation of outputs from foundational quantum circuits. Consequently, whether the powerful expressivity comes from the classical part or the quantum part of hybrid models is unclear.

In our present work, we make no assumptions in the construction of the PQCs. In our PQC model, all variables take the form of parameters within rotation gates. Besides, these trainable parameters remain distinct from the data encoding gates to avoid any classical computational influence. These properties ensure that our constructed PQCs retain practicality and remain firmly rooted within the quantum domain.

Appendix B Implementing multivariate polynomials using PQCs

B.1 Implementing multivariate real polynomials

A multivariate polynomial with d𝑑ditalic_d variables and degree s𝑠s\in{{\mathbb{N}}}italic_s ∈ blackboard_N is defined as

p(𝒙)𝜶1sc𝜶𝒙𝜶,𝑝𝒙subscriptsubscriptdelimited-∥∥𝜶1𝑠subscript𝑐𝜶superscript𝒙𝜶p(\bm{x})\coloneqq\sum_{\lVert\bm{\alpha}\rVert_{1}\leq s}c_{\bm{\alpha}}\bm{x% ^{\alpha}},italic_p ( bold_italic_x ) ≔ ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT bold_italic_x start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT , (B.11)

where 𝒙=(x1,,xd)d𝒙subscript𝑥1subscript𝑥𝑑superscript𝑑\bm{x}=(x_{1},\ldots,x_{d})\in{{\mathbb{R}}}^{d}bold_italic_x = ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, 𝜶=(α1,,αd)d𝜶subscript𝛼1subscript𝛼𝑑superscript𝑑\bm{\alpha}=(\alpha_{1},\ldots,\alpha_{d})\in{{\mathbb{N}}}^{d}bold_italic_α = ( italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ∈ blackboard_N start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, c𝜶subscript𝑐𝜶c_{\bm{\alpha}}\in{{\mathbb{R}}}italic_c start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT ∈ blackboard_R and 𝒙𝜶=x1α1x2α2xdαdsuperscript𝒙𝜶superscriptsubscript𝑥1subscript𝛼1superscriptsubscript𝑥2subscript𝛼2superscriptsubscript𝑥𝑑subscript𝛼𝑑\bm{x^{\alpha}}=x_{1}^{\alpha_{1}}x_{2}^{\alpha_{2}}\cdots x_{d}^{\alpha_{d}}bold_italic_x start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT = italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ⋯ italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT. To implement the multivariate polynomial p(𝒙)𝑝𝒙p(\bm{x})italic_p ( bold_italic_x ), we first build a PQC to express a monomial c𝜶𝒙𝜶=c𝜶x1α1x2α2xdαdsubscript𝑐𝜶superscript𝒙𝜶subscript𝑐𝜶superscriptsubscript𝑥1subscript𝛼1superscriptsubscript𝑥2subscript𝛼2superscriptsubscript𝑥𝑑subscript𝛼𝑑c_{\bm{\alpha}}\bm{x^{\alpha}}=c_{\bm{\alpha}}x_{1}^{\alpha_{1}}x_{2}^{\alpha_% {2}}\cdots x_{d}^{\alpha_{d}}italic_c start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT bold_italic_x start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT = italic_c start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ⋯ italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT, where |c𝜶𝒙𝜶|1subscript𝑐𝜶superscript𝒙𝜶1\lvert c_{\bm{\alpha}}\bm{x^{\alpha}}\rvert\leq 1| italic_c start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT bold_italic_x start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT | ≤ 1 for 𝒙[0,1]d𝒙superscript01𝑑\bm{x}\in[0,1]^{d}bold_italic_x ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT and 𝜶1ssubscriptdelimited-∥∥𝜶1𝑠\lVert\bm{\alpha}\rVert_{1}\leq s∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s. We apply the single-qubit PQC with Pauli X𝑋Xitalic_X basis encoding defined in Eq. A.4 on each xjsubscript𝑥𝑗x_{j}italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT for 1jd1𝑗𝑑1\leq j\leq d1 ≤ italic_j ≤ italic_d, respectively.

Lemma S5.

Given a monomial c𝛂𝐱𝛂=c𝛂x1α1x2α2xdαdsubscript𝑐𝛂superscript𝐱𝛂subscript𝑐𝛂superscriptsubscript𝑥1subscript𝛼1superscriptsubscript𝑥2subscript𝛼2superscriptsubscript𝑥𝑑subscript𝛼𝑑c_{\bm{\alpha}}\bm{x^{\alpha}}=c_{\bm{\alpha}}x_{1}^{\alpha_{1}}x_{2}^{\alpha_% {2}}\cdots x_{d}^{\alpha_{d}}italic_c start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT bold_italic_x start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT = italic_c start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ⋯ italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT such that |c𝛂𝐱𝛂|1subscript𝑐𝛂superscript𝐱𝛂1\lvert c_{\bm{\alpha}}\bm{x^{\alpha}}\rvert\leq 1| italic_c start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT bold_italic_x start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT | ≤ 1 for all 𝐱[0,1]d𝐱superscript01𝑑\bm{x}\in[0,1]^{d}bold_italic_x ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT and 𝛂1ssubscriptdelimited-∥∥𝛂1𝑠\lVert\bm{\alpha}\rVert_{1}\leq s∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s for s𝑠s\in{{\mathbb{N}}}italic_s ∈ blackboard_N, there exists a PQC U𝛂(𝐱)superscript𝑈𝛂𝐱U^{\bm{\alpha}}(\bm{x})italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x ) such that

+|dU𝜶(𝒙)|+d=c𝜶𝒙𝜶.superscriptbratensor-productabsent𝑑superscript𝑈𝜶𝒙superscriptkettensor-productabsent𝑑subscript𝑐𝜶superscript𝒙𝜶\bra{+}^{\otimes d}\!U^{\bm{\alpha}}(\bm{x})\!\ket{+}^{\otimes d}=c_{\bm{% \alpha}}\bm{x^{\alpha}}.⟨ start_ARG + end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x ) | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT = italic_c start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT bold_italic_x start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT . (B.12)

The width of the PQC is at most d𝑑ditalic_d, the depth is at most 2s+12𝑠12s+12 italic_s + 1, and the number of parameters is at most s+d𝑠𝑑s+ditalic_s + italic_d.

Proof.

By Corollary S2, there exist d𝑑ditalic_d single-qubit PQCs U𝜽1α1(x1),U𝜽2α2(x2),,U𝜽dαd(xd)superscriptsubscript𝑈subscript𝜽1subscript𝛼1subscript𝑥1superscriptsubscript𝑈subscript𝜽2subscript𝛼2subscript𝑥2superscriptsubscript𝑈subscript𝜽𝑑subscript𝛼𝑑subscript𝑥𝑑U_{\bm{\theta}_{1}}^{\alpha_{1}}(x_{1}),U_{\bm{\theta}_{2}}^{\alpha_{2}}(x_{2}% ),\ldots,U_{\bm{\theta}_{d}}^{\alpha_{d}}(x_{d})italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , … , italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) such that

+|U𝜽1α1(x1)|+quantum-operator-productsuperscriptsubscript𝑈subscript𝜽1subscript𝛼1subscript𝑥1\displaystyle\braket{+}{U_{\bm{\theta}_{1}}^{\alpha_{1}}(x_{1})}{+}⟨ start_ARG + end_ARG | start_ARG italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_ARG | start_ARG + end_ARG ⟩ =c𝜶x1α1,absentsubscript𝑐𝜶superscriptsubscript𝑥1subscript𝛼1\displaystyle=c_{\bm{\alpha}}x_{1}^{\alpha_{1}},= italic_c start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ,
+|U𝜽2α2(x2)|+quantum-operator-productsuperscriptsubscript𝑈subscript𝜽2subscript𝛼2subscript𝑥2\displaystyle\braket{+}{U_{\bm{\theta}_{2}}^{\alpha_{2}}(x_{2})}{+}⟨ start_ARG + end_ARG | start_ARG italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG | start_ARG + end_ARG ⟩ =x2α2,absentsuperscriptsubscript𝑥2subscript𝛼2\displaystyle=x_{2}^{\alpha_{2}},= italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ,
\displaystyle\cdots
+|U𝜽dαd(xd)|+quantum-operator-productsuperscriptsubscript𝑈subscript𝜽𝑑subscript𝛼𝑑subscript𝑥𝑑\displaystyle\braket{+}{U_{\bm{\theta}_{d}}^{\alpha_{d}}(x_{d})}{+}⟨ start_ARG + end_ARG | start_ARG italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) end_ARG | start_ARG + end_ARG ⟩ =xdαd,absentsuperscriptsubscript𝑥𝑑subscript𝛼𝑑\displaystyle=x_{d}^{\alpha_{d}},= italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ,

where the number of layers of each PQC is Lj=αjsubscript𝐿𝑗subscript𝛼𝑗L_{j}=\alpha_{j}italic_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT for 1jd1𝑗𝑑1\leq j\leq d1 ≤ italic_j ≤ italic_d. We then define a d𝑑ditalic_d-qubit PQC as

U𝜶(𝒙)=j=1dU𝜽jαj(xj),superscript𝑈𝜶𝒙superscriptsubscripttensor-product𝑗1𝑑superscriptsubscript𝑈subscript𝜽𝑗subscript𝛼𝑗subscript𝑥𝑗U^{\bm{\alpha}}(\bm{x})=\bigotimes_{j=1}^{d}U_{\bm{\theta}_{j}}^{\alpha_{j}}(x% _{j}),italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x ) = ⨂ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) , (B.13)

which gives

+|dU𝜶(𝒙)|+d=j=1d+|U𝜽jαj(xj)|+=c𝜶𝒙𝜶.superscriptbratensor-productabsent𝑑superscript𝑈𝜶𝒙superscriptkettensor-productabsent𝑑superscriptsubscriptproduct𝑗1𝑑quantum-operator-productsuperscriptsubscript𝑈subscript𝜽𝑗subscript𝛼𝑗subscript𝑥𝑗subscript𝑐𝜶superscript𝒙𝜶\bra{+}^{\otimes d}\!U^{\bm{\alpha}}(\bm{x})\!\ket{+}^{\otimes d}=\prod_{j=1}^% {d}\braket{+}{U_{\bm{\theta}_{j}}^{\alpha_{j}}(x_{j})}{+}=c_{\bm{\alpha}}\bm{x% ^{\alpha}}.⟨ start_ARG + end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x ) | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT = ∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ⟨ start_ARG + end_ARG | start_ARG italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) end_ARG | start_ARG + end_ARG ⟩ = italic_c start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT bold_italic_x start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT . (B.14)

Since 𝜶1=j=1dαjssubscriptdelimited-∥∥𝜶1superscriptsubscript𝑗1𝑑subscript𝛼𝑗𝑠\lVert\bm{\alpha}\rVert_{1}=\sum_{j=1}^{d}\alpha_{j}\leq s∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≤ italic_s, we can conclude that the depth of U𝜶(𝒙)superscript𝑈𝜶𝒙U^{\bm{\alpha}}(\bm{x})italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x ) is at most 2s+12𝑠12s+12 italic_s + 1 and the number of parameters in U𝜶(𝒙)superscript𝑈𝜶𝒙U^{\bm{\alpha}}(\bm{x})italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x ) is at most s+d𝑠𝑑s+ditalic_s + italic_d.     square-intersection\sqcapsquare-union\sqcup

The next step is to combine monomials together to implement the multivariate polynomial. Specifically, we would like to implement the following (unnormalized) operator

Up(𝒙)𝜶1sU𝜶(𝒙)subscript𝑈𝑝𝒙subscriptsubscriptdelimited-∥∥𝜶1𝑠superscript𝑈𝜶𝒙U_{p}(\bm{x})\coloneqq\sum_{\lVert\bm{\alpha}\rVert_{1}\leq s}U^{\bm{\alpha}}(% \bm{x})italic_U start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x ) (B.15)

so that we can implement an (unnormalized) polynomial as

+|dUp(𝒙)|+d=𝜶1s+|dU𝜶(𝒙)|+d=𝜶1sc𝜶𝒙𝜶=p(𝒙).superscriptbratensor-productabsent𝑑subscript𝑈𝑝𝒙superscriptkettensor-productabsent𝑑subscriptsubscriptdelimited-∥∥𝜶1𝑠superscriptbratensor-productabsent𝑑superscript𝑈𝜶𝒙superscriptkettensor-productabsent𝑑subscriptsubscriptdelimited-∥∥𝜶1𝑠subscript𝑐𝜶superscript𝒙𝜶𝑝𝒙\bra{+}^{\otimes d}\!U_{p}(\bm{x})\!\ket{+}^{\otimes d}=\sum_{\lVert\bm{\alpha% }\rVert_{1}\leq s}\bra{+}^{\otimes d}\!U^{\bm{\alpha}}(\bm{x})\!\ket{+}^{% \otimes d}=\sum_{\lVert\bm{\alpha}\rVert_{1}\leq s}c_{\bm{\alpha}}\bm{x^{% \alpha}}=p(\bm{x}).⟨ start_ARG + end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT ⟨ start_ARG + end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x ) | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT bold_italic_α end_POSTSUBSCRIPT bold_italic_x start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT = italic_p ( bold_italic_x ) . (B.16)

We denote T𝑇Titalic_T the number of terms in the summation and observe that it can be bounded as

T=𝜶1s1=j=0s𝜶1=j1j=0sds(s+1)ds.𝑇subscriptsubscriptdelimited-∥∥𝜶1𝑠1superscriptsubscript𝑗0𝑠subscriptsubscriptdelimited-∥∥𝜶1𝑗1superscriptsubscript𝑗0𝑠superscript𝑑𝑠𝑠1superscript𝑑𝑠T=\sum_{\lVert\bm{\alpha}\rVert_{1}\leq s}1=\sum_{j=0}^{s}\sum_{\lVert\bm{% \alpha}\rVert_{1}=j}1\leq\sum_{j=0}^{s}d^{s}\leq(s+1)d^{s}.italic_T = ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT 1 = ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_j end_POSTSUBSCRIPT 1 ≤ ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ≤ ( italic_s + 1 ) italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT . (B.17)

For convenience, we rewrite the normalized target operator with 𝜶𝜶\bm{\alpha}bold_italic_α being an indexed variable as

Up(𝒙)=j=1T1TU𝜶(j)(𝒙).subscript𝑈𝑝𝒙superscriptsubscript𝑗1𝑇1𝑇superscript𝑈superscript𝜶𝑗𝒙U_{p}(\bm{x})=\sum_{j=1}^{T}\frac{1}{T}U^{\bm{\alpha}^{(j)}}(\bm{x}).italic_U start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_T end_ARG italic_U start_POSTSUPERSCRIPT bold_italic_α start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ( bold_italic_x ) . (B.18)

However, the addition operation in quantum computing is non-trivial as the sum of unitary operators is not necessarily unitary. To sum the monomials together, we utilize the technique of linear combination of unitaries (LCU) [48] to implement the operator Up(𝒙)subscript𝑈𝑝𝒙U_{p}(\bm{x})italic_U start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) in Eq. B.18 on a quantum computer. We first construct a unitary operator F𝐹Fitalic_F such that

F|0=1Tj=1T|j.𝐹ket01𝑇superscriptsubscript𝑗1𝑇ket𝑗F\ket{0}=\frac{1}{\sqrt{T}}\sum_{j=1}^{T}\ket{j}.italic_F | start_ARG 0 end_ARG ⟩ = divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_T end_ARG end_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT | start_ARG italic_j end_ARG ⟩ . (B.19)

The unitary F𝐹Fitalic_F could be simply implemented by Hadamard gates. Next, we construct a controlled unitary

Uc(𝒙)=j=1T|jj|U𝜶(j)(𝒙).U_{c}(\bm{x})=\sum_{j=1}^{T}\lvert j\rangle\!\langle j\rvert\otimes U^{\bm{% \alpha}^{(j)}}(\bm{x}).italic_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_italic_x ) = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT | italic_j ⟩ ⟨ italic_j | ⊗ italic_U start_POSTSUPERSCRIPT bold_italic_α start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ( bold_italic_x ) . (B.20)

Note that each |jj|U𝜶(j)(𝒙)\lvert j\rangle\!\langle j\rvert\otimes U^{\bm{\alpha}^{(j)}}(\bm{x})| italic_j ⟩ ⟨ italic_j | ⊗ italic_U start_POSTSUPERSCRIPT bold_italic_α start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ( bold_italic_x ) could be constructed using (logT)𝑇(\log T)( roman_log italic_T )-qubit controlled Pauli rotation gates, as U𝜶(j)(𝒙)superscript𝑈superscript𝜶𝑗𝒙U^{\bm{\alpha}^{(j)}}(\bm{x})italic_U start_POSTSUPERSCRIPT bold_italic_α start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ( bold_italic_x ) consisting of single-qubit Pauli rotation gates. The (logT)𝑇(\log T)( roman_log italic_T )-qubit controlled gates could be further decomposed into quantum circuits of CNOT gates and single-qubit rotation gates in O(logT)𝑂𝑇O(\log T)italic_O ( roman_log italic_T ) circuit depth without using any ancilla qubit. We refer to the detailed implementation of these multi-controlled gates to da Silva and Park [49]. Then the unitary Wlcu=(FI)Uc(FI)subscript𝑊𝑙𝑐𝑢tensor-productsuperscript𝐹𝐼subscript𝑈𝑐tensor-product𝐹𝐼W_{lcu}=(F^{\dagger}\otimes I)U_{c}(F\otimes I)italic_W start_POSTSUBSCRIPT italic_l italic_c italic_u end_POSTSUBSCRIPT = ( italic_F start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_I ) italic_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_F ⊗ italic_I ) satisfies that

Wlcu|0|+d=|0Up(𝒙)|+d+|,subscript𝑊𝑙𝑐𝑢ket0superscriptkettensor-productabsent𝑑ket0subscript𝑈𝑝𝒙superscriptkettensor-productabsent𝑑ketperpendicular-toW_{lcu}\ket{0}\ket{+}^{\otimes d}=\ket{0}U_{p}(\bm{x})\ket{+}^{\otimes d}+\ket% {\perp},italic_W start_POSTSUBSCRIPT italic_l italic_c italic_u end_POSTSUBSCRIPT | start_ARG 0 end_ARG ⟩ | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT = | start_ARG 0 end_ARG ⟩ italic_U start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT + | start_ARG ⟂ end_ARG ⟩ , (B.21)

where (0|I)|=0tensor-productbra0𝐼ketperpendicular-to0(\bra{0}\otimes I)\ket{\perp}=0( ⟨ start_ARG 0 end_ARG | ⊗ italic_I ) | start_ARG ⟂ end_ARG ⟩ = 0. Notice that

0|+|dWlcu|0|+d=+|dUp(𝒙)|+d=p(𝒙).bra0superscriptbratensor-productabsent𝑑subscript𝑊𝑙𝑐𝑢ket0superscriptkettensor-productabsent𝑑superscriptbratensor-productabsent𝑑subscript𝑈𝑝𝒙superscriptkettensor-productabsent𝑑𝑝𝒙\bra{0}\bra{+}^{\otimes d}W_{lcu}\ket{0}\ket{+}^{\otimes d}=\bra{+}^{\otimes d% }U_{p}(\bm{x})\ket{+}^{\otimes d}=p(\bm{x}).⟨ start_ARG 0 end_ARG | ⟨ start_ARG + end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_l italic_c italic_u end_POSTSUBSCRIPT | start_ARG 0 end_ARG ⟩ | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT = ⟨ start_ARG + end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT = italic_p ( bold_italic_x ) . (B.22)

To obtain the polynomial p(𝒙)𝑝𝒙p(\bm{x})italic_p ( bold_italic_x ), we could estimate 0|+|dWlcu|0|+dbra0superscriptbratensor-productabsent𝑑subscript𝑊𝑙𝑐𝑢ket0superscriptkettensor-productabsent𝑑\bra{0}\bra{+}^{\otimes d}W_{lcu}\ket{0}\ket{+}^{\otimes d}⟨ start_ARG 0 end_ARG | ⟨ start_ARG + end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_l italic_c italic_u end_POSTSUBSCRIPT | start_ARG 0 end_ARG ⟩ | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT using the Hadamard test.

Theorem 1.

For any multivariate polynomial p(𝐱)𝑝𝐱p(\bm{x})italic_p ( bold_italic_x ) with d𝑑ditalic_d variables and degree s𝑠sitalic_s such that |p(𝐱)|1𝑝𝐱1\lvert p(\bm{x})\rvert\leq 1| italic_p ( bold_italic_x ) | ≤ 1 for 𝐱[0,1]d𝐱superscript01𝑑\bm{x}\in[0,1]^{d}bold_italic_x ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, there exists a PQC Wp(𝐱)subscript𝑊𝑝𝐱W_{p}(\bm{x})italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) such that

fWp(𝒙)0|Wp(𝒙)Z(0)Wp(𝒙)|0=p(𝒙)subscript𝑓subscript𝑊𝑝𝒙bra0subscriptsuperscript𝑊𝑝𝒙superscript𝑍0subscript𝑊𝑝𝒙ket0𝑝𝒙f_{W_{p}}(\bm{x})\coloneqq\bra{0}W^{\dagger}_{p}(\bm{x})Z^{(0)}W_{p}(\bm{x})% \ket{0}=p(\bm{x})italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ⟨ start_ARG 0 end_ARG | italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩ = italic_p ( bold_italic_x ) (B.23)

where Z(0)superscript𝑍0Z^{(0)}italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT is the Pauli Z𝑍Zitalic_Z observable on the first qubit. The width of the PQC is O(d+logs+slogd)𝑂𝑑𝑠𝑠𝑑O(d+\log s+s\log d)italic_O ( italic_d + roman_log italic_s + italic_s roman_log italic_d ), the depth is O(s2ds(logs+slogd))𝑂superscript𝑠2superscript𝑑𝑠𝑠𝑠𝑑O(s^{2}d^{s}(\log s+s\log d))italic_O ( italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ( roman_log italic_s + italic_s roman_log italic_d ) ), and the number of parameters is O(sds(s+d))𝑂𝑠superscript𝑑𝑠𝑠𝑑O(sd^{s}(s+d))italic_O ( italic_s italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ( italic_s + italic_d ) ).

Proof.

We apply the Hadamard test on Wlcusubscript𝑊𝑙𝑐𝑢W_{lcu}italic_W start_POSTSUBSCRIPT italic_l italic_c italic_u end_POSTSUBSCRIPT, giving the quantum circuit Wp(𝒙)subscript𝑊𝑝𝒙W_{p}(\bm{x})italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) as follows.

\Qcircuit@C=1em@R=0.5em\lstick|0&\qw\gateH\qw\ctrl1\qw\gateH\qw\lstick|0/\qw\qw\qw\multigate1Wlcu\qw\qw\qw\lstick|0/\qw\gateHd\qw\ghostHd\qw\qw\qw\Qcircuit@𝐶1𝑒𝑚@𝑅0.5𝑒𝑚\lstickket0&\qw\gate𝐻\qw\ctrl1\qw\gate𝐻\qw\lstickket0\qw\qw\qw\multigate1subscript𝑊𝑙𝑐𝑢\qw\qw\qw\lstickket0\qw\gatesuperscript𝐻tensor-productabsent𝑑\qw\ghostsuperscript𝐻tensor-productabsent𝑑\qw\qw\qw\Qcircuit@C=1em@R=0.5em{\lstick{\ket{0}}&\qw\gate{H}\qw\ctrl{1}\qw\gate{H}\qw% \\ \lstick{\ket{0}}{/}\qw\qw\qw\multigate{1}{W_{lcu}}\qw\qw\qw\\ \lstick{\ket{0}}{/}\qw\gate{H^{\otimes d}}\qw\ghost{H^{\otimes d}}\qw\qw\qw}@ italic_C = 1 italic_e italic_m @ italic_R = 0.5 italic_e italic_m | start_ARG 0 end_ARG ⟩ & italic_H 1 italic_H | start_ARG 0 end_ARG ⟩ / 1 italic_W start_POSTSUBSCRIPT italic_l italic_c italic_u end_POSTSUBSCRIPT | start_ARG 0 end_ARG ⟩ / italic_H start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_H start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT

Measuring the first qubit of Wp(𝒙)subscript𝑊𝑝𝒙W_{p}(\bm{x})italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ), we have

fWp(𝒙)0|Wp(𝒙)Z(0)Wp(𝒙)|0=0|+|dWlcu|0|+d=p(𝒙).subscript𝑓subscript𝑊𝑝𝒙bra0subscriptsuperscript𝑊𝑝𝒙superscript𝑍0subscript𝑊𝑝𝒙ket0bra0superscriptbratensor-productabsent𝑑subscript𝑊𝑙𝑐𝑢ket0superscriptkettensor-productabsent𝑑𝑝𝒙f_{W_{p}}(\bm{x})\coloneqq\bra{0}W^{\dagger}_{p}(\bm{x})Z^{(0)}W_{p}(\bm{x})% \ket{0}=\bra{0}\bra{+}^{\otimes d}W_{lcu}\ket{0}\ket{+}^{\otimes d}=p(\bm{x}).italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ⟨ start_ARG 0 end_ARG | italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩ = ⟨ start_ARG 0 end_ARG | ⟨ start_ARG + end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_l italic_c italic_u end_POSTSUBSCRIPT | start_ARG 0 end_ARG ⟩ | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT = italic_p ( bold_italic_x ) . (B.24)

The controlled unitary used in LCU,

Uc(𝒙)=j=1T|jj|U𝜶(j)(𝒙),U_{c}(\bm{x})=\sum_{j=1}^{T}\lvert j\rangle\!\langle j\rvert\otimes U^{\bm{% \alpha}^{(j)}}(\bm{x}),italic_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_italic_x ) = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT | italic_j ⟩ ⟨ italic_j | ⊗ italic_U start_POSTSUPERSCRIPT bold_italic_α start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ( bold_italic_x ) , (B.25)

could be implemented by at most O(Ts)𝑂𝑇𝑠O(Ts)italic_O ( italic_T italic_s ) (logT)𝑇(\log T)( roman_log italic_T )-qubit controlled gates. A (logT)𝑇(\log T)( roman_log italic_T )-qubit controlled gate could be implemented by a quantum circuit consisting of CNOT gates and single-qubit gates with depth O(logT)𝑂𝑇O(\log T)italic_O ( roman_log italic_T ) [49]. Thus Uc(𝒙)subscript𝑈𝑐𝒙U_{c}(\bm{x})italic_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_italic_x ) could be implemented by a quantum circuit with depth O(sTlogT)𝑂𝑠𝑇𝑇O(sT\log T)italic_O ( italic_s italic_T roman_log italic_T ) and width O(d+logT)𝑂𝑑𝑇O(d+\log T)italic_O ( italic_d + roman_log italic_T ). Then the depth and width of Wlcu=(FI)Uc(FI)subscript𝑊𝑙𝑐𝑢tensor-productsuperscript𝐹𝐼subscript𝑈𝑐tensor-product𝐹𝐼W_{lcu}=(F^{\dagger}\otimes I)U_{c}(F\otimes I)italic_W start_POSTSUBSCRIPT italic_l italic_c italic_u end_POSTSUBSCRIPT = ( italic_F start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_I ) italic_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_F ⊗ italic_I ) are in the same order of Uc(𝒙)subscript𝑈𝑐𝒙U_{c}(\bm{x})italic_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_italic_x ) since F𝐹Fitalic_F is simply tensor of Hadamard gates. Therefore the entire depth of the circuit Wpsubscript𝑊𝑝W_{p}italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is O(sTlogT+d)𝑂𝑠𝑇𝑇𝑑O\bigl{(}sT\log T+d\bigr{)}italic_O ( italic_s italic_T roman_log italic_T + italic_d ), and the width of Wpsubscript𝑊𝑝W_{p}italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is O(d+logT)𝑂𝑑𝑇O(d+\log T)italic_O ( italic_d + roman_log italic_T ). As T(s+1)ds𝑇𝑠1superscript𝑑𝑠T\leq(s+1)d^{s}italic_T ≤ ( italic_s + 1 ) italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT. Note that the number of parameters in the PQC equals the number of parameters in Uc(𝒙)subscript𝑈𝑐𝒙U_{c}(\bm{x})italic_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_italic_x ), which is O(T(s+d))𝑂𝑇𝑠𝑑O(T(s+d))italic_O ( italic_T ( italic_s + italic_d ) ).     square-intersection\sqcapsquare-union\sqcup

Note that measuring the first qubit of Wp(𝒙)subscript𝑊𝑝𝒙W_{p}(\bm{x})italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) for O(1ε2)𝑂1superscript𝜀2O(\frac{1}{\varepsilon^{2}})italic_O ( divide start_ARG 1 end_ARG start_ARG italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) times is needed to estimate the value of p(𝒙)𝑝𝒙p(\bm{x})italic_p ( bold_italic_x ) up to an additive error ε𝜀\varepsilonitalic_ε. We could further use the amplitude estimation algorithm [50] to reduce the overhead while increasing the circuit depth by O(1ε)𝑂1𝜀O(\frac{1}{\varepsilon})italic_O ( divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ).

B.2 Implementing multivariate trigonometric polynomials

We extend the PQCs with RZsubscript𝑅𝑍R_{Z}italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT encoding to implement multivariate trigonometric polynomials. A multivariate trigonometric polynomials with d𝑑ditalic_d variables and degree s𝑠sitalic_s is defined as

t(𝒙)𝒏1sc𝒏ei𝒏𝒙𝑡𝒙subscriptsubscriptdelimited-∥∥𝒏1𝑠subscript𝑐𝒏superscript𝑒𝑖𝒏𝒙t(\bm{x})\coloneqq\sum_{\lVert\bm{n}\rVert_{1}\leq s}c_{\bm{n}}e^{i\bm{n}\cdot% \bm{x}}italic_t ( bold_italic_x ) ≔ ∑ start_POSTSUBSCRIPT ∥ bold_italic_n ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_i bold_italic_n ⋅ bold_italic_x end_POSTSUPERSCRIPT (B.26)

where c𝒏subscript𝑐𝒏c_{\bm{n}}\in{{\mathbb{C}}}italic_c start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT ∈ blackboard_C, 𝒙=(x1,,xd)d𝒙subscript𝑥1subscript𝑥𝑑superscript𝑑\bm{x}=(x_{1},\ldots,x_{d})\in{{\mathbb{R}}}^{d}bold_italic_x = ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, 𝝂=(α1,,αd)d𝝂subscript𝛼1subscript𝛼𝑑superscript𝑑\bm{\nu}=(\alpha_{1},\ldots,\alpha_{d})\in{{\mathbb{Z}}}^{d}bold_italic_ν = ( italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ∈ blackboard_Z start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, and ei𝒏𝒙=ein1x1ein2x2eindxdsuperscript𝑒𝑖𝒏𝒙superscript𝑒𝑖subscript𝑛1subscript𝑥1superscript𝑒𝑖subscript𝑛2subscript𝑥2superscript𝑒𝑖subscript𝑛𝑑subscript𝑥𝑑e^{i\bm{n}\cdot\bm{x}}=e^{in_{1}x_{1}}e^{in_{2}x_{2}}\cdots e^{in_{d}x_{d}}italic_e start_POSTSUPERSCRIPT italic_i bold_italic_n ⋅ bold_italic_x end_POSTSUPERSCRIPT = italic_e start_POSTSUPERSCRIPT italic_i italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT italic_i italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ⋯ italic_e start_POSTSUPERSCRIPT italic_i italic_n start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT. Consider a trigonometric monomial c𝒏ei𝒏𝒙=c𝒏ein1x1ein2x2eindxdsubscript𝑐𝒏superscript𝑒𝑖𝒏𝒙subscript𝑐𝒏superscript𝑒𝑖subscript𝑛1subscript𝑥1superscript𝑒𝑖subscript𝑛2subscript𝑥2superscript𝑒𝑖subscript𝑛𝑑subscript𝑥𝑑c_{\bm{n}}e^{i\bm{n}\cdot\bm{x}}=c_{\bm{n}}e^{in_{1}x_{1}}e^{in_{2}x_{2}}% \cdots e^{in_{d}x_{d}}italic_c start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_i bold_italic_n ⋅ bold_italic_x end_POSTSUPERSCRIPT = italic_c start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_i italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT italic_i italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ⋯ italic_e start_POSTSUPERSCRIPT italic_i italic_n start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT such that |c𝒏ei𝒏𝒙|1subscript𝑐𝒏superscript𝑒𝑖𝒏𝒙1\lvert c_{\bm{n}}e^{i\bm{n}\cdot\bm{x}}\rvert\leq 1| italic_c start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_i bold_italic_n ⋅ bold_italic_x end_POSTSUPERSCRIPT | ≤ 1 for all 𝒙d𝒙superscript𝑑\bm{x}\in{{\mathbb{R}}}^{d}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT and 𝒏1ssubscriptdelimited-∥∥𝒏1𝑠\lVert\bm{n}\rVert_{1}\leq s∥ bold_italic_n ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s, we could apply the single-qubit PQC with RZsubscript𝑅𝑍R_{Z}italic_R start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT encoding as defined in Eq. A.8 on each xjsubscript𝑥𝑗x_{j}italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT for 1jd1𝑗𝑑1\leq j\leq d1 ≤ italic_j ≤ italic_d respectively.

Lemma S6.

Given a trigonometric monomial c𝐧ei𝐧𝐱=c𝐧ein1x1ein2x2eindxdsubscript𝑐𝐧superscript𝑒𝑖𝐧𝐱subscript𝑐𝐧superscript𝑒𝑖subscript𝑛1subscript𝑥1superscript𝑒𝑖subscript𝑛2subscript𝑥2superscript𝑒𝑖subscript𝑛𝑑subscript𝑥𝑑c_{\bm{n}}e^{i\bm{n}\cdot\bm{x}}=c_{\bm{n}}e^{in_{1}x_{1}}e^{in_{2}x_{2}}% \cdots e^{in_{d}x_{d}}italic_c start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_i bold_italic_n ⋅ bold_italic_x end_POSTSUPERSCRIPT = italic_c start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_i italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT italic_i italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ⋯ italic_e start_POSTSUPERSCRIPT italic_i italic_n start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT such that |c𝐧ei𝐧𝐱|1subscript𝑐𝐧superscript𝑒𝑖𝐧𝐱1\lvert c_{\bm{n}}e^{i\bm{n}\cdot\bm{x}}\rvert\leq 1| italic_c start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_i bold_italic_n ⋅ bold_italic_x end_POSTSUPERSCRIPT | ≤ 1 for all 𝐱d𝐱superscript𝑑\bm{x}\in{{\mathbb{R}}}^{d}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT and 𝐧1ssubscriptdelimited-∥∥𝐧1𝑠\lVert\bm{n}\rVert_{1}\leq s∥ bold_italic_n ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s, there exists a PQC U𝐧(𝐱)superscript𝑈𝐧𝐱U^{\bm{n}}(\bm{x})italic_U start_POSTSUPERSCRIPT bold_italic_n end_POSTSUPERSCRIPT ( bold_italic_x ) such that

0|dU𝒏(𝒙)|0d=c𝒏ei𝒏𝒙.superscriptbra0tensor-productabsent𝑑superscript𝑈𝒏𝒙superscriptket0tensor-productabsent𝑑subscript𝑐𝒏superscript𝑒𝑖𝒏𝒙\bra{0}^{\otimes d}\!U^{\bm{n}}(\bm{x})\!\ket{0}^{\otimes d}=c_{\bm{n}}e^{i\bm% {n}\cdot\bm{x}}.⟨ start_ARG 0 end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUPERSCRIPT bold_italic_n end_POSTSUPERSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT = italic_c start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_i bold_italic_n ⋅ bold_italic_x end_POSTSUPERSCRIPT . (B.27)

The width of the PQC is at most d𝑑ditalic_d, the depth is at most 6s+36𝑠36s+36 italic_s + 3, and the number of parameters is at most 4s+3d4𝑠3𝑑4s+3d4 italic_s + 3 italic_d.

Proof.

By Corollary S4, we could construct d𝑑ditalic_d single-qubit PQCs U𝜽1,ϕ1n1(x1),U𝜽2,ϕ2n2(x2),,U𝜽d,ϕdnd(xd)superscriptsubscript𝑈subscript𝜽1subscriptbold-italic-ϕ1subscript𝑛1subscript𝑥1superscriptsubscript𝑈subscript𝜽2subscriptbold-italic-ϕ2subscript𝑛2subscript𝑥2superscriptsubscript𝑈subscript𝜽𝑑subscriptbold-italic-ϕ𝑑subscript𝑛𝑑subscript𝑥𝑑U_{\bm{\theta}_{1},\bm{\phi}_{1}}^{n_{1}}(x_{1}),U_{\bm{\theta}_{2},\bm{\phi}_% {2}}^{n_{2}}(x_{2}),\ldots,U_{\bm{\theta}_{d},\bm{\phi}_{d}}^{n_{d}}(x_{d})italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , … , italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) such that

0|U𝜽1,ϕ1n1(x1)|0quantum-operator-product0superscriptsubscript𝑈subscript𝜽1subscriptbold-italic-ϕ1subscript𝑛1subscript𝑥10\displaystyle\braket{0}{U_{\bm{\theta}_{1},\bm{\phi}_{1}}^{n_{1}}(x_{1})}{0}⟨ start_ARG 0 end_ARG | start_ARG italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_ARG | start_ARG 0 end_ARG ⟩ =c𝒏ein1x1,absentsubscript𝑐𝒏superscript𝑒𝑖subscript𝑛1subscript𝑥1\displaystyle=c_{\bm{n}}e^{in_{1}x_{1}},= italic_c start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_i italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ,
0|U𝜽2,ϕ2n2(x2)|0quantum-operator-product0superscriptsubscript𝑈subscript𝜽2subscriptbold-italic-ϕ2subscript𝑛2subscript𝑥20\displaystyle\braket{0}{U_{\bm{\theta}_{2},\bm{\phi}_{2}}^{n_{2}}(x_{2})}{0}⟨ start_ARG 0 end_ARG | start_ARG italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG | start_ARG 0 end_ARG ⟩ =ein2x2,absentsuperscript𝑒𝑖subscript𝑛2subscript𝑥2\displaystyle=e^{in_{2}x_{2}},= italic_e start_POSTSUPERSCRIPT italic_i italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ,
\displaystyle\cdots
0|U𝜽d,ϕdnd(xd)|0quantum-operator-product0superscriptsubscript𝑈subscript𝜽𝑑subscriptbold-italic-ϕ𝑑subscript𝑛𝑑subscript𝑥𝑑0\displaystyle\braket{0}{U_{\bm{\theta}_{d},\bm{\phi}_{d}}^{n_{d}}(x_{d})}{0}⟨ start_ARG 0 end_ARG | start_ARG italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) end_ARG | start_ARG 0 end_ARG ⟩ =eindxd,absentsuperscript𝑒𝑖subscript𝑛𝑑subscript𝑥𝑑\displaystyle=e^{in_{d}x_{d}},= italic_e start_POSTSUPERSCRIPT italic_i italic_n start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ,

where the number of layers of each PQC is Lj=njsubscript𝐿𝑗subscript𝑛𝑗L_{j}=n_{j}italic_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT for 1jd1𝑗𝑑1\leq j\leq d1 ≤ italic_j ≤ italic_d. We then define a d𝑑ditalic_d-qubit PQC as

U𝒏(𝒙)=j=1dU𝜽j,ϕjnj(xj),superscript𝑈𝒏𝒙superscriptsubscripttensor-product𝑗1𝑑superscriptsubscript𝑈subscript𝜽𝑗subscriptbold-italic-ϕ𝑗subscript𝑛𝑗subscript𝑥𝑗U^{\bm{n}}(\bm{x})=\bigotimes_{j=1}^{d}U_{\bm{\theta}_{j},\bm{\phi}_{j}}^{n_{j% }}(x_{j}),italic_U start_POSTSUPERSCRIPT bold_italic_n end_POSTSUPERSCRIPT ( bold_italic_x ) = ⨂ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) , (B.28)

which gives

0|dU𝒏(𝒙)|0d=j=1d0|U𝜽j,ϕjnj(xj)|0=c𝒏ei𝒏𝒙.superscriptbra0tensor-productabsent𝑑superscript𝑈𝒏𝒙superscriptket0tensor-productabsent𝑑superscriptsubscriptproduct𝑗1𝑑quantum-operator-product0superscriptsubscript𝑈subscript𝜽𝑗subscriptbold-italic-ϕ𝑗subscript𝑛𝑗subscript𝑥𝑗0subscript𝑐𝒏superscript𝑒𝑖𝒏𝒙\bra{0}^{\otimes d}\!U^{\bm{n}}(\bm{x})\!\ket{0}^{\otimes d}=\prod_{j=1}^{d}% \braket{0}{U_{\bm{\theta}_{j},\bm{\phi}_{j}}^{n_{j}}(x_{j})}{0}=c_{\bm{n}}e^{i% \bm{n}\cdot\bm{x}}.⟨ start_ARG 0 end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUPERSCRIPT bold_italic_n end_POSTSUPERSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT = ∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ⟨ start_ARG 0 end_ARG | start_ARG italic_U start_POSTSUBSCRIPT bold_italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) end_ARG | start_ARG 0 end_ARG ⟩ = italic_c start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_i bold_italic_n ⋅ bold_italic_x end_POSTSUPERSCRIPT . (B.29)

Since 𝒏1=j=1dnjssubscriptdelimited-∥∥𝒏1superscriptsubscript𝑗1𝑑subscript𝑛𝑗𝑠\lVert\bm{n}\rVert_{1}=\sum_{j=1}^{d}n_{j}\leq s∥ bold_italic_n ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≤ italic_s, we can conclude that the depth of U𝒏(𝒙)superscript𝑈𝒏𝒙U^{\bm{n}}(\bm{x})italic_U start_POSTSUPERSCRIPT bold_italic_n end_POSTSUPERSCRIPT ( bold_italic_x ) is at most 6s+36𝑠36s+36 italic_s + 3 and the number of parameters in U𝒏(𝒙)superscript𝑈𝒏𝒙U^{\bm{n}}(\bm{x})italic_U start_POSTSUPERSCRIPT bold_italic_n end_POSTSUPERSCRIPT ( bold_italic_x ) is at most 4s+3d4𝑠3𝑑4s+3d4 italic_s + 3 italic_d.     square-intersection\sqcapsquare-union\sqcup

Then we could apply the technique of LCU on the PQCs U𝒏(𝒙)superscript𝑈𝒏𝒙U^{\bm{n}}(\bm{x})italic_U start_POSTSUPERSCRIPT bold_italic_n end_POSTSUPERSCRIPT ( bold_italic_x ) to implement the operator

Ut(𝒙)𝒏1sU𝒏(𝒙),subscript𝑈𝑡𝒙subscriptsubscriptdelimited-∥∥𝒏1𝑠superscript𝑈𝒏𝒙U_{t}(\bm{x})\coloneqq\sum_{\lVert\bm{n}\rVert_{1}\leq s}U^{\bm{n}}(\bm{x}),italic_U start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ∑ start_POSTSUBSCRIPT ∥ bold_italic_n ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT italic_U start_POSTSUPERSCRIPT bold_italic_n end_POSTSUPERSCRIPT ( bold_italic_x ) , (B.30)

so that we can implement the multivariate trigonometric polynomial as

+|dUt(𝒙)|+d=𝒏1s+|dU𝒏(𝒙)|+d=𝒏1sc𝒏ei𝒏𝒙=t(𝒙).superscriptbratensor-productabsent𝑑subscript𝑈𝑡𝒙superscriptkettensor-productabsent𝑑subscriptsubscriptdelimited-∥∥𝒏1𝑠superscriptbratensor-productabsent𝑑superscript𝑈𝒏𝒙superscriptkettensor-productabsent𝑑subscriptsubscriptdelimited-∥∥𝒏1𝑠subscript𝑐𝒏superscript𝑒𝑖𝒏𝒙𝑡𝒙\bra{+}^{\otimes d}\!U_{t}(\bm{x})\!\ket{+}^{\otimes d}=\sum_{\lVert\bm{n}% \rVert_{1}\leq s}\bra{+}^{\otimes d}\!U^{\bm{n}}(\bm{x})\!\ket{+}^{\otimes d}=% \sum_{\lVert\bm{n}\rVert_{1}\leq s}c_{\bm{n}}e^{i\bm{n}\cdot\bm{x}}=t(\bm{x}).⟨ start_ARG + end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT ∥ bold_italic_n ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT ⟨ start_ARG + end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUPERSCRIPT bold_italic_n end_POSTSUPERSCRIPT ( bold_italic_x ) | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT ∥ bold_italic_n ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT bold_italic_n end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_i bold_italic_n ⋅ bold_italic_x end_POSTSUPERSCRIPT = italic_t ( bold_italic_x ) . (B.31)

Note that the number of terms in the summation is

𝒏1s1=j=0s𝒏1=j1j=0sd2s(s+1)d2s.subscriptsubscriptdelimited-∥∥𝒏1𝑠1superscriptsubscript𝑗0𝑠subscriptsubscriptdelimited-∥∥𝒏1𝑗1superscriptsubscript𝑗0𝑠superscript𝑑2𝑠𝑠1superscript𝑑2𝑠\sum_{\lVert\bm{n}\rVert_{1}\leq s}1=\sum_{j=0}^{s}\sum_{\lVert\bm{n}\rVert_{1% }=j}1\leq\sum_{j=0}^{s}d^{2s}\leq(s+1)d^{2s}.∑ start_POSTSUBSCRIPT ∥ bold_italic_n ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT 1 = ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT ∥ bold_italic_n ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_j end_POSTSUBSCRIPT 1 ≤ ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT 2 italic_s end_POSTSUPERSCRIPT ≤ ( italic_s + 1 ) italic_d start_POSTSUPERSCRIPT 2 italic_s end_POSTSUPERSCRIPT . (B.32)

Then, we have the following proposition.

Proposition S7.

For any multivariate trigonometric polynomial t(𝐱)𝑡𝐱t(\bm{x})italic_t ( bold_italic_x ) with d𝑑ditalic_d variables and degree s𝑠sitalic_s such that |t(𝐱)|1𝑡𝐱1\lvert t(\bm{x})\rvert\leq 1| italic_t ( bold_italic_x ) | ≤ 1 for 𝐱d𝐱superscript𝑑\bm{x}\in{{\mathbb{R}}}^{d}bold_italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, there exists a PQC Wtri(𝐱)subscript𝑊𝑡𝑟𝑖𝐱W_{tri}(\bm{x})italic_W start_POSTSUBSCRIPT italic_t italic_r italic_i end_POSTSUBSCRIPT ( bold_italic_x ) such that

fWtri(𝒙)0|Wtri(𝒙)Z(0)Wtri(𝒙)|0=t(𝒙)subscript𝑓subscript𝑊𝑡𝑟𝑖𝒙bra0subscriptsuperscript𝑊𝑡𝑟𝑖𝒙superscript𝑍0subscript𝑊𝑡𝑟𝑖𝒙ket0𝑡𝒙f_{W_{tri}}(\bm{x})\coloneqq\bra{0}W^{\dagger}_{tri}(\bm{x})Z^{(0)}W_{tri}(\bm% {x})\ket{0}=t(\bm{x})italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_t italic_r italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ⟨ start_ARG 0 end_ARG | italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t italic_r italic_i end_POSTSUBSCRIPT ( bold_italic_x ) italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_t italic_r italic_i end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩ = italic_t ( bold_italic_x ) (B.33)

where Z(0)superscript𝑍0Z^{(0)}italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT is the Pauli Z𝑍Zitalic_Z observable on the first qubit. The width of the PQC is O(d+logs+slogd)𝑂𝑑𝑠𝑠𝑑O(d+\log s+s\log d)italic_O ( italic_d + roman_log italic_s + italic_s roman_log italic_d ), the depth is O(s2d2s(logs+slogd))𝑂superscript𝑠2superscript𝑑2𝑠𝑠𝑠𝑑O(s^{2}d^{2s}(\log s+s\log d))italic_O ( italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT 2 italic_s end_POSTSUPERSCRIPT ( roman_log italic_s + italic_s roman_log italic_d ) ), and the number of parameters is O(sd2s(s+d))𝑂𝑠superscript𝑑2𝑠𝑠𝑑O(sd^{2s}(s+d))italic_O ( italic_s italic_d start_POSTSUPERSCRIPT 2 italic_s end_POSTSUPERSCRIPT ( italic_s + italic_d ) ).

The proof is similar to Theorem 1. This result demonstrates the universal approximation property of PQC in the perspective of multivariate Fourier series, which yields similar results as in Schuld et al. [34]. Notably, the PQC in Proposition S7 has an explicit construction without any assumption, improving the implicit PQCs proposed in Schuld et al. [34] in terms of circuit size. For instance, to implement the d𝑑ditalic_d-variable Fourier series with degree s𝑠sitalic_s, the PQC with parallel structure in Schuld et al. [34] requires width O(ds)𝑂𝑑𝑠O(ds)italic_O ( italic_d italic_s ) and potentially exponential depth O(4ds)𝑂superscript4𝑑𝑠O(4^{ds})italic_O ( 4 start_POSTSUPERSCRIPT italic_d italic_s end_POSTSUPERSCRIPT ).

Appendix C Approximating continuous functions via PQCs

We have constructively shown in the previous section that PQCs could implement multivariate polynomials. To study the approximation capabilities of PQC, a natural strategy involves aggregating multiple polynomials to approximate the continuous function, drawing on well-established principles from classical approximation theory. In the context of univariate functions, this endeavor is guided by the Stone-Weierstrass Theorem [69]. For the multivariate case, we accomplish this task by employing PQCs to implement Bernstein polynomials, followed by the established result on the approximation error bound of Bernstein polynomials [52, 53].

C.1 Established results of Bernstein polynomials approximation

For a d𝑑ditalic_d-variable continuous function f:d:𝑓superscript𝑑f\mathrel{\mathop{\mathchar 58\relax}}{{\mathbb{R}}}^{d}\to{{\mathbb{R}}}italic_f : blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → blackboard_R, the multivariate Bernstein polynomial with degree n𝑛n\in{{\mathbb{N}}}italic_n ∈ blackboard_N of f𝑓fitalic_f is defined as

Bn(f;𝒙)k1=0nkd=0nf(𝒌n)j=1d(nkj)xjkj(1xj)nkj,subscript𝐵𝑛𝑓𝒙superscriptsubscriptsubscript𝑘10𝑛superscriptsubscriptsubscript𝑘𝑑0𝑛𝑓𝒌𝑛superscriptsubscriptproduct𝑗1𝑑binomial𝑛subscript𝑘𝑗superscriptsubscript𝑥𝑗subscript𝑘𝑗superscript1subscript𝑥𝑗𝑛subscript𝑘𝑗B_{n}(f;\bm{x})\coloneqq\sum_{k_{1}=0}^{n}\cdots\sum_{k_{d}=0}^{n}f\bigl{(}% \frac{\bm{k}}{n}\bigr{)}\prod_{j=1}^{d}\binom{n}{k_{j}}x_{j}^{k_{j}}(1-x_{j})^% {n-k_{j}},italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_f ; bold_italic_x ) ≔ ∑ start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⋯ ∑ start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_f ( divide start_ARG bold_italic_k end_ARG start_ARG italic_n end_ARG ) ∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( FRACOP start_ARG italic_n end_ARG start_ARG italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ) italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_n - italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT , (C.34)

and 𝒌=(k1,,kd){0,,n}d𝒌subscript𝑘1subscript𝑘𝑑superscript0𝑛𝑑\bm{k}=(k_{1},\ldots,k_{d})\in\{0,\ldots,n\}^{d}bold_italic_k = ( italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_k start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ∈ { 0 , … , italic_n } start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. Then, we have the following lemma on the approximation error bound of the Bernstein polynomial.

Lemma S8 (Bernstein polynomials approximation for Lipschitz functions [53]).

Given a Lipschitz continuous function f:[0,1]d:𝑓superscript01𝑑f\mathrel{\mathop{\mathchar 58\relax}}[0,1]^{d}\to{{\mathbb{R}}}italic_f : [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → blackboard_R with Lipschitz constant \ellroman_ℓ, which is defined as |f(𝐱)f(𝐲)|𝐱𝐲𝑓𝐱𝑓𝐲subscriptnorm𝐱𝐲|f(\bm{x})-f(\bm{y})|\leq\ell\|\bm{x}-\bm{y}\|_{\infty}| italic_f ( bold_italic_x ) - italic_f ( bold_italic_y ) | ≤ roman_ℓ ∥ bold_italic_x - bold_italic_y ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT. Let f𝑓fitalic_f be bounded by ΓΓ\Gammaroman_Γ. The approximation error of the n𝑛nitalic_n-degree Bernstein polynomial of f𝑓fitalic_f scales as

|f(𝒙)Bn(f;𝒙)|ε+2Γj=1d(dj)(24nε2)jε+2Γ((1+24nε2)d1),𝑓𝒙subscript𝐵𝑛𝑓𝒙𝜀2Γsuperscriptsubscript𝑗1𝑑binomial𝑑𝑗superscriptsuperscript24𝑛superscript𝜀2𝑗𝜀2Γsuperscript1superscript24𝑛superscript𝜀2𝑑1\lvert f(\bm{x})-B_{n}(f;\bm{x})\rvert\leq\varepsilon+2\Gamma\sum_{j=1}^{d}% \binom{d}{j}\left(\frac{\ell^{2}}{4n\varepsilon^{2}}\right)^{j}\leq\varepsilon% +2\Gamma\left(\left(1+\frac{\ell^{2}}{4n\varepsilon^{2}}\right)^{d}-1\right),| italic_f ( bold_italic_x ) - italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_f ; bold_italic_x ) | ≤ italic_ε + 2 roman_Γ ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( FRACOP start_ARG italic_d end_ARG start_ARG italic_j end_ARG ) ( divide start_ARG roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 italic_n italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ≤ italic_ε + 2 roman_Γ ( ( 1 + divide start_ARG roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 italic_n italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT - 1 ) , (C.35)

where ε>0𝜀0\varepsilon>0italic_ε > 0 is an arbitrarily small quantity.

Proof.

Drawing inspiration from the Lipschitz continuity of the target function f𝑓fitalic_f, we define δ=ϵ/𝛿italic-ϵ\delta=\epsilon/\ellitalic_δ = italic_ϵ / roman_ℓ. Consequently, for any two points 𝒙=(x1,,xd)𝒙subscript𝑥1subscript𝑥𝑑\bm{x}=(x_{1},\dots,x_{d})bold_italic_x = ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) and 𝒚=(y1,,yd)𝒚subscript𝑦1subscript𝑦𝑑\bm{y}=(y_{1},\dots,y_{d})bold_italic_y = ( italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_y start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) such that |xiyi|<δsubscript𝑥𝑖subscript𝑦𝑖𝛿|x_{i}-y_{i}|<\delta| italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | < italic_δ for all i{1,,d}𝑖1𝑑i\in\{1,\dots,d\}italic_i ∈ { 1 , … , italic_d }, it follows that |f(𝒙)f(𝒚)|ε𝑓𝒙𝑓𝒚𝜀|f(\bm{x})-f(\bm{y})|\leq\varepsilon| italic_f ( bold_italic_x ) - italic_f ( bold_italic_y ) | ≤ italic_ε. The target function can be written as

f(𝒙)𝑓𝒙\displaystyle f(\bm{x})italic_f ( bold_italic_x ) =\displaystyle== f(x1,,xd)𝑓subscript𝑥1subscript𝑥𝑑\displaystyle f(x_{1},\dots,x_{d})italic_f ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) (C.38)
=\displaystyle== f(x1,,xd)k1=0nkd=0ni=1d(nki)xiki(1xi)nki𝑓subscript𝑥1subscript𝑥𝑑superscriptsubscriptsubscript𝑘10𝑛superscriptsubscriptsubscript𝑘𝑑0𝑛superscriptsubscriptproduct𝑖1𝑑𝑛subscript𝑘𝑖superscriptsubscript𝑥𝑖subscript𝑘𝑖superscript1subscript𝑥𝑖𝑛subscript𝑘𝑖\displaystyle f\left(x_{1},\cdots,x_{d}\right)\sum_{k_{1}=0}^{n}\cdots\sum_{k_% {d}=0}^{n}\prod_{i=1}^{d}\left(\begin{array}[]{l}n\\ k_{i}\end{array}\right)x_{i}^{k_{i}}\left(1-x_{i}\right)^{n-k_{i}}italic_f ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ∑ start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⋯ ∑ start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( start_ARRAY start_ROW start_CELL italic_n end_CELL end_ROW start_ROW start_CELL italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY ) italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_n - italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT
=\displaystyle== k1=0nkd=0nf(x1,,xd)i=1d(nki)xiki(1xi)nki.superscriptsubscriptsubscript𝑘10𝑛superscriptsubscriptsubscript𝑘𝑑0𝑛𝑓subscript𝑥1subscript𝑥𝑑superscriptsubscriptproduct𝑖1𝑑𝑛subscript𝑘𝑖superscriptsubscript𝑥𝑖subscript𝑘𝑖superscript1subscript𝑥𝑖𝑛subscript𝑘𝑖\displaystyle\sum_{k_{1}=0}^{n}\cdots\sum_{k_{d}=0}^{n}f\left(x_{1},\cdots,x_{% d}\right)\prod_{i=1}^{d}\left(\begin{array}[]{l}n\\ k_{i}\end{array}\right)x_{i}^{k_{i}}\left(1-x_{i}\right)^{n-k_{i}}.∑ start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⋯ ∑ start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_f ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( start_ARRAY start_ROW start_CELL italic_n end_CELL end_ROW start_ROW start_CELL italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY ) italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_n - italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT . (C.41)

Let us consider the set E=i=1d{0,1,,n}𝐸superscriptsubscriptproduct𝑖1𝑑01𝑛E=\prod_{i=1}^{d}\{0,1,\dots,n\}italic_E = ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT { 0 , 1 , … , italic_n }, and for j=1,2,,d𝑗12𝑑j=1,2,\dots,ditalic_j = 1 , 2 , … , italic_d, we define the sets

Ωj={kj{0,1,,n}:|kinxj|<δ} and F=E(Ω1××Ωd).subscriptΩ𝑗subscript𝑘𝑗01𝑛:subscript𝑘𝑖𝑛subscript𝑥𝑗𝛿 and 𝐹𝐸subscriptΩ1subscriptΩ𝑑\displaystyle\Omega_{j}=\{k_{j}\in\{0,1,\dots,n\}\mathrel{\mathop{\mathchar 58% \relax}}|\frac{k_{i}}{n}-x_{j}|<\delta\}\text{ and }F=E\setminus(\Omega_{1}% \times\cdots\times\Omega_{d}).roman_Ω start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = { italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ { 0 , 1 , … , italic_n } : | divide start_ARG italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG - italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | < italic_δ } and italic_F = italic_E ∖ ( roman_Ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT × ⋯ × roman_Ω start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) . (C.42)

Then, F=k=1dFk𝐹superscriptsubscript𝑘1𝑑subscript𝐹𝑘F=\bigcup_{k=1}^{d}F_{k}italic_F = ⋃ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_F start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, with Fk={i=1dΩik[αik]F:αik{0,1},i=1dαik=k}F_{k}=\left\{\prod_{i=1}^{d}\Omega_{ik}^{\left[\alpha_{ik}\right]}\in F% \mathrel{\mathop{\mathchar 58\relax}}\alpha_{ik}\in\{0,1\},\quad\sum_{i=1}^{d}% \alpha_{ik}=k\right\}italic_F start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = { ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT roman_Ω start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_α start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT ] end_POSTSUPERSCRIPT ∈ italic_F : italic_α start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT ∈ { 0 , 1 } , ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT = italic_k }, where Ωik[αik]={Ωi if αik=0Ωic if αik=1superscriptsubscriptΩ𝑖𝑘delimited-[]subscript𝛼𝑖𝑘casessubscriptΩ𝑖 if subscript𝛼𝑖𝑘0superscriptsubscriptΩ𝑖𝑐 if subscript𝛼𝑖𝑘1\Omega_{ik}^{\left[\alpha_{ik}\right]}=\left\{\begin{array}[]{ll}\Omega_{i}&% \text{ if }\alpha_{ik}=0\\ \Omega_{i}^{c}&\text{ if }\alpha_{ik}=1\end{array}\right.roman_Ω start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_α start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT ] end_POSTSUPERSCRIPT = { start_ARRAY start_ROW start_CELL roman_Ω start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_CELL start_CELL if italic_α start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT = 0 end_CELL end_ROW start_ROW start_CELL roman_Ω start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT end_CELL start_CELL if italic_α start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT = 1 end_CELL end_ROW end_ARRAY and Ωic={ki{0,,n}:|kinxi|δ}superscriptsubscriptΩ𝑖𝑐subscript𝑘𝑖0𝑛:subscript𝑘𝑖𝑛subscript𝑥𝑖𝛿\Omega_{i}^{c}=\left\{k_{i}\in\left\{0,\cdots,n\right\}\mathrel{\mathop{% \mathchar 58\relax}}\left|\frac{k_{i}}{n}-x_{i}\right|\geq\delta\right\}roman_Ω start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT = { italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ { 0 , ⋯ , italic_n } : | divide start_ARG italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | ≥ italic_δ }. For Ak=i=1dΩik[αik]Fk,k=1,,dformulae-sequencesubscript𝐴𝑘superscriptsubscriptproduct𝑖1𝑑superscriptsubscriptΩ𝑖𝑘delimited-[]subscript𝛼𝑖𝑘subscript𝐹𝑘𝑘1𝑑A_{k}=\prod_{i=1}^{d}\Omega_{ik}^{\left[\alpha_{ik}\right]}\in F_{k},k=1,% \cdots,ditalic_A start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT roman_Ω start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT [ italic_α start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT ] end_POSTSUPERSCRIPT ∈ italic_F start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_k = 1 , ⋯ , italic_d, let us define also IAk={i{1,,d}:αik=1}subscript𝐼subscript𝐴𝑘𝑖1𝑑:subscript𝛼𝑖𝑘1I_{A_{k}}=\left\{i\in\{1,\cdots,d\}\mathrel{\mathop{\mathchar 58\relax}}\alpha% _{ik}=1\right\}italic_I start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT = { italic_i ∈ { 1 , ⋯ , italic_d } : italic_α start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT = 1 } (that means card (IAk)=k1)\left.\left(I_{A_{k}}\right)=k\geq 1\right)( italic_I start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) = italic_k ≥ 1 ). We have

|f(x1,,xd)Bn(f;x1,,xd)|𝑓subscript𝑥1subscript𝑥𝑑subscript𝐵𝑛𝑓subscript𝑥1subscript𝑥𝑑\displaystyle\left|f\left(x_{1},\cdots,x_{d}\right)-B_{n}\left(f;x_{1},\cdots,% x_{d}\right)\right|| italic_f ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) - italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_f ; italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) | (C.43)
=\displaystyle== k1=0nkd=0nf(x1,,xd)i=1d(nki)xiki(1xi)nki\displaystyle\mid\sum_{k_{1}=0}^{n}\cdots\sum_{k_{d}=0}^{n}f\left(x_{1},\cdots% ,x_{d}\right)\prod_{i=1}^{d}\left(\begin{array}[]{l}n\\ k_{i}\end{array}\right)x_{i}^{k_{i}}\left(1-x_{i}\right)^{n-k_{i}}∣ ∑ start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⋯ ∑ start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_f ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( start_ARRAY start_ROW start_CELL italic_n end_CELL end_ROW start_ROW start_CELL italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY ) italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_n - italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT
k1=0nkd=0nf(k1n,,kdn)i=1d(nki)xiki(1xi)nki\displaystyle-\sum_{k_{1}=0}^{n}\cdots\sum_{k_{d}=0}^{n}f\left(\frac{k_{1}}{n}% ,\cdots,\frac{k_{d}}{n}\right)\prod_{i=1}^{d}\left(\begin{array}[]{l}n\\ k_{i}\end{array}\right)x_{i}^{k_{i}}\left(1-x_{i}\right)^{n-k_{i}}\mid- ∑ start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⋯ ∑ start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_f ( divide start_ARG italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG , ⋯ , divide start_ARG italic_k start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG ) ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( start_ARRAY start_ROW start_CELL italic_n end_CELL end_ROW start_ROW start_CELL italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY ) italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_n - italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ∣
=\displaystyle== |k1=0nkd=0n[f(x1,,xd)f(k1n,,kdn)]i=1d(nki)xiki(1xi)nki|superscriptsubscriptsubscript𝑘10𝑛superscriptsubscriptsubscript𝑘𝑑0𝑛delimited-[]𝑓subscript𝑥1subscript𝑥𝑑𝑓subscript𝑘1𝑛subscript𝑘𝑑𝑛superscriptsubscriptproduct𝑖1𝑑𝑛subscript𝑘𝑖superscriptsubscript𝑥𝑖subscript𝑘𝑖superscript1subscript𝑥𝑖𝑛subscript𝑘𝑖\displaystyle\left|\sum_{k_{1}=0}^{n}\cdots\sum_{k_{d}=0}^{n}\left[f\left(x_{1% },\cdots,x_{d}\right)-f\left(\frac{k_{1}}{n},\cdots,\frac{k_{d}}{n}\right)% \right]\prod_{i=1}^{d}\left(\begin{array}[]{l}n\\ k_{i}\end{array}\right)x_{i}^{k_{i}}\left(1-x_{i}\right)^{n-k_{i}}\right|| ∑ start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⋯ ∑ start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT [ italic_f ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) - italic_f ( divide start_ARG italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG , ⋯ , divide start_ARG italic_k start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG ) ] ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( start_ARRAY start_ROW start_CELL italic_n end_CELL end_ROW start_ROW start_CELL italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY ) italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_n - italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT |
\displaystyle\leq k1=0nkd=0n|f(x1,,xd)f(k1n,,kdn)|i=1d(nki)xiki(1xi)nkisuperscriptsubscriptsubscript𝑘10𝑛superscriptsubscriptsubscript𝑘𝑑0𝑛𝑓subscript𝑥1subscript𝑥𝑑𝑓subscript𝑘1𝑛subscript𝑘𝑑𝑛superscriptsubscriptproduct𝑖1𝑑𝑛subscript𝑘𝑖superscriptsubscript𝑥𝑖subscript𝑘𝑖superscript1subscript𝑥𝑖𝑛subscript𝑘𝑖\displaystyle\sum_{k_{1}=0}^{n}\cdots\sum_{k_{d}=0}^{n}\left|f\left(x_{1},% \cdots,x_{d}\right)-f\left(\frac{k_{1}}{n},\cdots,\frac{k_{d}}{n}\right)\right% |\prod_{i=1}^{d}\left(\begin{array}[]{l}n\\ k_{i}\end{array}\right)x_{i}^{k_{i}}\left(1-x_{i}\right)^{n-k_{i}}∑ start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⋯ ∑ start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_f ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) - italic_f ( divide start_ARG italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG , ⋯ , divide start_ARG italic_k start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG ) | ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( start_ARRAY start_ROW start_CELL italic_n end_CELL end_ROW start_ROW start_CELL italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY ) italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_n - italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT
\displaystyle\leq Ω1Ωd|f(x1,,xd)f(k1n,,kdn)|i=1d(nki)xiki(1xi)nkisubscriptsubscriptΩ1subscriptsubscriptΩ𝑑𝑓subscript𝑥1subscript𝑥𝑑𝑓subscript𝑘1𝑛subscript𝑘𝑑𝑛superscriptsubscriptproduct𝑖1𝑑𝑛subscript𝑘𝑖superscriptsubscript𝑥𝑖subscript𝑘𝑖superscript1subscript𝑥𝑖𝑛subscript𝑘𝑖\displaystyle\sum_{\Omega_{1}}\cdots\sum_{\Omega_{d}}\left|f\left(x_{1},\cdots% ,x_{d}\right)-f\left(\frac{k_{1}}{n},\cdots,\frac{k_{d}}{n}\right)\right|\prod% _{i=1}^{d}\left(\begin{array}[]{l}n\\ k_{i}\end{array}\right)x_{i}^{k_{i}}\left(1-x_{i}\right)^{n-k_{i}}∑ start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋯ ∑ start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUBSCRIPT | italic_f ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) - italic_f ( divide start_ARG italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG , ⋯ , divide start_ARG italic_k start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG ) | ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( start_ARRAY start_ROW start_CELL italic_n end_CELL end_ROW start_ROW start_CELL italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY ) italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_n - italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT
+F|f(x1,,xd)f(k1n,,kdn)|i=1d(nki)xiki(1xi)nki.subscript𝐹𝑓subscript𝑥1subscript𝑥𝑑𝑓subscript𝑘1𝑛subscript𝑘𝑑𝑛superscriptsubscriptproduct𝑖1𝑑𝑛subscript𝑘𝑖superscriptsubscript𝑥𝑖subscript𝑘𝑖superscript1subscript𝑥𝑖𝑛subscript𝑘𝑖\displaystyle+\sum_{F}\left|f\left(x_{1},\cdots,x_{d}\right)-f\left(\frac{k_{1% }}{n},\cdots,\frac{k_{d}}{n}\right)\right|\prod_{i=1}^{d}\left(\begin{array}[]% {l}n\\ k_{i}\end{array}\right)x_{i}^{k_{i}}\left(1-x_{i}\right)^{n-k_{i}}.+ ∑ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT | italic_f ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) - italic_f ( divide start_ARG italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG , ⋯ , divide start_ARG italic_k start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_ARG start_ARG italic_n end_ARG ) | ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( start_ARRAY start_ROW start_CELL italic_n end_CELL end_ROW start_ROW start_CELL italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY ) italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_n - italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT .

Using the fact that f𝑓fitalic_f is continuous and bounded, we get

|f(x1,,xd)Bn(f;x1,,xd)|𝑓subscript𝑥1subscript𝑥𝑑subscript𝐵𝑛𝑓subscript𝑥1subscript𝑥𝑑\displaystyle\left|f\left(x_{1},\cdots,x_{d}\right)-B_{n}\left(f;x_{1},\cdots,% x_{d}\right)\right|| italic_f ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) - italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_f ; italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) | (C.44)
\displaystyle\leq εΩ1Ωdi=1d(nki)xiki(1xi)nki+2ΓFi=1d(nki)xiki(1xi)nki𝜀subscriptsubscriptΩ1subscriptsubscriptΩ𝑑superscriptsubscriptproduct𝑖1𝑑𝑛subscript𝑘𝑖superscriptsubscript𝑥𝑖subscript𝑘𝑖superscript1subscript𝑥𝑖𝑛subscript𝑘𝑖2Γsubscript𝐹superscriptsubscriptproduct𝑖1𝑑𝑛subscript𝑘𝑖superscriptsubscript𝑥𝑖subscript𝑘𝑖superscript1subscript𝑥𝑖𝑛subscript𝑘𝑖\displaystyle\varepsilon\sum_{\Omega_{1}}\cdots\sum_{\Omega_{d}}\prod_{i=1}^{d% }\left(\begin{array}[]{l}n\\ k_{i}\end{array}\right)x_{i}^{k_{i}}\left(1-x_{i}\right)^{n-k_{i}}+2\Gamma\sum% _{F}\prod_{i=1}^{d}\left(\begin{array}[]{l}n\\ k_{i}\end{array}\right)x_{i}^{k_{i}}\left(1-x_{i}\right)^{n-k_{i}}italic_ε ∑ start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⋯ ∑ start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( start_ARRAY start_ROW start_CELL italic_n end_CELL end_ROW start_ROW start_CELL italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY ) italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_n - italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT + 2 roman_Γ ∑ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( start_ARRAY start_ROW start_CELL italic_n end_CELL end_ROW start_ROW start_CELL italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY ) italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_n - italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT
\displaystyle\leq ε+2Γl=1dAlFli=1d(nki)xiki(1xi)nki𝜀2Γsuperscriptsubscript𝑙1𝑑subscriptsubscript𝐴𝑙subscript𝐹𝑙superscriptsubscriptproduct𝑖1𝑑𝑛subscript𝑘𝑖superscriptsubscript𝑥𝑖subscript𝑘𝑖superscript1subscript𝑥𝑖𝑛subscript𝑘𝑖\displaystyle\varepsilon+2\Gamma\sum_{l=1}^{d}\sum_{A_{l}\in F_{l}}\prod_{i=1}% ^{d}\left(\begin{array}[]{l}n\\ k_{i}\end{array}\right)x_{i}^{k_{i}}\left(1-x_{i}\right)^{n-k_{i}}italic_ε + 2 roman_Γ ∑ start_POSTSUBSCRIPT italic_l = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ∈ italic_F start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( start_ARRAY start_ROW start_CELL italic_n end_CELL end_ROW start_ROW start_CELL italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY ) italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_n - italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT
\displaystyle\leq ε+2Γl=1dAlFliIAl14nδ2𝜀2Γsuperscriptsubscript𝑙1𝑑subscriptsubscript𝐴𝑙subscript𝐹𝑙subscriptproduct𝑖subscript𝐼subscript𝐴𝑙14𝑛superscript𝛿2\displaystyle\varepsilon+2\Gamma\sum_{l=1}^{d}\sum_{A_{l}\in F_{l}}\prod_{i\in I% _{A_{l}}}\frac{1}{4n\delta^{2}}italic_ε + 2 roman_Γ ∑ start_POSTSUBSCRIPT italic_l = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ∈ italic_F start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∏ start_POSTSUBSCRIPT italic_i ∈ italic_I start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG 4 italic_n italic_δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG
=\displaystyle== ε+2Γj=1d(dj)1(4nδ2)jε+2Γ((1+24nε2)d1).𝜀2Γsuperscriptsubscript𝑗1𝑑binomial𝑑𝑗1superscript4𝑛superscript𝛿2𝑗𝜀2Γsuperscript1superscript24𝑛superscript𝜀2𝑑1\displaystyle\varepsilon+2\Gamma\sum_{j=1}^{d}\binom{d}{j}\frac{1}{(4n\delta^{% 2})^{j}}\leq\varepsilon+2\Gamma\left(\left(1+\frac{\ell^{2}}{4n\varepsilon^{2}% }\right)^{d}-1\right).italic_ε + 2 roman_Γ ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( FRACOP start_ARG italic_d end_ARG start_ARG italic_j end_ARG ) divide start_ARG 1 end_ARG start_ARG ( 4 italic_n italic_δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT end_ARG ≤ italic_ε + 2 roman_Γ ( ( 1 + divide start_ARG roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 italic_n italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT - 1 ) .

This completes the proof. A more detailed expansion of Eq. C.44 can be seen in Theorem 2 in Foupouagnigni and Mouafo Wouodjié [53].     square-intersection\sqcapsquare-union\sqcup

Remark S2.

Here, it is important to observe that for a continuous target function, denoted as f(𝐱)𝑓𝐱f(\bm{x})italic_f ( bold_italic_x ), there exists a value of δ>0𝛿0\delta>0italic_δ > 0 such that:

|f(x1,,xd)Bn(f;x1,,xd)|ε+2Γ((1+14nδ2)d1).𝑓subscript𝑥1subscript𝑥𝑑subscript𝐵𝑛𝑓subscript𝑥1subscript𝑥𝑑𝜀2Γsuperscript114𝑛superscript𝛿2𝑑1\left|f\left(x_{1},\cdots,x_{d}\right)-B_{n}\left(f;x_{1},\cdots,x_{d}\right)% \right|\leq\varepsilon+2\Gamma\left(\left(1+\frac{1}{4n\delta^{2}}\right)^{d}-% 1\right).| italic_f ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) - italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_f ; italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) | ≤ italic_ε + 2 roman_Γ ( ( 1 + divide start_ARG 1 end_ARG start_ARG 4 italic_n italic_δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT - 1 ) .

This expression signifies the convergence rate of the Bernstein polynomial for general continuous functions.

C.2 Implement Bernstein polynomials via PQCs

In Lemma S8, we have defined the Bernstein polynomial and its approximation error towards the Lipschitz continuous function. Guided by Theorem 1, we can construct a PQC to implement such a Bernstein polynomial.

Lemma S9.

For any d𝑑ditalic_d-variable Bernstein polynomial with degree n𝑛n\in{{\mathbb{N}}}italic_n ∈ blackboard_N defined in Eq. C.34 such that |Bn(f;𝐱)|1subscript𝐵𝑛𝑓𝐱1\lvert B_{n}(f;\bm{x})\rvert\leq 1| italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_f ; bold_italic_x ) | ≤ 1 for 𝐱[0,1]d𝐱superscript01𝑑\bm{x}\in[0,1]^{d}bold_italic_x ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, there exist a PQC Wb(𝐱)subscript𝑊𝑏𝐱W_{b}(\bm{x})italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) satisfying

fWb(𝒙)0|Wb(𝒙)Z(0)Wb(𝒙)|0=Bn(f;𝒙).subscript𝑓subscript𝑊𝑏𝒙bra0subscriptsuperscript𝑊𝑏𝒙superscript𝑍0subscript𝑊𝑏𝒙ket0subscript𝐵𝑛𝑓𝒙f_{W_{b}}(\bm{x})\coloneqq\bra{0}W^{\dagger}_{b}(\bm{x})Z^{(0)}W_{b}(\bm{x})% \ket{0}=B_{n}(f;\bm{x}).italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ⟨ start_ARG 0 end_ARG | italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩ = italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_f ; bold_italic_x ) . (C.45)

The width of the PQC is O(dlogn)𝑂𝑑𝑛O(d\log{n})italic_O ( italic_d roman_log italic_n ), the depth is O(dndlogn)𝑂𝑑superscript𝑛𝑑𝑛O\bigl{(}dn^{d}\log{n}\bigr{)}italic_O ( italic_d italic_n start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT roman_log italic_n ), and the number of parameters is O(dnd)𝑂𝑑superscript𝑛𝑑O(dn^{d})italic_O ( italic_d italic_n start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ).

Proof.

We undertake a two-step process in the proof of Lemma S9. Initially, we construct PQCs to provide an exact representation of f(𝒌n)j=1d(nkj)xjkj(1xj)nkj𝑓𝒌𝑛superscriptsubscriptproduct𝑗1𝑑binomial𝑛subscript𝑘𝑗superscriptsubscript𝑥𝑗subscript𝑘𝑗superscript1subscript𝑥𝑗𝑛subscript𝑘𝑗f\bigl{(}\frac{\bm{k}}{n}\bigr{)}\prod_{j=1}^{d}\binom{n}{k_{j}}x_{j}^{k_{j}}(% 1-x_{j})^{n-k_{j}}italic_f ( divide start_ARG bold_italic_k end_ARG start_ARG italic_n end_ARG ) ∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( FRACOP start_ARG italic_n end_ARG start_ARG italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ) italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_n - italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT for all 𝒌{0,1,,n}d𝒌superscript01𝑛𝑑\bm{k}\in\{0,1,\dots,n\}^{d}bold_italic_k ∈ { 0 , 1 , … , italic_n } start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. Subsequently, we employ LCU to aggregate these PQCs for the purpose of approximating the Bernstein polynomial described in Eq. C.34.

The univariate polynomial xk(1x)nksuperscript𝑥𝑘superscript1𝑥𝑛𝑘x^{k}(1-x)^{n-k}italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( 1 - italic_x ) start_POSTSUPERSCRIPT italic_n - italic_k end_POSTSUPERSCRIPT can be represented by a PQC. The depth of this PQC is less than 2n+12𝑛12n+12 italic_n + 1, the width is 2222, and the number of parameters is n+2𝑛2n+2italic_n + 2. The multivariate polynomial f(𝒌n)j=1d(nkj)xjkj(1xj)nkj𝑓𝒌𝑛superscriptsubscriptproduct𝑗1𝑑binomial𝑛subscript𝑘𝑗superscriptsubscript𝑥𝑗subscript𝑘𝑗superscript1subscript𝑥𝑗𝑛subscript𝑘𝑗f\bigl{(}\frac{\bm{k}}{n}\bigr{)}\prod_{j=1}^{d}\binom{n}{k_{j}}x_{j}^{k_{j}}(% 1-x_{j})^{n-k_{j}}italic_f ( divide start_ARG bold_italic_k end_ARG start_ARG italic_n end_ARG ) ∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( FRACOP start_ARG italic_n end_ARG start_ARG italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ) italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( 1 - italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_n - italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT can be exactly represented by the product of the univariate polynomial xk(1x)nksuperscript𝑥𝑘superscript1𝑥𝑛𝑘x^{k}(1-x)^{n-k}italic_x start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ( 1 - italic_x ) start_POSTSUPERSCRIPT italic_n - italic_k end_POSTSUPERSCRIPT. The same routine has been employed in Lemma S5. The depth of this PQC is less than 2n+12𝑛12n+12 italic_n + 1, the width is 2d2𝑑2d2 italic_d, and the number of parameters is d(n+2)𝑑𝑛2d(n+2)italic_d ( italic_n + 2 ).

The number of terms in the summation in Eq. C.34 is (n+1)dsuperscript𝑛1𝑑(n+1)^{d}( italic_n + 1 ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. We can employ the same routine in Theorem 1 to construct the PQC Wb(𝒙)subscript𝑊𝑏𝒙W_{b}(\bm{x})italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ). The depth of Wbsubscript𝑊𝑏W_{b}italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT scales as

O((d(n+1)d+1log(n+1)+d)),𝑂𝑑superscript𝑛1𝑑1𝑛1𝑑O\Bigl{(}\bigl{(}d(n+1)^{d+1}\log{(n+1)}+d\bigr{)}\Bigr{)},italic_O ( ( italic_d ( italic_n + 1 ) start_POSTSUPERSCRIPT italic_d + 1 end_POSTSUPERSCRIPT roman_log ( italic_n + 1 ) + italic_d ) ) ,

the width is 2d+dlog(n+1)2𝑑𝑑𝑛12d+d\log{(n+1)}2 italic_d + italic_d roman_log ( italic_n + 1 ), and the number of parameters is (n+1)d(n+2)dsuperscript𝑛1𝑑𝑛2𝑑(n+1)^{d}(n+2)d( italic_n + 1 ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( italic_n + 2 ) italic_d. The results presented in Lemma S9 can be obtained after simplification.     square-intersection\sqcapsquare-union\sqcup

C.3 PQC approximating continuous functions

We have successfully derived results regarding the approximation error between PQCs and Bernstein polynomials and between Bernstein polynomials and continuous functions. Leveraging these established findings, we can now formulate a rigorous assertion regarding the universal approximation theorem and the error bound of PQCs, employing the well-established principles of triangle inequality.

Theorem 2 (The Universal Approximation Theorem of PQC).

For any continuous function f:[0,1]d[1,1]:𝑓superscript01𝑑11f\mathrel{\mathop{\mathchar 58\relax}}[0,1]^{d}\to[-1,1]italic_f : [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → [ - 1 , 1 ], given an ε>0𝜀0\varepsilon>0italic_ε > 0, there exist an n𝑛n\in{{\mathbb{N}}}italic_n ∈ blackboard_N and a PQC Wb(𝐱)subscript𝑊𝑏𝐱W_{b}(\bm{x})italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) with width O(dlogn)𝑂𝑑𝑛O(d\log n)italic_O ( italic_d roman_log italic_n ), depth O(dndlogn)𝑂𝑑superscript𝑛𝑑𝑛O(dn^{d}\log n)italic_O ( italic_d italic_n start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT roman_log italic_n ) and the number of trainable parameters O(dnd)𝑂𝑑superscript𝑛𝑑O(dn^{d})italic_O ( italic_d italic_n start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ) such that

|f(𝒙)fWb(𝒙)|ε𝑓𝒙subscript𝑓subscript𝑊𝑏𝒙𝜀\lvert f(\bm{x})-f_{W_{b}}(\bm{x})\rvert\leq\varepsilon| italic_f ( bold_italic_x ) - italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) | ≤ italic_ε (C.46)

for all 𝐱[0,1]d𝐱superscript01𝑑\bm{x}\in[0,1]^{d}bold_italic_x ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, where fWb(𝐱)0|Wb(𝐱)Z(0)Wb(𝐱)|0subscript𝑓subscript𝑊𝑏𝐱bra0subscriptsuperscript𝑊𝑏𝐱superscript𝑍0subscript𝑊𝑏𝐱ket0f_{W_{b}}(\bm{x})\coloneqq\bra{0}W^{\dagger}_{b}(\bm{x})Z^{(0)}W_{b}(\bm{x})% \ket{0}italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ⟨ start_ARG 0 end_ARG | italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩.

Proof.

Remark S2 has established the uniform convergence of the Bernstein polynomial towards any continuous function within the cubic domain [0,1]dsuperscript01𝑑[0,1]^{d}[ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, denoted as Bn(f;𝒙)subscript𝐵𝑛𝑓𝒙B_{n}(f;\bm{x})italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_f ; bold_italic_x ), with the property that Bn(f;𝒙)f(𝒙)subscript𝐵𝑛𝑓𝒙𝑓𝒙B_{n}(f;\bm{x})\rightarrow f(\bm{x})italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_f ; bold_italic_x ) → italic_f ( bold_italic_x ) as n+𝑛n\rightarrow+\inftyitalic_n → + ∞. Building on Lemma S9, we can effectively implement this Bernstein polynomial Bn(f;𝒙)subscript𝐵𝑛𝑓𝒙B_{n}(f;\bm{x})italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_f ; bold_italic_x ) using fWb(𝒙)subscript𝑓subscript𝑊𝑏𝒙f_{W_{b}}(\bm{x})italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ). The depth of the PQC Wb(𝒙)subscript𝑊𝑏𝒙W_{b}(\bm{x})italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) is O(dndlogn)𝑂𝑑superscript𝑛𝑑𝑛O\bigl{(}dn^{d}\log{n}\bigr{)}italic_O ( italic_d italic_n start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT roman_log italic_n ), the width is O(dlogn)𝑂𝑑𝑛O(d\log{n})italic_O ( italic_d roman_log italic_n ), and the number of parameters is O(dnd)𝑂𝑑superscript𝑛𝑑O(dn^{d})italic_O ( italic_d italic_n start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ). This completes the proof.     square-intersection\sqcapsquare-union\sqcup

Theorem 3.

Given a Lipschitz continuous function f:[0,1]d[1,1]:𝑓superscript01𝑑11f\mathrel{\mathop{\mathchar 58\relax}}[0,1]^{d}\to[-1,1]italic_f : [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → [ - 1 , 1 ] with a Lipschitz constant \ellroman_ℓ, for any ε>0𝜀0\varepsilon>0italic_ε > 0 and n𝑛n\in{{\mathbb{N}}}italic_n ∈ blackboard_N, there exists a PQC Wb(𝐱)subscript𝑊𝑏𝐱W_{b}(\bm{x})italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) with such that fWb(𝐱)0|Wb(𝐱)Z(0)Wb(𝐱)|0subscript𝑓subscript𝑊𝑏𝐱bra0subscriptsuperscript𝑊𝑏𝐱superscript𝑍0subscript𝑊𝑏𝐱ket0f_{W_{b}}(\bm{x})\coloneqq\bra{0}W^{\dagger}_{b}(\bm{x})Z^{(0)}W_{b}(\bm{x})% \ket{0}italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ⟨ start_ARG 0 end_ARG | italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩ satisfies

|f(𝒙)fWb(𝒙)|ε+2((1+2nε2)d1)ε+d2d2nε2𝑓𝒙subscript𝑓subscript𝑊𝑏𝒙𝜀2superscript1superscript2𝑛superscript𝜀2𝑑1𝜀𝑑superscript2𝑑superscript2𝑛superscript𝜀2\lvert f(\bm{x})-f_{W_{b}}(\bm{x})\rvert\leq\varepsilon+2\biggl{(}\Bigl{(}1+% \frac{\ell^{2}}{n\varepsilon^{2}}\Bigr{)}^{d}-1\biggr{)}\leq\varepsilon+d2^{d}% \frac{\ell^{2}}{n\varepsilon^{2}}| italic_f ( bold_italic_x ) - italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) | ≤ italic_ε + 2 ( ( 1 + divide start_ARG roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_n italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT - 1 ) ≤ italic_ε + italic_d 2 start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT divide start_ARG roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_n italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG (C.47)

for all 𝐱[0,1]d𝐱superscript01𝑑\bm{x}\in[0,1]^{d}bold_italic_x ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. The width of the PQC is O(dlogn)𝑂𝑑𝑛O(d\log n)italic_O ( italic_d roman_log italic_n ), the depth is O(dndlogn)𝑂𝑑superscript𝑛𝑑𝑛O\bigl{(}dn^{d}\log{n}\bigr{)}italic_O ( italic_d italic_n start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT roman_log italic_n ), and the number of parameters is O(dnd)𝑂𝑑superscript𝑛𝑑O(dn^{d})italic_O ( italic_d italic_n start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ).

Proof.

Lemma S8 has established the uniform convergence rate of the Bernstein polynomial towards any Lipschitz continuous function within the cubic domain [0,1]dsuperscript01𝑑[0,1]^{d}[ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. We know that for any Lipschitz continuous function f(𝒙)𝑓𝒙f(\bm{x})italic_f ( bold_italic_x ) with Lipschitz constant \ellroman_ℓ, there exists a Bernstein polynomial Bn(f;𝒙)subscript𝐵𝑛𝑓𝒙B_{n}(f;\bm{x})italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_f ; bold_italic_x ) satisfying

|f(𝒙)Bn(f;𝒙)|ε+2Γj=1d(dj)(24nε2)jε+2Γ((1+24nε2)d1).𝑓𝒙subscript𝐵𝑛𝑓𝒙𝜀2Γsuperscriptsubscript𝑗1𝑑binomial𝑑𝑗superscriptsuperscript24𝑛superscript𝜀2𝑗𝜀2Γsuperscript1superscript24𝑛superscript𝜀2𝑑1\lvert f(\bm{x})-B_{n}(f;\bm{x})\rvert\leq\varepsilon+2\Gamma\sum_{j=1}^{d}% \binom{d}{j}\left(\frac{\ell^{2}}{4n\varepsilon^{2}}\right)^{j}\leq\varepsilon% +2\Gamma\left(\left(1+\frac{\ell^{2}}{4n\varepsilon^{2}}\right)^{d}-1\right).| italic_f ( bold_italic_x ) - italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_f ; bold_italic_x ) | ≤ italic_ε + 2 roman_Γ ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( FRACOP start_ARG italic_d end_ARG start_ARG italic_j end_ARG ) ( divide start_ARG roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 italic_n italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ≤ italic_ε + 2 roman_Γ ( ( 1 + divide start_ARG roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 4 italic_n italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT - 1 ) .

Building on Lemma S9, we can effectively implement this Bernstein polynomial Bn(f;𝒙)subscript𝐵𝑛𝑓𝒙B_{n}(f;\bm{x})italic_B start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_f ; bold_italic_x ) using fWb(𝒙)subscript𝑓subscript𝑊𝑏𝒙f_{W_{b}}(\bm{x})italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ). The depth of the PQC Wb(𝒙)subscript𝑊𝑏𝒙W_{b}(\bm{x})italic_W start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( bold_italic_x ) is O(dndlogn)𝑂𝑑superscript𝑛𝑑𝑛O\bigl{(}dn^{d}\log{n}\bigr{)}italic_O ( italic_d italic_n start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT roman_log italic_n ), the width is O(dlogn)𝑂𝑑𝑛O(d\log{n})italic_O ( italic_d roman_log italic_n ), and the number of parameters is O(dnd)𝑂𝑑superscript𝑛𝑑O(dn^{d})italic_O ( italic_d italic_n start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ). This completes the proof.     square-intersection\sqcapsquare-union\sqcup

Appendix D Approximating smooth functions via nested PQCs

Other than using a Bernstein polynomial to approximate a continuous function globally, we could also utilize local polynomials to achieve a piecewise approximation. To do this, we follow the path of classical deep neural networks [18, 21, 25], using multivariate Taylor series expansion to approximate a multivariate smooth function f𝑓fitalic_f in some small local region. Let β=s+r>0𝛽𝑠𝑟0\beta=s+r>0italic_β = italic_s + italic_r > 0, r=(0,1]𝑟01r=(0,1]italic_r = ( 0 , 1 ] and s=β𝑠𝛽s=\lfloor\beta\rfloor\in{{\mathbb{N}}}italic_s = ⌊ italic_β ⌋ ∈ blackboard_N, for a finite constant B0>0subscript𝐵00B_{0}>0italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT > 0, the β𝛽\betaitalic_β-Hölder class of functions β([0,1]d,B0)superscript𝛽superscript01𝑑subscript𝐵0{\cal H}^{\beta}([0,1]^{d},B_{0})caligraphic_H start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is defined as

β([0,1]d,B0)={f:[0,1]d,max𝜶1s𝜶fB0,max𝜶1=ssup𝒙𝒚|𝜶f(𝒙)𝜶f(𝒚)|𝒙𝒚2rB0},{\cal H}^{\beta}([0,1]^{d},B_{0})=\Bigl{\{}f\mathrel{\mathop{\mathchar 58% \relax}}[0,1]^{d}\to{{\mathbb{R}}},\max_{\lVert\bm{\alpha}\rVert_{1}\leq s}% \lVert\partial^{\bm{\alpha}}f\rVert_{\infty}\leq B_{0},\max_{\lVert\bm{\alpha}% \rVert_{1}=s}\sup_{\bm{x}\neq\bm{y}}\frac{\lvert\partial^{\bm{\alpha}}f(\bm{x}% )-\partial^{\bm{\alpha}}f(\bm{y})\rvert}{\lVert\bm{x}-\bm{y}\rVert_{2}^{r}}% \leq B_{0}\Bigr{\}},caligraphic_H start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = { italic_f : [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → blackboard_R , roman_max start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT ∥ ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ∥ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ≤ italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , roman_max start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_s end_POSTSUBSCRIPT roman_sup start_POSTSUBSCRIPT bold_italic_x ≠ bold_italic_y end_POSTSUBSCRIPT divide start_ARG | ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ( bold_italic_x ) - ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ( bold_italic_y ) | end_ARG start_ARG ∥ bold_italic_x - bold_italic_y ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT end_ARG ≤ italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } , (D.48)

where 𝜶=α1αdsuperscript𝜶superscriptsubscript𝛼1superscriptsubscript𝛼𝑑\partial^{\bm{\alpha}}=\partial^{\alpha_{1}}\cdots\partial^{\alpha_{d}}∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT = ∂ start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ⋯ ∂ start_POSTSUPERSCRIPT italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT for 𝜶=(α1,,αd)d𝜶subscript𝛼1subscript𝛼𝑑superscript𝑑\bm{\alpha}=(\alpha_{1},\ldots,\alpha_{d})\in{{\mathbb{N}}}^{d}bold_italic_α = ( italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ∈ blackboard_N start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. By definition, for a function fβ([0,1]d,B0)𝑓superscript𝛽superscript01𝑑subscript𝐵0f\in{\cal H}^{\beta}([0,1]^{d},B_{0})italic_f ∈ caligraphic_H start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), when β(0,1)𝛽01\beta\in(0,1)italic_β ∈ ( 0 , 1 ), f𝑓fitalic_f is a Hölder continuous function with order β𝛽\betaitalic_β and Hölder constant B0subscript𝐵0B_{0}italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT; when β=1𝛽1\beta=1italic_β = 1, f𝑓fitalic_f is a Lipschitz function with Lipschitz constant B0subscript𝐵0B_{0}italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT; when β>1𝛽1\beta>1italic_β > 1, f𝑓fitalic_f belongs to the Cssuperscript𝐶𝑠C^{s}italic_C start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT class of functions whose s𝑠sitalic_s-th partial derivatives exist and are bounded.

We utilize the following lemma on the Taylor expansion of β𝛽\betaitalic_β-Hölder functions as a mathematical tool for constructing and analyzing the PQC approximation.

Lemma S10 ([18]).

Given a function fβ([0,1]d,1)𝑓superscript𝛽superscript01𝑑1f\in{\cal H}^{\beta}([0,1]^{d},1)italic_f ∈ caligraphic_H start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , 1 ) with β=r+s𝛽𝑟𝑠\beta=r+sitalic_β = italic_r + italic_s, r(0,1]𝑟01r\in(0,1]italic_r ∈ ( 0 , 1 ] and s+𝑠superscripts\in{{\mathbb{N}}}^{+}italic_s ∈ blackboard_N start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT, for any 𝐱,𝐱𝟎[0,1]d𝐱subscript𝐱0superscript01𝑑\bm{x},\bm{x_{0}}\in[0,1]^{d}bold_italic_x , bold_italic_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, we have

|f(𝒙)𝜶1s𝜶f(𝒙𝟎)𝜶!(𝒙𝒙𝟎)𝜶|ds𝒙𝒙𝟎2β,𝑓𝒙subscriptsubscriptdelimited-∥∥𝜶1𝑠superscript𝜶𝑓subscript𝒙0𝜶superscript𝒙subscript𝒙0𝜶superscript𝑑𝑠subscriptsuperscriptdelimited-∥∥𝒙subscript𝒙0𝛽2\Big{\lvert}f(\bm{x})-\sum_{\lVert\bm{\alpha}\rVert_{1}\leq s}\frac{\partial^{% \bm{\alpha}}f(\bm{x_{0}})}{\bm{\alpha}!}(\bm{x}-\bm{x_{0}})^{\bm{\alpha}}\Big{% \rvert}\leq d^{s}\lVert\bm{x}-\bm{x_{0}}\rVert^{\beta}_{2},| italic_f ( bold_italic_x ) - ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT divide start_ARG ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ( bold_italic_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT ) end_ARG start_ARG bold_italic_α ! end_ARG ( bold_italic_x - bold_italic_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT | ≤ italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ∥ bold_italic_x - bold_italic_x start_POSTSUBSCRIPT bold_0 end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , (D.49)

where 𝛂!=α1!αd!𝛂subscript𝛼1subscript𝛼𝑑\bm{\alpha}!=\alpha_{1}!\cdots\alpha_{d}!bold_italic_α ! = italic_α start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ! ⋯ italic_α start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT !.

Next, we show how to construct PQCs to implement the Taylor expansion of β𝛽\betaitalic_β-Hölder functions.

D.1 Localization via PQC

As shown in Eq. D.49, the Taylor expansion of a multivariate smooth function only converges in a fairly small local region. So, we need first to localize the entire region [0,1]dsuperscript01𝑑[0,1]^{d}[ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. Given K𝐾K\in{{\mathbb{N}}}italic_K ∈ blackboard_N and Δ(0,13K)Δ013𝐾\Delta\in(0,\frac{1}{3K})roman_Δ ∈ ( 0 , divide start_ARG 1 end_ARG start_ARG 3 italic_K end_ARG ), for each 𝜼=(η1,,ηd){0,1,,K1}d𝜼subscript𝜂1subscript𝜂𝑑superscript01𝐾1𝑑\bm{\eta}=(\eta_{1},\ldots,\eta_{d})\in\{0,1,\ldots,K-1\}^{d}bold_italic_η = ( italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_η start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ∈ { 0 , 1 , … , italic_K - 1 } start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, we define

Q𝜼{𝒙=(x1,,xd):xi[ηiK,ηi+1KΔ1ηi<K1]}.subscript𝑄𝜼𝒙subscript𝑥1subscript𝑥𝑑:subscript𝑥𝑖subscript𝜂𝑖𝐾subscript𝜂𝑖1𝐾Δsubscript1subscript𝜂𝑖𝐾1Q_{\bm{\eta}}\coloneqq\Bigl{\{}\bm{x}=(x_{1},\ldots,x_{d})\mathrel{\mathop{% \mathchar 58\relax}}x_{i}\in\bigl{[}\frac{\eta_{i}}{K},\frac{\eta_{i}+1}{K}-% \Delta\cdot 1_{\eta_{i}<K-1}\bigr{]}\Bigr{\}}.italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ≔ { bold_italic_x = ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) : italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ [ divide start_ARG italic_η start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_K end_ARG , divide start_ARG italic_η start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + 1 end_ARG start_ARG italic_K end_ARG - roman_Δ ⋅ 1 start_POSTSUBSCRIPT italic_η start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT < italic_K - 1 end_POSTSUBSCRIPT ] } . (D.50)

By the definition of Q𝜼subscript𝑄𝜼Q_{\bm{\eta}}italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT, the region [0,1]dsuperscript01𝑑[0,1]^{d}[ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT is approximately divided into small hypercubes 𝜼Q𝜼subscript𝜼subscript𝑄𝜼\bigcup_{\bm{\eta}}Q_{\bm{\eta}}⋃ start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT and some trifling region Λ(d,K,Δ)[0,1]d(𝜼Q𝜼)Λ𝑑𝐾Δsuperscript01𝑑subscript𝜼subscript𝑄𝜼\Lambda(d,K,\Delta)\coloneqq[0,1]^{d}\setminus(\bigcup_{\bm{\eta}}Q_{\bm{\eta}})roman_Λ ( italic_d , italic_K , roman_Δ ) ≔ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ∖ ( ⋃ start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ), as illustrated in Fig. 2 in the main text. Then we need to construct a PQC that maps any xQ𝜼𝑥subscript𝑄𝜼x\in Q_{\bm{\eta}}italic_x ∈ italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT to some fixed point x𝜼=𝜼KQ𝜼subscript𝑥𝜼𝜼𝐾subscript𝑄𝜼x_{\bm{\eta}}=\frac{\bm{\eta}}{K}\in Q_{\bm{\eta}}italic_x start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT = divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ∈ italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT, i.e., approximating the piecewise-constant function D(𝒙)=𝜼K𝐷𝒙𝜼𝐾D(\bm{x})=\frac{\bm{\eta}}{K}italic_D ( bold_italic_x ) = divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG if 𝒙Q𝜼𝒙subscript𝑄𝜼\bm{x}\in Q_{\bm{\eta}}bold_italic_x ∈ italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT, where 𝜼K=(η1/K,,ηd/K)𝜼𝐾subscript𝜂1𝐾subscript𝜂𝑑𝐾\frac{\bm{\eta}}{K}=(\eta_{1}/K,\ldots,\eta_{d}/K)divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG = ( italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / italic_K , … , italic_η start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT / italic_K ). We consider the case of d=1𝑑1d=1italic_d = 1, where the localization function is

D(x)=kK,if x[kK,k+1KΔ1k<K1] for k=0,1,,K1.𝐷𝑥𝑘𝐾if x[kK,k+1KΔ1k<K1] for k=0,1,,K1D(x)=\frac{k}{K},\qquad\text{if $x\in\Bigl{[}\frac{k}{K},\frac{k+1}{K}-\Delta% \cdot 1_{k<K-1}\Bigr{]}$ for $k=0,1,\ldots,K-1$}.italic_D ( italic_x ) = divide start_ARG italic_k end_ARG start_ARG italic_K end_ARG , if italic_x ∈ [ divide start_ARG italic_k end_ARG start_ARG italic_K end_ARG , divide start_ARG italic_k + 1 end_ARG start_ARG italic_K end_ARG - roman_Δ ⋅ 1 start_POSTSUBSCRIPT italic_k < italic_K - 1 end_POSTSUBSCRIPT ] for italic_k = 0 , 1 , … , italic_K - 1 . (D.51)

The multivariate case could be easily generalized by applying D(x)𝐷𝑥D(x)italic_D ( italic_x ) to each variable xjsubscript𝑥𝑗x_{j}italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. The idea is to find a polynomial that approximates the sign function

sgn(xc)={1,if x>c,0,if x=c1,if x<c,sgn𝑥𝑐cases1if x>c,0if x=c1if x<c\operatorname{sgn}(x-c)=\begin{cases}1,&\text{if $x>c$,}\\[1.0pt] 0,&\text{if $x=c$}\\ -1,&\text{if $x<c$}\end{cases},roman_sgn ( italic_x - italic_c ) = { start_ROW start_CELL 1 , end_CELL start_CELL if italic_x > italic_c , end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL if italic_x = italic_c end_CELL end_ROW start_ROW start_CELL - 1 , end_CELL start_CELL if italic_x < italic_c end_CELL end_ROW , (D.52)

as shown in the following lemma.

Lemma S11 (Polynomial approximation to the sign function sgn(xc)sgn𝑥𝑐\operatorname{sgn}(x-c)roman_sgn ( italic_x - italic_c ) [54]).

c[1,1],Δ>0,ε(0,1)formulae-sequencefor-all𝑐11formulae-sequenceΔ0𝜀01\forall c\in[-1,1],\Delta>0,\varepsilon\in(0,1)∀ italic_c ∈ [ - 1 , 1 ] , roman_Δ > 0 , italic_ε ∈ ( 0 , 1 ). there exists an odd polynomial PΔ,ε(x)subscript𝑃Δ𝜀𝑥P_{\Delta,\varepsilon}(x)italic_P start_POSTSUBSCRIPT roman_Δ , italic_ε end_POSTSUBSCRIPT ( italic_x ) of degree n=O(1Δlog1ε)𝑛𝑂1Δ1𝜀n=O(\frac{1}{\Delta}\log\frac{1}{\varepsilon})italic_n = italic_O ( divide start_ARG 1 end_ARG start_ARG roman_Δ end_ARG roman_log divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ) that satisfies

  1. 1.

    |PΔ,ε(xc)|1subscript𝑃Δ𝜀𝑥𝑐1\lvert P_{\Delta,\varepsilon}(x-c)\rvert\leq 1| italic_P start_POSTSUBSCRIPT roman_Δ , italic_ε end_POSTSUBSCRIPT ( italic_x - italic_c ) | ≤ 1 for all x[1,1],𝑥11x\in[-1,1],italic_x ∈ [ - 1 , 1 ] ,

  2. 2.

    |sgn(xc)PΔ,ε(xc)|εsgn𝑥𝑐subscript𝑃Δ𝜀𝑥𝑐𝜀\lvert\operatorname{sgn}(x-c)-P_{\Delta,\varepsilon}(x-c)\rvert\leq\varepsilon| roman_sgn ( italic_x - italic_c ) - italic_P start_POSTSUBSCRIPT roman_Δ , italic_ε end_POSTSUBSCRIPT ( italic_x - italic_c ) | ≤ italic_ε for all x[1,1](cΔ2,c+Δ2)𝑥11𝑐Δ2𝑐Δ2x\in[-1,1]\setminus(c-\frac{\Delta}{2},c+\frac{\Delta}{2})italic_x ∈ [ - 1 , 1 ] ∖ ( italic_c - divide start_ARG roman_Δ end_ARG start_ARG 2 end_ARG , italic_c + divide start_ARG roman_Δ end_ARG start_ARG 2 end_ARG ).

Note that we could also approximate the step function defined as stp(xc)12sgn(xc)+12stp𝑥𝑐12sgn𝑥𝑐12\operatorname{stp}(x-c)\coloneqq\frac{1}{2}\operatorname{sgn}(x-c)+\frac{1}{2}roman_stp ( italic_x - italic_c ) ≔ divide start_ARG 1 end_ARG start_ARG 2 end_ARG roman_sgn ( italic_x - italic_c ) + divide start_ARG 1 end_ARG start_ARG 2 end_ARG by the polynomial PΔ,ε(xc)=12PΔ,ε(xc)+12superscriptsubscript𝑃Δ𝜀𝑥𝑐12subscript𝑃Δ𝜀𝑥𝑐12P_{\Delta,\varepsilon}^{\prime}(x-c)=\frac{1}{2}P_{\Delta,\varepsilon}(x-c)+% \frac{1}{2}italic_P start_POSTSUBSCRIPT roman_Δ , italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x - italic_c ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_P start_POSTSUBSCRIPT roman_Δ , italic_ε end_POSTSUBSCRIPT ( italic_x - italic_c ) + divide start_ARG 1 end_ARG start_ARG 2 end_ARG of degree n=O(1Δlog1ε)𝑛𝑂1Δ1𝜀n=O(\frac{1}{\Delta}\log\frac{1}{\varepsilon})italic_n = italic_O ( divide start_ARG 1 end_ARG start_ARG roman_Δ end_ARG roman_log divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ), which satisfies that |PΔ,ε(xc)|1superscriptsubscript𝑃Δ𝜀𝑥𝑐1\lvert P_{\Delta,\varepsilon}^{\prime}(x-c)\rvert\leq 1| italic_P start_POSTSUBSCRIPT roman_Δ , italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x - italic_c ) | ≤ 1 for all x[1,1]𝑥11x\in[-1,1]italic_x ∈ [ - 1 , 1 ] and |stp(xc)PΔ,ε(xc)|ε2stp𝑥𝑐superscriptsubscript𝑃Δ𝜀𝑥𝑐𝜀2\lvert\operatorname{stp}(x-c)-P_{\Delta,\varepsilon}^{\prime}(x-c)\rvert\leq% \frac{\varepsilon}{2}| roman_stp ( italic_x - italic_c ) - italic_P start_POSTSUBSCRIPT roman_Δ , italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x - italic_c ) | ≤ divide start_ARG italic_ε end_ARG start_ARG 2 end_ARG for all x[1,1](cΔ2,c+Δ2)𝑥11𝑐Δ2𝑐Δ2x\in[-1,1]\setminus(c-\frac{\Delta}{2},c+\frac{\Delta}{2})italic_x ∈ [ - 1 , 1 ] ∖ ( italic_c - divide start_ARG roman_Δ end_ARG start_ARG 2 end_ARG , italic_c + divide start_ARG roman_Δ end_ARG start_ARG 2 end_ARG ). Note that the polynomial PΔ,ε(xc)superscriptsubscript𝑃Δ𝜀𝑥𝑐P_{\Delta,\varepsilon}^{\prime}(x-c)italic_P start_POSTSUBSCRIPT roman_Δ , italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x - italic_c ) does not have definite parity and thus cannot be directly implemented by a PQC as shown in Corollary S2. Since only the domain [0,1]01[0,1][ 0 , 1 ] is relevant to x𝑥xitalic_x, for c(0,1)𝑐01c\in(0,1)italic_c ∈ ( 0 , 1 ), we could define an even polynomial

Pc,Δ,εeven(x)=11+ε2(PΔ,ε(xc)+PΔ,ε(xc))superscriptsubscript𝑃𝑐Δ𝜀even𝑥11𝜀2superscriptsubscript𝑃Δ𝜀𝑥𝑐superscriptsubscript𝑃Δ𝜀𝑥𝑐P_{c,\Delta,\varepsilon}^{\text{even}}(x)=\frac{1}{1+\frac{\varepsilon}{2}}% \left(P_{\Delta,\varepsilon}^{\prime}(x-c)+P_{\Delta,\varepsilon}^{\prime}(-x-% c)\right)italic_P start_POSTSUBSCRIPT italic_c , roman_Δ , italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT even end_POSTSUPERSCRIPT ( italic_x ) = divide start_ARG 1 end_ARG start_ARG 1 + divide start_ARG italic_ε end_ARG start_ARG 2 end_ARG end_ARG ( italic_P start_POSTSUBSCRIPT roman_Δ , italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_x - italic_c ) + italic_P start_POSTSUBSCRIPT roman_Δ , italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( - italic_x - italic_c ) ) (D.53)

such that |Pc,Δ,εeven(x)|1superscriptsubscript𝑃𝑐Δ𝜀even𝑥1\lvert P_{c,\Delta,\varepsilon}^{\text{even}}(x)\rvert\leq 1| italic_P start_POSTSUBSCRIPT italic_c , roman_Δ , italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT even end_POSTSUPERSCRIPT ( italic_x ) | ≤ 1 for all x[1,1]𝑥11x\in[-1,1]italic_x ∈ [ - 1 , 1 ] and |stp(xc)Pc,Δ,εeven(x)|ε2stp𝑥𝑐superscriptsubscript𝑃𝑐Δ𝜀even𝑥𝜀2\lvert\operatorname{stp}(x-c)-P_{c,\Delta,\varepsilon}^{\text{even}}(x)\rvert% \leq\frac{\varepsilon}{2}| roman_stp ( italic_x - italic_c ) - italic_P start_POSTSUBSCRIPT italic_c , roman_Δ , italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT even end_POSTSUPERSCRIPT ( italic_x ) | ≤ divide start_ARG italic_ε end_ARG start_ARG 2 end_ARG for all x[0,1](cΔ2,c+Δ2)𝑥01𝑐Δ2𝑐Δ2x\in[0,1]\setminus(c-\frac{\Delta}{2},c+\frac{\Delta}{2})italic_x ∈ [ 0 , 1 ] ∖ ( italic_c - divide start_ARG roman_Δ end_ARG start_ARG 2 end_ARG , italic_c + divide start_ARG roman_Δ end_ARG start_ARG 2 end_ARG ). The piecewise-constant function D(x)𝐷𝑥D(x)italic_D ( italic_x ) can be written as a combination of step functions,

D(x)=k=1K11Kstp(xkK+Δ2).𝐷𝑥superscriptsubscript𝑘1𝐾11𝐾stp𝑥𝑘𝐾Δ2D(x)=\sum_{k=1}^{K-1}\frac{1}{K}\operatorname{stp}\bigl{(}x-\frac{k}{K}+\frac{% \Delta}{2}\bigr{)}.italic_D ( italic_x ) = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K - 1 end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_K end_ARG roman_stp ( italic_x - divide start_ARG italic_k end_ARG start_ARG italic_K end_ARG + divide start_ARG roman_Δ end_ARG start_ARG 2 end_ARG ) . (D.54)

Then we could find even polynomials Pc,Δ,εeven(x)superscriptsubscript𝑃𝑐Δ𝜀even𝑥P_{c,\Delta,\varepsilon}^{\text{even}}(x)italic_P start_POSTSUBSCRIPT italic_c , roman_Δ , italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT even end_POSTSUPERSCRIPT ( italic_x ) that approximate stp(xkK+Δ2)stp𝑥𝑘𝐾Δ2\operatorname{stp}\bigl{(}x-\frac{k}{K}+\frac{\Delta}{2}\bigr{)}roman_stp ( italic_x - divide start_ARG italic_k end_ARG start_ARG italic_K end_ARG + divide start_ARG roman_Δ end_ARG start_ARG 2 end_ARG ) for each k𝑘kitalic_k. Combining those polynomials together as in Eq. D.54, we have the following lemma.

Lemma S12.

Given K𝐾K\in{{\mathbb{N}}}italic_K ∈ blackboard_N and Δ(0,13K)Δ013𝐾\Delta\in(0,\frac{1}{3K})roman_Δ ∈ ( 0 , divide start_ARG 1 end_ARG start_ARG 3 italic_K end_ARG ), there exists an even polynomial PΔ,ε(x)subscript𝑃Δ𝜀𝑥P_{\Delta,\varepsilon}(x)italic_P start_POSTSUBSCRIPT roman_Δ , italic_ε end_POSTSUBSCRIPT ( italic_x ) of degree n=O(1ΔlogKε)𝑛𝑂1Δ𝐾𝜀n=O(\frac{1}{\Delta}\log\frac{K}{\varepsilon})italic_n = italic_O ( divide start_ARG 1 end_ARG start_ARG roman_Δ end_ARG roman_log divide start_ARG italic_K end_ARG start_ARG italic_ε end_ARG ) that satisfies

  1. 1.

    |PΔ,ε(x)|1subscript𝑃Δ𝜀𝑥1\lvert P_{\Delta,\varepsilon}(x)\rvert\leq 1| italic_P start_POSTSUBSCRIPT roman_Δ , italic_ε end_POSTSUBSCRIPT ( italic_x ) | ≤ 1 for all x[1,1],𝑥11x\in[-1,1],italic_x ∈ [ - 1 , 1 ] ,

  2. 2.

    |D(x)PΔ,ε(x)|ε𝐷𝑥subscript𝑃Δ𝜀𝑥𝜀\lvert D(x)-P_{\Delta,\varepsilon}(x)\rvert\leq\varepsilon| italic_D ( italic_x ) - italic_P start_POSTSUBSCRIPT roman_Δ , italic_ε end_POSTSUBSCRIPT ( italic_x ) | ≤ italic_ε for all xk=0K1[kK,k+1KΔ1k<K1]𝑥superscriptsubscript𝑘0𝐾1𝑘𝐾𝑘1𝐾Δsubscript1𝑘𝐾1x\in\bigcup_{k=0}^{K-1}\bigl{[}\frac{k}{K},\frac{k+1}{K}-\Delta\cdot 1_{k<K-1}% \bigr{]}italic_x ∈ ⋃ start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K - 1 end_POSTSUPERSCRIPT [ divide start_ARG italic_k end_ARG start_ARG italic_K end_ARG , divide start_ARG italic_k + 1 end_ARG start_ARG italic_K end_ARG - roman_Δ ⋅ 1 start_POSTSUBSCRIPT italic_k < italic_K - 1 end_POSTSUBSCRIPT ].

Note that we could shift the polynomial PΔ,ε(x)subscript𝑃Δ𝜀𝑥P_{\Delta,\varepsilon}(x)italic_P start_POSTSUBSCRIPT roman_Δ , italic_ε end_POSTSUBSCRIPT ( italic_x ) such that PΔ,ε(x)D(x)(0,ε)subscript𝑃Δ𝜀𝑥𝐷𝑥0𝜀P_{\Delta,\varepsilon}(x)-D(x)\in(0,\varepsilon)italic_P start_POSTSUBSCRIPT roman_Δ , italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_D ( italic_x ) ∈ ( 0 , italic_ε ) without changing the degree. It follows that we can construct a PQC to implement the polynomial PΔ,ε(x)subscript𝑃Δ𝜀𝑥P_{\Delta,\varepsilon}(x)italic_P start_POSTSUBSCRIPT roman_Δ , italic_ε end_POSTSUBSCRIPT ( italic_x ) by Corollary S2.

Corollary S13.

Given K𝐾K\in{{\mathbb{N}}}italic_K ∈ blackboard_N, Δ(0,13K)Δ013𝐾\Delta\in(0,\frac{1}{3K})roman_Δ ∈ ( 0 , divide start_ARG 1 end_ARG start_ARG 3 italic_K end_ARG ) and ε(0,1K)𝜀01𝐾\varepsilon\in(0,\frac{1}{K})italic_ε ∈ ( 0 , divide start_ARG 1 end_ARG start_ARG italic_K end_ARG ), there exists a single-qubit PQC UD(x)subscript𝑈𝐷𝑥U_{D}(x)italic_U start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ( italic_x ) of depth O(1ΔlogKε)𝑂1Δ𝐾𝜀O(\frac{1}{\Delta}\log\frac{K}{\varepsilon})italic_O ( divide start_ARG 1 end_ARG start_ARG roman_Δ end_ARG roman_log divide start_ARG italic_K end_ARG start_ARG italic_ε end_ARG ) that satisfies

+|UD(x)|+kK(0,ε)if x[kK,k+1KΔ1k<K1] for k=0,1,,K1.quantum-operator-productsubscript𝑈𝐷𝑥𝑘𝐾0𝜀if x[kK,k+1KΔ1k<K1] for k=0,1,,K1\braket{+}{U_{D}(x)}{+}-\frac{k}{K}\in(0,\varepsilon)\quad\text{if $x\in\Bigl{% [}\frac{k}{K},\frac{k+1}{K}-\Delta\cdot 1_{k<K-1}\Bigr{]}$ for $k=0,1,\ldots,K% -1$}.⟨ start_ARG + end_ARG | start_ARG italic_U start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ( italic_x ) end_ARG | start_ARG + end_ARG ⟩ - divide start_ARG italic_k end_ARG start_ARG italic_K end_ARG ∈ ( 0 , italic_ε ) if italic_x ∈ [ divide start_ARG italic_k end_ARG start_ARG italic_K end_ARG , divide start_ARG italic_k + 1 end_ARG start_ARG italic_K end_ARG - roman_Δ ⋅ 1 start_POSTSUBSCRIPT italic_k < italic_K - 1 end_POSTSUBSCRIPT ] for italic_k = 0 , 1 , … , italic_K - 1 . (D.55)

Note that ε𝜀\varepsilonitalic_ε has to be bounded by 1K1𝐾\frac{1}{K}divide start_ARG 1 end_ARG start_ARG italic_K end_ARG, which is the length of the localized region. We could further implement such a localization procedure for 𝒙=(x1,,xd)𝒙subscript𝑥1subscript𝑥𝑑\bm{x}=(x_{1},\ldots,x_{d})bold_italic_x = ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) on the region [0,1]dsuperscript01𝑑[0,1]^{d}[ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT by applying the PQC for each xjsubscript𝑥𝑗x_{j}italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, as stated in the following corollary.

Lemma S14 (Localization via PQC).

Given K𝐾K\in{{\mathbb{N}}}italic_K ∈ blackboard_N, Δ(0,13K)Δ013𝐾\Delta\in(0,\frac{1}{3K})roman_Δ ∈ ( 0 , divide start_ARG 1 end_ARG start_ARG 3 italic_K end_ARG ) and ε(0,1K)𝜀01𝐾\varepsilon\in(0,\frac{1}{K})italic_ε ∈ ( 0 , divide start_ARG 1 end_ARG start_ARG italic_K end_ARG ), there exists a PQC WD(𝐱)subscript𝑊𝐷𝐱W_{D}(\bm{x})italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ( bold_italic_x ) of width O(d)𝑂𝑑O(d)italic_O ( italic_d ) and depth O(1ΔlogKε)𝑂1Δ𝐾𝜀O(\frac{1}{\Delta}\log\frac{K}{\varepsilon})italic_O ( divide start_ARG 1 end_ARG start_ARG roman_Δ end_ARG roman_log divide start_ARG italic_K end_ARG start_ARG italic_ε end_ARG ) implementing a localization function fWD(𝐱):dd:subscript𝑓subscript𝑊𝐷𝐱superscript𝑑superscript𝑑f_{W_{D}}(\bm{x})\mathrel{\mathop{\mathchar 58\relax}}{{\mathbb{R}}}^{d}\to{{% \mathbb{R}}}^{d}italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) : blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT such that

𝟎fWD(𝒙)𝜼K𝜺if 𝒙Q𝜼,formulae-sequence0subscript𝑓subscript𝑊𝐷𝒙𝜼𝐾𝜺if 𝒙Q𝜼,\bm{0}\leq f_{W_{D}}(\bm{x})-\frac{{\bm{\eta}}}{K}\leq\bm{\varepsilon}\quad% \text{if $\bm{x}\in Q_{\bm{\eta}}$,}bold_0 ≤ italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) - divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ≤ bold_italic_ε if bold_italic_x ∈ italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT , (D.56)

where 𝟎=(0,,0)000\bm{0}=(0,\ldots,0)bold_0 = ( 0 , … , 0 ) and 𝛆=(ε,,ε)𝛆𝜀𝜀\bm{\varepsilon}=(\varepsilon,\ldots,\varepsilon)bold_italic_ε = ( italic_ε , … , italic_ε ) are d𝑑ditalic_d-dimensional vectors.

Proof.

We construct a d𝑑ditalic_d-qubit PQC WD(𝒙)j=1dUD(xj)subscript𝑊𝐷𝒙superscriptsubscripttensor-product𝑗1𝑑subscript𝑈𝐷subscript𝑥𝑗W_{D}(\bm{x})\coloneqq\bigotimes_{j=1}^{d}U_{D}(x_{j})italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ⨂ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) where the single-qubit PQC UD(x)subscript𝑈𝐷𝑥U_{D}(x)italic_U start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ( italic_x ) is constructed in Corollary S13. Then we apply the Hadamard test on each UD(xj)subscript𝑈𝐷subscript𝑥𝑗U_{D}(x_{j})italic_U start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) to obtain fUD(xj)+|UD(xj)|+subscript𝑓subscript𝑈𝐷subscript𝑥𝑗quantum-operator-productsubscript𝑈𝐷subscript𝑥𝑗f_{U_{D}}(x_{j})\coloneqq\braket{+}{U_{D}(x_{j})}{+}italic_f start_POSTSUBSCRIPT italic_U start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ≔ ⟨ start_ARG + end_ARG | start_ARG italic_U start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) end_ARG | start_ARG + end_ARG ⟩. Let fWD(𝒙)(fUD(x1),,fUD(xd))subscript𝑓subscript𝑊𝐷𝒙subscript𝑓subscript𝑈𝐷subscript𝑥1subscript𝑓subscript𝑈𝐷subscript𝑥𝑑f_{W_{D}}(\bm{x})\coloneqq(f_{U_{D}}(x_{1}),\ldots,f_{U_{D}}(x_{d}))italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ( italic_f start_POSTSUBSCRIPT italic_U start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , … , italic_f start_POSTSUBSCRIPT italic_U start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ), which implements the localization function as required.     square-intersection\sqcapsquare-union\sqcup

D.2 Implementing the Taylor coefficients by PQC

Next, we use PQC to implement the Taylor coefficients, which is essentially a point-fitting problem. For each 𝜼=(η1,,ηd){0,1,,K1}d𝜼subscript𝜂1subscript𝜂𝑑superscript01𝐾1𝑑{\bm{\eta}}=(\eta_{1},\ldots,\eta_{d})\in\{0,1,\ldots,K-1\}^{d}bold_italic_η = ( italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_η start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ) ∈ { 0 , 1 , … , italic_K - 1 } start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT and 𝜶𝜶\bm{\alpha}bold_italic_α, we denote ξ𝜼,𝜶𝜶f(𝜼K)𝜶![1,1]subscript𝜉𝜼𝜶superscript𝜶𝑓𝜼𝐾𝜶11\xi_{{\bm{\eta}},\bm{\alpha}}\coloneqq\frac{\partial^{\bm{\alpha}}f(\frac{{\bm% {\eta}}}{K})}{\bm{\alpha}!}\in[-1,1]italic_ξ start_POSTSUBSCRIPT bold_italic_η , bold_italic_α end_POSTSUBSCRIPT ≔ divide start_ARG ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ( divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) end_ARG start_ARG bold_italic_α ! end_ARG ∈ [ - 1 , 1 ]. Then we could construct the following PQC,

Uco𝜶=𝜼|𝜼𝜼|RX(θ𝜼,𝜶),U_{co}^{\bm{\alpha}}=\sum_{{\bm{\eta}}}\lvert{\bm{\eta}}\rangle\!\langle{\bm{% \eta}}\rvert\otimes R_{X}(\theta_{\bm{\eta},\bm{\alpha}}),italic_U start_POSTSUBSCRIPT italic_c italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT | bold_italic_η ⟩ ⟨ bold_italic_η | ⊗ italic_R start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT bold_italic_η , bold_italic_α end_POSTSUBSCRIPT ) , (D.57)

where |𝜼=|η1|ηdket𝜼tensor-productketsubscript𝜂1ketsubscript𝜂𝑑\ket{{\bm{\eta}}}=\ket{\eta_{1}}\otimes\cdots\otimes\ket{\eta_{d}}| start_ARG bold_italic_η end_ARG ⟩ = | start_ARG italic_η start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ⟩ ⊗ ⋯ ⊗ | start_ARG italic_η start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_ARG ⟩ and θ𝜼,𝜶=2arccos(ξ𝜼,𝜶)subscript𝜃𝜼𝜶2subscript𝜉𝜼𝜶\theta_{\bm{\eta},\bm{\alpha}}=2\arccos(\xi_{\bm{\eta},\bm{\alpha}})italic_θ start_POSTSUBSCRIPT bold_italic_η , bold_italic_α end_POSTSUBSCRIPT = 2 roman_arccos ( italic_ξ start_POSTSUBSCRIPT bold_italic_η , bold_italic_α end_POSTSUBSCRIPT ). It gives the following lemma.

Lemma S15.

Given a β𝛽\betaitalic_β-Hölder smooth function f:[0,1]d[1,1]:𝑓superscript01𝑑11f\mathrel{\mathop{\mathchar 58\relax}}[0,1]^{d}\to[-1,1]italic_f : [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → [ - 1 , 1 ], for any 𝛂d𝛂superscript𝑑\bm{\alpha}\in{{\mathbb{N}}}^{d}bold_italic_α ∈ blackboard_N start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT and 𝛈{0,1,,K1}d𝛈superscript01𝐾1𝑑{\bm{\eta}}\in\{0,1,\ldots,K-1\}^{d}bold_italic_η ∈ { 0 , 1 , … , italic_K - 1 } start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, there exists a PQC Uco𝛂superscriptsubscript𝑈𝑐𝑜𝛂U_{co}^{\bm{\alpha}}italic_U start_POSTSUBSCRIPT italic_c italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT such that

𝜼,0|Uco𝜶|𝜼,0=ξ𝜼,𝜶.bra𝜼0superscriptsubscript𝑈𝑐𝑜𝜶ket𝜼0subscript𝜉𝜼𝜶\bra{{\bm{\eta}},0}U_{co}^{\bm{\alpha}}\ket{{\bm{\eta}},0}=\xi_{{\bm{\eta}},% \bm{\alpha}}.⟨ start_ARG bold_italic_η , 0 end_ARG | italic_U start_POSTSUBSCRIPT italic_c italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT | start_ARG bold_italic_η , 0 end_ARG ⟩ = italic_ξ start_POSTSUBSCRIPT bold_italic_η , bold_italic_α end_POSTSUBSCRIPT . (D.58)

The width of the PQC is O(dlogK)𝑂𝑑𝐾O(d\log K)italic_O ( italic_d roman_log italic_K ), and the depth is O(Kd)𝑂superscript𝐾𝑑O(K^{d})italic_O ( italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ).

We note that the state |𝜼ket𝜼\ket{{\bm{\eta}}}| start_ARG bold_italic_η end_ARG ⟩ can be prepared using basis encoding according to the results of localization in Lemma S14.

D.3 Implementing multivariate Taylor series by PQC

To implement the multivariate Taylor expansion of a function, we first build a PQC to represent a single term in the Taylor series, which could be done by combining the monomial implementation in Lemma S5 and the Taylor coefficient implementation in Lemma S15. Thus, we have the following corollary.

Corollary S16.

For any β𝛽\betaitalic_β-Hölder smooth function f𝑓fitalic_f, given an 𝛂d𝛂superscript𝑑\bm{\alpha}\in{{\mathbb{N}}}^{d}bold_italic_α ∈ blackboard_N start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT with 𝛂1ssubscriptdelimited-∥∥𝛂1𝑠\lVert\bm{\alpha}\rVert_{1}\leq s∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s for s+𝑠superscripts\in{{\mathbb{N}}}^{+}italic_s ∈ blackboard_N start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT and an 𝛈{0,1,,K1}d𝛈superscript01𝐾1𝑑{\bm{\eta}}\in\{0,1,\ldots,K-1\}^{d}bold_italic_η ∈ { 0 , 1 , … , italic_K - 1 } start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, there exists a PQC U𝛈𝛂(𝐱)subscriptsuperscript𝑈𝛂𝛈𝐱U^{\bm{\alpha}}_{\bm{\eta}}(\bm{x})italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ( bold_italic_x ) such that

𝜼,0|+|dU𝜼𝜶(𝒙)|𝜼,0|+d=𝜶f(𝜼K)𝜶!(𝒙𝜼K)𝜶.bra𝜼0superscriptbratensor-productabsent𝑑subscriptsuperscript𝑈𝜶𝜼𝒙ket𝜼0superscriptkettensor-productabsent𝑑superscript𝜶𝑓𝜼𝐾𝜶superscript𝒙𝜼𝐾𝜶\bra{\bm{\eta},0}\!\bra{+}^{\otimes d}U^{\bm{\alpha}}_{\bm{\eta}}(\bm{x})\ket{% \bm{\eta},0}\!\ket{+}^{\otimes d}=\frac{\partial^{\bm{\alpha}}f(\frac{\bm{\eta% }}{K})}{\bm{\alpha}!}\bigl{(}\bm{x}-\frac{\bm{\eta}}{K}\bigr{)}^{\bm{\alpha}}.⟨ start_ARG bold_italic_η , 0 end_ARG | ⟨ start_ARG + end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG bold_italic_η , 0 end_ARG ⟩ | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT = divide start_ARG ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ( divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) end_ARG start_ARG bold_italic_α ! end_ARG ( bold_italic_x - divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT . (D.59)

The width of the PQC is O(dlogK)𝑂𝑑𝐾O(d\log K)italic_O ( italic_d roman_log italic_K ), the depth is O(Kd+s)𝑂superscript𝐾𝑑𝑠O(K^{d}+s)italic_O ( italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + italic_s ), and the number of parameters is at most Kd+s+dsuperscript𝐾𝑑𝑠𝑑K^{d}+s+ditalic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + italic_s + italic_d.

Proof.

Let U𝜼𝜶(𝒙)Uco𝜶U𝜶(𝒙𝜼K)subscriptsuperscript𝑈𝜶𝜼𝒙tensor-productsuperscriptsubscript𝑈𝑐𝑜𝜶superscript𝑈𝜶𝒙𝜼𝐾U^{\bm{\alpha}}_{\bm{\eta}}(\bm{x})\coloneqq U_{co}^{\bm{\alpha}}\otimes U^{% \bm{\alpha}}(\bm{x}-\frac{\bm{\eta}}{K})italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ( bold_italic_x ) ≔ italic_U start_POSTSUBSCRIPT italic_c italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ⊗ italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x - divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ), where Uco𝜶superscriptsubscript𝑈𝑐𝑜𝜶U_{co}^{\bm{\alpha}}italic_U start_POSTSUBSCRIPT italic_c italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT is defined in Lemma S15 and U𝜶(𝒙𝜼K)superscript𝑈𝜶𝒙𝜼𝐾U^{\bm{\alpha}}(\bm{x}-\frac{\bm{\eta}}{K})italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x - divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) is defined in Lemma S5 with changing input from 𝒙𝒙\bm{x}bold_italic_x to 𝒙𝜼K𝒙𝜼𝐾\bm{x}-\frac{\bm{\eta}}{K}bold_italic_x - divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG. Then the corollary directly follows from Lemma S5 and Lemma S15.     square-intersection\sqcapsquare-union\sqcup

The next step is to combine single Taylor terms together to implement the truncated Taylor expansion of the target function. The method is in the same spirit as what is utilized in Theorem 1, i.e., using LCU to achieve the following (unnormalized) operator,

Ut(𝒙)𝜶1sU𝜼𝜶(𝒙).subscript𝑈𝑡𝒙subscriptsubscriptdelimited-∥∥𝜶1𝑠subscriptsuperscript𝑈𝜶𝜼𝒙U_{t}(\bm{x})\coloneqq\sum_{\lVert\bm{\alpha}\rVert_{1}\leq s}U^{\bm{\alpha}}_% {\bm{\eta}}(\bm{x}).italic_U start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ( bold_italic_x ) . (D.60)

Then we can implement the Taylor expansion of the function f𝑓fitalic_f at point 𝜼K𝜼𝐾\frac{\bm{\eta}}{K}divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG as

𝜼,0|+|dUt(𝒙)|𝜼,0|+d=𝜶1s𝜶f(𝜼K)𝜶!(𝒙𝜼K)𝜶.bra𝜼0superscriptbratensor-productabsent𝑑subscript𝑈𝑡𝒙ket𝜼0superscriptkettensor-productabsent𝑑subscriptsubscriptdelimited-∥∥𝜶1𝑠superscript𝜶𝑓𝜼𝐾𝜶superscript𝒙𝜼𝐾𝜶\bra{\bm{\eta},0}\!\bra{+}^{\otimes d}U_{t}(\bm{x})\ket{\bm{\eta},0}\!\ket{+}^% {\otimes d}=\sum_{\lVert\bm{\alpha}\rVert_{1}\leq s}\frac{\partial^{\bm{\alpha% }}f(\frac{\bm{\eta}}{K})}{\bm{\alpha}!}\bigl{(}\bm{x}-\frac{\bm{\eta}}{K}\bigr% {)}^{\bm{\alpha}}.⟨ start_ARG bold_italic_η , 0 end_ARG | ⟨ start_ARG + end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_U start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG bold_italic_η , 0 end_ARG ⟩ | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT divide start_ARG ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ( divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) end_ARG start_ARG bold_italic_α ! end_ARG ( bold_italic_x - divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT . (D.61)

Hence we have the following lemma.

Lemma S17.

Given a function fβ([0,1]d,1)𝑓superscript𝛽superscript01𝑑1f\in{\cal H}^{\beta}([0,1]^{d},1)italic_f ∈ caligraphic_H start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , 1 ) with β=r+s𝛽𝑟𝑠\beta=r+sitalic_β = italic_r + italic_s, r(0,1]𝑟01r\in(0,1]italic_r ∈ ( 0 , 1 ] and s+𝑠superscripts\in{{\mathbb{N}}}^{+}italic_s ∈ blackboard_N start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT, for any 𝛈{0,,K1}d𝛈superscript0𝐾1𝑑\bm{\eta}\in\{0,\ldots,K-1\}^{d}bold_italic_η ∈ { 0 , … , italic_K - 1 } start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, there exists a PQC We(𝐱,𝛈K)subscript𝑊𝑒𝐱𝛈𝐾W_{e}(\bm{x},\frac{\bm{\eta}}{K})italic_W start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ( bold_italic_x , divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) such that fWe(𝐱)0|We(𝐱)Z(0)We(𝐱)|0subscript𝑓subscript𝑊𝑒𝐱bra0subscriptsuperscript𝑊𝑒𝐱superscript𝑍0subscript𝑊𝑒𝐱ket0f_{W_{e}}(\bm{x})\coloneqq\bra{0}W^{\dagger}_{e}(\bm{x})Z^{(0)}W_{e}(\bm{x})% \ket{0}italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ⟨ start_ARG 0 end_ARG | italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ( bold_italic_x ) italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩ implements the truncated Taylor expansion at point 𝛈K𝛈𝐾\frac{\bm{\eta}}{K}divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG,

fWe(𝒙)=𝜶1s𝜶f(𝜼K)α!(𝒙𝜼K)𝜶.subscript𝑓subscript𝑊𝑒𝒙subscriptsubscriptdelimited-∥∥𝜶1𝑠superscript𝜶𝑓𝜼𝐾𝛼superscript𝒙𝜼𝐾𝜶f_{W_{e}}(\bm{x})=\sum_{\lVert\bm{\alpha}\rVert_{1}\leq s}\frac{\partial^{\bm{% \alpha}}f(\frac{\bm{\eta}}{K})}{\alpha!}\bigl{(}\bm{x}-\frac{\bm{\eta}}{K}% \bigr{)}^{\bm{\alpha}}.italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) = ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT divide start_ARG ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ( divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) end_ARG start_ARG italic_α ! end_ARG ( bold_italic_x - divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT . (D.62)

The depth of the PQC is O(s2dsKd(logs+slogd+dlogK))𝑂superscript𝑠2superscript𝑑𝑠superscript𝐾𝑑𝑠𝑠𝑑𝑑𝐾O(s^{2}d^{s}K^{d}(\log s+s\log d+d\log K))italic_O ( italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( roman_log italic_s + italic_s roman_log italic_d + italic_d roman_log italic_K ) ), the width is O(dlogK+logs+slogd)𝑂𝑑𝐾𝑠𝑠𝑑O(d\log K+\log s+s\log d)italic_O ( italic_d roman_log italic_K + roman_log italic_s + italic_s roman_log italic_d ), and the number of parameters is O(sds(s+d+Kd))𝑂𝑠superscript𝑑𝑠𝑠𝑑superscript𝐾𝑑O(sd^{s}(s+d+K^{d}))italic_O ( italic_s italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ( italic_s + italic_d + italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ) ).

Proof.

The idea of constructing the PQC We(𝒙,𝜼K)subscript𝑊𝑒𝒙𝜼𝐾W_{e}(\bm{x},\frac{\bm{\eta}}{K})italic_W start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ( bold_italic_x , divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) is similar to the construction of Wp(𝒙)subscript𝑊𝑝𝒙W_{p}(\bm{x})italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( bold_italic_x ) in Theorem 1. The only difference is that here we apply LCU on unitaries U𝜼𝜶(𝒙)Uco𝜶U𝜶(𝒙𝜼K)subscriptsuperscript𝑈𝜶𝜼𝒙tensor-productsuperscriptsubscript𝑈𝑐𝑜𝜶superscript𝑈𝜶𝒙𝜼𝐾U^{\bm{\alpha}}_{\bm{\eta}}(\bm{x})\coloneqq U_{co}^{\bm{\alpha}}\otimes U^{% \bm{\alpha}}(\bm{x}-\frac{\bm{\eta}}{K})italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ( bold_italic_x ) ≔ italic_U start_POSTSUBSCRIPT italic_c italic_o end_POSTSUBSCRIPT start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ⊗ italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x - divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) instead of U𝜶(𝒙)superscript𝑈𝜶𝒙U^{\bm{\alpha}}(\bm{x})italic_U start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT ( bold_italic_x ). Thus, the controlled unitary is

Uc(𝒙,𝜼K)=j=1T|jj|U𝜼𝜶(j)(𝒙)U_{c}\bigl{(}\bm{x},\frac{\bm{\eta}}{K}\bigr{)}=\sum_{j=1}^{T}\lvert j\rangle% \!\langle j\rvert\otimes U^{\bm{\alpha}^{(j)}}_{\bm{\eta}}(\bm{x})italic_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_italic_x , divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT | italic_j ⟩ ⟨ italic_j | ⊗ italic_U start_POSTSUPERSCRIPT bold_italic_α start_POSTSUPERSCRIPT ( italic_j ) end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ( bold_italic_x ) (D.63)

and the unitary Wlcu(𝒙,𝜼K)=(FI)Uc(𝒙,𝜼K)(FI)subscript𝑊𝑙𝑐𝑢𝒙𝜼𝐾tensor-productsuperscript𝐹𝐼subscript𝑈𝑐𝒙𝜼𝐾tensor-product𝐹𝐼W_{lcu}(\bm{x},\frac{\bm{\eta}}{K})=(F^{\dagger}\otimes I)U_{c}(\bm{x},\frac{% \bm{\eta}}{K})(F\otimes I)italic_W start_POSTSUBSCRIPT italic_l italic_c italic_u end_POSTSUBSCRIPT ( bold_italic_x , divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) = ( italic_F start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_I ) italic_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_italic_x , divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) ( italic_F ⊗ italic_I ) satisfies that

0|𝜼,0|+|dWlcu(𝒙,𝜼K)|0|𝜼,0|+d=𝜶1s𝜶f(𝜼K)𝜶!(𝒙𝜼K)𝜶.bra0bra𝜼0superscriptbratensor-productabsent𝑑subscript𝑊𝑙𝑐𝑢𝒙𝜼𝐾ket0ket𝜼0superscriptkettensor-productabsent𝑑subscriptsubscriptdelimited-∥∥𝜶1𝑠superscript𝜶𝑓𝜼𝐾𝜶superscript𝒙𝜼𝐾𝜶\bra{0}\!\bra{\bm{\eta},0}\!\bra{+}^{\otimes d}W_{lcu}\bigl{(}\bm{x},\frac{\bm% {\eta}}{K}\bigr{)}\ket{0}\!\ket{\bm{\eta},0}\!\ket{+}^{\otimes d}=\sum_{\lVert% \bm{\alpha}\rVert_{1}\leq s}\frac{\partial^{\bm{\alpha}}f(\frac{\bm{\eta}}{K})% }{\bm{\alpha}!}(\bm{x}-\frac{\bm{\eta}}{K})^{\bm{\alpha}}.⟨ start_ARG 0 end_ARG | ⟨ start_ARG bold_italic_η , 0 end_ARG | ⟨ start_ARG + end_ARG | start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_l italic_c italic_u end_POSTSUBSCRIPT ( bold_italic_x , divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) | start_ARG 0 end_ARG ⟩ | start_ARG bold_italic_η , 0 end_ARG ⟩ | start_ARG + end_ARG ⟩ start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT divide start_ARG ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ( divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) end_ARG start_ARG bold_italic_α ! end_ARG ( bold_italic_x - divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT . (D.64)

We then apply the Hadamard test on Wlcu(𝒙,𝜼K)subscript𝑊𝑙𝑐𝑢𝒙𝜼𝐾W_{lcu}(\bm{x},\frac{\bm{\eta}}{K})italic_W start_POSTSUBSCRIPT italic_l italic_c italic_u end_POSTSUBSCRIPT ( bold_italic_x , divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ), giving the quantum circuit We(𝒙,𝜼K)subscript𝑊𝑒𝒙𝜼𝐾W_{e}(\bm{x},\frac{\bm{\eta}}{K})italic_W start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ( bold_italic_x , divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) as below

\Qcircuit@C=1em@R=0.5em\lstick|0&\qw\gateH\qw\ctrl1\qw\gateH\qw\lstick|0/\qw\qw\qw\multigate3Wlcu\qw\qw\qw\lstick|0/\qw\gateU(𝜼)\qw\ghostWlcu\qw\qw\qw\lstick|0\qw\qw\qw\ghostWlcu\qw\qw\qw\lstick|0/\qw\gateHd\qw\ghostWlcu\qw\qw\qw\Qcircuit@𝐶1𝑒𝑚@𝑅0.5𝑒𝑚\lstickket0&\qw\gate𝐻\qw\ctrl1\qw\gate𝐻\qw\lstickket0\qw\qw\qw\multigate3subscript𝑊𝑙𝑐𝑢\qw\qw\qw\lstickket0\qw\gate𝑈𝜼\qw\ghostsubscript𝑊𝑙𝑐𝑢\qw\qw\qw\lstickket0\qw\qw\qw\ghostsubscript𝑊𝑙𝑐𝑢\qw\qw\qw\lstickket0\qw\gatesuperscript𝐻tensor-productabsent𝑑\qw\ghostsubscript𝑊𝑙𝑐𝑢\qw\qw\qw\Qcircuit@C=1em@R=0.5em{\lstick{\ket{0}}&\qw\gate{H}\qw\ctrl{1}\qw\gate{H}\qw% \\ \lstick{\ket{0}}{/}\qw\qw\qw\multigate{3}{W_{lcu}}\qw\qw\qw\\ \lstick{\ket{0}}{/}\qw\gate{U(\bm{\eta})}\qw\ghost{W_{lcu}}\qw\qw\qw\\ \lstick{\ket{0}}\qw\qw\qw\ghost{W_{lcu}}\qw\qw\qw\\ \lstick{\ket{0}}{/}\qw\gate{H^{\otimes d}}\qw\ghost{W_{lcu}}\qw\qw\qw}@ italic_C = 1 italic_e italic_m @ italic_R = 0.5 italic_e italic_m | start_ARG 0 end_ARG ⟩ & italic_H 1 italic_H | start_ARG 0 end_ARG ⟩ / 3 italic_W start_POSTSUBSCRIPT italic_l italic_c italic_u end_POSTSUBSCRIPT | start_ARG 0 end_ARG ⟩ / italic_U ( bold_italic_η ) italic_W start_POSTSUBSCRIPT italic_l italic_c italic_u end_POSTSUBSCRIPT | start_ARG 0 end_ARG ⟩ italic_W start_POSTSUBSCRIPT italic_l italic_c italic_u end_POSTSUBSCRIPT | start_ARG 0 end_ARG ⟩ / italic_H start_POSTSUPERSCRIPT ⊗ italic_d end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_l italic_c italic_u end_POSTSUBSCRIPT

where the unitary U(𝜼)𝑈𝜼U(\bm{\eta})italic_U ( bold_italic_η ) takes 𝜼𝜼\bm{\eta}bold_italic_η as input and maps |0ket0\ket{0}| start_ARG 0 end_ARG ⟩ to |𝜼ket𝜼\ket{\bm{\eta}}| start_ARG bold_italic_η end_ARG ⟩. Note that the controlled unitary Uc(𝒙,𝜼K)subscript𝑈𝑐𝒙𝜼𝐾U_{c}(\bm{x},\frac{\bm{\eta}}{K})italic_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_italic_x , divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) could be implemented by O(T(s+1))𝑂𝑇𝑠1O(T(s+1))italic_O ( italic_T ( italic_s + 1 ) ) number of (logT)𝑇(\log T)( roman_log italic_T )-qubit controlled gates and O(TKd)𝑂𝑇superscript𝐾𝑑O(TK^{d})italic_O ( italic_T italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ) number of (logT+dlogK)𝑇𝑑𝐾(\log T+d\log K)( roman_log italic_T + italic_d roman_log italic_K )-qubit controlled gates. An n𝑛nitalic_n-qubit controlled gate could be implemented by a quantum circuit consisting of CNOT gates and single-qubit gates with depth O(n)𝑂𝑛O(n)italic_O ( italic_n ) [49]. Thus Uc(𝒙)subscript𝑈𝑐𝒙U_{c}(\bm{x})italic_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_italic_x ) could be implemented by a quantum circuit with depth O((s+1)TlogT+TKd(logT+dlogK))𝑂𝑠1𝑇𝑇𝑇superscript𝐾𝑑𝑇𝑑𝐾O((s+1)T\log T+TK^{d}(\log T+d\log K))italic_O ( ( italic_s + 1 ) italic_T roman_log italic_T + italic_T italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( roman_log italic_T + italic_d roman_log italic_K ) ) and width O(d+logT+dlogK)𝑂𝑑𝑇𝑑𝐾O(d+\log T+d\log K)italic_O ( italic_d + roman_log italic_T + italic_d roman_log italic_K ). Then the depth and width of Wlcu(𝒙,𝜼K)=(FI)Uc(𝒙,𝜼K)(FI)subscript𝑊𝑙𝑐𝑢𝒙𝜼𝐾tensor-productsuperscript𝐹𝐼subscript𝑈𝑐𝒙𝜼𝐾tensor-product𝐹𝐼W_{lcu}(\bm{x},\frac{\bm{\eta}}{K})=(F^{\dagger}\otimes I)U_{c}(\bm{x},\frac{% \bm{\eta}}{K})(F\otimes I)italic_W start_POSTSUBSCRIPT italic_l italic_c italic_u end_POSTSUBSCRIPT ( bold_italic_x , divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) = ( italic_F start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ italic_I ) italic_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_italic_x , divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) ( italic_F ⊗ italic_I ) are in the same order of Uc(𝒙,𝜼K)subscript𝑈𝑐𝒙𝜼𝐾U_{c}(\bm{x},\frac{\bm{\eta}}{K})italic_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_italic_x , divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) since F𝐹Fitalic_F is simply tensor of Hadamard gates. Therefore the entire depth of the circuit Wesubscript𝑊𝑒W_{e}italic_W start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT is O((sTlogT+TKd(logT+dlogK)))𝑂𝑠𝑇𝑇𝑇superscript𝐾𝑑𝑇𝑑𝐾O((sT\log T+TK^{d}(\log T+d\log K)))italic_O ( ( italic_s italic_T roman_log italic_T + italic_T italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( roman_log italic_T + italic_d roman_log italic_K ) ) ) and the width is O(d+logT+dlogK)𝑂𝑑𝑇𝑑𝐾O(d+\log T+d\log K)italic_O ( italic_d + roman_log italic_T + italic_d roman_log italic_K ). As T(s+1)ds𝑇𝑠1superscript𝑑𝑠T\leq(s+1)d^{s}italic_T ≤ ( italic_s + 1 ) italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT, we have the depth and width of PQC shown in Lemma S17. Note that the number of parameters in the PQC equals the number of parameters in Uc(𝒙)subscript𝑈𝑐𝒙U_{c}(\bm{x})italic_U start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( bold_italic_x ), which is O(T(s+d+Kd))𝑂𝑇𝑠𝑑superscript𝐾𝑑O(T(s+d+K^{d}))italic_O ( italic_T ( italic_s + italic_d + italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ) ).     square-intersection\sqcapsquare-union\sqcup

Finally, we combine the steps of localization and the Taylor series implementation to achieve a local Taylor expansion for the target function. The PQC is in a nested structure consisting of a PQC for localization and a PQC for Taylor series; see the detailed construction in the following theorem.

Theorem 4.

Given a function fβ([0,1]d,1)𝑓superscript𝛽superscript01𝑑1f\in{\cal H}^{\beta}([0,1]^{d},1)italic_f ∈ caligraphic_H start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , 1 ) with β=r+s𝛽𝑟𝑠\beta=r+sitalic_β = italic_r + italic_s, r(0,1]𝑟01r\in(0,1]italic_r ∈ ( 0 , 1 ] and s+𝑠superscripts\in{{\mathbb{N}}}^{+}italic_s ∈ blackboard_N start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT, for any K𝐾K\in{{\mathbb{N}}}italic_K ∈ blackboard_N and Δ(0,13K)Δ013𝐾\Delta\in(0,\frac{1}{3K})roman_Δ ∈ ( 0 , divide start_ARG 1 end_ARG start_ARG 3 italic_K end_ARG ), there exists a PQC Wt(𝐱)subscript𝑊𝑡𝐱W_{t}(\bm{x})italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) such that fWt(𝐱)0|Wt(𝐱)Z(0)Wt(𝐱)|0subscript𝑓subscript𝑊𝑡𝐱bra0subscriptsuperscript𝑊𝑡𝐱superscript𝑍0subscript𝑊𝑡𝐱ket0f_{W_{t}}(\bm{x})\coloneqq\bra{0}W^{\dagger}_{t}(\bm{x})Z^{(0)}W_{t}(\bm{x})% \ket{0}italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ⟨ start_ARG 0 end_ARG | italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩ satisfies

|f(𝒙)fWt(𝒙)|ds+β/2Kβ𝑓𝒙subscript𝑓subscript𝑊𝑡𝒙superscript𝑑𝑠𝛽2superscript𝐾𝛽\lvert f(\bm{x})-f_{W_{t}}(\bm{x})\rvert\leq d^{s+\beta/2}K^{-\beta}| italic_f ( bold_italic_x ) - italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) | ≤ italic_d start_POSTSUPERSCRIPT italic_s + italic_β / 2 end_POSTSUPERSCRIPT italic_K start_POSTSUPERSCRIPT - italic_β end_POSTSUPERSCRIPT (D.65)

for 𝐱𝛈Q𝛈𝐱subscript𝛈subscript𝑄𝛈\bm{x}\in\bigcup_{{\bm{\eta}}}Q_{\bm{\eta}}bold_italic_x ∈ ⋃ start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT. The width of the PQC is O(dlogK+logs+slogd)𝑂𝑑𝐾𝑠𝑠𝑑O(d\log K+\log s+s\log d)italic_O ( italic_d roman_log italic_K + roman_log italic_s + italic_s roman_log italic_d ), the depth is O(s2dsKd(logs+slogd+dlogK))+1ΔlogK)O(s^{2}d^{s}K^{d}(\log s+s\log d+d\log K))+\frac{1}{\Delta}\log K)italic_O ( italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( roman_log italic_s + italic_s roman_log italic_d + italic_d roman_log italic_K ) ) + divide start_ARG 1 end_ARG start_ARG roman_Δ end_ARG roman_log italic_K ), and the number of parameters is O(sds(s+d+Kd)+dΔlogK)𝑂𝑠superscript𝑑𝑠𝑠𝑑superscript𝐾𝑑𝑑Δ𝐾O(sd^{s}(s+d+K^{d})+\frac{d}{\Delta}\log K)italic_O ( italic_s italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ( italic_s + italic_d + italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ) + divide start_ARG italic_d end_ARG start_ARG roman_Δ end_ARG roman_log italic_K ).

Proof.

By Lemma S10, we have the following error bound for 𝒙Q𝜼𝒙subscript𝑄𝜼\bm{x}\in Q_{\bm{\eta}}bold_italic_x ∈ italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT,

|f(𝒙)𝜶1s𝜶f(𝜼K)𝜶!(𝒙𝜼K)𝜶|ds𝒙𝜼K2βds+β/2Kβ.𝑓𝒙subscriptsubscriptdelimited-∥∥𝜶1𝑠superscript𝜶𝑓𝜼𝐾𝜶superscript𝒙𝜼𝐾𝜶superscript𝑑𝑠subscriptsuperscriptdelimited-∥∥𝒙𝜼𝐾𝛽2superscript𝑑𝑠𝛽2superscript𝐾𝛽\Big{\lvert}f(\bm{x})-\sum_{\lVert\bm{\alpha}\rVert_{1}\leq s}\frac{\partial^{% \bm{\alpha}}f(\frac{{\bm{\eta}}}{K})}{\bm{\alpha}!}(\bm{x}-\frac{{\bm{\eta}}}{% K})^{\bm{\alpha}}\Big{\rvert}\leq d^{s}\big{\lVert}\bm{x}-\frac{{\bm{\eta}}}{K% }\big{\rVert}^{\beta}_{2}\leq d^{s+\beta/2}K^{-\beta}.| italic_f ( bold_italic_x ) - ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT divide start_ARG ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ( divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) end_ARG start_ARG bold_italic_α ! end_ARG ( bold_italic_x - divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ) start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT | ≤ italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ∥ bold_italic_x - divide start_ARG bold_italic_η end_ARG start_ARG italic_K end_ARG ∥ start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_d start_POSTSUPERSCRIPT italic_s + italic_β / 2 end_POSTSUPERSCRIPT italic_K start_POSTSUPERSCRIPT - italic_β end_POSTSUPERSCRIPT . (D.66)

Motivated by this, we first construct a localization PQC WD(x)subscript𝑊𝐷𝑥W_{D}(x)italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ( italic_x ) as in Lemma S14 such that

𝟎fWD(𝒙)𝜼𝑲(12K,,12K)if 𝒙Q𝜼.formulae-sequence0subscript𝑓subscript𝑊𝐷𝒙𝜼𝑲12𝐾12𝐾if 𝒙Q𝜼\bm{0}\leq f_{W_{D}}(\bm{x})-\frac{\bm{\eta}}{\bm{K}}\leq\Bigl{(}\frac{1}{2K},% \ldots,\frac{1}{2K}\Bigr{)}\quad\text{if $\bm{x}\in Q_{\bm{\eta}}$}.bold_0 ≤ italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) - divide start_ARG bold_italic_η end_ARG start_ARG bold_italic_K end_ARG ≤ ( divide start_ARG 1 end_ARG start_ARG 2 italic_K end_ARG , … , divide start_ARG 1 end_ARG start_ARG 2 italic_K end_ARG ) if bold_italic_x ∈ italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT . (D.67)

The depth of WD(x)subscript𝑊𝐷𝑥W_{D}(x)italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT ( italic_x ) is O(1ΔlogK)𝑂1Δ𝐾O(\frac{1}{\Delta}\log K)italic_O ( divide start_ARG 1 end_ARG start_ARG roman_Δ end_ARG roman_log italic_K ). We then construct a PQC

Wt(𝒙)We(𝒙,fWD(𝒙)),subscript𝑊𝑡𝒙subscript𝑊𝑒𝒙subscript𝑓subscript𝑊𝐷𝒙W_{t}(\bm{x})\coloneqq W_{e}(\bm{x},f_{W_{D}}(\bm{x})),italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) ≔ italic_W start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ( bold_italic_x , italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ) , (D.68)

where Wesubscript𝑊𝑒W_{e}italic_W start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT is the PQC proposed in Lemma S17. Note that the state |𝜼ket𝜼\ket{\bm{\eta}}| start_ARG bold_italic_η end_ARG ⟩ in Lemma S17 could be prepared by rounding fWD(𝜼)Ksubscript𝑓subscript𝑊𝐷𝜼𝐾f_{W_{D}}(\bm{\eta})Kitalic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_η ) italic_K, i.e., 𝜼=fWD(𝜼)K𝜼subscript𝑓subscript𝑊𝐷𝜼𝐾\bm{\eta}=\lfloor f_{W_{D}}(\bm{\eta})K\rfloorbold_italic_η = ⌊ italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_η ) italic_K ⌋. In other words, the PQC Wt(𝒙)subscript𝑊𝑡𝒙W_{t}(\bm{x})italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) has a nested structure consisting of a PQC for localization and a PQC for Taylor series implementation. Then we show that fWt(𝒙)0|Wt(𝒙)Z(0)Wt(𝒙)|0subscript𝑓subscript𝑊𝑡𝒙bra0subscriptsuperscript𝑊𝑡𝒙superscript𝑍0subscript𝑊𝑡𝒙ket0f_{W_{t}}(\bm{x})\coloneqq\bra{0}W^{\dagger}_{t}(\bm{x})Z^{(0)}W_{t}(\bm{x})% \ket{0}italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ⟨ start_ARG 0 end_ARG | italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩ can approximate β𝛽\betaitalic_β-Hölder smooth function f𝑓fitalic_f on 𝜼Q𝜼subscript𝜼subscript𝑄𝜼\bigcup_{\bm{\eta}}Q_{\bm{\eta}}⋃ start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT. By the triangle inequality and Eq. D.66, we have

|f(𝒙)fWt(𝒙)|𝑓𝒙subscript𝑓subscript𝑊𝑡𝒙\displaystyle\lvert f(\bm{x})-f_{W_{t}}(\bm{x})\rvert| italic_f ( bold_italic_x ) - italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) | |fWt(𝒙)𝜶1s𝜶f(fWD(𝒙))𝜶!(xfWD(𝒙))𝜶|+ds𝒙fWD(𝒙)2βabsentsubscript𝑓subscript𝑊𝑡𝒙subscriptsubscriptdelimited-∥∥𝜶1𝑠superscript𝜶𝑓subscript𝑓subscript𝑊𝐷𝒙𝜶superscript𝑥subscript𝑓subscript𝑊𝐷𝒙𝜶superscript𝑑𝑠superscriptsubscriptdelimited-∥∥𝒙subscript𝑓subscript𝑊𝐷𝒙2𝛽\displaystyle\leq\Big{\lvert}f_{W_{t}}(\bm{x})-\sum_{\lVert\bm{\alpha}\rVert_{% 1}\leq s}\frac{\partial^{\bm{\alpha}}f(f_{W_{D}}(\bm{x}))}{\bm{\alpha}!}(x-f_{% W_{D}}(\bm{x}))^{\bm{\alpha}}\Big{\rvert}+d^{s}\lVert\bm{x}-f_{W_{D}}(\bm{x})% \rVert_{2}^{\beta}≤ | italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) - ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT divide start_ARG ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ( italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ) end_ARG start_ARG bold_italic_α ! end_ARG ( italic_x - italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ) start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT | + italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ∥ bold_italic_x - italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT (D.69)
|fWt(𝒙)𝜶1s𝜶f(fWD(𝒙))𝜶!(xfWD(𝒙))𝜶|+ds+β/2Kβabsentsubscript𝑓subscript𝑊𝑡𝒙subscriptsubscriptdelimited-∥∥𝜶1𝑠superscript𝜶𝑓subscript𝑓subscript𝑊𝐷𝒙𝜶superscript𝑥subscript𝑓subscript𝑊𝐷𝒙𝜶superscript𝑑𝑠𝛽2superscript𝐾𝛽\displaystyle\leq\Big{\lvert}f_{W_{t}}(\bm{x})-\sum_{\lVert\bm{\alpha}\rVert_{% 1}\leq s}\frac{\partial^{\bm{\alpha}}f(f_{W_{D}}(\bm{x}))}{\bm{\alpha}!}(x-f_{% W_{D}}(\bm{x}))^{\bm{\alpha}}\Big{\rvert}+d^{s+\beta/2}K^{-\beta}≤ | italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) - ∑ start_POSTSUBSCRIPT ∥ bold_italic_α ∥ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_s end_POSTSUBSCRIPT divide start_ARG ∂ start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT italic_f ( italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ) end_ARG start_ARG bold_italic_α ! end_ARG ( italic_x - italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ) start_POSTSUPERSCRIPT bold_italic_α end_POSTSUPERSCRIPT | + italic_d start_POSTSUPERSCRIPT italic_s + italic_β / 2 end_POSTSUPERSCRIPT italic_K start_POSTSUPERSCRIPT - italic_β end_POSTSUPERSCRIPT (D.70)
ds+β/2Kβ.absentsuperscript𝑑𝑠𝛽2superscript𝐾𝛽\displaystyle\leq d^{s+\beta/2}K^{-\beta}.≤ italic_d start_POSTSUPERSCRIPT italic_s + italic_β / 2 end_POSTSUPERSCRIPT italic_K start_POSTSUPERSCRIPT - italic_β end_POSTSUPERSCRIPT . (D.71)

The second inequality comes from the fact that 𝒙fWD(𝒙)21Ksubscriptnorm𝒙subscript𝑓subscript𝑊𝐷𝒙21𝐾||\bm{x}-f_{W_{D}}(\bm{x})||_{2}\leq\frac{1}{K}| | bold_italic_x - italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) | | start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ divide start_ARG 1 end_ARG start_ARG italic_K end_ARG for 𝒙Q𝜼𝒙subscript𝑄𝜼\bm{x}\in Q_{\bm{\eta}}bold_italic_x ∈ italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT. This completes the proof.     square-intersection\sqcapsquare-union\sqcup

Note that the PQC in Theorem 4 is nesting of two PQCs, while its depth is counted as the sum of two PQCs for simplicity. We have established the uniform convergence property of PQCs for approximating Hölder smooth function on [0,1]dsuperscript01𝑑[0,1]^{d}[ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT except for the trifling region Λ(d,K,Δ)Λ𝑑𝐾Δ\Lambda(d,K,\Delta)roman_Λ ( italic_d , italic_K , roman_Δ ). Note that the Lebesgue measure of such a trifling region is no more than dKΔ𝑑𝐾ΔdK\Deltaitalic_d italic_K roman_Δ. We can set Δ=KdΔsuperscript𝐾𝑑\Delta=K^{-d}roman_Δ = italic_K start_POSTSUPERSCRIPT - italic_d end_POSTSUPERSCRIPT with no influence on the size of the constructed PQC in Theorem 4. Since ν𝜈\nuitalic_ν is absolutely continuous with respect to the Lebesgue measure, we have the following corollary.

Corollary S18.

Given a function fβ([0,1]d,1)𝑓superscript𝛽superscript01𝑑1f\in{\cal H}^{\beta}([0,1]^{d},1)italic_f ∈ caligraphic_H start_POSTSUPERSCRIPT italic_β end_POSTSUPERSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , 1 ) with β=r+s𝛽𝑟𝑠\beta=r+sitalic_β = italic_r + italic_s, r(0,1]𝑟01r\in(0,1]italic_r ∈ ( 0 , 1 ] and s+𝑠superscripts\in{{\mathbb{N}}}^{+}italic_s ∈ blackboard_N start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT, for any K𝐾K\in{{\mathbb{N}}}italic_K ∈ blackboard_N and Δ(0,13K)Δ013𝐾\Delta\in(0,\frac{1}{3K})roman_Δ ∈ ( 0 , divide start_ARG 1 end_ARG start_ARG 3 italic_K end_ARG ), there exists a PQC Wt(𝐱)subscript𝑊𝑡𝐱W_{t}(\bm{x})italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) such that fWt(𝐱)0|Wt(𝐱)Z(0)Wt(𝐱)|0subscript𝑓subscript𝑊𝑡𝐱bra0subscriptsuperscript𝑊𝑡𝐱superscript𝑍0subscript𝑊𝑡𝐱ket0f_{W_{t}}(\bm{x})\coloneqq\bra{0}W^{\dagger}_{t}(\bm{x})Z^{(0)}W_{t}(\bm{x})% \ket{0}italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ≔ ⟨ start_ARG 0 end_ARG | italic_W start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) italic_Z start_POSTSUPERSCRIPT ( 0 ) end_POSTSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( bold_italic_x ) | start_ARG 0 end_ARG ⟩ satisfies

f(𝒙)fWt(𝒙)L2(v)2subscriptsuperscriptdelimited-∥∥𝑓𝒙subscript𝑓subscript𝑊𝑡𝒙2superscript𝐿2𝑣\displaystyle\lVert f(\bm{x})-f_{W_{t}}(\bm{x})\rVert^{2}_{L^{2}(v)}∥ italic_f ( bold_italic_x ) - italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_v ) end_POSTSUBSCRIPT =[0,1]d(f(𝒙)fWt(𝒙))2ν(x)\odifabsentsubscriptsuperscript01𝑑superscript𝑓𝒙subscript𝑓subscript𝑊𝑡𝒙2𝜈𝑥\odif\displaystyle=\int_{[0,1]^{d}}(f(\bm{x})-f_{W_{t}}(\bm{x}))^{2}\nu(x)\odif= ∫ start_POSTSUBSCRIPT [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_f ( bold_italic_x ) - italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_ν ( italic_x )
=𝜼Q𝜼Λ(d,K,Δ)(f(𝒙)fWt(𝒙))2ν(x)\odifxabsentsubscriptsubscript𝜼subscript𝑄𝜼Λ𝑑𝐾Δsuperscript𝑓𝒙subscript𝑓subscript𝑊𝑡𝒙2𝜈𝑥\odif𝑥\displaystyle=\int_{\cup_{\bm{\eta}}Q_{\bm{\eta}}\bigcup\Lambda(d,K,\Delta)}(f% (\bm{x})-f_{W_{t}}(\bm{x}))^{2}\nu(x)\odif{x}= ∫ start_POSTSUBSCRIPT ∪ start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT italic_Q start_POSTSUBSCRIPT bold_italic_η end_POSTSUBSCRIPT ⋃ roman_Λ ( italic_d , italic_K , roman_Δ ) end_POSTSUBSCRIPT ( italic_f ( bold_italic_x ) - italic_f start_POSTSUBSCRIPT italic_W start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_x ) ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_ν ( italic_x ) italic_x (D.72)
(ds+β/2Kβ)2+4dK1d.absentsuperscriptsuperscript𝑑𝑠𝛽2superscript𝐾𝛽24𝑑superscript𝐾1𝑑\displaystyle\leq(d^{s+\beta/2}K^{-\beta})^{2}+4dK^{1-d}.≤ ( italic_d start_POSTSUPERSCRIPT italic_s + italic_β / 2 end_POSTSUPERSCRIPT italic_K start_POSTSUPERSCRIPT - italic_β end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 4 italic_d italic_K start_POSTSUPERSCRIPT 1 - italic_d end_POSTSUPERSCRIPT . (D.73)

The width of the PQC is O(dlogK+logs+slogd)𝑂𝑑𝐾𝑠𝑠𝑑O(d\log K+\log s+s\log d)italic_O ( italic_d roman_log italic_K + roman_log italic_s + italic_s roman_log italic_d ), the depth is O(s2dsKd(logs+slogd+dlogK))+1ΔlogK)O(s^{2}d^{s}K^{d}(\log s+s\log d+d\log K))+\frac{1}{\Delta}\log K)italic_O ( italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( roman_log italic_s + italic_s roman_log italic_d + italic_d roman_log italic_K ) ) + divide start_ARG 1 end_ARG start_ARG roman_Δ end_ARG roman_log italic_K ), and the number of parameters is O(sds(s+d+Kd)+dΔlogK)𝑂𝑠superscript𝑑𝑠𝑠𝑑superscript𝐾𝑑𝑑Δ𝐾O(sd^{s}(s+d+K^{d})+\frac{d}{\Delta}\log K)italic_O ( italic_s italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ( italic_s + italic_d + italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ) + divide start_ARG italic_d end_ARG start_ARG roman_Δ end_ARG roman_log italic_K ).

D.4 Comparison of “global” and “local” approaches in this work

We note that we have presented two distinct methodologies for constructing PQC models with UAP properties aimed at approximating continuous functions. In Theorem 3 and Theorem 4, we establish PQC models, guided by the multivariate Bernstein polynomials and the Taylor expansion of multivariate continuous functions, respectively. We categorize these approaches as “local” and “global”. We proceed to conduct a comprehensive comparative analysis of these two strategies in the context of approximating Lipschitz continuous functions. For the subsequent analysis, we set β=1𝛽1\beta=1italic_β = 1, thus s=0𝑠0s=0italic_s = 0 in Theorem 4, in accordance with the Lipschitz continuous property exhibited by the target function.

The approximation error associated with the global approach can be bounded as (2dd2)/(nε2)+εsuperscript2𝑑𝑑superscript2𝑛superscript𝜀2𝜀{(2^{d}d\ell^{2})}/{(n\varepsilon^{2})}+\varepsilon( 2 start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_d roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) / ( italic_n italic_ε start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) + italic_ε. By selecting n=(2dd2)/ε3𝑛superscript2𝑑𝑑superscript2superscript𝜀3n=(2^{d}d\ell^{2})/{\varepsilon^{3}}italic_n = ( 2 start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_d roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) / italic_ε start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT, we ensure an approximation error of 2ε2𝜀2\varepsilon2 italic_ε. Concurrently, the corresponding number of trainable parameters scale as O(2d2dd+12d/ε3d)𝑂superscript2superscript𝑑2superscript𝑑𝑑1superscript2𝑑superscript𝜀3𝑑O\bigl{(}2^{d^{2}}d^{d+1}\ell^{2d}/\varepsilon^{3d}\bigr{)}italic_O ( 2 start_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT italic_d + 1 end_POSTSUPERSCRIPT roman_ℓ start_POSTSUPERSCRIPT 2 italic_d end_POSTSUPERSCRIPT / italic_ε start_POSTSUPERSCRIPT 3 italic_d end_POSTSUPERSCRIPT ). In contrast, the local approach exhibits an approximation error scaling as dK1+ε𝑑superscript𝐾1𝜀\sqrt{d}K^{-1}+\varepsilonsquare-root start_ARG italic_d end_ARG italic_K start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT + italic_ε. Setting K=d/ε𝐾𝑑𝜀K=\sqrt{d}/\varepsilonitalic_K = square-root start_ARG italic_d end_ARG / italic_ε ensures a 2ε2𝜀2\varepsilon2 italic_ε approximation error, with the number of trainable parameters scaling as O(dd/2/εd)𝑂superscript𝑑𝑑2superscript𝜀𝑑O\left(d^{d/2}/\varepsilon^{d}\right)italic_O ( italic_d start_POSTSUPERSCRIPT italic_d / 2 end_POSTSUPERSCRIPT / italic_ε start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ). These findings highlight the advantage of the local approach for approximating continuous functions. More importantly, the approximation error proposed by the local method approaches the optimal convergence rate established in Shen et al. [22]. A formal comparison between PQCs and classical deep neural networks is stated in the next section.

Appendix E Comparison with related works in classical machine learning

Table S1: Approximation errors of PQCs and ReLU FNNs
Approach Target Width Depth Number of parameters Approximation error
PQC d𝑑ditalic_d-var. deg.-s𝑠sitalic_s monomial O(d)𝑂𝑑O(d)italic_O ( italic_d ) O(s)𝑂𝑠O(s)italic_O ( italic_s ) O(d+s)𝑂𝑑𝑠O(d+s)italic_O ( italic_d + italic_s ) 00
ReLU FNN [21] d𝑑ditalic_d-var. deg.-s𝑠sitalic_s monomial O(N+s)𝑂𝑁𝑠O(N+s)italic_O ( italic_N + italic_s ) O(s2M)𝑂superscript𝑠2𝑀O(s^{2}M)italic_O ( italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_M ) O((N2+s2)s2M)𝑂superscript𝑁2superscript𝑠2superscript𝑠2𝑀O((N^{2}+s^{2})s^{2}M)italic_O ( ( italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_M ) O(sNsM)𝑂𝑠superscript𝑁𝑠𝑀O(sN^{-sM})italic_O ( italic_s italic_N start_POSTSUPERSCRIPT - italic_s italic_M end_POSTSUPERSCRIPT )
Nested PQC Cus([0,1]d)superscriptsubscript𝐶𝑢𝑠superscript01𝑑C_{u}^{s}([0,1]^{d})italic_C start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ) O(dlogK+slogd)𝑂𝑑𝐾𝑠𝑑O(d\log K+s\log d)italic_O ( italic_d roman_log italic_K + italic_s roman_log italic_d ) O(Kdds)𝑂superscript𝐾𝑑superscript𝑑𝑠O(K^{d}d^{s})italic_O ( italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ) O(Kdds+1)𝑂superscript𝐾𝑑superscript𝑑𝑠1O(K^{d}d^{s+1})italic_O ( italic_K start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT italic_s + 1 end_POSTSUPERSCRIPT ) O(d2sKs)𝑂superscript𝑑2𝑠superscript𝐾𝑠O(d^{2s}K^{-s})italic_O ( italic_d start_POSTSUPERSCRIPT 2 italic_s end_POSTSUPERSCRIPT italic_K start_POSTSUPERSCRIPT - italic_s end_POSTSUPERSCRIPT )
ReLU FNNisuperscriptReLU FNNi\text{ReLU}\text{ FNN}^{\mathrm{i}}italic_ReLU italic_FNN start_POSTSUPERSCRIPT roman_i end_POSTSUPERSCRIPT  [21] Cus([0,1]d)superscriptsubscript𝐶𝑢𝑠superscript01𝑑C_{u}^{s}([0,1]^{d})italic_C start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ) O(sd+1N)𝑂superscript𝑠𝑑1𝑁O(s^{d+1}N)italic_O ( italic_s start_POSTSUPERSCRIPT italic_d + 1 end_POSTSUPERSCRIPT italic_N ) O(s2M)𝑂superscript𝑠2𝑀O(s^{2}M)italic_O ( italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_M ) O(s2d+4Kd/2N)𝑂superscript𝑠2𝑑4superscript𝐾𝑑2𝑁O(s^{2d+4}K^{d/2}N)italic_O ( italic_s start_POSTSUPERSCRIPT 2 italic_d + 4 end_POSTSUPERSCRIPT italic_K start_POSTSUPERSCRIPT italic_d / 2 end_POSTSUPERSCRIPT italic_N ) O(sd8sKs)𝑂superscript𝑠𝑑superscript8𝑠superscript𝐾𝑠O(s^{d}8^{s}K^{-s})italic_O ( italic_s start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT 8 start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT italic_K start_POSTSUPERSCRIPT - italic_s end_POSTSUPERSCRIPT )
  • i

    Satisfying NM=Θ(Kd/2)𝑁𝑀Θsuperscript𝐾𝑑2NM=\Theta(K^{d/2})italic_N italic_M = roman_Θ ( italic_K start_POSTSUPERSCRIPT italic_d / 2 end_POSTSUPERSCRIPT ).

In this subsection, we conduct a comparative exploration of PQCs and classical deep neural networks, focusing on critical aspects, including model size, the number of trainable parameters, and approximation error. To establish a meaningful benchmark, we turn our attention to deep feed-forward neural networks (FNNs) distinguished by the incorporation of rectified linear unit (ReLU) activation functions. FNNs represent the foundational class of neural networks, characterized by a unidirectional flow of information, commencing from the input layer and traversing through one or more hidden layers before culminating at the output layer. This architectural design ensures the absence of cyclic dependencies or loops among nodes within each layer. The ReLU activation function, mathematically defined as ReLU(x):=max(x,0)\text{ReLU}(x)\mathrel{\mathop{\mathchar 58\relax}}=\max(x,0)ReLU ( italic_x ) : = roman_max ( italic_x , 0 ), has gained prominence across diverse domains, including but not limited to image recognition [70, 71] and natural language processing [72, 73]. Its popularity in feed-forward networks stems from its efficacy in facilitating the convergence of function approximation during network training. Additionally, a recent study [74] has affirmed that classical neural networks employing commonly utilized activation functions can be effectively approximated by ReLU-activated networks while maintaining a mild increment in network size. Readers are also referred to some other excellent works related to ReLU networks [16, 75, 76].

In particular, Shen et al. [22] have proposed the optimal approximation error to approximate any Lipschitz function. Lu et al. [21] have provided a nearly optimal approximation error to approximate any smooth function using ReLU FNNs. For clarity, the comparison of our results with theirs is summarized in Table. S1. It is pertinent to observe that, in the majority of practical instances, the smoothness coefficient s𝑠sitalic_s of the target function tends to be modest since most functions to be approximated is not very smooth. Additionally, within practical scenarios, particularly in domains like image recognition and natural language processing, the dimensionality d𝑑ditalic_d of input data is substantially large. Consequently, within this context, we identify terms that solely rely on the variable s𝑠sitalic_s as constants and dsmuch-greater-than𝑑𝑠d\gg sitalic_d ≫ italic_s within Table S1.

We extend our investigation by quantifying the performance of PQCs and FNNs in terms of the model size and the number of parameters for approximating s𝑠sitalic_s-smooth functions Cus([0,1]d)subscriptsuperscript𝐶𝑠𝑢superscript01𝑑C^{s}_{u}([0,1]^{d})italic_C start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ( [ 0 , 1 ] start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ). Notably, we discover that in cases where the target function adheres to certain norms of smoothness, PQCs exhibit a notable improvement in approximating this function in terms of the model size and the number of parameters.

Model size. In particular, we explore the comparison of PQC and FNN model sizes when they yield the same approximation error ε𝜀\varepsilonitalic_ε (say some constant). Here, we use a straightforward measure, the product of width and depth, to gauge the model size. By setting approximation error as ε𝜀\varepsilonitalic_ε, the size of PQC and FNN scale as O(KQdds+1)𝑂superscriptsubscript𝐾𝑄𝑑superscript𝑑𝑠1O(K_{Q}^{d}d^{s+1})italic_O ( italic_K start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT italic_s + 1 end_POSTSUPERSCRIPT ) and O(KCd/2sd+3)𝑂superscriptsubscript𝐾𝐶𝑑2superscript𝑠𝑑3O(K_{C}^{d/2}s^{d+3})italic_O ( italic_K start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d / 2 end_POSTSUPERSCRIPT italic_s start_POSTSUPERSCRIPT italic_d + 3 end_POSTSUPERSCRIPT ), respectively, where KQ=Θ(d2/ε1/s)subscript𝐾𝑄Θsuperscript𝑑2superscript𝜀1𝑠K_{Q}=\Theta(d^{2}/\varepsilon^{1/s})italic_K start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT = roman_Θ ( italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / italic_ε start_POSTSUPERSCRIPT 1 / italic_s end_POSTSUPERSCRIPT ) and KC=Θ(sd/s/ε1/s)subscript𝐾𝐶Θsuperscript𝑠𝑑𝑠superscript𝜀1𝑠K_{C}=\Theta(s^{d/s}/\varepsilon^{1/s})italic_K start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT = roman_Θ ( italic_s start_POSTSUPERSCRIPT italic_d / italic_s end_POSTSUPERSCRIPT / italic_ε start_POSTSUPERSCRIPT 1 / italic_s end_POSTSUPERSCRIPT ).

Remarkably, when 2s<d2𝑠𝑑2\leq s<d2 ≤ italic_s < italic_d, an intriguing observation emerges: the ratio of model sizes between PQCs and FNNs [21] exhibits a scaling behavior of O(εd/(2s)/sd2dlogsd)𝑂superscript𝜀𝑑2𝑠superscript𝑠superscript𝑑2𝑑subscript𝑠𝑑O(\varepsilon^{-d/(2s)}/s^{d^{2}-d\log_{s}d})italic_O ( italic_ε start_POSTSUPERSCRIPT - italic_d / ( 2 italic_s ) end_POSTSUPERSCRIPT / italic_s start_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_d roman_log start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT italic_d end_POSTSUPERSCRIPT ). Our comprehensive analysis concludes that in situations where the smoothness threshold is satisfied, PQCs boast a significantly smaller model size compared to FNNs.

Number of trainable parameters. In the present investigation, we delve into the comparative analysis of the number of trainable parameters of PQC and FNN under the premise of yielding comparable approximation errors. From the perspective of approximation theory, the count of parameters serves as a standard metric for assessing model degrees of freedom and expressing model expressiveness. By setting approximation error as ε𝜀\varepsilonitalic_ε, the number of trainable parameters of PQC and FNN scale as O(KQdds+1)𝑂superscriptsubscript𝐾𝑄𝑑superscript𝑑𝑠1O(K_{Q}^{d}d^{s+1})italic_O ( italic_K start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT italic_s + 1 end_POSTSUPERSCRIPT ) and O(KC(1+λ0)d/2s2d+4)𝑂superscriptsubscript𝐾𝐶1subscript𝜆0𝑑2superscript𝑠2𝑑4O(K_{C}^{(1+\lambda_{0})d/2}s^{2d+4})italic_O ( italic_K start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 1 + italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) italic_d / 2 end_POSTSUPERSCRIPT italic_s start_POSTSUPERSCRIPT 2 italic_d + 4 end_POSTSUPERSCRIPT ), respectively. Here, the hyperparameter λ0(0,1)subscript𝜆001\lambda_{0}\in(0,1)italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ ( 0 , 1 ) signifies FNN’s width.

Remarkably, through our analysis, we have uncovered that when 2s<d2𝑠𝑑2\leq s<d2 ≤ italic_s < italic_d, the relationship between the number of trainable parameters of PQCs and FNNs [21] demonstrates a scaling pattern characterized by O(ε(1λ0)d/(2s)/s(1+λ0)d2dlogsd)𝑂superscript𝜀1subscript𝜆0𝑑2𝑠superscript𝑠1subscript𝜆0superscript𝑑2𝑑subscript𝑠𝑑O(\varepsilon^{-(1-\lambda_{0})d/(2s)}/s^{(1+\lambda_{0})d^{2}-d\log_{s}d})italic_O ( italic_ε start_POSTSUPERSCRIPT - ( 1 - italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) italic_d / ( 2 italic_s ) end_POSTSUPERSCRIPT / italic_s start_POSTSUPERSCRIPT ( 1 + italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_d roman_log start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT italic_d end_POSTSUPERSCRIPT ). As a consequence, the number of trainable parameters of PQCs significantly reduces compared to that of FNNs.

Approximating monomial. Here, we conduct a comparative performance analysis of PQC and FNN in approximating monomial functions of degree s𝑠sitalic_s. Within this specialized target function space, PQCs exhibit distinct advantages in terms of width, depth, model size, and the number of trainable parameters. Notably, PQCs possess the unique capability to capture the dynamics of monomial functions precisely, eliminating the need for approximation and thereby offering a compelling advantage. These advantages position PQCs as promising candidates for outperforming FNNs when addressing more complex target function spaces.