2405 19306v1

UNIFORM-IN-TIME ESTIMATES ON THE SIZE OF CHAOS
FOR INTERACTING BROWNIAN PARTICLES
ARMAND BERNOU AND MITIA DUERINCKX
Abstract. We consider a system of classical Brownian particles interacting via a smooth long-range
potential in the mean-field regime, and we analyze the propagation of chaos in form of sharp, uniform-
in-time estimates on many-particle correlation functions. Our results cover both the kinetic Langevin
arXiv:2405.19306v1 [math.AP] 29 May 2024
setting and the corresponding overdamped Brownian dynamics. The approach is mainly based on
so-called Lions expansions, which we combine with new diagrammatic tools to capture many-particle
cancellations, as well as with fine ergodic estimates on the linearized mean-field equation, and with
discrete stochastic calculus with respect to initial data. In the process, we derive some new ergodic es-
timates for the linearized Vlasov–Fokker–Planck kinetic equation that are of independent interest. Our
analysis also leads to uniform-in-time concentration estimates and to a uniform-in-time quantitative
central limit theorem for the empirical measure associated with the particle dynamics.
Contents
1. Introduction 1
2. Preliminary 9
3. Ergodic Sobolev estimates for mean field 20
4. Representation of Brownian cumulants 36
5. Refined propagation of chaos 53
6. Concentration estimates 55
7. Quantitative central limit theorem 57
References 65
1. Introduction
1.1. General overview. We consider the Langevin dynamics for a system of N Brownian particles
with mean-field interactions, moving in a confining potential in Rd , d ≥ 1, as described by the following
system of coupled SDEs: for 1 ≤ i ≤ N,
dXti,N = Vti,N dt,





dVti,N = − Nκ N i,N
− Xtj,N ) dt − β2 Vti,N dt − ∇A(Xti,N ) dt + dBti , (1.1)
P
 j=1 ∇W (Xt t ≥ 0,
(Xti,N , Vti,N )|t=0 = (X◦i,N , V◦i,N ),


where {Z i,N := (X i,N , V i,N )}1≤i≤N is the set of particle positions and velocities in the phase space
Dd := Rd × Rd , where W : Rd → R is a long-range interaction potential, where A is a uniformly
convex confining potential, where {B i }i are i.i.d. d-dimensional Brownian motions, and where κ, β > 0
are given constants. The interaction potential W is assumed to satisfy the action-reaction condi-
tion W (x) = −W (−x), and we assume that it is smooth, W ∈ Cb∞ (Rd ). Regarding the confining
potential A, we choose it to be quadratic for simplicity,
1 2
A(x) := 2 a|x| , for some a > 0, (1.2)
1
2 A. BERNOU AND M. DUERINCKX
although this is not essential for our results; see Remark 1.5 below. Next to this Langevin dynamics, we
also consider its overdamped limit, that is, the following inertialess Brownian dynamics: for 1 ≤ i ≤ N ,
dYti,N = − Nκ N i,N
− Ytj,N ) dt − ∇A(Yti,N ) dt + dBti ,
( P
j=1 ∇W (Yt t ≥ 0,
(1.3)
Yti,N |t=0 = Y◦i,N ,
where {Y i,N }1≤i≤N is now the corresponding set of particle positions in Rd . For presentation purposes
in this introduction, we restrict to the more delicate setting of the Langevin dynamics (1.1), but we
emphasize that all our results hold in both cases.
In the regime of a large number N ≫ 1 of particles, let us turn to a statistical description of
the system and consider the evolution of a random ensemble of particles. In terms of a probability
density F N on the N -particle phase space (Dd )N , the Langevin dynamics (1.1) is equivalent to the
Liouville equation
N
X N
X
∂t F N + vi · ∇ x i F N = 1
2 divvi ((∇vi + βvi )F N )
i=1 i=1 N N
X X
N
+ κ
N ∇W (xi − xj ) · ∇vi F + ∇xi A · ∇vi F N . (1.4)
i,j=1 i=1
Particles are assumed to be exchangeable, which amounts to the symmetry of F N in its N variables
zi = (xi , vi ) ∈ Dd , 1 ≤ i ≤ N . More precisely, we assume for simplicity that particles are initially
chaotic, meaning that the initial data {Z◦i,N := (X◦i,N , V◦i,N )}1≤i≤N are i.i.d. with some common phase-
space density µ◦ ∈ P(Dd ): in other words, F N is initially tensorized,
FtN |t=0 = µ⊗N

◦ . (1.5)
In the large-N limit, we aim at an averaged description of the system and we focus on the evolution
of a finite number of “typical” particles as described by the marginals of F N ,
ˆ
m,N
Ft (z1 , . . . , zn ) := FtN (z1 , . . . , zN ) dzm+1 . . . dzN , 1 ≤ m ≤ N.
(Dd )N−m
In view of Boltzmann’s chaos assumption, correlations between particles are expected to be negligible
to leading order, hence the chaotic behavior of initial data would remain approximately satisfied: this
is the so-called propagation of chaos,
Ftm,N − (Ft1,N )⊗m → 0, as N ↑ ∞, (1.6)
for any fixed m ≥ 1 and t ≥ 0. If this holds, it automatically implies the validity of the mean-field
limit
Ftm,N → µ⊗m
t , as N ↑ ∞,
where µt is the solution of the Vlasov–Fokker–Planck mean-field equation
(
∂t µ + v · ∇x µ = 12 divv ((∇v + βv)µ) + (∇A + κ∇W ∗ µ) · ∇v µ, t ≥ 0,
(1.7)
µ|t=0 = µ◦ ,
´
with the short-hand notation ∇W ∗µ(x) := Dd ∇W (x−y) µ(y, v) dy dv. This topic has been extensively
investigated since the 1990s, starting in particular with [51, 84]; see e.g. [62, 23] for a review.
On the formal level, corrections to the propagation of chaos and to the mean-field limit are naturally
unravelled by means of the BBGKY approach, which goes back to the work of Bogolyubov [7]. This
starts by noting that the Liouville equation (1.4) is equivalent to the following hierarchy of coupled
UNIFORM-IN-TIME ESTIMATES ON THE SIZE OF CHAOS 3
equations for marginals: for 1 ≤ m ≤ N ,

m
X m
X
∂t F m,N + vi · ∇xi F m,N = 1
2 divvi ((∇vi + βvi )F m,N )
i=1 i=1
Xm m ˆ
X
m,N −m
+ ∇xi A · ∇vi F + κ NN ∇W (xi − x∗ ) · ∇vi F m+1,N (·, z∗ ) dx∗ dv∗
i=1 i=1 Dd
m
X
+ κ
N ∇W (xi − xj ) · ∇vi F m,N , (1.8)
i,j=1
with the convention F m,N

≡ 0 for m > N . In each of those equations, the last right-hand side
term is precisely the one that disrupts the chaotic structure: it creates correlations between initially
independent particles, hence leads to deviations from the mean-field approximation. As this term is
formally of order O(m2 N −1 ), we are led to conjecture the following error estimate for the propagation
of chaos,
Ftm,N − (Ft1,N )⊗m = O(m2 N −1 ). (1.9)
This was first made rigorous in [78] for several related particle systems, and it is referred to as estimating
the size of chaos. For the particle systems of interest in the present work, (1.1) or (1.3), a rigorous
BBGKY analysis can be performed at least to some extent to deduce similar estimates, cf. [66, 12, 61].
In case of non-Brownian interacting particles (β = 0), the problem is more difficult and was solved
in [40] by means of different techniques.
A variant of the above estimates on the size of chaos is given by so-called weak propagation of chaos
estimates: for any sufficiently well-behaved functional Φ defined on the space P(Dd ) of probability
measures on Dd , one expects
−1
E Φ(µN

t ) − Φ(µt ) = O(N ), (1.10)
in terms of the empirical measure
1 PN
µN
t := N i=1 δZti,N ∈ P(Dd ), (1.11)
where we recall that the limit µt is the solution of the mean-field equation (1.7) and where the expecta-
tion E is taken with respect to both the initial data and the Brownian forces. Such an estimate is essen-
tially equivalent to (1.9) (up to the precise dependence on m and Φ), and we refer to [63, 72, 74, 5, 25]
for results in that direction. Note that the rate O(N −1 ) in (1.10) is only expected for Φ smooth enough.
For the specific choice Φ = W2 (·, µt ), for instance, the question amounts to estimating the expectation
of the 2-Wasserstein distance between µN t and µt : this
is referred toas strong propagation of chaos and
is known to lead only to a weaker convergence rate E W2 (µN −1/2 ) in link with random
t , µt ) = O(N
fluctuations of the empirical measure; see [3, 71, 10, 22, 48, 66].
In recent years, there has been increasing interest in uniform-in-time versions of the above chaos
estimates (1.9) or (1.10). This happens to be an important question both in theory and for practical
applications: it amounts to describing the long-time behavior of particle systems uniformly in the
limit N ↑ ∞, thus showing in particular the proximity of corresponding equilibria. This is naturally
related to the long-time behavior of the mean-field equation (1.7), which has itself been an intense
topic of research for more than two decades. While the mean-field equilibrium is not unique in general,
cf. [43], the long-time convergence of the mean-field density has been established under several types of
assumptions guaranteeing uniqueness [87, 9, 60, 75, 53, 2]; see also [4, 20, 8] for the Brownian dynamics.
In contrast, uniform-in-time propagation of chaos is a more subtle question and is indeed not ensured
by the uniqueness of the mean-field equilibrium [71, 6]. Uniform-in-time weak chaos estimates with
optimal rate O(N −1 ) were first obtained by Delarue and Tse [35] for the Brownian dynamics, and we
refer to [71, 10, 22, 9, 44, 80, 53, 27] for corresponding uniform-in-time strong chaos estimates with
weaker convergence rates. We also refer to [55, 79, 54, 29] for some uniform-in-time chaos estimates in
case of singular interactions.
In the present work, we aim to go beyond uniform-in-time chaos estimates by further estimating
many-particle correlation functions, which provide finer information on the propagation of chaos in the
system. The two-particle correlation function is defined as
G2,N := F 2,N − (F 1,N )⊗2 ,
which captures the defect to propagation of chaos (1.6) at the level of two-particle statistics. From
the BBGKY hierarchy (1.8), we note that proving the mean-field limit Ft1,N → µt amounts to proving
G2,N
t → 0, which is precisely ensured by standard chaos estimates, cf. (1.9). Yet, two-particle correla-
tions do not allow to reconstruct the full particle density F N : in particular, understanding corrections
to the mean-field limit requires to further estimate higher-order correlation functions {Gk,N }2≤k≤N .
Those are defined as suitable polynomial combinations of marginals of F N in such a way that the full
particle distribution F N be recovered in form of a cluster expansion,
X Y
F N (z1 , . . . , zN ) = G♯A,N (zA ), (1.12)
π⊢JN K A∈π
where π runs through the list of all partitions of the index set JN K := {1, . . . , N }, where A runs through
the list of blocks of the partition π, where ♯A is the cardinality of A, and where for A = {i1 , . . . , il } ⊂ JN K
we write zA = (zi1 , . . . , zil ). As is easily checked, correlation
´ functions are fully determined by pre-
scribing (1.12) together with the “maximality” requirement D Gm,N (z1 , . . . , zm ) dzl = 0 for 1 ≤ l ≤ m.
More explicitly, we can write
G3,N = sym F 3,N − 3F 2,N ⊗ F 1,N + 2(F 1,N )⊗3 ,

G4,N = sym F 4,N − 4F 3,N ⊗ F 1,N − 3F 2,N ⊗ F 2,N + 12F 2,N ⊗ (F 1,N )⊗2 − 6(F 1,N )⊗4 ,

and so on, where the symbol ‘sym’ stands for the symmetrization of coordinates. More generally, we
can write for all 2 ≤ m ≤ N ,
X Y
Gm,N (z1 , . . . , zm ) = (♯π − 1)!(−1)♯π−1 F ♯A,N (zA ), (1.13)
π⊢JmK A∈π
where we use a similar notation as in (1.12) and where ♯π stands for the number of blocks in a
partition π. While standard propagation of chaos leads to G2,N t = O(N −1 ), cf. (1.9), and in fact
m,N
Gt = O(N −1 ) for all 2 ≤ m ≤ N , a formal analysis of the BBGKY hierarchy (1.8) further leads to
expect
Gm,N
t = O(N 1−m ), 2 ≤ m ≤ N. (1.14)
Such estimates provide a much deeper understanding of the structure of propagation of chaos and
provide key tools to describe deviations from mean-field theory, cf. [39, 40]. We established such
estimates in [40] for non-Brownian particle systems (β = 0), up to some exponential time growth, under
the name of refined propagation of chaos. With a similar time growth, a bound of the form (1.14) was
also obtained in [61] for the Brownian dynamics (1.3) in the case of bounded non-smooth interactions.
In the present work, we obtain for the first time corresponding uniform-in-time estimates, both for
the Langevin and Brownian dynamics. Along the way, we also establish uniform-in-time concentration
estimates and a uniform-in-time central limit theorem for the empirical measure.
From the technical perspective, we mainly take inspiration from a recent work of Delarue and Tse [35]
(see also [19, 25, 24]), where the uniform-in-time weak propagation of chaos (1.10) was established for
the Brownian dynamics. The key idea of the analysis is to consider the mean-field semigroup induced
on functionals µ 7→ Φ(µ) on the space of probability measures, and then appeal to so-called Lions
calculus on this space to expand the expectation E[Φ(µN t )] of functionals along the particle dynamics
(see Lemma 2.1 below). As noted in [35], the resulting so-called Lions expansions can be combined with
ergodic properties of the linearized mean-field equation to deduce uniform-in-time estimates. In the
present contribution, in order to control correlation functions {Gm,N }2≤m≤N , we reduce the problem to
estimating cumulants of functionals of the empirical measure. We apply Lions expansions to cumulants
and we develop suitable diagrammatic tools to efficiently capture cancellations and derive the desired
estimates (1.14). To account for the effect of initial correlations, we further combine Lions expansions
with the so-called Glauber calculus that we developed in [40]. While only the case of the Brownian
dynamics was considered in [35], note that we need to further appeal to hypocoercivity techniques to
establish the relevant ergodic estimates for the linearized mean-field equation in case of the kinetic
Langevin dynamics: for that purpose, we mainly draw inspiration from the work of Mischler and
Mouhot [73], which we are led to revisit in several ways (see Theorem 3.1).
1.2. Main results. We start with uniform-in-time refined propagation of chaos estimates (1.14). We
focus on the Langevin dynamics (1.1), but we emphasize that all our results also hold in the simpler
case of the Brownian dynamics (1.3).1 The smallness assumption κ ≪ 1 for the interaction strength
ensures the uniqueness of the steady state for the mean-field equation (1.7), which is useful to ensure
strong ergodic properties.
Theorem 1.1 (Uniform-in-time refined propagation of chaos). There exists κ0 > 0 (only depending
on d, β, W,
´ A) such that the following holds for any κ ∈ [0, κ0 ]. Assume that the initial law µ◦ ∈ P(Dd )
satisfies Dd |z|p0 µ◦ (dz) < ∞ for some p0 > 0, and consider the Langevin dynamics (1.1) and the
associated correlation functions {Gm,N }2≤m≤N as defined in (1.13). For all 2 ≤ m ≤ N , there exist
ℓm > 0 (only depending on m) and Cm > 0 (only depending on d, β, W, A, µ◦ , m) such that we have
for all t ≥ 0,
kGm,N
t kW −ℓm ,1 (Dd ) ≤ Cm N 1−m . (1.15)
In particular, the uniform-in-time smallness of the two-particle correlation function G2,N

t = O(N −1 )
allows to truncate the BBGKY hierarchy (1.8) and to recover the uniform-in-time validity of propaga-
tion of chaos: for all 1 ≤ m ≤ N and t ≥ 0,
kFtm,N − µ⊗m
t kW −ℓm ,1 (Dd ) ≤ Cm N
−1
. (1.16)
Corrections to this mean-field approximation can be further captured by truncating the BBGKY
hierarchy (1.8) to higher orders as e.g. in [40]. In fact, by a suitable analysis of those corrections, it
is possible to deduce the following improvement of (1.16), which we state here for simplicity in case
m = 2: for all k ≥ 1, there exist λk , ℓk , Ck > 0 such that we have for all N ≥ 1 and t ≥ 0,
kFt2,N − µ⊗2
t kW −ℓk ,1 (Dd ) ≤ Ck (e
−λk t −1
N + N −k ). (1.17)
To our knowledge, this O(e−λt N −1 + N −∞ ) estimate constitutes a new type of result in mean-field
theory, which can be viewed as combining the mean-field approximation (1.16) quantitatively with
the convergence of the particle system to Gibbs equilibrium. The proof of this refined estimate (1.17)
as an application of Theorem 1.1 requires detailed computations of corrections to mean field and is
postponed to a forthcoming work.
The strategy of the proof of Theorem 1.1 can be taken further to derive uniform-in-time concen-
tration estimates and a quantitative central limit theorem for the empirical measure. We start with
concentration estimates. For the Langevin dynamics (1.1), the following result completes the concen-
tration estimates obtained in [9, Theorem 5]. In the simpler setting of the Brownian dynamics (1.3),
corresponding results were already well-known: a uniform-in-time concentration estimate was first de-
duced in [70] from a logarithmic Sobolev inequality in the case when W is convex, and it was largely
extended more recently in [69].
Theorem 1.2 (Uniform-in-time concentration). There exists κ0 > 0 (only depending on d, β, W, A)
such that the following holds for any κ ∈ [0, κ0 ]. Assume that the initial law µ◦ ∈ P(Dd ) is com-
pactly supported, and consider the Langevin dynamics (1.1) and the associated empirical measure µN
t ,
1In fact, in case of the Brownian dynamics (1.3), we can choose p = 0 in our different results, meaning that no
0
moment assumption is needed on the initial law µ◦ .
cf. (1.11). For all ε > 0, there exists Cε > 0 (only depending on d, β, W, A, ε, µ◦ ) such that the following
holds: for all φ ∈ Cc∞ (Dd ) and N, t, r ≥ 0,
N r2
ˆ hˆ i
N N
P φµt − E φµt ≥ r ≤ exp − ,
Dd Dd Cε kφkW 2+ε,∞ (Dd )
provided that r ≤ (eCε kφkW 2+ε,∞ (Dd ) )1/2 and that r(N r 2 )ε ≤ e1/2 (eCε kφkW 2+ε,∞ (Dd ) )ε .
Finally, we state our uniform-in-time quantitative central limit theorem (CLT) for leading fluc-
tuations of the empirical measure. As expected from formal computations, leading fluctuations are
described by the Gaussian linearized Dean–Kawasaki SPDE, cf. (1.19) below. A qualitative CLT has
actually been known to hold since the early days of mean-field theory [85, 83, 47], and it has recently
been extended to some singular interaction potentials as well [89, 28]. In case of smooth interactions,
as considered in the present work, an optimal quantitative estimate for fluctuations already follows
from [40], but we provide here the first uniform-in-time result. To our knowledge, this is new both in
the Langevin and in the Brownian cases.
Theorem 1.3 (Uniform-in-time CLT). There exist κ0 , λ0 > 0 (only depending on d, β,´ W, A) such that
the following holds for any κ ∈ [0, κ0 ]. Assume that the initial law µ◦ ∈ P(Dd ) satisfies Dd |z|p0 µ◦ (dz) <
∞ for some 0 < p0 ≤ 1, and consider the Langevin dynamics (1.1) and the associated empirical mea-
sure µN ∞ d
t , cf. (1.11). For all φ ∈ Cc (D ), there exists Cφ > 0 (only depending on d, β, W, A, φ, µ◦ ) such
that for all N, t ≥ 0 we have
ˆ ˆ
1 1 1
d2 N 2 N
φ (µt − µt ) ; φ νt ≤ Cφ N − 2 + e−p0 λ0 t N − 3 ,
X X
where:
— d2 stands for the second-order Zolotarev distance between random variables,
n o
d2 (X; Y ) := sup E[g(X)] − E[g(Y )] : g ∈ Cb2 (R), g′ (0) = 0, kg ′′ kL∞ (R) = 1 ; (1.18)
— the limit fluctuation νt is the centered Gaussian process that is the unique almost sure distributional
solution of the Gaussian linearized Dean–Kawasaki SPDE (see Section 7.1 for details),
√
 ∂t νt + v · ∇x νt = divv ( µt ξt ) + divv ((∇v + βv)νt )

+∇A · ∇v νt + κ(∇W ∗ νt ) · ∇v µt + κ(∇W ∗ µt ) · ∇v νt , (1.19)

νt |t=0 = ν◦ ,

where ξ is a vector-valued space-time white noise on R+ × Dd , and where ν◦ is the´ Gaussian field
describing the fluctuations of the initial empirical measure in the sense that N 1/2 Dd φ (µN
◦ − µ◦ )
converges in law to Dd φ ν◦ for all φ ∈ Cc∞ (Dd ).2
´
Remark 1.4 (Higher-order fluctuations). In recent years, much work has been devoted to the jus-
tification of the non-Gaussian nonlinear Dean–Kawasaki equation, which is a highly singular SPDE
formally expected to capture higher-order fluctuations; see in particular [64, 30, 36, 31]. In contrast,
the above result only focuses on Gaussian leading fluctuations, but it provides the first uniform-in-
time justification. Extensions to non-Gaussian corrections and the uniform-in-time justification of the
nonlinear Dean–Kawasaki equation is postponed to a future work.
Remark 1.5 (Confining potential). Although we focus for simplicity on particle systems in Rd with
quadratic confinement (1.2), we emphasize that this requirement is not essential.
(a) Non-quadratic confinement: The same results hold if instead of the quadratic confinement (1.2)
we choose A(x) = a|x|2 + A′ (x) for some a > 0 and some smooth potential A′ ∈ Cb∞ (Rd ), provided
that k∇2 A′ kL∞ (Rd ) is small enough (depending on β, W, a). In that case, we can still appeal to [9]
2In other words, this means that ν is the random tempered distribution on Dd characterized by having Gaussian law
◦ ´
φµ◦ )2 µ◦ for all φ ∈ Cc∞ (Dd ).
´ ´ ´
with E[ Dd
φν◦ ] = 0 and Var[ Dd
φν◦ ] = Dd
(φ − Dd
to ensure the validity of Theorem 3.1(i), while the rest of our approach can be adapted directly
without major difficulties.
(b) Periodic setting: Our approach is easily adapted to particle systems on the torus Td with A ≡ 0.
The above results still hold in the same form in that case, and the only difference in the proof
appears when investigating the ergodic properties of the linearized mean-field operator. We refer
to Remark 3.2 for details.
Remark 1.6 (Expansions of functionals along the flow). Along the way, we also extend the work
of Chassagneux, Szpruch, and Tse [25] to the case of the kinetic Langevin dynamics: more precisely,
in the setting of Theorem 1.1 with κ ∈ [0, κ0 ], for all smooth functionals Φ,3 we obtain a truncated
expansion of the following form, for all k ≥ 0,
k
Cj,Φ (t, µ◦ )
+ O(N −k−1 ),
X
E Φ(µN

t ) − Φ(µ t ) =
Nj
j=1
with exact expressions for the coefficients {Cj,Φ (t, µ◦ )}j independent of N . As explained in [25, Sec-
tion 1.1], by means of Romberg extrapolation, such an expansion can be used to accelerate the conver-
gence of numerical schemes to estimate Φ(µt ) through the particle method. Only the case of Brownian
dynamics was previously covered in [25].
1.3. Strategy and plan of the paper. We start by describing the strategy of the proof of Theo-
rem 1.1. It is well known that the ´estimation of correlation functions {Gm,N }2≤m≤N can be reduced to
the estimation of cumulants {κn ( Dd φµN N
t )}n≥1 of linear functionals of the empirical measure µt ; see
Lemma 2.6. As the probability space is a product space accounting both for initial data and for Brown-
ian forces, cumulants can be split through the law of total cumulance: we are led to consider separately
“initial” and “Brownian” cumulants. To estimate initial cumulants, we appeal to the machinery that we
developed in [40] based on so-called Glauber calculus; see Section 2.3. In order to estimate Brownian
cumulants, we might try to appeal similarly to Malliavin calculus in the form of [76]. Unfortunately,
representations of cumulants through Malliavin calculus do not seem easy to combine with ergodic
properties of the linearized mean-field equation to deduce uniform-in-time estimates. Instead, we draw
inspiration from the recent literature on mean-field games using the master equation formalism and
the so-called Lions calculus on the space of probability measures, cf. [19, 25, 24]. In a nutshell, the
key idea is to consider the mean-field semigroup induced on functionals µ 7→ Φ(µ) on the space of
probability measures, and then to use Lions calculus on that space to expand the Brownian expecta-
tion EB [Φ(µN t )] of functionals along the particle dynamics; see Lemma 2.1. As noted by Delarue and
Tse [35], such expansions can be combined with ergodic properties of the linearized mean-field equa-
tion to obtain uniform-in-time estimates. Yet, this does not immediately lead to the desired cumulant
estimates Gm,N
t = O(N 1−m ): we further need to capture underlying cancellations, which we achieve
by developing new diagrammatic techniques in form of so-called Lions graphs; see Section 4.
As explained, for uniform-in-time estimates, we rely on ergodic properties of the linearized mean-
field equation. While ergodic estimates follow from the standard parabolic theory in the case of the
Brownian dynamics, cf. [35], we have to further appeal to hypocoercivity techniques in the kinetic
Langevin setting. For ergodic estimates on the weighted space L2 (M −1/2 ), where the weight is given
by the steady state M for the mean-field equation, we can simply appeal to hypocoercivity in form of
the theory of Dolbeault, Mouhot, and Schmeiser [38]. Since estimating cumulants costs derivatives, we
rather need ergodic estimates on negative Sobolev spaces, and we easily check that the estimates on
L2 (M −1/2 ) can be upgraded to estimates on H −k (M −1/2 ) for all k ≥ 0. Yet, we would ideally rather
need ergodic estimates on the larger space W −k,1(Dd ). Unfortunately, even the enlargement theory of
Gualdani, Mischler, and Mouhot [52, 73] does not allow to reach such spaces. In Section 3, we revisit
3For our purposes in this work, we actually focus on linear functionals Φ, but we emphasize that this is not essential
in the proofs and nonlinear functionals could be considered as well under suitable smoothness assumptions as in [25].
enlargement techniques and show that we can actually reach W −k,q (hzip ) with arbitrarily small q > 1
and p > 0 provided pq ′ ≫ 1, which happens to be just enough for our purposes.
Finally, in order to deduce the concentration estimates and the quantitative CLT stated in Theo-
rems 1.2 and 1.3, we combine the same Lions expansions with the Herbst argument and with Stein’s
method, respectively. We believe this combination of techniques to be of independent interest for ap-
plications to other settings. Note that the proof of Theorems 1.2 and 1.3 is actually much simpler than
the proof of cumulant estimates in Theorem 1.1 as it does not require to capture arbitrarily fine can-
cellations. For the quantitative CLT, for instance, the proof essentially boils down to the convergence
of the variance and to the smallness of the third cumulant of the empirical measure, thus requiring no
fine information on higher cumulants.
Plan of the paper. We start in Section 2 with the presentation and development of the main technical
tools that are used to prove our main results, namely Lions and Glauber calculus. In Section 3,
we establish suitable ergodic estimates for the linearized mean-field equation, which are key to our
uniform-in-time results. In Section 4, we develop suitable diagrammatic representations for iterated
Lions expansions of Brownian cumulants of the empirical measure, which allows us to systematically
capture the needed cancellations. Finally, the correlation estimates of Theorem 1.1 are concluded in
Section 5, the concentration estimates of Theorem 1.2 are established in Section 6, and the quantitative
CLT of Theorem 1.3 is proven in Section 7.
1.4. Notation. For notational convenience, we consider a general framework that covers both the
Langevin and the Brownian dynamics (1.1) and (1.3) as special cases. More precisely, we denote by
{Zti,N }1≤i≤N the set of particle trajectories in the space X := Dd or Rd , as given by the following
system of coupled SDEs: for 1 ≤ i ≤ N ,
dZti,N = b(Zti,N , µN
(
i t ≥ 0,
t )dt + σ0 dBt ,
i,N i,N
(1.20)
Zt |t=0 = Z◦ ,
where µN
t stands for the empirical measure
1 PN
µN
t := N i=1 δZti,N ∈ P(X),
where b : X × P(X) → Rd is a smooth functional (in a sense that will be made clear later on), where
{B i }i are i.i.d. Brownian motions in X, and where σ0 is a constant matrix. We assume that initial
data {Z◦i,N }1≤i≤N are i.i.d. with some density µ◦ ∈ P(X). The associated mean-field equation takes
form of the following McKean–Vlasov equation,
∂t µ + div(b(·, µ) µ) = 21 div(a0 ∇µ), in R+ × X,

(1.21)
µ|t=0 = µ◦ , in X,
with a0 := σ0 σ0T , and we denote the well-posed solution operator on P(X) by
µt := m(t; µ◦ ). (1.22)
This general framework allows us to consider both systems of interest (1.1) and (1.3) at once: the
Langevin dynamics (1.1) is given by

d β 0Rd 0Rd
X = D , b((x, v), µ) = v, − 2 v − (∇A + κ∇W ∗ µ)(x) , σ0 = , (1.23)
0Rd IdRd
and the Brownian dynamics (1.3) by
X = Rd , b(x, µ) = −(∇A + κ∇W ∗ µ)(x), σ0 = IdRd . (1.24)
Note that the diffusion matrix a0 = σ0 σ0T is degenerate in the Langevin case, which is why specific
hypocoercivity techniques are then needed. Most of our work can actually be performed in the general
framework (1.20) without any structural assumption on X, b, σ0 , except when establishing ergodic
estimates in Section 3. More precisely, our different main results hold for any system of the form (1.20),
under suitable smoothness assumptions for b, provided that the ergodic estimates of Theorem 3.1 are
available. For the latter, we restrict to the setting of the Langevin or Brownian dynamics in the
weak coupling regime κ ≪ 1. Under mere smoothness assumptions on b, if ergodic estimates are not
available, we note that our analysis can at least be repeated to obtain non-uniform estimates with
exponential time growth.
Finally, let us briefly list the main notation used throughout this work:
— We denote by C ≥ 1 any constant that only depends on the space dimension d. We use the nota-
tion . (resp. &) for ≤ C× (resp. ≥ C1 ×) up to such a multiplicative constant C. We write ≃ when
both . and & hold. We add subscripts to C, ., &, ≃ to indicate dependence on other parameters.
— The underlying probability space (Ω, P) splits as a product (Ω, P) = (Ω◦ , P◦ ) × (ΩB , PB ), where
the first factor accounts for random initial data and where the second factor accounts for Brownian
forces. The space Ω◦ is endowed with the σ-algebra F◦ = σ(Z◦1,N , . . . , Z◦N,N ) generated by initial
data, while ΩB is endowed with the σ-algebra σ((Bt1 , . . . , BtN )t≥0 ). We also denote by FtB :=
σ((Bs1 , . . . , BsN )0≤s≤t ) the Brownian filtration. We use E[·] and κm [·] to denote the expectation
and the cumulant of order m with respect to P, and we similarly denote by E◦ [·], κm ◦ [·] and by
EB [·], κm
B [·] the expectation and cumulants with respect to P ◦ and P B , respectively.
— For any two integers b ≥ a ≥ 0, we use the short-hand notation Ja, bK := {a, a + 1, . . . , b}, and in
addition for any integer a ≥ 1 we set JaK := J1, aK.
1
— For all z ∈ Rd , we use the notation hzi := (1 + |z|2 ) 2 .
2. Preliminary
This section is devoted to the presentation and development of the main technical tools used in this
work. We start with an account of the master equation formalism and of Lions calculus for functionals
on the space of probability measures, then we turn to the study of correlation functions by means of
cumulants of the empirical measure, and finally we recall useful tools from Glauber calculus.
2.1. Lions calculus. We recall several notions of derivatives for functionals defined on the space P(X)
of probability measures, and how they can be used to expand functionals along the particle dynamics.
2.1.1. Linear derivative. We start with the notion of linear derivative, as used for instance by Lions
in his course at Collège de France [16]; see also [19, Chapter 5] for a slightly different exposition. A
functional V : P(X) → R is said to be continuously differentiable if there exists a continuous map
δV
δµ : P(X) × X → R such that, for all µ0 , µ1 ∈ P(X),
ˆ 1ˆ
δV
V(µ0 ) − V(µ1 ) = sµ0 + (1 − s)µ1 , y (µ0 − µ1 )(dy) ds, (2.1)
0 X δµ
and we then call δV

δµ the linear functional derivative of V. This definition holds up to a constant, which
we fix by setting
δV
ˆ
(µ, y) µ(dy) = 0, for all µ ∈ P(X).
X δµ
The denomination “linear derivative” is understood as it is precisely defined to satisfy for all µ ∈ P(X)
and y ∈ X,
V((1 − h)µ + hδy ) − V(µ) δV
lim = (µ, y). (2.2)
h→0 h δµ
Higher-order linear derivatives are defined by induction: for all integers p ≥ 1, if the functional V is
p-times continuously differentiable, we say that it is (p + 1)-times continuously differentiable if there
p+1
exists a continuous map δδµp+1V : P(X) × Xp+1 → R such that for all µ, µ′ in P(X) and y ∈ Xp ,
ˆ 1 ˆ p+1
δp V δp V ′ δ V
(sµ + (1 − s)µ′ , y, y ′ (µ − µ′ )(dy ′ ) ds.

p
(µ, y) − p
(µ , y) = p+1
δµ δµ 0 X δµ
p+1
Once again, to ensure the uniqueness of the (p + 1)th linear functional derivative δδµp+1V , we choose the
convention
ˆ p+1
δ V
p+1
(µ, y1 , . . . , yp+1 ) µ(dyp+1 ) = 0, for all µ ∈ P(X) and y1 , . . . , yp ∈ X.
X δµ
2.1.2. L-derivative. We further recall the notion of so-called L-derivatives (or Lions derivatives, or
intrinsic derivatives), as developed in [68]. We refer e.g. to [17, Section 2.2] for the link to the Otto
calculus on Wasserstein space [77, 1]. For a continuously differentiable functional V : P(X) → R, if the
map y 7→ δV 1
δµ (µ, y) is of class C on X, the L-derivative of V is defined as
δV
∂µ V(µ, y) := ∇y (µ, y). (2.3)
δµ
We also define corresponding higher-order derivatives: for all µ ∈ P(X) and y1 , . . . , yp ∈ X, we define,
provided that it makes sense,
δp V
∂µp V(µ, y1 , . . . , yp ) := ∇y1 . . . ∇yp p µ, y1 , . . . , yp .

δµ
2.1.3. Master equation formalism. In terms of the above calculus on the space P(X) of probability
measures, we now introduce the so-called master equation formalism to describe the evolution of
functionals on P(X) along the mean-field flow. For a smooth functional Φ : P(X) → R, we define
U (t, µ) := Pt Φ(µ) := Φ(m(t, µ)), t ≥ 0, µ ∈ P(X), (2.4)
where we recall the notation m(t, µ) for the mean-field solution operator (1.22). This defines a semi-
group (Pt )t≥0 acting on bounded measurable functionals on P(X). From [14, Theorem 7.2], using the
regularity of b, and assuming corresponding regularity of Φ, we find that U (t, µ) satisfies the following
master equation, which is viewed as an evolution equation for functionals on P(X),
( ´ h i
∂t U (t, µ) = X b(x, µ) · ∇x δU
δµ (t, µ, x) + 1
2 a0 : ∇ 2 δU (t, µ, x) µ(dx),
x δµ
(2.5)
U (0, µ) = Φ(µ),
where we recall a0 = σ0 σ0T . For the Langevin dynamics (1.1), this takes on the following guise,
 h
´ β δU 1 δU
 t∂ U (t, µ) = Rd ×Rd v · ∇x − 2 ∇v δµ (t, µ, x, v) + 2 △v δµ (t, µ, x, v)i


− ∇A(x) + κ∇W ∗ µ(x) · ∇v δU


 δµ (t, µ, x, v) µ(dxdv),
U (0, µ) = Φ(µ).

2.1.4. Expansions along the particle dynamics. We recall the following useful result that allows to
expand functionals along the particle dynamics in terms of the corresponding mean-field flow, cf. [19,
(5.131)] or [25, Lemma 2.8]. Note that the proof in [25] only relies on the master equation (2.5)
and on [24, Proposition 3.1], so that in particular there is no uniform ellipticity requirement for the
diffusivity a0 = σ0 σ0T .
Lemma 2.1 (see [19, 25]). Let Φ : P(X) → R be a smooth functional and let U (t, µ) be defined in (2.4).
Then for all 0 ≤ s ≤ t we have
ˆ sˆ
N N 1 h i
U (t − s, µs ) = U (t, µ0 ) + tr a0 ∂µ2 U (t − u, µN N N
u )(z, z) µu (dz) du + Mt,s , (2.6)
2N 0 X
N)
where (Mt,s B N
s≥0 is a square-integrable (Fs )s≥0 -martingale with Mt,0 = 0, which is explicitly given by
N ˆ
N 1 X s∧t
Mt,s := ∂µ U (t − u, µN i,N i
u )(Zu ) · σ0 dBu .
N 0
i=1
This expansion will be used throughout this work to compare the empirical measure to the cor-
responding mean-field semigroup. More precisely, we shall abundantly use the following immediate
consequences.
Corollary 2.2. Let Φ : P(X) → R be a smooth functional and let U (t, µ) be defined in (2.4).
(i) For all t ≥ 0, we have
ˆ tˆ
−1
E[Φ(µN
t )] − E◦ [Φ(m(t, µN
0 ))] . N E ∂µ2 U (t − u, µN
u )(z, z) µN
u (dz) du .
0 X
(ii) For all t ≥ 0, we have

ˆ tˆ 1
2
− 21 2 N
kΦ(µN
t ) − Φ(m(t, µN
0 ))kL2 (ΩB ) . N EB ∂µ U (t − u, µN
u )(z) µu (dz) du
0 X
ˆ t ˆ 2
+ N −1 EB ∂µ2 U (t − u, µN
u )(z, z) µ N
u (dz) du .
0 X
Proof. Taking the expectation E = E◦ EB in (2.6), using EB [Mt,s N ] = 0, and setting s = t, we are led in
particular to the following expansion for the expectation of a functional of the empirical measure,
ˆ t ˆ
N N 1 h
2 N
i
N
E[Φ(µt )] = E◦ [Φ(m(t, µ0 ))] + E tr a0 ∂µ U (t − u, µu )(z, z) µu (dz) du,
2N 0 X
and item (i) immediately follows. Next, taking the L2 (Ω) norm in (2.6), noting that Jensen’s inequality
yields
ˆ s∧t ˆ
N 2 −1 T N 2 N
EB [(Mt,s ) ] ≤ N EB σ0 (∂µ U )(t − u, µu )(z) µu (dz) du ,
0 X
and setting s = t, we similarly obtain item (ii).
Due to the above result, as emphasized in [19, 25, 35], Lions calculus provides a natural starting
point for propagation of chaos, which was indeed successfully used in particular in [35] to establish
uniform-in-time weak propagation of chaos estimates for the Brownian dynamics. More precisely, in
order to obtain a weak propagation of chaos estimate of the form (1.10),
E[Φ(µN 1
t )] − Φ(m(t, µ◦ )) = O( N ),
we can appeal to item (i) above and it remains to compare E◦ [Φ(m(t, µN

0 ))] to Φ(m(t, µ◦ )). The missing
estimate is provided by the following general result; see [25, Theorem 2.11].
Lemma 2.3 (see [25]). For any smooth functional Φ : P(X) → R, we have
1 1 1
ˆ ˆ ˆ 2
δ2 Φ N

N δ Φ N 1,N
E[Φ(µ0 )] − Φ(µ◦ ) = E (µ̃ , z, z) − 2 (µ̃s,u,z , z, Z◦ ) µ◦ (dz) du s ds,
N 0 0 X δµ2 s,u,z δµ
in terms of
su
µ̃N
s,u,z := (δz − δZ 1,N ) + µ◦ + s(µN
0 − µ◦ ).
N ◦
2.2. Cumulants. In order to estimate the many-particle correlation functions {Gk,N }1≤k≤N defined
in (1.13), we shall proceed by estimating cumulants of the empirical measure, which have a more
exploitable probabilistic content. We recall that the mth cumulant of a bounded random variable X
is defined by

d m
κm [X] := ( dt ) log E etX ,
t=0
hence in particular,
κ1 [X] = E[X],
κ2 [X] = E[X 2 ] − E[X]2 = Var[X],
κ3 [X] = E[X 3 ] − 3E[X 2 ]E[X] + 2E[X]3 ,
κ4 [X] = E[X 4 ] − 4E[X 3 ]E[X] − 3E[X 2 ]2 + 12E[X 2 ]E[X]2 − 6E[X]4 ,
and so on. Using similar notation as in (1.13), the following general formula holds for all m ≥ 1,
X Y
κm [X] = (−1)♯π−1 (♯π − 1)! E[X ♯A ]. (2.7)
π⊢JmK A∈π
We can also define the joint cumulant of a family of bounded random variables X1 , . . . , Xm as
m Pm
κ[X1 , . . . , Xm ] := dt1d...dtm log E e j=1 tj Xj .
t1 =...=tm =0
Since we consider in this work a product probability space (Ω, P) = (Ω◦ , P◦ ) × (ΩB , PB ), where the
first factor accounts for random initial data and where the second factor accounts for Brownian forces,
we shall appeal to the following law of total cumulance in order to split cumulants accordingly.
Lemma 2.4 (see [13]). For all m ≥ 2 and all bounded random variables X, we have
h i
κ♯A
X
κm [X] = κ♯π

◦ B [X] A∈π
,
π⊢JmK
where we recall that κ◦ and κB stand for cumulants with respect to P◦ and PB , respectively.
2.2.1. Moments and cumulants. While cumulants are defined as polynomial expressions involving mo-
ments, cf. (2.7), those relations are easily inverted: similarly as in (1.12), moments can be recovered
from cumulants in form of a cluster expansion,
X Y
E[X m ] = κ♯A [X]. (2.8)
π⊢JmK A∈π
For later purposes, we state the following recurrence relation between moments and cumulants: it
immediately implies the above cluster expansion by induction, and it will be useful in this form in the
sequel. A short proof is included for convenience.
Lemma 2.5. For all m ≥ 2 and all bounded random variables X1 , . . . , Xm , we have
X Y
E[X1 . . . Xm ] = κ[X1 , XJ ] E Xj , (2.9)
J⊂J2,mK j∈J2,mK\J
Q
where we use the standard convention j∈∅ Xj = 1 for the empty product. In particular, for all m ≥ 1
and all bounded random variables X, we have
m
m
X m−1 j
E[X ] = κ [X] E[X m−j ].
j−1
j=1
Proof. We follow [76, Proposition 2.2], extending it to the present multivariate setting. Let
Pm
MX1 ,...,Xm (t1 , . . . , tm ) := E e j=1 tj Xj

be the multivariate moment generating function of X1 , . . . , Xm . We can write

dm
E[X1 . . . Xm ] = MX1 ,...,Xm (t1 , . . . , tm )
dt1 . . . dtm t1 =...=tm =0
d m−1
d
= log MX1 ,...,Xm (t1 , . . . , tm ) MX1 ,...,Xm (t1 , . . . , tn ) ,
dt2 . . . dtm dt1 t1 =...=tm =0
and thus, by the Leibniz rule,

X d♯J+1
E[X1 . . . Xm ] = log MX1 ...Xm (t1 , . . . , tm )
dt1 dtJ t1 =...=tm =0
J⊂J2,mK
m−1−♯J
d
× MX1 ...Xm (t1 , . . . , tm ) ,
dtJ2,mK\J t1 =...=tm =0
where dtJ stands for dtj1 . . . dtjs if J = {j1 , . . . , js }. By definition of cumulants and of the moment
generating function, this yields the conclusion.
2.2.2. From cumulants to correlations. We work out the standard link between cumulants of the em-
pirical measure and correlation functions. We state it in form of an inequality that can be directly
iterated to bound successive correlation functions in terms of cumulants of the empirical mesure. This
was used for instance in [40, Section 4], but we provide a self-contained statement and a short proof
for convenience.
Lemma 2.6. For all 1 ≤ m ≤ N and t ≥ 0, we have
ˆ ˆ
⊗m m,N m N
φ Gt ≤ κ φ dµt
Xm X
ˆ O O
G♯D,N
X X
♯π−♯ρ−m+1
+ Cm N φ♯B t (zD ) dzπ .
π⊢JmK ρ⊢π X♯π B∈π D∈ρ
♯π<m
Proof. We start from the relation between cumulants and moments, cf. (2.7), applied to a linear
functional of the empirical measure: given φ ∈ Cc∞ (X), we have
ˆ X Y ˆ ♯A
κm φ dµN
t = (−1)♯π−1
(♯π − 1)! E φ dµ N
t .
X π⊢JmK A∈π X
Now moments of the empirical measure µN = N1 N

P
i=1 δZ i,N can be computed as follows,
ˆ n N Yn
1 X iℓ ,N
E φ dµNt = E φ(Z t )
X Nn
i1 ,...,in =1 ℓ=1
1 X
ˆ O
= N (N − 1) . . . (N − ♯π + 1) φ♯A
Ft♯π,N ,
Nn X ♯π
π⊢JnK A∈π
while marginals of FN can be expressed in terms of correlations via the cluster expansion (1.12),
Ftn,N (zJnK ) =
X Y ♯A,N
Gt (zA ).
π⊢JnK A∈π
Combining those different identities, after straightforward simplifications, we obtain the following
expression for cumulants of the empirical measure in terms of correlation functions,
ˆ ˆ O O
G♯D,N
X X
m N ♯π−m
κ φ dµt = N KN (ρ) φ♯B t (zD ) dzπ , (2.10)
X π⊢JmK ρ⊢π X♯π B∈π D∈ρ
where the coefficients are given by

Y P
X ( D∈C ♯D)−1
KN (ρ) := (−1)♯σ−1 (♯σ − 1)! (1 − 1
N)... 1− N .
σ⊢ρ C∈σ
Isolating Xm φ⊗(m) Gm,N

´
t in the right-hand side of (2.10) (this term is obtained for the choice π =
{{1}, . . . , {m}} and ρ = {π}), and noting that |KN (ρ)| ≤ Cm N 1−♯ρ , the conclusion follows.
2.3. Glauber calculus. We recall some useful tools from the so-called Glauber calculus on (Ω◦ , P◦ ),
as developed in particular by the second-named author in [40] (see also [33, 42]). Given that initial
data (Z◦j,N )1≤j≤N are i.i.d., the probability measure P◦ is a product measure and we denote by E◦,j
the expectation with respect to the jth variable Z◦j,N only. Given a random variable X ∈ L2 (Ω◦ ), we
then define its Glauber derivative at j ∈ JN K as
Dj◦ X := X − E◦,j [X].
The full gradient D ◦ X = (Dj◦ X)j∈JN K is viewed as an element of ℓ2 (JN K; L2 (Ω◦ )). A straightforward
computation shows that Dj◦ is self-adjoint on L2 (Ω◦ ) and satisfies
Dj◦ Dj◦ = Dj◦ , Dj◦ Dk◦ = Dk◦ Dj◦ , for all j, k ∈ JN K.
We then define the Glauber Laplacian
N
X N
X
L◦ := (D ◦ )∗ D ◦ = (Dj◦ )∗ Dj◦ = Dj◦ ,
j=1 j=1
which is a nonnegative self-adjoint operator on L2 (Ω◦ ). We recall some fundamental properties of this
operator; see [40, Lemmas 2.5 and 2.6].
Lemma 2.7 (see [40]).
(i) The kernel of L◦ is reduced to constants, ker L◦ = R. Moreover, L◦ has a unit spectral gap
above 0, and its spectrum is the set N.
(ii) The restriction of L◦ to (ker L◦ )⊥ = {X ∈ L2 (Ω◦ ) : E◦ [X] = 0} admits a well-defined inverse L−1
◦ ,
which is a nonnegative self-adjoint contraction on (ker L◦ )⊥ . Moreover, this inverse operator
satisfies for all 1 < p < ∞ and X ∈ Lp (Ω◦ ) with E◦ [X] = 0,
p2
kL−1
◦ XkLp (Ω◦ ) . p−1 kXkL (Ω◦ ) .
p (2.11)
(iii) The following Helffer-Sjöstrand representation holds for covariances: for all X, Y ∈ L2 (Ω◦ ),
N
E (Dj◦ X)L−1 ◦
X
Cov◦ [X, Y ] = ◦ (Dj Y ) . (2.12)
j=1
Combining the spectral gap for L◦ and the Helffer–Sjöstrand inequality (2.12), we recover in partic-
ular the following well-known variance inequality due to Efron and Stein [46]: for all X ∈ L2 (Ω◦ ),
N
X
Var◦ [X] ≤ E◦ [|Dj◦ X|2 ]. (2.13)
j=1
2.3.1. Cumulant estimates via Glauber calculus. It was shown in [40] how cumulants can be expressed as
polynomials of Glauber derivatives. We further show now that this can be extended to joint cumulants
of families of random variables. For that purpose, we first introduce some notation and recall a
suitable notion of so-called Stein kernels {Γn }n generalizing the one in [40]. For all n ≥ 1, given
bounded σ((Z◦j,N )j )-measurable random variables X1 , . . . , Xn , we define for all j ∈ JN K,
n
Y
n j′ j′
δj (X1 , . . . , Xn ) := E◦ (Xi − (Xi ) ) ,
i=1
where for all i, j the random variable (Xi )j′ is obtained from Xi by replacing the underlying vari-
able Z◦j,N by an i.i.d. copy, and where Ej′
◦ stands for expectation with respect to this i.i.d. copy. Note
Qn
in particular that δj1 (X1 ) = Dj◦ X1 , while δjn (X1 , . . . , Xn ) should be compared to ◦
i=1 (Dj Xi ). In these
terms, we now define the Stein kernels
Γ0 (X1 ) := Γ00 (X1 ) := X1 ,

N
X N
X
Γ1 (X1 , X2 ) := Γ11 (X1 , X2 ) := (Dj◦ X2 )L−1 ◦
◦ (Dj X1 ) = δj1 (X2 )L−1 ◦
◦ (Dj X1 ),
j=1 j=1
and iteratively, for all n ≥ 1, m ≥ 0, and ♯J = m,
N
X
Γnn+m (X1 , . . . , Xn+1 , XJ ) := δjm+1 (Xn+1 , XJ )L−1 n−1
◦ Dj Γn−1 (X1 , . . . , Xn )
j=1
1 n+m
− m+2 1n>1 Γn−1 (X1 , . . . , Xn+1 , XJ ) ,
where we let XJ = (Xj1 , . . . , Xjs ) for J = {j1 , . . . , js }, and we then set
Γn (X1 , . . . , Xn+1 ) := Γnn (X1 , . . . , Xn+1 ).
Note that Γn (X1 , . . . , Xn+1 ) is not symmetric in its arguments X1 , . . . , Xn+1 (we could choose to
consider instead its symmetrization, but it does not matter). In these terms, we can now state the
following representation formula for cumulants.
Lemma 2.8. For all n ≥ 0 and all bounded σ((Z◦j,N )j )-measurable random variables X1 , . . . , Xn+1 ,
we have
κn+1

◦ [X1 , . . . , Xn+1 ] = E◦ Γn (X1 , . . . , Xn+1 ) .
Proof. We omit the subscript ‘◦’ for notational simplicity. By the Hellfer-Sjöstrand representation
formula (2.12), we can write
m
Y m
Y m
Y
E Xi = E[X1 ] E Xi + Cov X1 , Xi
i=1 i=2 i=2
Ym N Y m
L−1
X
= E[X1 ] E Xi + E Dj Xi ◦ Dj X1 . (2.14)
i=2 j=1 i=2
Now note that the following formula is easily obtained by induction for differences of products: for all
(ai )2≤i≤m , (bi )2≤i≤m ⊂ R,
m
Y m
Y X Y Y
♯J+1
ai − bi = (−1) ai (ai − bi ) ,
i=2 i=2 J⊂J2,mK i6∈J i∈J
J6=∅
and this obviously implies
m
Y m
Y m
Xi δj♯J (XJ ).
Y X Y
Dj Xi = Ej′ Xi − (Xi )j′ = (−1)♯J+1 (2.15)
i=2 i=2 i=2 J⊂J2,mK i∈J2,mK\J
J6=∅
Inserting this into (2.14), separating the contributions of singletons in the sum, and recognizing the
definition of Γ0 , Γ1 , we get
m
Y m
Y N
δj♯J (XJ )L−1
X X Y
♯J+1
E Xi = E[Γ0 (X1 )] E Xi + (−1) E Xk ◦ Dj X1
i=1 i=2 J⊂J2,mK j=1 i∈J2,mK\J
J6=∅
m
Y X Y
= E[Γ0 (X1 )] E Xi + E Xi Γ1 (X1 , Xℓ )
i=2 ℓ∈J2,mK i∈J2,mK\{ℓ}
N
(−1)♯J

♯J+1
X X X Y
+ ♯J+1 E Xi Γ1 (X1 , Xℓ , XJ ) . (2.16)
ℓ∈J2,mK J⊂J2,mK\{ℓ} j=1 i∈J2,mK\({ℓ}∪J)
♯J≥1
Using again the Hellfer–Sjöstrand representation formula (2.12) to handle the second right-hand side
term, we can decompose for all ℓ ∈ J2, mK,
Y Y
E Xi Γ1 (X1 , Xℓ ) = E[Γ1 (X1 , Xℓ )] E Xi
i∈J2,mK\{ℓ} i∈J2,mK\{ℓ}
N
X Y
+ E Dj Xi L−1
◦ Dj Γ1 (X1 , Xℓ ) ,
j=1 i∈J2,mK\{ℓ}
and thus, appealing again to (2.15) to reformulate the last term,

Y Y
E Xi Γ1 (X1 , Xℓ ) = E[Γ1 (X1 , Xℓ )] E Xi
i∈J2,mK\{ℓ} i∈J2,mK\{ℓ}
N i
Xi δj♯J (XJ )L−1
X X Y
♯J+1
+ (−1) E ◦ Dj Γ 1 (X1 , Xℓ ) .
J⊂J2,mK\{ℓ} j=1 i∈J2,mK\({ℓ}∪J)
J6=∅
Inserting this into (2.16) and recognizing the definition of Γ2 , we find

m
Y m
Y X Y
E Xi = E[Γ0 (X1 )] E Xi + E[Γ1 (X1 , Xℓ )] E Xi
i=1 i=2 ℓ∈J2,mK i∈J2,mK\{ℓ}
X Y
+ E Xi Γ2 (X1 , Xℓ , Xℓ′ )
ℓ,ℓ′ ∈J2,mK i∈J2,mK\{ℓ,ℓ′ }

(−1)♯J

Xi Γ♯J+2
X X Y
+ ♯J+1 E 2 (X1 , Xℓ , Xℓ′ , XJ ) ,
ℓ,ℓ′ ∈J2,mK J⊂[2,m]\{ℓ,ℓ′ } i∈J2,mK\({ℓ,ℓ′ }∪J)
J6=∅
and the claim follows by iteration and a direct comparison with the formula (2.9).
The above representation formula for cumulants implies in particular that cumulants can be con-
trolled in terms of higher-order Glauber derivatives. This provides a generalization of [40, Theorem 2.2]
to the multivariate case and can be viewed as a higher-order version of Poincaré’s inequality (2.13)
on L2 (Ω◦ ) with respect to Glauber calculus.
Proposition 2.9. For all n ≥ 0 and bounded σ((Z◦j,N )j )-measurable random variables X1 , . . . , Xn+1 ,
we have
n−1
X X n+1
Y
κn+1
◦ [X1 , . . . , Xn+1 ] .n N k+1 (D ◦ )aj Xj aj
1 (n+k+1) ,
ℓ∞
6= L (Ω◦ )
k=0 a1 ,...,an+1 ≥1 j=1
P
j aj =n+k+1
where we have set

(D ◦ )m Zkℓ∞ p
6= (L (Ω◦ ))
:= sup Dj◦1 . . . Dj◦m Z Lp (Ω◦ )
.
j1 ...jm
distinct
Proof. First note that Jensen’s inequality yields for all m ≥ 0 and r ≥ 1,
m
hY i r 1r m
Y 1
r
m j′ j′ j′ j′ r
kδj (X1 , . . . , Xm )kLr (Ω◦ ) = E◦ E◦ (Xi − (Xi ) ) ≤ E◦ E◦ |Xi − (Xi ) | ,
i=1 i=1
and thus, decomposing Xi − (Xi )j′ = Dj◦ Xi − (Dj◦ Xi )j′ and using Hölder’s inequality
m h i 1 m
′ ′ rm
Y Y
Dj◦ Xi (Dj◦ Xi )j kDj◦ Xi kLrm (Ω◦ ) .
rm
kδjm (X1 , . . . , Xm )kLr (Ω◦ ) ≤ E◦ Ej◦ − ≤ 2 m
i=1 i=1
By induction, using this estimate along with (2.11), we find for all m, n, r, for all bounded σ((Z◦j,N )j )-
measurable random variables X1 , . . . , Xn+m+1 ,
kΓm+n
n (X1 , . . . , Xm+n+1 )kLr (Ω)
n−1
X X m+n+1
Y
.m,n,r N k+1
k(D ◦ )aj Xj k r (m+n+k+1)
aj
.
ℓ∞
6= L (Ω◦ )
k=0 a1 ,...,am+n+1 ≥1
P j=1
j aj =m+n+k+1
Combined with the representation formula of Lemma 2.8, this yields the conclusion.
2.3.2. Asymptotic normality via Glauber calculus. As the approximate normality of a random variable
essentially follows from the smallness of its cumulants of order ≥ 3, there is no surprise that it can
be quantified as well by means of Glauber calculus. The following result is typically known in the
literature as a “second-order Poincaré inequality” for approximate normality. It was first established
by Chatterjee [26, Theorem 2.2] based on Stein’s method for the 1-Wasserstein distance, while the
corresponding bound on the Kolmogorov distance is due to [65, Theorem 4.2]. We include a short
proof for convenience to show that the same result also holds for the Zolotarev distance.
Proposition 2.10 (Second-order Poincaré inequality [26, 65]). For all bounded σ((Z◦j,N )j )-measurable
random variable Y , setting σY2 := Var [Y ], there holds

Y − E◦ [Y ] Y − E◦ [Y ] Y − E◦ [Y ]
d2 ; N + dW ; N + dK ;N
σY σY σY
1 X
N N
1 XX
N 2 12
◦ 6 12 ◦ 4 41 ◦ ◦ 4 14
. 3 E◦ [|Dj Y | ] + 2 E◦ [|Dl Y | ] E◦ [|Dj Dl Y | ] ,
σY j=1 σY j=1
l=1
where dW (·; N ) and dK (·; N ) stand for the 1-Wasserstein and the Kolmogorov distances to a standard
Gaussian random variable respectively, and where we recall that d2 (·; N ) stands for the corresponding
second-order Zolotarev distance (1.18).
Proof. By homogeneity, it suffices to consider a bounded random variable Y with
E◦ [Y ] = 0, σY2 = Var◦ [Y ] = 1.
Given g ∈ Cb1 (R), we define its Stein transform Sg as the solution of Stein’s equation
Sg′ (x) − xSg (x) = g(x) − EN [g(N )]. (2.17)
As shown in [82], the latter can be computed as
ˆ 1
1 h √ √ i
Sg (x) = − √ EN g′ t x + 1 − t N dt,
0 2 t
Using this formula and a Gaussian integration by parts, we easily obtain the following bound,
kSg′ kW 1,∞ (R) . kg ′′ kL∞ (R) . (2.18)
Evaluating equation (2.17) at Y and taking the expectation, we find
E◦ [g(Y )] − EN [g(N )] = E◦ Sg′ (Y ) − Y Sg (Y ) .

Now appealing to the Helffer–Sjöstrand representation formula of Lemma 2.7(iii) for the covariance
E◦ [Y Sg (Y )] = Cov◦ [Y, Sg (Y )], this yields
N
′ ◦ −1 ◦
X
E◦ [g(Y )] − EN [g(N )] = E◦ Sg (Y ) − (Dj Sg (Y ))L◦ (Dj Y ) . (2.19)
j=1
A Taylor expansion gives for all p ≥ 1,

Dj◦ Sg (Y ) − Sg′ (Y )Dj◦ Y Lp (Ω◦ )
≤ kSg′′ kL∞ (R) kDj◦ Y k2L2p (Ω◦ ) .
Using this to replace Dj◦ Sg (Y ) in (2.19), using Hölder’s inequality with p = 32 to bound the error,
using the boundedness of L−1 3
◦ in L (Ω◦ ), cf. Lemma 2.7(ii), and recalling the bound (2.18) on the Stein
transform, we are led to
XN X N
′′ ∞ ◦ −1 ◦ ◦ 3
E◦ [g(Y )] − EN [g(N )] . kg kL (R) E◦ 1 − (Dj Y )L◦ (Dj Y ) + E◦ [|Dj Y | ] .
j=1 j=1
Now recalling the Helffer–Sjöstrand representation formula of Lemma 2.7(iii) in form of

N
hX i
1 = Var◦ [Y ] = E◦ (Dj◦ Y )L−1 ◦
◦ (Dj Y ) ,
j=1
we deduce by the Cauchy–Schwarz inequality,

N
hX i1 X N
2
E◦ [g(Y )] − EN [g(N )] . kg′′ kL∞ (R) Var◦ (Dj◦ Y )L−1
◦ (Dj
◦
Y ) + E ◦ [|Dj
◦
Y |3
] .
j=1 j=1
Taking the supremum over g ∈ Cb2 (R),

the conclusion follows in the second-order Zolotarev distance d2 .
Notice that the proof in 1-Wasserstein distance can actually obtained in the same way by noting that
on top of (2.18) the Stein transform also satisfies kSg′ kW 1,∞ (R) . kg ′ kL∞ (R) , cf. [82]. The proof in
Kolmogorov distance is more delicate and we refer to [65, Theorem 4.2].
2.3.3. Concentration via Glauber calculus. We establish the following concentration estimate for ran-
dom variables in L2 (Ω◦ ). It follows from some degraded version of a log-Sobolev inequality, combined
with the Herbst argument. Note however that we do not have an exact log-Sobolev inequality with
respect to Glauber calculus, cf. [67], which is why we need to require an almost sure a priori bound on
the Glauber derivative.
Proposition 2.11. Let X ∈ L2 (Ω◦ ) be σ((Z◦j,N )1≤j≤N )-measurable with E◦ [X] = 0 and |Dj◦ X| ≤ 12 L
almost surely for all 1 ≤ j ≤ N , for some constant L > 0. Then for all λ > 0 we have

E◦ [eλX ] ≤ exp N2 λL(eλL − 1) .
In particular, this entails

r r

P◦ [X > r] ≤ exp − 4L log 1 + NL ,
2
where the right-hand side is ≤ exp(− 8Nr L2 ) as long as r ≤ N L.
Proof. We appeal to the following degraded version of a log-Sobolev inequality as obtained in [41,
Proposition 2.4]: for all random variables Y ∈ L2 (Ω◦ ), we have
N
X h i
Ent◦ [Y 2 ] ≤ 2 E◦ sup (Y − Y j )2 ,
j=1 j
where we recall that Y j stands for the random variable obtained from Y by replacing the underlying
variable Z◦j,N by an i.i.d. copy, and where supj stands for the essential supremum with respect to this
1
i.i.d. copy. Applying this inequality to Y = e 2 X , using the bound
ˆ 1
1 1 j 1 j 1 1 j
X X 1
|e 2 − e 2 | ≤ 2 |X − X | j
e 2 (X−t(X−X )) dt ≤ 12 e 2 X |X − X j | e 2 |X−X | ,
0
we find
N h i
j
X
X
Ent◦ [e ] ≤ 1
2 E◦ eX sup (X − X j )2 e|X−X | ≤ N 2 MX
2 MX e E◦ [eX ],
j=1 j
in terms of
MX := sup sup ess |X − X j | ≤ 2 sup sup ess |Dj◦ X| ≤ L.
1≤j≤N 1≤j≤N
We are now in position to appeal to the Herbst argument in the form of [67, Proposition 2.9 and
Corollary 2.12] and the conclusion follows.
2.3.4. Link to linear derivatives. As the following lemma shows, Glauber derivatives can be estimated
in terms of linear derivatives. This is particularly convenient in the sequel to unify notations when
both Glauber and Lions derivatives are involved.
Lemma 2.12. Given a smooth functional Φ : P(X) → R, we have almost surely for all k ≥ 1 and all
distinct indices j1 , . . . , jk ∈ JN K,
δk Φ
Dj◦1 . . . Dj◦k Φ(µN
0 ) ≤ N
−k k
2 sup (µ, ·) .
µ∈P(X) δµk L∞ (Xk )
Proof. For all j ∈ JN K, by definition of the Glauber derivative and of the linear derivative, we can
compute
ˆ
Dj◦ Φ(µN0 ) = Φ(µ N
0 ) − Φ µ N
0 + 1
(δ
N z − δZ0j,N
) µ◦ (dz)
X
ˆ 1ˆ ˆ
−1 δΦ N 1−s
= N µ0 + N (δz − δZ j,N ) , y (δZ j,N − δz )(dy) µ◦ (dz) ds. (2.20)
0 X X δµ 0 0
By induction, we are led to the following representation formula for iterated Glauber derivatives: for
all k ≥ 1 and all distinct indices j1 , . . . , jk ∈ JN K,
k
δk Φ N X 1−si
ˆ
Dj◦1 . . . Dj◦k Φ(µN
0 ) = N −k
µ0 + N (δzi − δZ0ji ,N ) , y1 , . . . , yk
([0,1]×X×X)k δµk
i=1
Yk
× (δZ ji ,N − δzi )(dyi ) µ◦ (dzi ) dsi . (2.21)
0
i=1
Recalling Z0j,N ∼ µ◦ for all j, the conclusion immediately follows.

3. Ergodic Sobolev estimates for mean field

In this section, we establish ergodic estimates for the linearized mean-field equation, which will
be the key tool for our uniform-in-time results in the spirit of [35]. Given µ ∈ P(X), ´the linearized
mean-field McKean–Vlasov operator at µ is defined as follows: for all h ∈ Cc∞ (X) with X h = 0,
ˆ δb
1

Lµ h := 2 div(a0 ∇h) − div b(·, µ)h − div µ (·, µ, z) h(z) dz . (3.1)
X δµ
In the Langevin setting (1.23), this means for all h on X = Rd × Rd ,
1
Lµ h = 2 divv ((∇v + βv)h) + ∇A · ∇v h − v · ∇x h + κ(∇W ∗ µ) · ∇v h + κ(∇W ∗ h) · ∇v µ, (3.2)
and in the Brownian setting (1.24), this means for all h on X = Rd ,
1
Lµ h = 2 △h + div(h∇A) + κdiv(h∇W ∗ µ) + κdiv(µ∇W ∗ h).
For our purposes in this work, we shall establish ergodic estimates in a weighted Sobolev framework with
arbitrary integrability, negative regularity, and polynomial weight: more precisely, for all 1 ≤ q ≤ 2
and p ≥ 0, we consider the space Lq (hzip ) as the weighted Lebesgue space with the norm
ˆ 1
p q
khkLq (hzip ) := khzi hkLq (X) = |h(z)|q hzipq dz ,
X
and, for all k ≥ 0, we consider the space W −k,q (hzip ) as the weighted negative Sobolev space associated
with the dual norm
ˆ
′ p ′
khkW −k,q (hzip ) := sup hh hzi : kh kW k,q′ (X) =1 , (3.3)
X
q ′
where q ′ := q−1 is the dual integrability exponent and where W k,q (X) is the standard Sobolev space
with norm
khkW k,q′ (X) := sup k∇j hkLq′ (X) .
0≤j≤k
In these terms, the main result of this section takes on the following guise. While item (i) is well
known (see e.g. [9] and the discussion below), our main contribution here is to prove the Sobolev
ergodic estimates of item (ii). Note that the restriction pq ′ ≫ 1 in the Langevin setting is fairly
natural: indeed, we note for instance that the restriction pq ′ > d precisely ensures the spatial density
ρh (x) := Rd h(x, v) dv to be defined in L1loc (Rd ) for all h ∈ W −k,q (hzip ).
´
Theorem 3.1. There exist constants κ0 , λ0 > 0 (only depending on d, β, a, and kW kW 1,∞ (Rd ) ), such
that the following results hold for any κ ∈ [0, κ0 ].
(i) There is a unique steady state M for the mean-field evolution (1.21), and the solution opera-
tor (1.22) satisfies for all µ◦ ∈ P(X) and t ≥ 0,
W2 m(t, µ◦ ), M .W,β,a e−λ0 t W2 (µ◦ , M ),

(3.4)
where W2 stands for the 2-Wasserstein distance and where the multiplicative constant only depends
on d, β, a, and kW kW 1,∞ (Rd ) .
(ii) Let 1 < q ≤ 2, k ≥ 1, and 0 < p ≤ 1, and assume in the Langevin setting (1.23) that pq ′ ≫β,a 1
is large enough (only depending on d,´β, a). For all µ´◦ ∈ P(X) ∩ W 1−k,q (hzip ), f◦ ∈ W −k,q (hzip ),
and r ∈ L∞ +
loc (R ; W
−k,q (hzip )) with
X f◦ = 0 and X rt = 0 for t ≥ 0, there is a unique weak
∞
solution f ∈ Lloc (R ; W −k,q (hzip )) to the Cauchy problem
+

∂t ft = Lm(t,µ◦ ) ft + rt ,
(3.5)
ft |t=0 = f◦ ,
and it satisfies for all λ ∈ [0, λ0 ) and t ≥ 0,
ˆ t
pλs
sup e kfs kW −k,q (hzip ) .W,β,λ,k,p,q,a,µ◦ kf◦ kW −k,q (hzip ) + epλs krs kW −k,q (hzip ) ds, (3.6)
0≤s≤t 0
where the multiplicative constant only depends on d, β, λ, k, p, q, a, kW kW (2k)∨(k+d+1),∞ (Rd ) , and

on kµ◦ kW 1−k,q (hzip ) .
The convergence to equilibrium stated in item (i) is well known: it was proven for instance by
Bolley, Guillin, and Malrieu [9] in the Langevin setting (see also [70, 58, 11] for earlier results), and
their elementary coupling argument is immediately adapted to the Brownian setting as well. For
corresponding results relying on convexity rather than on smallness of the interaction, we refer to [75,
56] in the Langevin setting, and to [70, 71, 20, 21, 22] in the Brownian setting. Perturbations of the
strictly convex case have also been investigated e.g. in [8, 15, 45].
Regarding the ergodic estimates stated in item (ii), in the Brownian setting, they easily follow from
classical parabolic theory [49, 50]. In the periodic case with A ≡ 0, such estimates can be found in [18,
Lemma 7.4] on the space L∞ (Td ), and in [35] on the space W −k,1 (Td ) with 0 ≤ k < 2. Those results
are easily generalized to the case of a nontrivial confinement in the whole space Rd , and they can be
checked to hold on the space W −k,q (hzip ) for all 1 ≤ q ≤ 2, k ≥ 1, and 0 ≤ p ≤ 1. We emphasize in
particular that they also hold on the unweighted space W −k,1 (Rd ) for all k ≥ 1. We skip the detail as
it is similar to [35]. Note that the control of higher-order correlation functions indeed requires ergodic
estimates in Sobolev spaces with arbitrary negative regularity k ≥ 1.
The main challenge is to obtain the corresponding ergodic estimates in the kinetic Langevin setting,
where parabolic tools are no longer available due to hypocoercivity. This has been a very active area of
research over the last two decades and it is the focus of the rest of this section. In the PDE community,
the convergence to equilibrium for linear kinetic equations was first studied in [57, 59]. General hypoco-
ercivity techniques were developed in [86, 37, 38], where the linear kinetic Fokker–Planck equation
served as a prototypical example and where the exponential convergence to equilibrium was obtained
both on the spaces L2 (M −1/2 ) and H 1 (M −1/2 ). Combining hypocoercivity techniques with so-called
enlargement theory, Gualdani, Mischler and Mouhot [52, 73] later obtained corresponding estimates
on larger spaces. While ergodic estimates in the Brownian setting hold on W −k,1 (X) for all k ≥ 1,
hypocoercivity techniques in the kinetic Langevin setting actually require working on weighted spaces
W −k,q (hzip ) with integrability exponent q > 1 and with p > 0. More precisely, enlargement theory as
developed in [73] leads to estimates on W −k,q (hzip ) for all q > 1, k ∈ {−1, 0, 1}, and for large enough
weight exponents p ≫ 1. Yet, it is critical for our concentration results in Theorem 1.2 to be able
to cover arbitrarily small p > 0 when the integrability exponent q is close enough to 1. This has led
us to revisit and partially improve the work of Mischler and Mouhot [73]: our ergodic estimates are
proven to hold for all q > 1 and p > 0 under the sole restriction that pq ′ be large enough, which is
of independent interest. In addition, the control of higher-order correlation functions requires to cover
arbitrary negative regularity k ≥ 1.
Remark 3.2 (Periodic setting). As mentioned in the introduction, cf. Remark 1.5(b), the above
result can essentially be adapted to the corresponding periodic setting on the torus Td with A ≡ 0, but
some special care is then needed in the Langevin setting. Indeed, the nonlinear hypocoercivity result
available in that case is slightly weaker, cf. [87, Theorem 56]: it only yields a convergence rate t−∞
in (3.4), thus leading to a similar decay rate t−∞ instead of exponential in (3.6). Fortunately, the
resulting non-exponential estimates are still enough to repeat the proofs of Theorems 1.1, 1.2, and 1.3,
which can be checked to hold in the very same form.
3.1. Exponential relaxation for modified linearized operators. We focus on the proof of the
ergodic estimates of Theorem 3.1(ii) in the kinetic Langevin setting (1.23), while the same arguments
can be repeated and substantially simplified in the Brownian setting. We start by considering the
following modified version of the linearized operator Lµ defined in (3.2), where we remove the (compact)
convolution term: given a measure µ ∈ P(X), we define for all h ∈ Cc∞ (X),
1
Rµ h := 2 divv ((∇v + βv)h) − v · ∇x h + (∇A + κ∇W ∗ µ) · ∇v h. (3.7)
Given µ◦ ∈ P(X), s ≥ 0, and hs ∈ Cc∞ (X), recalling that µt := m(t, µ◦ ) stands for the solution of the
mean-field equation (1.7), we consider the following (non-autonomous) equation,

∂t ht = Rµt ht , for t ≥ s,
(3.8)
ht |t=s = hs .
It is easily checked that this linear parabolic equation is well-posed with h ∈ Cloc ([s, ∞); L2 (M −1/2 ))
whenever the initial condition hs belongs to L2 (M −1/2 ). We then consider the associated fundamental
solution operators {Vt,s }t≥s≥0 on L2 (M −1/2 ) defined by
ht = Vt,s hs .
´ ´
Note that X Vt,s hs = X hs for all t ≥ s. We establish the following exponential convergence result to
the steady state.
Proposition 3.3. Let κ0 be as in Theorem 3.1(i) and let κ ∈ [0, κ0 ]. There exists a constant λ0 > 0
(only depending on d, β, a, and kW kW 1,∞ (Rd ) ), such that the following holds: given 1 < q ≤ 2 and
0 < p ≤ 1 with pq ′ ≫β,a 1 large enough (only depending on d, β, a), we have for all λ ∈ [0, λ0 ), k ≥ 0,
hs ∈ Cc∞ (X), and t ≥ s ≥ 0,
ˆ
Vt,s hs − M hs −k,q p .W,β,λ,k,p,q,a e−pλ(t−s) khs kW −k,q (hzip ) ,
X W (hzi )
where M is the unique steady state given by Theorem 3.1(i), and where the multiplicative constant only
depends on d, β, λ, k, p, q, a, and kW kW k+d+1,∞ (Rd ) .
As stated in Theorem 3.1(i), recall that the mean-field evolution (1.7) has a unique steady state M
for κ ∈ [0, κ0 ], which can actually be characterized as the unique solution of the fixed-point Gibbs
equation h i
M (x, v) = cM exp − β 12 |v|2 + A(x) + κW ∗ M (x) , (x, v) ∈ X,
´
where cM is the normalizing constant such that X M (x, v) dxdv = 1. Note that this fixed-point
equation has indeed a unique solution provided that κβkW kL∞ (Rd ) < 1. In order to prove Propo-
sition 3.3, we shall first establish the exponential decay on negative Sobolev spaces with this Gibbs
weight M , that is, the exponential decay on the smaller spaces H −k (M −1/2 ), and next we shall appeal
to the enlargement theory of Gualdani, Mischler and Mouhot [52, 73] to conclude with the desired
result on W −k,q (hzip ). Here, for all k ≥ 0, the space H −k (M −1/2 ) is defined as the weighted negative
Sobolev space associated with the dual norm
ˆ
khkH −k (M −1/2 ) := sup hh′ M −1 : kh′ kH k (M −1/2 ) = 1 , (3.9)
X
where H k (M −1/2 ) is the standard weighted Sobolev space with norm

ˆ 1
2
khkH k (M −1/2 ) := sup k∇ix ∇jv hkL2 (M −1/2 ) , khkL2 (M −1/2 ) := |h|2 M −1 .
i,j≥0, i+j≤k X
Note that the treatment of the weight in the definition of those weighted spaces differs slightly from the
one in the definition of W −k,q (hzip ), cf. (3.3), but for convenience we stick to this slight inconstistency
in the choice of definitions.
In order to appeal to enlargement theory, we start by introducing a suitable decomposition of the
operator Rµ . Let a cut-off function χ ∈ Cc∞ (X) be fixed with χ(z) = 1 for |z| ≤ 1, and set
χR (z) := χ( R1 z). (3.10)
In those terms, let us split the operator Rµ as follows,
Rµ := A + Bµ , (3.11)
Ah := ΛχR h,
Bµ h := 21 divv ((∇v + βv)h) − v · ∇x h + (∇A + κ∇W ∗ µ) · ∇v h − ΛχR h,
for some constants Λ, R > 0 to be properly chosen later on (see Lemmas 3.5 and 3.6 below). Let us
denote by {Wt,s }t≥s≥0 the fundamental solution operators for the (non-autonomous) evolution equation
associated with Bµ : for all s ≥ 0 and hs ∈ Cc∞ (X), we define ht = Wt,s hs as the solution of

∂t ht = Bµt ht , for t ≥ s,
ht |t=s = hs .
Again, it is easily checked that this equation is well-posed with h ∈ Cloc ([s, ∞); L2 (M −1/2 )) whenever
hs ∈ L2 (M −1/2 ). Our proof of Proposition 3.3 is based on the following three preliminary lemmas, the
proofs of which are postponed to Sections 3.3, 3.4, 3.5, and 3.6 below.
Lemma 3.4 (Exponential decay on restricted space). Let κ0 be as in Theorem 3.1(i) and let κ ∈ [0, κ0 ].
There exists a constant λ1 > 0 (only depending on d, β, a, and kW kW 1,∞ (Rd ) ), such that the following
holds: given λ ∈ [0, λ1 ) and k ≥ 0, we have for all hs ∈ Cc∞ (X) and t ≥ s ≥ 0,
ˆ
Vt,s hs − M hs −k −1/2 .W,β,λ,k,a e−λ(t−s) khs kH −k (M −1/2 ) ,
X H (M )
where the multiplicative constant only depends on d, β, λ, k, a, and kW kW k+2,∞ (Rd ) .

Lemma 3.5 (Exponential decay for modified operator). Let κ0 be as in Theorem 3.1(i) and let κ ∈
[0, κ0 ]. There exists a constant λ2 > 0 (only depending on d, β, a, and kW kW 1,∞ (Rd ) ), such that the
following holds: given 1 < q ≤ 2 and 0 < p ≤ 1 with pq ′ ≫β,a 1 large enough (only depending on
d, β, a), choosing Λ, R large enough (only depending on d, β, p, a, and kW kW 1,∞ (Rd ) ), we have for all
λ ∈ [0, λ2 ), k ≥ 0, hs ∈ Cc∞ (X), and t ≥ s ≥ 0,
kWt,s hs kW −k,q (hzip ) .W,β,λ,k,p,q,a e−λ(t−s) khs kW −k,q (hzip ) , (3.12)
kWt,s hs kH −k (M −1/2 ) .W,β,λ,k,p,q,a e−λ(t−s) khs kH −k (M −1/2 ) , (3.13)
where the multiplicative constants only depend on d, β, λ, k, p, q, a, and kW kW k+1,∞ (Rd ) .
Lemma 3.6 (Regularization estimate). Let κ0 , λ2 be as in Theorem 3.1(i) and Lemma 3.5, respectively,
and let κ ∈ [0, κ0 ]. There is some n ≥ 1 large enough (only depending on d) such that the following
holds: given 1 < q ≤ 2 and 0 < p ≤ 1 with pq ′ ≫β,a 1 large enough (only depending on d, β, a), choosing
Λ, R large enough (only depending on d, β, p, a, and kW kW 1,∞ (Rd ) ), we have for all λ ∈ [0, λ2 ), k ≥ 0,
hs ∈ Cc∞ (X), and t ≥ s ≥ 0,
ˆ
kAWt,sn AWsn ,sn−1 . . . AWs1 ,s hs kH −k (M −1/2 ) ds1 . . . dsn
s≤s1 ≤...≤sn ≤t
.W,β,λ,k,p,q,a e−pλ(t−s) khs kW −k,q (hzip ) ,

where the multiplicative constant only depends on d, β, λ, k, p, q, a, and kW kW k+d+1,∞ (Rd ) .
With those lemmas at hand, we are now in position to conclude the proof of Proposition 3.3 based
on the enlargement theory of Gualdani, Mischler, and Mouhot [52, 73].
Proof of Proposition 3.3. Let λ1 , λ2 be defined in Lemmas 3.4 and 3.5, respectively, and let 1 < q ≤ 2,
k ≥ 0, and 0 < p ≤ 1 with pq ′ ≫ 1 large enough in the sense of Lemmas 3.5 and 3.6. We note that
the space H −k (M −1/2 ) is continuously embedded in W −k,q (hzip ): by definition of dual norms, we find
for all h ∈ Cc∞ (X),
khkW −k,q (hzip ) .W,β,k,a khkH −k (M −1/2 ) , (3.14)
where the constant only depends on d, β, k, a, and kW kW k,∞ (Rd ) . In this setting, we can appeal to
enlargement theory to extend the estimates of Lemma 3.4 to W −k,q (hzip ): by Lemmas 3.4, 3.5, and 3.6,
we can apply [73, Theorem 1.1] and the conclusion precisely follows. For completeness, we include a
short proof of enlargement as the present situation does not exactly fit in the semigroup setting of [73].
Starting point is the following form of the Duhamel formula: based on the decomposition (3.11), the
fundamental solution operators {Vt,s }0≤s≤t can be expanded around {Wt,s }0≤s≤t via
ˆ t
Vt,s = Wt,s + Vt,u AWu,s du.
s
By iteration, we get for all n ≥ 1,
n−1
Xˆ t ˆ t
Vt,s = Wt,s + Wt,u (AW )ju,s du + Vt,u (AW )nu,s du,
j=1 s s
where we have set for abbreviation, for all j ≥ 1 and 0 ≤ s0 ≤ sj ,

ˆ
(AW )jsj ,s0 := AWsj ,sj−1 AWsj−1 ,sj−2 . . . AWs1 ,s0 1s0 ≤s1 ≤...≤sj ds1 . . . dsj−1 .
(R+ )j−1
Given hs ∈ Cc∞ (X), taking norms, applying the exponential decay of Lemma ´ 3.5 for {Wt,s }0≤s≤t on
the space W −k,q (hzip ), noting that A is bounded on W −k,q (hzip ) and that | X h| .k,p,q khkW −k,q (hzip )
provided pq ′ > 2d, we get for all n ≥ 1 and λ ∈ [0, λ2 ),
ˆ ˆ
Vt,s hs − M hs −k,q p = Vt,s hs − M Vt,s hs −k,q p
X W (hzi ) X W (hzi )
n −pλ(t−s)

.W,β,λ,k,p,q,n,a 1 + (t − s) e khs kW −k,q (hzip )
ˆ t ˆ
+ Vt,u (AW )u,s hs − M (AW )nu,s hs −k,q p du.
n
s X W (hzi )
In order to estimate the last right-hand side term, we recall the embedding (3.14), we use the exponen-
tial relaxation of Lemma 3.4 for {Vt,s }0≤s≤t on the space H −k (M −1/2 ), and we use the regularization
estimate of Lemma 3.6 for n ≥ 1 large enough (only depending on d): for λ ∈ [0, λ1 ∧ λ2 ), this leads
us to
ˆ
Vt,s hs − M hs −k,q p
X W (hzi )
ˆ t
.W,β,λ,k,p,q,a 1 + (t − s)n e−pλ(t−s) khs kW −k,q (hzip ) + e−λ(t−u) k(AW )nu,s hs kH −k (M −1/2 ) du

s
n −pλ(t−s)

.W,β,λ,k,p,q,a 1 + (t − s) e khs kW −k,q (hzip ) ,
and the conclusion follows with λ0 = λ1 ∧ λ2 .
3.2. Proof of Theorem 3.1(ii). In this section, we establish Theorem 3.1(ii) as a consequence of
Proposition 3.3. As a preliminary, we start by noting that the convergence of the mean-field evolu-
tion (1.21) to equilibrium as stated in Theorem 3.1(i) also holds on the spaces W −k,q (hzip ).
Lemma 3.7. Let κ0 , λ0 be as in Proposition 3.3 and let κ ∈ [0, κ0 ]. Given 1 < q ≤ 2 and 0 < p ≤ 1
with pq ′ ≫β,a 1 large enough (only depending on d, β, a), the mean-field evolution (1.21) satisfies for
all λ ∈ [0, λ0 ), k ≥ 0, µ◦ ∈ P ∩ Cc∞ (X), and t ≥ 0,
km(t, µ◦ ) − M kW −k,q (hzip ) .W,β,λ,k,p,q,a e−pλt kµ◦ kW −k,q (hzip ) ,
hence, in particular,
km(t, µ◦ )kW −k,q (hzip ) .W,β,λ,k,p,q,a kµ◦ kW −k,q (hzip ) , (3.15)
where the multiplicative constants only depend on d, β, λ, k, p, q, a, and kW kW k+d+1,∞ (Rd ) .
Proof. The mean-field equation (1.7) for µt := m(t, µ◦ ) can be written as ∂t m(t, µ◦ ) = Rµt m(t, µ◦ ),
which entails m(t, µ◦ ) = Vt,0 µ◦ , and the conclusion then follows from Proposition 3.3.
With the above estimate at hand, we can finally conclude the proof of Theorem 3.1(ii) in the
Langevin setting.
Proof of Theorem 3.1(ii). By´ a standard approximation argument, it suffices to consider f◦ ∈ Cc∞ (X)
∞ +
´
and r ∈ Cc (R × X), with X f◦ = 0 and X rt = 0 for all t. In that case, the well-posedness of the
Cauchy problem (3.5) is standard and it remains to establish the stability estimate (3.6). In terms
of the modified linearized operator Rµ defined in (3.7), setting µt := m(t, µ◦ ), equation (3.5) can be
reformulated as
∂t ft = Rµt ft + κ(∇W ∗ ft ) · ∇v µt + rt ,
hence, by Duhamel’s formula,
ˆ t
ft = Vt,0 f◦ + Vt,s κ(∇W ∗ fs ) · ∇v µs + rs ds.
0
´ ´
Appealing to the exponential decay of Proposition 3.3 for {Vt,s }t≥s≥0 with X f◦ = 0 and X rt = 0,
noting that for pq ′ > 2d we have
k(∇W ∗ fs ) · ∇v µs kW −k,q (hzip )
.k kW ∗ fs kW k,∞ (X) kµs kW 1−k,q (hzip )
.k,p,q kW kW 2k,∞ (Rd ) kfs kW −k,q (hzip ) kµs kW 1−k,q (hzip ) ,
and further appealing to the a priori estimate (3.15) in Lemma 3.7 above, we deduce for all λ ∈ [0, λ0 )
and k ≥ 1,
ˆ t
epλt kft kW −k,q (hzip ) .W,β,λ,k,p,q,a,µ◦ kf◦ kW −k,q (hzip ) + epλs kfs kW −k,q (hzip ) + krs kW −k,q (hzip ) ds.
0
The conclusion follows from Grönwall’s inequality.
3.3. Proof of Lemma 3.4: ergodic estimates with Gibbs weight. This section is devoted to the
proof of Lemma 3.4. We start by considering the standard kinetic Fokker–Planck operator
1
RM h := 2 divv ((∇v + βv)h) − v · ∇x h + (∇A + κ∇W ∗ M ) · ∇v h.
The exponential relaxation of the associated semigroup {etRM }t≥0 on L2 (M −1/2 ) was established in the
seminal work of Dolbeault, Mouhot, and Schmeiser [38] based on hypocoercivity techniques. We post-
process this well-known result to further derive estimates on Sobolev spaces with arbitrary negative
regularity. For that purpose, we appeal to a duality argument and argue by induction using parabolic
estimates.
Lemma 3.8. Let κ0 be as in Theorem 3.1(i) and let κ ∈ [0, κ0 ]. There exists λ1 > 0 (only depending
on d, β, a, and kW kW 1,∞ (X) ) such that for all λ ∈ [0, λ1 ), t ≥ 0 and h ∈ Cc∞ (X),
ˆ
etRM h − M h −k −1/2 .W,β,k,a e−λt khkH −k (M −1/2 ) , (3.16)
X H (M )
where the multiplicative factor only depends on d, β, k, a, and kW kW k+1,∞ (Rd ) .

⊥ h := h − M
´ ´ tR ´
Proof. We set for abbreviation πM X h, and we note that X e X h for all t ≥ 0.
Mh =
By definition of dual norms, cf. (3.9), also recalling the definition of the steady state M , it suffices to
show that there is some λ1 > 0 such that for all k ≥ 0, λ ∈ [0, λ1 ), t ≥ 0, and h ∈ Cc∞ (X) we have
⊥ tRM ∗
kπM e hkH k (M −1/2 ) .W,β,k,a e−λt khkH k (M −1/2 ) ,
∗ stands for the dual Fokker–Planck operator
where RM
∗ 1
RM h = 2 divv ((∇v + βv)h) + v · ∇x h − (∇A + κ∇W ∗ M ) · ∇v h. (3.17)
We shall actually prove the following more detailed estimate, further capturing the dissipation: there
is some λ1 > 0 such that for all k ≥ 0, λ ∈ [0, λ1 ), t ≥ 0, and h ∈ Cc∞ (X) we have
ˆ t 1
λt ⊥ tR∗M ∗ 2
e kπM e hkH k (M −1/2 ) + e2λs k(∇v + βv)esRM hk2H k (M −1/2 ) ds
0
.W,β,λ,k,a khkH k (M −1/2 ) . (3.18)
We split the proof into two steps.
Step 1. Case k = 0: there exists λ1 > 0 (only depending on d, β, a, and kW kW 1,∞ (X) ) such that for
all t ≥ 0 and h ∈ Cc∞ (X),
⊥ tRM ∗
kπM e hkL2 (M −1/2 ) .W,β,a e−λ1 t kπM
⊥
hkL2 (M −1/2 ) . (3.19)
This was precisely established by Dolbeault, Mouhot, and Schmeiser in [38, Theorem 10].
Step 2. Conclusion: proof of (3.18).
Given h ∈ Cc∞ (X), we set for shortness Jtα,γ := ∇αx ∇γv πM ⊥ etR∗M h for multi-indices α, γ ∈ Nd . By
definition, it satisfies
∂t Jtα,γ = RM∗ J α,γ + r α,γ ,

t t for t ≥ 0,
(3.20)
Jtα,γ |t=0 = ∇αx ∇γv (πM
⊥ h),
where the source term rtα,γ is given by

∗
rtα,γ := [∇αx ∇γv , RM
∗ ⊥ tRM
]πM e h. (3.21)
On the one hand, by Duhamel’s formula in form of
ˆ t
tR∗M ∗
Jtα,γ = e ⊥
∇αx ∇γv (πM h) + e(t−s)RM rsα,γ ds,
0
the exponential decay (3.19) yields

ˆ t
kJtα,γ kL2 (M −1/2 ) .W,β,a e−λ1 t ⊥
k∇αx ∇γv (πM h)kL2 (M −1/2 ) + e−λ1 (t−s) krsα,γ kL2 (M −1/2 ) ds. (3.22)
0
On the other hand, integrating by parts, the energy identity for equation (3.20) takes the form
ˆ ˆ
∂t kJt kL2 (M −1/2 ) = 2 Jt (RM Jt )M + 2 Jtα,γ rtα,γ M −1
α,γ 2 α,γ ∗ α,γ −1
X X
≤ −k(∇v + βv)Jtα,γ k2L2 (M −1/2 ) + 2kJtα,γ kL2 (M −1/2 ) krtα,γ kL2 (M −1/2 ) . (3.23)
Regarding the dissipation term in this last estimate, we make the following observation: integrating
by parts and using ∇v M −1 = βvM −1 , we find for all f ∈ Cc∞ (X),
ˆ ˆ
−1
2
= − M −1 f (∇v + βv) · ∇v f

|∇v f | M
X
ˆX ˆ
= − M −1 f divv (∇v + βv)f + βd |f |2 M −1

ˆ X ˆ X
2 −1 2 −1
= |(∇v + βv)f | M + βd |f | M . (3.24)
X X
Using this to replace half of the dissipation term in (3.23), we get

∂t kJtα,γ k2L2 (M −1/2 ) + 12 k∇v Jtα,γ k2L2 (M −1/2 ) + k(∇v + βv)Jtα,γ k2L2 (M −1/2 )
α,γ 2
≤ βd
2 kJt kL2 (M −1/2 ) + 2kJtα,γ kL2 (M −1/2 ) krtα,γ kL2 (M −1/2 ) ,
and thus, by Grönwall’s inequality, for all λ ≥ 0,

ˆ t
2λs α,γ 2
sup e kJs kL2 (M −1/2 ) + e2λs k∇v Jsα,γ k2L2 (M −1/2 ) + k(∇v + βv)Jsα,γ k2L2 (M −1/2 ) ds
0≤s≤t 0
ˆ t 2 ˆ t
⊥
.β,λ k∇αx ∇γv (πM h)k2L2 (M −1/2 ) + λs
e krsα,γ kL2 (M −1/2 ) ds + e2λs kJsα,γ k2L2 (M −1/2 ) ds.
0 0
Now using (3.22) to bound the last term, we obtain for all 0 ≤ λ < λ′ < λ1 ,
ˆ t
2λs α,γ 2
0≤s≤t 0
ˆ t 2
⊥ ′
.W,β,λ,λ′ ,a k∇αx ∇γv (πM h)k2L2 (M −1/2 ) + eλ s krsα,γ kL2 (M −1/2 ) ds .
0
By definition of RM∗ , cf. (3.17), the source term r α,γ defined in (3.21) takes the form
X γ α+e ,γ−e
α γ ′ ′ β ′ ′
rtα,γ = v − ∇A − κ∇W ∗ M · ∇v Jtα ,γ ,
X
∇α−α ∇γ−γ

Jt i i
+ ′ ′ x v 2
ei α γ
i:ei ≤γ ′ ′ (α ,γ )<(α,γ)
so the above yields for all 0 ≤ λ < λ′ < λ1 ,

ˆ t
2λs α,γ 2
0≤s≤t 0
′
⊥
.W,β,λ,λ′ ,α,γ,a k∇αx ∇γv (πM h)k2L2 (M −1/2 ) + max sup e2λ s kJsα+ei ,γ−ei k2L2 (M −1/2 )
i:ei ≤γ 0≤s≤t
ˆ t
′ ′ ′
+ max e2λ s k∇v Jsα ,γ k2L2 (M −1/2 ) ds.
(α′ ,γ ′ )<(α,γ) 0
A direct induction then yields for all α, γ ≥ 0, λ ∈ [0, λ1 ), and t ≥ 0,
ˆ t
2λt α,γ 2
e kJt kL2 (M −1/2 ) + e2λs k∇v Jsα,γ k2L2 (M −1/2 ) + k(∇v + βv)Jsα,γ k2L2 (M −1/2 ) ds
0
⊥
.W,β,λ,α,γ,a k∇αx ∇γv (πM h)k2L2 (M −1/2 ) .
Recalling Jtα,γ = ∇αx ∇γv πM
∗
⊥ etRM h and π ⊥ h = h − M
´
M X h, this proves the claim (3.18).
With the above exponential decay for the Fokker–Planck semigroup, we can easily conclude the
proof of Lemma 3.4 by means of a perturbation argument.
Proof of Lemma 3.4. Let λ0 , λ1 be as in Theorem 3.1(i) and in Lemma 3.8, respectively. We split the
proof into two steps.
Step 1. Proof that for all k ≥ 0, λ ∈ [0, λ0 ∧ λ1 ), t ≥ s ≥ 0, and gs ∈ Cc∞ (X),
⊥
eλ(t−s) kVt,s πM gs kH −k (M −1/2 ) .W,β,λ,k,a kgs kH −k (M −1/2 ) . (3.25)
Duhamel’s formula yields
ˆ t
⊥ ⊥ ⊥
gs = e(t−s)RM πM e(t−u)RM (∇W ∗ (µu − M )) · ∇v (Vu,s πM

Vt,s πM gs + κ gs ) du,
s
and thus, for all ht ∈ Cc∞ (X), ⊥ etRM = etRM π ⊥ ,
integrating by parts and using πM M
ˆ ˆ
⊥ −1 ∗ ⊥
e(t−s)RM πM ht gs M −1

(Vt,s πM gs )ht M =
X X
ˆ t ˆ
∗ ⊥
(∇v + βv)e(t−u)RM ht · (∇W ∗ (µu − M ))(Vu,s πM gs )M −1 du.

−κ
s X
Applying the exponential decay estimate (3.18) of Lemma 3.8, and taking the supremum over ht
in H k (M −1/2 ), we deduce for all k ≥ 0, λ ∈ [0, λ1 ), and t ≥ 0,
⊥
eλ(t−s) kVt,s πM gs kH −k (M −1/2 ) .W,β,λ,k,a kgs kH −k (M −1/2 )
ˆ t 1
⊥ 2
+κ e2λ(u−s) kVu,s πM gs k2H −k (M −1/2 ) k∇W ∗ (µu − M )k2W k,∞ (Rd ) du .
s
Now, by Theorem 3.1(i), we have

k∇W ∗ (µt − M )kW k,∞ (Rd ) .W,k W2 (µt , M ) .W,β,a e−λ0 t W2 (µ◦ , M ), (3.26)
and the claim (3.25) then follows from Grönwall’s inequality.
Step 2. Conclusion.
⊥ by π ⊥ V
It remains to replace Vt,s πM M t,s in the result (3.25) of Step 1. For that purpose, let us decompose
ˆ
⊥ ⊥ ⊥
πM Vt,s gs = Vt,s πM gs + gs rt,s , rt,s := πM Vt,s M. (3.27)
X
By definition, rt,s satisfies
∂t rt,s = RM Vt,s M + κ(∇W ∗ (µt − M )) · ∇v Vt,s M
= RM rt,s + κ(∇W ∗ (µt − M )) · ∇v rt,s + κ(∇W ∗ (µt − M )) · ∇v M,
with rt,s |t=s = 0. By Duhamel’s formula, this yields
ˆ t

rt,s = κ Vt,u ∇W ∗ (µu − M ) · ∇v M du.
s
Applying the relaxation estimate (3.25) of Step 1, and using Theorem 3.1(i) again in form of (3.26),
we deduce for all k ≥ 0, λ ∈ [0, λ0 ∧ λ1 ), and t ≥ s ≥ 0,
krt,s kH −k (M −1/2 ) .W,β,λ,k,a e−λt .
Together with (3.25) and (3.27), this yields the conclusion (up to renaming λ1 ).
3.4. Proof of Lemma 3.5 on W −k,q (hzip ). This section is devoted to the proof of (3.12). Instead
of hzip , we shall consider deformed weights of the form ω p in terms of

ω(x, v) := 1 + β 21 |v|2 + a|x|2 + ηx · v , (3.28)
1√
where the parameter 0 < η ≪ 1 will be properly chosen later on. We naturally restrict to η ≤ 2 a,
which ensures
ω(x, v) ≃β,a hzi2 .
Note that those weights differ from the choice used in [73] and are critical for the improved result
we establish in this work. We define the weighted negative Sobolev spaces W −k,q (ω p ) exactly as the
spaces W −k,q (hzip ) in (3.3), simply replacing the weight hzip by ω p in the definition. Comparing ω
and hzi2 , the definition of dual norms easily ensures for all h ∈ Cc∞ (X),
khkW −k,q (hzi2p ) ≃β,k,p,a khkW −k,q (ωp ) . (3.29)
′
For a densely-defined operator X on Lq (ω p ) := W 0,q p ∗,p q
´ (ω ), pwe denote by X its∗,padjoint on L (X) with
respect to the weighted duality product (g, h) 7→ X ghω : more precisely, X stands for the closed
′
operator on Lq (X) defined by the relation
ˆ ˆ
g(Xh)ω p = h(X ∗,p g)ω p .
X X
In particular, for µ ∈ P(X), we consider the weighted adjoint Bµ∗,p of Bµ , which takes the explicit form

Bµ∗,p h = 12 △v h + v · ∇x h − 21 βv + ∇A + κ∇W ∗ µ − ω −p ∇v ω p · ∇v h

+ 12 ω −p △v ω p + ω −p v · ∇x ω p − ω −p 21 βv + ∇A + κ∇W ∗ µ · ∇v ω p − ΛχR h. (3.30)

By the equivalence√of norms (3.29) and by the definition of dual norms, it suffices to prove that there
is some 0 < η ≤ 21 a and some λ2 > 0 (only depending on d, β, a) such that the following result holds:
given 1 < q ≤ 2 and 0 < p ≤ 1 with pq ′ ≫β,a 1 large enough (only depending on d, β, a), choosing
′
Λ, R large enough (only depending on d, β, p, a, and kW kW 1,∞ (Rd ) ), if g ∈ C([0, t]; Lq (X)) satisfies the
backward Cauchy problem
∂s gs = −Bµ∗,p

s gs , for 0 ≤ s ≤ t,
(3.31)
gs |s=t = gt ,
for some t ≥ 0 and gt ∈ Cc∞ (X), then we have for all λ ∈ [0, λ2 ), k ≥ 0, and 0 ≤ s ≤ t,
kgs kW k,q′ (X) .W,β,λ,k,p,q,a e−λ(t−s) kgt kW k,q′ (X) . (3.32)
We split the proof into two steps, starting with the case k = 0 before treating all k ≥ 0 by induction.
Step 1. Proof of (3.32) for k = 0.
By definition of Bµ∗,p and by integration by parts, we find
ˆ
′ ′
∂s kgs kq q′ = −q ′ |gs |q −2 gs Bµ∗,p
s s
g
L (X) X
ˆ ˆ
1 ′ ′ q ′ −2 ′
2
|gs |q q ′ ω −p v · ∇x ω p − q ′ ω −p 1
+ ∇A + κ∇W ∗ µ · ∇v ω p

= 2 q (q − 1) |gs | |∇v gs | − 2 βv
X X

− q ′ ΛχR + 21 q ′ ω −p △v ω p + βd
2 − div v (ω −p
∇ v ω p
) .
Now inserting the form of the weight ω and explicitly computing its derivatives, we get after straight-
forward simplifications,
ˆ
q′ ′
∂s kgs k q′ ≥ 2 q (q − 1) |gs |q −2 |∇v gs |2
1 ′ ′
L (X) X
ˆ
′ − 21

|gs |q pq ′ β β2 − η(1 + 32a
1
β 2 ) ω −1 |v|2 + 2aηβpq ′ ω −1 |x|2 − βd ′

+ 2 + q Λχ R − C W,β,a ω .
X
We show that we can choose our parameters in such a way that the last bracket be bounded below by
a positive constant, which is the key to the desired exponential decay. More precisely, choosing
√
1
η := min 14 β(1 + 32a β 2 )−1 , 21 a , 4λ2 := min{ 21 β, 2η},

we get
ˆ
′ ′
∂s kgs kq q′ ≥ 1 ′ ′
2 q (q − 1) |gs |q −2 |∇v gs |2
L (X) X
ˆ
′ βd 1
+ |gs |q 4pq ′ βλ2 ω −1 ( 21 |v|2 + a|x|2 ) − 2 + q ′ ΛχR − CW,β,a ω − 2 .
X
1√ 1 2 1
As the choice η ≤ 2 a ensures + a|x|2 ≥ 2β
2 |v| (ω − 1), this actually means
ˆ ˆ
′

′ ′ − 21
∂s kgs kq q′ ≥ 12 q ′ (q ′ − 1) |gs |q −2 |∇v gs |2 + |gs |q 2pq ′ λ2 − βd
2 + q ′
Λχ R − C W,β,a ω .
L (X) X X
Now, recalling the definition of the cut-off function χR , we note that we can choose Λ, R > 0 large
enough (only depending on d, W, β, p, a) such that
1
ΛχR − CW,β,aω − 2 ≥ − 21 pλ2 .
Provided that pq ′ ≥ 2βdλ−1

2 , we then obtain
ˆ
′ ′ ′
∂s kgs kq q′ ≥ 1 ′ ′
2 q (q − 1) |gs |q −2 |∇v gs |2 + pq ′ λ2 kgs kq q′ , (3.33)
L (X) X L (X)
hence, by Grönwall’s inequality,

kgs kLq′ (X) ≤ e−pλ2 (t−s) kgt kLq′ (X) . (3.34)
that is, (3.32) for k = 0.

Step 2. Proof of (3.32) for all k ≥ 0.
For multi-indices α, γ ∈ Nd , we set Jsα,γ := ∇αx ∇γv gs . Differentiating equation (3.31), we get
∂s Jsα,γ = −Bµ∗,p α,γ
− rsα,γ , for 0 ≤ s ≤ t,

s Js
α,γ α γ
Js |s=t = ∇x ∇v gt ,
where the remainder is given by
rsα,γ := [∇αx ∇γv , Bµ∗,p
s
]gs .
Repeating the proof of (3.33), for the choice of η, λ2 , Λ, R in Step 1, we get
ˆ ˆ
′ ′ ′ ′
∂s kJsα,γ kq q′ ≥ 12 q ′ (q ′ −1) |Jsα,γ |q −2 |∇v Jsα,γ |2 +pq ′ λ2 kJsα,γ kq q′ −q ′ |Jsα,γ |q −2 Jsα,γ rsα,γ , (3.35)
L (X) X L (X) X
and it remains to analyze the last contribution. By definition of Bµ∗,p ,

cf. (3.30), we can compute
X γ
rsα,γ = J α+ei ,γ−ei
ei s
i:ei ≤γ

X α γ h ′ ′
α ,γ

α−α′ γ−γ ′ 1 −p p
i
− div v Js ∇ x ∇ v 2 βv + ∇A + κ∇W ∗ µ − ω ∇ v ω
α′ γ′
(α′ ,γ ′ )<(α,γ)

X α γ α′ ,γ ′

α−α′ γ−γ ′ 1 −p p −p p −p p
+ ′ ′
Js ∇ x ∇ v 2 ω △v ω + ω v · ∇x ω − divv (ω ∇v ω )
α γ
(α′ ,γ ′ )<(α,γ)

−ω −p 12 βv + ∇A + κ∇W ∗ µ · ∇v ω p + βd

2 − Λχ R . (3.36)
Integrating by parts, this allows us to estimate

ˆ
′
′
′ ′

|Jsα,γ |q −2 Jsα,γ rsα,γ .W,β,α,γ,a kJsα,γ kq q−1
′ max kJsα+ei ,γ−ei kLq′ (X) + max kJsα ,γ kLq′ (X)
X L (X) i:ei ≤γ (α′ ,γ ′ )<(α,γ)
ˆ
′ ′ ′
+ q′ max |Jsα,γ |q −2 |Jsα ,γ ||∇v Jsα,γ |.
(α′ ,γ ′ )<(α,γ) X
Inserting this estimate into (3.35) and appealing to Young’s inequality to absorb ∇v J α,γ into the
dissipation term, we are led to
′ ′ ′

α′ ,γ ′ 2
∂s kJsα,γ kq q′ ≥ pq ′ λ2 kJsα,γ kq q′ − (q ′ )2 CW,β,α,γ,akJsα,γ kq q−2
′ max kJs k ′
Lq (X)
L (X) L (X) L (X) (α′ ,γ ′ )<(α,γ)
′

α′ ,γ ′
− q ′ CW,β,α,γ,akJsα,γ kq q−1
′ max kJs
α+ei ,γ−ei
k ′
Lq (X)
+ max kJs k ′
Lq (X)
.
L (X) i:ei ≤γ (α′ ,γ ′ )<(α,γ)
Further appealing to Young’s inequality, we get for all λ < λ2 ,

′ ′
∂s kJsα,γ kq q′ ≥ pq ′ λkJsα,γ kq q′
L (X) L (X)
′ ′
′ ′

− CW,β,λ,α,γ,p,q,a max kJsα+ei ,γ−ei kq q′ + max kJsα ,γ kq q′ ,
i:ei ≤γ L (X) (α′ ,γ ′ )<(α,γ) L (X)
and thus, by Grönwall’s inequality,

′ ′ ′
epq λ(t−s) kJsα,γ kq q′ .W,β,λ,α,γ,p,q,a k∇αx ∇γv gt kq q′
L (X) L (X)
ˆ t
′ ′
′
′ ′

+ epq λ(t−u) max kJuα+ei ,γ−ei kq q′ + max kJuα ,γ kq q′ du.
s i:ei ≤γ L (X) (α′ ,γ ′ )<(α,γ) L (X)
Iterating this inequality and starting from the result (3.34) of Step 1 for Js0,0 = gs , the conclusion
follows.
3.5. Proof of Lemma 3.5 on H −k (M −1/2 ). This section is devoted to the proof of (3.13). Taking
inspiration from the work of Mischler and Mouhot [73, Section 4.2], we consider deformed weights of
the form M −1 ζ with the factor ζ given by
x·v
ζ(x, v) := 1 + 21 (3.37)
1 + η2 |x|2 + 2η
1
|v|2
where the parameter η > 0 will be properly chosen later on. Note that for any η > 0 we have
1 3
2 ≤ ζ ≤ 2.
In these terms, we define the weighted negative Sobolev spaces H −k ( M −1 ζ) exactly as H −k (M −1/2 )
p
in (3.9), simply replacing the weight M −1 by M −1 ζ in the definition. Comparing M −1 ζ to M −1 , the

definition of dual norms easily yields the equivalence, for all h ∈ Cc∞ (X),
khkH −k (M −1/2 ) ≃k,η khkH −k (√M −1 ζ) . (3.38)
For a densely-defined operator X on L2 ( M −1 ζ), by X ∗,ζ its adjoint on L2 ( M −1 ζ) with

p p
´ we denote
respect to the weighted duality product (g, h) 7→ X ghM −1 ζ: more precisely, X ∗,ζ stands for the closed
operator on L2 ( M −1 ζ) defined by the relation
p
ˆ ˆ
−1
g(Xh)M ζ = h(X ∗,ζ h)M −1 ζ.
X X
By the equivalence of norms (3.38) and by definition of dual norms, it suffices to prove that there is some
0 < η ≤ 1 and λ2 > 0 (only depending on d, β, a, and kW kW 1,∞ (Rd ) ) such that the following result holds:
choosing Λ, R large enough (only depending on d, β, a, and kW kW 1,∞ (Rd ) ), if g ∈ C([0, t]; L2 (M −1/2 ))
satisfies the backward Cauchy problem
∂s gs = −Bµ∗,ζ

s gs , for 0 ≤ s ≤ t,
(3.39)
gs |s=t = gt ,
for some t ≥ 0 and gt ∈ Cc∞ (X), then we have for all λ ∈ [0, λ2 ), k ≥ 0, and 0 ≤ s ≤ t,
kgs kH k (M −1/2 ) .W,β,λ,k,a e−λ(t−s) kgt kH k (M −1/2 ) . (3.40)
We split the proof into two steps, starting with the case k = 0 before treating all k ≥ 0 by induction.
Step 1. Proof of (3.40) for k = 0.
By definition of Bµ and by integration by parts, we find
ˆ ˆ
2 √
∂s kgs k 2 = −2 gs (Bµs gs )M ζ = −2 gs (Bµs gs )M −1 ζ
∗,ζ −1
L ( M −1 ζ)
X X
ˆ ˆ
−1
p
2 2 1 2
= |∇v ( M −1 ζgs )| + |gs | M 4 ζ|βv| + (∇A + κ∇W ∗ µs ) · ∇v ζ − v · ∇x ζ
X X
p 2
+ 2ΛχR ζ − βd 2 ζ + κβζv · (∇W ∗ (µ s − M )) − |∇ v ζ| . (3.41)
We show that we can choose parameters in such a way that the last bracket be bounded below by a
positive constant, which is the key to the desired exponential decay. By definition of ζ, cf. (3.37), and
by Young’s inequality, we find
1 2
4 ζ|βv| + ∇A · ∇v ζ − v · ∇x ζ
|x|2 (x · v)2

1 2 1 2 1 −1 1
= 2 |v| 2 ζβ − η 1 + a η 1 − (aη − 2 η) η 1
1 + 2 |x|2 + 2η |v|2 1 + 2 |x|2 + 2η |v|2 (1 + 2 |x|2 + 2η |v|2 )2
1 + aη −2 |x|2

1 2 1 2 1
≥ |v| β − + a
2 4
1 + η2 |x|2 + 2η
1
|v|2 2
1 + η2 |x|2 + 2η1
|v|2
1 + 2aη −2

1 2 1 2 −1 1
≥ 2 |v| 4β − + aη 1− .
1 + η2 |x|2 + 2η
1
|v|2 1 + η2 |x|2 + 2η
1
|v|2
Inserting this into (3.41), and further using |∇v ζ| . η −1 hzi−1 , we obtain
ˆ p
2 √
∂s kgs k 2 ≥ |∇ v ( M −1 ζgs )|2
L ( M −1 ζ)
X
ˆ
|gs |2 M −1 aη −1 − CW,β,a + ΛχR + 101
|βv|2 1 − η −3 hzi−2 CW,β,a − CW,β,a η −2 hzi−1 .

+
X
Noting that the dissipation term can be bounded below as

√
|∇v ( M −1 ζg)|2 ≥ 21 |ζ|2 |∇v ( M −1 g)|2 − 2|∇v ζ|2 |g|2 M −1
p
≥ 1
8 |(∇v + β2 v)g|2 M −1 − Cη −2 hzi−2 |g|2 M −1 ,
the above becomes
∂s kgs k2 2 √ −1 ≥ 18 k(∇v + β2 v)gs k2L2 (M −1/2 )

L ( M ζ)
ˆ
|gs |2 M −1 aη −1 − CW,β,a + ΛχR + 1 2
1 − η −3 hzi−2 CW,β,a − CW,β,a η −2 hzi−1 .

+ 10 |βv|
X
−1
Now let us choose η := 12 aCW,β,a , and note that, by definition of the cut-off function χR , we may then
choose Λ, R > 0 large enough (only depending on d, W, β, a) such that
1
|βv|2 1 − η −3 hzi−2 CW,β,a − CW,β,a η −2 hzi−1 ≥ 201
|βv|2 − 41 aη −1 .

ΛχR + 10
1 −1
Further setting λ2 := 12 aη , this choice leads us to
∂s kgs k2 2 √ ≥ 1
8 k(∇v + β2 v)gs k2L2 (M −1/2 ) + 1 2
20 kβvgs kL2 (M −1/2 ) + 2λ2 kgs k2 2 √ . (3.42)
L ( M −1 ζ) L ( M −1 ζ)
In particular, by Grönwall’s inequality,

kgs kL2 (M −1/2 ) . e−λ2 (t−s) kgt kL2 (M −1/2 ) , (3.43)
that is, (3.40) for k = 0.
Step 2. Proof of (3.40) for all k.
For multi-indices α, γ ∈ Nd , we set Jsα,γ := ∇αx ∇γv gs . Differentiating equation (3.39), we get
(
∂s Jsα,γ = −Bµ∗,ζ α,γ
s Js − rsα,γ , for 0 ≤ s ≤ t,
Jsα,γ |s=t = ∇αx ∇γv gt ,
where the remainder is given by
rsα,γ := [∇αx ∇γv , Bµ∗,ζ
s
]gs .
Repeating the proof of (3.42), for the choice of η, λ2 , Λ, R in Step 1, we get
∂s kJsα,γ k2 2 √ ≥ 1
8 k(∇v + β2 v)Jsα,γ k2L2 (M −1/2 ) + 1 α,γ 2
20 kβvJs kL2 (M −1/2 ) + 2λ2 kJsα,γ k2 2 √ −1
L ( M −1 ζ) L ( M ζ)
ˆ
− 2 Jsα,γ rsα,γ M −1 ζ. (3.44)
X
and it remains to analyze the last contribution. By definition of Bµ , the weighted adjoint Bµ∗,ζ takes
the explicit form

β
Bµ∗,ζ h = 1
2 △v h + v · ∇x h + 2v − ∇A − κ∇W ∗ µ + ζ1 ∇v ζ · ∇v h

βd 1
+ 2 + 2ζ △v ζ − κβv · (∇W ∗ (µ − M ))

1 β
+ ζ1 v · ∇x ζ +

ζ 2v − ∇A − κ∇W ∗ µ · ∇v ζ − ΛχR h,
and we may then compute
X γ
rsα,γ = [∇αx ∇γv , Bµ∗,ζ ]gs
= J α+ei ,γ−ei
s
ei s
i:ei ≤γ

X α γ α−α′ γ−γ ′ β
′ ′
+ ′ ′
∇ x ∇ v 2 v − ∇A − κ∇W ∗ µ + 1
ζ ∇ v ζ · ∇v Jsα ,γ
α γ
(α′ ,γ ′ )<(α,γ)

X α γ α′ ,γ ′ α−α′ γ−γ ′ βd

1
+ ′ ′
J s ∇ x ∇ v 2 + 2ζ △v ζ − κβv · (∇W ∗ (µ − M ))
′ ′
α γ
(α ,γ )<(α,γ)

1 β
+ ζ1 v · ∇x ζ +

ζ 2v − ∇A − κ∇W ∗ µ · ∇v ζ − ΛχR ,
from which we easily estimate

ˆ
Jsα,γ rsα,γ M −1 ζ .W,β,α,γ,a kJsα,γ kL2 (√M −1 ζ) + k∇v Jsα,γ kL2 (M −1/2 ) + kβvJsα,γ kL2 (M −1/2 )
X
′ ′

× max kJsα+ei ,γ−ei kL2 (M −1/2 ) + max kJsα ,γ kL2 (M −1/2 ) .
i:ei ≤γ (α′ ,γ ′ )<(α,γ)
Inserting this into (3.44), and appealing to Young’s inequality to absorb J α,γ , ∇v J α,γ , and βvJ α,γ into
the dissipation terms, we deduce for all λ < λ2 ,
∂s kJsα,γ k2 2 √ ≥ 2λkJsα,γ k2 2 √ −1
L ( M −1 ζ) L ( M ζ)
′ ′

− CW,β,λ,α,γ,a max kJsα+ei ,γ−ei k2L2 (M −1/2 ) + max kJsα ,γ k2L2 (M −1/2 ) ,
i:ei ≤γ (α′ ,γ ′ )<(α,γ)
e2λ(t−s) kJsα,γ k2L2 (M −1/2 ) .W,β,λ,α,γ,a k∇αx ∇γv gt k2L2 (M −1/2 )

ˆ t
′ ′
+ e2λ(t−u) max kJuα+ei ,γ−ei k2L2 (M −1/2 ) + max kJuα ,γ k2L2 (M −1/2 ) du.
s i:ei ≤γ (α′ ,γ ′ )<(α,γ)
Iterating this inequality, and starting from the result (3.43) of Step 1 for Js0,0 = gs , the conclusion
follows.
3.6. Proof of Lemma 3.6. In this section, we appeal again to duality but we shall use a slightly
different notation than in Section 3.4: for a densely-defined operator X on Lq (hzip )´ we now denote by
′
X ∗,p its adjoint on Lq (X) with respect to the weighted duality product (g, h) 7→ X ghhzip . In other
words, we use the same notation as in Section 3.4 for X ∗,p , but now with the weight ω replaced by hzi.
∗,p
We consider in particular the weighted adjoints {Wt,s }t≥s≥0 of the fundamental operators {Wt,s }t≥s≥0 ,
∞ ∗,p ′
and we note that for all t ≥ 0 and gt ∈ Cc (X) the flow gs := Wt,s gt is such that g ∈ C([0, t]; Lq (X))
satisfies the backward Cauchy problem
∂s gs = −Bµ∗,p

s gs , for 0 ≤ s ≤ t,
(3.45)
gs |s=t = gt ,
where Bµ∗,p is the weighted adjoint of Bµ . This operator Bµ∗,p takes the explicit form (3.30) with ω now
replaced by hzi. The proof of Lemma 3.6 ultimately relies on the following result.
Lemma 3.9. For all k ≥ 0, 0 ≤ p ≤ 1, 0 ≤ t − s ≤ 1, and gt ∈ Cc∞ (X), we have
∗,p 3
Wt,s gt H k+1 (X)
.W,β,k,a (t − s)− 2 kgt kH k (X) , (3.46)
where the constant only depends on d, β, k, a, and kW kW k+2,∞ (Rd ) .

We postpone the proof of this result for a moment and start by showing that Lemma 3.6 follows as
a straightforward consequence.
Proof of Lemma 3.6. We start by applying the interpolation argument of [73, Lemma 2.4]: thanks to
the exponential decay estimates of Lemma 3.5, it suffices to find some θ ≥ 0 (only depending on d)
such that for all 1 < q ≤ 2, k ≥ 0, 0 < p ≤ 1, 0 ≤ t − s ≤ 1, and hs ∈ Cc∞ (X) we have
kAWt,s hs kH −k (M −1/2 ) .W,β,k,p,q,a (t − s)−θ khs kW −k,q (hzip ) .
In order to prove this, we argue by duality: more precisely, recalling A = ΛχR , it suffices to find some
θ ≥ 0 such that for all 1 < q ≤ 2, k ≥ 0, 0 < p ≤ 1, 0 ≤ t − s ≤ 1, and gt ∈ Cc∞ (X),
∗,p
kWt,s (χR M −1 hzi−p gt )kW k,q′ (X) .W,β,k,p,q,a (t − s)−θ kgt kH k (M −1/2 ) . (3.47)
By the Sobolev inequality with 2 ≤ q ′ < ∞, the left-hand side can be estimated as follows,
∗,p ∗,p
kWt,s (χR M −1 hzi−p gt )kW k,q′ (X) .k,q kWt,s (χR M −1 hzi−p gt )kH k+d (X) ,
3d
and the desired bound (3.47) then follows with θ = 2 by iterating the result of Lemma 3.9.
The rest of this section is devoted to the proof of Lemma 3.9. For k = 0, this is in fact a standard
consequence of the theory of hypoellipticity as in [59, 87, 73]. For k > 0, we argue by induction, further
using parabolic estimates similarly as in Section 3.4.
∗,p
Proof of Lemma 3.9. Given t ≥ 0 and gt ∈ Cc∞ (X), let gs := Wt,s gt be the solution of the backward
Cauchy problem (3.45), and recall that Bµ∗,p takes the explicit form (3.30) with ω replaced by hzi,
Bµ∗,p h = 1
2 △v h + v · ∇x h + A0µ,p h − A1µ,p · ∇v h,
in terms of
−p −p −p 1
A0µ,p := 1 p p
+ ∇A + κ∇W ∗ µ · ∇v hzip − ΛχR ,

2 hzi △v hzi + hzi v · ∇x hzi − hzi 2 βv
−p
A1µ,p := 1
2 βv + ∇A + κ∇W ∗ µ − hzi ∇v hzi .
p

Step 1. Case k = 0: proof that for all 0 ≤ t − s ≤ 1 we have
3
k∇x gs kL2 (X) .W,β,a (t − s)− 2 kgs kL2 (X) . (3.48)
Integrating by parts, we can compute

ˆ ˆ ˆ ˆ
2 ∗,p 2
∂s |gs | = −2 gs Bµs gs = |∇v gs | − |gs |2 2A0µs ,p + divv (A1µs ,p ) ,
X X X X
and thus, by definition of A0µ,p , A1µ,p ,

ˆ ˆ ˆ
2 2
∂s |gs | ≥ |∇v gs | − CW,β,a |gs |2 .
X X X
Similarly, we can easily estimate

ˆ ˆ ˆ
2 2
∂s |∇x gs | ≥ |∇xv g| − CW,β,a |∇x gs |2 + |gs |2 + |∇x gs ||∇v gs | ,
ˆX ˆ ˆ X
ˆ
2 2 2
∂s |∇v gs | ≥ |∇v gs | − 2 ∇v gs · ∇x gs − CW,β,a |∇v gs |2 + |gs |2 ,
X X
ˆ ˆ ˆX X
ˆ
2 2
−∂s ∇x gs · ∇v gs ≥ 2 1
|∇x gs | − |∇xv gs ||∇v gs | − CW,β,a |∇v gs |2 + |gs |2 .
X X X X
Let us consider the functional

ˆ ˆ ˆ ˆ
2 2 3 2 2
Fs (g) := a0 |g| + a1 (t − s) |∇v g| + a2 (t − s) |∇x g| − 2a3 (t − s) ∇x g · ∇v g, (3.49)
X X X X
where the constants a0 , a1 , a2 , a3 > 0 will be suitably chosen in a moment. In these terms, using
Young’s inequality, the above inequalities lead us to deduce for all 0 ≤ t − s ≤ 1 and 0 < ε, δ ≤ 1,
ˆ ˆ
−1 2 2
∂s Fs (gs ) ≥ a0 − CW,β,a(δ a1 + a2 + a3 ) 1
|∇v gs | + 2 a3 − δa1 − CW,β,aa2 (t − s) |∇x gs |2
X X
ˆ ˆ
−1 2 2 3 2

+ a1 − ε a3 (t − s) |∇v gs | + a2 − εa3 (t − s) |∇xv gs |
X X
ˆ
|gs |2 .

− CW,β,a a0 + a1 + a2 + a3
X
Choosing for instance a0 = 2CW,β,a (δ−1 a1 + a2 + a3 ), a1 = 4a23 , a2 = 1, a3 = 4CW,β,a , ε = (2a3 )−1 ,

and δ = (32a3 )−1 , we obtain
ˆ ˆ
∂s Fs (gs ) ≥ 21 a0 |∇v gs |2 + 18 a3 (t − s)2 |∇x gs |2
X X
ˆ ˆ ˆ
2 2 3 2
1 1
+ 2 a1 (t − s) |∇v gs | + 2 (t − s) |∇xv gs | − 2a0 CW,β,a |gs |2
X X X
ˆ
≥ −2a0 CW,β,a |gs |2 . (3.50)
X
√
By definition of Fs , as the choice of a1 , a2 , a3 satisfies 2a3 = a1 a2 , we have
Fs (g) ≥ a0 kgk2L2 (X) + 21 a1 (t − s)k∇v gk2L2 (X) + 12 (t − s)3 k∇x gk2L2 (X) , (3.51)
so that the above estimate (3.50) entails

∂s Fs (gs ) &W,β,a −Fs (gs ).
By Grönwall’s inequality with Ft (g) = a0 kgk2L2 (X) , this yields for all 0 ≤ t − s ≤ 1,
Fs (gs ) .W,β,a kgt kL2 (X) ,
and the claim (3.48) then follows from (3.51).

Step 2. Conclusion.
Given multi-indices α, γ ∈ Nd , we set for abbreviation Jsα,γ := ∇αx ∇γv gs , which satisfies
∂s Jsα,γ = −Bµ∗,p α,γ
− rsα,γ , for 0 ≤ s ≤ t,

s Js
Jsα,γ |s=t = ∇αx ∇γv gt ,
where the remainder term is given by
rsα,γ := [∇αx ∇γv , Bµ∗,p
s
]gs .
Repeating the proof of (3.50), for Fs defined in (3.49) with the same choice of constants a0 , a1 , a2 , a3
as in Step 1, we get for all 0 ≤ t − s ≤ 1,
ˆ ˆ
α,γ α,γ 2 2
1
∂s Fs (Js ) ≥ 2 a0 1
|∇v Js | + 8 a3 (t − s) |∇x Jsα,γ |2
X X
ˆ ˆ ˆ
2 α,γ 2 3
1 1
+ 2 a1 (t − s) |∇v Js | + 2 (t − s) |∇xv Js | − 2a0 CW,β,a |Jsα,γ |2
α,γ 2
X X X
ˆ ˆ ˆ
α,γ α,γ α,γ α,γ 3
− 2a0 Js rs − 2a1 (t − s) ∇v Js · ∇v rs − a2 (t − s) ∇x Jsα,γ · ∇x rsα,γ
X X X
ˆ ˆ
2 α,γ α,γ 2
+ 2a3 (t − s) ∇x Js · ∇v rs + 2a3 (t − s) ∇v Jsα,γ · ∇x rsα,γ .
X X
Recalling that the remainder term rsα,γ
can be written as in (3.36) with ω replaced by hzi, integrating
by parts, and using Young’s inequality to absorb all factors involving Jsα,γ into the dissipation terms,
we deduce for all 0 ≤ t − s ≤ 1,
∂s Fs (Jsα,γ ) &W,β,α,γ,a −kJsα,γ k2L2 (X)

′ ′ ′ ′ ′ ′
− max kJsα ,γ k2L2 (X) + (t − s)k∇v Jsα ,γ k2L2 (X) + (t − s)3 k∇x Jsα ,γ k2L2 (X)
(α′ ,γ ′ )<(α,γ)

α+ei ,γ−ei 2 α+ei ,γ−ei 2 3 α+ei ,γ−ei 2
− max kJs kL2 (X) + (t − s)k∇v Js kL2 (X) + (t − s) k∇x Js kL2 (X) .
i:ei ≤γ
By (3.51), this entails

′ ′
∂s Fs (Jsα,γ ) &W,β,α,γ,a −Fs (Jsα,γ ) − max Fs (Jsα ,γ ) − max Fs (Jsα+ei ,γ−ei ),
(α′ ,γ ′ )<(α,γ) i:ei ≤γ
and thus, by Grönwall’s inequality with Ft (g) = a0 kgk2L2 (X) , we deduce for all 0 ≤ t − s ≤ 1,
ˆ t
α,γ α γ ′ ′
Fs (Js ) .W,β,α,γ,a k∇x ∇v gt kL2 (X) + max Fu (Juα ,γ ) + max Fu (Juα+ei ,γ−ei ) du.
s (α′ ,γ ′ )<(α,γ) i:ei ≤γ
By a direct iteration, this proves for all 0 ≤ t − s ≤ 1,

Fs (Jsα,γ ) .W,β,α,γ,a kgt kH |α|+|γ| (X) ,
and the conclusion (3.46) then follows from (3.51).
4. Representation of Brownian cumulants

This section is devoted to the representation of Brownian cumulants by means of Lions calculus.
More precisely, our starting point is the Lions expansion of Lemma 2.1: following [25], it leads to an
1
expansion of quantities of the form EB [Φ(µN t )] as power series in N . We introduce so-called L-graphs
(or Lions graphs) as a new diagram representation that allows to efficiently capture cancellations in
moment computations, leading us to a useful representation of Brownian cumulants. (Note this is
unrelated to the Lions forests in [34].) In the sequel, the nth time-integration simplex is denoted by
△nt := (t1 , . . . , tn ) ∈ [0, t]n : 0 < tn < . . . < t1 < t ,

and we also define

△n := (t, τ ) : t > 0, τ ∈ △nt .

4.1. Lions expansion along the flow. We appeal to Lemma 2.1 similarly as in [25] to expand of
1
the Brownian expectation EB [Φ(µN t )] as a power series in N . To this aim, we start with the following
iterative definition, which describes the natural quantities that appear in the expansion.
Definition 4.1. Given n ≥ 1 and a smooth functional Φ : ∆n × P(X) → R, we define the sequence
(m)
(UΦ , Φ(m) )m≥1 as follows:
• For m = 1, we set for all t > 0, τ = (τ1 , . . . , τn ) ∈ △nt , 0 < s < τn , and µ ∈ P(X),
(1)
UΦ (t, τ, s), µ := Φ (t, τ ), m(τn − s, µ) ,
and ˆ h i
(1) (1)
tr a0 ∂µ2 UΦ (t, τ, s), µ (z, z) µ(dz).

Φ (t, τ, s), µ :=
X
• For m ≥ 2, we iteratively define for all t > 0, τ = (τ1 , . . . , τn+m−1 ) ∈ △n+m−1

t , 0 < s < τn+m−1 ,
and µ ∈ P(X),
(m)
UΦ (t, τ, s), µ := Φ(m−1) (t, τ ), m(τn+m−1 − s, µ) ,

and
ˆ h i
(m) (m)
tr a0 ∂µ2 UΦ (t, τ, s), µ (z, z) µ(dz).

Φ ((t, τ, s), µ) :=
X
By convention, for n = 0, given a smooth functional Φ : P(X) → R, we identify it with the functional
Φ̃ : △0 × P(X) → R given by Φ̃(t, µ) := Φ(µ), and we then set
(1)
UΦ (t, s), µ := Φ m(t − s, µ) , 0 < s < t,
(m)
from which we can define UΦ , Φ(m) iteratively as above.
This definition is a minor extension of [25], where only the case n = 0 was considered. By a
straightforward adaptation of [25, Theorems 2.15–2.16], we emphasize that this definition always makes
sense with our smoothness assumptions. Moreover, for all n ≥ 0 and all smooth functionals Φ :
△n × P(X) → R, it is clear from the definition that for all m, p ≥ 1, t > 0, τ ∈ △n+m+p
t , and µ ∈ P(X)
we have
(m+p) (p)
UΦ (t, τ ), µ = UΦ(m) (t, τ ), µ , (4.1)
and similarly,
Φ(m+p) (t, τ ), µ = (Φ(m) )(p) (t, τ ), µ .

In these terms, we can now state the following expansion result for functionals of the empirical measure
along the particle dynamics. This is similar to the so-called weak error expansion in [25, Theorem 2.9];
we include a short proof for completeness.
Proposition 4.2. Given a smooth functional Φ : P(X) → R, we have for all n ≥ 0 and t > 0,
n
1
ˆ
(m+1)
X
EB [Φ(µN (t, τ, 0), µN

t )] = UΦ 0 dτ
m=0
(2N )m △m
t
1
ˆ h
(n+2) i
+ E B UΦ (t, τ, τn+1 ), µN
τn+1 dτ. (4.2)
(2N )n+1 △n+1
t
Proof. We proceed by induction and split the proof into two steps.
Step 1. Case n = 0.
By Lemma 2.1, we find for all 0 ≤ s ≤ t and µ ∈ P(X),
ˆ sˆ
N N 1 h
(1)
i
tr a0 ∂µ2 UΦ (t, u), µN µN N

Φ m(t − s, µs ) = Φ m(t, µ0 ) + u (z, z) u (dz) du + Mt,s ,
2N 0 X
N ) with M N = 0, where we use the notation U (1)
for some square-integrable martingale (Mt,s s t,0 Φ from
Definition 4.1. Hence, taking the expectation with respect to PB and choosing s = t, we get
ˆ t ˆ
N N
1 2 (1) N
N
EB [Φ(µt )] = Φ m(t, µ0 ) + EB tr a0 ∂µ UΦ (t, u), µu (z, z) µu (dz) du.
2N 0 X
With the notation of Definition 4.1, this means
t
1
ˆ h
(1) i
EB [Φ(µN (t, 0), µN EB Φ(1) (t, s), µN

t )] = UΦ 0 + s ds.
2N 0
(2)
Φ(1) (t, s), µN (t, s, s), µN

As s = UΦ s , this proves (4.2) with n = 0.
Step 2. General case.
We argue by induction. Suppose that (4.2) has been established for some n ≥ 0. Let t > 0 and τ =
(τ1 , . . . , τn+1 ) ∈ △n+1
t be fixed. Applying Lemma 2.1 with µ 7→ Φ(µ) replaced by µ 7→ Φ(n+1) ((t, τ ), µ),
we find similarly as in Step 1,
EB Φ(n+1) (t, τ ), µN = Φ(n+1) (t, τ ), m(τn+1 , µN

τn+1 0 )
ˆ τn+1 ˆ
1 h
2 (1) N
i
N
+ EB tr a0 ∂µ UΦ(n+1) (t, τ, u), µu (z, z) µu (dz) du.
2N 0 X
Using this to further decompose the remainder term in (4.2), using the notation of Definition 4.1, and
recalling (4.1), we precisely deduce that (4.2) also holds with n replaced by n + 1.
4.2. Graphical notation and definition of L-graphs. We introduce a graphical notation associated
with Definition 4.1, defining the notion of L-graphs (or Lions graphs), which will considerably simplify
combinatorial manipulations in the sequel. Let Φ : P(X) → R be a reference smooth functional.
• Base point. Given n ≥ 0 and a smooth functional Ψ : △n × P(X) → R, we set for all t > 0,
τ = (τ1 , . . . , τn ) ∈ △nt , and 0 < s < τn ,
(1)
[Ψ] (t, τ, s), µ := UΨ (t, τ, s), µ = Ψ (t, τ ), m(τn − s, µ) .
In case of the reference smooth functional Ψ = Φ , we drop the subscript and simply set

(t, s), µ := [Φ] (t, s), µ = Φ m(t − s, µ) .
• Round edge. In view of Proposition 4.2, the key operation that we want to account for in our graphical
representation is
(k) (k+1)
UΨ 7→ UΨ ,
cf. Definition 4.1. This will be represented with the symbol , which we henceforth call “round edge”.
More precisely, given n ≥ 0 and a smooth functional Ψ : △n × P(X) → R, we define for all (t, τ ) ∈ △n ,
0 < s < τn , and µ ∈ P(X),
ˆ h i
2

Ψ (t, τ, s), µ := tr a0 ∂µ Ψ (t, τ ), ν (z, z) ν(dz) , (4.3)
X ν=m(τn −s,µ)
hence for instance

(2) (3)
= UΦ , = UΦ ,
and so on. In particular, we emphasize that
Ψ 6= [Ψ] .
When iterating this operation, we add a subscript (m) to indicate the number m of iterations, that is,
(0) := , (1) := , (m+1) := (m) .
With this notation, the identity (4.1) takes on the following guise, for all m, p ≥ 0,
(m+p) = (m) .
(p)
• Time integration. We introduce a short-hand notation for ordered time integrals: given n, m ≥ 0
and given a smooth functional Ψ : △n+m × P(X) → R, we define for all (t, τ ) ∈ △n and µ ∈ P(X),
ˆ ˆ ˆ

Ψ (t, τ ), µ := Ψ (τ, µ) := Ψ (t, τ, σ), µ dσ,
△m △m
t △m
τn
and also, for m ≥ 1,

ˆ ˆ ˆ

Ψ (t, τ ), µ := Ψ (τ, µ) := Ψ (t, τ, σ, 0), µ dσ.
△m−1 ×0 △m−1
t ×0 △m−1
τn
In these terms, the result of Proposition 4.2 takes on the following guise: for all n ≥ 0 and t > 0,
n
1
X ˆ
EB [Φ(µN
t )] = (m) (µN
0 )
m=0
(2N )m △m
t ×0
1
ˆ h i
+ EB (n+1) (t, τ, τn+1 ), µN
τn+1 dτ. (4.4)
(2N )n+1 △n+1
t
In order to compute cumulants of functionals of the empirical measure along the flow, we shall need
to apply (4.4) with Φ replaced by powers Φk with k ≥ 1, and try to recognize the structure of the
relation between moments and cumulants, cf. Lemma 2.5. This will be performed in Section 4.3 and
motivates the further notation that we introduce below.
• Products. Products of different functionals are simply denoted by juxtaposing the corresponding
graphs. Some care is however needed on how to identify respective time variables. For that purpose,
we include subscripts with angular brackets indicating labels of the time variables for the different
subgraphs: given n, m ≥ 0, given functionals Ψ : △n × P(X) → R and Θ : △m × P(X) → R, and given
i1 < . . . < in and j1 < . . . < jm with {i1 , . . . , in } ∪ {j1 , . . . , jm } = JpK (possibly not disjoint), we set
for all (t, τ ) ∈ △p and µ ∈ P(X),

Ψhi1 ,...,in i Θhj1 ,...,jm i (t, τ ), µ := Ψ (t, τi1 , . . . , τin ), µ Θ (t, τj1 , . . . , τjm ), µ . (4.5)
For instance,
(2) (3)
h1,4i (2)h2,3,4i (t, τ1 , τ2 , τ3 , τ4 ), µ = UΦ (t, τ1 , τ4 ), µ UΦ (t, τ2 , τ3 , τ4 ), µ .
We also occasionally use indices to label time variables in subgraphs, for instance
h1i h2,3i ≡ h1,4i (2)h2,3,4i .

h4i h4i
• Straight edges. When applying the round edge (4.3) to a power of a given functional, we are led to
defining another type of operation, which we represent by a straight edge between subgraphs. More
precisely, given n, m ≥ 0, given smooth functionals Ψ : △n+1 × P(X) → R and Θ : △m+1 × P(X) → R,
and given a partition {i1 , . . . , in } ⊎ {j1 , . . . , jm } = Jn + mK with i1 < . . . < in and j1 < . . . < jm , we
set for all (t, τ, s, s′ ) ∈ △n+m+2 ,
Ψhi1 ,...,in ,m+n+1i Θhj1 ,...,jm ,m+n+1i (t, τ, s, s′ , µ

hm+n+2i
ˆ

:= ∂µ Ψ (t, τi1 , . . . , τin , s), ν (z) · a0 ∂µ Θ (t, τj1 , . . . , τjm , s), ν (z) ν(dz) . (4.6)
X ν=m(s−s′ ,µ)
Note that the dotted boxes around Ψ and Θ above have no particular meaning in the graphical notation:
they are just meant to emphasize that the straight edge is between the subgraphs corresponding to Ψ
and Θ; we remove them when no confusion is possible, simply writing for instance
. .
. .
h1i h2i h1i
≡ h1i h2i h1i
, h3ih2i
≡ h2i
.
h1, 2i h1, 2i h3i
• Graphical notation and terminology. Starting from a number of base points, surrounding some
subgraphs by round edges, and connecting some subgraphs via straight edges, we are led to a class of
diagrams that we shall call L-graphs (or Lions graphs). Viewing round edges as loops,
Ψ ≡ Ψ , Ψ ≡ Ψ ,
we can view L-graphs as (undirected) multi-hypergraphs that satisfy a number of properties. First
recall that a multi-hypergraph is a pair (V, E) where:
— V is a set of elements called base points or vertices;
— E is a multiset of elements called edges, which are pairs of non-empty subsets of V . The two subsets
that are connected by an edge are called the ends of the edge.
We then formally define an (unlabeled) L-graph as a multi-hypergraph Ψ = (V, E) such that the
following three properties hold:
— an edge in E is either a loop (so-called round edge) or it connects disjoint vertex subsets (so-called
straight edge): in other words, for all {A, B} ∈ E, we have either A = B or A ∩ B = ∅;
— a vertex subset S ⊂ V can only be the end of at most one straight edge, but it can at the same
time be the end of several round edges; in particular, each straight edge is simple, but round edges
can be multiple;
— if a vertex subset S ⊂ V is the end of some edge (round or straight), then strict subsets of S can
only be connected to other strict subsets of S: in other words, for all {A, B} ∈ E and A′ ( A, the
condition {A′ , A′′ } ∈ E implies A′′ ( A.
A labeled L-graph is an unlabeled L-graph endowed with a time labeling V ⊔ E → N, which associates
a time label to each vertex and each edge. Note that each labeled L-graph is uniquely associated
to a functional that can be obtained by iterating the round edge operation (4.3), the straight edge
operation (4.6), and by taking products (4.5). We introduce some further useful terminology:
— Induced subgraphs. Given an L-graph (V, E) and a subset S ⊂ V , we define the L-graph induced
by S as the pair (S, ES ) where ES is the multiset of all edges {A, B} ∈ E with A, B ⊂ S. We
define the L-graph strictly induced by S as the pair (S, ES′ ) where ES′ is now the multiset of all
edges {A, B} ∈ E with A, B ( S. By definition, we note that induced L-graphs are indeed L-
graphs themselves, and moreover ES′ coincides with ES after removing all occurrences of the round
edge {S}. We call L-subgraph of (V, E) any L-graph (S, F ) with S ⊂ V and ES′ ⊂ F ⊂ ES .
— Stability. Given an L-graph (V, E), an L-subgraph (S, F ) is said to be stable if for all {A, B} ∈ E
with A ( S we also have B ( S. In particular, by definition of an L-graph, a vertex subset that is
the end of an edge is automatically inducing a stable subgraph.
— Connectedness. An L-graph (V, E) is said to be connected if there is no partition V = A ∪ B with

A, B 6= ∅, A ∩ B = ∅, and E = EA ∪ EB . An L-graph can be uniquely decomposed into its
connected components.
— Irreducibility. An L-graph (V, E) is said to be irreducible if for all straight edges {A, B} ∈ E the
induced subgraphs (A, EA ) and (B, EB ) are both connected and if for all round edge {A} ∈ E the
strictly induced subgraph (A, EA′ ) is connected.
• Rules for time ordering and symmetrization. Given an L-graph (V, E), we consider the following four
rules that restrict the possibilities for the time labeling V ⊔ E → N:
(R1) Round edges. In accordance with definition (4.3), the time label of a round edge {A} ∈ E must
always be larger than time labels of the strictly induced subgraph (A, EA ′ ); in other words, it
must be larger than time labels of the subgraph that the round edge ‘surrounds’.
(R2) Straight edges. In accordance with definition (4.6), the time label of a straight edge {A, B} ∈ E
must always be larger than time labels of the two induced subgraphs (A, EA ) and (B, EB ). In
addition, the last time label of the two subgraphs must coincide.
(R3) Products. In any stable subgraph (S, F ), decomposing it into its connected components, the last
time label of each component coincides.
(R4) No other repetition and no gap. Apart from equalities of time labels imposed by the above three
rules (R1)–(R3), all time labels must be different. In addition, the set of time labels, that is, the
image of the time labeling map V ⊔ E → N, must be of the form JnK = {1, . . . , n} for some n ≥ 1
(that is, without gap).
In our notation, an unlabeled L-graph will be understood as the arithmetic average of all the labeled
L-graphs that can be obtained by endowing the graph with a time labeling that satisfies the above
four rules (R1)–(R4). For instance,

1
≡ 2 h1,3i h2,3i + h2,3i h1,3i = h1,3i h2,3i ,

1
(2) ≡ 3 h1,4i (2)h2,3,4i + h2,4i (2)h1,3,4i + h3,4i (2)h1,2,4i ,
h3i h3i h2i
1
≡ 2
h4i + h4i + h4i .
h1, 4i h2, 3i h2, 4i h1, 3i h3, 4i h1, 2i
As we shall see in the next section, the whole point of this graphical notation is that it allows for quick
and easy computations to derive representation formulas for Brownian cumulants. We summarize
the main graphical computation rules in the following lemma. Note in particular that item (ii) below
implies that any L-graph is equal to a linear combination of irreducible L-graphs with the same number
of vertices and edges.
Lemma 4.3 (Graphical computation rules).

(i) For all k ≥ 1, a base point associated with the power Φk is equivalent to the product of k copies
of the basepoint associated with Φ,
[Φ] = , [Φ2 ] = , [Φk ] = ( )k .
(ii) For any L-graph Ψ : △n × P(X) → R, for all t > 0, τ = (τ1 , . . . , τn ) ∈ △nt and 0 < s < τn , for
all µ ∈ P(X),

Ψ (t, τ ), m(τn − s, µ) = Ψ (t, τ1 , . . . , τn−1 , s), µ . (4.7)
(iii) Given two L-graphs Ψ : △n+1 × P(X) → R and Θ : △m+1 × P(X) → R, and given i1 < . . . < in
and j1 < . . . < jm with {i1 , . . . , in } ∪ {j1 , . . . , jm } = JpK for some 0 ≤ p ≤ n + m, we find
Ψhi1 ,...,in ,p+1i Θhj1 ,...,jm,p+1i = Ψ Θhj1 ,...,jm ,p+2i

hp+2i hi1 ,...,in ,p+1,p+2i
+ Ψhi1 ,...,in ,p+2i Θ hj1 ,...,jm ,p+1,p+2i
+ 2 Ψhi1 ,...,in ,p+1i hp+2i

Θhj1 ,...,jm ,p+1i . (4.8)
In particular, in the first right-hand side term, we note that the penultimate time label p + 1 of Ψ
is larger than all the time labels j1 , . . . , jm in Θ (and conversely in the second term), thus adding
nontrivial time ordering not implied by the basic rules (R1)–(R4). Using symmetrized notations,
we find for instance
= 2 +2 , (4.9)
= 2 +2 +4 +2 . (4.10)
This naturally generalizes to products of more than two functionals; we skip the details for con-
ciseness.
(iv) Given functionals Ψ : △n+1 × P(X) → R and Θ : △m+1 × P(X) → R, the time integral of their
symmetrized product can be factorized as
ˆ
n+m ˆ ˆ
(Ψ Θ) = Ψ Θ .
n △n+m ×0 △n ×0 △m ×0
Proof. All four items are direct consequences of the definitions. First, the definition of the basepoint
in the graphical notation means
(1) k
(t, s), µ = UΦk (t, s), µ = (Φk ) t, m(t − s, µ) = Φ t, m(t − s, µ) ,

[Φk ]
which proves item (i). Second, recalling the semigroup property
m(τn−1 − τn , m(τn − s, µ)) = m(τn−1 − s, µ),
we find from item (i) that [Φk ] satisfies (4.7). From the definitions (4.3) and (4.6), this property is
clearly conserved when surrounding an L-graph satisfying (4.7) by a round edge, and when connecting
two such L-graphs by a straight edge, which yields item (ii). The proof of items (iii) and (iv) is divided
into the following two steps.
Step 1. Proof of (iii).

Given smooth functionals Ψ : △n+1 × P(X) → R and Θ : △m+1 × P(X) → R, and given i1 < . . . < in
and j1 < . . . < jm with {i1 , . . . , in } ∪ {j1 , . . . , jm } = JpK, the definition of the round edge notation
yields for all (t, τ ) ∈ △p+2 and µ ∈ P(X),
Ψhi1 ,...,in ,p+1i Θhj1 ,...,jm ,p+1i

(t, τ ), µ
hp+2i
ˆ h
tr a0 ∂µ2 Ψ (t, τi1 , . . . , τin , τp+1 ), ν

=
X i
× Θ (t, τj1 , . . . , τjm , τp+1 ), ν (z, z) ν(dz) ,
ν=m(τp+1 −τp+2 ,µ)
and thus, using the chain rule for the Lions derivative,
Ψhi1 ,...,in ,p+1i Θhj1 ,...,jm ,p+1i

(t, τ ), µ
hp+2i
ˆ h i
tr a0 (∂µ2 Ψ) (t, τi1 , . . . , τin , τp+1 ), ν (z, z)

=
X

×Θ (t, τj1 , . . . , τjm , τp+1 ), ν ν(dz)
ν=m(τp+1 −τp+2 ,µ)
ˆ h i
tr a0 (∂µ2 Θ) (t, τj1 , . . . , τjm , τp+1 ), ν (z, z)

+
X

×Ψ (t, τi1 , . . . , τin , τp+1 ), ν ν(dz)
ν=m(τp+1 −τp+2 ,µ)
ˆ

+ 2 (∂µ Ψ) (t, τi1 , . . . , τin , τp+1 ), ν (z)
X i

·a0 (∂µ Θ) (t, τj1 , . . . , τjm , τp+1 ), ν (z) ν(dz) ,
ν=m(τp+1 −τp+2 ,µ)
or equivalently,
Ψhi1 ,...,in ,p+1i Θhj1 ,...,jm,p+1i

(t, τ, τp+2 ), µ
hp+2i

= Θ (t, τj1 , . . . , τjm , τp+1 ), m(τp+1 − τp+2 , µ)
ˆ h i
tr a0 (∂µ2 Ψ) (t, τi1 , . . . , τin , τp+1 ), ν (z, z) ν(dz)

×
X ν=m(τp+1 −τp+2 ,µ)

+ Ψ (t, τi1 , . . . , τin , τp+1 ), m(τp+1 − τp+2 , µ)
ˆ h i
2

× tr a0 (∂µ Θ) (t, τj1 , . . . , τjm , τp+1 ), ν (z, z) ν(dz)
X ν=m(τp+1 −τp+2 ,µ)
ˆ

+ 2 (∂µ Ψ) (t, τi1 , . . . , τin , τp+1 ), ν (z)
X i
·a0 (∂µ Θ) (t, τj1 , . . . , τjm , τp+1 ), ν (z) ν(dz) .
ν=m(τp+1 −τp+2 ,µ)
Further using item (ii), the identity (4.8) follows.

Step 2. Proof of (iv).
By definition of the symmetrized product, we have for all (t, τ ) ∈ △n+m and 0 < s < τn+m ,
n + m −1
X

(Ψ Θ) (t, τ, s), µ = 1σ(1)<...<σ(n) 1σ(n+1)<...<σ(n+m)
n
σ∈sym(n+m)

× Ψ (t, τσ(1) , . . . , τσ(n) , s), µ Θ (t, τσ(n+1) , . . . , τσ(n+m) , s), µ ,
and thus, taking the time integral,
n + m −1
ˆ X

(Ψ Θ) (t, τ, 0), µ dτ = 1σ(1)<...<σ(n) 1σ(n+1)<...<σ(n+m)
△n+m
t
n
σ∈sym(n+m)
ˆ

× Ψ (t, τσ(1) , . . . , τσ(n) , 0), µ Θ (t, τσ(n+1) , . . . , τσ(n+m) , 0), µ dτ.
△n+m
t
Noting that the sum over permutations allows to reconstruct the full product of integrals, we obtain
n + m −1
ˆ ˆ ˆ

(Ψ Θ) (t, τ, 0), µ dτ = Ψ (t, τ, 0), µ dτ Θ (t, τ, 0), µ dτ ,
△n+m
t
n △n
t △m
t
which is precisely the statement of item (iv).
4.3. Graphical representation of Brownian cumulants. Starting from Proposition 4.2 in form
of (4.4), we can use the above graphical notation to easily compute cumulants of functionals of the
empirical measure along the particle dynamics. We first illustrate this by a direct computation of the
leading contribution to the variance and to the third cumulant.
Lemma 4.4. Given a smooth functional Φ : P(X) → R, we can represent the variance and the third
cumulant along the particle dynamics as
1 EN,2
ˆ
VarB [Φ(µN
t )] = (µN
0 ) + ,
N △t ×0 (2N )2
3 EN,3
ˆ
3 N
κB [Φ(µt )] = (µN
0 ) + ,
N 2 △2t ×0 (2N )3
where the error terms EN,2 , EN,3 are given explicitly by
(A1,2 )2 2
ˆ
N
EN,2 := A2,2 − 2A1,2 EB [Φ(µt )] + − (µN
0 ) ,
(2N )2 △t ×0
2
EN,3 := A3,3 − 3A2,2 EB [Φ(µN N 2 N 2
t )] − A1,3 EB [Φ(µt ) ] + 4A1,3 EB [Φ(µt )] − A2 EB [Φ(µN
t )]
(2N )3 1,3
ˆ
− 4 + +6 + 12 + 6 (µN0 )
△3t ×0

1
ˆ
− +2 + 12 + 6 (µN
0 )
2N △4t ×0

ˆ 1
+ EB [Φ(µN
t )] 4 (µN
0 ) + (µN
0 )
△3t ×0 N
where for shortness we have defined
ˆ h i
Ak,m := EB k
[Φ ] (t, τ, τm ), µN
τm dτ.
△m
t
(m)
Proof. We split the proof into three steps.

Step 1. Formula for variance.
Using Proposition 4.2 in form of (4.4) to accuracy O(N −2 ), we find
1 A2,2
ˆ
EB [Φ(µN 2 N
(µN

t ) ] = (t, 0), µ 0 + 0 ) + , (4.11)
2N △t ×0 (2N )2
with the notation for A2,2 in the statement. By Lemma 4.3(iii) in form of (4.9), we get
2
EB [Φ(µN 2
t ) ] = (t, 0), µN
0
2 2 A2,2
ˆ ˆ
N

+ (t, 0), µ0 (µN
0 ) + (µN
0 )+ . (4.12)
2N △t ×0 2N △t ×0 (2N )2
On the other hand, using again Proposition 4.2 in form of (4.4) to accuracy O(N −2 ), we also have
1 A1,2
ˆ
N N
(µN

EB [Φ(µt )] = (t, 0), µ0 + 0 ) + .
2N △t ×0 (2N )2
Taking the square of this identity, and comparing it to (4.12), the formula for the variance follows after
straightforward simplifications.
Step 2. Next-order formula for variance.

Before turning to the third cumulant, we expand the formula for the variance to the next order. Instead
of (4.11), we start from Proposition 4.2 with accuracy O(N −3 ), in form of
1 1 A2,3
ˆ ˆ
N 2 N N
(µN

EB [Φ(µt ) ] = (t, 0), µ0 + (µ0 ) + 2 0 )+ .
2N △t ×0 (2N ) △2t ×0 (2N )3
Then further appealing to Lemma 4.3(iii) in form of (4.9), as well as to Lemma 4.3(iv), we obtain,
instead of (4.12),
2 2 2
ˆ ˆ
N 2 N N
N

EB [Φ(µt ) ] = (t, 0), µ0 + (t, 0), µ0 (µ0 ) + (µN
0 )
2N △t ×0 2N △t ×0
2 1 2
ˆ ˆ
N
N

N
+ (t, 0), µ 0 (µ 0 ) + (µ 0 )
(2N )2 △2t ×0 (2N )2 △t ×0
1 A2,3
ˆ
+ 2
4 + 2 (µN0 )+ .
(2N ) △2t ×0 (2N )3
On the other hand, using again Proposition 4.2 in form of (4.4) to accuracy O(N −3 ), we also have
1 1 A1,3
ˆ ˆ
N N N
(µN

EB [Φ(µt )] = (t, 0), µ0 + (µ0 ) + 2 0 )+ . (4.13)
2N △t ×0 (2N ) △t ×0 (2N )3
Taking the square of this identity, and comparing it to the previous one for EB [Φ(µN 2
t ) ], we are led to
2 1 RN
ˆ ˆ
VarB [Φ(µN
t )] = (µ N
0 ) + 2
4 + 2 (µN
0 )+ . (4.14)
2N △t ×0 (2N ) △2t ×0 (2N )3
where the error term RN is given by
(A1,3 )2 1 2
ˆ ˆ
RN := A2,3 − 2A1,3 EB [Φ(µN
t )] + −2 (µN
0 )− (µN
0 ) .
(2N )3 △3t ×0 2N △2t ×0
Step 3. Formula for third cumulant.

Using Proposition 4.2 in form of (4.4), we find
1 1 A3,3
ˆ ˆ
N 3 N
(µN (µN

EB [Φ(µt ) ] = (t, 0), µ0 + 0 ) + 0 ) + .
2N △t ×0 (2N )2 △2t ×0 (2N )3
To compute the different right-hand side terms, we appeal to Lemma 4.3 in form of
= 3 +6 ,
2
= 3 +3 + 12 +6 +6 + 12
Inserting these identities into the above, comparing with (4.13) and (4.14), and recalling that the third
cumulant is given by
κ3 [Φ(µN N 3 N N N 3
t )] = EB [Φ(µt ) ] − 3 VarB [Φ(µt )]EB [Φ(µt )] − EB [Φ(µt )] ,
the formula in the statement follows after straightforward simplifications.
We now show how the above explicit diagrammatic computation can be pursued systematically to
higher orders. First note that starting from Proposition 4.2 in form of (4.4) and appealing to the
computation rules of Lemma 4.3 to expand each L-graph into a sum of irreducible graphs, the kth
moment of a smooth functional along the flow can be expanded as a power series in N −1 , where the
term of order O(N −m ) is given by a sum of all irreducible L-graphs with k vertices and m edges (see
indeed (4.15) below). In the above lemma, for the first cumulants, we manage to capture cancellations
showing that the power series for the variance starts at order O(N −1 ) and that the power series for
the third cumulant starts at O(N −2 ). In the following result, we unravel the underlying combinatorial
structure and show how cancellations can be systematically captured to higher orders: in a nutshell, the
power series for cumulants takes the same form as for moments, except that the sum over irreducible
L-graphs is restricted to connected graphs, cf. (4.17). In particular, given that for m < k − 1 there is
no connected L-graph with k vertices and only m edges, we deduce that the power series for the kth
cumulant must start at order O(N 1−k ).
Proposition 4.5. Given a smooth functional Φ : P(X) → R, for all k ≥ 1, we can expand as follows
the kth Brownian moment along the flow: for all n ≥ 0,
n k (t)
1 RN,n
X X ˆ
N k
EB [Φ(µt ) ] = γ(Ψ) Ψ + , (4.15)
m=0
(2N )m △m
t ×0
(2N )n+1
Ψ∈Γ(k,m)
where Γ(k, m) stands for the set of all (unlabeled) irreducible L-graphs with k vertices and m edges,
where γ is some map Γ(k, m) → N, and where the remainder is given by
ˆ h
k
i
RN,n (t) := EB [Φk ] (t, τ, τn+1 ), µN
τn+1 dτ. (4.16)
△n+1
t
(n+1)
Moreover, with this notation, the kth cumulant can be expanded as follows: for all n ≥ 0,
n k (t)
1 R̃N,n
X X ˆ
κkB [Φ(µN
t )] = 1 n≥k−1 γ(Ψ) Ψ + , (4.17)
(2N )m △m
t ×0
(2N )n+1
m=k−1 Ψ∈Γ◦ (k,m)
where the sum is now restricted to Γ◦ (k, m) ⊂ Γ(k, m), which stands for the subset of all connected
k (t) can be
(unlabeled) irreducible L-graphs with k vertices and m edges, and where the remainder R̃N,n
expressed as a linear combination of elements of the set
j j
k−j
Y X
N i αi
SN,n (t) EB [Φ(µt ) ] : 0 ≤ j ≤ k and α1 , . . . , αj ∈ N with iαi = j ,
i=1 i=1
with bounded coefficients independent of N, Φ, µN k

0 , t, where the factors {SN,n (t)}k are defined by
k n
k−1 X
ˆ
k−j
X X
k k
SN,n (t) := RN,n (t) − RN,n−m (t) γ(Ψ) Ψ(µN
0 ). (4.18)
j−1 m=0 △m
t ×0
j=1 Ψ∈Γ◦ (j,m)
Proof. We split the proof into three steps.

Step 1. Proof of (4.15).
By Proposition 4.2 in form of (4.4), we recall that we have for all k ≥ 1 and n ≥ 0,
n k (t)
1 RN,n
X ˆ
N k N
EB [Φ(µt ) ] = k
[Φ ] (µ 0 ) + , (4.19)
(2N )m △m t ×0
(m) (2N )n+1
m=0
k (t) defined in (4.16). In order to prove (4.15), it remains to use Lemma 4.3(iii) to
with remainder RN,n
expand the L-graphs in the above right-hand side as sums of irreducible L-graphs. By a direct induction
argument, we note that all time labelling satisfying the basic rules (R1)–(R4) appear symmetrically in
the expansion, thus proving that for all k ≥ 1 and m ≥ 0 we can expand
X
[Φk ] = γ(Ψ)Ψ, (4.20)
(m)
Ψ∈Γ(k,m)
for some map γ : Γ(k, m) → N, where as in the statement Γ(k, m) stands for the set of all (unlabeled)
irreducible L-graphs with k vertices and m edges. This already proves (4.15).
Step 2. Proof that for all k ≥ 1 and m ≥ 0 the map γ : Γ(k, m) → N in (4.20) satisfies for all L-graphs
Ψ ∈ Γ(k, m),
X k − 1 m
γ(Ψ) = γ(Θ) γ(Ψ \ Θ), (4.21)
V (Θ) − 1 E(Θ)
Θ⊂Ψ
connected
component
where the sum runs over all connected components Θ of the L-graph Ψ, where V (Θ) and E(Θ) stand
for the number of vertices and the number of edges in Θ, respectively, and where Ψ \ Θ stands for the
L-subgraph obtained by removing the component Θ from Ψ.
To prove this identity, we start by noting that, when appealing to Lemma 4.3(iii) to iteratively
prove (4.20), the map γ can be given an explicit interpretation: for all Ψ ∈ Γ(k, m), the coefficient
γ(Ψ) is the positive integer given by
γ(Ψ) := 2SE(Ψ) N (Ψ),
where SE(Ψ) is the number of straight edges in Ψ and where N (Ψ) is the number of ways to obtain
the graph Ψ by starting from k labeled vertices and by iteratively adding round or straight edges
between stable subgraphs. Conditioning on the connected component that the vertex with the first
label belongs to, the identity (4.21) immediately follows from this interpretation.
Given k ≥ 1 and m ≥ 0, the result (4.21) of Step 2 implies
k m X
k−1 X m
X X X
γ(Ψ)Ψ = γ(Ψ)Ψ γ(Ψ)Ψ ,
j−1 p
Ψ∈Γ(k,m) j=1 p=0 Ψ∈Γ(j,p) Ψ∈Γ(k−j,m−p)
connected
and thus, by (4.20),

k m
X k−1 X m X
[Φk ] = γ(Ψ)Ψ [Φk−j ] .
(m) j−1 p=0
p (m−p)
j=1 Ψ∈Γ(j,p)
connected
Taking the time integral and appealing to Lemma 4.3(iii), this leads us to
k m X
k−1 X
ˆ X ˆ ˆ
k
[Φ ] (µ) = γ(Ψ) Ψ(µ) [Φ k−j
] (µ) .
△m
t ×0
(m) j−1 △pt ×0 △m−p ×0 (m−p)
j=1 p=0 Ψ∈Γ(j,p) t
connected
Now recalling (4.19), and defining

n
1
X X ˆ
LkN,n (t, µN
0 ) := γ(Ψ) Ψ(µN
0 ),
(2N )m △m
t ×0
m=0 Ψ∈Γ(k,m)
connected
we deduce
k k (t)
SN,n

k−1
LjN,n (t, µN
X
EB [Φ(µN k
t ) ] = N k−j
0 ) EB [Φ(µt ) ]+ , (4.22)
j−1 (2N )n+1
j=1
k (t) defined in (4.18). Finally, we recall the recurrence relation of Lemma 2.5
with remainder SN,n
between moments and cumulants: for all k ≥ 1, we have
k
N k
X k−1 j
EB [Φ(µt ) ] = κ [Φ(µN N k−j
t )] EB [Φ(µt ) ].
j−1 B
j=1
Comparing this with the identity (4.22) above, the conclusion follows by a direct induction.
4.4. Error estimates. We turn to the uniform-in-time estimation of error terms in expansions such
as (4.15) or (4.17). We start with the following lemma describing Lions derivatives of the solution of
the mean-field McKean–Vlasov equation (1.21).
Lemma 4.6. Let m(·, µ) : R+ × X → R+ denote the solution operator (1.22) for the ´ mean-field
McKean–Vlasov equation (1.21). For all t ≥ 0 and φ ∈ Cc∞ (X), the functional µ 7→ X φ m(t, µ) is
smooth and its linear functional derivatives can be represented as follows: for all k ≥ 1, µ ∈ P(X),
and y1 , . . . , yk ∈ X,
δk
ˆ ˆ
µ 7→ φ m(t, µ) (µ, y1 , . . . , yk ) = φ m(k) (t, µ, y1 , . . . , yk ), (4.23)
δµk X X
where m(k) (·, µ, y1 , . . . , yk ) : R+ × X → R is a distributional solution of the linear Cauchy problem
∂t m(k) (t, µ, y1 , . . . , yk ) − Lm(t,µ) m(k) (t, µ, y1 , . . . , yk ) = Fk (t, µ, y1 , . . . , yk ),

(4.24)
m(k) (t, µ, y1 , . . . , yk )|t=0 = (−1)k−1 (δyk − µ),
where for all µ ∈ P(X) we recall that Lµ stands for the linearized McKean–Vlasov operator at µ,
cf. (3.1), and where the source term Fk is given by
k−1
X X
Fk (t, µ, y1 , . . . , yk ) := 1♯I0 =j,♯I1 =k−j
j=1 I0 ∪I1 =JkK
disjoint

δb
ˆ
(j) (k−j)
× div m (t, µ, yI0 ) (·, m(t, µ), z) m ((t, z), µ, yI1 ) dz , (4.25)
X δµ
where for I = {i1 , . . . , ir } with 1 ≤ i1 < . . . < ir ≤ k we set yI := (yi1 , . . . , yir ). Note that for k = 1
we have F1 ≡ 0. In addition, given κ0 , λ0 as in Theorem 3.1, we have the following uniform-in-time
estimates: given κ ∈ [0, κ0 ], 1 < q ≤ 2, and 0 q1′ dim X,
k∇αy11 . . . ∇αykk m(k) (t, µ, y1 , . . . , yk )kW −(ℓ0 +k−1+maxj αj ),q (hzip ) . e−pλt hy1 ip . . . hyk ip , (4.26)
where the multiplicative constant only depends on d, W, β, λ, k, ℓ0 , p, q, a, and maxj αj .
Proof. By successively taking linear derivatives in the McKean–Vlasov equation (1.21), the represen-
tation (4.23)–(4.24) in terms of linearized equations is straightforward with source term given by
k X
X k−1 X X
Fk (t, µ, y1 , . . . , yk ) := 1∀0≤r≤l:♯Ir =ar
l=1 a0 =0 a1 +...+al =k−a0 I0 ∪...∪Il =JkK
1≤a1 ≤...≤al <k disjoint
δl b
ˆ
(j)
× div m (t, µ, yI0 ) (·, m(t, µ), z1 , . . . , zl )
Xl δµl

(a1 ) (al )
×m ((t, z1 ), µ, yI1 ) . . . m ((t, zl ), µ, yIl ) dz1 . . . dzl ,
where we recall the notation yI = (yi1 , . . . , yir ) for I = {i1 , . . . , ir }. The pairwise structure of interac-
tions actually yields various simplifications: noting that
δl (W ∗ µ)
(·, µ, z1 , . . . , zl ) = (−1)l W ∗ (δzl − µ),
δµl
and noting that X m(k) (t, µ, y1 , . . . , yk ) = 0 for all k ≥ 1, the above expression for Fk reduces precisely
´
to (4.25). We emphasize that this simplification is not essential, but it slightly simplifies the computa-
tions. It remains to deduce the uniform-in-time estimate (4.26): we argue by induction and split the
proof into two steps. Let p, q be fixed as in the statement.
Step 1. Preliminary: we prove the following properties of the spaces W −k,q (hzip ) and their du-
′
als W k,q (X),
— for all ℓ ≥ 0 and h ∈ Cc∞ (X),
k∇hkW ℓ,q′ (X) ≤ khkW ℓ+1,q′ (X) ; (4.27)
k∇hkW −ℓ,q (hzip ) .W,β,ℓ,a khkW 1−ℓ,q (hzip ) ; (4.28)
1
— for all ℓ > q′ dim(X) and y ∈ X we have
kδy kW −ℓ,q (hzip ) .ℓ,q hyip . (4.29)
k,q ′
The claim (4.27) is a direct consequence of the definition of W (X). The claim (4.28) is a direct
consequence of (4.27) by definition of dual norms. We turn to the proof of (4.29). By the Sobolev
embedding, we have for all y ∈ X and h ∈ Cc∞ (X),
ˆ
hδy hzip = hyip |h(y)| ≤ hyip khkL∞ (X) .ℓ,q hyip khkW ℓ,q′ (X) ,
X
1
provided that ℓ > q′ dim X. By definition of dual norms, the claim (4.29) follows.
Step 2. Conclusion.
Let ℓ0 > q1′ dim(X). Applying ∇αy11 . . . ∇αykk to both sides of equation (4.24), and then appealing to
Theorem 3.1(ii), we obtain for all ℓ ≥ 0, λ ∈ [0, λ0 ) and t ≥ 0
epλt k∇αy11 . . . ∇αykk m(k) (t, µ, y1 , . . . , yk )kW −ℓ,q (hzip )

ˆ t
.W,β,λ,ℓ,p,q,a k∇y1 . . . ∇yk (δyk −µ)kW −ℓ,q (hzip ) + epλs k∇αy11 . . . ∇αykk Fk (s, µ, y1 , . . . , yk )kW −ℓ,q (hzip ) ds.
α1 αk
0
Note that the first right-hand side term is equal to 1α1 =...=αk−1 =0 k∇αk δyk kW −ℓ,q (hzip ) . Using (4.28)
and (4.29) to bound this term, we then get for all ℓ ≥ ℓ0 + αk ,
epλt k∇αy11 . . . ∇αykk m(k) (t, µ, y1 , . . . , yk )kW −ℓ,q (hzip )

ˆ t
p
.W,β,λ,ℓ,ℓ0,p,q,a hyk i + epλs k∇αy11 . . . ∇αykk Fk (s, µ, y1 , . . . , yk )kW −ℓ,q (hzip ) ds.
0
To shorten notation, let us introduce the following norms: given k ≥ 1, we define for all ℓ, n ≥ 0 and
H : X × Xk → R,

H ℓ,n := sup sup hy1 i−p . . . hyk i−p k∇αy11 . . . ∇αykk H(·, y1 , . . . , yk )kW −ℓ,q (hzip ) .
0≤α1 ,...,αk ≤n y1 ,...,yk ∈X
In these terms, the above reads as follows, for all ℓ ≥ ℓ0 + n,

ˆ t
pλt (k)
e m (t, µ) ℓ,n .W,β,λ,ℓ,ℓ0,p,q,a 1 + epλs Fk (s, µ) ℓ,n ds. (4.30)
0
We turn to the estimation of the source term Fk as defined in (4.25). Recalling the choice of b, cf. (1.23)
and (1.24), and using again (4.27), we easily find for all ℓ, ℓ′ , n ≥ 0,
Fk (t, µ) ℓ+1,n .W,β,ℓ,ℓ′,k,a max m(j) (t, µ) ℓ,n m(k−j)(t, µ) ℓ′ ,n .
1≤j≤k−1
Inserting this into (4.30), and recalling that F1 = 0, we deduce by induction for all n ≥ 0, ℓ ≥
ℓ0 + n + k − 1, and λ ∈ [0, λ0 ), for all t ≥ 0
epλt m(k) (t, µ) ℓ,n .W,β,λ,ℓ,ℓ0,k,p,q,a 1,
which concludes the proof of (4.26).

In order to compensate for the polynomial growth in (4.26), we shall appeal to the following uniform-
in-time moment estimates both for the particle dynamics and for the mean-field dynamics.
Lemma 4.7 (Uniform moment estimates). For all t ≥ 0, N, k ≥ 1, and µ ∈ P(X), we have
ˆ ˆ
EB k N
|z| µt (dz) ≤ (Ck) k
he−t/C zik µN
0 (dz), (4.31)
X X
ˆ ˆ
k
|z| m(t, µ) ≤ (Ck) k
he−t/C zik µ(dz), (4.32)
X X
1
and for all 0 ≤ θ ≤ C,
ˆ ˆ
2 −t/C |z|2 N
EB eθ|z| µNt (dz) ≤ C eCθe µ0 (dz), (4.33)
X X
ˆ ˆ
θ|z|2 −t/C |z|2
e m(t, µ) ≤ C eCθe µ(dz), (4.34)
X X
for some constant C < ∞ only depending on d, W, β, a.
Proof. We focus on the Langevin setting for shortness. We split the proof into two steps, separately
proving (4.31) and (4.33), while the proof of (4.32) and (4.34) for the mean-field dynamics is identical
and is skipped.
In the spirit of [9], we consider the random process
1,N 2
GN
t := a|Xt | + |Vt1,N |2 + ηXt1,N · Vt1,N ,
for some η > 0 to be suitable chosen. By Itô’s formula, the particle dynamics (1.1) yields
1,N 2
dGN
t = −aη|Xt | dt − (β − η)|Vt1,N |2 dt − ηβ 1,N
2 Xt · Vt1,N dt
N
2Vt1,N ηXt1,N ∇W (Xt1,N − Xtj,N ) dt + 2Vt1,N + ηXt1,N · dBt1 . (4.35)
X
κ

− + · N
j=1
From this equation and Itô’s formula, we then find for all k ≥ 1,
h i
1,N 2 β β 2 1,N 2
∂t EB [(GN
t )k
] ≤ −kE B (GN k−1 1
t ) 4 aη|X t | + 2 − η(1 + 8a ) |Vt |
η 2 2
N k−1
+ k ( a + β )k∇W kL∞ (Rd ) EB [(Gt ) ]
+ 21 k(k − 1)EB [(GN
t )
k−2
|2Vt1,N + ηXt1,N |2 ].
Provided that 0 < η ≪β,a 1 is small enough (only depending on β, a), there exist λ, C > 0 (only
depending on d, W, β, a) such that we get for all k ≥ 1,
∂t EB [(GN k N k 2 N k−1
t ) ] ≤ −λpkEB [(Gt ) ] + Ck EB [(Gt ) ],
ˆ t
−λkt
EB [(GN k
t ) ] ≤ e EB [(GN k
0 ) ] + Ck 2
e−λk(t−s) EB [(GN
s )
k−1
] ds.
0
A direct induction then yields for all k ≥ 1,
k
X (k!)2
EB [(GN k
t ) ] ≤ e−λjt ( λ1 C)k−j EB [(GN j
0 ) ],
(k − j)!
j=0
and the conclusion follows.


From (4.35) and Itô’s formula, arguing similarly as in Step 1, provided that 0 < η ≪β,a 1 is small
enough (only depending on β, a), there exist λ, C > 0 (only depending on d, W, β, a) such that we have
for all θ > 0,
N N N N
∂t E[eθGt ] ≤ −2λθE[GN
t e
θGt
] + CθE[eθGt ] + Cθ 2 E[GN
t e
θGt
].
λ
Hence, for θ ≤ θ̃ := 2C ,
N N N
∂t E[eθGt ] ≤ −λθE[GN
t e
θGt
] + CθE[eθGt ].
N
This amounts to the following differential inequality for the Laplace transform F (t, θ) := E[eθGt ]: for
all t ≥ 0 and 0 ≤ θ ≤ θ̃,
∂t F + λθ∂θ F ≤ CθF,
which can be rewritten as follows, for all t ≥ 0 and 0 ≤ eλt θ ≤ θ̃,
∂t [F (t, eλt θ)] ≤ Ceλt θF (t, eλt θ).
By integration, this yields F (t, θ) ≤ eθC/λ F (0, e−λt θ) for all t ≥ 0 and 0 ≤ θ ≤ θ̃, that is, (4.33).
With the above estimates at hand, we may now turn to the estimation of the Lions derivative of
smooth functionals along the particle dynamics. For that purpose, we define the following hierarchy of
norms: for any smooth functional Φ : P(X) → R, we define for all µ ∈ P(X), k ≥ 1, n ≥ 0, and p ≥ 0,
j

−p −p α1 αj δ Φ
||| Φ |||k,n,p,µ := max max sup hy1 i . . . hyj i ∇y1 . . . ∇yj j µ, y1 , . . . , yj .
1≤j≤k 0≤α1 ,...,αj ≤n y1 ,...,yj ∈X δµ
The following result can be iterated to estimate arbitrary Lions graphs.
Lemma 4.8. Let κ0 , λ0 be as in Theorem 3.1 and let κ ∈ [0, κ0 ]. Given m ≥ 0 and a smooth functional
Ψ : △m × P(X) 7→ R, we have for all λ ∈ [0, λ0 ), k ≥ 1, n ≥ 0, 0 0,
(1)
||| UΨ ((t, τ, s), ·) ||| k,n,p,µ .W,β,λ,ℓ0,k,p,n,a e−pλ(τm −s) ||| Ψ((t, τ ), ·) |||k,ℓ0 +n+k−1, p ,m(τm −s,µ) ,
2
and in addition,
ˆ
Ψ ((t, τ, s), ·) .W,β,λ,ℓ0,k,p,n,a e−pλ(τm −s) hzip µ(dz) ||| Ψ((t, τ ), ·) ||| k+2,ℓ0+n+k, p ,m(τm −s,µ) .
k,n,p,µ X 3
′
Given m, m′ ≥ 0, smooth functionals Ψ : △m+1 × X 7→ R and Θ : △m +1 × X 7→ R, and given a
partition {i1 , . . . , im } ∪ {j1 , . . . , jm′ } = Jm + m′ K with i1 < . . . < im and j1 < . . . < jm′ , we further
′
have for all λ ∈ [0, λ0 ), k ≥ 1, n ≥ 0, 0 0,
Ψhi1 ,...,im ,m+m′ +1i Θhj1 ,...,jm′ ,m+m′ +1i ((t, τ, s, s′ ), ·)

hm+m′ +2i
k,n,p,µ
ˆ
−pλ(s−s′ )
hzip µ(dz) ||| Ψ (t, τi1 , . . . , τim , s), · |||k+1,ℓ0 +n+k, p ,m(s−s′ ,µ)

.W,β,λ,ℓ0,k,p,n,a e
3
X

× ||| Θ (t, τj1 , . . . , τjm′ , s), · |||k+1,ℓ0+n+k, p ,m(s−s′ ,µ) .
3
(1)
Proof. Given m ≥ 0 and a smooth functional Ψ : △m × P(X) → R, recalling that UΨ is defined in
Definition 4.1, we can compute
(1)
δUΨ δΨ
ˆ
(t, τ ), m(τm − s, µ), · m(1) (τm − s, µ, y),

(t, τ, s), µ, y = (4.36)
δµ X δµ
with m(1) as defined in Lemma 4.6. Recalling the definition of dual norms and using Lemma 4.6, for
any 1 < q ≤ 2 with pq ′ ≫β,a 1 large enough and with ℓ0 q ′ > dim X, we deduce for all λ ∈ [0, λ0 ),
(t, τ, s) ∈ △m+1 , µ ∈ P(X), and y ∈ X,
(1)
δUΨ δΨ
≤ km(1) (τm − s, µ, y)kW −ℓ0 ,q (h·ip ) h·i−p

(t, τ, s), µ, y (t, τ ), m(τm − s, µ), ·) ′
δµ δµ W ℓ0 ,q (X)
.W,β,λ,ℓ0,p,a hyip e−pλ(τm −s) ||| Ψ |||1,ℓ0 , p ,m(τm −s,µ) ,

2
where in the last estimate we further used pq ′ > 2 dim X. By induction, on top of (4.36), we find for
all k ≥ 1 and y1 , . . . , yk ∈ X,
(1) k
δ k UΨ X X X
(t, τ, s), µ, y1 , . . . , yk = 1∀1≤r≤l:♯Ir =ar
δµk
l=1 a1 +...+al =k I1 ∪...∪Il =JkK
1≤a1 ≤...≤al ≤k disjoint
δl Ψ
ˆ
(t, τ ), m(τm − s, µ), z1 , . . . , zl m(a1 ) (τm − s, z1 ), µ, yI1

× l
Xl δµ
. . . m(al ) (τm − s, zl ), µ, yIl dz1 . . . dzl , (4.37)

from which we then get the following conclusion, using Lemma 4.6,
(1)
||| UΨ ((t, τ, s), ·) |||k,n,p,µ .W,β,λ,ℓ0,k,q,p,n,a e−pλ(τm −s) ||| Ψ((t, τ ), ·) |||k,ℓ0+n+k−1, p ,m(τm −s,µ) .
2
We turn to the estimation of the round edge. By definition (4.3), we can write
(1)
Ψ (t, τ, s), µ = UΨ′ (t, τ, s), µ ,
in terms of ˆ h i
′
tr a0 ∂µ2 Ψ (t, τ ), µ (z, z) µ(dz).

Ψ (t, τ ), µ :=
X
By a similar induction as the one performed to get (4.37), we find for all k ≥ 1 and y1 , . . . , yk ∈ X,
(1) k j
δ k U Ψ′ XX X X
(t, τ, s), µ, y 1 , . . . , y k = 1∀1≤r≤l:♯Ir =ar
δµk
j=0 l=1 a1 +...+al =j I1 ∪...∪Il ⊂JkK
1≤a1 ≤...≤al ≤j disjoint
δl
ˆ h
2
i (a1 )

× tr a 0 ∂µ Ψ (t, τ ), m(τ m − s, µ), z, z (z1 , . . . , zl ) m (τ m − s, z1 ), µ, y I 1
Xl+1 δµl
. . . m(al ) (τm − s, zl ), µ, yIl m(k−j) (τm − s, z), µ, yJkK\(I1 ∪...∪Il ) dz1 . . . dzl dz.

For j < k, the terms can be estimated as before using Lemma 4.6. For j = k, we use the following: for
any bounded function ϕ, by Lemma 4.7, we have
ˆ ˆ
ϕ(z, z) m (τm − s, z), µ dz .W,β,a sup hzi−p ϕ(z, z) hzip µ(dz),

X z∈X X
and the conclusion then follows. The argument for the straight edge is similar and we skip the detail
for shortness.
The above result can be iterated to estimate arbitrary Lions graphs. Combining it with the dia-
grammatic representation of moments and cumulants in Proposition 4.5, we obtain the following.
Corollary 4.9 (Truncated Lions expansions). Let´ κ0pbe as in Theorem 3.1, let κ ∈ [0, κ0 ], and further
assume that the initial law µ◦ ∈ P(X) satisfies X |z| 0 µ◦ (dz) < ∞ for some p0 > 0. Given a smooth
functional Φ : P(X) → R, for all k ≥ 1, we can expand as follows the kth Brownian moment and the
kth Brownian cumulant along the particle dynamics: for all n ≥ 0,
n
1
X X ˆ
N k
E◦ EB [Φ(µt ) ] − γ(Ψ) Ψ . N −n−1 , (4.38)
(2N )m △m ×0
m=0 t Ψ∈Γ(k,m)
n
1
X X ˆ
E◦ κkB [Φ(µN
t )] − 1n≥k−1 γ(Ψ) Ψ . N −n−1 , (4.39)
(2N )m △m
t ×0
m=k−1 Ψ∈Γ◦ (k,m)
where we recall that Γ(k, m) stands for the set of all (unlabeled) irreducible L-graphs with k vertices
and m edges and that Γ◦ (k, m) stands for the subset of all connected (unlabeled) irreducible L-graphs
with k vertices and m edges, where ´ γ is some map Γ(k, m) → N, and where multiplicative constants
only depend on d, W, β, ℓ0 , k, n, a, X |z|p0 µ◦ (dz), and on
sup ||| Φ |||2(n+1),(n+1)(ℓ0 +n+2),3−n−3 p0 ,µ ,
µ∈P(X)
for any ℓ0 > 0.

Proof. Let κ0 , λ0 be as in Theorem 3.1 and let κ ∈ [0, κ0 ]. By definition (4.3) of the round edge, and by
Lemma 4.7, given m ≥ 0 and a smooth functional Ψ : △m ×P(X) → R, we have for all (t, τ, s) ∈ △m+1 ,
0 < p ≤ 1, and µ ∈ P(X),
ˆ
Ψ ((t, τ, s), µ) .W,β,p,a ||| Ψ((t, τ ), ·) |||2,1, p ,m(τm −s,µ) hzip µ(dz).
2
X
Using this, we get in particular, for all m ≥ 0, (t, τ, s) ∈ △m+2 , and 0 < p ≤ 1,
ˆ
N
[Φk ] ((t, τ, s), µs ) .W,β,p,a [Φk ] ((t, τ ), ·) hzip µN
s (dz).
(m+1) (m) 2,1, p2 ,m(τm −s,µN X
s )
Now repeatedly applying Lemma 4.8 to control the right-hand side, and using Jensen’s inequality, we
get for all λ ∈ [0, λ0 ) and ℓ0 > 0,
−3 −m pλ(τ −τ
[Φk ] ((t, τ, s), µN
s ) .W,β,λ,ℓ0 ,m,p,a e
1 m+1 )
(m+1)
ˆ
−m )p
× [Φk ] ((t, τ1 ), ·) hzi(2−3 µN
s (dz).
2(m+1),mℓ0 +m(m+1)+1,3−m−1 p,m(τ1 −s,µN
s ) X
(1) (1)
Recalling [Φk ] = UΦk = (UΦ )k , using again Lemma 4.8, taking the expectation, and using the
moment bounds of Lemma 4.7 with E◦ [µN 0 ] = µ◦ , we get

−m
E [Φk ] ((t, τ, s), µN
s ) .W,β,λ,ℓ0,k,m,p,a e−3 pλ(t−τm+1 )
(m+1) ˆ
× ||| Φ |||k2(m+1),(m+1)(ℓ0 +m+2),3−m−2 p,m(t−s,µN
s )
hzi2p µ◦ (dz).
X
Inserting this into (4.15) and recalling the moment assumption for µ◦ , the conclusion (4.38) follows.
Noting that similar a priori bounds on any irreducible L-graph can be obtained iteratively from Lem-
mas 4.7 and 4.8, the proof of (4.39) follows similarly from (4.17).
5. Refined propagation of chaos

This section is devoted to the proof of Theorem 1.1. Let κ0 , λ0 be as in Theorem 3.1 and let
κ ∈ [0, κ0 ] be fixed. For t ≥ 0 and φ ∈ Cc∞ (X), consider the random variables
ˆ
XtN (φ) := φ µNt .
X
In the spirit of Lemma 2.6, we start by estimating cumulants of XtN (φ). By the law of total cumulance,
cf. Lemma 2.4, they can be decomposed as follows, for all m ≥ 2,
h i
♯A
X
m N ♯π N

κ [Xt (φ)] = κ◦ κB [Xt (φ)] A∈π . (5.1)
π⊢JmK
We appeal to Corollary 4.9 with Φ(µ) := X φµ to expand Brownian cumulants of XtN (φ) = Φ(µN
´
t ):
for k ≤ m, we find
m−2
1
X X ˆ
k N
E◦ κB [Xt (φ)] − 1k<m γ(Ψ) Ψ .W,β,φ,m,a,µ◦ N 1−m ,
(2N )p p
△t ×0
p=k−1 Ψ∈Γ◦ (k,p)
where we recall that Γ◦ (k, m) stands for the set of all connected irreducible L-graphs with k vertices and
m edges built´ from the reference base point Φ, and where the multiplicative constant only depends on
d, W, β, m, a, X |z|p0 µ◦ (dz), and on the W r,∞ (X) norm of φ for some r only depending on m. Inserting
this approximation into (5.1), we deduce
m m−2 m−2
m
X X X X 1
κ [XtN (φ)] − ...
(2N )p1 +...+ps
s=2 {A1 ,...,As }⊢JmK p1 =♯A1 −1 ps =♯As −1
X X
× ... γ(Ψ1 ) . . . γ(Ψs )
Ψ1 ∈Γ◦ (♯A1 ,p1 ) Ψs ∈Γ◦ (♯As ,ps )
hˆ ˆ i
× κs◦ p
Ψ1 , . . . , Ψs .W,β,φ,m,a,µ◦ N 1−m . (5.2)
△t 1 ×0 △pt s ×0
It remains to estimate the joint Glauber cumulants in this expression. For that purpose, we appeal
to the higher-order Poincaré inequality of Proposition 2.9: recalling that Glauber derivatives can be
bounded by linear derivatives, cf. (2.21), we get
hˆ ˆ i
κs◦ p
Ψ 1 , . . . , Ψ s
△t 1 ×0 △pt s ×0
s−2 s
δaj
X X Y ˆ ˆ N,s1 ,...,sa
1−s j
.s N Ψj (m0 , y1 , . . . , yaj )
([0,1]×X×X)aj δµaj p
△t j ×0
a1 ,...,as ≥1 j=1
k=0 P
j aj =s+k
aj
Y
× (δZ l,N − δzl )(dyl ) µ◦ (dzl ) dsl s+k ,
0 aj
l=1 L (Ω◦ )
where we have set for abbreviation

aj
N,s1 ,...,saj X 1 − sl
m0 := µN
0 + (δzl − δZ l,N ).
N 0
l=1
Norms of linear derivatives of each Ψj ∈ Γ◦ (♯Aj , pj ) can be estimated using Lemmas 4.7 and 4.8,
together with the moment assumption on µ◦ . Inserting the result into (5.2), we conclude for all m ≥ 1,
κm [XtN (φ)] .W,β,φ,m,a,µ◦ N 1−m .
We now appeal to Lemma 2.6 to turn this into an estimate on correlation functions: the above cumulant
estimate implies for all 1 ≤ m ≤ N ,
ˆ X X ˆ O O
φ⊗m Gm,N .W,β,φ,m,a,µ◦ N 1−m + N ♯π−♯ρ−m+1 φ♯B G♯D,N (zD ) dzπ ,
Xm π⊢JmK ρ⊢π X♯π B∈π D∈ρ
♯π<m
and a direct induction argument then yields

ˆ
φ⊗m Gm,N .W,β,φ,m,a,µ◦ N 1−m .
Xm
As the multiplicative constant only depends on φ via its W r,∞ (X) norm for some r only depending
on m, the conclusion of Theorem 1.1 follows by duality.
6. Concentration estimates
This section is devoted to the proof of Theorem 1.2. Let κ0 , λ0 be as in Theorem 3.1, and let
κ ∈ [0, κ0 ] be fixed. For t ≥ 0 and φ ∈ Cc∞ (X), consider the centered random variables
ˆ hˆ i
YtN (φ) := XtN (φ) − E[XtN (φ)] = φ µN
t − E φ µ N
t .
X X
We shall establish concentration by means of moment estimates. By the Lions expansion of Lemma 2.1,
we can decompose
1
YtN (φ) = ỸtN (φ) + MtN (φ) + EtN (φ) − E[EtN (φ)] ,

2N
in terms of
ˆ hˆ i
N N
Ỹt (φ) := φ m(t, µ0 ) − E φ m(t, µN0 ) ,
X X
N ˆ t
1 X
MtN (φ) := ∂µ U (t − s, µN i,N i
s )(Zs ) · σ0 dBs ,
N 0
i=1
ˆ tˆ h i
EtN (φ) := tr a0 ∂µ2 U (t − s, µN
s )(z, z) µN
s (dz) ds,
0 X
where we use the short-hand notation
ˆ
U (t − s, µ) := φ m(t − s, µ).
X
For all k ≥ 1, we may then decompose the kth moment of YtN (φ) as
E |YtN (φ)|k ≤ 3k E◦ |ỸtN (φ)|k + 3k E |MtN (φ)|k + 3k N −k E |EtN (φ)|k .

(6.1)
We separately analyze the three right-hand side terms and split the proof into four steps. In the sequel,
constants C are implicitly allowed to depend on W, β, a, on the compact support of µ◦ , as well as on
the further parameters ℓ0 , p, q used below.
Step 1. Proof that for all 1 ≤ k ≤ N , 1 < q ≤ 2, 0 q1′ dim X,
Ck k
2
E◦ |ỸtN (φ)|k ≤ khzi−p φkkW ℓ0 ,q′ (X) .

(6.2)
N
Using (2.21) to estimate the Glauber derivative by means of a linear derivative, and using Lemma 4.6
to control the latter, we get for all λ ∈ [0, λ0 ), 1 < q ≤ 2, 0 q1′ dim X,
ˆ
|Dj◦ ỸtN (φ)| ≤ CN −1 e−pλt hZ0j,N ip + hzip µ◦ (dz) khzi−p φkW ℓ0 ,q′ (X) .
X
By the compact support assumption for µ◦ , this yields almost surely
|Dj◦ ỸtN (φ)| ≤ CN −1 khzi−p φkW ℓ0 ,q′ (X) .
We may then appeal to Proposition 2.11, to the effect of
n N o
E◦ |ỸtN (φ)|k ≤ inf k!λ−k E◦ eλỸt (φ)

λ>0
2λCN −1 khzi−p φk ℓ ,q′
n o
≤ inf k!λ−k exp Cλkhzi−p φkW ℓ0 ,q′ (X) e W 0 (X) − 1 .
λ>0
Choosing λ = (kN )1/2 (2Ckhzi−p φkW ℓ0 ,q′ (X) )−1 , the claim (6.2) follows.
Step 2. Proof that for all k ≥ 1, 1 < q ≤ 2, 0 q1′ dim X,
Ck k n p k
o
N k 2 −p k k
N 2p
E[|Mt (φ)| ] ≤ khzi φkW 1+ℓ0 ,q′ (X) 1 + min (Ck 2 ) ; max E |Ys (hzi )| 2 . (6.3)
N 0≤s≤t
We recall that (MtN (φ))t is a martingale with quadratic variation given by

N ˆ
N 1 X t i,N 2
hMt (φ)i = 2 ∂µ U (t − s, µN
s )(Zs ) ds.
N 0 i=1
Appealing to Lemma 4.6 to estimate the L-derivative, we get for all λ ∈ [0, λ0 ), 1 < q ≤ 2, 0 q1′ dim X,
ˆ t ˆ
N −1 −p 2 −2pλ(t−s)
hMt (φ)i ≤ CN khzi φkW 1+ℓ0 ,q′ (X) e hzi2p µNs (dz) ds.
0 X
By the Burkholder–Davis–Gundy inequality, see e.g. [88, Theorem 1], we have

k k
E[|MtN (φ)|k ] ≤ (Ck) 2 E[hMtN (φ)i 2 ],
and thus
Ck k ˆ k
2 2
E[|MtN (φ)|k ] ≤ khzi−p φkkW 1+ℓ0 ,q′ (X) max E hzi2p µN
s (dz) .
N 0≤s≤t X
Subtracting from µN
s its expectation, recognizing the definition of YsN ,
and appealing ´ to Lemma 4.7
together with the compact support assumption for µ◦ to control the expectation E[ X hzi2p µN s (dz)],
we get
Ck k k

2
E[|MtN (φ)|k ] ≤ khzi−p φkkW 1+ℓ0 ,q′ (X) 1 + max E |YsN (hzi2p )| 2 .

N 0≤s≤t
By Jensen’s inequality and the compact support assumption for µ◦ , we note that the Gaussian bounds
of Lemma 4.7 imply
k p
E |YsN (hzi2p )| 2 ≤ (Ck 2 )k ,

and the claim (6.3) follows.

1
Step 3. Proof that for all k ≥ 1, 1 < q ≤ 2, 0 q′ dim X,
E[|EtN (φ)|k ] ≤ (Ck p )k khzi−p φkkW 2+ℓ0 ,q′ (X) . (6.4)
Recalling the definition of EtN (φ), and appealing to Lemma 4.6 to estimate multiple L-derivatives, we
get for all λ ∈ [0, λ0 ), 1 < q ≤ 2, 0 q1′ dim X,
ˆ t ˆ
|EtN (φ)| . khzi−p φkW 2+ℓ0 ,q′ (X) e−pλ(t−s) hzi2p µN
s (dz) ds.
0 X
Now appealing to the moment bounds of Lemma 4.7, together with Jensen’s inequality and with the
compact support assumption for µ◦ , the claim (6.4) follows.
Step 4. Conclusion.
Inserting the results of the first three steps into (6.1), we obtain for all 1 ≤ k ≤ N , 1 < q ≤ 2, 0 q1′ dim X,
Ck k n p k
o
N k 2 −p k k
N 2p
E[|Yt (φ)| ] ≤ khzi φkW 2+ℓ0 ,q′ (X) 1 + min (Ck 2 ) ; max E |Ys (hzi )| 2 .
N 0≤s≤t
Further applying this same estimate with φ = hzi2p and with p replaced by 3p to control the last factor,
we are led to the following: for all 1 ≤ k ≤ N and p, ℓ0 > 0,
Ck k2 Ck 1+p k

N k
E[|Yt (φ)| ] ≤ + kφkkW 2+ℓ0 ,∞ (X) .
N N
For r ≥ 0, this entails by Markov’s inequality, for all 1 ≤ k ≤ N and p, ℓ0 > 0,
Ck k2 Ck1+p k

−k
N N k
kφkkW 2+ℓ0 ,∞ (X) .

P Yt (φ) ≥ r ≤ r E[|Yt (φ)| ] ≤ +
N r2 Nr
Choosing
−1 1/2
k = N r 2 eCkφkW 2+ℓ0 ,∞ (X) , for 0 ≤ r ≤ eCkφkW 2+ℓ0 ,∞ (X) ,
the conclusion follows.
7. Quantitative central limit theorem

This section is devoted to the proof of Theorem 1.3. For t ≥ 0 and φ ∈ Cc∞ (Rd ), consider the
centered random variables
√ √
ˆ hˆ i
N N N N
St (φ) := N Yt (φ) := N φµt − E φµt .
X X
We shall start by using a Lions expansion to split the contributions from initial data and from Brownian
forces in the fluctuations. From there, we separately analyze initial and Brownian fluctuations, using
tools from Glauber and Lions calculus, respectively.
7.1. Gaussian Dean–Kawasaki equation. We consider the Gaussian Dean–Kawasaki SPDE (1.19).
With our general notation, covering the Langevin and Brownian settings at the same time, this reads
as follows,
√
∂t νt = Lµt νt + div( µt σ0 ξt ), for t ≥ 0,
(7.1)
νt |t=0 = ν◦ ,
where:
— Lµ is the linearized mean-field operator defined in (3.1);
— µt := m(t, µ◦ ) is the solution of the mean-field McKean–Vlasov equation (1.21);
— ν◦ is √
the ´Gaussian field describing the fluctuations of the initial empirical measure, in the sense
that N X φ(µN all φ ∈ Cc∞ (X); in other words, ν◦ is the
´
0 − µ ◦ ) converges in law to X φ ν ◦ for
random tempered distribution on X characterized by having Gaussian law with
hˆ i ˆ ˆ 2 hˆ i
Var φ ν◦ = φ− φ µ◦ µ◦ , E φ ν◦ = 0, for all φ ∈ Cc∞ (X); (7.2)
X X X X
— ξ = (ξt )t∈R is a Gaussian white noise on R × X and is taken independent of ν◦ .
As ν◦ and ξ are random tempered distributions on X and R×X, respectively, equation (7.1) is naturally
understood in the distributional sense almost surely, and its solution ν must itself be sought as a random
tempered distribution on R+ × X.
In order to solve equation (7.1), we start by introducing some notation. Given 1 < q ≤ 2, k ≥ 1,
and 0 < p ≤ 1 with pq ′ ≫β,a 1 large enough, for all h ∈ W −k,q (hzip ), we can define {Ut,s [h]}s≤t as the
unique weak solution in Cloc ([s, ∞); W −k,q (hzip )) of

∂t Ut,s [h] = Lµt Ut,s [h], for t ≥ s,
(7.3)
Ut,s [h]|t=s = h,
see Theorem 3.1(ii). We also consider the dual evolution {Ut,s∗ } −k,q (hzip )∗ , where for all t ≥ s
t≥s on W
∗
we define Ut,s as the adjoint of Ut,s ,
ˆ ˆ
∗
Ut,s [g]h = Ut,s [h]g.
X X
Recall that the condition pq ′ > d ensures that the dual space W −k,q (hzip )∗ contains W k,∞(X). For
any g ∈ W −k,q (hzip )∗ and t ≥ 0, the dual flow s 7→ Ut,s∗ [g] naturally belongs to C([0, t]; W −k,q (hzip )∗ ),
where W −k,q (hzip )∗ is endowed with the weak topology. Denoting by L∗µ the adjoint of the linearized
mean-field operator Lµ , we find that the dual flow satisfies the backward Cauchy problem
∗ [g] = −L∗ U ∗ [g], for 0 ≤ s ≤ t,

∂s Ut,s µs t,s
∗ [g]| (7.4)
Ut,s s=t = g.
In these terms, using the theory of Da Prato and Zabczyk [32], and more specifically its non-
autonomous extension by Seidler [81], we can check that the Gaussian Dean–Kawasaki equation (7.1)
admits a unique weak solution that is a random element in C(R+ ; S ′ (X)), and it can be expressed by
Duhamel’s principle
ˆ t
√
νt := Ut,0 [ν◦ ] + Ut,s div µs σ0 ξs ds.
0
In particular, the solution is characterized by its covariance structure
ˆ ˆ ˆ t ˆ
∗ √ T ∗
Var φνt = Var Ut,0 [φ] ν◦ + Var µs σ0 ∇Ut,s [φ] · ξs ds
X X 0 X
ˆ ˆ 2 ˆ tˆ
∗ ∗ ∗ 2
= Ut,0 [φ] − Ut,0 [φ] µ◦ µ◦ + σ0T ∇Ut,s [φ] µs ds. (7.5)
X X 0 X
7.2. Splitting fluctuations. By means of a Lions expansion, we start by showing that fluctuations
can be split neatly into contributions from initial data and from Brownian forces.
Lemma 7.1. Let κ0 be as in Theorem 3.1, let κ ∈ [0, κ0 ], and assume that the initial law µ◦ satisfies
p0 µ (dz) < ∞ for some p > 0. We have for all φ ∈ C ∞ (X) and t ≥ 0,
´
X |z| ◦ 0 c
1
StN (φ) − CtN (φ) − DtN (φ) L2 (Ω)
.W,β,φ,a,µ◦ N − 2 ,
in terms of
√
ˆ hˆ i
CtN (φ) := N φ m(t, µN
0 )− E◦ φ m(t, µN
0 ) ,
X X
N ˆ
1 X t
DtN (φ) :=
√ ∂µ U (t − s, µN i,N i
s )(Zs ) · σ0 dBs ,
N i=1 0
´
where we have set for abbreviation U (t, µ) := X φ m(t, µ).
Proof. Let κ0 , λ0 be as in Theorem 3.1 and let κ ∈ [0, κ0 ] be fixed. Starting point is the Lions expansion
of Lemma 2.1,
N ˆ
1 X t
ˆ ˆ
φ µN
t = φ m(t, µN
0 )+ ∂µ U (t − s, µN i,N i
s )(Zs ) · σ0 dBs
X X N 0
i=1
ˆ tˆ
1 h i
+ tr a0 ∂µ2 U (t − s, µN N
s )(z, z) µs (dz) ds.
2N 0 X
√
Multiplying by N , subtracting the expectation to both sides of the identity, taking the L2 (Ω) norm,
and recognizing the definition of CtN (φ) and DtN (φ), we are led to
ˆ tˆ
N N N 1 h i
St (φ) − Ct (φ) − Dt (φ) L2 (Ω) ≤ √ tr a0 ∂µ U (t − s, µs )(z, z) µN
2 N
s (dz) ds . (7.6)
2 N 0 X L2 (Ω)
By Lemma 4.6, for all λ ∈ [0, λ0 ), 0 0, we get

ˆ tˆ h i
tr a0 ∂µ2 U (t − s, µN
s )(z, z) µN
s (dz) ds
0 X ˆ t ˆ
.W,β,λ,ℓ0,p,a kφkW 2+ℓ0 ,∞ (X) e−pλ(t−s) hzi2p µN
s (dz) ds.
0 X
Appealing to Lemma 4.7 together with Jensen’s inequality and with the moment assumption for µ◦ ,
we deduce
ˆ tˆ h i
tr a0 ∂µ2 U (t − s, µN
s )(z, z) µN
s (dz) ds .W,β,ℓ0,p0 ,a kφkW 2+ℓ0 ,∞ (X) .
0 X L2 (Ω)
Combined with (7.6), this yields the conclusion.

7.3. Initial fluctuations. We establish the following quantitative central limit theorem for initial
fluctuations CtN (φ). The proof is split into two separate parts: the asymptotic normality of CtN (φ)
and the convergence of its variance structure. For both parts, we exploit tools from Glauber calculus:
more precisely, the first part follows from Stein’s method in form of the so-called second-order Poincaré
inequality of Proposition 2.10, while for the second part our starting point is the Helffer–Sjöstrand
representation for the variance in Lemma 2.7(iii).
´Lemma 7.2. Let κ0 , λ0 be as in Theorem 3.1, let κ ∈ [0, κ0 ], and assume that the initial law µ◦ satisfies
|z|p0 µ (dz) < ∞ for some 0 < p ≤ 1. The random variable C N (φ) defined in Lemma 7.1 satisfies
X ◦ 0 t
for all φ ∈ Cc∞ (X), t ≥ 0, and λ ∈ [0, 21 p0 λ0 ),
−1
N C − 12 −λt C − 13 −λt 21
d2 Ct (φ) , σt (φ, µ◦ )N .W,β,λ,φ,a,µ◦ N e 1 + σt (φ, µ◦ ) + (N e ) ,
where the limit variance is defined by

∗
ˆ ˆ 2
C 2 1,N ∗ ∗

σt (φ, µ◦ ) := Var◦ (Ut,0 [φ])(Z◦ ) = Ut,0 [φ] − Ut,0 [φ] µ◦ µ◦ , (7.7)
X X
∗ is defined in (7.4), that d is the second-order Zolotarev metric (1.18), and
where we recall that Ut,0 2
that N stands for a standard normal random variable.
Proof. Let κ0 , λ0 be as in Theorem 3.1 and let κ ∈ [0, κ0 ] be fixed. We split the proof into three steps.
Step 1. Asymptotic normality: proof that for all t ≥ 0 and λ ∈ [0, 21 p0 λ0 ),
CtN (φ) CtN (φ) CtN (φ)

d2 1 , N + dW 1 , N + dK 1 , N
Var◦ [CtN (φ)] 2 Var◦ [CtN (φ)] 2 Var◦ [CtN (φ)] 2
1
1

.W,β,λ,φ,a,µ◦ N − 2 e−λt Var◦ [CtN (φ)]−1 1 + Var◦ [CtN (φ)]− 2 . (7.8)
Set for abbreviation
CtN (φ)
ĈtN (φ) := 1 .
Var◦ [CtN (φ)] 2
By Proposition 2.10, we can estimate
N
3 X 1
d2 ĈtN (φ), N dW ĈtN (φ), N dK ĈtN (φ), N Var◦ [CtN (φ)]− 2 E◦ |Dj◦ CtN (φ)|6 2

+ + .
j=1
N X
X N 1
1 1 2 2
+ Var◦ [CtN (φ)]−1 E◦ |Dl◦ CtN (φ)|4 E◦ |Dj◦ Dl◦ CtN (φ)|4

4 4
.
j=1 l=1
Recalling the definition of CtN (φ) in Lemma 7.1, using (2.20) and (2.21) to bound Glauber derivatives
by means of linear derivatives, appealing to Lemma 4.6 to estimate the latter, using the moment
assumption for µ◦ , and distinguishing between the cases l = j and l 6= j in the second right-hand side
term, we deduce for all λ ∈ [0, λ0 ) and ℓ0 > 0,
d2 ĈtN (φ), N + dW ĈtN (φ), N + dK ĈtN (φ), N

3 1 1
ˆ 1
.W,β,λ,ℓ0,p0,a Var◦ [CtN (φ)]− 2 N − 2 kφk3W ℓ0 ,∞ (X) e− 2 p0 λt
2
hzip0 µ◦ (dz)
X
1
ˆ 3
N −1 − 2 2 − 12 p0 λt 4
+ Var◦ [Ct (φ)] N kφkW 1+ℓ0 ,∞ (X) e hzip0 µ◦ (dz) ,
X
and the claim follows.
Step 2. Convergence of the variance: proof that for all t ≥ 0 and λ ∈ [0, 12 p0 λ0 ),
Var◦ [CtN (φ)] − σtC (φ, µ◦ )2 .W,β,λ,φ,a,µ◦ N −1 e−λt , (7.9)
where the limit variance is defined in (7.7).

By the definition of CtN (φ) in Lemma 7.1, we have
hˆ i
Var◦ [CtN (φ)] = N Var◦ φ m(t, µN
0 ) .
X
Appealing to the Helffer–Sjöstrand representation for the variance in terms of Glauber calculus,
cf. Lemma 2.7(iii), we get
N
X ˆ ˆ
Var◦ [CtN (φ)] = N E◦ Dj◦ φ m(t, µN
0 ) L −1
◦ Dj
◦
φ m(t, µ N
0 ) .
j=1 X X
By exchangeability, this is equivalently written as

ˆ ˆ
N 2 ◦ N −1 ◦ N
Var◦ [Ct (φ)] = N E◦ D1 φ m(t, µ0 ) L◦ D1 φ m(t, µ0 ) .
X X
Denoting by E6= 1 1,N 6=1 1,N

◦ := E◦ [ · |Z◦ ] and Var◦ := Var◦ [ · |Z◦ ] the expectation and variance with respect
j,N 6=1 −1 ◦ 6
= 1
to {Z◦ }j:j6=1 , and noting that E◦ L◦ D1 = L−1 ◦
◦ E◦ D1 , we deduce from the triangle inequality
h ˆ i h ˆ i
Var◦ [CtN (φ)] − N 2 E◦ E6= ◦
1
D 1
◦
φ m(t, µ N
0 ) L −1
◦ E 6=1
◦ D1
◦
φ m(t, µ N
0 )
X X
h ˆ i
2 6=1 ◦ N
≤ N E◦,1 Var◦ D1 φ m(t, µ0 ) .
X
Since we have L◦ X = X for any σ(Z◦1,N )-measurable

random variable X with E◦ [X] = 0, the opera-
−1
tor L◦ can be replaced by Id in the left-hand side. Further appealing to the variance inequality (2.13)
for Glauber calculus, we are led to
i2
h ˆ 2
ˆ
Var◦ [CtN (φ)] − N 2 E◦ E6=
◦
1
D1
◦
φ m(t, µ N
0 ) ≤ N 3
E ◦ D ◦ ◦
D
2 1 φ m(t, µ N
0 ) .
X X
Using (2.21) to bound Glauber derivatives by means of linear derivatives, appealing to Lemma 4.6 to
estimate the latter, and recalling the moment assumption for µ◦ , we obtain for all λ ∈ [0, p0 λ0 ),
h ˆ i2
6=1 ◦
N 2
Var◦ [Ct (φ)] − N E◦ E◦ D1 N
φ m(t, µ0 ) .W,β,λ,φ,a,µ◦ N −1 e−λt . (7.10)
X
It remains to evaluate the Glauber derivative in the left-hand side. Recalling again the link between
Glauber and linear derivatives, cf. (2.20), and appealing to Lemma 4.6 for the computation of the
linear derivative, we get

ˆ
◦
D1 φ m(t, µN0 )
X
ˆ 1ˆ ˆ ˆ
−1
φ m(1) t, µN 1−s

= N 0 + N (δz − δZ 1,N ), y (δZ 1,N − δz )(dy) µ◦ (dz) ds. (7.11)
◦ ◦
0 X X X
1−s
Let us further appeal to the definition of linear derivative to replace the measure µN
0 + N (δz − δZ 1,N ) ◦
in the argument of m(1) by
PN
µN N
0,z ′ := µ0 +
1
N (δz
′ − δZ 1,N ) = 1
N δz
′ + 1
N j=2 δZ◦j,N ,
◦
where z ′ is a new variable integrated over with respect to µ◦ . Using Lemma 4.6 to estimate the
additional linear derivative that constitutes the resulting error, we get for all λ ∈ [0, 21 p0 λ0 ),
ˆ
E◦ D1◦ φ m(t, µN
0 )
X
1
2
ˆ ˆ ˆ ˆ 2
−1 ′
−N φm (1)
(t, µN
0,z ′ , y) (δZ 1,N − µ◦ )(dy) µ◦ (dz ) .W,β,λ,φ,a,µ◦ N −2 e−λt .
◦
X X X X
Inserting this into (7.10) and reorganizing expectations and integrals, we obtain for all λ ∈ [0, 12 p0 λ0 ),
ˆ ˆ ˆ 2
N
Var◦ [Ct (φ)] − E◦ φ m (t, µ0 , y) (δz − µ◦ )(dy) µ◦ (dz) .W,β,φ,a,µ◦ N −1 e−λt .
(1) N
X X X
We are now in position to appeal to Lemma 2.3 to replace µN 0 by µ◦ in the expectation. Using as
before Lemma 4.6 to estimate linear derivatives, this leads us to
ˆ ˆ ˆ 2
Var◦ [CtN (φ)] − φ m(1) (t, µ◦ , y) (δz − µ◦ )(dy) µ◦ (dz) .W,β,λ,φ,a,µ◦ N −1 e−λt . (7.12)
X X X
Using the notation (7.3), the definition of m(1) in Lemma 4.6 amounts to
m(1) (t, µ◦ , y) = Ut,0 [δy − µ◦ ],

and the claim (7.9) follows with the limit variance σtC (φ, µ◦ ) defined in (7.7).
Step 3. Conclusion.
By homogeneity of d2 and by the triangle inequality, we can estimate

d2 CtN (φ) , σtC (φ, µ◦ )N
CtN (φ) σtC (φ, µ◦ )

N
= Var[Ct (φ)] d2 , N
Var[CtN (φ)]1/2 Var[CtN (φ)]1/2
CtN (φ)

N
≤ Var[Ct (φ)] d2 , N + 12 Var[CtN (φ)] − σtC (φ, µ◦ )2 .
Var[CtN (φ)]1/2
By the asymptotic normality (7.8) and by the convergence result (7.9) for the variance, we then get
for all λ ∈ [0, 21 p0 λ0 )
1
1

d2 CtN (φ) , σtC (φ, µ◦ )N .W,β,λ,φ,a,µ◦ N − 2 e−λt 1 + Var[CtN (φ)]− 2 . (7.13)
It remains to deal with the last factor involving the inverse of the variance. For that purpose, we
distinguish between two cases:
— Case 1: assume that σtC (φ, µ◦ )2 ≥ L.

In this case, the convergence result (7.9) for the variance yields
1
− 1 − 1
2
Var[CtN (φ)]− 2 ≤ σtC (φ, µ◦ )2 − Cφ N −1 e−λt ≤ L − Cφ N −1 e−λt 2 ,
so that (7.13) becomes
1
− 1
d2 CtN (φ) , σtC (φ, µ◦ )N .φ N − 2 e−λt 1 + L − Cφ N −1 e−λt 2 .
— Case 2: assume that σtC (φ, µ◦ )2 ≤ L.

In this case, the convergence result (7.9) for the variance yields

d2 CtN (φ) , σtC (φ, µ◦ )N ≤ σtC (φ, µ◦ )2 + Var◦ [CtN (φ)]
≤ 2L + Cφ N −1 e−λt .
Optimizing between those two cases, the conclusion follows.
7.4. Brownian fluctuations. We establish the following quantitative central limit theorem for Brow-
nian fluctuations DtN (φ). The proof is based on the Lions expansion combined with a simple heat-kernel
PDE argument. The additional estimate (7.16) in the statement below ensures that the central limit
theorem for DtN (φ) also holds given CtN (φ), which is key to deduce a joint result.
Lemma 7.3 (Brownian fluctuations). Let κ0 be as in Theorem 3.1, let κ ∈ [0, κ0 ], and assume that
the initial law µ◦ satisfies X |z|p0 µ◦ (dz) < ∞ for some p0 > 0. The random variable DtN (φ) defined
´
in Lemma 7.1 satisfies for all t ≥ 0,
1
d2 DtN (φ) , σtD (φ, µ◦ ) N .W,β,φ,a,µ◦ N − 2 ,

(7.14)
where the limit variance is given by
ˆ t ˆ
∗ 2
σtD (φ, µ◦ )2 := σ0T ∇Ut,s [φ]) m(s, µ◦ ) ds, (7.15)
0 X
∗ is defined in (7.4), that d is the second-order Zolotarev metric (1.18), and
where we recall that Ut,s 2
that N stands for a standard normal random variable. In addition, for all h ∈ Cb2 (R2 ) and t ≥ 0,
1
E h CtN (φ), DtN (φ) − E◦ EN h CtN (φ), σtD (φ, µ◦ )N .W,β,φ,a,µ◦ N − 2 k∂22 hkL∞ (R2 ) ,

(7.16)
where the standard normal variable N is taken independent both of initial data and of Brownian forces,
and where we denote by EN the expectation with respect to N .
Proof. Let κ0 , λ0 be as in Theorem 3.1 and let κ ∈ [0, κ0 ] be fixed. We focus on the proof of (7.14),
while the additional statement (7.16) can be obtained along the exact same lines — simply replacing
the test function g below by h(CtN (φ), ·) and recalling that CtN (φ) is independent of Brownian forces.
Step 1. Proof that for all g ∈ Cb2 (R) and t ≥ 0,
h i 1
E g(DtN (φ)) − E◦ EN g σtD (φ, µN .W,β,φ,a,µ◦ N − 2 kg ′′ kL∞ (R) ,

0 ) N (7.17)
where the limit variance is defined in (7.15), and where as in the statement N stands for a standard
normal random variable taken independent both of initial data and of Brownian forces.
To prove this result, let us consider the (FsB )0≤s≤t -martingale (Ds,t
N (φ))
0≤s≤t given by
N ˆ
N 1 X s
Ds,t (φ) := √ (∂µ U )(t − u, µN i,N i
u )(Zu ) · σ0 dBu ,
N i=1 0
which satisfies
N N
D0,t (φ) = 0, Dt,t (φ) = DtN (φ).
Let g ∈ Cb2 (R) be fixed. By definition of Ds,t

N (φ), Itô’s lemma yields for all θ ∈ R and 0 ≤ s ≤ t,

d
ˆ
N 1 ′′ N T N 2 N

EB g(θ + Ds,t (φ)) = 2 EB g (θ + Ds,t (φ)) σ0 (∂µ U )(t − s, µs ) µs . (7.18)
ds X
Appealing to the Lions expansion in form of Corollary 2.2(ii), we find for all 0 ≤ s ≤ t,
ˆ ˆ
T N 2 N 2
σ0 (∂µ U )(t − s, µs ) µs − σ0T (∂µ U )(t − s, m(s, µN N
0 )) m(s, µ0 )
X X L2 (ΩB )
ˆ sˆ 1
2
− 12 2 N
. N EB ∂µ Ht−s (s − u, µN
u )(z) µu (dz) du
0 X
ˆ s ˆ 2
−1
+ N EB ∂µ2 Ht−s (s − u, µN
u )(z, z) µN
u (dz) du ,
0 X
in terms of
2
ˆ
Ht−s (u, µ) := σ0T ∂µ U (t − s, m(u, µ)) m(u, µ).
X
Appealing to Lemma 4.6 to estimate the multiple linear derivatives, and combining it with the moment
bounds of Lemma 4.7, we get after straightforward computations for all 0 ≤ s ≤ t and λ ∈ [0, 18 p0 λ0 ),
ˆ ˆ
T N 2 N 2
σ0 (∂µ U )(t − s, µs ) µs − σ0T (∂µ U )(t − s, m(s, µN N
0 )) m(s, µ0 )
X X L2 (ΩB )
ˆ
− 21 −λ(t−s)
.W,β,λ,φ,a,p0 N e hzip0 µN
0 (dz).
X
Inserting this into (7.18), and using the short-hand notation

ˆ
2
κs,t (φ, µ) := σ0T (∂µ U )(t − s, m(s, µ)) m(s, µ),
X
we deduce for all θ ∈ R, 0 ≤ s ≤ t, and λ ∈ [0, 18 p0 λ0 ),
d N d2
(φ)) − 21 κs,t (µN N

EB g(θ + Ds,t 0 ) 2
EB g(θ + Ds,t (φ))
ds dθ
ˆ
− 21 −λ(t−s) ′′
.W,β,λ,φ,a,p0 N e kg kL∞ (R) hzip0 µN
0 (dz).
X
In order to solve this approximate heat equation for the map (s, θ) 7→ EB [g(θ on [0, t] × R, N (φ))]
+ Ds,t
N
we can appeal for instance to the Feynman–Kac formula with D0,t (φ) = 0. Equivalently, this amounts
to integrating the above estimate with the associated heat kernel. It leads us to deduce for all θ ∈ R
and 0 ≤ s ≤ t,
h q´ i ˆ
N s − 21 ′′ ∞
N .W,β,φ,a,p0 N kg kL (R) hzip0 µN

EB g(θ + Ds,t (φ)) − EN g θ + N 0 κu,t (φ, µ0 ) du 0 (dz).
X
N (φ) = D N (φ), taking the expectation with respect
In particular, setting θ = 0 and s = t, recalling Dt,t t
N
to initial data, and recalling E◦ [µ0 ] = µ◦ and the moment assumption for µ◦ , we get
h q´ i 1
t
E g(DtN (φ)) − E◦ EN g N N ) du .W,β,φ,a,µ◦ N − 2 kg′′ kL∞ (R) .

κ
0 u,t (φ, µ 0
Using the notation (7.3) and recalling that the definition of m(1) in Lemma 4.6 amounts to
m(1) (t, µ◦ , y) = Ut,0 [δy − µ◦ ],
we recognize the definition (7.15) of σtD ,

ˆ t ˆ t ˆ
2
κu,t (φ, µ) du = σ0T (∂µ U )(t − s, m(s, µ)) m(s, µ) ds
0 0 X
ˆ t ˆ
∗ 2
= σ0T ∇Ut,s [φ]) m(s, µ) ds
0 X
= σtD (φ, µ)2 , (7.19)
Step 2. Proof that for all g ∈ Cb2 (R) and t ≥ 0,

h i h i 1
E◦ EN g σtD (φ, µN 0 ) N − E N g σ D
t (φ, µ ◦ ) N .W,β,φ,a,µ◦ N − 2 kg′′ kL∞ (R) . (7.20)
Combining this with the result (7.17) of Step 1, and recalling the definition (1.18) of the second-
order Zolotarev metric, this will conclude the proof of (7.14). Set for shortness σtN := σtD (φ, µN
0 )
D
and σt := σt (φ, µ◦ ). We can decompose
ˆ 1 h i
EN N g′ σt + θ(σtN − σt ) N dθ,
N N
EN g(σt N ) − EN g(σt N ) = (σt − σt )
0
and a Gaussian integration by parts then yields

ˆ 1 h i
σt + θ(σtN − σt ) EN g′′ σt + θ(σtN − σt ) N dθ.
N N

EN g(σt N ) − EN g(σt N ) = (σt − σt )
0
Hence,
EN g(σtN N ) − EN g(σt N ) ≤ kg′′ kL∞ (R) |σtN − σt | |σt | + |σtN − σt | .

Taking the expectation with respect to initial data, the claim (7.20) would follow provided that we
could show for all t ≥ 0,
|σtD (φ, µ◦ )| .W,β,φ,a,µ◦ 1, (7.21)

1 1
E◦ |σtD (φ, µN D 2 2
.W,β,φ,a,µ◦ N − 2 .

0 ) − σt (φ, µ◦ )| (7.22)
For that purpose, we first recall that by (7.19) we can write

ˆ tˆ
D 2 2
σt (φ, µ) = σ0T (∂µ U )(t − s, m(s, µ)) m(s, µ) ds, (7.23)
0 X
´
with the short-hand notation U (t, µ) := X φ m(t, µ). Applying Lemma 4.6 to estimate the linear
derivative, and combining it with the moment bounds of Lemma 4.7, the claim (7.21) follows after
straightforward computations. We turn to the proof of (7.22). By the triangle inequality, we can
decompose
1 N 12
E◦ |σtD (φ, µN D 2 2
≤ E◦ [σtD (φ, µN D D

0 ) − σt (φ, µ◦ )| 0 )] − σt (φ, µ◦ ) + Var◦ [σt (φ, µ0 )] ,
and we estimate both terms separately. On the one hand, starting again from (7.23), appealing to
Lemma 2.3, using Lemma 4.6 to estimate the multiple linear derivatives, and combining it with the
moment bounds of Lemma 4.7 and with the moment assumption for µ◦ , we easily get
−1
E◦ [σtD (φ, µN D
0 )] − σt (φ, µ◦ ) .W,β,φ,a,µ◦ N .
On the other hand, using the variance inequality (2.13) for Glauber calculus, and appealing to (2.20)
to bound Glauber derivatives in terms of linear derivatives, we find
N
E◦ |Dj◦ σtD (φ, µN
X
Var◦ σtD (φ, µN 2

0 ) ≤ 0 )|
j=1
1ˆ
δσtD
ˆ
2
ˆ
−1
. N E◦ φ, µN
0 +
1−s
N (δz − δZ 1,N ), y (δZ 1,N − δz )(dy) µ◦ (dz) ds .
0 X X δµ 0 0
Further using Lemma 4.6 to estimate the multiple linear derivatives, and combining it with the moment
bounds of Lemma 4.7 and with the moment assumption for µ◦ , we deduce
−1
Var◦ [σtD (φ, µN
0 )] .W,β,φ,a,µ◦ N ,
7.5. Proof of Theorem 1.3. Let κ0 , λ0 be as in Theorem 3.1 and let κ ∈ [0, κ0 ] be fixed. Let
also g ∈ Cb2 (R) be momentarily fixed with g ′ (0) = 0 and kg ′′ kL∞ (R) = 1. By Lemma 7.1, we find
1
E g(StN (φ)) − E g CtN (φ) + DtN (φ) .W,β,φ,a,µ◦ N − 2 .

Next, appealing to (7.16) in Lemma 7.3 for the asymptotic normality of DtN (φ) given CtN (φ), we deduce
1
E g(StN (φ)) − E◦ EN g CtN (φ) + σtD (φ, µ◦ )N .W,β,φ,a,µ◦ N − 2 ,

(7.24)
where N stands for a standard normal random variable taken independent both of initial data and of
Brownian forces. It remains to combine this with our analysis of fluctuations of CtN (φ). Appealing
to the asymptotic normality of CtN (φ) as stated in Lemma 7.2, and testing that result in Zolotarev
metric with the function EN [g(· + σtD (φ, µ◦ )N )] ∈ Cb2 (R), we deduce for all λ ∈ [0, 21 p0 λ0 ),
E g(StN (φ)) − EN EN ′ g σtC (φ, µ◦ )N ′ + σtD (φ, µ◦ )N

−1
− 21 −λt C − 13 −λt 21
.W,β,λ,φ,a,µ◦ N 1+e σt (φ, µ◦ ) + (N e ) ,
where N ′ stands for another standard normal random variable taken independent of initial data, of
Brownian forces, and of N . Taking the supremum over g, and noting that σtC (φ, µ◦ )N ′ + σtD (φ, µ◦ )N
has the same distribution as σt (φ, µ◦ )N with total variance
σt (φ, µ◦ )2 := σtC (φ, µ◦ )2 + σtD (φ, µ◦ )2 ,
we conclude
−1
N − 21 −λt C − 31 −λt 12
d2 St (φ) , σt (φ, µ◦ )N .W,β,λ,φ,a,µ◦ N 1+e σt (φ, µ◦ ) + (N e )
1 1 1
≤ N − 2 + N − 3 e− 2 λt .
Noting that the total variance σt (φ, µ◦ ) coincides with the variance predicted by the Gaussian Dean–
Kawasaki equation, cf. (7.5), the conclusion follows.
References
[1] L. Ambrosio, N. Gigli, and G. Savaré. Gradient flows in metric spaces and in the space of probability measures.
Lectures in Mathematics ETH Zürich. Birkhäuser Verlag, Basel, 2005.
[2] E. Bayraktar, Q. Feng, and W. Li. Exponential entropy dissipation for weakly self-consistent Vlasov-Fokker-Planck
equations. J. Nonlinear Sci., 34(1):Paper No. 7, 42, 2024.
[3] S. Benachour, B. Roynette, D. Talay, and P. Vallois. Nonlinear self-stabilizing processes. I. Existence, invariant
probability, propagation of chaos. Stochastic Process. Appl., 75(2):173–201, 1998.
[4] S. Benachour, B. Roynette, and P. Vallois. Nonlinear self-stabilizing processes. II. Convergence to invariant proba-
bility. Stochastic Process. Appl., 75(2):203–224, 1998.
[5] O. Bencheikh and B. Jourdain. Bias behavior and antithetic sampling in mean-field particle approximations of
SDEs nonlinear in the sense of McKean. In CEMRACS 2017—numerical methods for stochastic models: control,
uncertainty quantification, mean-field, volume 65 of ESAIM Proc. Surveys, pages 219–235. EDP Sci., Les Ulis, 2019.
[6] L. Bertini, G. Giacomin, and C. Poquet. Synchronization and random long time dynamics for mean-field plane
rotators. Probab. Theory Related Fields, 160(3-4):593–653, 2014.
[7] N. N. Bogolyubov. Problems of a Dynamical Theory in Statistical Physics, volume I of Studies in Statistical Me-
chanics. North-Holland, Amsterdam, 1962. Translation of the 1946 Russian version.
[8] F. Bolley, I. Gentil, and A. Guillin. Uniform convergence to equilibrium for granular media. Arch. Ration. Mech.
Anal., 208(2):429–445, 2013.
[9] F. Bolley, A. Guillin, and F. Malrieu. Trend to equilibrium and particle approximation for a weakly selfconsistent
Vlasov-Fokker-Planck equation. M2AN Math. Model. Numer. Anal., 44(5):867–884, 2010.
[10] F. Bolley, A. Guillin, and C. Villani. Quantitative concentration inequalities for empirical measures on non-compact
spaces. Probab. Theory Related Fields, 137(3-4):541–593, 2007.
[11] F. Bouchut and J. Dolbeault. On long time asymptotics of the Vlasov-Fokker-Planck equation and of the Vlasov-
Poisson-Fokker-Planck system with Coulombic and Newtonian potentials. Differential Integral Equations, 8(3):487–
514, 1995.
[12] D. Bresch, P.-E. Jabin, and J. Soler. A new approach to the mean-field limit of Vlasov-Fokker-Planck equations.
Preprint, arXiv:2203.15747.
[13] D. R. Brillinger. The Calculation of Cumulants via Conditioning. Annals of the Institute of Statistical Mathematics,
21(1):215–218, 1969.
[14] R. Buckdahn, J. Li, S. Peng, and C. Rainer. Mean-field stochastic differential equations and associated PDEs. Ann.
Probab., 45(2):824–878, 2017.
[15] O. A. Butkovsky. On ergodic properties of nonlinear Markov chains and stochastic McKean-Vlasov equations. Theory
Probab. Appl., 58(4):661–674, 2014.
[16] P. Cardaliaguet. Notes on Mean Field Games, 2013. Based on P.-L. Lions’ lectures at Collège de France.
[17] P. Cardaliaguet, F. Delarue, J.-M. Lasry, and P.-L. Lions. The master equation and the convergence problem in
mean field games, volume 201 of Annals of Mathematics Studies. Princeton University Press, Princeton, NJ, 2019.
[18] P. Cardaliaguet, J.-M. Lasry, P.-L. Lions, and A. Porretta. Long time average of mean field games with a nonlocal
coupling. SIAM J. Control Optim., 51(5):3558–3591, 2013.
[19] R. Carmona and F. Delarue. Probabilistic theory of mean field games with applications. I, volume 83 of Probability
Theory and Stochastic Modelling. Springer, Cham, 2018. Mean field FBSDEs, control, and games.
[20] J. A. Carrillo, R. J. McCann, and C. Villani. Kinetic equilibration rates for granular media and related equations:
entropy dissipation and mass transportation estimates. Rev. Mat. Iberoamericana, 19(3):971–1018, 2003.
[21] J. A. Carrillo, R. J. McCann, and C. Villani. Contractions in the 2-Wasserstein length space and thermalization of
granular media. Arch. Ration. Mech. Anal., 179(2):217–263, 2006.
[22] P. Cattiaux, A. Guillin, and F. Malrieu. Probabilistic approach for granular media equations in the non-uniformly
convex case. Probab. Theory Related Fields, 140(1-2):19–40, 2008.
[23] L.-P. Chaintron and A. Diez. Propagation of chaos: a review of models, methods and applications. II. Applications.
Kinet. Relat. Models, 15(6):1017–1173, 2022.
[24] J.-F. Chassagneux, D. Crisan, and F. Delarue. A probabilistic approach to classical solutions of the master equation
for large population equilibria. Mem. Amer. Math. Soc., 280(1379):v+123, 2022.
[25] J.-F. Chassagneux, L. Szpruch, and A. Tse. Weak quantitative propagation of chaos via differential calculus on the
space of measures. Ann. Appl. Probab., 32(3):1929–1969, 2022.
[26] S. Chatterjee. A new method of normal approximation. Ann. Probab., 36(4):1584–1610, 2008.
[27] F. Chen, Y. Lin, Z. Ren, and S. Wang. Uniform-in-time propagation of chaos for kinetic mean field Langevin
dynamics. Electron. J. Probab., 29:Paper No. 17, 43, 2024.
[28] L. Chen, A. Holzinger, and A. Jüngel. Fluctuations around the mean-field limit for attractive Riesz potentials in
the moderate regime. Preprint, arXiv:2405.15128.
[29] A. Chodron de Courcel, M. Rosenzweig, and S. Serfaty. Sharp uniform-in-time mean-field convergence for singular
periodic Riesz flows. To appear, 2024.
[30] F. Cornalba and J. Fischer. The Dean-Kawasaki equation and the structure of density fluctuations in systems of
diffusing particles. Arch. Ration. Mech. Anal., 247(5):Paper No. 76, 59, 2023.
[31] F. Cornalba, J. Fischer, J. Ingmanns, and C. Raithel. Density fluctuations in weakly interacting particle systems
via the Dean-Kawasaki equation. Preprint, arXiv:2303.00429.
[32] G. Da Prato and J. Zabczyk. A note on stochastic convolution. Stochastic Anal. Appl., 10(2):143–153, 1992.
[33] L. Decreusefond and H. Halconruy. Malliavin and Dirichlet structures for independent random variables. Stochastic
Process. Appl., 129(8):2611–2653, 2019.
[34] F. Delarue and W. Salkeld. An example driven introduction to probabilistic rough paths. Preprint, arXiv:2106.09801,
2021.
[35] F. Delarue and A. Tse. Uniform in time weak propagation of chaos on the torus. Preprint, arXiv:2104.14973.
[36] A. Djurdjevac, H. Kremp, and N. Perkowski. Weak error analysis for a nonlinear SPDE approximation of the
Dean-Kawasaki equation. To appear, 2024.
[37] J. Dolbeault, C. Mouhot, and C. Schmeiser. Hypocoercivity for kinetic equations with linear relaxation terms. C.
R. Math. Acad. Sci. Paris, 347(9-10):511–516, 2009.
[38] J. Dolbeault, C. Mouhot, and C. Schmeiser. Hypocoercivity for linear kinetic equations conserving mass. Trans.
Amer. Math. Soc., 367(6):3807–3828, 2015.
[39] M. Duerinckx. Lenard-Balescu correction to mean-field theory. Probab. Math. Phys., 2(1):27–69, 2021.
[40] M. Duerinckx. On the Size of Chaos via Glauber Calculus in the Classical Mean-Field Dynamics. Commun. Math.
Phys., 382:613–653, 2021.
[41] M. Duerinckx and A. Gloria. Multiscale functional inequalities in probability: constructive approach. Ann. H.
Lebesgue, 3:825–872, 2020.
[42] M. Duerinckx, A. Gloria, and F. Otto. The structure of fluctuations in stochastic homogenization. Comm. Math.
Phys., 377(1):259–306, 2020.
[43] M. H. Duong and J. Tugaut. Stationary solutions of the Vlasov-Fokker-Planck equation: existence, characterization
and phase-transition. Appl. Math. Lett., 52:38–45, 2016.
[44] A. Durmus, A. Eberle, A. Guillin, and R. Zimmer. An elementary approach to uniform in time propagation of chaos.
Proc. Amer. Math. Soc., 148(12):5387–5398, 2020.
[45] A. Eberle, A. Guillin, and R. Zimmer. Quantitative Harris-type theorems for diffusions and McKean-Vlasov pro-
cesses. Trans. Amer. Math. Soc., 371(10):7135–7173, 2019.
[46] B. Efron and C. Stein. The jackknife estimate of variance. Ann. Statist., 9(3):586–596, 1981.
[47] B. Fernandez and S. Méléard. A Hilbertian approach for fluctuations on the McKean-Vlasov model. Stochastic
Process. Appl., 71(1):33–53, 1997.
[48] N. Fournier and A. Guillin. On the rate of convergence in Wasserstein distance of the empirical measure. Probab.
Theory Related Fields, 162(3-4):707–738, 2015.
[49] A. Friedman. Partial differential equations of parabolic type. Prentice-Hall, Inc., Englewood Cliffs, NJ, 1964.
[50] M. G. Garroni and J. L. Menaldi. Green functions for second order parabolic integro-differential problems, volume
275 of Pitman Research Notes in Mathematics Series. Longman Scientific & Technical, Harlow; copublished in the
United States with John Wiley & Sons, Inc., New York, 1992.
[51] J. Gärtner. On the McKean-Vlasov limit for interacting diffusions. Math. Nachr., 137:197–248, 1988.
[52] M. P. Gualdani, S. Mischler, and C. Mouhot. Factorization of non-symmetric operators and exponential H-theorem.
Mém. Soc. Math. Fr. (N.S.), 153:137, 2017.
[53] A. Guillin, P. Le Bris, and P. Monmarché. Convergence rates for the Vlasov-Fokker-Planck equation and uniform
in time propagation of chaos in non convex cases. Electron. J. Probab., 27:Paper No. 124, 44, 2022.
[54] A. Guillin, P. Le Bris, and P. Monmarché. On systems of particles in singular repulsive interaction in dimension
one: log and Riesz gas. J. Éc. polytech. Math., 10:867–916, 2023.
[55] A. Guillin, P. Le Bris, and P. Monmarché. Uniform in time propagation of chaos for the 2D vortex model and other
singular stochastic systems. To appear, 2024.
[56] A. Guillin and P. Monmarché. Uniform long-time and propagation of chaos estimates for mean field kinetic particles
in non-convex landscapes. J. Stat. Phys., 185(2):Paper No. 15, 20, 2021.
[57] B. Helffer and F. Nier. Hypoelliptic estimates and spectral theory for Fokker-Planck operators and Witten Laplacians,
volume 1862 of Lecture Notes in Mathematics. Springer-Verlag, Berlin, 2005.
[58] F. Hérau. Short and long time behavior of the Fokker-Planck equation in a confining potential and applications. J.
Funct. Anal., 244(1):95–118, 2007.
[59] F. Hérau and F. Nier. Isotropic hypoellipticity and trend to equilibrium for the Fokker-Planck equation with a
high-degree potential. Arch. Ration. Mech. Anal., 171(2):151–218, 2004.
[60] F. Hérau and L. Thomann. On global existence and trend to the equilibrium for the Vlasov-Poisson-Fokker-Planck
system with exterior confining potential. J. Funct. Anal., 271(5):1301–1340, 2016.
[61] E. Hess-Childs and K. Rowan. Higher-order propagation of chaos in L2 for interacting diffusions. Preprint,
arXiv:2310.09654.
[62] P.-E. Jabin and Z. Wang. Mean field limit for stochastic particle systems. In Active particles. Vol. 1. Advances
in theory, models, and applications, Model. Simul. Sci. Eng. Technol., pages 379–402. Birkhäuser/Springer, Cham,
2017.
[63] V. N. Kolokoltsov. Nonlinear Markov processes and kinetic equations, volume 182 of Cambridge Tracts in Mathe-
matics. Cambridge University Press, Cambridge, 2010.
[64] V. Konarovskyi, T. Lehmann, and M. von Renesse. On Dean-Kawasaki dynamics with smooth drift potential. J.
Stat. Phys., 178(3):666–681, 2020.
[65] R. Lachièze-Rey and G. Peccati. New Berry-Esseen bounds for functionals of binomial point processes. Ann. Appl.
Probab., 27(4):1992–2031, 2017.
[66] D. Lacker. Hierarchies, entropy, and quantitative propagation of chaos for mean field diffusions. Probab. Math. Phys.,
4(2):377–432, 2023.
[67] M. Ledoux. Concentration of measure and logarithmic Sobolev inequalities. In Séminaire de Probabilités, XXXIII,
volume 1709 of Lecture Notes in Math., pages 120–216. Springer, Berlin, 1999.
[68] P.-L. Lions. Cours au Collège de France: Théorie des jeux à champs moyen, 2014.
[69] W. Liu, L. Wu, and C. Zhang. Long-time behaviors of mean-field interacting particle systems related to McKean-
Vlasov equations. Comm. Math. Phys., 387(1):179–214, 2021.
[70] F. Malrieu. Logarithmic Sobolev inequalities for some nonlinear PDE’s. Stochastic Process. Appl., 95(1):109–132,
2001.
[71] F. Malrieu. Convergence to equilibrium for granular media equations and their Euler schemes. Ann. Appl. Probab.,
13(2):540–560, 2003.
[72] S. Mischler and C. Mouhot. Kac’s program in kinetic theory. Invent. Math., 193(1):1–147, 2013.
[73] S. Mischler and C. Mouhot. Exponential stability of slowly decaying solutions to the kinetic-Fokker-Planck equation.
Arch. Ration. Mech. Anal., 221(2):677–723, 2016.
[74] S. Mischler, C. Mouhot, and B. Wennberg. A new approach to quantitative propagation of chaos for drift, diffusion
and jump processes. Probab. Theory Related Fields, 161(1-2):1–59, 2015.
[75] P. Monmarché. Long-time behaviour and propagation of chaos for mean field kinetic particles. Stochastic Process.
Appl., 127(6):1721–1737, 2017.
[76] I. Nourdin and G. Peccati. Cumulants on the Wiener space. J. Funct. Anal., 258(11):3775–3791, 2010.
[77] F. Otto. The geometry of dissipative evolution equations: the porous medium equation. Comm. Partial Differential
Equations, 26(1-2):101–174, 2001.
[78] T. Paul, M. Pulvirenti, and S. Simonella. On the Size of Chaos in the Mean Field Dynamics. Arch. Ration. Mech.
Anal., 231(1):285–317, 2019.
[79] M. Rosenzweig and S. Serfaty. Global-in-time mean-field convergence for singular Riesz-type diffusive flows. Ann.
Appl. Probab., 33(2):754–798, 2023.
[80] S. Salem. A gradient flow approach of propagation of chaos. Discrete Contin. Dyn. Syst., 40(10):5729–5754, 2020.
[81] J. Seidler. Da Prato-Zabczyk’s maximal inequality revisited. I. Math. Bohem., 118(1):67–106, 1993.
[82] C. Stein. Approximate computation of expectations, volume 7 of Institute of Mathematical Statistics Lecture Notes—
Monograph Series. Institute of Mathematical Statistics, Hayward, CA, 1986.
[83] A.-S. Sznitman. Nonlinear reflecting diffusion process, and the propagation of chaos and fluctuations associated. J.
Funct. Anal., 56(3):311–336, 1984.
[84] A.-S. Sznitman. Topics in propagation of chaos. In École d’Été de Probabilités de Saint-Flour XIX—1989, volume
1464 of Lecture Notes in Math., pages 165–251. Springer, Berlin, 1991.
[85] H. Tanaka and M. Hitsuda. Central limit theorem for a simple diffusion model of interacting particles. Hiroshima
Math. J., 11(2):415–423, 1981.
[86] C. Villani. Hypocoercive diffusion operators. In International Congress of Mathematicians. Vol. III, pages 473–498.
Eur. Math. Soc., Zürich, 2006.
[87] C. Villani. Hypocoercivity. Mem. Amer. Math. Soc., 202(950):iv+141, 2009.
[88] G. Wang. Sharp inequalities for the conditional square function of a martingale. Ann. Probab., 19(4):1679–1688,
1991.
[89] Z. Wang, X. Zhao, and R. Zhu. Gaussian fluctuations for interacting particle systems with singular kernels. Arch.
Ration. Mech. Anal., 247(5):Paper No. 101, 62, 2023.
(Armand Bernou) Université de Lyon, Université Claude Bernard Lyon 1, Laboratoire de Sciences
Actuarielle et Financière, 50 Avenue Tony Garnier, F-69007 Lyon, France
Email address: [email protected]
(Mitia Duerinckx) Université Libre de Bruxelles, Département de Mathématique, 1050 Brussels, Bel-
gium
Email address: [email protected]

2405 19306v1

Uploaded by

Copyright:

Available Formats

2405 19306v1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2405 19306v1

Uploaded by

Copyright:

Available Formats

UNIFORM-IN-TIME ESTIMATES ON THE SIZE OF CHAOS

FOR INTERACTING BROWNIAN PARTICLES

ARMAND BERNOU AND MITIA DUERINCKX

dXti,N = Vti,N dt,

FtN |t=0 = µ⊗N

Ftm,N − (Ft1,N )⊗m → 0, as N ↑ ∞, (1.6)

equations for marginals: for 1 ≤ m ≤ N ,

with the convention F m,N

In particular, the uniform-in-time smallness of the two-particle correlation function G2,N

+∇A · ∇v νt + κ(∇W ∗ νt ) · ∇v µt + κ(∇W ∗ µt ) · ∇v νt , (1.19)

and we then call δV

(ii) For all t ≥ 0, we have

we can appeal to item (i) above and it remains to compare E◦ [Φ(m(t, µN

be the multivariate moment generating function of X1 , . . . , Xm . We can write

and thus, by the Leibniz rule,

Now moments of the empirical measure µN = N1 N

where the coefficients are given by

Isolating Xm φ⊗(m) Gm,N

Γ0 (X1 ) := Γ00 (X1 ) := X1 ,

and iteratively, for all n ≥ 1, m ≥ 0, and ♯J = m,

where we let XJ = (Xj1 , . . . , Xjs ) for J = {j1 , . . . , js }, and we then set

Γn (X1 , . . . , Xn+1 ) := Γnn (X1 , . . . , Xn+1 ).

and this obviously implies

and thus, appealing again to (2.15) to reformulate the last term,

Inserting this into (2.16) and recognizing the definition of Γ2 , we find

where we have set

A Taylor expansion gives for all p ≥ 1,

Now recalling the Helffer–Sjöstrand representation formula of Lemma 2.7(iii) in form of

we deduce by the Cauchy–Schwarz inequality,

Taking the supremum over g ∈ Cb2 (R),

In particular, this entails 

Recalling Z0j,N ∼ µ◦ for all j, the conclusion immediately follows.

3. Ergodic Sobolev estimates for mean field

where the multiplicative constant only depends on d, β, λ, k, p, q, a, kW kW (2k)∨(k+d+1),∞ (Rd ) , and

where H k (M −1/2 ) is the standard weighted Sobolev space with norm

where the multiplicative constant only depends on d, β, λ, k, a, and kW kW k+2,∞ (Rd ) .

.W,β,λ,k,p,q,a e−pλ(t−s) khs kW −k,q (hzip ) ,

where we have set for abbreviation, for all j ≥ 1 and 0 ≤ s0 ≤ sj ,

The conclusion follows from Grönwall’s inequality.

where the multiplicative factor only depends on d, β, k, a, and kW kW k+1,∞ (Rd ) .

where the source term rtα,γ is given by

the exponential decay (3.19) yields

Using this to replace half of the dissipation term in (3.23), we get

and thus, by Grönwall’s inequality, for all λ ≥ 0,

so the above yields for all 0 ≤ λ < λ′ < λ1 ,

Now, by Theorem 3.1(i), we have

Provided that pq ′ ≥ 2βdλ−1

hence, by Grönwall’s inequality,

that is, (3.32) for k = 0.

and it remains to analyze the last contribution. By definition of Bµ∗,p ,

Integrating by parts, this allows us to estimate

Further appealing to Young’s inequality, we get for all λ < λ2 ,

and thus, by Grönwall’s inequality,

in (3.9), simply replacing the weight M −1 by M −1 ζ in the definition. Comparing M −1 ζ to M −1 , the

For a densely-defined operator X on L2 ( M −1 ζ), by X ∗,ζ its adjoint on L2 ( M −1 ζ) with

Noting that the dissipation term can be bounded below as

∂s kgs k2 2 √ −1 ≥ 18 k(∇v + β2 v)gs k2L2 (M −1/2 )

In particular, by Grönwall’s inequality,

Repeating the proof of (3.42), for the choice of η, λ2 , Λ, R in Step 1, we get

In particular, this entails