Probabilistic Reachability Analysis of Stochastic Control Systems

Saber Jafarpour^1∗, Zishun Liu^2∗ and Yongxin Chen² The first two authors contribute equally to this work¹ Saber Jafarpour is with University of Colorado Boulder, Boulder, CO 80309 [email protected]²Zishun Liu and Yongxin Chen are with Georgia Institute of Technology, Atlanta, GA 30332 {zliu910}{yongchen}@gatech.edu

Abstract

We address the reachability problem for continuous-time stochastic dynamic systems. Our objective is to present a unified framework that characterizes the reachable set of a dynamic system in the presence of both stochastic disturbances and deterministic inputs. To achieve this, we devise a strategy that effectively decouples the effects of deterministic inputs and stochastic disturbances on the reachable sets of the system. For the deterministic part, many existing methods can capture the deterministic reachability. As for the stochastic disturbances, we introduce a novel technique that probabilistically bounds the difference between a stochastic trajectory and its deterministic counterpart. The key to our approach is introducing a novel energy function termed the Averaged Moment Generating Function that yields a high probability bound for this difference. This bound is tight and exact for linear stochastic dynamics and applicable to a large class of nonlinear stochastic dynamics. By combining our innovative technique with existing methods for deterministic reachability analysis, we can compute estimations of reachable sets that surpass those obtained with current approaches for stochastic reachability analysis. We validate the effectiveness of our framework through various numerical experiments. Beyond its immediate applications in reachability analysis, our methodology is poised to have profound implications in the broader analysis and control of stochastic systems. It opens avenues for enhanced understanding and manipulation of complex stochastic dynamics, presenting opportunities for advancements in related fields.

Index Terms:

Reachability analysis, stochastic dynamic systems, stochastic control

I Introduction

Reachability analysis is an important topic in systems and control theory that focuses on analyzing whether the trajectory of a system will reach a certain set within a time horizon starting from a given set of initial conditions and possibly subject to inputs or disturbances. It is essential in many applications including autonomous vehicles, aerospace systems, robotics, etc. For instance, in safety-critical applications where the system should be kept outside an unsafe region of the state space, reachability analysis is a key machinery to verify and design the control input to avoid the unsafe region.

Reachability analysis of dynamical systems is a fundamentally challenging task. For general dynamical systems, obtaining exact or close approximations of their reachable sets is only possible when the state dimension is low and generally demands substantial computational resources. However, there is a rapidly growing need for fast reachability analysis methods in various control applications. This motivates the need for rigorous methods that can efficiently upper bound the reachable sets of dynamical systems.

For deterministic systems with bounded inputs or disturbances, many methods have been proposed to over-approximate the reachable set. Several representative methods include Hamilton-Jacobi reachability that poses reachability as a game between two players [1, 2], contraction-based reachability that estimates the propagation of reachable sets using contraction rate of the system [3, 4], and Interval-based reachability that over-approximates reachable sets by leveraging techniques from interval analysis and monotone system theory [5, 6, 7, 8]. Other methods such as simulation-based reachability [9, 10] are also popular in a wide range of studies.

In this work, we are interested in reachability analysis for stochastic systems. In many real-world applications, systems are subject to unbounded and stochastic disturbances and are better modeled by stochastic dynamics. Despite the efficiency of the aforementioned deterministic reachability methods in the presence of bounded disturbances, they cannot be applied directly to systems subject to unbounded stochastic disturbances. For systems with stochastic disturbances, considering all possible disturbance scenarios will often result in unbounded reachable sets due to the unboundedness of stochastic noise. Moreover, this approach also ignores the statistical properties of the noise, leading to overly conservative results [11]. To better capture the effects of stochastic disturbances, reachability analysis in stochastic systems focuses on the probabilistic reachable set, which refers to the set that any possible trajectory starting from an initial set can reach with high probability (e.g., 99.9%).

There have been several attempts to approximate probabilistic reachable sets of stochastic systems and they can be divided into two categories. The first category is the optimization-based approaches that use Hamilton-Jacobi equations and dynamic programming [12, 13, 14, 15]. However, these approaches are usually computational heavy, rendering them impractical for large-scale systems. The second category is simulation-based approaches which provide guarantees for reachability using trajectory samples [16, 17, 18]. One drawback of these methods is that the amount of samples needed to obtain reasonable bounds on reachable sets can grow exponentially. Another tangentially related line of research is on the stochastic Lyapunov function [19] or barrier function [20, 21, 22, 23] for measuring the probability of a trajectory staying in a safe set. In these works, the goal is not to find the probabilistic reachable set but to verify whether a given safe set is in the probabilistic reachable set.

In this work, we establish a unified framework for computing the probabilistic reachable sets of nonlinear stochastic systems subject to both deterministic inputs and stochastic disturbances. Our method is both theoretically optimal and effective in practice. Theoretically, under standard assumptions, our method yields tight approximations of probabilistic reachable sets that cannot be improved further without additional assumptions. Implementation-wise, our approximations of probabilistic reachable sets can be computed efficiently and are scalable to high-dimensional systems.

Our framework is built upon a novel separation strategy, which decouples the effects of deterministic inputs and stochastic uncertainty on reachability analysis of the stochastic system (Proposition 1). The effects of stochastic uncertainty on the probabilistic reachable set can be represented using stochastic deviation, which refers to the distance between a stochastic trajectory and its associated deterministic trajectory. By developing a novel energy function termed the Averaged Moment Generating Function (AMGF), we provide a probabilistic bound on the stochastic deviation of general stochastic continuous trajectories (Theorem 1). Our bound has a dependence $\mathcal{O}(\sqrt{\log(1/\delta)})$ on the probability level $1-\delta$ , significantly better than existing techniques which result in a bound of the order $\mathcal{O}(\sqrt{1/\delta})$ . Moreover, our bound coincides with that for linear stochastic systems under the same assumptions and cannot be improved further. The effects of deterministic input on the probabilistic reachable set can be captured using deterministic reachable sets of the associated deterministic system, i.e., the system obtained by removing the stochastic noise.

Consequently, our separation strategy enables a decomposition of the probabilistic reachable set into a deterministic reachable set capturing the deterministic input and a tight robustness buffer around it against the stochastic uncertainty (Theorem 2). As such, analyzing the reachability of the associated deterministic system is all we need to obtain a good probabilistic reachable set. This is a paradigm shift and brings tremendous flexibility to the reachability analysis of stochastic systems as any deterministic reachability framework can be incorporated. In particular, we combine our framework with two computationally efficient deterministic reachability approaches namely contraction-based reachability and interval-based reachability to obtain probabilistic reachable sets for stochastic systems.

Finally, our tight probabilistic bound of stochastic deviation is poised to have profound implications in the broader analysis and control of stochastic systems beyond its immediate applications in reachability analysis. To the best of our knowledge, this bound is the first non-conservative result that can quantitatively describe the behavior of a nonlinear stochastic system under standard assumptions. The bound is of independent interests and can potentially impact many other areas such as estimation, uncertainty quantification, finance, machine learning, statistics, etc. It opens avenues for enhanced understanding and manipulation of complex stochastic dynamics, presenting opportunities for advancements in related fields.

The rest of the paper is organized as follows. In Section II we briefly review reachability analysis for deterministic systems. In Section III we introduce and formulate the probabilistic reachability problem and present our overall strategy. The discussion of an existing method is given in Section IV. Section V contains the main technical contribution of this paper where we introduce a novel energy function termed the Averaged Moment Generating Function to bound the deviation of stochastic trajectories from their deterministic counterpart with high-probability. This high-probability bound of stochastic deviation is combined with deterministic reachability analysis in Section VI to approximate the probabilistic reachable set of stochastic systems. This is followed by case studies in Section VII and numerical experiments in Section VIII.

II Preliminaries

In this section, we briefly review reachability analysis for deterministic dynamics and related concepts.

II-A Notations

Vectors and matrices. Given a vector $x\in\mathbb{R}^{n}$ , $\|x\|$ denotes its Euclidean norm ( $\ell_{2}$ norm) and $\|x\|_{P}=\sqrt{x^{\mathsf{T}}Px}$ with some positive definite matrix $P$ . Given a matrix $A\in\mathbb{R}^{m\times n}$ , $\|A\|$ denotes its spectral norm and $\|A\|_{P}$ denotes its weighted spectral norm with respect to some positive definite matrix $P$ . For two matrices $A,B\in\mathbb{R}^{n\times n}$ , $A\preceq B$ if $B-A$ is positive semi-definite. If $A\in\mathbb{R}^{n\times n}$ is a positive definite matrix, we denote its square root by $A^{\frac{1}{2}}$ , i.e., $A^{\frac{1}{2}}$ is the unique matrix such that $A^{\frac{1}{2}}(A^{\frac{1}{2}})^{\mathsf{T}}=(A^{\frac{1}{2}})^{\mathsf{T}}A^% {\frac{1}{2}}=A$ . Besides, we use $\langle\cdot,\cdot\rangle$ to denote standard inner product, $0$ to denote all-zero vectors and matrices, and $I_{n}$ to denote $n$ -dimensional identity matrix.

Set and Functions. We use $\mathcal{B}^{n}\left(r,y\right)$ to denote the ball $\{x\in\mathbb{R}^{n}:\|x-y\|\leq r\}$ and $\mathcal{S}^{n-1}$ to denote the unit sphere $\{x\in\mathbb{R}^{n}:\|x\|=1\}$ . For two sets $A,B$ , their Minkowski sum is defined as $A\oplus B=\{x+y:x\in A,~{}y\in B\}$ . Given a set $\mathcal{X}\subseteq{}^{n}$ and a matrix $T\in{}^{n\times n}$ , we define $T\mathcal{X}=\{Tx\;|\;x\in\mathcal{X}\}$ . Given a continuously differentiable vector-valued function $f:{}^{n}\to{}^{m}$ , we denote the Jacobian of $f$ at $x$ by $D_{x}f(x)$ . For a twice-differentiable scalar-valued function $f:{}^{n}\to\mathbb{R}$ , its gradient at $x$ is $\nabla f(x)$ and the Hessian matrix is denoted as $\nabla^{2}f(x)$ .

Throughout the paper, we use $\mathbb{E}$ to denote expectation and $\mathbb{P}$ to denote probability. For a set $S$ , $X\sim S$ means $X$ is a random variable drawn uniformly from $S$ .

II-B Reachable Set of Deterministic Dynamics

Computing the reachable sets is a fundamental problem in dynamical systems and control theory. Consider the continuous-time deterministic system

\dot{x}_{t}=f(x_{t},u_{t},t),

(1)

where $x_{t}\in\mathbb{R}^{n}$ is the state at time $t$ , $u_{t}\in\mathbb{R}^{p}$ is the input at time $t$ , and $f:\mathbb{R}^{n}\times\mathbb{R}^{p}\times{}_{\geq 0}\to\mathbb{R}^{n}$ is a parameterized vector field. Depending on the applications, $u_{t}$ can be a control action or a disturbance. The reachable set of a deterministic system is the set of all states that the system can reach, starting from an initial configuration, under all possible inputs within a specified time period [24].

Definition II.1 (DRS).

Consider the system (1) with initial set $\mathcal{X}_{0}\subseteq{}^{n}$ and input set $\mathcal{U}\subseteq{}^{p}$ . The deterministic reachable set (DRS) of (1) at time $t$ starting from $\mathcal{X}_{0}$ with inputs in $\mathcal{U}$ is

\displaystyle\mathcal{R}_{t}=\left\{x_{t}\middle|\begin{aligned} &\tau\mapsto x% _{\tau}\mbox{ is a trajectory of~{}\eqref{eq:deterministic}}\\ &\mbox{ with }x_{0}\in\mathcal{X}_{0}\mbox{ and }u_{\tau}:{}_{\geq 0}\to% \mathcal{U}\end{aligned}\right\}

(2)

In general, computing the exact DRS of a dynamic system is computationally intractable [25]. Therefore, most methods in reachability analysis focus on providing over-approximation of DRS [24]. A set $\overline{\mathcal{R}}_{t}\subseteq{}^{n}$ is an over-approximation of the DRS (2) if, for every $t\geq 0$ ,

\displaystyle\mathcal{R}_{t}\subseteq\overline{\mathcal{R}}_{t}.

In Section VII, we revisit two approaches to compute $\overline{\mathcal{R}}_{t}$ : contraction-based reachability and interval-based reachability.

II-C Matrix Measure and Contraction Theory

A key tool in studying reachable sets of system (1) is the matrix measure [26, 27] defined as follows.

Definition II.2 (Matrix Measure).

Given a matrix $A\in\mathbb{R}^{n\times n}$ , its matrix measure with respect to $\|\cdot\|$ , denoted by $\mu(A)$ , is defined as

\displaystyle\mu(A)=\lim_{\epsilon\to 0^{+}}\frac{\|I_{n}+\epsilon A\|-1}{% \epsilon}.

Intuitively, $\mu(A)$ can be considered as the one-sided derivative of the norm $\|\cdot\|$ at $I_{n}$ in the direction of $A$ . Although matrix measure can be defined with respect to any norm, in this paper we focus on the spectral norm. In this case, the matrix measure has a closed-form expression $\mu(A)=\tfrac{1}{2}\lambda_{\max}(A+A^{\mathsf{T}})$ .

For the system (1), the evolution of the distance of two arbitrary trajectories can be measured using $\mu(D_{x}f(x,u,t))$ . The following lemma provides a variational characterization of $\mu(D_{x}f(x,u,t))$ [28].

Lemma II.1.

Given a deterministic system (1), for every $t\geq 0$ , the following statement are equivalent

(i)

$\mu(D_{x}f(x,u,t))\leq c_{t}$ for all $(x,u)\in\mathbb{R}^{n}\times\mathcal{U}$ .
(ii)

$(x-y)^{\mathsf{T}}(f(x,u,t)-f(y,u,t))\leq c_{t}\|x-y\|^{2}$ , for all $(x,y,u)\in\mathbb{R}^{n}\times{}^{n}\times\mathcal{U}$ .

A classical result in contraction theory states that if condition (i) holds then the distance between trajectories of the system (1) can be upper bounded exponentially with time [4]. If there exists $\alpha>0$ such that $c_{t}<-\alpha$ for all $t$ , then the distance between two arbitrary trajectories of (1) is decreasing over time, and we say the system is contracting [29, 30, 31]. In practice, to apply the contraction theory for reachability analysis, one needs to estimate or bound $\mu(D_{x}f(x,u,t))$ . Several approaches have been proposed in the literature to determine the upper bound of $\mu(D_{x}f(x,u,t))$ (see e.g., [9],[4, Chapter 3,4],[32, 33]). These methods are applicable not only to contracting systems but also to systems with any $c_{t}\in\mathbb{R}$ . In this paper, we allow $c_{t}\in\mathbb{R}$ rather than restricting it to be negative.

III Reachability of Stochastic Systems

In many real-world applications, the underlying dynamics are driven not only by deterministic inputs but also by stochastic disturbances. Existing methods and techniques for deterministic reachability analysis designed for deterministic and often bounded inputs/disturbances are not applicable to these scenarios with stochastic disturbances. We aim to bridge this gap by developing a unified framework of reachability analysis for stochastic systems. In this section, we formulate our probabilistic reachability problem and introduce our overall strategy for addressing it.

III-A Problem Statement

Consider the stochastic system

dX_{t}=f(X_{t},u_{t},t)dt+g_{t}(X_{t})dW_{t},

(3)

where the state $X_{t}\in\mathbb{R}^{n}$ is a random vector, $u_{t}:{}_{\geq 0}\to\mathcal{U}\subseteq\mathbb{R}^{p}$ is the input, $g_{t}$ is the diffusion coefficient, and $W_{t}\in\mathbb{R}^{m}$ is an $m$ -dimensional Wiener process (Brownian motion) modeling the stochastic uncertainty. This stochastic system can be viewed as a noisy version of the deterministic system

\dot{x}_{t}=f(x_{t},u_{t},t).

(4)

To ensure (3) has a solution, we default standard Lipschitz and linear growth conditions [34, Theorem 5.2.1]. For reachability analysis, we impose the following assumption.

Assumption 1.

For the stochastic system (3), there exist integrable curves $t\mapsto c_{t}$ and $t\mapsto\sigma_{t}$ such that,

(i)

$\mu(D_{x}f(x,u,t))\leq c_{t}$ for any $t\geq 0$ , $u\in\mathcal{U}$ , and $x\in\mathbb{R}^{n}$ .
(ii)

$g_{t}(x)g_{t}(x)^{\mathsf{T}}\preceq\sigma_{t}^{2}I_{n}$ for any $t\geq 0$ and $x\in\mathbb{R}^{n}$ .

We are interested in characterizing the reachable set of the stochastic system (3) under Assumption 1. Departing from the deterministic dynamics (4) driven only by the input $u_{t}$ , the stochastic system (3) is driven by both the input $u_{t}$ and stochastic disturbance $dW_{t}/dt$ . Deterministic reachability analysis falls short of capturing this stochastic disturbance. Indeed, most methods in deterministic reachability analysis assume bounded input/disturbance and approximate its DRS through worst-case type analysis [11]. However, the stochastic disturbance $dW_{t}/dt$ is unbounded [35, Chapter 4.1]. This unbounded stochastic disturbance often results in a trivial reachable set in the sense of (2). For example, the classical reachable set of the system $dX_{t}=dW_{t}$ is the entire state space for any $t>0$ . We resort to a probabilistic notion of reachable sets to overcome these limitations of deterministic reachability analysis.

Definition III.1 ( $\delta$ -PRS).

Consider the stochastic system (3) with initial set $\mathcal{X}_{0}\subseteq\mathbb{R}^{n}$ and input set $\mathcal{U}\subseteq\mathbb{R}^{p}$ . Given $\delta\in(0,1]$ and $t\geq 0$ , the set $\mathcal{R}_{\delta,t}\subseteq{}^{n}$ is a $\delta$ -probabilistic reachable set ( $\delta$ -PRS) of (3) at time $t$ , if for any $x_{0}\in\mathcal{X}_{0}$ and piecewise continuous $u_{t}:\mathbb{R}_{\geq 0}\to\mathcal{U}$ , we have

\mathbb{P}\left(X_{t}\in\mathcal{R}_{\delta,t}\right)\geq 1-\delta.

(5)

Briefly, a probabilistic reachable set of a stochastic system (3) is the set all possible trajectories can reach with high probability. An illustration of $\delta$ -PRS is given in Figure 1. For sufficiently small $\delta$ , $\mathcal{R}_{\delta,t}$ contains the DRS of the associated deterministic system (4) due to the stochastic disturbance, that is, $\mathcal{R}_{t}\subseteq\mathcal{R}_{\delta,t}$ . By definition, the $\delta$ -PRS is not unique. If $\mathcal{R}_{\delta,t}$ is a $\delta$ -PRS, then any $\mathcal{R}_{\delta,t}^{\prime}\supseteq\mathcal{R}_{\delta,t}$ is also a $\delta$ -PRS. We say $\mathcal{R}_{\delta,t}$ is a tighter $\delta$ -PRS than $\mathcal{R}_{\delta,t}^{\prime}$ if $\mathcal{R}_{\delta,t}\subseteq\mathcal{R}_{\delta,t}^{\prime}$ .

Refer to caption — Figure 1: An illustration of $\delta$ -PRS at time $t$ . Here $\mathcal{R}_{\delta,t}$ is a $\delta$ -PRS of the stochastic system (3), whose trajectories are in color, and $\mathcal{R}_{t}$ is the DRS of the associated deterministic system (4), whose trajectories are in black.

In many applications involving reachability analysis, it is desirable to have a tight $\delta$ -PRS. For instance, for safety-critical control, the safety of the system can be guaranteed by ensuring that the $\delta$ -PRS does not overlap with the unsafe regions [36]. A loose $\delta$ -PRS can result in very conservative strategies. Therefore, we are interested in finding the tightest possible $\delta$ -PRS.

Problem 1.

Find an as tight as possible $\delta$ -PRS $\mathcal{R}_{\delta,t}$ of the stochastic system (3) under Assumption 1.

III-B Separation Strategy and Stochastic Deviation

The trajectories of the stochastic system (3) are driven by both deterministic input and stochastic disturbance/input. The effects of these two types of inputs on the trajectories are relatively independent and may be handled separately. Building on this intuition, we propose a strategy termed separation strategy for probabilistic reachability analysis. The effects of the deterministic input can be encoded by the DRS of the associated deterministic system (4). To capture the effects of the stochastic disturbance, we associate each trajectory $X_{t}$ of the system (3) with a trajectory $x_{t}$ of the system (4) with the same initial state $x_{0}=X_{0}$ and the same deterministic input $u_{t}$ . The influence of the stochastic disturbance can then be represented by the deviation $\|X_{t}-x_{t}\|$ . The probabilistic reachable set of (3) can be approximated by combining these two components as formalized below.

Proposition 1 (Separation strategy).

Consider the stochastic system (3) with its associated deterministic system (4). Let $\overline{\mathcal{R}}_{t}$ be any over-approximation of the DRS of (4). If there exists $r_{\delta,t}$ such that, for any given trajectory $x_{t}$ of (4) and any associated trajectory $X_{t}$ of (3) with the same initial condition $x_{0}$ and input $u_{\tau}$ ,

\mathbb{P}\left(\|X_{t}-x_{t}\|\leq r_{\delta,t}\right)\geq 1-\delta,

(6)

then $\overline{\mathcal{R}}_{t}\oplus\mathcal{B}^{n}(r_{\delta,t},0)$ is a $\delta$ -PRS of (3).

Proof.

Let $X_{t}$ be any trajectory of (3) associated with a trajectory $x_{t}$ of (4), then, by the assumption (6) and the definition of the Minkowski sum [37],

\displaystyle X_{t}\in\{x_{t}\}\oplus\mathcal{B}^{n}(r_{\delta,t},0)

with probability at least $1-\delta$ . By the definition of $\overline{\mathcal{R}}_{t}$ , $x_{t}\in\overline{\mathcal{R}}_{t}$ . Therefore, with probability at least $1-\delta$ ,

X_{t}\in\overline{\mathcal{R}}_{t}\oplus\mathcal{B}^{n}(r_{\delta,t},0),

which completes the proof. ∎

We term the difference $\|X_{t}-x_{t}\|$ between associated trajectories stochastic deviation. A key ingredient of Proposition 1 is a probabilistic bound $r_{\delta,t}$ that upper bounds the stochastic deviation with high probability. Proposition 1 states that if a probabilistic bound $r_{\delta,t}$ exists, then the dilation of the reachable set of the deterministic system (4) with a ball of radius $r_{\delta,t}$ is a $\delta$ -PRS of (3). This separation strategy decomposes the probabilistic reachability analysis problem into two parts: approximate the DRS of (4) and estimate the probabilistic bound $r_{\delta,t}$ of the stochastic deviation. Once a bound $r_{\delta,t}$ of the stochastic deviation is provided, one can combine it with any existing deterministic reachability method to approximate the $\delta$ -PRS.

The size of the $\delta$ -PRS $\overline{\mathcal{R}}_{t}\oplus\mathcal{B}^{n}(r_{\delta,t},0)$ in Proposition 1 increases with $r_{\delta,t}$ . To ensure $\overline{\mathcal{R}}_{t}\oplus\mathcal{B}^{n}(r_{\delta,t},0)$ is not an overly-conservative $\delta$ -PRS of (3), it is crucial to establish an as tight as possible probabilistic bound $r_{\delta,t}$ for the stochastic deviation. This is the main challenge addressed in this paper.

Problem 2.

Establish an as tight as possible probabilistic bound $r_{\delta,t}$ of the stochastic deviation $\|X_{t}-x_{t}\|$ associated with systems (3)-(4) under Assumption 1.

IV Expectation Bound and Limitations

To warm up, we first revisit an existing approach [38] for Problem 2 and highlight its limitations.

IV-A Expectation Bound on Stochastic Deviation

Inspired by [38] we present a method that probabilistically bounds the stochastic deviation $\|X_{t}-x_{t}\|$ under Assumption 1 by bounding the expectation $\mathbb{E}(\|X_{t}-x_{t}\|^{2})$ .

For a trajectory $X_{t}$ of the stochastic system (3) and the associated trajectory $x_{t}$ of the deterministic system (4), define the Lyapunov function $V_{t}=\|X_{t}-x_{t}\|^{2}$ . Then a direct application of the Ito’s Lemma [35] yields

	$\displaystyle dV_{t}$	$\displaystyle=2\left(X_{t}-x_{t}\right)^{\mathsf{T}}(f(X_{t},u_{t},t)-f(x_{t},% u_{t},t))dt$
		$\displaystyle+\mathrm{tr}(g_{t}^{\mathsf{T}}g_{t})dt+2(X_{t}-x_{t})^{\mathsf{T% }}g_{t}dW_{t}$		(7)

Following standard Itó Calculus, for every $t,h\geq 0$ ,

	$\displaystyle\mathbb{E}(V_{t+h})$	$\displaystyle-\mathbb{E}(V_{t})=\mathbb{E}\left(\int_{t}^{t+h}dV_{s}\right)$
		$\displaystyle\leq\int_{t}^{t+h}\mathbb{E}(dV_{s})$
		$\displaystyle\leq\int_{t}^{t+h}(2c_{s}\mathbb{E}(\\|X_{s}-x_{s}\\|^{2})+n\sigma^% {2}_{s})ds$
		$\displaystyle=\int_{t}^{t+h}\left(2c_{s}\mathbb{E}(V_{s})+n\sigma^{2}_{s}% \right)ds.$

where the first inequality holds by the triangle inequality and the second inequality holds by Lemma II.1. Taking the limsup of both side as $h\to 0$ , for every $t\geq 0$ , we get

\displaystyle D^{+}\mathbb{E}(V_{t})\leq 2c_{t}\mathbb{E}(V_{t})+n\sigma_{t}^{% 2},\quad V_{0}=0,

(8)

where $D^{+}$ is the upper Dini Derivative with respect to $t$ . By the generalized Gröwall-Bellman lemma [39, Appendix A1, Proposition 4], it follows the expectation bound

\mathbb{E}(\|X_{t}-x_{t}\|^{2})=\mathbb{E}(V_{t})\leq n\Psi_{t},

(9)

where


$\displaystyle\Psi_{t}$	$\displaystyle=$	$\displaystyle e^{2\psi_{t}}\int_{0}^{t}\sigma_{\tau}^{2}e^{-2\psi_{\tau}}d\tau$	(10a)
$\displaystyle\psi_{t}$	$\displaystyle=$	$\displaystyle\int_{0}^{t}c_{\tau}d\tau.$	(10b)

Applying Markov inequality to the expectation bound (9), we obtain the probabilistic bound

\mathbb{P}\left(\|X_{t}-x_{t}\|\leq\sqrt{\frac{n}{\delta}\Psi_{t}}\right)=% \mathbb{P}\left(V_{t}\leq\frac{n}{\delta}\Psi_{t}\right)\geq 1-\delta

(11)

for any $\delta\in(0,1)$ .

IV-B Limitations of Expectation Bound

The bound (11) based on the expectation bound (9) turns out to be loose. To see this, consider the linear time-invariant (LTI) stochastic system

\displaystyle dX_{t}=(AX_{t}+Bu_{t})dt+\sigma dW_{t}

(12)

and the associated deterministic system

\displaystyle\dot{x}_{t}=Ax_{t}+Bu_{t}.

(13)

In this case, the bound (11) reads

\mathbb{P}\left(\|X_{t}-x_{t}\|\leq r_{\delta,t}^{(1)}\right)\geq 1-\delta,

(14)

where $r_{\delta,t}^{(1)}=\sqrt{\frac{n\sigma^{2}(e^{2ct}-1)}{2c\delta}}$ with $c=\mu(A)$ .

On the other hand, when initialized at $X_{0}=x_{0}$ , $X_{t}$ is a Gaussian random variable [35] with mean $\mathbb{E}(X_{t})=x_{t}$ and covariance matrix

\displaystyle\text{cov}(X_{t})=\int_{0}^{t}\sigma^{2}e^{A(t-\tau)}e^{A^{% \mathsf{T}}(t-\tau)}d\tau.

(15)

Invoking the fact that $\|e^{At}\|\leq e^{\mu(A)t}$ for any $t\geq 0$ [40], $\text{cov}(X_{t})$ can be bounded as

\begin{split}\text{cov}(X_{t})&\preceq\int_{0}^{t}\sigma^{2}\|e^{A(t-\tau)}\|% \|e^{A^{\mathsf{T}}(t-\tau)}\|d\tau\,I_{n}\\ &\preceq\sigma^{2}\int_{0}^{t}e^{2c(t-\tau)}d\tau\,I_{n}\\ &=\tfrac{\sigma^{2}}{2c}(e^{2ct}-1)\,I_{n}.\end{split}

(16)

By the concentration property of Gaussian distribution [41, Chapter 7], for any $\delta\in(0,1)$ , with probability at least $1-\delta$ ,

\|X_{t}-x_{t}\|\leq\sqrt{\|\text{cov}(X_{t})\|}(4\sqrt{n}+2\sqrt{2\log(1/% \delta)}).

(17)

Plugging (16) into (17) yields

\displaystyle\mathbb{P}\left(\|X_{t}-x_{t}\|\leq r^{(2)}_{\delta,t}\right)\geq 1% -\delta,

(18)

where $r_{\delta,t}^{(2)}=\sqrt{\tfrac{\sigma^{2}}{2c}(e^{2ct}-1)}(4\sqrt{n}+2\sqrt{2% \log(1/\delta)})$ .

The bound (18) is substantially better than (14). While the dependency of $r_{\delta,t}^{(1)}$ and $r_{\delta,t}^{(2)}$ on $c$ and $n$ are of the same order, the dependency of $r^{(2)}_{\delta,t}$ on $\delta$ is $\mathcal{O}(\sqrt{\log(1/\delta)})$ , much better than the $\mathcal{O}\left(\sqrt{1/\delta}\right)$ dependency of $r^{(1)}_{\delta,t}$ on $\delta$ . For small $\delta$ (e.g., $10^{-10}$ ), which is crucial for safety-critical systems, $\sqrt{\log(1/\delta)}$ is significantly smaller than $\sqrt{1/\delta}$ ( $4.80$ v.s. $10^{5}$ ). As a result, the probabilistic reachable set based on the bound (11) can be conservative in practice.

Thus, there is a significant gap between the result (11) for nonlinear dynamics and probabilistic bounds for linear dynamics. The limitation of the expectation bound primarily lies in the quadratic Lyapunov function $V_{t}=\|X_{t}-x_{t}\|^{2}$ . The analysis focuses only on the evolution of the second order moment $\mathbb{E}(\|X_{t}-x_{t}\|^{2})$ . It can at best guarantee a probabilistic bound for $\|X_{t}-x_{t}\|$ of order $\mathcal{O}(\sqrt{1/\delta})$ via Markov inequality. This gives rise to the question: is the gap fundamental or an artifact of the analysis?

V Probabilistic Bound on Stochastic Deviation

In this section, we answer the aforementioned question by establishing a probabilistic bound for the stochastic deviation $\|X_{t}-x_{t}\|$ of order $\mathcal{O}(\sqrt{\log(1/\delta)})$ for general nonlinear stochastic systems (3) under Assumption 1. We further show our bound is consistent with that for linear systems under the same assumption and is thus tight.

V-A Sub-Gaussian and MGF

The analysis (15)-(18) relying on the Gaussianity for linear systems can not be applied to (3) since $X_{t}$ is not necessarily Gaussian for nonlinear systems. Fortunately, the norm concentration property (17) holds not only for Gaussian random vectors (distributions) but also for a wider class of random vectors known as sub-Gaussian vectors (distributions).

Definition V.1.

A random variable $X\in\mathbb{R}^{n}$ is said to be sub-Gaussian with variance proxy $\sigma^{2}$ , denoted as $X\sim subG(\sigma^{2})$ , if $\mathbb{E}_{X}(X)=0$ and

\mathbb{E}_{X}\left(e^{\lambda\langle\ell,X\rangle}\right)\leq e^{\frac{% \lambda^{2}\sigma^{2}}{2}},~{}\forall\lambda\in\mathbb{R},~{}\forall\ell\in% \mathcal{S}^{n-1}.

(19)

Many distributions including Gaussian distribution, zero-mean uniform distribution, and any zero-mean distribution with bounded support are instances of sub-Gaussian distributions. For Gaussian distribution, the variance proxy $\sigma^{2}$ is $\|\text{cov}(X)\|$ .

Sub-Gaussian distributions share the same norm concentration property as Gaussian distributions. For the sake of completeness, we present a version of the concentration property and its proof in Appendix -A.

Lemma V.1.

Let $X\in\mathbb{R}^{n}$ be a sub-Gaussian random vector with variance proxy $\sigma^{2}$ , then for any $\delta\in(0,1)$ and any $\varepsilon\in(0,1)$ ,

\|X\|\leq\sigma\sqrt{\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta)}

(20)

holds with probability at least $1-\delta$ , where

\varepsilon_{1}=\frac{2\log(1+2/\varepsilon)}{(1-\varepsilon)^{2}},~{}% \varepsilon_{2}=\frac{2}{(1-\varepsilon)^{2}}.

(21)

Lemma V.1 states a probabilistic bound of the norm $\|X\|$ of a sub-Gaussian random vector that scales as $\mathcal{O}(\sqrt{n})$ and $\mathcal{O}(\sqrt{\log(1/\delta)})$ , the same as (17). The parameter $\varepsilon$ can be selected according to the values of $n,\delta$ to minimize the bound. When $\varepsilon=0.5$ , $\varepsilon_{1}=8\log 5\approx 16$ and $\varepsilon_{2}=8$ . Since $\sigma^{2}=\|\text{cov}(X)\|$ for Gaussian, (20) becomes (17) after applying Jensen’s Inequality. The dependence $\mathcal{O}(\sqrt{n})$ and $\mathcal{O}(\sqrt{\log(1/\delta)})$ in Lemma V.1 is tight, but the expressions of $\varepsilon_{1},\varepsilon_{2}$ in (21) are constructed in the proof and are by no means optimal, especially for specific values of $n$ . For example, when the dimension $n=1$ , by Hoeffding’s Inequality [42, Chapter 1.2], a better choice is $\varepsilon_{1}=2\log 2$ and $\varepsilon_{2}=2$ .

To show a random variable $X$ is sub-Gaussian, one needs to verify $\mathbb{E}_{X}(X)=0$ and the inequality (19). Note that the left-hand side of (19) is the Moment Generating Function (MGF) [42, Chapter 1.1]

\mathbb{E}_{X}\left(M_{\lambda,\ell}(X)\right):=\mathbb{E}_{X}\left(e^{\lambda% \langle\ell,X\rangle}\right),\quad\ell\in\mathcal{S}^{n-1}

(22)

a common tool for concentration analysis. One advantage of the MGF compared with the second-order moment used in Section IV is that the MGF captures high-order information, and this is a major reason why MGF is useful for analyzing concentration properties.

Thus, a potential approach to bound $\|X_{t}-x_{t}\|$ is to show $X_{t}-x_{t}$ is sub-Gaussian. Unfortunately, this is not true. For associated trajectories $X_{t}$ and $x_{t}$ , $\mathbb{E}(X_{t})\neq x_{t}$ for general nonlinear dynamics [35, Chapter 5.5]. Moreover, (19) requires bounding the evolution of the MGF for all $\ell\in\mathcal{S}^{n-1}$ , which can be too strong.

V-B Averaged Moment Generating Function

Inspired by the concentration properties of sub-Gaussian distributions and the limitations of MGF, we propose a weaker version of MGF termed the Averaged Moment Generating Function (AMGF) for probabilistic reachability analysis.

Definition V.2 (AMGF).

Given $\lambda\in\mathbb{R}$ , the Averaged Moment Generating Function $\Phi_{n,\lambda}:\mathbb{R}^{n}\to\mathbb{R}$ is defined as

\mathbb{E}_{X}(\Phi_{n,\lambda}(X)):=\mathbb{E}_{X}\mathbb{E}_{\ell\sim% \mathcal{S}^{n-1}}\left(e^{\lambda\langle\ell,X\rangle}\right).

(23)

The AMGF is an average of the MGF over the sphere $\ell\sim\mathcal{S}^{n-1}$ . It was recently proposed in [43] to study sampling problems. Thanks to the averaging, bounding the AMGF is easier than bounding MGF for each $\ell$ . The AMGF can also be viewed as an MGF by replacing the exponential energy function $e^{\lambda\langle\ell,x\rangle}$ by $\Phi_{n,\lambda}(x)=\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(e^{\lambda% \langle\ell,x\rangle}\right)$ . This energy function $\Phi_{n,\lambda}$ has several intriguing properties.

Lemma V.2 (Properties of $\Phi_{n,\lambda}$ ).

The following statements hold for $\Phi_{n,\lambda}$ in (23):

(i)

Rotation invariance: For any $x\in\mathbb{R}^{n}$ and $\eta\in\mathcal{S}^{n-1}$ ,

\displaystyle\Phi_{n,\lambda}(x)=\Phi_{n,\lambda}(\|x\|\eta).

(ii)

Monotonicity: For any $x,y\in\mathbb{R}^{n}$ such that $\|x\|\leq\|y\|$ ,

1\leq\Phi_{n,\lambda}(x)\leq\Phi_{n,\lambda}(y).

Lemma V.2 implies that $\Phi_{n,\lambda}(x)$ only depends on the norm $\|x\|$ of $x$ and is monotonically increasing as $\|x\|$ . For a non-expanding deterministic system (4), that is, $\mu(D_{x}f(x,u,t))\leq 0$ , these properties imply that $\Phi_{n,\lambda}(x_{t}-y_{t})$ is decreasing for any two arbitrary trajectories $x_{t},y_{t}$ . This can be formalized as follows.

Lemma V.3.

Consider the deterministic system (4) such that $\mu(D_{x}f(x,u,t))\leq 0$ for every $(x,u,t)\in{}^{n}\times\mathcal{U}\times{}_{\geq 0}$ , then for any $x,y\in\mathbb{R}^{n}$ , $u\in\mathcal{U}$ and $t\geq 0$ :

\displaystyle\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(e^{\lambda\langle\ell% ,x-y\rangle}\lambda\ell^{\mathsf{T}}(f(x,u,t)-f(y,u,t))\right)\leq 0.

An intriguing fact about AMGF is that it induces the same concentration property as MGF.

Lemma V.4.

If a random variable $X\in\mathbb{R}^{n}$ satisfies

\mathbb{E}_{X}\left(\Phi_{n,\lambda}(X)\right)\leq e^{\frac{\lambda^{2}\sigma^% {2}}{2}},~{}\forall\lambda\in\mathbb{R},

(24)

then for any $\delta>0$ , (20) holds with probability at least $1-\delta$ .

At first sight, this is counter-intuitive, since upper-bounding AMGF is weaker than upper-bounding MGF for all $\ell$ . To see why Lemma V.4 holds, define an intermediate random variable $\tilde{X}=QX$ where $Q\sim\mathbb{U}^{n}$ is a random unitary matrix with $\mathbb{U}^{n}$ denoting the set of all the unitary matrices in $\mathbb{R}^{n\times n}$ . Then the AMGF over $X$ is equal to the MGF over $\tilde{X}$ , that is, $\mathbb{E}_{X}\left(\Phi_{n,\lambda}(X)\right)=\mathbb{E}_{\tilde{X}}\left(e^{% \lambda\langle\ell,\tilde{X}\rangle}\right)$ . This means $\tilde{X}$ is sub-Gaussian with variance proxy $\sigma^{2}$ . Lemma V.4 then follows by noticing that the transformation $\tilde{X}=QX$ does not affect the norm.

V-C Theoretical Analysis

Equipped with the AMGF, we are ready to establish a tighter probabilistic bound for the stochastic deviation $\|X_{t}-x_{t}\|$ . Thanks to Lemma V.4, it suffices to bound the evolution of the AMGF $\mathbb{E}(\Phi_{n,\lambda}(X_{t}-x_{t}))$ over time. Below we establish a probabilistic bound of the stochastic deviation of order $\mathcal{O}(\sqrt{\log(1/\delta)})$ for the stochastic system (3) satisfying Assumption 1 by developing a tight bound of $\mathbb{E}(\Phi_{n,\lambda}(X_{t}-x_{t}))$ .

Theorem 1.

Consider the stochastic system (3) and the deterministic system (4) under Assumption 1. Let $X_{t}$ be a trajectory of (3) and $x_{t}$ be an associated trajectory of (4) with the same initial condition $x_{0}$ and input $u_{t}:t\to\mathcal{U}$ . Then, for any $t>0$ , $\delta\in(0,1)$ and $\varepsilon\in(0,1)$ ,

\|X_{t}-x_{t}\|\leq\sqrt{\Psi_{t}(\varepsilon_{1}n+\varepsilon_{2}\log(1/% \delta))},

(25)

holds with probability at least $1-\delta$ , where $\Psi_{t}$ is as in (10) and $\varepsilon_{1}$ , $\varepsilon_{2}$ are given by (21).

Proof.

We start with a special case where Assumption 1 holds with a global matrix measure bound $c_{t}=0$ and then generalize it to cases where Assumption 1 holds with arbitrary $c_{t}$ .

V-C1 Special Case

Denote $v_{t}=X_{t}-x_{t}$ and $\beta_{t}=f(X_{t},u_{t},t)-f(x_{t},u_{t},t)$ , then

dv_{t}=\beta_{t}dt+g_{t}dW_{t}.

(26)

Based on the Fokker–Planck equation [35], $h_{t}=\mathbb{E}(\Phi_{n,\lambda}(v_{t}))$ satisfies

\frac{dh_{t}}{dt}=\mathbb{E}\left(\langle\nabla\Phi_{n,\lambda}(v_{t}),\beta_{% t}\rangle\right)+\tfrac{1}{2}\mathbb{E}\left(\langle\nabla^{2}\Phi_{n,\lambda}% (v_{t}),g_{t}g_{t}^{\mathsf{T}}\rangle\right)

(27)

By (23),

\mathbb{E}\left(\langle\nabla\Phi_{n,\lambda}(v_{t}),\beta_{t}\rangle\right)=% \mathbb{E}\,\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(e^{\lambda\langle\ell,% v_{t}\rangle}\lambda\ell^{\mathsf{T}}\beta_{t}\right).

(28)

Applying Lemma V.3 with $x=X_{t}$ and $y=x_{t}$ , we obtain

\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(e^{\lambda\langle\ell,v_{t}\rangle% }\lambda\ell^{\mathsf{T}}\beta_{t}\right)\leq 0.

(29)

Then

\mathbb{E}\left(\langle\nabla\Phi_{n,\lambda}(v_{t}),\beta_{t}\rangle\right)\leq 0

(30)

follows by taking the expectation of (29).

The term $\frac{1}{2}\mathbb{E}\left(\langle\nabla^{2}\Phi_{n,\lambda}(v_{t}),g_{t}g_{t}% ^{\mathsf{T}}\rangle\right)$ can be bounded as

\begin{split}\tfrac{1}{2}&\mathbb{E}\left(\langle\nabla^{2}\Phi_{n,\lambda}(v_% {t}),g_{t}g_{t}^{\mathsf{T}}\rangle\right)\\ =&\tfrac{1}{2}\mathbb{E}\,\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(\langle% \lambda^{2}e^{\lambda\langle\ell,v_{t}\rangle}\ell\ell^{\mathsf{T}},g_{t}g_{t}% ^{\mathsf{T}}\rangle\right)\\ \leq&\tfrac{1}{2}\mathbb{E}\,\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(% \lambda^{2}e^{\lambda\langle\ell,v_{t}\rangle}\mathrm{tr}(\ell\ell^{\mathsf{T}% })\,\|g_{t}g_{t}^{\mathsf{T}}\|\right)\\ \leq&\frac{\lambda^{2}\sigma_{t}^{2}}{2}\mathbb{E}\left(\Phi_{n,\lambda}(v_{t}% )\right)=\frac{\lambda^{2}\sigma_{t}^{2}}{2}h_{t}\end{split}

(31)

where the first inequality follows the Cauchy–Schwarz inequality and the last line uses the fact that $\mathrm{tr}(\ell\ell^{\mathsf{T}})=1$ for any $\ell\sim\mathcal{S}^{n-1}$ and $\|g_{t}g_{t}^{\mathsf{T}}\|\leq\sigma_{t}^{2}$ as in Assumption 1.

Plugging (30) and (31) into (27) we arrive at

\frac{dh_{t}}{dt}\leq\frac{\lambda^{2}\sigma_{t}^{2}}{2}h_{t},\quad h_{0}=1,

(32)

and using the Grönwall inequality [44], we conclude

\mathbb{E}\left(\Phi_{n,\lambda}(X_{t}-x_{t})\right)=h_{t}\leq e^{\frac{% \lambda^{2}\int_{0}^{t}\sigma_{\tau}^{2}d\tau}{2}}.

(33)

By Lemma V.4, (33) implies that, for $\forall\delta\in(0,1)$ , with probability at least $1-\delta$ ,

\|X_{t}-x_{t}\|\leq\sqrt{(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))\int_% {0}^{t}\sigma_{\tau}^{2}d\tau},

(34)

where $\varepsilon_{1},\varepsilon_{2}$ satisfy (21). Since $\Psi_{t}=\int_{0}^{t}\sigma_{\tau}^{2}d\tau$ when $c_{t}=0$ by definition, (34) corresponds to (25). This completes the proof in the special case.

V-C2 General Cases

Next we consider the general cases where Assumption 1 holds with $\mu(D_{x}f(x,u,t))\leq c_{t}$ for arbitrary $c_{t}\in\mathbb{R}$ . The strategy is to convert them into the above special case via scaling. Define scaled trajectories $\tilde{X}_{t}=e^{-\psi_{t}}X_{t}$ and $\tilde{x}_{t}=e^{-\psi_{t}}x_{t}$ where $\psi_{t}=\int_{0}^{t}c_{\tau}d\tau$ , then $\tilde{x}_{t}$ is a trajectory of the deterministic system

\dot{\tilde{x}}_{t}=-c_{t}\tilde{x}_{t}+e^{-\psi_{t}}f(e^{\psi_{t}}\tilde{x}_{% t},u_{t},t)=:\tilde{f}(\tilde{x}_{t},u_{t},t),

(35)

Similarly, $\tilde{X}_{t}$ satisfies

d\tilde{X}_{t}=\tilde{f}(\tilde{X}_{t},u_{t},t)dt+e^{-\psi_{t}}g_{t}dW_{t}.

(36)

Note that (36) and (35) have the same drift dynamics $\tilde{f}$ . For any $\tilde{x}_{t},\tilde{y}_{t}\in\mathbb{R}^{n}$ , $\tilde{f}$ satisfies

\begin{split}&(\tilde{x}_{t}-\tilde{y}_{t})^{\mathsf{T}}\left(\tilde{f}(\tilde% {x}_{t},u_{t},t)-\tilde{f}(\tilde{y}_{t},u_{t},t)\right)\\ =&(\tilde{x}_{t}-\tilde{y}_{t})^{\mathsf{T}}\left(-c_{t}(\tilde{x}_{t}-\tilde{% y}_{t})+e^{-\psi_{t}}\left(f(x_{t},u_{t},t)-f(y_{t},u_{t},t)\right)\right)\\ =&-c_{t}\|\tilde{x}_{t}-\tilde{y}_{t}\|^{2}+e^{-2\psi_{t}}(x_{t}-y_{t})^{% \mathsf{T}}\left(f(x_{t},u_{t},t)-f(y_{t},u_{t},t)\right)\\ \leq&-c_{t}\|\tilde{x}_{t}-\tilde{y}_{t}\|^{2}+e^{-2\psi_{t}}c_{t}\|x_{t}-y_{t% }\|^{2}\\ =&-c_{t}\|\tilde{x}_{t}-\tilde{y}_{t}\|^{2}+c_{t}e^{-2\psi_{t}}e^{2\psi_{t}}\|% \tilde{x}_{t}-\tilde{y}_{t}\|^{2}=0,\end{split}

meaning Assumption 1 holds for scaled systems (35) and (36) with $\tilde{c}_{t}=0$ and the results for the special case can be applied. The diffusion coefficient of (36) satisfies $\|e^{-2\psi_{t}}g_{t}g_{t}^{\mathsf{T}}\|\leq e^{-2\psi_{t}}\sigma_{t}^{2}=:% \tilde{\sigma}_{t}^{2}$ . Applying (33) to the scaled dynamics (36) we have that with probability at least $1-\delta$ ,

\|\tilde{X}_{t}-\tilde{x}_{t}\|\leq\sqrt{(\varepsilon_{1}n+\varepsilon_{2}\log% (1/\delta))\int_{0}^{t}\tilde{\sigma}_{\tau}^{2}d\tau}.

(37)

Recalling $X_{t}=e^{\psi_{t}}\tilde{X}_{t}$ , $x_{t}=e^{\psi_{t}}\tilde{x}_{t}$ , and $\Psi_{t}$ in (10), we conclude that with probability at least $1-\delta$ ,

\begin{split}\|X_{t}-x_{t}\|\leq&\sqrt{(\varepsilon_{1}n+\varepsilon_{2}\log(1% /\delta))e^{2\psi_{t}}\int_{0}^{t}\sigma_{\tau}^{2}e^{-2\psi_{\tau}}d\tau}\\ =&\sqrt{\Psi_{t}(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))},\end{split}

which completes the proof. ∎

Remark V.1.

When Assumption 1 holds with time-invariant $c_{t}\equiv c$ and $\sigma_{t}\equiv\sigma$ , $\psi_{t}$ defined in (10) becomes $\psi_{t}=ct$ , and (25) in Theorem 1 reduces to

\|X_{t}-x_{t}\|\leq\sqrt{\frac{\sigma^{2}(e^{2ct}-1)}{2c}(\varepsilon_{1}n+% \varepsilon_{2}\log(1/\delta))}.

(38)

The probabilistic bound in Theorem 1 highly relies on the contraction rate of the dynamics. The bound (25) and (38) resemble the input-to-state bounds used in contraction-based reachability of deterministic systems [28]. Thus, our results can be viewed as the stochastic counterpart of the deterministic incremental input-to-state bounds in contraction theory.

V-D Extension to Weighted Norm

The probabilistic bound of the stochastic deviation $\|X_{t}-x_{t}\|$ in Theorem 1 can be extended to bound the weighted deviation $\|X_{t}-x_{t}\|_{P}$ for any positive-definite matrix $P$ . To this end, define the weighted matrix measure of a matrix $A$ as

\mu_{P}(A)=\lim_{\epsilon\to 0^{+}}\frac{\|I_{n}+\epsilon A\|_{P}-1}{\epsilon},

which can be obtained using the expression $\mu_{P}(A)=\mu(P^{\frac{1}{2}}AP^{-\frac{1}{2}})$ [4]. Consider the systems (3) and (4) satisfying a modified version of Assumption 1 as $\mu_{P}(D_{x}f(x,u,t))\leq c_{t}$ and $P^{\frac{1}{2}}g_{t}g_{t}^{\mathsf{T}}P^{\frac{1}{2}}\preceq\sigma_{t}^{2}I_{n}$ .

This setting with weighted norm can be converted to the unweighted version in Section V-C through a coordinate transformation. More specifically, given associated trajectories $X_{t},x_{t}$ of (3) and (4), define $\hat{X}_{t}=P^{\frac{1}{2}}X_{t}$ , $\hat{x}_{t}=P^{\frac{1}{2}}x_{t}$ , then $\hat{X_{t}}$ and $\hat{x}_{t}$ satisfy

	$\displaystyle d\hat{X}_{t}$	$\displaystyle=\hat{f}(\hat{X}_{t},u_{t},t)dt+\hat{g}_{t}dW_{t},$		(39)
	$\displaystyle\dot{\hat{x}}_{t}$	$\displaystyle=\hat{f}(\hat{x}_{t},u,t)$		(40)

with $\hat{f}(\hat{x})=P^{\frac{1}{2}}f(P^{-\frac{1}{2}}\hat{x})$ and $\hat{g}_{t}=P^{\frac{1}{2}}g_{t}$ . By definition, $\mu_{P}(A)=\mu(P^{\frac{1}{2}}AP^{-\frac{1}{2}})$ for any matrix $A$ , thus $\mu(D_{\hat{x}}\hat{f}(\hat{x},u,t))=\mu_{P}(D_{x}f(x,u,t))\leq c_{t}$ . Moreover, $\hat{g}_{t}\hat{g}_{t}^{\mathsf{T}}=P^{\frac{1}{2}}g_{t}g_{t}^{\mathsf{T}}P^{% \frac{1}{2}}\preceq\sigma_{t}^{2}I_{n}$ . Therefore, the systems (39) and (40) satisfy Assumption 1 with the standrad $\ell_{2}$ -norm. Then, by Theorem 1, with probability at least $1-\delta$ ,

\|X_{t}-x_{t}\|_{P}=\|\hat{X}_{t}-\hat{x}_{t}\|\leq\sqrt{\Psi_{t}(\varepsilon_% {1}n+\varepsilon_{2}\log(1/\delta))}.

(41)

This extension for weighted norm can sometimes be advantageous to establish a tighter bound. Given a matrix $A$ , $\mu(A)$ can be much larger than the real parts of the eigenvalues of $A$ . In contrast, with a proper positive-definite matrix $P$ , $\mu(P^{\frac{1}{2}}AP^{-\frac{1}{2}})$ can be made arbitrarily close to the real parts of the eigenvalues of $A$ [4, Chapter 2.7]. In this circumstance, working with the weighted norm can lead to sharper results.

V-E Tightness of Probabilistic Bound

Finally, we show that the probabilistic bound in Theorem 1 is tight under Assumption 1 and it is impossible to achieve better probabilistic bounds than (25) without additional assumptions. In particular, we show that the bound (25) precisely captures the stochastic deviation of linear systems satisfying Assumption 1.

Consider the LTI stochastic system (12) and its associated deterministic system (13). They satisfy Assumption 1 with $c_{t}\equiv c=\mu(A)$ and $\sigma_{t}\equiv\sigma$ . By Theorem 1, with probability at least $1-\delta$ ,

\|X_{t}-x_{t}\|\leq\sqrt{\frac{\sigma^{2}(e^{2ct}-1)}{2c}(\varepsilon_{1}n+% \varepsilon_{2}\log(1/\delta))}.

(42)

This is the same, up to some constants chosen by convention, as the tight bound (18) calculated using Gaussian concentration properties [45, Chapter 4.4]. For Assumption 1 with time-varying $c_{t}$ and $\sigma_{t}$ , we can construct linear dynamics

	$\displaystyle dX_{t}$	$\displaystyle=$	$\displaystyle c_{t}X_{t}dt+\sigma_{t}dW_{t}$
	$\displaystyle\dot{x}_{t}$	$\displaystyle=$	$\displaystyle c_{t}x_{t}.$

With the same initial condition $X_{0}=x_{0}$ , $X_{t}-x_{t}$ is a zero-mean Gaussian random variance with covariance $\Psi_{t}I_{n}$ where $\Psi_{t}$ is as in (10). The Gaussianity of $X_{t}-x_{t}$ leads to the same probabilistic bound as (25). Therefore, Theorem 1 is tight and cannot be improved.

While in this work the probabilistic bound in Theorem 1 is designed for reachability analysis, we emphasize that this very bound is one of the first non-conservative results that can quantitatively describe the behavior of a stochastic system. The bound is of independent interests beyond reachability analysis and can potentially impact many other areas such as estimation, uncertainty quantification, finance, etc.

VI Probabilistic Reachable Set

Equipped with the probabilistic bound (25) for the stochastic deviation, we are ready to present our approach to approximating the $\delta$ -PRS of a general nonlinear stochastic system (3). Recalling the separation strategy in Proposition 1, we can combine our tight bound (25) with any existing methods for approximating the DRS of the associated deterministic system (4) to estimate the $\delta$ -PRS of (3).

Theorem 2.

Consider the stochastic system (3) with initial set $\mathcal{X}_{0}\subseteq\mathbb{R}^{n}$ and input set $\mathcal{U}\subseteq\mathbb{R}^{p}$ . Suppose Assumption 1 holds. Let $\overline{\mathcal{R}}_{t}$ be an over-approximation of the DRS of the associated deterministic system (4). Then, for any probability level $\delta\in(0,1)$ , a $\delta$ -PRS of (3) is

\mathcal{R}_{\delta,t}=\overline{\mathcal{R}}_{t}\oplus\mathcal{B}^{n}\left(r_% {\delta,t},0\right),

(43)

where $r_{\delta,t}=\sqrt{\Psi_{t}(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))}$ with $\Psi_{t}$ in (10) and $\varepsilon_{1},\varepsilon_{2}$ in (21).

Proof.

The result follows by replacing $r_{\delta,t}$ in Proposition 1 by (25) in Theorem 1. ∎

Theorem 2 is a paradigm shift and essentially reduces the probabilistic reachability problem into a widely studied deterministic reachability problem. To compute the $\delta$ -PRS (43) for the stochastic system (3), one only needs to over-approximate the DRS for the deterministic system (4). Theoretically speaking, the $\delta$ -PRS in Theorem 2 is tight and cannot be improved further without additional assumptions. From a practical point of view, by combining the tight high probability bounds on stochastic deviation in Theorem 1 with the scalable deterministic reachability frameworks [3, 6, 7], the $\delta$ -PRS in Theorem 2 can be computed efficiently for high-dimensional systems.

Tightness. To be more precise, replacing $\overline{\mathcal{R}}_{t}$ by $\mathcal{R}_{t}$ in (43) gives a tight $\delta$ -PRS. First, the probabilistic bound $r_{\delta,t}$ is tight provided the coefficients $c_{t},\sigma_{t}$ in Assumption 1 is tight. Moreover, since the deterministic input and stochastic disturbance in (3) affects the $\delta$ -PRS in Definition III.1 in an independent manner, the separation strategy (Proposition 1) is also tight, meaning the decomposition in Proposition 1 is necessary. Thus, the tightness of $\mathcal{R}_{\delta,t}$ in Theorem 2 depends only on the tightness of the over-approximation $\overline{\mathcal{R}}_{t}$ of the DRS of the associated deterministic system (4). It becomes tighter as $\overline{\mathcal{R}}_{t}\to\mathcal{R}_{t}$ .

Computational complexity. The computational cost of (43) comes from two sources: computing $\overline{\mathcal{R}}_{t}$ and realizing the Minkowski sum $\oplus$ . The former depends on the choice of algorithms for approximating DRS. Computing the Minkowski sum in a parametrized form is challenging and efficient algorithms are only available for ellipsoids and polyhedral [46, 47, 48]. Fortunately, a parametrized Minkowski sum is not needed for reachability analysis. In practice, we only need an efficient membership oracle to determine whether a point $x$ belongs to the Minkowski sum, which is an easier task. In particular, for (43), this oracle requires comparing $\min_{y\in\overline{\mathcal{R}}_{t}}\|y-x\|$ and $r_{\delta,t}$ , which is a convex optimization when $\overline{\mathcal{R}}_{t}$ is convex. In the following section, we exemplify our framework with two popular methods for computing $\overline{\mathcal{R}}_{t}$ . These methods are scalable and result in convex $\overline{\mathcal{R}}_{t}$ , rendering efficient algorithms for probabilistic reachability analysis.

Extension to weighted norm. Similar to stochastic deviation, Theorem 2 is also extendable to the case for $P$ -weighted $\ell_{2}$ norm. Consider the modified assumption as shown in Section V-D. Following the proof of Proposition 1 while substituting $\mathcal{B}^{n}(r_{\delta,t},0)$ by $\mathcal{B}_{P}^{n}(r_{\delta,t},0)$ , where $\mathcal{B}_{P}^{n}(r_{\delta,t},0)=\{x\in\mathbb{R}^{n}:~{}\|x\|_{P}\leq r_{% \delta,t}\}$ is an ellipsoid, we conclude that

\mathcal{R}_{\delta,t}=\overline{\mathcal{R}}_{t}\oplus\mathcal{B}_{P}^{n}(r_{% \delta,t},0)

is a $\delta$ -PRS of the system (3).

VII Case study of Probabilistic Reachability

In this section, we present the application of $\delta$ -PRS derived in Section VI in two case studies where contraction-based and interval-based methods are used to approximate $\mathcal{R}_{t}$ .

VII-A Contraction-based Probabilistic Reachability

Contraction theory is a classical framework for analyzing the stability of dynamical systems using the incremental distance between their trajectories [29, 31]. Traditionally, it is employed to infer strong robustness properties of dynamical systems. Recently, contraction theory has emerged as a computationally efficient tool for reachability analysis of deterministic systems. The contraction-based method relies on the matrix measure (Definition II.2) and the following assumption.

Assumption 2.

For the deterministic system (4), there exist constants $c,\rho\in\real$ such that, for every $t,x,u\in{}_{\geq 0}\times{}^{n}\times\mathcal{U}$ ,

(i)

$\mu_{\mathbb{X}}(D_{x}f(x,u,t))\leq c$ , and
(ii)

$\|D_{u}f(x,u,t)\|_{\mathbb{X},\mathbb{U}}\leq\rho$ .

Here $\mu_{\mathbb{X}}$ is the matrix measure with respect to the norm $\|\cdot\|_{\mathbb{X}}$ on ⁿ and $\|\cdot\|_{\mathbb{X},\mathbb{U}}$ denotes the induced norm on ^p×n. The norm $\|\cdot\|_{\mathbb{X}}$ can be chosen differently from the Euclidean norm in general to ensure the tightest possible reachable set. Suppose that system (4) satisfies Assumption 2 and let $t\mapsto x^{*}_{t}$ be a trajectory of (4) with the input $t\mapsto u^{*}_{t}$ . Given initial configuration $x_{0}\in\mathcal{X}_{0}=\mathcal{B}_{{\mathbb{X}}}(r_{1},x^{*}_{0})$ for $r_{1}>0$ and input $u_{t}\in\mathcal{B}_{{\mathbb{U}}}(r_{2},u^{*}_{t})\subset\mathcal{U}$ for $r_{2}>0$ , the contraction-based method gives the following over-approximation of reachable sets of (4) [28]

\overline{\mathcal{R}}_{t}=\mathcal{B}_{{\mathbb{X}}}(e^{ct}r_{1}+\tfrac{\rho}% {c}(e^{ct}-1)r_{2},x^{*}_{t}).

(44)

The contraction-based over-approximation of reachable sets in (44) can be combined with Theorem 2 to estimate a $\delta$ -PRS of the system (3).

Proposition 2 (Contraction-based reachability).

Consider the stochastic system (3) and its associated deterministic system (4) satisfying Assumptions 1 and 2. Let $t\mapsto x^{*}_{t}$ be a trajectory of (4) with the input $t\mapsto u^{*}_{t}$ and $t\mapsto X_{t}$ be a trajectory of the stochastic system (3) starting from $x_{0}\in\mathcal{B}_{{\mathbb{X}}}(r_{1},x^{*}_{0})$ with an input $u_{t}:{}_{\geq 0}\to\mathcal{B}_{{\mathbb{U}}}(r_{2},u^{*}_{t})$ . Then, for every $t\geq 0$ , with probability at least $1-\delta$ ,

\displaystyle X_{t}\in\mathcal{B}_{{\mathbb{X}}}(e^{ct}r_{1}+\tfrac{\rho}{c}(e% ^{ct}-1)r_{2},x^{*}_{t})\oplus\mathcal{B}^{n}(r_{\delta,t},0)

where $r_{\delta,t}=\sqrt{\Psi_{t}(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))}$ , $\Psi_{t}$ is as in (10), and $\varepsilon_{1}$ , $\varepsilon_{2}$ are given by (21).

Proof.

The result follows by combining Theorem 2 and the contraction-based over-approximation of the reachable set of system (4) in (44). ∎

VII-B Interval-based Probabilistic Reachability

Interval analysis is a framework for estimating propagation of uncertainties by computing function bounds [49] and has been successfully used for reachability analysis of deterministic systems. The main idea of interval-based reachability is to embed the dynamical system into a higher dimensional space using a suitable inclusion function. The map $\left[\begin{smallmatrix}\underline{\mathsf{F}}\\ \overline{\mathsf{F}}\end{smallmatrix}\right]:{}^{2n}\times{}^{2p}\times% \mathbb{R}_{\geq 0}\to{}^{2n}$ is an inclusion function for $f$ , if, for every $z,w\in[\underline{x},\overline{x}]\times[\underline{u},\overline{u}]$ and every $t\geq 0$ ,

\displaystyle\underline{\mathsf{F}}(\underline{x},\overline{x},\underline{u},% \overline{u},t)\leq f(z,w,t)\leq\overline{\mathsf{F}}(\underline{x},\overline{% x},\underline{u},\overline{u},t).

Many automated approaches exist for finding an inclusion function for $f$ . We refer to [8, Section IV.B] for a detailed discussion on these approaches and to [50] for a toolbox for computing inclusion functions.

Given an interval initial configuration $\mathcal{X}_{0}=[\underline{x}_{0},\overline{x}_{0}]$ and an interval input set $\mathcal{U}=[\underline{u},\overline{u}]$ , the embedding system of (4) associated with the inclusion function $\mathsf{F}$ is given by

\displaystyle\begin{bmatrix}\dot{\underline{x}}\\ \dot{\overline{x}}\end{bmatrix}=\begin{bmatrix}\underline{\mathsf{F}}(% \underline{x},\overline{x},\underline{u},\overline{u},t)\\ \overline{\mathsf{F}}(\underline{x},\overline{x},\underline{u},\overline{u},t)% \end{bmatrix}.

(45)

Let $\left[\begin{smallmatrix}\underline{x}_{t}\\ \overline{x}_{t}\end{smallmatrix}\right]$ be the trajectory of the embedding system (45) starting from $\left[\begin{smallmatrix}\underline{x}_{0}\\ \overline{x}_{0}\end{smallmatrix}\right]$ . Then, an over-approximation of the deterministic reachable set of (4) is [8, Proposition 5]

\displaystyle\overline{\mathcal{R}}_{t}=[\underline{x}_{t},\overline{x}_{t}].

(46)

This interval-based over-approximation of reachable sets can be combined with Theorem 2 to estimate a $\delta$ -PRS of the system (3).

Proposition 3 (Interval-based reachability).

Consider the stochastic system (3) and its associated deterministic system (4) satisfying Assumption 1. Let $t\mapsto X_{t}$ be a trajectory of the stochastic system (3) starting from $x_{0}\in[\underline{x}_{0},\overline{x}_{0}]$ with an input curve $u_{t}:{}_{\geq 0}\to[\underline{u},\overline{u}]$ . Suppose that $\mathsf{F}=\left[\begin{smallmatrix}\underline{\mathsf{F}}\\ \overline{\mathsf{F}}\end{smallmatrix}\right]$ is an inclusion function for $f$ and $\left[\begin{smallmatrix}\underline{x}_{t}\\ \overline{x}_{t}\end{smallmatrix}\right]$ is the trajectory of the embedding system (45) starting from $\left[\begin{smallmatrix}\underline{x}_{0}\\ \overline{x}_{0}\end{smallmatrix}\right]$ . Then, for every $t\geq 0$ , with probability at least $1-\delta$

\displaystyle X_{t}\in[\underline{x}_{t},\overline{x}_{t}]\oplus\mathcal{B}^{n% }(r_{\delta,t},0),

where $r_{\delta,t}=\sqrt{\Psi_{t}(\varepsilon_{1}n+\varepsilon_{2}\log(1/\delta))}$ , $\Psi_{t}$ is as in (10), $\varepsilon_{1}$ , $\varepsilon_{2}$ are given by (21).

Proof.

The result follows by combining Theorem 2 and the interval over-approximation of the reachable set of the deterministic system (4) in (46). ∎

VIII Numerical experiments

In this section, we present several examples to illustrate the efficacy of our framework and the tightness of our results.

VIII-A Linear Example

We first consider a linear example to validate the tightness of our bound (25) on the stochastic deviation. Consider a simple linear dynamics

\begin{split}dX_{t}&=-0.4I_{n}X_{t}dt+\sqrt{2}dW_{t}\\ &=AX_{t}dt+\sigma dW_{t},\end{split}

(47)

initialized at $X_{0}=0$ . The system (47) satisfies Assumption 1 with $c_{t}\equiv c=\mu(A)=-0.4$ and $\sigma_{t}\equiv\sigma=\sqrt{2}$ . By linearity, $X_{t}$ follows a zero-mean Gaussian distribution whose covariance $\text{cov}(X_{t})$ can be computed using (15) in closed-form. The trajectory of the deterministic dynamics associated with (47) starting from $x_{0}=0$ is $x_{t}\equiv 0$ .

To illustrate the bound (25), we simulate 5000 independent trajectories of (47) with $n=2$ over a time horizon $t\in[0,1.5]$ and compute the deviation associated with each trajectory, as depicted in Figure 3. These trajectories are compared with our probabilistic bound $r_{\delta,t}$ with design parameter $\varepsilon=1/16$ , $\delta=10^{-3}$ . Figure 3 shows that all the trajectories satisfy the bound $r_{\delta,t}$ as expected.

By Theorem 1, the square of our bound (25), $r_{\delta,t}^{2}$ , grows linearly with $\log(1/\delta)$ and $n$ , as illustrated in Figure 4. To verify the tightness of these dependencies, we compare them with those obtained through simulation. In particular, for each choice of $\delta$ and $n$ , we simulate $10^{7}$ independent trajectories of (47) and compute the associated value of $\|X_{t}-x_{t}\|$ for each trajectory. We follow a standard approach [51] and estimate the high probability bound $\hat{r}_{\delta,t}$ of the stochastic deviation as the $\delta$ -th largest $\|X_{t}-x_{t}\|$ (e.g., top 1% if $\delta=10^{-2}$ ). The results, shown in Figure 4, imply that $\hat{r}_{\delta,t}^{2}$ also grows linearly with $\log(1/\delta)$ and $n$ , consistent with our theoretical bound (25).

Note that there is a gap between the calculated bounds with $\varepsilon=1/16$ and the simulated bounds in 4. This is due to the choice of parameters $\varepsilon_{1}$ and $\varepsilon_{2}$ . These parameters $\varepsilon_{1}$ and $\varepsilon_{2}$ (21) are constructed in the proof for all $\delta,n$ and are not optimal for each choice of $\delta,n$ , as explained in Section V-A.

VIII-B Inverted Pendulum

Next, we consider an inverted pendulum with a stabilizing state feedback controller, whose state space model is given by

\begin{split}dX_{t}&=\begin{bmatrix}\dot{\theta}\\ \frac{g}{L}\sin\theta+KX_{t}\\ \end{bmatrix}dt+g_{t}dW_{t}\\ \end{split}

(48)

where $X_{t}=\begin{bmatrix}\theta&\dot{\theta}\end{bmatrix}^{\mathsf{T}}$ is the state vector, $\theta$ is the angle describing the position of the pendulum, $\dot{\theta}$ is the angular velocity of the pendulum, $KX_{t}=K_{1}\theta+K_{2}\dot{\theta}$ is a stabilizing linear state feedback controller, and $g_{t}dW_{t}$ is the stochastic disturbance on the angular acceleration with $W_{t}$ a one-dimensional Wiener process. Set the gravity $g=10$ , the pendulum length $L=1$ , and $g_{t}=\begin{bmatrix}0&0.1\end{bmatrix}^{\mathsf{T}}$ . The linear state feedback controller $KX_{t}$ is designed with feedback gain $K=\begin{bmatrix}K_{1}&K_{2}\end{bmatrix}=\begin{bmatrix}-20&-20\end{bmatrix}$ to stabilize the equilibrium point $x^{*}=\begin{bmatrix}\theta^{*}&\dot{\theta}^{*}\end{bmatrix}^{\mathsf{T}}=0$ of the associated deterministic system

\displaystyle\dot{x}_{t}=\begin{bmatrix}\dot{\theta}\\ \frac{g}{L}\sin(\theta)+K_{1}\theta+K_{2}\dot{\theta}\end{bmatrix}:=f(x_{t}),

(49)

where $x_{t}=\begin{bmatrix}\theta&\dot{\theta}\end{bmatrix}^{\mathsf{T}}$ .

Our goal is to find a tight $\delta$ -PRS of the inverted pendulum (48) starting from the initial configuration $\mathcal{X}_{0}=[-\frac{\pi}{10},\frac{\pi}{10}]\times[-0.2,0.2]$ . We use Theorem 2 with contraction-based and interval-based deterministic reachability methods to obtain $\delta$ -PRS of the inverted pendulum (48). We first consider the modified version of Assumption 1 introduced in Section V-D as $\mu_{P}(D_{x}f(x))\leq c_{t}$ and $P^{\frac{1}{2}}g_{t}g_{t}^{\mathsf{T}}P^{\frac{1}{2}}\preceq\sigma_{t}^{2}I_{n}$ for every $t\geq 0$ and $x\in{}^{n}$ . For every $x=(\theta,\dot{\theta})^{\mathsf{T}}\in{}^{2}$ ,

\displaystyle D_{x}f(x)=\left[\begin{smallmatrix}0&1\\ \frac{g}{L}\cos(\theta)+K_{1}&K_{2}\end{smallmatrix}\right].

We define the matrices $A_{1},A_{2}\in{}^{2\times 2}$ as follows:

\displaystyle A_{1}=\left[\begin{smallmatrix}0&1\\ \frac{g}{L}+K_{1}&K_{2}\end{smallmatrix}\right],\qquad A_{2}=\left[\begin{% smallmatrix}0&1\\ -\frac{g}{L}+K_{1}&K_{2}\end{smallmatrix}\right].

Note that $\cos(\theta)\in[-1,1]$ . This implies that, for every $x\in{}^{2}$ , we have $D_{x}f(x)\in\mathrm{conv}\left\{A_{1},A_{2}\right\}$ , where $\mathrm{conv}$ is the convex hull. Thus, using [52, Lemma 4.1], the minimum constant contraction rate $c_{t}=c$ for the system (49) can be computed using the following optimization problem:

	$\displaystyle\min_{c\in\real,P\succ 0}$	$\displaystyle\quad c$
		$\displaystyle\mbox{s.t.}\;\;A_{i}^{\mathsf{T}}P+PA_{i}\preceq 2cP,\quad\mbox{% for }i\in\{1,2\}.$		(50)

We solve optimization problem (VIII-B) by successively applying semi-definite programming on $P$ and bisection on $c$ . The optimal solution of (VIII-B) is given by the constant contraction rate $c_{t}=c=-0.5$ and the weight matrix $P=\left[\begin{smallmatrix}35.68&2.21\\ 2.21&1.27\end{smallmatrix}\right]$ . With this matrix $P$ , we compute $P^{\frac{1}{2}}g_{t}g_{t}^{\mathsf{T}}P^{\frac{1}{2}}=\left[\begin{smallmatrix% }0.0010&0.0034\\ 0.0034&0.0118\end{smallmatrix}\right]\preceq 0.0128I_{2}$ , and get $\sigma_{t}=\sigma=0.1130$ .

Contraction-based Reachability

We use Proposition 2 to find a $\delta$ -PRS of (48). We consider Assumption 2 with $\|\cdot\|_{\mathbb{X}}=\|\cdot\|_{P}$ with positive definite matrix $P$ as defined above. For every $x\in{}^{2}$ , we have $\mu_{P}(D_{x}f(x))\leq c=-0.5$ . Using Proposition 2 with the initial configuration $\overline{\mathcal{X}}_{0}=\{x\in{}^{2}\;|\;\|x\|_{P}\leq\left\|\left[\begin{% smallmatrix}\frac{\pi}{10}\\ 0.2\end{smallmatrix}\right]\right\|_{P}\}\supset\mathcal{X}_{0}$ , we obtain a $\delta$ -PRS of (48) with $\delta=10^{-3}$ as shown in Figure 5 (left).

Interval-based Reachability

We use Theorem 2 with a modified version of interval-based analysis for the associated deterministic system (49) to find a $\delta$ -PRS of (48). We consider the coordinate transformation $y_{t}=Tx_{t}$ with nonsingular matrix $T=\left[\begin{smallmatrix}1&0.2\\ 1&0\end{smallmatrix}\right]$ for the associated deterministic system (49) and apply interval-based reachability to the transformed system. We employ Theorem 2 with the initial configuration $T^{-1}\overline{\mathcal{Y}}_{0}\supset\mathcal{X}_{0}$ where $\overline{\mathcal{Y}}_{0}=[{-\tfrac{\pi}{10}\left[\begin{smallmatrix}1.04\\ 1\end{smallmatrix}\right]},{\tfrac{\pi}{10}\left[\begin{smallmatrix}1.04\\ 1\end{smallmatrix}\right]}]$ . The $\delta$ -PRS of (48) with $\delta=10^{-3}$ obtained using this analysis are shown in Figure 5 (right).

VIII-C Nonlinear Unicycle model

Finally, we consider a vehicle moving on a $2$ -dimensional plane with obstacles shown in light red in Figure 6. The vehicle is modeled by the unicycle dynamics

\displaystyle dX_{t}=\begin{bmatrix}v_{t}\cos(\theta)\\ v_{t}\sin(\theta)\\ w_{t}+u_{t}\end{bmatrix}dt+g_{t}dW_{t}

(51)

where $X_{t}=\begin{bmatrix}p_{x}&p_{y}&\theta\end{bmatrix}^{\mathsf{T}}$ is the state of the vehicle, $(p_{x},p_{y})$ is the position of the center of mass of the vehicle in the plane, $\theta$ is the heading angle of the vehicle, $v_{t}$ is the linear velocity of the center of mass, $w_{t}$ is the angular velocity of the vehicle, $u_{t}$ is the deterministic disturbance on the angular velocity, and $g_{t}dW_{t}$ is the stochastic disturbance on the model with $W_{t}$ a three-dimensional Wiener process. The associated deterministic unicycle model is given by

\displaystyle\dot{x}_{t}=\begin{bmatrix}v_{t}\cos(\theta)\\ v_{t}\sin(\theta)\\ w_{t}\end{bmatrix}+\begin{bmatrix}0\\ 0\\ u_{t}\end{bmatrix}:=f(x_{t},u_{t},t)

(52)

where $x_{t}=\begin{bmatrix}p_{x}&p_{y}&\theta\end{bmatrix}^{\mathsf{T}}$ . We use Model Predictive Control (MPC) to design an open-loop controller to steer the deterministic system (52) from the initial configuration $x_{0}=(5,5,-\frac{2\pi}{3})$ to the origin while avoiding the obstacles in the $p_{x}-p_{y}$ plane. The trajectory of the deterministic system (52) with the MPC controller starting from $x_{0}=(5,5,-\frac{2\pi}{3})$ is denoted by $t\mapsto(p_{x}^{*},p_{y}^{*},\theta^{*})$ . We consider $t\mapsto(p_{x}^{*},p_{y}^{*},\theta^{*})$ as the reference trajectory for the stochastic vehicle (51). Using the approach in [53], we design the following feedback controller for tracking the reference trajectory $t\mapsto(p_{x}^{*},p_{y}^{*},\theta^{*})$ :

	$\displaystyle v_{t}$	$\displaystyle=K_{r}r_{t}\cos(\alpha_{t}),$
	$\displaystyle w_{t}$	$\displaystyle=K_{\alpha}\alpha_{t}+K_{r}\sin(\alpha_{t})\cos(\alpha_{t})\tfrac% {\alpha_{t}+\beta_{t}}{\alpha_{t}},$		(53)

where the variables $r_{t},\alpha_{t},\beta_{t}$ are defined as

	$\displaystyle r_{t}$	$\displaystyle=\sqrt{(p_{x}-p_{x}^{})^{2}+(p_{y}-p_{y}^{})^{2}},$
	$\displaystyle\alpha_{t}$	$\displaystyle=\theta-\mathrm{atan}(p_{y}-p_{y}^{},p_{x}-p_{x}^{}),$
	$\displaystyle\beta_{t}$	$\displaystyle=\mathrm{atan}(p_{y}-p_{y}^{},p_{x}-p_{x}^{})-\theta^{*},$

and $K_{r},K_{\alpha}\geq 0$ are feedback gains.

We consider the stochastic vehicle (51) with $g_{t}=0.1$ and $u_{t}\in[-0.03,0.03]$ with the tracking controller (VIII-C) and feedback gains $K_{r}=-0.8$ and $K_{\alpha}=-1.5$ . We assume that this stochastic vehicle starts from $x_{0}=(5,5,-\frac{2\pi}{3})$ . Our goal is to provide high probability guarantees that the stochastic vehicle (51) with the tracking controller (VIII-C) avoids the obstacles shown in Figure 6, over the time horizon $[0,5]$ . We use a modified version of Proposition 2 to construct $\delta$ -PRS of the stochastic vehicle (51) with the tracking controller (VIII-C). We use the strategy in [9] to estimate a time-varying $c_{t}$ in Assumption 1. We also use a generalization of Assumption 2 with $\|\cdot\|_{\mathbb{X}}$ and $\|\cdot\|_{\mathbb{U}}$ defined as standard Euclidean norms and with time-varying contraction rate $c_{t}$ . This time-varying contraction rate is then used for contraction-based reachability analysis of the associated deterministic system (52) in Proposition 2. For $\delta=10^{-3}$ , the $\delta$ -PRS of the stochastic vehicle (51) with the tracking controller (VIII-C) starting from $x_{0}=(5,5,-\frac{2\pi}{3})$ at times $t\in[0,5]$ are shown in Figure 6 using the green envelope. From Figure 6, it is clear that the green envelope does not intersect any of the obstacles in the $p_{x}-p_{y}$ plane. Therefore, with probability at least $99.9\%$ , the stochastic vehicle (51) with the tracking controller (VIII-C) starting from $x_{0}=(5,5,-\frac{2\pi}{3})$ is safe and does not hit any obstacle for all times $t\in[0,5]$ .

IX Conclusion

We propose an efficient and flexible framework for computing the Probabilistic Reachable Set (PRS) of continuous-time nonlinear stochastic systems. Using a suitable separation strategy, we decouple the effect of deterministic inputs and the effect of stochastic uncertainties on the PRS. This separation strategy is flexible as it allows using any deterministic reachability method to capture the effects of deterministic inputs. It essentially reduce the problem of computing PRS into analyzing the distance between stochastic trajectories and their associated deterministic trajectories termed stochastic deviation. By developing a novel energy function called Averaged Moment Generating Function, we establish a tight high-probability bound on the stochastic deviation of stochastic systems. To the best of our knowledge, our bound is the tightest high-probability bound on stochastic deviation for general nonlinear systems. By combining this probabilistic bound on stochastic deviation with the contraction-based and interval-based reachability of deterministic systems, we provide tight estimates of PRS for stochastic systems. Our separation strategy and tight probabilistic bounds on stochastic deviation can transform many current methods/results in control theory and applications. They will also open new research directions in various fields, such as safety-critical control, estimation, uncertainty quantification, statistics, and machine learning. Additionally, the AMGF leveraged in our theoretical analysis is a powerful mathematical tool, waiting for further exploitation in the future.

References

[1] S. Bansal, M. Chen, S. Herbert, and C. J. Tomlin, “Hamilton-jacobi reachability: A brief overview and recent advances,” in IEEE 56th Annual Conference on Decision and Control (CDC), 2017, pp. 2242–2253.
[2] I. Mitchell, “A toolbox of level set methods,” UBC Department of Computer Science Technical Report TR-2007-11, vol. 1, p. 6, 2007.
[3] J. Maidens and M. Arcak, “Reachability analysis of nonlinear systems using matrix measures,” IEEE Transactions on Automatic Control, vol. 60, no. 1, pp. 265–270, 2015.
[4] F. Bullo, Contraction Theory for Dynamical Systems, 1.0 ed. Kindle Direct Publishing, 2022. [Online]. Available: http://motion.me.ucsb.edu/book-ctds
[5] J. K. Scott and P. I. Barton, “Bounds on the reachable sets of nonlinear control systems,” Automatica, vol. 49, no. 1, pp. 93–100, 2013.
[6] P.-J. Meyer, A. Devonport, and M. Arcak, “TIRA: Toolbox for interval reachability analysis,” in Proceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Control, 2019, pp. 224–229.
[7] S. Coogan, “Mixed monotonicity for reachability and safety in dynamical systems,” in 2020 59th IEEE Conference on Decision and Control (CDC), 2020, pp. 5074–5085.
[8] S. Jafarpour, A. Harapanahalli, and S. Coogan, “Efficient interaction-aware interval analysis of neural network feedback loops,” arXiv preprint, 2023. [Online]. Available: https://arxiv.org/abs/2307.14938
[9] C. Fan, J. Kapinski, X. Jin, and S. Mitra, “Simulation-driven reachability using matrix measures,” ACM Trans. Embed. Comput. Syst., vol. 17, no. 1, dec 2017. [Online]. Available: https://doi.org/10.1145/3126685
[10] Z. Huang and S. Mitra, “Computing bounded reach sets from sampled simulation traces,” in Proceedings of the 15th ACM International Conference on Hybrid Systems: Computation and Control, ser. HSCC ’12. Association for Computing Machinery, 2012, p. 291–294. [Online]. Available: https://doi.org/10.1145/2185632.2185676
[11] R. K. Cosner, P. Culbertson, and A. D. Ames, “Bounding stochastic safety: Leveraging Freedman’s inequality with discrete-time control barrier functions,” IEEE Control Systems Letters, vol. 8, pp. 1937–1942, 2024.
[12] H. M. Soner and N. Touzi, “Dynamic programming for stochastic target problems and geometric flows,” Journal of the European Mathematical Society, vol. 4, no. 3, pp. 201–236, 2002.
[13] A. Abate, M. Prandini, J. Lygeros, and S. Sastry, “Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems,” Automatica, vol. 44, no. 11, pp. 2724–2734, 2008.
[14] S. Summers and J. Lygeros, “Verification of discrete time stochastic hybrid systems: A stochastic reach-avoid decision problem,” Automatica, vol. 46, no. 12, pp. 1951–1961, 2010.
[15] P. Mohajerin Esfahani, D. Chatterjee, and J. Lygeros, “The stochastic reach-avoid problem and set characterization for diffusions,” Automatica, vol. 70, pp. 43–56, 2016.
[16] K. Lesser, M. Oishi, and R. S. Erwin, “Stochastic reachability for control of spacecraft relative motion,” in 52nd IEEE Conference on Decision and Control, 2013, pp. 4705–4712.
[17] H. Sartipizadeh, A. P. Vinod, B. Acikmese, and M. Oishi, “Voronoi partition-based scenario reduction for fast sampling-based stochastic reachability computation of linear systems,” in 2019 American Control Conference (ACC), 2019, pp. 37–44.
[18] N. Hashemi, X. Qin, L. Lindemann, and J. V. Deshmukh, “Data-driven reachability analysis of stochastic dynamical systems with conformal inference,” in 62nd IEEE Conference on Decision and Control (CDC), 2023, pp. 3102–3109.
[19] M. Black, G. Fainekos, B. Hoxha, and D. Panagou, “Risk-aware fixed-time stabilization of stochastic systems under measurement uncertainty,” arXiv preprint, 2024. [Online]. Available: https://arxiv.org/abs/2403.20258
[20] H. El-Samad, M. Fazel, X. Liu, A. Papachristodoulou, and S. Prajna, “Stochastic reachability analysis in complex biological networks,” in 2006 American Control Conference. IEEE, 2006, pp. 6–pp.
[21] S. Prajna, A. Jadbabaie, and G. J. Pappas, “A framework for worst-case and stochastic safety verification using barrier certificates,” IEEE Transactions on Automatic Control, vol. 52, no. 8, pp. 1415–1428, 2007.
[22] C. Santoyo, M. Dutreix, and S. Coogan, “A barrier function approach to finite-time stochastic system verification and control,” Automatica, vol. 125, p. 109439, 2021.
[23] M. Anand, A. Lavaei, and M. Zamani, “From small-gain theory to compositional construction of barrier certificates for large-scale stochastic systems,” IEEE Transactions on Automatic Control, vol. 67, no. 10, pp. 5638–5645, 2022.
[24] X. Chen and S. Sankaranarayanan, “Reachability analysis for cyber-physical systems: Are we there yet?” in NASA Formal Methods: 14th International Symposium, NFM 2022, Pasadena, CA, USA, May 24–27, 2022, Proceedings. Springer, 2022, pp. 109–130.
[25] C. Moore, “Unpredictability and undecidability in dynamical systems,” Physical Review Letters, vol. 64, pp. 2354–2357, May 1990.
[26] C. A. Desoer and H. Haneda, “The measure of a matrix as a tool to analyze computer algorithms for circuit analysis,” IEEE Transactions on Circuit Theory, vol. 19, no. 5, pp. 480–486, 1972.
[27] G. Söderlind, “The logarithmic norm. history and modern theory,” BIT Numerical Mathematics, vol. 46, pp. 631–652, 2006.
[28] A. Davydov, S. Jafarpour, and F. Bullo, “Non-Euclidean contraction theory for robust nonlinear stability,” IEEE Transactions on Automatic Control, vol. 67, no. 12, pp. 6667–6681, 2022.
[29] W. Lohmiller and J.-J. E. Slotine, “On contraction analysis for non-linear systems,” Automatica, vol. 34, no. 6, pp. 683–696, 1998.
[30] F. Forni and R. Sepulchre, “A differential Lyapunov framework for contraction analysis,” vol. 59, no. 3, pp. 614–628, 2014.
[31] Z. Aminzare and E. D. Sontag, “Contraction methods for nonlinear systems: A brief introduction and some open problems,” in 53rd IEEE Conference on Decision and Control, 2014, pp. 3835–3847.
[32] E. M. Aylward, P. A. Parrilo, and J.-J. E. Slotine, “Stability and robustness analysis of nonlinear systems via contraction metrics and sos programming,” Automatica, vol. 44, no. 8, pp. 2163–2170, 2008.
[33] Z. Zahreddine, “Matrix measure and application to stability of matrices and interval dynamical systems,” International Journal of Mathematics and Mathematical Sciences, vol. 2003, no. 2, p. 937084, 2003.
[34] B. Øksendal, Stochastic differential equations: an introduction with applications, ser. Universitext. Springer Berlin, Heidelberg, 2013.
[35] S. Särkkä and A. Solin, Applied stochastic differential equations. Cambridge University Press, 2019, vol. 10.
[36] R. K. Cosner, P. Culbertson, A. J. Taylor, and A. D. Ames, “Robust safety under stochastic uncertainty with discrete-time control barrier functions,” arXiv preprint, 2023. [Online]. Available: https://arxiv.org/abs/2302.07469
[37] E. Fogel and D. Halperin, “Exact and efficient construction of Minkowski sums of convex polyhedra with applications,” Computer-Aided Design, vol. 39, no. 11, pp. 929–940, 2007.
[38] Q.-C. Pham, N. Tabareau, and J.-J. Slotine, “A contraction theory approach to stochastic incremental stability,” IEEE Transactions on Automatic Control, vol. 54, no. 4, pp. 816–820, 2009.
[39] T. Lorenz, Mutational analysis: a joint framework for Cauchy problems in and beyond vector spaces, ser. Lecture Notes in Mathematics. Springer-Verlag, Berlin, 2010, vol. 1996.
[40] F. Burns, M. Fiedler, and E. Haynsworth, “Polyhedral cones and positive operators,” Linear Algebra and its Applications, vol. 8, no. 6, pp. 547–559, 1974.
[41] A. Gittens and J. A. Tropp, “Tail bounds for all eigenvalues of a sum of random matrices,” arXiv preprint, 2011. [Online]. Available: https://arxiv.org/abs/1104.4513
[42] P. Rigollet and J.-C. Hütter, “High-dimensional statistics,” arXiv preprint, 2023. [Online]. Available: https://arxiv.org/abs/2310.19244
[43] J. M. Altschuler and K. Talwar, “Concentration of the Langevin algorithm’s stationary distribution,” arXiv preprint, 2022. [Online]. Available: https://arxiv.org/abs/2212.12629
[44] T. H. Gronwall, “Note on the derivatives with respect to a parameter of the solutions of a system of differential equations,” Annals of Mathematics, vol. 20, no. 4, pp. 292–296, 1919.
[45] R. Vershynin, High-Dimensional Probability: An Introduction with Applications in Data Science, ser. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2018.
[46] P. Gritzmann and B. Sturmfels, “Minkowski addition of polytopes: Computational complexity and applications to gröbner bases,” SIAM Journal on Discrete Mathematics, vol. 6, no. 2, pp. 246–269, 1993.
[47] G. Varadhan and D. Manocha, “Accurate Minkowski sum approximation of polyhedral models,” in 12th Pacific Conference on Computer Graphics and Applications, 2004. PG 2004. Proceedings., 2004, pp. 392–401.
[48] C. Weibel, “Minkowski sums of polytopes: combinatorics and computation,” EPFL, Tech. Rep., 2007.
[49] L. Jaulin, M. Kieffer, O. Didrit, and É. Walter, Applied Interval Analysis. Springer London, 2001.
[50] A. Harapanahalli, S. Jafarpour, and S. Coogan, “A toolbox for fast interval arithmetic in numpy with an application to formal verification of neural network controlled systems,” in 2nd ICML Workshop on Formal Verification of Machine Learning, 2023. [Online]. Available: https://arxiv.org/abs/2306.15340
[51] A. Shapiro, “Monte Carlo sampling methods,” Handbooks in operations research and management science, vol. 10, pp. 353–425, 2003.
[52] C. Fan, J. Kapinski, X. Jin, and S. Mitra, “Simulation-driven reachability using matrix measures,” ACM Transactions on Embedded Computing Systems, vol. 17, no. 1, dec 2017.
[53] M. Aicardi, G. Casalino, A. Bicchi, and A. Balestrino, “Closed loop steering of unicycle like vehicles via Lyapunov techniques,” IEEE Robotics & Automation Magazine, vol. 2, no. 1, pp. 27–35, 1995.

-A Proof of Lemma V.1 (Sub-Gaussian Norm Concentration)

For every $\varepsilon\in(0,1)$ , we can find a finite set $\mathcal{N}\subseteq\mathcal{B}^{n}\left(1,0\right)$ such that for $\forall x_{0}\in\mathcal{B}^{n}\left(1,0\right),~{}\exists x\in\mathcal{N},~{}% \|x-x_{0}\|\leq\varepsilon$ . Let $|\mathcal{N}|$ denote the number of elements in $\mathcal{N}$ . By [42, Exercise 4.4.2], there exists such an $\mathcal{N}$ that $|\mathcal{N}|\leq(1+2/\varepsilon)^{n}$ and for any vector $x\in\mathbb{R}^{n}$ ,

\|x\|\leq\frac{1}{1-\varepsilon}\max_{\ell\in\mathcal{N}}\ell^{\mathsf{T}}x.

It follows that, for any $r>0$ and any sub-Gaussian vector $X\in\mathbb{R}^{n}$ with variance proxy $\sigma^{2}$ ,

\begin{split}&\mathbb{P}\left(\|X\|\geq r\right)\leq\mathbb{P}\left(\frac{1}{1% -\varepsilon}\max_{\ell\in\mathcal{N}}\ell^{\mathsf{T}}X\geq r\right)\\ \leq&\mathbb{P}\left(\bigcup_{\ell\in\mathcal{N}}\frac{\ell^{\mathsf{T}}X}{1-% \varepsilon}\geq r\right).\end{split}

(54)

Since $\|\ell\|\leq 1$ for $\ell\in\mathcal{N}$ , we have

\mathbb{P}\left(\frac{\ell^{\mathsf{T}}X}{1-\varepsilon}\geq r,~{}\ell\in% \mathcal{N}\right)\leq\mathbb{P}\left(\frac{\ell^{\mathsf{T}}X}{\|\ell\|(1-% \varepsilon)}\geq r,~{}\ell\in\mathcal{N}\right).

(55)

By the definition of sub-Gaussian vector, we know $\frac{\ell^{\mathsf{T}}X}{\|\ell\|}$ is sub-Gaussian with variance proxy $\sigma^{2}$ for any $\ell\in\mathcal{N}$ . By Hoeffding’s Inequality,

\mathbb{P}\left(\frac{\ell^{\mathsf{T}}X}{\|\ell\|(1-\varepsilon)}\geq r,~{}% \ell\in\mathcal{N}\right)\leq e^{-\frac{(1-\varepsilon)^{2}r^{2}}{2\sigma^{2}}}.

(56)

Combining (54)-(56) and taking union bound over $\ell\in\mathcal{N}$ , we obtain

\begin{split}&\mathbb{P}\left(\|X\|\geq r\right)\leq\mathbb{P}\left(\bigcup_{% \ell\in\mathcal{N}}\frac{\ell^{\mathsf{T}}X}{1-\varepsilon}\geq r\right)\\ \leq&|\mathcal{N}|e^{-\frac{(1-\varepsilon)^{2}r^{2}}{2\sigma^{2}}}\leq(1+% \frac{2}{\varepsilon})^{n}e^{-\frac{(1-\varepsilon)^{2}r^{2}}{2\sigma^{2}}}.% \end{split}

(57)

To ensure a confidence level $\delta$ , which means the right-hand side of (57) $\leq\delta$ , $r$ should satisfy

r^{2}\geq\frac{2\sigma^{2}}{(1-\varepsilon)^{2}}(n\log(1+\frac{2}{\varepsilon}% )+\log\frac{1}{\delta}).

(58)

Then (20) follows by taking the square root. This completes the proof.

-B Proof of Lemma V.2

(i) For any $\lambda\in\mathbb{R}$ and $\eta_{1},\eta_{2}\in\mathcal{S}^{n-1}$ , we have

\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}(e^{\lambda\langle\ell,\eta_{1}\rangle})% =\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}e^{\lambda\langle\ell,\eta_{2}\rangle}).

(59)

It follows that, for any $x\in{}^{n}$ and $\eta\in\mathcal{S}^{n-1}$ ,

\begin{split}\Phi_{n,\lambda}(x)&=\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(% e^{\lambda\langle\ell,x\rangle}\right)\\ &=\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(e^{\lambda\|x\|\langle\ell,\frac% {x}{\|x\|}\rangle}\right)\\ &=\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(e^{\lambda\|x\|\langle\ell,\eta% \rangle}\right)\\ &=\Phi_{n,\lambda}(\|x\|\,\eta).\end{split}

(60)

(ii) By part (i), $\Phi_{n,\lambda}(x)=\Phi_{n,\lambda}(\|x\|\eta)$ for any $\eta\in\mathcal{S}^{n-1}$ . Taking the derivative of $\Phi_{n,\lambda}(x)$ over $\|x\|$ when $\|x\|\neq 0$ :

\begin{split}&\frac{d\Phi_{n,\lambda}(x)}{d\|x\|}=\frac{d}{d\|x\|}\mathbb{E}_{% \ell\sim\mathcal{S}^{n-1}}\left(e^{\lambda\|x\|\langle\ell,\eta\rangle}\right)% \\ &=\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(\lambda\langle\ell,\eta\rangle e% ^{\lambda\|x\|\langle\ell,\eta\rangle}\right)\\ &=\frac{1}{\|x\|}\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(\lambda\|x\|% \langle\ell,\eta\rangle e^{\lambda\|x\|\langle\ell,\eta\rangle}\right).\end{split}

(61)

Set $y=e^{\lambda\|x\|\langle\ell,\eta\rangle}$ . Applying Jensen’s Inequality over the convex function $y\log y$ , we arrive at

\begin{split}&\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(\lambda\|x\|\langle% \ell,\eta\rangle e^{\lambda\|x\|\langle\ell,\eta\rangle}\right)\\ \geq&\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(\lambda\|x\|\langle\ell,\eta% \rangle\right)\,\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(e^{\lambda\|x\|% \langle\ell,\eta\rangle}\right)=0.\end{split}

Thus, $\frac{d\Phi_{n,\lambda}(x)}{d\|x\|}\geq 0$ when $\|x\|\neq 0$ . When $\|x\|=0$ , obviously $\Phi_{n,\lambda}(x)=1$ and $\frac{d\Phi_{n,\lambda}(x)}{d\|x\|}=0$ . This completes the proof.

-C Proof of Lemma V.3

Let $\tau\mapsto x_{\tau}$ and $\tau\mapsto y_{\tau}$ be two trajectories of the system (4). Since $\mu(D_{x}f(x,u,t))\leq 0$ , for every $x,u,t\in{}^{n}\times\mathcal{U}\times{}_{\geq 0}$ , we get [28, Theorem 36]

\displaystyle\|x_{\tau}-y_{\tau}\|\leq\|x_{t}-y_{t}\|,\mbox{ for all }\tau\geq t.

Using Lemma V.2 (ii), for every $\tau\geq t$ ,

\displaystyle\mathbb{E}_{\ell\in\mathcal{S}^{n-1}}(e^{\lambda\langle\ell,x_{% \tau}-y_{\tau}\rangle})\leq\mathbb{E}_{\ell\in\mathcal{S}^{n-1}}(e^{\lambda% \langle\ell,x_{t}-y_{t}\rangle}).

This implies that, for every $\tau\geq t$ ,

\displaystyle\mathbb{E}_{\ell\in\mathcal{S}^{n-1}}\left(\tfrac{1}{\tau-t}\left% (e^{\lambda\langle\ell,x_{\tau}-y_{\tau}\rangle}-e^{\lambda\langle\ell,x_{t}-y% _{t}\rangle}\right)\right)\leq 0.

Taking the limit as $\tau-t\to 0^{+}$ , we have

\displaystyle\mathbb{E}_{\ell\in\mathcal{S}^{n-1}}\left(e^{\lambda\langle\ell,% x_{t}-y_{t}\rangle}\lambda\ell^{\mathsf{T}}(f(x_{t},u_{t},t)-f(y_{t},u_{t},t))% \right)\leq 0.

The result follows by noting that $x_{t},y_{t}\in{}^{n}$ and $u_{t}\in\mathcal{U}$ have been chosen arbitrarily.

-D Proof of Lemma V.4

Define random vector $\tilde{X}=QX$ , where $Q\sim\mathbb{U}^{n}$ is a random unitary matrix. By Lemma V.2(i), we have that for any $\eta\in\mathcal{S}^{n-1}$ ,

\begin{split}\Phi_{n,\lambda}(X)&=\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(% e^{\lambda\|X\|\langle\ell,\frac{X}{\|X\|}\rangle}\right)=\mathbb{E}_{\ell\sim% \mathcal{S}^{n-1}}\left(e^{\lambda\|X\|\langle\ell,\eta\rangle}\right)\\ &=\mathbb{E}_{\ell\sim\mathcal{S}^{n-1}}\left(e^{\lambda\langle\eta,\ell\|X\|% \rangle}\right)=\mathbb{E}_{Q\sim\mathbb{U}^{n}}\left(e^{\lambda\langle\eta,QX% \rangle}\right),\end{split}

where the last “ $=$ ” uses the fact that $Q\ell\in\mathcal{S}^{n-1}$ for any $\ell\in\mathcal{S}^{n-1}$ . By (24), we obtain

\begin{split}&\mathbb{E}_{\tilde{X}}\left(e^{\lambda\langle\eta,\tilde{X}% \rangle}\right)=\mathbb{E}_{X}\mathbb{E}_{Q}\left(e^{\lambda\langle\eta,QX% \rangle}\right)\\ =&\mathbb{E}_{X}\left(\Phi_{n,\lambda}(X)\right)\leq e^{\frac{\lambda^{2}% \sigma^{2}}{2}},\quad\forall\lambda\in\mathbb{R},~{}\forall\eta\in\mathcal{S}^% {n-1}.\end{split}

(62)

Therefore, $\tilde{X}$ is sub-Gaussian with variance proxy $\sigma^{2}$ . By Lemma V.1, $\tilde{X}$ satisfies (20).

Finally, since $\|X\|=\|QX\|=\|\tilde{X}\|$ for any $Q\in\mathbb{U}^{n}$ , we conclude that $X$ also satisfies (20). This completes the proof.

Probabilistic Reachability Analysis of Stochastic Control Systems

Abstract

Index Terms:

I Introduction

II Preliminaries

II-A Notations

II-B Reachable Set of Deterministic Dynamics

Definition II.1 (DRS).

II-C Matrix Measure and Contraction Theory

Definition II.2 (Matrix Measure).

Lemma II.1.

III Reachability of Stochastic Systems

III-A Problem Statement

Assumption 1.

Definition III.1 (δ𝛿\deltaitalic_δ-PRS).

Problem 1.

III-B Separation Strategy and Stochastic Deviation

Proposition 1 (Separation strategy).

Proof.

Problem 2.

IV Expectation Bound and Limitations

IV-A Expectation Bound on Stochastic Deviation

IV-B Limitations of Expectation Bound

V Probabilistic Bound on Stochastic Deviation

V-A Sub-Gaussian and MGF

Definition V.1.

Lemma V.1.

V-B Averaged Moment Generating Function

Definition V.2 (AMGF).

Lemma V.2 (Properties of Φn,λsubscriptΦ𝑛𝜆\Phi_{n,\lambda}roman_Φ start_POSTSUBSCRIPT italic_n , italic_λ end_POSTSUBSCRIPT).

Lemma V.3.

Lemma V.4.

V-C Theoretical Analysis

Theorem 1.

Proof.

V-C1 Special Case

V-C2 General Cases

Remark V.1.

V-D Extension to Weighted Norm

V-E Tightness of Probabilistic Bound

VI Probabilistic Reachable Set

Theorem 2.

Proof.

VII Case study of Probabilistic Reachability

VII-A Contraction-based Probabilistic Reachability

Assumption 2.

Proposition 2 (Contraction-based reachability).

Proof.

VII-B Interval-based Probabilistic Reachability

Proposition 3 (Interval-based reachability).

Proof.

VIII Numerical experiments

VIII-A Linear Example

VIII-B Inverted Pendulum

Contraction-based Reachability

Interval-based Reachability

VIII-C Nonlinear Unicycle model

IX Conclusion

References

-A Proof of Lemma V.1 (Sub-Gaussian Norm Concentration)

-B Proof of Lemma V.2

-C Proof of Lemma V.3

-D Proof of Lemma V.4

Definition III.1 ( $\delta$ -PRS).

Lemma V.2 (Properties of $\Phi_{n,\lambda}$ ).