Data-driven model predictive control: closed-loop guarantees and experimental results

Julian Berberich; Johannes Köhler; Matthias A. Müller; Frank Allgöwer

doi:10.1515/auto-2021-0024

Open Access Published by De Gruyter (O) June 30, 2021

Data-driven model predictive control: closed-loop guarantees and experimental results

Datenbasierte prädiktive Regelung: theoretische Garantien und experimentelle Ergebnisse

Julian Berberich
Julian Berberich received the Master’s degree in Engineering Cybernetics from the University of Stuttgart, Germany, in 2018. Since 2018, he has been a Ph.D. student at the Institute for Systems Theory and Automatic Control under supervision of Prof. Frank Allgöwer and a member of the International Max-Planck Research School (IMPRS). He has received the Outstanding Student Paper Award at the 59th Conference on Decision and Control in 2020. His research interests are in the area of data-driven system analysis and control.
, Johannes Köhler
Johannes Köhler received his Master degree in Engineering Cybernetics from the University of Stuttgart, Germany, in 2017. He has since been a doctoral student at the Institute for Systems Theory and Automatic Control under the supervision of Prof. Frank Allgöwer and a member of the Graduate School Soft Tissue Robotics at the University of Stuttgart. His research interests are in the area of model predictive control.
, Matthias A. Müller
Matthias A. Müller received a Diploma degree in Engineering Cybernetics from the University of Stuttgart, Germany, and an M.S. in Electrical and Computer Engineering from the University of Illinois at Urbana-Champaign, US, both in 2009. In 2014, he obtained a Ph.D. in Mechanical Engineering, also from the University of Stuttgart, Germany, for which he received the 2015 European Ph.D. award on control for complex and heterogeneous systems. Since 2019, he is director of the Institute of Automatic Control and full professor at the Leibniz University Hannover, Germany. He obtained an ERC Starting Grant in 2020 and is recipient of the inaugural Brockett-Willems Outstanding Paper Award for the best paper published in Systems & Control Letters in the period 2014–2018. His research interests include nonlinear control and estimation, model predictive control, and data-/learning-based control, with application in different fields including biomedical engineering.
and Frank Allgöwer
Frank Allgöwer is professor of mechanical engineering at the University of Stuttgart, Germany, and Director of the Institute for Systems Theory and Automatic Control (IST) there. Frank is active in serving the community in several roles: Among others he has been President of the International Federation of Automatic Control (IFAC) for the years 2017–2020, Vice-president for Technical Activities of the IEEE Control Systems Society for 2013/14, and Editor of the journal Automatica from 2001 until 2015. From 2012 until 2020 Frank served in addition as Vice-president for the German Research Foundation (DFG), which is Germany’s most important research funding organization. His research interests include predictive control, data-based control, networked control, cooperative control, and nonlinear control with application to a wide range of fields including systems biology.

From the journal at - Automatisierungstechnik

https://doi.org/10.1515/auto-2021-0024

Abstract

We provide a comprehensive review and practical implementation of a recently developed model predictive control (MPC) framework for controlling unknown systems using only measured data and no explicit model knowledge. Our approach relies on an implicit system parametrization from behavioral systems theory based on one measured input-output trajectory. The presented MPC schemes guarantee closed-loop stability for unknown linear time-invariant (LTI) systems, even if the data are affected by noise. Further, we extend this MPC framework to control unknown nonlinear systems by continuously updating the data-driven system representation using new measurements. The simple and intuitive applicability of our approach is demonstrated with a nonlinear four-tank system in simulation and in an experiment.

Zusammenfassung

Dieser Artikel beinhaltet einen umfassenden Überblick sowie eine praktische Implementierung von kürzlich entwickelten Entwurfsverfahren zur modellprädiktiven Regelung (MPC), welche unbekannte Systeme nur mit Hilfe von gemessenen Daten und ohne explizites Modellwissen regeln. Unser Ansatz bedient sich einer impliziten Systemparametrisierung aus der behavioral Systemtheorie basierend auf einer Eingangs-Ausgangs-Trajektorie. Die präsentierten MPC-Algorithmen garantieren Stabilität für unbekannte lineare, zeitinvariante Systeme, selbst im Fall von verrauschten Messungen. Zusätzlich stellen wir eine Erweiterung vor, um unbekannte nichtlineare Systeme zu regeln durch stetige Aktualisierung der datenbasierten Systemparametrisierung. Die einfache und intuitive Anwendbarkeit wird an einem nichtlinearen Vier-Tank System in der Simulation und in einem Experiment demonstriert.

Keywords: data-driven control; model predictive control; nonlinear systems

Schlagwörter: datenbasierte Regelung; prädiktive Regelung; nichtlineare Systeme

1 Introduction

Model predictive control (MPC) is a successful modern control technique which relies on the repeated solution of an open-loop optimal control problem [21]. Essential advantages of MPC are its applicability to general system classes and the possibility to enforce constraint satisfaction. In order to implement an MPC controller, typically an accurate model of the plant is required. Since modeling is often the most time-consuming step in controller design and due to the increasing availability of data, control approaches using only data and inaccurate or no model knowledge have recently gained increasing attention [15]. Examples for such approaches are recent works on adaptive [2], [3] or learning-based [14] MPC.

Another promising approach for designing MPC schemes using only measured data stems from a result from behavioral systems theory: In [22], it is shown that one input-output trajectory of an unknown linear time-invariant (LTI) system can be used to parametrize all trajectories, assuming that the corresponding input is persistently exciting. By replacing the standard state-space model with this data-dependent parametrization, it is simple to design MPC schemes which use input-output data instead of prior model knowledge [23], [11], [7]. Such MPC schemes have successfully been applied to challenging real-world examples, compare [13], and open-loop robustness properties have been established [12]. However, for a reliable application to complex or safety-critical systems, guarantees for the closed-loop behavior are crucial, which are, however, challenging to obtain, in particular in case of noisy data.

In this paper, we provide an overview of recent advances in data-driven MPC based on [22]. We focus on MPC schemes with guaranteed closed-loop stability and robustness properties in case of LTI systems [7], [5], [6], [8], [9]. Additionally, we demonstrate how such MPC schemes can be modified to control unknown nonlinear systems using only measured data. We perform an extensive validation of this approach in simulation and in an experiment involving the classical nonlinear four-tank system from [20].

The remainder of the paper is structured as follows. After providing some preliminaries in Section 2, we present MPC schemes to control LTI systems using noise-free data, LTI systems using noisy data, and nonlinear systems, respectively, in Section 3. We then validate the presented MPC framework with a nonlinear four-tank system in simulation (Section 4) and in an experiment (Section 5). Finally, we conclude the paper in Section 6.

2 Preliminaries

We write I[a,b] for the set of all integers in the interval [a,b], I≥0 for the set of nonnegative integers, and R≥0 for the set of nonnegative real numbers. For a vector x, we denote by ‖x‖p its p-norm. We denote an identity matrix of appropriate dimension by I, we write P=P⊤≻0 if a matrix P is positive definite, and we define ‖x‖P2:=x⊤Px. The interior of a set X is denoted by int(X). We define K as the class of functions α:R≥0→R≥0 which are continuous, strictly increasing, and satisfy α(0)=0. For a sequence {uk}k=0N−1, we define the Hankel matrix

HL(u):=u0u1…uN−Lu1u2…uN−L+1⋮⋮⋱⋮uL−1uL…uN−1

and we write u[a,b]:=ua⊤…ub⊤⊤, u:=u[0,N−1]. For our theoretical results, we consider an LTI system

(1)xk+1=Axk+Buk,yk=Cxk+Duk

with state xk∈Rn, input uk∈Rm, and output yk∈Rp. Throughout this paper, we make the standing assumption that (A,B) is controllable, (A,C) is observable, and an upper bound on the system order n is known. Beyond that, no knowledge on System (1) is available and, in particular, the matrices A, B, C, D are unknown. A measured input-output trajectory {ukd,ykd}k=0N−1 is assumed to be available, where the input ud is persistently exciting.

Definition 1.

We say that a sequence {uk}k=0N−1 with uk∈Rm is persistently exciting of order L if rank(HL(u))=mL.

Note that persistence of excitation of order L imposes a lower bound on the required data length N, i. e., N≥(m+1)L−1. The following result provides a purely data-driven parametrization of all trajectories of (1). While the result is originally formulated and proven in the behavioral framework in [22], we state a reformulation in the state-space framework from [4].

Theorem 1 ([4, Theorem 3]).

Suppose{ukd,ykd}k=0N−1is a trajectory of (1), whereudis persistently exciting of orderL+n. Then,{u¯k,y¯k}k=0L−1is a trajectory of (1) if and only if there existsα∈RN−L+1such that

(2)HL(ud)HL(yd)α=u¯y¯.

Theorem 1 shows that Hankel matrices containing one persistently exciting input-output trajectory span the space of all system trajectories. This allows us to parametrize any trajectory of an unknown system, using only measured data and no explicit model knowledge. While verifying the condition on ud in Theorem 1 requires knowledge of the system order n, the result (and all further results in this paper relying on Theorem 1) remains true if n is replaced by a (potentially rough) upper bound.

3 Data-driven model predictive control

In this section, we review data-driven MPC schemes based on Theorem 1 with a special focus on the closed-loop guarantees that can be given for such schemes if applied to LTI systems. We address the cases of noise-free data (Section 3.1) and noisy data (Section 3.2) both for LTI systems. Furthermore, we present a data-driven MPC scheme to control nonlinear systems in Section 3.3.

3.1 Nominal data-driven MPC for LTI systems

Our goal is to track a given input-output setpoint (us,ys)∈U×Y which corresponds to an equilibrium of the system (1), i. e., {uk,yk}k=0n with (uk,yk)=(us,ys), k∈I[0,n] is a valid trajectory of (1) (compare [7, Definition 3]). At the same time, we want to satisfy pointwise-in-time constraints ut∈U, yt∈Y for given constraint sets U⊆Rm, Y⊆Rp. MPC is a well-established method which can be used to achieve this task. It relies on the repeated solution of an open-loop optimal control problem, optimizing over all possible future system trajectories at each time step and always applying the first input component [21]. Standard MPC approaches exploit model knowledge, i. e., knowledge of the matrices A, B, C, D in (1), in order to solve this optimization problem. In contrast, the MPC scheme we consider relies on Theorem 1 which parametrizes all possible system trajectories, using only one input-output trajectory {ukd,ykd}k=0N−1.

Future trajectories can only be uniquely predicted if an additional initial condition is imposed, compare [18]. Therefore, since we assume that only input-output data of (1) and no state measurements are available, we use the last n input-output measurements {uk,yk}k=t−nt−1 to implicitly specify initial conditions at time t and thus, to fix a unique system trajectory. Based on these ingredients, we define the following optimal control problem:

(3a)minα(t),u¯(t),y¯(t)∑k=0L−1‖u¯k(t)−us‖R2+‖y¯k(t)−ys‖Q2

(3b)s.t.u¯[−n,L−1](t)y¯[−n,L−1](t)=HL+n(ud)HL+n(yd)α(t),

(3c)u¯[−n,−1](t)y¯[−n,−1](t)=u[t−n,t−1]y[t−n,t−1],

(3d)u¯[L−n,L−1](t)y¯[L−n,L−1](t)=unsyns,

(3e)u¯k(t)∈U,y¯k(t)∈Y,k∈I[0,L−1].

Problem (3) takes a common MPC form, minimizing the difference of the predicted input-output variables u¯(t), y¯(t) w. r. t. the setpoint (us,ys) while satisfying the constraints in (3e). The matrices Q,R≻0 are weights for tuning which can be specified by the user. The key difference to standard model-based MPC is that the “prediction model” is formed based on Theorem 1, i. e., by using Hankel matrices in (3b). Moreover, (3c) initializes the predictions using the last n input-output measurements, which implies that the internal states of the predictions and of the system at time t coincide. Due to these initial conditions, the predictions have an overall length of L+n.

Further, the constraint (3d) is a terminal equality constraint on the last n input-output predictions, similar to model-based MPC [21], where such conditions can be imposed on the state to ensure closed-loop stability. In Equation (3d), we write uns, yns for column vectors containing n times us and ys, respectively. The constraint (3d) is the main difference of Problem (3) to other works on data-driven MPC, e. g., in [11], [23], and it can be used to prove closed-loop stability for the presented MPC scheme. Note that Problem (3) does not require offline or online state measurements and hence, the considered MPC approach is inherently an output-feedback MPC.

For polytopic constraints, Problem (3) is a convex quadratic program (QP) which can be solved efficiently, similar to model-based MPC. Throughout this section, we write ut, xt, yt for closed-loop variables at time t∈I≥0, and {u¯k∗(t),y¯k∗(t)}k=−nL−1 for the optimal solution predicted at time t. Problem (3) is applied in a standard receding horizon fashion which is summarized in Algorithm 1.

The following result summarizes the closed-loop properties of Algorithm 1 when applied to (1).

Theorem 2 ([7, Theorem 2]).

SupposeL≥n,udis persistently exciting of orderL+2n, and the optimal cost of (3) is upper bounded by^[1]cu‖xt−xs‖22for somecu>0 [7, Assumption 1]. If Problem (3) is feasible att=0, then

it is feasible at anyt∈I≥0,
the closed loop satisfies the constraints, i. e.,ut∈Uandyt∈Yfor allt∈I≥0,
the steady-statexsis exponentially stable for the resulting closed loop.

Theorem 2 shows that the simple MPC scheme based on repeatedly solving (3) stabilizes the unknown LTI system (1), using only one a priori collected input-output trajectory. The proof is similar to stability arguments in model-based MPC [21] with the additional difficulty that the cost of (3) depends on the output and is thus only positive semi-definite in the internal state. The assumption that the cost of (3) is quadratically upper bounded is not restrictive and it holds, e. g., for compact constraints if (us,ys)∈int(U×Y) (see [7], [8] for details).

Since Theorem 1 provides an equivalent parametrization of system trajectories, its applicability is not limited to MPC schemes with terminal equality constraints as above. In particular, it can be used to design more sophisticated MPC schemes with general terminal ingredients, i. e., a terminal cost and a terminal region constraint, see [8] for details. Similar to terminal ingredients in model-based MPC, this has the advantage of increasing the region of attraction and improving robustness in closed loop. Alternatively, Theorem 1 is used to design a data-driven tracking MPC scheme in [5], where the setpoint (us,ys) for which the terminal equality constraint (3d) is imposed is optimized online, analogously to model-based tracking MPC [17]. In the data-driven problem setting considered in this paper, such a tracking formulation has the advantage that the given input-output setpoint need not be an equilibrium of the unknown system (1), which is a property that may be difficult to verify in practice. Finally, [9] provides closed-loop stability and robustness guarantees for a data-driven MPC scheme without any terminal ingredients for both noise-free and noisy input-output data.

3.2 Robust data-driven MPC for LTI systems

Theorem 2 only applies if the measured data are noise-free, which is rarely the case in a practical application. In this section, we consider the more challenging case of noisy data. In particular, we assume that both the data used for prediction as well as the initial conditions are affected by bounded output measurement noise, i. e., we have access to {ukd,y˜kd}k=0N−1 and {uk,y˜k}k=t−nt−1, where y˜kd=ykd+εkd and y˜k=yk+εk with the noise satisfying the bound ‖εkd‖∞≤ε¯, ‖εk‖∞≤ε¯ for k∈I≥0 for some ε¯>0. In order to retain desirable closed-loop properties despite noisy measurements, we consider the following modified data-driven MPC scheme:

(4a)minα(t),σ(t)u¯(t),y¯(t)∑k=0L−1‖u¯k(t)−us‖R2+‖y¯k(t)−ys‖Q2

(4b)+λαε¯‖α(t)‖22+λσε¯‖σ(t)‖22s.t.u¯(t)y¯(t)+σ(t)=HL+nudHL+ny˜dα(t),

(4c)u¯[−n,−1](t)y¯[−n,−1](t)=u[t−n,t−1]y˜[t−n,t−1],

(4d)u¯[L−n,L−1](t)y¯[L−n,L−1](t)=unsyns,u¯k(t)∈U.

In order to account for the noise affecting the available data in (4b), Problem (4) contains an additional slack variable σ(t). Both the slack variable and the vector α(t) are regularized in the cost, where the regularization depends on parameters λα,λσ>0 as well as on the noise level ε¯. The regularization of α(t) is needed since there exist infinitely many α satisfying (2) for a given input-output trajectory. The noise in the data y˜d acts as a multiplicative uncertainty w. r. t. α(t) in (4b) and thus regularizing the norm of α(t) reduces the influence of the noise on the prediction accuracy. On the other hand, the regularization of σ(t) prevents large values of σ(t) which may also deteriorate the prediction accuracy. Note that Problem (4) recovers the nominal MPC scheme in Problem (3) for ε¯→0. In [7], an additional (non-convex) constraint on σ(t) was required, but it was recently shown in [9] that this constraint can be dropped if the regularization of σ(t) depends reciprocally on ε¯, cf. (4a). Hence, if U is a convex polytope, Problem (4) is a strictly convex QP.

For simplicity, we do not consider output constraints in (4), i. e., Y=Rp. It is possible to extend the presented results by including a constraint tightening which guarantees robust output constraint satisfaction despite output measurement noise, see [6]. Finally, we note that MPC schemes similar to Problem (4) have been proposed in [11], [12], but only open-loop robustness properties have been proven. In the following, we state closed-loop properties resulting from the application of Problem (4) in a multi-step fashion, see Algorithm 2. We consider a multi-step MPC scheme due to the joint occurrence of model mismatch, i. e., output measurement noise in the Hankel matrix in (4b), and terminal equality constraints. Due to this combination and the controllability argument used to prove stability in [7], the theoretical guarantees are only valid locally for a one-step MPC scheme [7, Remark 4]. When removing the terminal equality constraints (4d) as in [9] or replacing them by general terminal ingredients [8], then comparable closed-loop guarantees can also be given for a one-step scheme.

Theorem 3 ([7, Theorem 3]).

SupposeL≥2n,us=0∈int(U), andudis persistently exciting of orderL+2n. Then, there exist a setXV⊆Rn, parametersλα,λσ>0, a sufficiently small noise boundε¯>0, and a functionβ∈Ksuch thatXVis positively invariant andxtconverges exponentially to{x∈Rn∣‖x‖2≤β(ε¯)}in closed loop.

Theorem 3 should be interpreted as follows: If the parameters λα, λσ are chosen suitably and the noise bound is sufficiently small, then the state xt converges exponentially to a region around 0, i. e., the closed loop is practically exponentially stable. We consider us=0 (which implies ys=0) for simplicity, but the same result holds qualitatively if (us,ys)≠(0,0), compare [7, Remark 5]. The guaranteed region of attraction XV is the sublevel set of a practical Lyapunov function which can be large (i. e., close to the region of attraction of the nominal MPC scheme in Section 3.1) if λα,λσ are chosen suitably and ε¯ is sufficiently small. Similarly, the function β(ε¯), i. e., the size of the region to which the closed loop converges, also depends on the parameters λα, λσ, ε¯ and, in particular, it decreases for smaller noise levels ε¯. As is discussed in more detail in [7], a larger magnitude of the input {ukd}k=0N−1 generating the data and an increasing length of the data N both improve closed-loop properties under Algorithm 2, i. e., they increase the region of attraction and decrease the tracking error. While these findings only reveal qualitative relations between different quantities, it is an important open problem to investigate quantitative guidelines for the appropriate selection of parameters in (4), which is also analyzed for the example in Section 4. To summarize, the MPC scheme based on repeatedly solving Problem (4) drives the system close to the desired setpoint using a noisy input-output trajectory of finite length. Sequential system identification and model-based MPC is an obvious alternative to Algorithm 2. Advantages of our approach are its simplicity, requiring no prior identification step, while at the same time providing closed-loop guarantees based on noisy data of finite length, which is a challenging problem in identification-based MPC due to the lack of tight estimation error bounds.

3.3 Data-driven MPC for nonlinear systems

Arguably, one of the biggest challenges in learning-based and data-driven control is the development of methods to control unknown nonlinear systems with closed-loop guarantees. In the following, we address this issue with an MPC scheme based on Theorem 1 which we then apply in the subsequent sections to a practical example. We do not provide theoretical results for the closed-loop behavior under the presented MPC scheme, which is an issue of our current research. Let us assume that, instead of (1), the considered system takes the form

(5)xt+1=f(xt)+g(xt)ut,yt=h0(xt)+h1(xt)ut

with unknown vector fields f, g, h0, h1 of appropriate dimensions. In the following, our goal is to track a desired output setpoint^[2]yT, i. e., yt→yT for t→∞, while satisfying input constraints ut∈U, t∈I≥0. To this end, we consider an MPC scheme based on Theorem 1, similar to the approaches in the previous sections. In order to account for the nonlinear nature of the dynamics, we update the (noisy) data {ukd,y˜kd}k=0N−1 used for prediction online based on current measurements. In this way, we exploit the fact that the nonlinear system (5) can be locally approximated as a linear system (assuming the vector fields are sufficiently smooth). Given past N input-output measurements {uk,y˜k}k=t−Nt−1 of (5) at time t≥N, we consider the following open-loop optimal control problem:

(6a)minα(t),σ(t)u¯(t),y¯(t)us(t),ys(t)∑k=0L‖u¯k(t)−us(t)‖R2+‖y¯k(t)−ys(t)‖Q2

(6b)+‖ys(t)−yT‖S2+λα‖α(t)‖22+λσ‖σ(t)‖22s.t.u¯(t)y¯(t)+σ(t)=HL+n+1u[t−N,t−1]HL+n+1y˜[t−N,t−1]α(t),

(6c)u¯[−n,−1](t)y¯[−n,−1](t)=u[t−n,t−1]y˜[t−n,t−1],

(6d)u¯[L−n,L](t)y¯[L−n,L](t)=un+1s(t)yn+1s(t),

(6e)∑i=0N−L−n−1αi(t)=1,us(t)∈Us,

(6f)u¯k(t)∈U,k∈I[0,L].

The key difference of Problem (6) to the MPC schemes considered in the previous sections is that the data used for prediction in (6b) are updated online, thus providing a local linear approximation of the unknown nonlinear system (5). Note that (6) contains a slack variable σ(t) as well as regularizations of σ(t) and α(t), similar to the robust MPC problem (4) for LTI systems. This is due to the fact that the error caused by the local linear approximation of (5) can also be viewed as output measurement noise similar to Section 3.2.

As an additional difference, Problem (6) includes an artificial setpoint us(t), ys(t) which is optimized online and which enters the terminal equality constraint (6d). The constraint (6d) is specified over n+1 steps such that (us(t),ys(t)) is an (approximate) equilibrium of the system and thus, the overall prediction horizon is of length L+1. At the same time, the distance of ys(t) w. r. t. the actual target setpoint yT is penalized, where the matrix S≻0 is a design parameter. The input setpoint us(t) lies in some constraint set Us⊆int(U). The idea of optimizing us(t), ys(t) online is inspired by model-based [17] and data-driven [5] tracking MPC, where artificial setpoints can be used to increase the region of attraction or retain closed-loop properties despite online setpoint changes. In the present problem setting, such an approach has the advantage that, if S is sufficiently small, then the optimal artificial setpoint (us∗(t),ys∗(t)) appearing in the terminal equality constraint (6d) remains close to the optimal predicted input-output trajectory (u¯∗(t),y¯∗(t)) and hence, close to the initial state xt. This means that the MPC first drives the system close to the steady-state manifold, where the linearity-based model (6b) is a good approximation of the nonlinear system dynamics (5) and therefore, the prediction error is small. Then, the artificial setpoint is slowly shifted towards the target setpoint yT along the steady-state manifold and hence, the MPC also steers the closed-loop trajectory towards yT.

Finally, (6e) implies that the weighting vector α(t) sums up to 1. The explanation for this modification is that the linearization of (5) at a point which is not a steady-state of (5) generally leads to affine (not linear) system dynamics. Theorem 1 provides a data-driven system parametrization which only applies to linear systems. In order to parametrize trajectories of an affine system based on measured data, the constraint (6e) needs to be added since it implies that the constant offset is carried through from the measured data to the predictions. Problem (6) can now be applied in a standard receding horizon fashion which is summarized in Algorithm 3.

It is worth noting that Algorithm 3 only requires solving the strictly convex QP (6) online, although the underlying control problem involves the nonlinear system (5). In this work, we do not address the issue of enforcing that the data (u,y) collected in closed loop and used for prediction in (6b) are persistently exciting. It is an obvious practical problem that, upon convergence of the closed loop, the input may eventually be constant and, in particular, not persistently exciting of a sufficient order, which is also an important issue in adaptive MPC [2]. For the nonlinear four-tank system investigated in Sections 4 and 5, we apply the presented MPC without additional modifications enforcing closed-loop persistence of excitation, but we plan to analyze this issue in future research.

4 Simulation study

In this section, we apply the MPC scheme for nonlinear systems discussed in Section 3.3 to a simulation model of the four-tank system originally considered in [20]. The continuous-time system dynamics can be described as

(7)x˙1=−a1A12gx1+a3A12gx3+γ1A1u1,x˙2=−a2A22gx2+a4A22gx4+γ2A2u2,x˙3=−a3A32gx3+1−γ2A3u2,x˙4=−a4A42gx4+1−γ1A4u1,

where xi is the water level of tank i in cm, ui the flow rate of pump i in cm3/s, and the other terms are system parameters, whose values are taken from [20] and summarized in Table 1. The output of the system is given by y=x1x2⊤. For the following simulation study, we assume that this output can be measured exactly without noise since this allows us to better investigate and illustrate the interplay between the nonlinear system dynamics and suitable design parameters of Problem (6) leading to a good closed-loop operation. In Section 5, we show that the proposed MPC scheme is also applicable in a real-world experiment with noisy measurements.

Table 1

Parameter values of the simulation model (7).

A1=A2:	A3=A4:	a1:	a2:
50.27cm2	28.27cm2	0.233cm2	0.242cm2
a3=a4:	γ1=γ2:	g:
0.127cm2	0.4	981cm2/s

We now apply the nonlinear MPC scheme from Section 3.3 (compare Algorithm 3) to the discrete-time nonlinear system obtained via Euler discretization with sampling time Ts=1.5 seconds of (7). Our goal is to track the setpoint yT=1515⊤ while satisfying the input constraints ut∈U=[0,60]2. To this end, we apply an input sequence sampled uniformly from^[3]uk∈[20,30]2 over the first N time steps to collect initial data, where the system is initialized at x0=0. Thereafter, for each t≥N, we solve Problem (6), apply the first component of the optimal predicted input, and update the data {uk,yk}k=t−Nt−1 used for prediction in (6b) in the next time step based on the current measurements. We use the parameters

(8)N=150,L=35,Q=I,R=2I,S=20I,λα=5·10−5,λσ=2·105,

and we choose the equilibrium input constraints as Us=[0.6,59.4]2. Further, the value of n used in (6) (i. e., our estimate of the system order) is chosen as 3. This suffices for the application of data-driven MPC since the lag of (the linearization of) the above system is 2 and the implicit prediction model remains valid as long as n is an upper bound on the lag (compare [18] for details). The closed-loop input and output trajectories under the MPC scheme with these parameters can be seen in Fig. 1. After the initial excitation phase t∈I[0,N−1], the MPC successfully steers the output to the desired target setpoint. First, we note that updating the data used for prediction in (6b) is a crucial ingredient of our MPC approach for nonlinear systems. In particular, if we do not update the data online but only use the first N input-output measurements {uk,yk}k=0N−1 for prediction, then the closed loop does not converge to the desired output yT and instead yields a significant permanent offset due to the model mismatch. For comparison, Fig. 1 also shows the closed-loop trajectory starting at time t=N resulting from a nonlinear tracking MPC scheme with full model knowledge and state measurements from [16], where the parameters are as above except for S=200I and R=0.1I. The two MPC schemes exhibit similar convergence speed although the data-driven MPC uses “less aggressive” parameters due to the slack variable σ(t) which implicitly relaxes the terminal equality constraint (6d). It has been observed in the literature, e. g., [13], that the choice of the regularization parameter λα has an essential impact on the closed-loop performance of data-driven MPC. In the following, we investigate in more detail how the specific choice of λα influences the closed-loop performance. To this end, we perform closed-loop simulations for a range of values λα and, for each of these simulations, we compute the corresponding cost as the deviation of the closed-loop output from the target setpoint yT, i. e., J=∑t=N500‖yt−yT‖S2. For comparison, we note that the parameters in (8) lead to a closed-loop cost of J=1.42·105, whereas the model-based nonlinear MPC shown in Fig. 1 leads to J=3.1·104. Fig. 2 shows the closed-loop cost depending on the parameter λα with all other parameters as in (8). Although the cost strongly depends on λα, it can be seen that a wide range of values λα∈[2·10−5,0.01] leads to a good performance, i. e., J≤1.5·105. If λα is chosen too small, then the robustness w. r. t. the nonlinearity deteriorates and the influence of numerical inaccuracies increases, which leads to a cost increase. This is in accordance with Theorem 3 which requires that λα is suitably chosen (in particular, it cannot be arbitrarily small). On the other hand, if λα is chosen too large then the closed-loop cost increases significantly since too small choices of the vector α(t) shift the input and output to which the closed loop converges towards zero, i. e., large values of λα increase the asymptotic tracking error. To summarize, since a wide range of values λα leads (approximately) to the minimum achievable cost, tuning the parameter λα is easy for the present example.

Figure 1

Closed-loop input-output trajectory, resulting from data-driven MPC (DD-MPC, Algorithm 3) and model-based nonlinear MPC (NMPC, [16]) to the four-tank system in simulation.

$Figure 2 Closed-loop cost J depending on the parameter λα{\lambda _{\alpha }}.$

Figure 2

Closed-loop cost J depending on the parameter λα.

Next, we analyze how different choices of other design parameters influence the closed-loop cost. Table 2 displays ranges for various parameters for which the cost J is less than 1.5·105, when keeping all other parameters as in (8). The data length N needs to be sufficiently large such that the input is persistently exciting, but choosing it too large deteriorates the performance since then the data used for prediction in (6b) cover a larger region of the state-space and the implicit linearity-based “model” is a less accurate approximation of the nonlinear dynamics (7). This is in contrast to the results on robust data-driven MPC for linear systems in Section 3.2, where larger data lengths always improve the closed-loop performance (cf. [7]). Similarly, too large values for the prediction horizon L are detrimental since they imply that the predicted trajectories are further away from the initial state, where the prediction accuracy deteriorates. On the other hand, too short horizons L lead to worse robustness due to the terminal equality constraints (6d). The assumed system order n cannot be larger than 4 due to the dependence of the required persistence of excitation on n and since larger values of n effectively shorten the prediction horizon due to the terminal equality constraints (6d), which are specified over n+1 time steps. If N and L are increased to N=190 and L=40, then the closed-loop output still converges to yT, e. g., for the upper bound 10 on the system order.

Table 2

Parameters leading to a closed-loop cost J≤1.5·105.

N:	L:	assumed system order:
I[130,159]	I[32,41]	I[2,4]
s¯:	λα:	λσ:
[16,3·102]	[2·10−5,0.01]	[4·102,106]

Further, Table 2 displays values of s¯ leading to a good closed-loop performance if the matrix S is chosen as S=s¯I. The value s¯ cannot be arbitrarily large since it needs to be small enough such that the artificial setpoint (us(t),ys(t)) and therefore the predicted trajectories remain close to the initial state, where the prediction accuracy of the data-dependent model (6b) is acceptable (compare the discussion in Section 3.3). On the other hand, for too small values of s¯, the asymptotic tracking error increases since the artificial steady-state is close to the initial condition and thus, the regularization of α w. r. t. zero dominates the cost of (6). Moreover, the parameter λσ can be chosen in a relatively large range. To summarize, the MPC scheme shown in Section 3.3 can successfully control the nonlinear four-tank system from [20] in simulation, and the influence of system and design parameters on the closed-loop performance confirms our theoretical findings.

5 Experimental application

In the following, we apply the MPC scheme presented in Section 3.3 in an experimental setup to the four-tank system by Quanser. This system possesses qualitatively the same dynamics as (7), but the parameter values differ (compare [1] for details). Nevertheless, as we show in the following, the presented nonlinear data-driven MPC scheme can successfully control the system using the same design parameters as in Section 4 due to its ability to adapt to changing operating conditions, in particular by updating the data used for prediction online. We use the same sampling time Ts=1.5 seconds as in Section 4. Similar to Section 4, we first apply an open-loop input sampled uniformly from uk∈[20,30]2 in order to generate data of length N=150. Thereafter, we compute the input applied to the plant via an MPC scheme based on Problem (6), where the design parameters are chosen exactly as in Section 4, i. e., as in (8). In addition to only tracking the setpoint yT=1515⊤ in the time interval t∈I[0,600], we include an online setpoint change for the time interval t∈I[601,1200] to yT=1111⊤. We note that the computation time for solving the strictly convex QP (6) is negligible compared to the sampling time of 1.5 seconds. The resulting closed-loop input-output trajectory is displayed in Fig. 3. After the initial exploration phase of length N, the closed-loop output first converges towards the setpoint 1515⊤ and after time t=600, the output converges towards the second setpoint 1111⊤, i. e., the MPC approximately solves our control problem. Similar to the simulation results in Section 4, the closed loop has a large steady-state tracking error if at all times only the first N=150 data points are used for prediction, underpinning the importance of updating the measured data in (6b) online when controlling nonlinear systems. However, Fig. 3 also illustrates a drawback of the presented approach which always relies on the last N input-output measurements. Upon convergence, the closed-loop input is approximately constant and, although the qualitative persistence of excitation condition in Definition 1 is still fulfilled, some of the singular values of the input Hankel matrix are very small, which deteriorates the prediction accuracy and hence the closed-loop performance (compare also the discussion at the end of Section 3.3). Therefore, the closed-loop output does not exactly converge to the setpoint but oscillates within a small region around yT. Moreover, when the setpoint change is initiated at time t=600, the past N=150 input-output data points contain only little information about the system behavior, which deteriorates the transient closed-loop behavior. It is possible to overcome these issues, e. g., by stopping the data updates after the setpoint is reached or by explicitly enforcing closed-loop persistence of excitation. We plan to investigate the benefit of such measures in future research.

Figure 3

Closed-loop input-output trajectory, resulting from the application of the data-driven MPC scheme presented in Section 3.3 to the four-tank system in an experiment.

Comparing Figures 1 and 3, we observe an important advantage of the presented MPC framework. Clearly, the two four-tank systems [20] and [1] have different parameters, e. g., the steady-state inputs leading to the output yT differ significantly. In particular, the model (7) does not accurately describe the four-tank system [1], e. g., due to differing pump flow rates, differing tube diameters, manufacturing inaccuracies, aging, and since the model (7) is not even an exact representation of the physical reality for the four-tank system considered in [20]. In order to implement a (nonlinear) model-based MPC as in [20], all of the mentioned quantities need to be carefully modeled which can be a challenging and time-consuming task. On the other hand, estimating an accurate model based on an open-loop experiment is also difficult due to the nonlinear nature of (7) and since only input-output measurements are available, see, e. g., [10]. In contrast, the proposed MPC leads to an acceptable closed-loop performance without any modifications compared to the simulation in Section 4 due to the fact that it naturally adapts to the operating conditions. This makes our MPC framework both very simple to apply, since no modeling or nonlinear identification tasks need to be carried out, and reliable, since the framework allows for rigorous theoretical guarantees (although so far only for linear systems).

6 Conclusion

We presented an MPC framework to control unknown systems using only measured data. We discussed simple MPC schemes for LTI systems which admit strong theoretical guarantees in closed loop both with and without measurement noise. Further, we proposed a modification which can be used to control unknown nonlinear systems by repeatedly updating the data used for prediction and exploiting local linear approximations. Finally, we applied this approach in simulation and in an experiment to a nonlinear four-tank system. Important advantages of the presented framework are its simplicity, the fact that no explicit model knowledge is required, the low computational complexity (solving a QP), the possibility to adapt to online changes in the system dynamics, and the applicability to (unknown) nonlinear systems. In particular, obtaining accurate models of nonlinear systems using noisy input-output data is a very challenging and largely open research problem. On the other hand, the presented framework admits desirable theoretical guarantees for LTI systems, and analogous results for nonlinear systems are the subject of our current research. Another interesting direction for future research is the practical and theoretical comparison to MPC based on (online) system identification, e. g., [2], [19].

Funding source: Deutsche Forschungsgemeinschaft

Award Identifier / Grant number: EXC 2075 - 390740016

Award Identifier / Grant number: GRK 2198/1 - 277536708

Funding source: H2020 European Research Council

Award Identifier / Grant number: 948679

Funding statement: This work was funded by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC 2075 - 390740016 and the International Research Training Group Soft Tissue Robotics (GRK 2198/1 - 277536708). This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 948679). The authors thank the International Max Planck Research School for Intelligent Systems (IMPRS-IS) for supporting Julian Berberich.

About the authors

Julian Berberich

Julian Berberich received the Master’s degree in Engineering Cybernetics from the University of Stuttgart, Germany, in 2018. Since 2018, he has been a Ph.D. student at the Institute for Systems Theory and Automatic Control under supervision of Prof. Frank Allgöwer and a member of the International Max-Planck Research School (IMPRS). He has received the Outstanding Student Paper Award at the 59th Conference on Decision and Control in 2020. His research interests are in the area of data-driven system analysis and control.

Johannes Köhler

Johannes Köhler received his Master degree in Engineering Cybernetics from the University of Stuttgart, Germany, in 2017. He has since been a doctoral student at the Institute for Systems Theory and Automatic Control under the supervision of Prof. Frank Allgöwer and a member of the Graduate School Soft Tissue Robotics at the University of Stuttgart. His research interests are in the area of model predictive control.

Matthias A. Müller

Matthias A. Müller received a Diploma degree in Engineering Cybernetics from the University of Stuttgart, Germany, and an M.S. in Electrical and Computer Engineering from the University of Illinois at Urbana-Champaign, US, both in 2009. In 2014, he obtained a Ph.D. in Mechanical Engineering, also from the University of Stuttgart, Germany, for which he received the 2015 European Ph.D. award on control for complex and heterogeneous systems. Since 2019, he is director of the Institute of Automatic Control and full professor at the Leibniz University Hannover, Germany. He obtained an ERC Starting Grant in 2020 and is recipient of the inaugural Brockett-Willems Outstanding Paper Award for the best paper published in Systems & Control Letters in the period 2014–2018. His research interests include nonlinear control and estimation, model predictive control, and data-/learning-based control, with application in different fields including biomedical engineering.

Frank Allgöwer

Frank Allgöwer is professor of mechanical engineering at the University of Stuttgart, Germany, and Director of the Institute for Systems Theory and Automatic Control (IST) there. Frank is active in serving the community in several roles: Among others he has been President of the International Federation of Automatic Control (IFAC) for the years 2017–2020, Vice-president for Technical Activities of the IEEE Control Systems Society for 2013/14, and Editor of the journal Automatica from 2001 until 2015. From 2012 until 2020 Frank served in addition as Vice-president for the German Research Foundation (DFG), which is Germany’s most important research funding organization. His research interests include predictive control, data-based control, networked control, cooperative control, and nonlinear control with application to a wide range of fields including systems biology.

References

1. Quanser coupled tanks system data sheet. [Online] https://www.quanser.com/products/coupled-tanks/. Accessed: 2021-01-21.Search in Google Scholar

2. V. Adetola and M. Guay. Robust adaptive MPC for constrained uncertain nonlinear systems. Int. J. Adaptive Control and Signal Processing, 25(2):155–167, 2011.10.1002/acs.1193Search in Google Scholar

3. A. Aswani, H. Gonzalez, S. S. Sastry and C. Tomlin. Provably safe and robust learning-based model predictive control. Automatica, 49(5):1216–1226, 2013.10.1016/j.automatica.2013.02.003Search in Google Scholar

4. J. Berberich and F. Allgöwer. A trajectory-based framework for data-driven system analysis and control. In Proc. European Control Conf., pages 1365–1370, 2020.10.23919/ECC51009.2020.9143608Search in Google Scholar

5. J. Berberich, J. Köhler, M. A. Müller and F. Allgöwer. Data-driven tracking MPC for changing setpoints. IFAC-PapersOnLine, 2020. 53(2):6923–6930.10.1016/j.ifacol.2020.12.389Search in Google Scholar

6. J. Berberich, J. Köhler, M. A. Müller and F. Allgöwer. Robust constraint satisfaction in data-driven MPC. In Proc. Conf. Decision and Control, pages 1260–1267, 2020.10.1109/CDC42340.2020.9303965Search in Google Scholar

7. J. Berberich, J. Köhler, M. A. Müller and F. Allgöwer. Data-driven model predictive control with stability and robustness guarantees. IEEE Transactions on Automatic Control, 66(4):1702–1717, 2021.10.1109/TAC.2020.3000182Search in Google Scholar

8. J. Berberich, J. Köhler, M. A. Müller and F. Allgöwer. On the design of terminal ingredients for data-driven MPC. arXiv:2101.05573, 2021.Search in Google Scholar

9. J. Bongard, J. Berberich, J. Köhler and F. Allgöwer. Robust stability analysis of a simple data-driven model predictive control approach. arXiv:2103.00851, 2021.Search in Google Scholar

10. J.-P. Calliess. Conservative decision-making and inference in uncertain dynamical systems. PhD thesis, University of Oxford, 2014.Search in Google Scholar

11. J. Coulson, J. Lygeros and F. Dörfler. Data-enabled predictive control: in the shallows of the DeePC. In Proc. European Control Conf., pages 307–312, 2019.10.23919/ECC.2019.8795639Search in Google Scholar

12. J. Coulson, J. Lygeros and F. Dörfler. Distributionally robust chance constrained data-enabled predictive control. arXiv:2006.01702, 2020.Search in Google Scholar

13. E. Elokda, J. Coulson, J. Lygeros and F. Dörfler. Data-enabled predictive control for quadcopters. ETH Zurich, Research Collection: 10.3929/ethz-b-000415427, 2019.Search in Google Scholar

14. L. Hewing, K. P. Wabersich, M. Menner and M. N. Zeilinger. Learning-based model predictive control: Toward safe learning in control. Ann. Rev. Control, Robotics, and Autonomous Systems, 3:269–296, 2020.10.1146/annurev-control-090419-075625Search in Google Scholar

15. Z.-S. Hou and Z. Wang. From model-based control to data-driven control: Survey, classification and perspective. Information Sciences, 235:3–35, 2013.10.1016/j.ins.2012.07.014Search in Google Scholar

16. J. Köhler, M. A. Müller and F. Allgöwer. A nonlinear tracking model predictive control scheme for dynamic target signals. Automatica, 118:109030, 2020.10.1016/j.automatica.2020.109030Search in Google Scholar

17. D. Limón, I. Alvarado, T. Alamo and E. F. Camacho. MPC for tracking piecewise constant references for constrained linear systems. Automatica, 44(9):2382–2387, 2008.10.1016/j.automatica.2008.01.023Search in Google Scholar

18. I. Markovsky and P. Rapisarda. Data-driven simulation and control. Int. J. Control, 81(12):1946–1959, 2008.10.1080/00207170801942170Search in Google Scholar

19. T. W. Nguyen, S. A. U. Islam, A. L. Bruce, A. Goel, D. S. Bernstein and I. V. Kolmanovsky. Output-feedback RLS-based model predictive control. In Proc. American Control Conf., pages 2395–2400, 2020.10.23919/ACC45564.2020.9148011Search in Google Scholar

20. T. Raff, S. Huber, Z. K. Nagy and F. Allgöwer. Nonlinear model predictive control of a four tank system: An experimental stability study. In Proc. Int. Conf. Control Applications, pages 237–242, 2006.10.1109/CACSD-CCA-ISIC.2006.4776652Search in Google Scholar

21. J. B. Rawlings, D. Q. Mayne and M. M. Diehl. Model Predictive Control: Theory, Computation, and Design. Nob Hill Pub, 2nd edition, 2017.Search in Google Scholar

22. J. C. Willems, P. Rapisarda, I. Markovsky and B. De Moor. A note on persistency of excitation. Systems & Control Letters, 54:325–329, 2005.10.1109/CDC.2004.1428856Search in Google Scholar

23. H. Yang and S. Li. A data-driven predictive controller design based on reduced hankel matrix. In Proc. Asian Control Conf., pages 1–7, 2015.10.1109/ASCC.2015.7244723Search in Google Scholar

Received: 2021-01-30

Accepted: 2021-05-17

Published Online: 2021-06-30

Published in Print: 2021-07-27

This work is licensed under the Creative Commons Attribution 4.0 International License.

Data-driven model predictive control: closed-loop guarantees and experimental results

Abstract

Zusammenfassung

1 Introduction

2 Preliminaries

Definition 1.

Theorem 1 ([4, Theorem 3]).

3 Data-driven model predictive control

3.1 Nominal data-driven MPC for LTI systems

Theorem 2 ([7, Theorem 2]).

3.2 Robust data-driven MPC for LTI systems

Theorem 3 ([7, Theorem 3]).

3.3 Data-driven MPC for nonlinear systems

4 Simulation study

5 Experimental application

6 Conclusion

About the authors

References

Journal and Issue

Articles in the same Issue