V33i09 PDF
V33i09 PDF
Abstract
In this paper we present the R package deSolve to solve initial value problems (IVP)
written as ordinary differential equations (ODE), differential algebraic equations (DAE)
of index 0 or 1 and partial differential equations (PDE), the latter solved using the method
of lines approach. The differential equations can be represented in R code or as compiled
code. In the latter case, R is used as a tool to trigger the integration and post-process the
results, which facilitates model development and application, whilst the compiled code sig-
nificantly increases simulation speed. The methods implemented are efficient, robust, and
well documented public-domain Fortran routines. They include four integrators from the
ODEPACK package (LSODE, LSODES, LSODA, LSODAR), DVODE and DASPK2.0.
In addition, a suite of Runge-Kutta integrators and special-purpose solvers to efficiently
integrate 1-, 2- and 3-dimensional partial differential equations are available. The rou-
tines solve both stiff and non-stiff systems, and include many options, e.g., to deal in an
efficient way with the sparsity of the Jacobian matrix, or finding the root of equations. In
this article, our objectives are threefold: (1) to demonstrate the potential of using R for
dynamic modeling, (2) to highlight typical uses of the different methods implemented and
(3) to compare the performance of models specified in R code and in compiled code for a
number of test cases. These comparisons demonstrate that, if the use of loops is avoided,
R code can efficiently integrate problems comprising several thousands of state variables.
Nevertheless, the same problem may be solved from 2 to more than 50 times faster by
using compiled code compared to an implementation using only R code. Still, amongst
the benefits of R are a more flexible and interactive implementation, better readability
of the code, and access to R’s high-level procedures. deSolve is the successor of package
odesolve which will be deprecated in the future; it is free software and distributed under
the GNU General Public License, as part of the R software project.
1. Introduction
Many phenomena in science and engineering can be mathematically represented as initial
value problems (IVP) of ordinary differential equations (ODE, Asher and Petzold 1998).
ODEs describe how a certain quantity changes as a function of time or space, or some other
variable (called the independent variable). They can be mathematically represented as:
y 0 = f (y, v, t)
where y are the differential variables, y 0 are the derivatives, v are other variables, and t is
the independent variable. For the remainder, we will assume that the independent variable
is “time”. For this equation to have a solution, an extra condition is required. Here we deal
only with models where some initial condition (at t = t0 ) is specified:
y(t0 ) = c
These are called initial value problems (IVP). The formalism above provides an explicit ex-
pression for y 0 as a function of y, x and t. A more general mathematical form is the implicit
expression:
0 = G(y 0 , y, v, t) (1)
If, in addition to the ordinary differential equations, the differential variables obey some
algebraic constraints at each time point:
0 = g(y, v, t) (2)
then we obtain a set of differential algebraic equations (DAE). The two previous functions G
(eq. 1) and g (eq. 2) can be combined to a new function F :
0 = F (y 0 , y, v, t)
which is the formalism that we will use in this paper. Solving a DAE is more complex than
solving an ODE. For instance, the initial conditions for a DAE must be chosen to be consistent.
This is, the initial values of t, y and y 0 , must obey:
0 = F (y 0 (t0 ), y(t0 ), v, t0 )
DAEs are commonly encountered in a number of scientific and engineering disciplines, e.g., in
the modelling of electrical circuits or mechanical systems, in constrained variational problems,
or in equilibrium chemistry (e.g., Brenan, Campbell, and Petzold 1996).
Most of the ODEs and DAEs are complicated enough to preclude finding an analytical solu-
tion, and therefore they are solved by numerical techniques, which calculate the solution only
at a limited number of values of the independent variable (t).
A common theme in many of the numerical solvers, are their capabilities to solve “stiff ”
ODE or DAE problems. Formally, if the eigenvalue spectrum of the ODE system (i.e., of its
Jacobian, see below) is large, the ODE system is said to be stiff (Hairer and Wanner 1980). As
Journal of Statistical Software 3
a less formal definition, an ODE system is called stiff if the problem changes on a wide variety
of time scales, i.e., it contains both very rapidly and very slowly changing terms. Unless these
stiff problems are solved with especially-designed methods, they require an excessive amount
of computing time, as they need to use very small time steps to satisfy stability requirements
(Press, Teukolsky, Vetterling, and Flannery 2007, p. 931).
Very often, stiff systems are most efficiently solved with implicit methods, which require the
creation of a Jacobian matrix ( ∂f∂y ) and the solution of a system of equations involving this
Jacobian. As we will see below, there is much to be gained by taking advantage of the sparsity
of the Jacobian matrix. Except for the Runge-Kutta methods, all solvers implemented in deS-
olve are variable order, variable step methods, that use the backward differentiation formulas
and Adams methods, two important families of multistep methods (Asher and Petzold 1998).
The remainder of the paper is organized as follows. In Section 2, the different solvers are
briefly discussed and some implementation issues noted. Section 3 gives some example im-
plementations of ODE, PDE and DAE systems in R (R Development Core Team 2009). In
Section 4, we demonstrate how to implement the models in a compiled language. Numerical
benchmarks of computational performance are conducted in Section 5. Finally, concluding
remarks are given in Section 6.
The package is available from the Comprehensive R Archive Network at http://CRAN.R-project.
org/package=deSolve.
developers, based on GForge (Copeland, Mas, McCullagh, Perdue, Smet, and Spisser 2006).
ode, ode.1D, ode.2D and ode.3D are wrappers around the integration routines de-
scribed below. The latter three are especially designed to solve partial differential equa-
tions, where, in addition to the time derivative, the components also change in one, two
or three (spatial) dimensions.
lsoda automatically selects a stiff or nonstiff method. It may switch between the two
methods during the simulation, in case the stiffness of the system changes. This is the
default method used in ode and especially well suited for simple problems.
lsodar is similar to lsoda but includes a method to find the root of a function.
Journal of Statistical Software 5
lsode, and vode also solve stiff and nonstiff problems. However, the user must decide
whether a stiff or nonstiff method is most suited for a particular problem and select an
appropriate solution method. zvode is a variant of vode that solves equations involving
variables that are complex numbers. lsode is the default solver used in ode.1D
lsodes exploits the sparsity in the Jacobian matrix by using linear algebra routines
from the Yale sparse matrix package (Eisenstat, Gursky, Schultz, and Sherman 1982).
It can determine the sparsity on its own or take it as input. Especially for large stiff
problems with few interactions between state variables (leading to sparse Jacobians),
dramatic savings in computing time can be achieved when using lsodes. It is the solver
used in ode.2D and ode.3D
daspk is the only integrator in the package that also solves differential algebraic equa-
tions of index zero and one. It can also solve ODEs.
Finally, the package also includes solvers for several methods of the Runge-Kutta family
(rk), with variable or fixed time steps. This includes the classical 4th order Runge-Kutta
and the Euler method (rk4, euler).
In addition, sets of coefficients (Butcher tableaus) for the most common Runge-Kutta-
methods are availabe in function rkMethod, e.g., Heun’s method, Bogacki-Shampine
2(3), Runge-Kutta-Fehlberg 4(5), Cash-Karp 4(5) or Dormand-Prince 4(5)7, and it is
possible to provide user-specified tableaus of coefficients (for Details see Dormand and
Prince 1981; Butcher 1987; Bogacki and Shampine 1989; Cash and Karp 1990; Press
et al. 2007).
2.4. Output
All solvers return an array that contains, in its columns, the time values (1st column) and the
values of all state variables (subsequent columns) followed by the ordinary output variables
(if any). This format is particularly suited for graphical routines of R (e.g., matplot). In
addition, a plot method is included which, for models with not too many state variables,
plots all output in one figure.
All Fortran codes have in common that they monitor essential properties of the integration,
such as the number of Jacobian evaluations, the number of time steps performed, the number
of integration error test failures, the stepsize to be attempted on the next step and so on.
These performance indicators can be called upon by a method called diagnostics.
dP P
= rG · P · 1 − − rI · P · C
dt K
dC
= kAE · rI · P · C − rM · C
dt
Where P and C are prey and consumer concentrations, rG is the growth rate of prey, K the
carrying capacity, rI the ingestion rate of the consumer, kAE its assimilation efficiency and
rM the consumer’s mortality rate. The implementation in R consists of three parts:
First the model function is defined. Here this function is called LVmod0D. It takes as input the
current simulation time, the values of the state variables, and the model parameters. These
three arguments have to be always present, and in this order; other arguments can be added
after them. The solver will call this function at each time step during the integration process;
at which Time contains the current simulation time; State the current value of the state
variables and Pars the values of the parameters as passed to the solver.
Both State and Pars are a vector, with named elements; the with statement, allows using
their names within the function.
The model returns a list, where the first element contains the derivatives, concatenated. Note
that Time is not used here, but in many models, it is used, e.g., when there are external
variables that depend on time.
Then the parameters are given a name and a value (pars), the state variables initialized (yini)
and the time points at which we want output specified (times). Based on these inputs, the
model is simulated. Here we use the default integration function ode, which is based on the
lsoda method; its returned model output is written in a matrix called out. We print the
Journal of Statistical Software 7
5
prey
predator
4
4
Conc
Conc
3
3
2
2
1
1
0 50 100 150 200 0 20 40 60 80
time time
Figure 1: A. Results of the Lotka-Volterra model. B. The Lotka-Volterra model solved till
steady-state.
first part of this matrix (head(out)). Matrix out has in its first column the time sequence,
and in its next columns the prey and consumer concentrations.
R> head(out, n = 3)
time P C
[1,] 0 1.000000 2.000000
[2,] 1 1.626853 1.863283
[3,] 2 2.488467 1.871156
The results (Figure 1A) clearly show that, after initial fluctuations, the consumer and prey
concentrations reach a steady-state. It takes 0.04 (lsoda, daspk) and 0.02 (lsode, vode,
lsodes) seconds to solve this model.
8 deSolve: Solving Differential Equations in R
time P C
[94,] 93.00000 1.999809 4.000187
[95,] 93.49872 1.999795 4.000148
∂P ∂ ∂P P
= Da + rG · P · 1 − − rI · P · C
∂t ∂x ∂x K
∂C ∂ ∂C
= Da + kAE · rI · P · C − rM · C
∂t ∂x ∂x
These partial differential equations (PDE) are solved by discretizing in space first and in-
tegrating the resulting initial value ODE; this is a technique called the “method of lines”
(Schiesser 1991). It is beyond the scope of this paper to derive how the spatial derivative
is numerically approximated and implemented in R; interested readers may refer to Soetaert
and Herman (2009) where this is discussed. Also, package ReacTran (Soetaert and Meysman
2009), implements numerical approximations of spatial derivatives. The function below im-
plements the model. Essentially, we first estimate the dispersive fluxes on the box interfaces
(F luxP , F luxC) as F lux = −Da · ∂C ∂x , after which the rate of change is estimated as the
negative of the flux gradient ( ∂t = − ∂F∂xlux + ...). Estimating a gradient is best done with
∂C
R function diff, which avoids the use of explicit loops, and is computationally very efficient.
Note that, by imposing P[1] and P[N] at the upper and lower boundaries, we effectively
impose a zero-gradient (or a zero-flux) boundary condition.
Journal of Statistical Software 9
The 20 meter model domain (R) is subdivided in 1000 boxes (N). After giving values to the
box sizes (dx) and the dispersion coefficient (Da), and initializing the consumer and prey
concentrations in each box (yini), the model can be solved for the requested time sequence
(times). It is most efficient to do this with integration routine ode.1D, which is especially
designed for solving this type of problems. Notwithstanding the large number of state variables
(2000), it takes less than a second to run this model for 200 days.
R> R <- 20
R> N <- 1000
R> dx <- R/N
R> Da <- 0.05
R> yini <- rep(0, 2*N)
R> yini[500:501] <- yini[1500:1501] <- 10
R> times <-seq(0, 200, by = 1)
R> print(system.time(
+ out <- ode.1D(y = yini, times = times, func = LVmod1D,
+ parms = pars, nspec = 2, N = N, dx = dx, Da = Da)
+ ))
The matrix out has in its first column the time sequence, followed by 1000 columns with prey
concentrations, one for each box, followed by 1000 columns with consumer concentrations.
We plot only the prey concentrations (Figure 2).
10 deSolve: Solving Differential Equations in R
Function ode.1D was run using either lsode, vode, lsoda and lsodes as the integrator; it
took 0.8 (lsode), 0.85 (vode), 1.2 (lsoda) and 0.65 (lsodes) seconds to finish the run.
∂P ∂ ∂P ∂ ∂P P
= Da + Da + rG · P · 1 − − rI · P · C
∂t ∂x ∂x ∂y ∂y K
∂C ∂ ∂C ∂ ∂C
= Da + Da + kAE · rI · P · C − rM · C
∂t ∂x ∂x ∂y ∂y
Journal of Statistical Software 11
The function below implements this 2-D model. Note that, as for the 1-D case, the use of
explicit looping is avoided: to estimate the gradient, we just subtract two matrices shifted
with one row (x-direction) or one column (y-direction). The zero-fluxes at the boundaries are
implemented by binding a row or column of 0-values (zero).
R> LVmod2D <- function (time, state, parms, N, Da, dx, dy) {
+ P <- matrix(nr = N, nc = N, state[1:NN])
+ C <- matrix(nr = N, nc = N, state[-(1:NN)])
+
+ with (as.list(parms), {
+ dP <- rG * P *(1 - P/K) - rI * P *C
+ dC <- rI * P * C * AE - rM * C
+
+ zero <- numeric(N)
+
+ ## Fluxes in x-direction; zero fluxes near boundaries
+ FluxP <- rbind(zero, -Da * (P[-1,] - P[-N,])/dx, zero)
+ FluxC <- rbind(zero, -Da * (C[-1,] - C[-N,])/dx, zero)
+
+ dP <- dP - (FluxP[-1,] - FluxP[-(N+1),])/dx
+ dC <- dC - (FluxC[-1,] - FluxC[-(N+1),])/dx
+
+ ## Fluxes in y-direction
+ FluxP <- cbind(zero, -Da * (P[,-1] - P[,-N])/dy, zero)
+ FluxC <- cbind(zero, -Da * (C[,-1] - C[,-N])/dy, zero)
+
+ dP <- dP - (FluxP[,-1] - FluxP[,-(N+1)])/dy
+ dC <- dC - (FluxC[,-1] - FluxC[,-(N+1)])/dy
+
+ return(list(c(as.vector(dP), as.vector(dC))))
+ })
+ }
The 2-D model domain, extending 20 meters (R) in both x- and y-directions is subdivided
into 50 · 50 boxes (N). This model can be solved efficiently only with integrator ode.2D. Here
we need to specify the dimensionality of the model (dimens) and the length of the work array
(lrw). It takes less than 3 seconds to solve this 5000 state-variable model for 200 days.
R> R <- 20
R> N <- 50
R> dx <- R/N
R> dy <- R/N
R> Da <- 0.05
R> NN <- N * N
R> yini <- rep(0, 2 * N * N)
R> cc <- c((NN/2):(NN/2 + 1) + N/2, (NN/2):(NN/2 + 1) - N/2)
R> yini[cc] <- yini[NN + cc] <- 10
12 deSolve: Solving Differential Equations in R
initial 20 days
20
20
15
15
10
10
y
y
5
5
0
0 5 10 15 20 0 0 5 10 15 20
x x
30 days 40 days 10
20
20
Prey concentration
15
15
6
10
10
y
4
5
2
0
0
0 5 10 15 20 0 5 10 15 20
x x
Results are in Figure 3 for the initial condition, and after 20, 30 and 40 days.
Note: with the initial conditions used (nonzero concentration in the centre), this is not a
particularly clever way of modeling dispersion on a 2-D surface. For this special case it is much
Journal of Statistical Software 13
more efficient to represent these dynamics in a 1-dimensional model, and using cylindrical
coordinates; this model is included as an example model in the help file of ode.1D. The 2-D
implementation here was added just for illustrative purposes.
k1
−*
D)−A+B
k2
In addition, D is produced at a constant rate, kprod , while B is consumed at a 1st order rate,
r. Implementing this model as an ODE system:
d[D]
= kprod − k1 · [D] + k2 · [A] · [B]
dt
d[A]
= −k2 · [A] · [B] + k1 · [D]
dt
d[B]
= −r · [B] − k2 · [A] · [B] + k1 · [D]
dt
The ODEs are now reformulated as a DAE. If the reversible reactions (involving k1 and k2 )
are much faster compared to the other rates (kprod , r) then the three quantities D, A and B
can be assumed to be in local equilibrium. Thus, at all times, the following relationship exists
between the concentrations of A, B and D (here concentration of x is denoted with [x]):
where K = k2 /k1 is the so-called equilibrium constant. The equilibrium description is com-
plete by taking linear combinations of rates of changes, such that the fast reversible reactions
vanish.
d[D] d[A]
+ = kprod
dt dt
d[B] d[A]
− = −r · [B]
dt dt
In this DAE, the fast equilibrium reactions (involving k1 and k2 ) have been removed.
DAEs are specified in implicit form (see Section 1):
0 = F (y 0 , y, x, t)
In R we define a function that takes as input time (t), the values of the state variables (y)
and their derivatives (yprime) and the parameter vector (pars), and that returns the results
14 deSolve: Solving Differential Equations in R
as a list; the first element of this list contains the implicit form of the differential equations
(res1, res2) and of the algebraic equation (eq), concatenated. Additionally, other quantities
can be returned as well (here CONC). Note that y, yprime and pars are vectors with named
elements; their names are made available through the with statement.
After defining the time sequence at which output is wanted (times), the parameter values
(pars) and the initial concentrations of the state variables (yini) and their rates of changes
(dyini), the model is solved with daspk. Note that in the example, the initial concentrations
of the state variables (yini) are consistent (i.e., they obey (K · [D] = [A] · [B]), but the initial
rates of changes (dyini) are not consistent. Thus, daspk will solve for them.
R> DAE <- daspk(y = yini, dy = dyini, times = times, res = Res_DAE,
+ parms = pars, atol = 1e-10, rtol = 1e-10, K = 1)
Note how we use the S3 plot method to plot all output variables at once on one figure
(Figure 4). In the deSolve package the same model is also solved as an ODE (example 1 of
ode.1D). The package also contains a DAE implementation of an electrical system, and all
the models from Hofmann et al. (2008) in the doc/examples directory.
A B
3.0
15
2.0
10
1.0
5
0.0
0 20 40 60 80 100 0 20 40 60 80 100
time time
D CONC
18
1 2 3 4 5 6
14
10
0 20 40 60 80 100 0 20 40 60 80 100
time time
an initializer function, which sets the values of the parameters (initparms() in the
example),
if forcing functions are to be used, an initializer for the data for the forcing function
(here absent),
the model function which calculates the rate of change and output variables (derivs()
in the example), and
Each function has a standard calling sequence (see the package vignette compiledCode in the
package deSolve for the details). The initializer subroutines serve just to link data from the R
side of things with memory accessible to the native code, and will rarely be more complicated
than the example shown here.
The bulk of the computation is carried out in the subroutine that defines the system of
differential equations. In the initializer routine, parameters are passed to the native programs
as one vector (containing 5 values). In Fortran, parameters are stored in a common block,
in which the values are given a name (rI,..) in the model function to make it easier to
understand the code, while it is a vector in the initializer routine. In the C code names can
be assigned to these parameters as well as state variables and their derivatives via #define
statements that make the code more readable.
16 deSolve: Solving Differential Equations in R
The variable yout(1) holds the total concentration of consumer and prey at any given time.
The model function must check whether enough memory is allocated for the output vari-
able(s), to prevent a fatal error if the memory allocation is inadequate. The subroutines
rexit() and error() are provided by R to gracefully exit with a message back to the R
command prompt.
To run this model, the code must first be compiled. Given that the appropriate toolset is
installed this can be done in R itself, using the system command: system("R CMD SHLIB
LVmod0D.f") or system("R CMD SHLIB LVmod0D.c"). The compiler will create a file that
can be linked dynamically into an R session, for example the dynamic link library (DLL)
LVmod0D.dll on Windows respectively a shared library LVmod0D.so on other operating sys-
tems.
The dynamic library is loaded into the current R process using a call to dyn.load().
After providing initial conditions of the state variables, the parameter vector, and the time
sequence, the model is run by calling the integrator ode. The functions passed to the ODE
solver are character strings ("derivs", "initparms"), giving the names of the compiled
functions in the dynamically loaded shared library ("LVmod0D"). When finished, the DLL can
be unloaded.
Journal of Statistical Software 17
#include <R.h>
/* initializers */
void initparms(void (* odeparms)(int *, double *)) {
int N=5;
odeparms(&N, parms);
}
5. Benchmarking
Model implementations written in compiled languages are expected to have one major advan-
tage in comparison with implementations in pure R: they use less CPU time. In this section
the performance of the two types of model implementation is illustrated by means of a set of
test problems. In order to increase the computational demand in a systematic way, we use
the consumer-prey model in different settings:
Several one-dimensional cases (Section 3.3), with varying number of grid cells (50, 100,
500, 1000, 2500, 5000). The latter is a 10000 state variable model.
All models were run for 200 days with a daily output interval, that is also the maximum time
step. We tested two implementations in R and two Fortran codes:
The R codes presented in Section 3, which include passing the names of state variables
and parameters.
A second implementation in R, where the names of parameters and state variables are
not used (i.e., without the with()-function).
A Fortran implementation, where the model was compiled as a DLL, loaded into R, and
the integration routine was triggered from within R, same as in previous section.
A second Fortran implementation, where the entire run was performed in Fortran.
The Fortran codes will not be given here but they are included in the R package (in the
doc/examples/dynload subdirectory).
Table 3: CPU time (in seconds) needed to perform a run of the Lotka-Volterra model in 0-D,
1D and 2-D (rows) and for different implementations (columns): (1) R code as in Section 3; (2)
R code without passing parameter and variable names; (3) model specified in a Fortran DLL,
loaded into R and the integration triggered by R and (4) the entire application implemented
in Fortran. All times reported are the mean of 10 consecutive runs.
Journal of Statistical Software 19
The measure used to evaluate computational performance is the CPU time spent in these runs
(Table 3). To obtain representative run times, we compare the average over 10 consecutive
runs, and on the same machine. Times reported are seconds of computing time, on a 2.5 GHz
portable pc with an Intel Core 2 Duo T9300 processor and 3 GB of RAM. Both ode.1D and
ode.2D used integration routine lsodes for solving the model.
Except for the 0-dimensional model, there is little gain (10-25%) in computing time by not
passing the parameter and state-variable names. In the simplest (0-D) model, the R version
that does not pass names (R code 2) finishes in only 35% of CPU time compared to the full
R implementation (R code 1).
The gain is much more pronounced when the model is implemented in Fortran rather than
in R: here the Fortran implementation executes 2 to 20 (1-D, 2-D) to 66 times (0-D) faster
(than R code 1). Finally, the difference between a Fortran model triggered by R, or a model
completely implemented in Fortran is very small; part of the difference is due to the checking
for illegal input values in the R integration routines.
6. Concluding remarks
The software R is rapidly gaining in popularity among scientists. With the launch of package
odesolve (Setzer 2001), it became possible to use R as a tool to solve initial value problems
of ordinary differential equations. The integration routines in this package opened up an
entirely new field of application, although it took a while before this was acknowledged. More
recent packages (rootSolve, bvpSolve) (Soetaert 2009; Soetaert, Cash, and Mazzia 2010) offer
to solve boundary value problems of differential equations.
The paper in R News by Petzoldt (2003), demonstrated the suitability of R for running
dynamic (ecological) simulations. More recently, a specially designed framework for eco-
logical modelling in R, simecol, emerged (Petzoldt and Rinke 2007); packages for inverse
modelling, (FME) and reactive transport modelling (ReacTran) (Soetaert and Petzoldt 2010;
Soetaert and Meysman 2009) were created, while a framework for more general continuous
dynamic modeling, Rdynamic (Setzer in prep.) is under construction. An increasing number
of textbooks deal with the subject (Ellner and Guckenheimer 2006; Bolker 2008; Soetaert and
Herman 2009; Stevens 2009).
In order to efficiently solve a variety of differential equation models, a flexible set of integration
routines is required. It is with this goal in mind that the integration routines in deSolve were
selected. Whereas the original integration routine in odesolve only efficiently solved relatively
simple ODE systems, the suite of routines now also includes a solver for differential algebraic
equations and methods to solve partial differential equations.
In this paper we have shown that thanks to these new functions, R can now more efficiently
run 0-dimensional, 1- 2-, and even 3-dimensional models of small to moderate size. Apart
from models implemented in pure R, it is possible to specify model functions in compiled code
written in any higher-level language that can produce shared libraries (resp. DLLs). The
integration routines then communicate directly with this code, without passing arguments to
and from R, so R is used just to trigger the integration and post-process the results. As the
entire simulation occurs in compiled code, there is no loss in execution speed compared to a
model that is fully implemented in the higher level language. But even in this case, all the
power of R as a pre- and post processing environment as well as its graphical and statistical
20 deSolve: Solving Differential Equations in R
facilities are immediately available – no need to import the model output from an external
source.
In the examples, linking compiled code to the integrator, indeed made the model run faster
with a factor 2 (for the 10000 state variable 1-D model) up to more than 50 times for the
smallest (2 state variable) model application, than when implemented as an R function. There
was only a small difference when running the model entirely in compiled code.
There are several reasons why compiled code is faster. First of all, R is an interpreted lan-
guage, and therefore processes the program at runtime. Every line is interpreted multiple
times at each time step. This makes interpreted code significantly slower than compiled code,
which transform programs directly into machine code, before running. Note though that R is
a vectorized language and, compared to some other interpreted languages, less performance
is lost if R’s high level functions, based on optimized machine code, are efficiently exploited
(Ligges and Fox 2008). In our 1-D example, we used R function diff to take numerical
differences, whilst in the 2-D model, entire matrices were subtracted. Because of this use of
high-level functions, the simulation speed of these models, entirely specified in R, was quite
impressive, approaching the implementation in Fortran. Performance of R code especially de-
teriorates when using loops. For instance, if the 2-D model is implemented by looping over all
rows, then the simulation time increases tenfold; when looping over rows and columns, com-
putation speed drops with 2 orders of magnitude! There are also trade-offs in using complex
variable types of R, especially if R performs extensive copying or internal data conversion. For
instance, the use of named variables and parameters introduced a computational overhead of
around 70% in our simplest model example. However, the effect was relatively less significant,
in the order of 10-20%, in more demanding models.
The use of code in a dynamically linked library also has its drawbacks. First of all, it is less
flexible. Whereas it is simple to interact with models specified in R code, this is not at all the
case for compiled code: before the model code can be executed, it has to be formally compiled,
and the DLL loaded. Secondly, errors may be particularly hard to trace and may even cause
R to terminate. The lack of easy access to R’s high-level procedures is another drawback of
using compiled code, where much more has to be hand-coded. Note though that, as from
deSolve version 1.5, the interpolation of external signals (also called forcing functions) to the
current timepoints is taken care of by the integration routines; the compiled-code equivalent
of R function approxfun.
Putting these pros and cons together, the optimal approach is probably to use pure R for
the initial model development (rapid prototyping). In case the model executes too slowly, or
when a large number of simulations are performed, implementing the model in C, C++ or
Fortran may be considered.
Finally, the creation and solution of a mathematical model is never a goal in itself. Models
are used, amongst other things to challenge our understanding of a natural system, to make
budgets or to quantify immeasurable processes or rates. When used in this way, the interaction
with data is crucial, as is statistical treatment and graphical representation of the model
outcome and the data. We hope that R’s excellence in these fields, and the fact that it is
entirely free, will give impetus to also using R as a modelling platform.
Journal of Statistical Software 21
Acknowledgments
The authors would like to thank our many colleagues and other R enthousiasts who have
tested the package. Two anonymous reviewers gave constructive comments on the paper. Also
thanks to Jan de Leeuw and Achim Zeileis for bringing this to a happy conclusion. The United
States Environmental Protection Agency through its Office of Research and Development
collaborated in the research described here. It has been subjected to Agency review and
approved for publication.
References
Asher UM, Petzold LR (1998). Computer Methods for Ordinary Differential Equations and
Differential-Algebraic Equations. SIAM, Philadelphia.
Bolker B (2008). Ecological Models and Data in R. Princeton University Press, Princeton.
URL http://www.zoology.ufl.edu/bolker/emdbook/.
Brenan KE, Campbell SL, Petzold LR (1996). Numerical Solution of Initial-Value Problems
in Differential-Algebraic Equations. SIAM Classics in Applied Mathematics.
Brown PN, Byrne GD, Hindmarsh AC (1989). “VODE, A Variable-Coefficient ODE Solver.”
SIAM Journal on Scientific and Statistical Computing, 10, 1038–1051.
Brown PN, Hindmarsh AC, Petzold LR (1994). “Using Krylov Methods in the Solution of
Large-Scale Differential-Algebraic Systems.” SIAM Journal on Scientific and Statistical
Computing, 15(6), 1467–1488. doi:10.1137/0915088.
Cash JR, Karp AH (1990). “A Variable Order Runge-Kutta Method for Initial Value Problems
With Rapidly Varying Right-Hand Sides.” ACM Transactions on Mathematical Software,
16, 201–222.
Crank J (1975). The Mathematics of Diffusion. 2nd edition. Clarendon Press, Oxford.
Dormand JR, Prince PJ (1981). “High Order Embedded Runge-Kutta Formulae.” Journal of
Computational and Applied Mathematics, 7, 67–75.
Eisenstat SC, Gursky MC, Schultz MH, Sherman AH (1982). “Yale Sparse Matrix Package. i.
The Symmetric Codes.” International Journal for Numerical Methods in Engineering, 18,
1145–1151.
22 deSolve: Solving Differential Equations in R
Ellner SP, Guckenheimer J (2006). Dynamic Models in Biology. Princeton University Press,
Princeton. URL http://www.cam.cornell.edu/~dmb/DMBsupplements.html.
Hairer E, Wanner G (1980). Solving Ordinary Differential Equation: Stiff Systems Vol. 2.
Springer-Verlag, Heidelberg.
Hindmarsh AC (1983). “ODEPACK, A Systematized Collection of ODE Solvers.” In R Steple-
man (ed.), Scientific Computing, Vol. 1 of IMACS Transactions on Scientific Computation,
pp. 55–64. IMACS / North-Holland, Amsterdam.
Hofmann AF, Meysman FJR, Soetaert K, Middelburg JJ (2008). “A Step-by-Step Procedure
for pH Model Construction in Aquatic Systems.” Biogeosciences, 5(1), 227–251. URL
http://www.biogeosciences.net/5/227/2008/.
Ligges U, Fox J (2008). “R Help Desk: How Can I Avoid This Loop or Make It Faster?”
R News, 8(1), 46–50. URL http://CRAN.R-project.org/doc/Rnews/.
Lotka AJ (1925). Elements of Physical Biology. Williams & Wilkins Co., Baltimore.
Petzold LR (1983). “Automatic Selection of Methods for Solving Stiff and Nonstiff Systems of
Ordinary Differential Equations.” SIAM Journal on Scientific and Statistical Computing,
4, 136–148.
Petzoldt T (2003). “R as a Simulation Platform in Ecological Modelling.” R News, 3(3), 8–16.
URL http://CRAN.R-project.org/doc/Rnews/.
Petzoldt T, Rinke K (2007). “simecol: An Object-Oriented Framework for Ecological Modeling
in R.” Journal of Statistical Software, 22(9), 1–31. URL http://www.jstatsoft.org/v22/
i09/.
Press WH, Teukolsky SA, Vetterling WT, Flannery BP (2007). Numerical Recipes. 3rd
edition. Cambridge University Press.
R Development Core Team (2009). R: A Language and Environment for Statistical Computing.
R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http:
//www.R-project.org/.
Schiesser WE (1991). The Numerical Method of Lines: Integration of Partial Differential
Equations. Academic Press, San Diego.
Setzer RW (2001). The odesolve Package: Solvers for Ordinary Differential Equations. R
package version 0.1-1, URL http://CRAN.R-project.org/package=odeSolve.
Setzer RW (in prep.). RDynamic, an R Package for Dynamic Modelling. R package version
0.1-1, URL http://r-forge.r-project.org/projects/rdynamic/.
Soetaert K (2009). rootSolve: Nonlinear Root Finding, Equilibrium and Steady-state Anal-
ysis of Ordinary Differential Equations. R package version 1.6, URL http://CRAN.
R-project.org/package=rootSolve.
Soetaert K, Cash JR, Mazzia F (2010). bvpSolve: Solvers for Boundary Value Problems of
Ordinary Differential Equations. R package version 1.1, URL http://CRAN.R-project.
org/package=bvpSolve.
Journal of Statistical Software 23
Soetaert K, Petzoldt T (2010). “Inverse Modelling, Sensitivity and Monte Carlo Analysis in
R Using Package FME.” Journal of Statistical Software, 33(3), 1–28. URL http://www.
jstatsoft.org/v33/i03/.
Soetaert K, Petzoldt T, Setzer RW (2009). deSolve: General Solvers for Initial Value Prob-
lems of Ordinary Differential Equations (ODE), Partial Differential Equations (PDE), Dif-
ferential Algebraic Equations (DAE), and Delay Differential Equations (DDE). R package
version 1.7, URL http://CRAN.R-project.org/package=deSolve.
Function Description
ode integrates systems of ordinary differential equations (ODEs), assumes
a full, banded or arbitrary sparse Jacobian
ode.1D integrates systems of ODEs resulting from multicomponent 1-
dimensional reaction-transport problems
ode.2D integrates systems of ODEs resulting from 2-dimensional reaction-
transport problems
ode.3D integrates systems of ODEs resulting from 3-dimensional reaction-
transport problems
ode.band integrates systems of ODEs resulting from unicomponent 1-
dimensional reaction-transport problems
daspk solves systems of differential algebraic equations (DAEs), assumes a
full or banded Jacobian
dede solves delay differential equations (DDEs)
lsoda integrates ODEs, automatically chooses method for stiff or non-stiff
problems, assumes a full or banded Jacobian
lsodar same as lsoda, but includes a root-solving procedure
lsode or vode integrates ODEs, user must specify if stiff or non-stiff assumes a full
or banded Jacobian; lsode includes a root-solving procedure
zvode same as vode, but for complex state variables
lsodes integrates ODEs, using stiff method and assuming an arbitrary sparse
Jacobian
rk integrates ODEs, using Runge-Kutta methods (includes Runge-Kutta
4 and Euler as special cases)
rk4 integrates ODEs, using the classical Runge-Kutta 4th order method
(special code with less options than rk)
euler integrates ODEs, using Euler’s method (special code with less options
than rk)
Affiliation:
Karline Soetaert
Centre for Estuarine and Marine Ecologoy (CEME)
Netherlands Institute of Ecology (NIOO)
4401 NT Yerseke, The Netherlands E-mail: [email protected]
URL: http://www.nioo.knaw.nl/users/ksoetaert/
Thomas Petzoldt
Institut für Hydrobiologie
Technische Universität Dresden
01062 Dresden, Germany
E-mail: [email protected]
URL: http://tu-dresden.de/Members/thomas.petzoldt/
R. Woodrow Setzer
National Center for Computational Toxicology
US Environmental Protection Agency
United States of America
URL: http://www.epa.gov/ncct/