1. Introduction
Trees have a well-established role in the analysis of numerical methods for ordinary differential equations. In this paper, the more general concept of a stump is introduced and applied to the analysis of B-series and the composition rule. It is also shown how stumps can be used to analyse the order of nonautonomous scalar problems for which the order conditions for Runge–Kutta methods are slightly different. A new explanation is given for this discrepancy.
In
Section 2, a brief survey is given of the theory of Runge–Kutta methods, showing the structure of the elementary differentials on which B-series are based and the relationship of elementary differentials to trees. This is followed by
Section 3, in which stumps are introduced. These are a generalisation of trees, but, by restricting to “atomic stumps”, they also provide a means of generating all trees. Isomeric classes of trees generated in this way provide a framework for the analysis of order conditions in the scalar case, as shown in
Section 4. The paper concludes with the derivation of a method of “ambiguous order”. That is, the method has order 4 in general, but this increases to 5 for a scalar problem.
The theory of stumps, isomeric trees, and applications to scalar differential equations appear in greater detail in [
1]. The theory of trees and applications to vector-based numerical methods can be found, for example, in [
2]. The order of the method in [
3] was studied in [
4].
2. Trees, Elementary Differentials, and B-Series
Trees are graphs such as
,
,
,
,
,
,
,
. The “root” of a tree is the lowest point in the diagram, and all vertices, except for the root, have a single parent. For a given tree
t, the “order of
t”, written as
, is the number of vertices in
t. If a vertex
v is the parent of
, then
is a child of
v. If there exists a path
then
is a “descendant” of
. The product of the number of descendants for every vertex in a tree
t is defined to be the “factorial of
t” and is written as
.
For the first eight trees, the order and factorial are the following:
2.1. Notation and Recursions
In this paper,
:=
, and we recall two recursions to build other trees in terms of
. There are two convenient constructions for building complicated trees in terms of simpler trees. They are the following:
Given trees
,
, …,
, define
from the diagram
The notation
is used to show repetitions of
, … Assuming the
are distinct, then the “symmetry”
is defined recursively by
Given trees
and
, define
from the diagram
2.2. Polish Notation Tree Construction
Polish notation or prefix (as distinct from infix or postfix) notation is credited to Lukasiewicz. A famous reference to his work is [
5]. We generalise the notation so that
acts as a prefix operator on
m operands and thus
has the same meaning as
. This gives a third and bracketless scheme for writing trees. In
Table 1, the various notations are given side by side. It is noted that the notation based on
does not always give a unique factorisation.
2.3. Elementary Differentials
Given an autonomous initial value problem,
we write
and also write the sequence of Fréchet derivatives of
f, evaluated at
, as
,
,
, … It is noted that, in linear algebra terms, these are linear, bilinear, and multilinear operators. In this paper, we always use Polish notation so that
acting on the
m vectors
,
, …,
is written as
.
Definition 1. The elementary differential associated with the tree t is defined by It is noted that the recursion formula can also be written as
This makes it possible, in the Polish form of tree notation, to perform a simple substitution. That is, every is replaced by f, and every τm is replaced by f(m).
2.4. Application to B-Series
Given a function
, the corresponding B-series is a formal Taylor series:
Two special cases are the following:
, which gives the Taylor series for the solution to Equation (
1) at
. The series is
, where
is the corresponding elementary weight for a specific Runge–Kutta method. This gives the Taylor series for the approximation computed by this Runge–Kutta method:
By comparing Equations (
2) and (
3), we recover the conditions for a Runge–Kutta method to have order
p:
3. Trees, Forests, and Stumps
A sequence of items built from , , , …, can be contracted by the rules of Polish operations to form a sequence of trees, together with a final subsequence that might not be a tree but would become one if further operands are appended on the right. The sequence of trees on the left is usually referred to as a forest and can be converted into a single tree by a suitable operator to the left of this subsequence.
Incomplete “trees” are referred to as stumps. Examples are
The “valency” of a stump is the number of copies of , appended to the right, that would be required to convert it into a tree. It is convenient to refer to a tree as a stump with zero valency.
The word “forestump” is introduced to refer to a sequence of items made up from factors and , When a particular forestump is contracted to form as many trees as possible, then the final form will be the formal product of a forest of trees followed by a single stump (possibly the empty stump).
3.1. Bicolour Diagrams to Represent Stumps
We now introduce a generalisation of the way trees are represented diagrammatically to include stumps. We regard stumps as modified trees with some leaves removed but with the edges from these missing leaves to their parents retained.
In the examples given here, a white disc represents the absence of a vertex. The number of white discs is the valency, with the remark that trees are stumps with zero valency.
Right multiplication by one or more additional stumps implies grafting to open valency positions. It is noted that the third and fourth examples of valency 2 stumps are mirror images. This is significant in determining the precedence of the operands.
3.1.1. Products of Stumps
Given two stumps
s and
, the product
has a nontrivial product if
is not the trivial stump
and
s has valency of at least 1; that is, if
s is not a tree, the product is formed by grafting the root of
to the rightmost open valency in
s.
Two examples of grafting illustrate the significance of stump orientations:
If s is a tree or is the trivial stump, no contraction takes place.
3.1.2. Atomic Stumps
An atomic stump is a graph of the following form:
It is noted that no more than two generations can be present.
If m of the children of the root are represented by black discs and n are represented by white discs, then the atomic stump is denoted by . The reason for the designation “atomic” is that every tree can be written as the product of atoms.
This is illustrated for trees of up to order 4:
3.1.3. Isomeric Trees
In the factorisation of trees into products of atoms, the factors are written in a specific order, with each factor operating on later factors. However, if we interpret the atoms just as symbols that can commute with each other, we obtain a new equivalence relation, written as ∼.
Definition 2. Two trees are isomeric if their atomic factors are the same.
Nothing interesting happens up to order 4, but for order 5, we find that
It is a simple exercise to find all isomeric classes of any particular order, but, as far as the author knows, this has not been done above order 6.
For orders 5 and 6, the isomers are, line by line, the following:
We see in
Section 4 that isomeric classes for
scalar differential equations have a similar role to individual trees in the case of differential systems of arbitrarily high dimension. We let
denote the number of trees with order
n and
denote the accumulated total
. Similarly, we let
denote the number of isomeric classes with order
n and
denote the accumulated total for this quantity. These are shown in
Table 2 up to order 6.
4. Scalar Differential Equations
Early studies of Runge–Kutta methods derived order conditions for the scalar initial value problem
instead of using the autonomous test problem (Equation (
1)).
The full set of conditions up to some specified order becomes the starting point for finding accurate Runge–Kutta methods. The derivations of these conditions to order 5 were the pioneering contributions of Runge, Heun, and then Kutta [
6,
7,
8]. We follow their arguments for the same model problem (Equation (
5)). In this derivation,
and
, with similar notations for higher partial derivatives. First, we find the second derivative of
y by the chain rule:
Similarly, we find the third derivative:
and carry on to find fourth and higher derivatives. By evaluating
at
, we can find the Taylor expansions to use in Equation (
5). A more complicated calculation leads to the detailed series of Equation (
8) in the case of any particular Runge–Kutta method and hence to the determination of its order. We pursue this line of enquiry below.
The greatest achievement in this line of work was given in [
3], where sixth order methods involving eight stages were derived. In all the derivations of new methods, up to the publication of this tour de force, a tacit assumption was made. This was that a method derived to have a specific order for a general scalar problem will have this same order for a coupled system of scalar problems; that is, it will have this order for a problem with
. This bald assumption is untrue, and it becomes necessary to carry out the order analysis in a multidimensional setting.
4.1. Nonautonomous Vector-Valued Problems
This analysis was carried out in a scalar context, in contrast to later work, for which the application was always to vector-valued problems. To cater for problems that are both nonautonomous and, at the same time, vector-valued, we can use the terminology of the present section but with a multidimensional interpretation.
This is done by regarding factors such as and as linear operators and bilinear operators, respectively, that operate on vector-valued terms to the right, using Polish notation. To maintain this interpretation, when a problem is nonscalar, this requires the strict order of factors to be observed. Of course, in the traditional scalar interpretation, all factors commute, and the order of factors could have been altered.
4.2. Systematic Derivation of Taylor Series
The evaluation of
,
, is now carried out in a systematic manner. We let
We also let
denote
evaluated at
.
Using Lemma 1, we find in turn that
To find the order conditions for a Runge–Kutta method, up to order 5, we need to systematically find the Taylor series for the stages and finally for the output. In this analysis, we assume that for all stages. For the stages, it is sufficient to work only to order 4, so that the scaled stage derivatives include terms.
As a step towards finding the Taylor expansions of the stages and the output, we need to find the series for
for a given series
. In the following result, we use an arbitrary weighted series using the terms in Equation (
8).
Proof. Throughout this proof, an expression of the form
is assumed to have been evaluated at
. Evaluate
,
,
, and
:
where
The evaluation of is similar but more complicated and is omitted. ☐
For the stage values of a Runge–Kutta method, we have
and then, to one further order,
A similar expression can be written down for the output from a step:
A comparison with the exact solution,
, evaluated using Equation (
8), gives, under second order conditions,
This analysis can be taken further in a straightforward and systematic way and is summarised, as far as order 5, in Theorem 1. This theorem, for which the detailed proof is omitted, has to be read together with
Table 3.
Theorem 1. In the statement of this result, the quantities p, , σ, and ϕ are given in Table 3. - 1.
The Taylor expansion for the exact solution to the initial value problemto within is plus the sum of terms of the form - 2.
The Taylor expansion for the numerical solution to Equation (9), using a Runge–Kutta method , to within is plus the sum of terms of the form - 3.
The conditions to order 5, for the solution of Equation (5) using , are the equations of the form
4.3. Order Conditions for Vector Problems
The order conditions for the autonomous vector problem, given by Equation (
4) for
, are identical to (O1)–(O11) and (O14)–(O17) together with the two cases of (
4) missing from
Table 3:
Although these do not occur in
Table 3, the sum of (O12) and (O13) is equal to
which does occur as an un-numbered entry in
Table 3. Apart from this discrepancy, the order conditions for the scalar and vector problems exactly agree as far as order 5.
4.4. Derivation of Ambiguous Method
We now construct a method that has order 5 for a scalar problem but only order 4 for a vector-based problem. This means that all the conditions
need to be satisfied for the 17 trees such that
, except for (O12) and (O13), which can be replaced by Equation (
10). In constructing this method, it is convenient to introduce a vector
defined as
This satisfies the property
because
In the method to be constructed, some assumptions are made. These are
From Equations (
13) and (
14) and some of the order conditions, it follows that
, implying that
and hence that
. We choose the convenient values
and
together with
and
. The value of
b is found from (O1), (O2), (O3), (O5), and (O9), and
d is found from Equation (
11) with the requirement that
. The results are
where
is a parameter, assumed to be nonzero. The third row of
A can be found from
because, from several order conditions,
From Equation (
15), it is found that
. The values of
and
can be written in terms of the other elements of rows 4 and 5 of
A, and row 6 can be found in terms of the other rows. There are now four free parameters remaining (
,
,
, and
) and four conditions that are not automatically satisfied. These are (O11), (O16), (O17), and Equation (
10). The solutions are given in the complete tableau.
5. Numerical Test
A suitable single differential equation to test the order of convergence of this method, together with a closely related autonomous system, is
The solution of Equation (
17), in parametric coordinates, is
and this is also the solution to Equation (
18).
Two experiments were carried out:
The scalar problem (Equation (
17)) was solved using the method of Equation (
16) on the interval
.
The two-dimensional problem of Equation (
18), using the same method, was solved on the interval
.
In each case,
for
. The errors for the two methods and the various numbers of steps are shown in
Table 4. Also shown are the errors for
n steps divided by the error for
steps.
As expected, the numerical behaviour for experiment 1 was consistent with order 5. In contrast, for experiment 2, the numerical behaviour was consistent only with order 4.
6. Discussion
There is little scientific interest in the solution of scalar initial value problems, and there is no advantage in constructing numerical methods that are suitable only for this special class of problems. Hence, in the search for useful numerical methods, it is an advantage to use tree-based theory. The results presented here emphasise the danger of using scalar theory to derive methods of order higher than 4 because they could be incorrect.