1. Introduction
Recently, unmanned aerial vehicles (UAVs) have been increasingly used both in the civilian and military spheres, mainly due to their relatively low cost, flexibility, and the elimination of the need for on-board pilot support. The use of UAV swarms is of particular importance, especially with increased autonomy of its elements. It is expected [
1] that autonomous UAV swarms will become a key element of future military operations, as well as civilian applications including security, reconnaissance, intrusion detection, and support Search and Rescue (SAR) or Disaster Recovery (DR) operations. DR operations are extremely challenging, and in the immediate aftermath of a disaster, one of the most pressing requirements is for situational awareness. UAV swarms provide an indispensable platform for building the situation awareness in such cases. The obvious benefit of using UAV swarms is an increase in the efficiency of the operation, an accelerated process of its execution and an increased probability of success. Their use in wilderness search and rescue (WiSAR), in particular, has been investigated for fast search-area coverage. One of the most important task in WiSAR is search – until a missing person has been found, they cannot be rescued or recovered. Many search tasks require a number of UAVs to remain in communication at all times and in contact with the base station via a short range ad hoc wireless network. For example, a swarm of UAVs must disperse (take the proper starting positions) to find the missing person as quickly as possible before their energy reserves run out. However, in order to efficiently operate a swarm of UAVs, it is necessary to address the various autonomous behaviors of its constituent elements, sometimes even with conflicting goals, to achieve a high level of adaptation and human-like cognitive behavior. Therefore, it is necessary to conduct research on methods of increasing the autonomy and interoperability of UAVs while limiting global communication and human dependence on the operator.
An individual UAV can perform different tasks, such as terrain reconnaissance, close-up inspection of selected areas, transfer of communication, and target pursuit. The UAV can intelligently take over the role depending on the situation. Such a high degree of adaptation and cooperation in complex scenarios requires innovative solutions at the design stage of the UAV swarm system and appropriate methods of its verification and testing.
A UAV swarm is a special case of robot swarm. There are a few definitions of
robot swarm, most of them differ in capabilities of its elements and a swarm itself but all of them link it to multirobot systems. A
multirobot system consists of multiple robots cooperating with a goal to accomplish a given task. The main features associated to multirobot systems are scalability, robustness, flexibility, and decentralized control. In [
2], multirobot systems that are not swarms are defined as those that have explicitly stated goals and in which robots are executing individual and/or group tasks. Additionally, robots in such systems have roles that can change during a course of a mission. In the same work, it was pointed out that in swarm system a swarm behavior emerges from local interactions between robots. In [
3], the authors defined as a swarm system any robotic system that is capable of performing “swarm behavior”. A frequently quoted definition of robotic swarms is the one presented in [
4]:
“Swarm robotics is the study of how large number of relatively simple physically embodied agents can be designed such that a desired collective behavior emerges from the local interactions among agents and between the agents and the environment.”
In the same paper, it was recommended for a swarm system to has the following properties.
Robots in a swarm should be autonomous and have the abilities to relocate and interact with other objects in the environment.
A swarm control method should allow for coexistence of large number of robots.
A swarm can be either homogeneous or heterogeneous. If a swarm is heterogeneous, it consists of multiple homogeneous subgroups.
Communication and perceiving capabilities are local. It means that robots do not know a global state of the environment at any moment.
For the purpose of this article, we will adopt the definition presented in [
3] with the features described above.
In recent surveys [
2,
3,
5], there are multiple classifications of tasks that a swarm can be given to perform. Below, we will use one that was defined in [
5].
Swarm tasks can be divided in three categories:
Spatial organization—tasks associated with this category focus on obtaining some spatial property by a swarm. An example of such property might be distance between robots. Typical tasks of this kind are aggregation, dispersion, coverage, and pattern formation.
Collective motion—this group consists of obstacle avoidance and objects gathering tasks. What makes collective motion different from spatial organization is that in the latter we are mainly focused on rearrangement of individual robots within a swarm while in collective motion we are generally focused on swarm as a whole. Typical tasks in this group are exploration, foraging (finding and collecting specific objects on the map), collective navigation (it aims at constructing, maintaining and, if there is such a need, modifying a formation heading toward some direction ) and collective transport (in which a swarm tries to move an object that is otherwise too heavy for a single robot in the swarm).
Decision-making—in this group robots make a decision that should lead to a consensus within a swarm. The decision is based on the local perception of the environment and information received from other robots. In the context of robot swarms, this kind of task appears in situations where there is no access to globally shared information. Typical tasks in this group are consensus (where a swarm tries to settle on a decision that every of its members agree on), task allocation (where robots select from an array of available tasks to perform), and localization.
Due to the large number of tasks that a swarm can be given to perform and their complexity, many areas were inspirations for swarm robotics over the course of years. Based on the survey presented in [
2], we can distinguish four main areas that served as an inspiration for robot swarms design:
Biology—a vast number of solutions and design methods originated directly from observation of real world swarms. To name a few, birds flocks, bees, and ants swarms served as such source of inspiration. A well-known example of robotic swarm originated from biology is presented in [
6]. All solutions based on evolutionary processes can be also included in this category. A complex introduction to bioinspired multirobot systems can be found in [
7,
8].
Control theory—this category includes all designs where physical aspects of robots are modeled as continuous-time continuous-space dynamical system and communication between robots is modeled using graph theory. In some works [
3], designs based on graph-theory are considered as a separate group. A concise introduction to solutions extensively using control theory can be found in [
9]. What is worth noting is that these kinds of designs methods give formal guarantee of correct execution as long as the requirements are met. Unfortunately, this group poorly takes into account indeterministic mission elements and requirements for a swarm are often unrealistic, as it was stated in [
10].
Amorphous computing and aggregated programming—the main idea behind amorphous computing [
11] is to use a large number of identical computers distributed across a space. It is assumed that these computers have only local communication capabilities and do not know their position. Because of its assumptions amorphous computing closely reassembles swarm systems. An example of software implementation of this paradigm is Proto language [
12]. In turn, aggregated programming [
13] is a paradigm that focuses on the development of large-scale systems from the perspective of their totality rather than individual elements. One prominent aggregate programming approach is based on the field calculus [
14]. An implementation of this parading is, for example, Protelis [
15]. It is worth noting that it currently used to model IoT-like systems.
Physics—swarm design methods inspired by physics are mainly focused on two ideas: artificial forces [
16] and Brownian motion [
10]. As it was pointed out in [
2], a characteristic feature of physics-inspired swarm design methods is that they tend to consider interactions between robots as passive. It means, there may be no message-exchanging communication between agents, instead robots can interact indirectly with each other (most of the time using some kind of forces).
There are multiple taxonomies concerning different aspects of swarm robotics. This includes swarm design methods and methods of analysis of both models and swarms themselves. For example, the taxonomy proposed in [
17] distinguishes swarms based on their features, such as their size or communication capabilities. Another taxonomies presented in [
18,
19] categorize, among others, methods of swarm modeling and its analysis as well as different ways to design its behavior. These taxonomies are especially important for us so we can compare our proposition to the existing methods of swarm design. In [
18], the authors divided methods of swarm modeling into two groups. Fist group, called
top-down (sometimes referred to as macroscopic methods) encapsulate all methods that start from defining a desired swarm behavior and then try to construct robots that exhibit this behavior. The second way to designing robotic swarms, defined as
bottom-up or microscopic, focuses first on capabilities and behavior of members of a designed swarm. Next, it is checked if the designed swarm is capable of carrying out a given mission. Both design methods have their pros and cons as it was discussed in [
20]. The key difference between both methods is where does a design method start from.
In the same work, swarms been distinguished based on their capabilities to improve results. These can be either
non-adaptive,
learning, or
evolutionary. A swarm is
non-adaptive if the only way to improve its performance is by manual modification by the designer. In turn, a swarm can be described as
learning if parameters of an algorithm it is using are automatically modified during task execution. Finally, if these parameters are modified in an iterative manner during the design stage with a use of evolution-based techniques, we can describe swarm behavior as
evolving. In [
19], a similar classification of swarm design methods have been proposed. According to this taxonomy, design methods can be described either as
behavior-based or
automatic. The first group consists of all methods where a swarm behavior is designed manually by the designer and improved with the trial and error method. The second group is made of all methods where a swarm behavior is constructed without a substantial involvement of the designer.
A constructed swarm model with a behavior policy for the swarm elements can be verified in two ways: using real robots or with simulators. This work is focused on earlier stages of robotic swarm development so we will only briefly cover the key achievements in this field.
The most obvious way to verify a robotic swarm model is with real robots. The most commonly cited swarm robots projects are
swarm-bots [
21], its successor
swarmanoid [
22], and the
Kilobot project [
23]. All of them are capable of performing multiple types of swarm behavior, which suggests that they are all equipped with sufficiently powerful hardware. This, in turn, let us to believe that the lack of common use of robotic swarms is due to insufficient behavior modeling techniques.
Based on an up-to-date state-of-the-art survey [
5], it can be seen that there is a number of different simulators designed to help designers verify their work. They vary in terms of performance and versatility of accepted solutions. To name a few, in our opinion two of them are worth to recommend for those wanting to verify their theoretical results:
ARGoS [
24]—is an open source simulator, whose key features are efficiency, flexibility, and accuracy. According to the information provided by the author, it is used by academic community around the world.
CoppeliaSim [
25]—(previously known as V-REP) is a very advanced simulator which seems to be used by many commercial and academic institutions globally. It is free for academic use.
One of the proven methods of designing complex systems, which UAV swarm systems certainly are, is engineering based on formal models. Formal models offer a number of possibilities to automate the system design process, including verification of the behavior of the designed system. They allow us to better understand and facilitate analysis of a modeled system. Formal models provide mathematical abstractions of the designed system and can be validated against requirements, tested using various infrastructures, and can also be used to directly simulate the behavior of the system. One of such formalisms which can be used for UAV swarms modeling are bigraphs with tracking. Bigraphs were introduced by R. Milner [
26] as a formalism to model systems in which placement and intercommunication between elements play an important role. Despite its novelty, there are already few extensions that allow to broaden its applicability. These are, among others, stochastic bigraphs [
27], bigraphs with sharing [
28] or bigraphs with tracking [
26]. A quick introduction to bigraphs with a real-world use case can be found in [
29].
It is important to emphasize that there are currently very few works on robot swarms using bigraphs. Examples [
30,
31] in the field of multi-agent system do not typically show how to generate behavior policy for swarm elements based on created models. The only solution we have found that does present a method of generating a behavior policy based on bigraphical model was presented in [
32]. It uses a basic bigraphical notation mixed with actors model [
33]. In our opinion, it is not an automatable method of swarm design.
Currently, there are only few tools supporting design with bigraphs, although it seems there are ongoing works [
34] to change that. To our best knowledge, there are only two utilities for designing with bigraphs that are beyond proof-of-concept stage. The most advanced tool for modeling, verifying and simulation of bigraphical system
BigraphER [
35]. The second one, a tool for verification of reachability of states
Bigraphical Model Checker (
BigMC) [
36] is no longer developed.
In this paper, we will present a method of modeling a UAV swarm with the addition of generating a behavior policy for swarm elements based on constructed model. Our goal is to present a swarm modeling method with the following features:
It separates modeling stage from generation of behavior for swarm elements.
It is flexible in the meaning that it can be used for a large number of different swarm tasks.
It is capable of generating behavior policies on multiple levels of abstractions (from a single agent, through their groups, to an entire swarm as a whole).
It is highly automatable. It is a desirable property because it indirectly enforces universal applicability of a method to different design problems. Additionally, automatic methods that are not monolithic tend to be modular, this in turn leads to standardization.
In the next section, we will present a method of modeling UAV swarm systems based on bigraphs with tracking. We will also define a way of constructing behavior policy which will guarantee a successful carrying out a given mission, assuming requirements that had been previously defined are met. Our method is inspired by the work presented in [
37]. Although very interesting, it has two major shortcomings. First, the requirement definition stage is loosely coupled with the modeling stage. We wanted to address this issue and allow to formally transform capabilities of robots and mission requirements into model elements. The second issue is the assumption of identical behavior for all swarm elements. We do not consider this a necessary requirement for a swarm, although it may differ depending on the accepted definition of robot swarm.
One of advantages of our method is that the whole process can be automated from the moment of defining mission requirements (as bigraphical patterns) and robots capabilities. We have proved it with software libraries [
38,
39,
40].
To summarize, according to taxonomies presented in [
18], our method can be categorized as
bottom-up, problem-agnostic and a generated behavior can be considered as
non-adaptive. In turn, using the taxonomies presented in [
19], our method can be considered as
automatic and a method of analysis of a constructed model can be viewed as
macroscopic (i.e., we are analyzing whole swarm and not individual interactions between its elements).
2. Methods and Materials
In this section, we will define formal elements and operations necessary to model a UAV swarm mission and determining a sequence of actions for the swarm elements. We have provided micro-examples at the end of each subsection for easier understanding.
Our proposition can be described as follows. We start from defining a UAV swarm mission as Tracking Bigraphical Reactive System (TBRS). We then transform this TBRS into state space represented as directed multigraph. Finally, we construct a behavior policy for swarm elements. As we treat a state space as a directed multigraph with edges corresponding to actions performed by swarm elements, we can define behavior policy as a walk (a finite length alternating sequence of vertices and edges) from the vertex representing the initial state of the mission to a vertex representing a final state (there can be few of those). A final state is a desirable outcome of the mission. We have proposed a method of finding all walks between any pair of vertices consisting of specified number of edges or loops.
2.1. Bigraphs
A bigraph consists of two graphs: a place graph and a link graph. Place graph is intended to model spatial relations between system’s elements. Link graph is a hypergraph that can be used to model interlinking between the elements.
Formally a bigraph is defined as
—a set of vertices identifiers;
—a set of hyperedges identifiers;
: — a function assigning a control type to vertices. K denotes a set of control types and is called a signature of the bigraph;
and denote a place and a link graph, respectively. A function defines hierarchical relations between vertices, roots, and sites. A function defines linking between vertices and hyperedges in the link graph;
and denote an inner face and outer face of the bigraph B. By we will denote a set of preceding ordinals of the form: . Sets X and Y represent inner and outer names, respectively.
A graphical example of a bigraph is presented in
Figure 1.
Reaction rules are used to model dynamics in bigraphical systems. In this paper, we will use simplified tracking reaction rules. We call them simplified because only vertices will be tracked between reactions, as opposed to the original bigraphs with tracking proposed by Milner [
26], where both vertices and hyperedges were tracked between reactions. Informally, a reaction rule defines a pattern (redex) in a source bigraph that shall be replaced with another bigraph (reactum). We will omit how patterns are found in bigraphs and how replacement is being done.
Formally, a tracking reaction rule is a quadruple:
where
—redex (a bigraph-pattern to be found in a bigraph to which rule is applied);
— reactum ( a bigraph replacing redex );
—a map of sites from reactum to redex;
—a partial map of reactum support onto redex support. It allows to indicate which elements are “residues” of a source bigraph in an output bigraph.
An example of reaction rule and its application is presented in
Figure 2. A
function denotes a residue of a source bigraph in an output bigraph.
Having defined the bigraphical reaction rules, we can proceed to the definition of Bigraphical Reactive System (BRS). A BRS is a tuple where denotes a set of bigraphs with empty inner face and is a set of reaction rules defined over . If consists of rules with tracking then a pair makes a Tracking Bigraphical Reactive System (TBRS).
Having a BRS we can generate a Transition System. A Transition System is a quadruple: , where
Agt—a set of agents (i.e., bigraphs with an empty inner face, denoted as );
Lab—a set of labels;
—an applicability relation;
—a transition relation;
For the purposes of this work, we will define a Tracking Transition System (TTS) . First, three elements have the same definition as described above, the rest is defined as follows.
—a participation function. It indicates which elements of a source bigraph participate in a transition. To avoid ambiguity,
Par function should return an injective mapping between redex’s support of the reaction rule corresponding to the transition’s label and the source bigraph of the transition. We have omitted this in the definitions for the sake of simplicity but the implementation provided in [
38] includes this in an output. The definition of the
Par function provided in this paper allows us to indicate who is participating in a transition but does not indicate what role a participant takes.
—a residue function. It maps vertices in an output bigraph that are residue of a source bigraph to the vertices in the source bigraph;
—a transition relation.
A Tracking Bigraphical Reactive System can be transformed into a Tracking Transition System.
A micro-example of Tracking Transition System is presented in
Table 1. Each row describes a single transition in the system. The initial state of the system is presented in the first row in the first
Agt column. The scenario that this TTS models is as follows. Two UAVs denoted as nodes with controls of type
U are trying to move from an area of type
A to an area of type
B. They can do it in two ways: The first method defined by reaction rule
r1 allows each UAV to move separately. The second method, denoted by reaction rule
r2, allows both UAVs to move in a cooperative manner. One can think of these reaction rules as of different algorithms enabling various capabilities of the UAVs. We do not provide a graphical representation of reaction rules for this example.
We have prepared a software library for generating Tracking Transition Systems available here [
38].
2.2. State Space
Having a Tracking Transition System we can transform it into a UAV swarm mission state space. A state space can be later used to generate a behavior for elements of the swarm we can control or have an influence on. Such elements will be called agents.
We have taken the following assumptions regarding modeled systems.
The number of agents is constant during whole mission.
A system cannot change its state without an explicit action of an agent (alone or in cooperation with other agents).
No actions performed by agents is subject to uncertainty.
A swarm mission can end for each agent separately in different moment. In other words, agents do not have to finish their part of the mission all at the same time.
In case of cooperative actions (actions performed by multiple agents), it is required of all participants to start cooperation at the same moment.
A state space
S for a system consisting of
agents and
states is defined as
where
—a set of vertices in the state space. It corresponds to bigraphs in Tracking Transition System;
—a multiset of ordered pairs of vertices. Called set of directed edges;
L—a set of labels of changes in the system. It will usually consists of reaction rules names from the Tracking Transition System the state space originate from. To determine what changes, in what order, have led to to a specific state we will additionally introduce a set .
—a set of possible state-at-time (SAT) configurations. For example, for the element denotes a situation where the agent with id 0 is at the moment 777 while the agent with id 1 is at the moment 123. It is important to emphasize that the configuration has the same time interpretation but different spatial interpretation. We later show an example with a justification why we need such a set.
—a set of possible mission courses. 0 denotes the neutral element, i.e., , we do not define operation + for the rest of elements of the C set.
—a set of functions defining progress of a mission. We later give an example with a rationale why we need such a set. The function returns 0 regardless of input. Additionally, we will denote by a set of all mission progress functions from the i state to the j state;
—a bijection mapping of edges to mission progress functions.
Below, we present an example demonstrating why we needed both I and T sets.
Let us assume that some TBRS consists of two bigraphs
and
as in
Figure 3b. Reaction rule for this TBRS is presented in
Figure 3a, agents in this system are denoted by the control of type
B. Then, transform the TBRS into TTS. This TTS consists of two states (associated to both bigraphs) and two transitions (there are two nodes of type
B and as we can change only one of them there are two ways to do so). Depending on whether the vertex with id 1 or 2 (numbering according to left-hand side of
Figure 3b participates in the reaction the result state-at-time configuration will differ. Let us assume that the SAT configuration for the state associated with bigraph
is equal to
and the reaction with label
r takes
units of time. Depending on which vertex participates in the reaction, the SAT configuration for the state
is either
or
. Because of this, the corresponding mission progress functions will be of the form
and
.
Edge identifiers and denote which way have led to the state, their names are arbitrary.
A micro-example of the state space based on TTS from
Table 1 is presented in
Figure 4 with the mission progress functions defined in
Table 2. The key idea behind generating mission progress functions is as follows. For each bigraph
B (either source or outcome in a transition), we treat a subset of
(denoting identifiers of agents that we want to determine a behavior policy for) as ordered set. We then compare if the order of agents in the source bigraph has changed in the outcome bigraph. If it did, then we must reflect this change in a tuple being an element of
I set. In our micro-example, such change of order is particularly visible in the first two transitions (represented by functions
and
). Both the source and outcome bigraphs of these transitions are the same, yet in the first transition, the order of agents (UAVs) has changed while in the second transition it has not. This is due to the residue function of both transitions. In the first transition, the order of UAVs identifiers (here 1 and 2 in the source bigraph and 1 and 3 in the outcome) are switched, that is, the order in the source bigraph is
and the order in the outcome bigraph is
. Because of that, the order of the input tuple in
is changed from
to
. The incrementation of
y variable indicates change of time for an agent with identifier equal to value of
b. In the second transition, the order remained the same, that is,
(in source) and
(in the outcome of the transition). The second case for all mission progress functions, returning 0, is necessary to properly define a walk in the state space. Its usage will be explained in the next subsection.
We omit here an algorithm for transformation of TTS into state space but an exemplary implementation of a software doing it is available at [
39].
2.3. Behavior Policy
We define a behavior policy as a schedule of actions for each agent from the beginning of a mission to its end, without breaks.
Having a state space we can view a behavior policy as a walk indicating what changes and done by who are required in order to reach a desired state.
Before we demonstrate how to construct a proper policy behavior based on a state space, we first need to define the following elements. Please note that by a series we will understand a finite sum of elements.
—a series, where summands are mission courses leading to the state s;
—a function returning a number of elements in a given series. According to the earlier definition, for any series this function returns a value of m (the greatest index of );
—a series, where summands are mission progress functions from the i to the j state;
—a matrix whose elements are series indicating possible walks leading to each state. Index t denotes a number of steps made in a state space. By a step we understand a transition between vertices (including the situation where the initial and final vertex are the same);
—a matrix of transitions between states.
Furthermore, we define two operations:
—a convolution of series defined above;
—a multiplication of matrices defined above. Elements of the new matrix are defined by the formula
With the elements defined above, we can generate all walks consisting of specified number of steps from the initial state to a final state. To do so, one must define the initial state as a matrix and multiply subsequent results by specified number of times. The result will be a matrix, whose elements in the ith column will contain information about all possible walks with x steps that ends in the ith state of the state space. If an element in the specified column is equal to 0, then it means there is no such walk.
Going back to our micro-example, using the state space as in
Figure 4 with function definitions listed in
Table 2 we can determine all sequences of actions that lead to the state denoted as 2nd. Each sequence is equivalent to behavior policy that, when applied, results in moving both UAVs to the area of type
B.
In order to determine such sequences we create two matrices: a matrix of transitions and matrix of initial state . Having both of them, we can multiply subsequent matrices by corresponding matrices and check whether the third state (recall that numbering starts from 0) is reachable. By reachable we understand having value other than 0 in specified column of the matrix.
Definitions of both matrices are listed below.
The tuple in the first column of matrix denotes that we have two agents. They are identified as agent 1 and agent 2, although this numbering is arbitrary and could be 777 and 111 as well. The zeros in both and indicate that both agents starts the mission at the same moment.
Subsequent matrices let us determine how system may change when a specified number of actions occur. For example, gives us information how system may evolve when one action occurs and two actions etc.
In this example,
and
are of the form
We have prepared a software library for generating behavior policies, available here [
40].