Top-Down Vs Bottom-Up Methodologies in Multi-Agent System Design
Top-Down Vs Bottom-Up Methodologies in Multi-Agent System Design
Top-Down Vs Bottom-Up Methodologies in Multi-Agent System Design
net/publication/225698081
CITATIONS READS
90 2,370
3 authors:
Kristina Lerman
University of Southern California
340 PUBLICATIONS 9,249 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Kristina Lerman on 23 May 2014.
1
Department of Computer Science, California State University, Los Angeles
2
USC Information Sciences Institute
proach, the design process starts with specifying the global system state
and assuming that each component has global knowledge of the system, as
other hand, the design starts with specifying requirements and capabilities
approach from the MAS perspective, and identify three elements that we
believe should serve as criteria on how and when to apply either of the
1 Introduction
methodology, the design starts from the top with the assumption that re-
the centralized case. The specification is then defined in terms of the global
systems state and implies that each individual component should be able to
delay, resources that are local to other agents of the system. Under these
on the other hand, the rules of agent interactions are designed typically in
design process for some applications [1]. In systems designed starting from
the bottom, the global state of all the components is assumed to be impos-
Top-down vs Bottom-up 3
sible to obtain, and the desired collective behavior is said to emerge from
interactions among individual agents and between the agents and the envi-
starts with a rigorously pre–decided set of rules for the individual behaviors
and local interactions and then proceeds with the inference of the global
emergent behavior.
extends over the most diverse areas in computer science and computer en-
ologies separately to the same case study. We would like, in both cases, to
address questions such as: what are the analytical and design challenges?
What are the mental processes that lead a designer to the final solution?
domain, that emphasize commonalities and differences between the two ap-
4 Valentino Crespi et al.
specifications modeling
synthesis analysis
optimization
2 Previous Research
sis [7,6]. Researchers have begun to formalize the design of different classes
of agents. Jones and Mataric [1,8] have proposed a methodology for for-
mally specifying the task domain and automatically synthesizing the con-
logic to design verifiable controllers for an MRS. The emphasis in that work
is on the formal definition of agent and group behaviors rather than con-
troller synthesis. McNew and Klavins [11] use graph grammars to synthesize
the robot follows while executing a task. A task description can be viewed
as a set of sentences generated by the grammar. Ott and Lerman [12] used
proach.
tems has been notoriously limited by the lack of a general unified model
for performance estimation. The difficulty in this matter concerns both the
mance.
that combines ideas from Agent Based computing and Classical Control
Theory has been proposed [17,18]. This methodology is based on the hy-
mation and control, with the ultimate goal of being able to derive reliable
the design process follows three main phases as outlined in the subsequent
3.1.1 Modeling The purpose of this phase is twofold: a) agents in the sys-
tem are identified and categorized based on the taxonomy described below;
– modeling agents that collect data from many information agents and
systems theory);
– planning agents that use the current world state estimates, the viable
new actions to carry out. These agents may need additional information
for their planning operation, and so they may task brokering agents to
down process. At first, it is assumed that each agent can access remote re-
sources local to other agents instantly and with infinite precision. So an ini-
ronment are applied and so the visibility of each agent is gradually reduced.
mine a minimum level of agent visibility necessary for the system to perform
sumption that each autonomous subcomponent of the system has full knowl-
edge of the global state through remote access to all the resources and events
that have been acquired or recorded by any other subcomponent of the sys-
tem. This produces a centralized solution that however runs locally on each
subcomponent.
global resources with local resources compatibly with the constraints of the
to the fact that now agents must acquire information through interactions
with other locally reachable (the notion of locally reachable is context de-
pendent) components in order to infer the most likely global state of the
system. The result of this stage is an algorithm that runs locally and uses
local information. It records all the events that occur within its range and
a peer-to-peer basis.
Top-down vs Bottom-up 9
may lead to a review (feedback) of the original Modeling of the agent sys-
tively known by any other agent in the system. In other words, the problem
puting and Control Theory. This is the main point: being able to apply
3.2 Bottom–Up
scalable and adaptable systems often requiring minimal (or no) communica-
tion. It has been used to control robotic systems (e.g., [22–24]), embedded
systems, sensor networks [25], and information agents [26] among others.
its interactions with other agents and the environment. The behaviors are
In the applications listed above, the agents themselves are rather sim-
3.2.1 Synthesis In the Synthesis phase one has to define the agent con-
resentation of an agent. In the case of a reactive agent (i.e., one that makes a
decision about what action to take based on its current state and input from
the goal is for robots to collect pucks scattered about an arena and deposit
task will have to execute the following behaviors: (i) searching for pucks
by wandering around the arena, (ii) puck pickup and (iii) homing or bring-
homing by the gripper closing around the puck, and transition from homing
while executing a task: e.g., how observations trigger transitions from one
state to another.
Given that FSAs are equivalent to regular grammars, Ott and Ler-
man [12] showed that the automatic controller synthesis can be treated as a
sentences generated by some grammar — one can learn the grammar, and
and Lerman showed that the method was able to generate correct grammars
3.2.2 Modeling and Analysis Once a controllers for individual agents have
ticular, Lerman et. al. have developed models based on Stochastic Master
Equation and its first moment, Rate Equation, to describe the average col-
lective behavior from the details of the agent automaton. The model consists
1
That work treats a more general problem of context-free grammar induction,
interactions with other agents following complex trajectories, etc. One does
gate, or average, behavior. Such probabilistic models often have very simple,
intuitive form, and can be easily written down by examining details of the
range of values for internal parameters that result in a valid controller, but
For example, some parameter values may lead to faster convergence to the
desired steady state, while others will lead to smaller deviations from it.
State Automata (FSA), analysis can be used not only for estimating the
transition probabilities for the FSA, so that the desired global behavior will
be achieved on average.
4 A Case Study
system.
of puck, R and G, are unknown and can even change in time. We deploy
N robots equipped with a red lamp and a green lamp to collect the pucks.
Each robot can be foraging for one type of puck at any given time and its
that robots have a memory buffer of a certain length where they can store
their recent observations of pucks and other robots. The goal of the appli-
as the proportion of red to green pucks in the arena. The task is to define
this problem using the top-down methodology? We assume that each com-
ponent, in this case robot, is capable of accessing resources that are local to
The general idea is to start with this assumption in order to apply well-
known and tested methods from classical control theory (i.e., machine learn-
This function should contain as constants all the quantities that are known
to the robots, i.e., all the observations. The variables instead become the
n Pn !2
ci i=1 gi
X
V (c1 , c2 , . . . , cn ) = − Pn , (1)
i=1
n i=1 (gi + ri )
where, n is the total number of robots and gi (ri ) is the number of green
(red) pucks observed by robot i. This function states literally that the aver-
16 Valentino Crespi et al.
age proportion of green robots a priori should be the same as the globally
n Pn
X ci gi
= Pn i=1 . (2)
i
n i=1 (gi + ri )
Since each ci represents the probability that the robot should be green in
order to reflect the correct rate of observed green pucks then each robot
should reflect the estimated probabilities. Values ci are local to robots and
that depend on properties of the gradient (see for example [30]). This is to
If we now limit the communication range of the robotic agents they will
each agent will need to approximate its gradient whose exact computation
depends on global quantities, i.e., vector c. The result is the following dis-
tributed algorithm:
Top-down vs Bottom-up 17
applying the following rule that depends only on local quantities: ci (t+ 1) =
P
2γt X cj (t) j∈N (i) jg
ci (t) − −P (4)
|N (i)| |N (i)| j∈N (i) (gj + rj )
j∈N (i)
where N (i) = {j | d(i, j) < ρ} with ρ being the communication range of the
robots and d the Euclidean distance (in the experimental verification we set
γt to 0.1).
3. Each robot i decides its own color by sampling from a Bernoulli distri-
essary to discuss the possible presence of local minima for V (in this special
case there are none but in general we may expect a nontrivial issue). Then
ci (t) within range (0, 1)). Finally study how the performance degrades as the
problem is that the various robots need to estimate, as the time varies, the
18 Valentino Crespi et al.
proportion of green vs red pucks and then sample their own color accord-
ingly.
using a Least Square (LS) method. Define p = g/(r + g). Each robot i
where gi and ni are respectively the number of green pucks and the total
pi = p + wi (5)
where wi are n normal random variables with zero mean and common vari-
what the robot does not know because of lack of experience. We can real-
mity and that the robot explores the area without reversing course over the
parameter.
Top-down vs Bottom-up 19
reduces the size of the samples each robot is able to average at any instant of
gi /(gi + ri ).
1 X
p̂i = pj (7)
|N (i)|
j∈N (i)
robots.
– Each robot i decides its own color by sampling from a Bernoulli distri-
for example, how the variance of the distributed estimators varies in relation
to the communication range and other quantities. This gives us the means to
study with precision the time of convergence and the adaptation capabilities
distribution of pucks).
20 Valentino Crespi et al.
While the top-down design cycle started with defining a global potential
red and green pucks respectively in ith agent’s observation window. Our task
gi and ri . For the present problem two possible actions are choosing red
by a finite states machine with two states, Red and Green, and transitions
between the state with probabilities fR→G (ri , gi ) and fG→R (ri , gi ). During
etc. However, since we want the model to capture how the fraction of robots
that the ith robot has encountered during a certain time interval. Generally
such as a robot’s speed, view angle, local density of pucks, and so on. To
Top-down vs Bottom-up 21
of the robot such as its speed, view angles, etc., and M0 is the number of
and G(t), R(t) + G(t) = M0 , be the actual number of red and green pucks
The probability that in the time interval [t − τ, t] the robot has encountered
respective distributions. In the case when the puck distribution does not
λG = αGτ .
model, the next step is to derive the resulting global behavior. Recall that
Red and Green robots that reflect the distribution of red and green pucks.
describing the global state. Let Nr (t) and Ng (t) be the average (or expected)
small time interval [t, t + ∆t] the i–th robot will change their color with
and fG→R (ri , gi )∆t. Recall that P (r, g) is the probability that a robot has
observed r red and g green pucks in the observation window [t − τ, t]. Since
probability that a randomly chosen robot (of either color) has observed r
Red and g Green pucks. Thus, the expected change in the number of Red
∞
X
Nr (t + ∆t) − Nr (t) = −Nr (t) fR→G (r, g)P (r, g)∆t
r,g=0
∞
X
+ (N − Nr (t)) fG→R (rj , gj )P (r, g)∆t (9)
r,g=0
The first (second) term in Eq. 9 describes the number of Red (Green) robots
∞
dnr (t) X
= −nr (t) fR→G (r, g)P (r, g)
dt r,g=0
∞
X
+ (1 − nr (t)) fG→R (r, g)P (r, g) (10)
r,g=0
robots of either color that reflect the puck distribution in the steady state.,
r
fG→R (r, g) = ε ≡ εγ(r, g) (11)
r+g
g
fR→G (r, g) = ε ≡ ε(1 − γ(r, g)) (12)
r+g
dnr
= εγ(1 − nr ) − ε(1 − γ)nr (13)
dt
where γ is give by
∞
X r
γ= P (r, g) (14)
r,g=0
r+g
n0 is readily obtained:
Z t
′
nr (t) = n0 e−εt + ε dt′ γ̄(t − t′ )e−εt (15)
0
To proceed further, we need to calculate γ(t) (e.g., the average of γ over the
t t
1 1 1
Z Z
γ(t) = dt′ µr (t′ ) + e−ατ M0 − dt′ µr (t′ ) (16)
τ t−τ 2 τ t−τ
where µr (t) = R(t)/M0 is the fraction of red pucks. Eq. 15 and 16 fully
To analyze its properties, let us first consider the case when the puck
distribution does not change with time, µr (t) = µ0 . Then the we have
Hence, the probability distribution approaches its steady state value nsr = γ
exponentially. Note that for large enough ατ M0 the second term in the
expression for γ can be neglected so that the steady state attains the desired
short history window), however, the desired steady state is not reached, and
in the limit of very small ατ M0 it attains the value 1/2 regardless of the
algorithms described above. There are 20 red and 80 green puck scattered
in an arena of size 600. Initially, all the robots are at state Red. Both
the correct value after some transient. Starting from an initial fraction of red
red pucks nr (t = 0) = 1 In the top–down case one can clearly see the effect
produces the very similar results with the bottom–up algorithm. In fact
Fig. 2 Convergence to correct puck distribution for (a) top–down and (b)
bottom–up approaches. Two curves for the top down approach are for ρ = 0
(no communication) and ρ = 200 in a 100 × 100 square grid of side length 600
nt (t + 1) = (1 − δt ǫ)nr (t) + δt ǫγ
which has the same form as (4) with γt ∼ δt ǫ. This explains well the ex-
approach based on the LS estimation (see Fig. 3). Here we can observe that
the value of ρ affects essentially the variance of the estimator. The intuition
here is that as ρ increases then the number of samples that concur to the
calculation of the mean also increases thus reducing the variance consistently
Top-down LS approach
1
ρ=0
ρ = 200
0.9
0.8
0.6
0.5
0.4
0.3
0.2
0.1
0
0 5 10 15 20 25 30 35 40 45 50
Time
5 Conclusion
resources to the rest of the system. This requirement is very important for
the top-down approach, while not so imperative for the bottom-up. Note
agents operate affect their ability to generalize their local experience. So,
under certain conditions both approaches not only become viable but may
even produce the same solution. An example is represented by the Puck Al-
our comparison:
Top-down vs Bottom-up 27
control system and translates those into necessary agent capabilities. Note
that the last step assumes implicitly that the global system requirements
can be delegated to individual components. For some tasks this might not
be straightforward.
the approaches but its impact is completely different if not opposite. In the
the system on the emergent behavior is more like a positive side effect of
and latency. It is then necessary to analyze the parameter space that defines
scribing the average system behavior but as a rule they do not allow for
worst-case analysis.
methodology for a given multi-agent system problem and resources for its
solution.
ACKNOWLEDGMENTS
Science at the California State University, Los Angeles, and at the Informa-
Top-down vs Bottom-up 29
References
1. Chris V. Jones and Maja J Matarić. From local to global behavior in intelli-
and Automation (ICRA’03), Taipei, Taiwan, Sep 2003, pages 721–726, 2003.
2004.
and P. Maes, editors, Proceedings of Artificial Life IV, pages 28–39, 1994.
In CCOGraphGrammars, 2005.
12. M. Ott and K. Lerman. Using grammar induction to synthesize robot con-
scopic models for swarm robotic systems. In Sahin E. and Spears W., editors,
and Intelligent Vehicles and Road Systems. Darpa Task Program white paper.
20. Valentino Crespi, George Cybenko, Daniela Rus, and Massimo Santini. Decen-
May 2002.
21. Valentino Crespi and George Cybenko. Decentralized Algorithms for Sensor
22. C. Kube and H. Zhang. The use of perceptual cues in multi-robot box-pushing.
26. Hans Chalupsky et al. Electric elves: Applying agent technology to support
2001.
32 Valentino Crespi et al.
telligent Robots and Systems (IROS-2003), Las Vegas, NV, pages 1951–1956,
oct 2003.
29. Chris V. Jones and Maja J Matarić. Adaptive task allocation in large-scale
ligent Robots and Systems (IROS’03), Las Vegas, NV, pages 1969–1974, Oct
2003.
30. D.P. Bertsekas and J.N. Tsitsiklis. Gradient Convergence in Gradient Meth-
2001.