1 s2.0 S0167739X15003362 Main

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Future Generation Computer Systems 65 (2016) 122–139

Contents lists available at ScienceDirect

Future Generation Computer Systems


journal homepage: www.elsevier.com/locate/fgcs

CEPSim: Modelling and simulation of Complex Event Processing


systems in cloud environments
Wilson A. Higashino a,b,∗ , Miriam A.M. Capretz a , Luiz F. Bittencourt b
a
Department of Electrical and Computer Engineering, Western University, London, ON Canada N6A 5B9
b
Instituto de Computação, Universidade Estadual de Campinas, Campinas, SP, Brazil

highlights
• CEPSim, a simulator for cloud-based Complex Event Processing systems, is proposed.
• CEPSim query model is based on Directed Acyclic Graphs.
• CEPSim simulation algorithm is based on a novel abstraction called event sets.
• Custom operator placement and scheduling algorithms can be used in simulations.
• Experimental results showed that CEPSim is effective for Big Data simulations.

article info abstract


Article history: The emergence of Big Data has had profound impacts on how data are stored and processed. As
Received 26 June 2015 technologies created to process continuous streams of data with low latency, Complex Event Processing
Received in revised form (CEP) and Stream Processing (SP) have often been related to the Big Data velocity dimension and used in
25 September 2015
this context. Many modern CEP and SP systems leverage cloud environments to provide the low latency
Accepted 30 October 2015
Available online 14 November 2015
and scalability required by Big Data applications, yet validating these systems at the required scale is a
research problem per se. Cloud computing simulators have been used as a tool to facilitate reproducible
Keywords:
and repeatable experiments in clouds. Nevertheless, existing simulators are mostly based on simple
Complex event processing application and simulation models that are not appropriate for CEP or for SP. This article presents CEPSim,
Cloud computing a simulator for CEP and SP systems in cloud environments. CEPSim proposes a query model based on
Simulation Directed Acyclic Graphs (DAGs) and introduces a simulation algorithm based on a novel abstraction called
Stream processing event sets. CEPSim is highly customizable and can be used to analyse the performance and scalability
Big Data of user-defined queries and to evaluate the effects of various query processing strategies. Experimental
results show that CEPSim can simulate existing systems in large Big Data scenarios with accuracy and
precision.
© 2015 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).

1. Introduction knowledge from these Big Data can bring competitive advantage
to organizations using them. Therefore, these organizations have
The emergence of Big Data has been profoundly changing the been actively pursuing alongside the research community new
way enterprises and organizations store and process data. Clearly, ways of leveraging Big Data to improve their businesses.
the sheer amount of data created by mobile devices, the Internet of According to the most commonly accepted definition, Big Data
Things (IoT) [1], and a myriad of other sources cannot be handled is characterized by the 4 Vs [3]: volume, velocity, variety, and
by traditional data processing approaches [2]. Simultaneously, veracity. The velocity dimension refers both to how fast data are
there is also a consensus that obtaining insights and generating generated and how fast they need to be processed. As technologies
created to process continuous streams of data with low latency,
Complex Event Processing (CEP) and Stream Processing (SP) have
∗ Corresponding author at: Department of Electrical and Computer Engineering, often been related to the velocity dimension and applied in the
Western University, London, ON Canada N6A 5B9. Big Data context. From the business perspective, the goal of these
E-mail addresses: [email protected] (W.A. Higashino), [email protected] technologies is to process fast input streams, obtain real-time
(M.A.M. Capretz), [email protected] (L.F. Bittencourt). insights, and enable prompt reaction to them [4].
http://dx.doi.org/10.1016/j.future.2015.10.023
0167-739X/© 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.
0/).
W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139 123

The resurgence of interest in CEP and SP systems has been Because of low-latency processing constraints, CEP and SP are often
accompanied by the use of cloud environments as their runtime used in Big Data scenarios in which velocity is the most prominent
platform. Clouds are leveraged to provide the low latency and dimension.
scalability needed by modern applications [5–7]. Other systems This work uses a terminology based on the Event Processing
explore cloud computing to facilitate offering CEP functionalities in Technical Society (EPTS) glossary [20], which originated from the
the services model [8]. In this context, the development of efficient CEP literature. This terminology has been chosen because its terms
operator placement and scheduling strategies are essential to are broadly defined and encompass most of the SP concepts.
achieve the required quality of service. However, validating these Hereafter the CEP term is used as a superset of SP, as defined in
strategies at the required Big Data scale in a cloud environment is the EPTS glossary.
a hard problem and constitutes a research problem per se.
First, cloud environments are subject to variations that make 2.2. CEP query languages
it difficult to reproduce the environment and conditions of an
experiment [9]. Moreover, setting up and maintaining large cloud In CEP systems, users create queries (or rules) that specify
environments are laborious, error-prone, and may be associated how to process input event streams and derive ‘‘complex events’’.
with a high financial cost. Finally, there are also many challenges These queries have usually been defined by means of proprietary
related to generating and storing the volume of data required by languages such as Aurora Stream Query Algebra (SQuAl) [17] and
Big Data experiments. CQL [21].
Simulators have been used in many different fields to over- Despite standardization efforts [22], a variety of languages
come the difficulty of executing repeatable and reproducible ex- are still in use today. Cugola and Margara [19] classify existing
periments. Early research into distributed systems [10] and grid languages into three groups:
computing [11] used simulators, as well as the more recent field of
• Declarative: the expected results of the computation are de-
cloud computing [12–14]. Generally, cloud computing simulators
clared, often using a language similar to SQL. The Continuous
make it possible to model cloud environments and to simulate dif-
Query Language (CQL) [21] is the most prominent representa-
ferent workloads running on them. Nonetheless, these simulators
tive of this category.
are mostly based on application models and simulation algorithms
• Imperative: the computations to be performed are directly
that cannot represent properly the dynamics of CEP or SP systems.
specified using operators that transform event streams. The
To overcome these limitations, this paper presents CEPSim, a flexi-
Aurora Stream Query Algebra (SQuAl) [17] inspired most
ble simulator of cloud-based CEP and SP systems.
languages in this category.
CEPSim extends CloudSim [12] using a query model based
• Pattern-based: languages are used to define patterns of events
on Directed Acyclic Graphs (DAGs) and introduces a simulation
using logical operators, causality relationships, and time con-
algorithm based on a novel abstraction called event sets. CEPSim
straints. The Rapide [10] language is an example of this category.
can be used to model different types of clouds, including public,
private, hybrid, and multi-cloud environments, and to simulate This research uses Directed Acyclic Graphs (DAG) as a language-
execution of user-defined queries on them. In addition, it can also agnostic representation of CEP queries. More details about the
be customized with various operator placement and scheduling chosen approach are discussed in Section 3.
strategies. These features enable architects and researchers to
analyse the scalability and performance of cloud-based CEP and 2.3. Big Data and CEP in the cloud
SP systems and to compare easily the effects of different query
processing strategies. The recent emergence of cloud computing has been strongly
This article significantly extends the authors’ previous work [15] shaping the Big Data landscape. Many authors have recognized
by improving the discussion about CEPSim’s goals and assump- the symbiotic relationship between these areas [23–25], as cloud
tions, by introducing the event set concept, by presenting detailed computing environments can be used to store and process Big
descriptions of all simulation algorithms and a thorough evaluation Data and also to enable new models for data services. For instance,
of CEPSim. New experiments include comparing CEPSim with a real Chang and Wills [26] used a cloud platform to store big biomedical
SP system in multiple scenarios, assessment of its performance, data, whereas Grolinger et al. [27] proposed a platform for
and a detailed analysis of the available simulation parameters. knowledge generation and access using cloud technologies.
The article is structured as follows: Section 2 presents Current CEP research has been strongly influenced by cloud
background information and related work. Section 3 discusses computing too. For instance, TimeStream [5], StreamCloud [6], and
the main design principles and assumptions of CEPSim, whereas StreamHub [7] are CEP systems that use cloud infrastructures as
Sections 4 and 5 detail fundamental concepts and the simulation their runtime environments.
algorithms. The experimental evaluation is presented in Section 6, Similarly, the discussion around Big Data, and the rise of the
and Section 7 is devoted to the final remarks. MapReduce platform [28], have also had a great impact on CEP.
The prevalence and success of MapReduce has motivated many
2. Related work researchers to work on systems that leverage its advantages while
at the same time try to overcome its limitations when used for
2.1. Complex event processing and stream processing low-latency processing. StreamMapReduce [29] and M3 [30] are
examples of MapReduce-inspired systems intended for stream
The basis of Complex Event Processing (CEP) was established processing. Other frameworks, such as Twitter’s Storm [31] and
by the work of Luckham on Rapide [10], a distributed system Yahoo’s S4 [32], propose a more radical departure from the
simulator. Later on, the concepts were generalized and applied MapReduce programming model, but maintain runtime platforms
to the enterprise context in another study by Luckham [16]. At inspired by MapReduce implementations.
about the same time, the database community developed the first
classical Stream Processing (SP) systems such as Aurora [17] and 2.4. Simulator
STREAM [18]. CEP and SP technologies share related goals, as both
are concerned with processing continuous data flows coming from Simulators are a popular tool that has been used in Grid
distributed sources to obtain timely responses to queries [19]. Computing research [11,33] for many years. More recently, the
124 W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139

GreenCloud [13] is a cloud simulator developed as an extension


of the NS-2 network simulator [36]. Therefore, it focuses on packet-
level simulation and energy consumption of network equipment,
but not on modelling of complex applications.
Finally, the iCanCloud simulator [14] is similar to CloudSim, but it
can also parallelize simulations and has a GUI to interact with the
simulator. Its application model, however, is based on low-level
primitives and needs to be significantly customized to represent
CEP applications. The choice of CloudSim over iCanCloud in this
research was motivated by CloudSim’s more mature codebase, the
authors’ previous experience, and the larger number of extensions
Fig. 1. CEPSim overview. available.

usage of simulators in the Cloud Computing field has also 3. CEPSim


become widespread, which motivated the development of a
number of simulators such as CloudSim [12], GreenCloud [13], and CEPSim is a simulator for cloud-based CEP systems that can be
iCanCloud [14]. None of these, however, can effectively model CEP used to study the scalability and performance of CEP queries and to
applications. compare easily the effects of different query processing strategies.
CloudSim [12] is a well-known cloud computing simulator that It has been developed with the following design principles as
can represent various types of clouds, including private, public, goals:
hybrid, and multi-cloud environments. In CloudSim, users define • Generality: it can simulate different cloud-based CEP systems
workloads by creating instances of cloudlets, which are submitted independently of query definition languages and platform
and processed by virtual machines (VMs) deployed in the cloud. specificities;
Among the most interesting CloudSim features is the customiz- • Extensibility: it can be extended with different operator
ability of its resource management policies, such as: placement, operator scheduling, and load shedding strategies;
• Multi-Cloud: it can run simulations that span multiple clouds;
• VM allocation (provisioning): determines how to map a user-
• Reuse: it can reuse capabilities that are present in CloudSim and
requested VM to one of the physical hosts available in a
comparable simulators.
datacentre. Cloud providers normally use strategies that try
to maximize the utilization of their servers without violating Because of its maturity and extensibility, CloudSim was chosen
existing service level agreements (SLA). as the base cloud simulator on top of which CEPSim was built. Fig. 1
• VM scheduling: determines how the VMs deployed on a physical shows an overview of CEPSim and how it is related to CloudSim.
host share the available processing elements (PEs). Currently, CloudSim provides the basic simulation framework and two
CloudSim provides two VM scheduling policies: space-shared main groups of functionalities: datacentres and policies. The
and time-shared. In the former, each VM has exclusive access to former group includes abstractions used to represent the physical
the PEs to which it is allocated, whereas in the latter, VMs share cloud environment, whereas the latter consists of customizable
the host PEs by executing on slices of the available processing strategies that control the dynamic aspects of the datacentre.
time. CEPSim significantly extends these functionalities to enable
• Cloudlet scheduling: determines how the cloudlets running in a simulation of CEP queries. In Fig. 1, these extensions are also
VM share the available VM PEs. Similarly to VM scheduling, both organized into two groups: foundation and simulation. The former
space-shared and time-shared strategies are available. group contains the fundamental CEPSim abstractions and is
detailed in Section 4, whereas the latter implements the CEP
The major drawback of CloudSim to simulate CEP is its simple simulation logic and is described in Section 5.
application model, which is more appropriate for simulation To achieve the generality goal, CEPSim assumes that user queries
of batch jobs. Normally, a cloudlet represents an independent can be transformed into the directed acyclic graph (DAG) format
finite computation with a length defined by a fixed number of described in Section 4.1. This choice of DAGs as a language-agnostic
instructions. Moreover, the cloudlet’s internal state other than its representation of CEP queries is corroborated by many studies in
expected finish time is invisible. CEP queries, on the other hand, the literature. For instance, most CEP systems based on imperative
are continuous computations that run indefinitely or for a specific languages also use DAGs to represent user queries. This is the case
period of time. In addition, tracking queries’ internal state during with Aurora [17], StreamCloud [6], Storm [31], S4 [32], FUGU [37],
simulation is essential to analysing any given CEP system. For and many others.
example, by monitoring the query operators’ queue size, one can Systems using declarative languages, on the other hand, create
determine whether they can keep up with the incoming event rate. execution plans from queries that can often be mapped into
The work discussed in this paper circumvents the limited CloudSim DAGs [5,18]. Even for pattern-based query languages, previous
application model with a new model based on DAGs, as discussed studies [38] have shown that is possible to transform them into
in Section 4.1. DAGs.
Because of its limitations, CloudSim has originated many Once transformed, CEPSim assumes that the queries run
extensions in the literature [9,34,35]. Garg and Buyya [9] created continuously, processing input events that are constantly pushed
NetworkCloudSim, which extends CloudSim with a three-tier into the system. The input streams are expected to be unbounded,
network model and an application model that can represent but the user must specify for how long the simulation should run.
communicating processes. Guérout et al. [34], on the other hand, To simulate distributed (networked) queries, CEPSim’s distribu-
focused on implementing the DVFS model on CloudSim. Finally, tion model assumes that parts of the query DAG are allocated to dif-
Grozev and Buyya [35] presented a model for three-tier Web ferent VMs and that these VMs can communicate with each other
applications and incorporated it into CloudSim. These extensions using a network. In addition, CEPSim assumes that multiple queries
are orthogonal to those presented in this paper because they do may be running simultaneously in the same VM and that they can
not focus on CEP. belong to different users.
W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139 125

Fig. 2. Query example.

Fig. 3. Windowed operator attributes.


Finally, CEPSim does not execute any form of single-query or
multiple-query optimization because it expects that the submitted if s1 processes 100 events, 50 will be sent to f1 and the other 50 to
queries have already been optimized. Nevertheless, to support f2 because the selectivity values of both (s1 , f1 ) and (s1 , f2 ) are 0.5.
these optimizations, CEPSim allows event sources and operators to Note that a selectivity can also be greater than 1 in the case where
be shared among queries. the operator outputs more than one event based on a single input,
Currently, the main limitation of CEPSim is the fact it only e.g., creating two alarms from a single sensor reading.
supports scenarios in which the number of simulated queries Hereafter, this article uses the dot notation to refer to query,
is fixed and these queries are not reconfigured neither fail at vertex, and edge attributes. For instance, v.id means the id attribute
runtime. However, most often this limitation can be circumvented of a vertex v .
by running and comparing two simulations: one of a scenario
before reconfiguration, and another of a scenario after. 4.1.1. Operators
To represent CEP queries, CEPSim uses two main operator types:
4. CEPSim foundation stateless and windowed.
A stateless operator, or simply an operator, can process incoming
This section presents CEPSim foundation concepts on top events in isolation with no dependency on any state computed
of which the simulation algorithm is implemented. First it is from previous events. For example, an Aurora filter is an operator
discussed the CEPSim query model, which is used to define that routes events to alternative outputs based on attribute
the simulated queries. Then the event set and event set queue values [17]. This operator is represented in CEPSim by a stateless
abstractions are described. operator vertex op connected to n neighbours opn , and each edge
(op, opn ) has a selectivity that determines the percentage of all
4.1. Query model events processed by op that are sent to opn .
A windowed operator, on the other hand, is used to simulate
In CEPSim, each user-defined query q is represented by a operators that process windows of events and combine them in
directed acyclic graph (DAG) Gq = (Vq , Eq ), where each vertex some manner. Typical examples are aggregation operators that
v ∈ Vq represents a query element and the edges (u, v) ∈ Eq count events or calculate the average value of attributes. The
represent event streams flowing from an element u to another behaviour of a windowed operator is determined by three main
element v . Fig. 2 shows an example of a query q. attributes: a w indow size, an adv ance duration, and a combination
CEPSim overcomes CloudSim batch application model limita- function.
tions by using this representation. DAGs can represent complex Fig. 3 illustrates the w indow and adv ance concepts. The w indow
data processing queries consisting of multiple interconnected specifies the period of time from which the events are taken,
steps through which the data flow. In addition, as mentioned in and the adv ance duration defines how the window slides when
Section 3, most existing query languages can be transformed to the previous window closes. Finally, the combination function is
DAGs, which emphasizes the generic aspect of this representation. defined as:
Vertices from a query q are further classified into event
p
producers, event consumers, and operators. The set Vq ⊂ Vq of event f : Rm
≥0 → R ≥0 (1)
producers (event sources) contains all vertices vp ∈ Vq that do not where m is the number of operator predecessors. This function
have any incoming edge. These vertices represent the sources of regulates the number of events that are sent to the output given
events processed by the query. Conversely, the set Vqc ⊂ Vq of the number of events accumulated in the input. Commonly, it is
event consumers (event sinks) contains all vertices vc ∈ Vq with defined as a constant function f (⃗
x) = 1, meaning that for each
no outgoing edges. These vertices are used for the sole purpose window, only one event is generated (e.g., for counting events).
of grouping events produced by the query. Finally, the set Vqo ⊂
Vq of operators consists of all vertices that have both incoming
4.1.2. Generator
and outgoing edges. Operators are pieces of computation that
process incoming event streams to produce output streams, and Every event producer p is associated with a generator function
in conjunction they constitute the actual query processing. gp that determines the total number of events produced by p given
Every vertex v ∈ Vq has a unique identifier (id) and an a point in time. Formally, the generator function is defined as any
instructions per event (ipe) attribute, which represents the number monotonically increasing function from the time domain to the set
of CPU instructions needed to process a single event. For event of positive integers:
producers, this attribute estimates the number of instructions gp : R≥0 → N, s.t. x ≤ y then gp (x) ≤ gp (y). (2)
required to take an event from the system input and forward it
to query execution. In other words, it does not include the effort
required to generate the event because event generation does not 4.2. Event sets
usually occur within the CEP system.
Every edge (u, v) ∈ Eq , on the other hand, has an associated An event set is an abstraction that represents a batch of events
selectiv ity attribute which determines how many of the events and is the basic processing unit used by CEPSim. This abstraction
processed by u are actually sent to v . For example, in Fig. 2, the has been created to improve the simulator performance and to
numbers on the edges represent their selectivity values. Therefore, assist in the calculation of simulation metrics. Operators exchange
126 W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139

event sets instead of individual events, and all system queues and
temporary buffers are composed of event sets.
Formally, an event set e is an instance of an E v entSet class that
contains the following attributes:
• cardinality (cn): number of events in the set. The notation |e| is
also used hereafter as a shortcut for e.cn.
• timestamp (ts): a timestamp associated with the set, which
can be used for various purposes. Most often, it contains the
timestamp at which the set has been created.
• latency (lt): the average of the latencies of the events in the set.
Event latency is defined as the period of time elapsed from event
creation to the moment at which the event is added to the set. Fig. 4. Placement definitions.
• totals (tt): a function that, for each producer vp ∈ Vqp , returns
the number of events that must have been produced by vp 4.3. Event set queues
to originate the events currently in the set. The goal of this
attribute is to track caused by (or is result of ) relationships An event set queue is simply a queue where the elements are
between the events in the set and the produced events. event sets. As with any regular queue, it is possible to enqueue
In addition to these attributes, four operations are also defined and dequeue elements in a first-in, first-out manner. In addition, an
for event sets: sum, extract, select, and update. event set queue has an overload dequeue operation that receives
• Sum: is applied to two event sets e1 and e2 and results in a the number of events to be extracted and returns an event set
new event set er containing all events from both operands. It representing these events.
is defined as: Finally, an event set queue Q also has a cardinality defined as
the sum of the cardinalities of all event sets in the queue:
er = e1 + e2 (3a) 
such that |Q | = |e|. (7)
e∈Q
|er | = |e1 | + |e2 | (3b)
|e1 | · e1 .ts + |e2 | · e2 .ts
er .ts = , (3c) 5. CEPSim simulation
|e1 | + |e2 |
|e1 | · e1 .lt + |e2 | · e2 .lt This section presents the CEP simulation logic implemented
er .lt = , (3d)
|e1 | + |e2 | by CEPSim. First it is discussed the role of operator placement
er .tt : Vqp → R≥0 , s.t. and scheduling strategies in the simulation. Then the simulation
procedures are presented both at operator and at placement level.
er .tt (vp ) = e1 .tt (vp ) + e2 .tt (vp ). (3e) Finally, it is described how CEPSim implements metric calculation.
• Extract: is applied to an event set e and the number of events to
be extracted n. The results are an event set er consisting of the 5.1. Operator placement
extracted events, and an event set em containing the remaining
events from e: Once the queries are modelled, the next step in any simulation
(er , em ) = e − n (4a) is to define a set of placements. Each placement maps a set of
query vertices to the VM where they will execute. Note that the
such that vertices from a single query can be mapped to more than one VM,
|er | = n (4b) which implies distributed query execution. A placement can also
contain vertices from more than one query, indicating that the VM
er .tt : Vqp → R≥0 , s.t.
is shared among queries. Fig. 4 illustrates the placement concept:
er .tt (vp ) = (n/|e|) · e.tt (vp ) (4c) Placement1 maps all vertices from Query1 and some from Query2 to
|em | = |e| − n (4d) Vm1 , whereas Placement2 maps the remaining Query2 vertices to
em .tt : Vqp → R≥0 , s.t. (4e) Vm2 .
Defining placements for a set of queries is an instance of
em .tt (vp ) = e.tt (vp ) − er .tt (vp ) (4f) the operator placement problem, as defined by Lakshmanan
and the latency and timestamp attributes from er and em are the et al. [39]. This mapping is one of the most determining factors
same as in e. of a CEP system performance. Because of this importance, CEPSim
• Select: is applied to an event set e and a selectivity s. It selects a is pluggable and enables the use of different placement strategies.
subset of events from the event set: By default, users must manually specify the mapping of vertices to
VMs when submitting a query to CEPSim.
er = e ∗ s (5a)
such that 5.2. Operator scheduling
|er | = |e| · s. (5b)
Operator scheduling is the procedure that, given a set of running
• Update: is applied to an event set e and a timestamp ts. It simply
queries and their internal state, defines which operator should
brings the event set latency and timestamp up to date:
run next and for how long it should run. A scheduling strategy
er = update(e, ts) (6a) can fundamentally determine the performance of a CEP system
such that by optimizing for different aspects of the system, such as overall
QoS [17] or memory consumption [40]. Because of this significance,
er .ts = ts (6b) CEPSim also allows different scheduling strategies to be plugged in
er .lt = e.lt + (ts − e.ts). (6c) and used during a simulation.
W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139 127

CEPSim contains two built-in scheduling strategies, and both Algorithm 1 Operator simulation
are based on an auxiliary allocation strategy. In this context, the Require: Operator op, with attributes:
allocation strategy divides the available instructions among the ◃ ipe, instructions per second
placement vertices, whereas the scheduling strategy determines ◃ pred, operator predecessors
how the vertices are traversed and how the allocated instructions ◃ succ , operator successors
are used. ◃ input , map of input event set queues
The two allocation strategy implementations provided by ◃ selectiv ity, map of outgoing edge selectivities
CEPSim are: ◃ output , map of output event set queues
• Uniform allocation: divide the available instructions equally
among all placement vertices; 1: function simulate(op, n, ts)
• Weighted allocation: divide the available instructions propor- ◃ op, operator
tionally to the ipe attribute of each vertex. ◃ n, number of instructions
◃ ts, start timestamp
These two strategies can be combined with the provided 2: totin ← 0
scheduling strategies, which work as follows: 3: for all vp ∈ op.pred do
• Simple scheduling: the vertices are sorted in topological order 4: totin ← totin + |op.input (vp )|
and traversed once according to this order. Each vertex receives 5: end for
the number of instructions determined by the allocation 6: ev t ← min(totin , n/op.ipe)
strategy, independently of the number of instructions required. 7: e ← empty event set
• Dynamic scheduling: the vertices are sorted in topological order 8: for all vp ∈ op.pred do
and traversed in one or more rounds. In each round, each vertex 9: no ← (|op.input (vp )|/totin ) ∗ ev t
receives the minimum between the number of instructions 10: e ← e+ Dequeue(op.input (vp ), no)
determined by the allocation strategy and the number of 11: end for
instructions required to process all input events. The process is 12: e ← update(e, ts)
repeated until there are no more instructions left to be allocated 13: for all vs ∈ op.succ do
or events to be processed. This strategy tries to redirect non- 14: en ← e ∗ op.selectiv ity(vs )
used instructions to overloaded vertices and thereby improve 15: Enqueue(op.output (vs ), en)
query throughput. 16: end for
17: end function
5.3. Operator simulation
Algorithm 2 Windowed operator simulation
In CEPSim, the simulation of an operator execution is accom-
plished by reading event sets from the operator’s input queues, Require: Windowed operator w , with operator attributes plus:
processing them, and writing output event sets to its output ◃ window, window size
queues. The general procedure used to simulate an operator ex- ◃ adv ance, advance period
ecution is detailed in Algorithm 1. ◃ f , combination function
The algorithm operates in three main steps: ◃ acc , accumulation data structure
◃ index, current slot in the accumulation data structure
1. Lines 2–6: Calculates the number of input events that can be ◃ next , next timestamp at which a window closes
processed. This number is the minimum between the total
number of events in all input queues and the maximum number 1: function simulate(w , n, ts)
of events that can be processed given the number of allocated ◃ w, windowed operator
instructions n. This maximum is obtained by dividing n by the ◃ n, number of instructions
operator ipe attribute. ◃ ts, start timestamp
2. Lines 7–11: Dequeues events from the input queues and builds a 2: slots ← w.w indow/w.adv ance
new event set e representing the dequeued events. The number 3: while ts > w.next do
of events dequeued from each input queue is proportional to its 4: GenerateOutput(w, index, ts)
size. This procedure aims to balance the queues by processing 5: w.next ← w.next + w.adv ance
more events from queues with more elements. 6: w.index ← (w.index + 1) mod slots
3. Lines 12–16: Enqueues the recently created event set e into the 7: Reset(w.acc , w.index)
operator output queues. While enqueuing, the selectivity value 8: end while
of the edge connecting the operator to each of its successors vs 9: totin ← 0
is taken into consideration. 10: for all vp ∈ w.pred do
Event producers and consumers are simulated in a similar way. 11: totin ← totin + |w.input (vp )|
Because event producers do not have predecessor vertices, the 12: end for
input events are read from the generator associated with them. 13: ev t ← min(totin , n/w.ipe)
Event consumers, on the other hand, do not have output queues. 14: for all vp ∈ w.pred do
The processed events are accumulated into a single output event 15: no ← (|w.input (vp )|/totin ) ∗ ev t
set that consolidates all events consumed during the simulation. 16: e ← dequeue(w.input (vp ), no)
Simulating windowed operators is different because output 17: accumulate(w.acc , w.index, vp , e)
events are generated only when a window closes. In addition, 18: end for
whenever a window does not close, the input events must be 19: end function
correctly processed and accumulated.
Algorithm 2 describes the simulation procedure of a windowed
operator w . To implement the simulation, every windowed the processed events. Fig. 5 shows an example of a windowed
operator has an auxiliary data structure that is used to accumulate operator and its corresponding data structure.
128 W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139

(a) Operator example. (b) Auxiliary data structure.

Fig. 5. Windowed operator simulation.

Algorithm 3 Windowed operator - generate output


1: function GenerateOutput(w , index, ts)
◃ w, windowed operator
◃ index, current index in w.acc
◃ ts, start timestamp
2: sum ← empty map
3: sumt ← empty event set
4: for all vp ∈ w.pred do
5: e ← empty event set
Fig. 6. Execution of a simulation tick. 6: for i = 0 to slots do
7: e ← e + w.acc (i)(vp )
The data structure works as a circular array divided into l slots, 8: end for
on which each slot represents a timeframe equivalent to one ad- 9: sumt ← sumt + e
vance period within the time window. For example, the windowed 10: sum(vp ) ← e
operator from Fig. 5(a) has a window size of 30 s and an advance 11: end for
period of 10 s, resulting in an array of size 3. Initially, slots 0, 1, 12: out ← empty event set
and 2 represent the intervals between 0–10, 10–20, and 20–30 s 13: out .cn ← f(sum)
respectively. Each position of this array contains one event set for 14: out .lt ← sumt .lt + (ts − sumt .ts)
each operator predecessor (p1 and p2 ). These event sets accumulate 15: out .ts ← ts
events coming from the predecessors during the slot period. 16: out .tt ← Sum(acc (index)).tt
To use this data structure, the windowed operator maintains 17: for all vs ∈ w.succ do
two auxiliary variables, index and next. The index variable points 18: en ← out ∗ w.selectiv ity(vs )
to the slot where the accumulation should currently take place,
19: Enqueue(w.output (vs ), en)
whereas next stores the next timestamp at which the window
20: end for
closes.
21: end function
These variables are primarily used between lines 2 and 8 of
Algorithm 2. First, when a window closes, an auxiliary procedure
generateOutput is invoked to generate the output event set • timestamp (out .ts) is set to the current timestamp.
(Algorithm 3). Then the next and index variables are adjusted, and • totals (out .tt) is set to the sum of all totals from the event sets
the next slot is reset. Note that this loop can be executed more in the current slot only, as events in previous slots have already
than once if more than one window has been closed since the last been considered in past windows.
simulation.
The following lines (9–18) are similar to the stateless operator 5.4. Placement simulation
simulation presented in Algorithm 1, but instead of writing the
processed event sets into the output queues, they are accumulated After describing how CEPSim simulates operators, this subsec-
at the current time slot. tion focuses on the algorithm used to simulate queries. A pseudo-
The last part of the windowed operator simulation is the code description of this procedure is presented in Algorithm 4.
generateOutput procedure shown in Algorithm 3. The loop between The first thing to notice is that the basic unit of simulation is a
lines 4 and 11 builds an event set for each predecessor and a sum of placement, not a query, which implies that all vertices allocated
these event sets sumt . This step is also shown in Fig. 5(b), in which to a VM are simulated at once. This approach enables operator
sum(p1 ) is calculated as e1 + e3 + e5 , sum(p2 ) as e2 + e4 + e6 , and scheduling strategies to consider simultaneously all vertices in
sumt is the sum of sum(p1 ) and sum(p2 ). a VM and potentially make better decisions regarding their
From lines 12 to 16, the output event set out is built according scheduling optimization criteria.
to the idea that this event set is caused by, or is a result of, all events This procedure simulates execution of a placement for the
accumulated in the window: duration of a simulation tick. As shown in Fig. 6, the CloudSim
• cardinality (out .cn) is set to the result of the combination simulation framework repeatedly invokes this procedure to
function f . This function receives as argument a set of event sets, represent the passing of time. Therefore, the simulation tick length
each one encapsulating all events received from each specific is a parameter that enables users to trade off precision against
predecessor vp during the window timeframe, and returns the computational cost. For example, if the tick is long, the procedure
number of events that must be generated. will be invoked fewer times, but the produced events will be
• latency (out .lt) is set to the average latency of all events in grouped into relatively large event sets and processed as such. On
the window (sumt .lt) plus their average waiting time. The the other hand, a shorter tick translates into smaller event sets and
waiting time is calculated as the difference between the current potentially more precise results.
timestamp (ts) and the average timestamp of all events in the The following parameters are required by the procedure:
window (sumt .ts). a pre-allocated number of instructions n, the simulation time
W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139 129

Algorithm 4 Placement simulation 1. Locate the placement where the successor vertex resides by
Require: Parameters: consulting the CepSimBroker (implementation details can be
◃ Schedule, Operator scheduling strategy examined in Appendix A).
1: function simulate(p, n, ts, cp) 2. Calculate the delay in transferring the event set to the
◃ p, placement to be simulated destination VM. This calculation depends on the network
◃ n, number of instructions interface implementation in use.
◃ ts, start timestamp (in ms) 3. Schedule a simulation event on the destination VM signalling
◃ cp, allocated CPU capacity (in MIPS) the arrival of the event set. This event is scheduled using the
2: for all vp ∈ p.producers do simulation framework provided by CloudSim.
3: Generate(vp , ts) The second modification is in the main loop between lines
4: end for 6 and 15. Before each iteration, the algorithm checks whether
5: it ←Schedule(p, n) any simulation event (representing the arrival of an event set) is
6: while next ← Next(it) do scheduled during the operator time slice. If one is, the time slice
7: v ← next .v is split in two at the event set arrival time, and the event set is
8: nv ← next .n enqueued into its destination queue between the two slices.
9: Simulate(v, nv , ts) This procedure is illustrated in Fig. 7. In the query from Fig. 7(a),
10: AdjustTime(ts) vertices p3 , f3 , and f4 are placed into one VM, and the remaining
11: for all vs ∈ v.succ do vertices are placed into another. The diagram in Fig. 7(b) shows
12: en ← Dequeue(v.output (vs )) the placements schedule as a function of time. At the end of the
13: Enqueue(vs .input (v), en) first iteration, vertex f4 ‘‘sends’’ an event set to its successor m3 .
14: end for This step is represented by scheduling a simulation event on the
15: end while destination placement after the period of time required to transfer
16: end function the event set from f4 to m3 . In the second iteration, the algorithm
detects the scheduled event before starting the m3 simulation.
at which the procedure has been invoked ts, and the CPU Then the m3 time slice is split into two halves (m′3 and m′′3 ), and
capacity cp (measured in MIPS) available to the placement. the event set is enqueued right after m′3 finishes.
The CloudSim simulation framework determines these arguments
at each invocation: first, a cloudlet scheduler calculates cp by 5.4.2. Bounded queues
distributing the total CPU processing power among all processes Most CEP systems limit the size of operator queues to avoid
concurrently running on the VM. In Fig. 6, the placement p1 has memory overflow and to maintain overall system performance.
only cp1 MIPS available because it shares the same VM with two Because of this characteristic, CEPSim also supports the definition
cloudlets c2 and c3 . Then the number of instructions n is derived by of bounded input operator queues. When using this feature, it is
multiplying the available capacity cp by the simulation tick length. necessary to define the behaviour of the system when new events
In Fig. 6, the value of n is equivalent to the area encompassed by arrive at an already full queue. Currently, CEPSim supports the
each process. application of backpressure to vertex predecessors.
In summary, there are three main steps in Algorithm 4: When using backpressure, operators inform their predecessors
1. All generators associated with the placement event producers about the maximum number of events accepted for the next itera-
are activated to determine the number of events that have been tion at the end of its simulation procedure. Then, the predecessors
generated from the last simulation tick to the current one (lines limit their output on the next tick if needed. Nevertheless, when
2–4); an operator limits its output, it may also accumulate events in its
2. The scheduling strategy associated with the placement is own input queues and consequently apply backpressure to its set
invoked to define the order in which the vertices will be of predecessors. Ultimately, the backpressure arrives at the event
simulated and the number of instructions allocated to each producers, which may choose to discard extraneous events or ac-
vertex (line 5). cumulate them in their own queues.
3. All vertices are traversed and simulated according to the
specified order (lines 6–15). The scheduling strategy returns an 5.5. Metrics
iterator of pairs, each one containing a vertex pointer (next .v )
and the number of instructions allocated to the vertex (next .n). One of the most important parts of any simulator is the set
With these two parameters, the operator simulation procedure of metrics obtained as a result of the simulation. As CEP queries
is invoked (line 9). Then the current timestamp ts is adjusted to performance are usually measured in terms of its latency and
reproduce the passing of time (line 10). Finally, the event sets in throughput, CEPSim provides built-in implementations for these
each of the vertex output queues are moved to the input queues two metrics.
of their respective successors (lines 11–14). The query latency metric is defined for each consumer vc as the
average number of seconds elapsed from the moment an event
5.4.1. Networked queries arrives at the query to the moment it is consumed by vc . In other
To simulate networked (distributed) queries, the CEPSim words, it measures how long a query takes to process an event.
placement simulation from Algorithm 4 received two main Conversely, the query throughput metric of a consumer vc is the
modifications. average number of events processed per second during its lifespan.
First, at the moment that event sets are moved from the Therefore, it quantifies the rate at which events are processed. In
operator output queues to the input queues of its successors (lines CEPSim, both metrics can be easily obtained because every event
11–14), the algorithm checks whether the successor vertex belongs consumer has an output event set that accumulates all events that
to the same placement or not. If it does, the event set is moved to have been consumed during a simulation.
the destination queue as usual. If it does not, then the event set Formally, the value of latency(vc ) is simply the latency of
and the destination vertex id are sent to a network interface, which vc output event set:
executes three main steps: latency(vc ) = vc .output .lt . (8)
130 W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139

(a) Query example. (b) Simulation algorithm.

Fig. 7. Networked query simulation.

Fig. 8. Event sets created during a simulation tick.

The calculation of throughput (vc ), on the other hand, is based on 6.1. Case study
the totals attribute of the output event set. This attribute contains
the total number of events generated by each producer that have The queries used in the experiments in this section have been
resulted in the events in the set. Thus, the throughput can be extracted from Powersmiths’ WOW system [41], a sustainability
obtained by summing the values for all producers and dividing this management platform that draws on live measurements of
sum by the query simulation time (in seconds). However, if there buildings to support energy management. Powersmiths’ WOW
is more than one path from a producer vp to the consumer vc , the uses Apache Storm [31] to process in near real-time sensor
output event set contains duplicates incorporated into the totals readings coming from buildings managed by the platform.
values for vp and needs to be fixed. Therefore, the query throughput Apache Storm is an open-source distributed stream processing
of a consumer vc is formally given by: system that has been adopted by many enterprises, despite

vc .output .tt (vp )
 limitations regarding QoS maintenance, privacy, and security [42].
throughput (vc ) = /q.time (9) Note, however, that Storm is still a young product and many
p |paths(vp , vc )| researchers are working to overcome its problems. For instance,
vp ∈Vq
Aniello et al. [43] proposed a scheduler that can be used to improve
where |paths(vp , vc )| is the number of paths from producer vp to
the system performance, and Chang et al. [44] introduced CCAF, a
consumer vc and q.time is the total query simulation time.
Fig. 8 exemplifies how the event sets are created and updated security framework that can be used to secure Storm deployments.
during a simulation tick until they are accumulated into the event Fig. 9 shows the Storm queries (topologies) used in the
consumer. The event sets e1 and e2 were generated at timestamp experiments. A spout in the Storm terminology is equivalent to an
ts = 5. At ts = 10 the producer p1 sends e1 to f1 , and at ts = event producer, whereas a bolt is equivalent to an operator. There
13 producer p2 sends e2 to f1 . Note that e1 and e2 attributes are is no concept analogous to an event consumer in Storm.
updated to take into account the time elapsed from the event set There are three main steps in the query q1 from Fig. 9(a): the
generation to the moment they are output. When processed by f1 , OutlierDetectorBolt detects and filters anomalous sensor readings,
both event sets are summed according to Eq. (3a), resulting in a the ReadingAv erageBolt groups readings into windows of 15 s
new event set e12 . At ts = 15, a new event set e3 is created by and calculates the average, and the DBConsumerBolt stores the
updating e12 timestamp and applying (f1 , f2 ) selectivity to it: calculated average in a database. By aggregating the sensor data
into 15 s windows, the query reduces the amount of data that is
e3 = update(e12 , ts) ∗ (f1 , f2 ).selectiv ity. (10)
written to the database.
Then the e3 event set is sent to f2 , where a similar procedure is The query q2 presented in Fig. 9(b), on the other hand, is used
executed and a new event set e4 is created. Finally, e4 is sent to to convert from a data format (JSON) to the native WOW format
the consumer c1 , where the final event set e5 is created and added (XML). This query is used because some existing sensors cannot be
to the output event set. modified to send data according to the WOW interface. The query
is composed of three main steps: the JsonParserBolt parses the JSON
6. Experiments request, the ValidateReadingBolt validates the request values, and
the XmlOutputBolt converts the request to XML format. The last
This section describes the experiments that have been per-
bolt (LatencyMeasurerBolt) is used only to measure the latency and
formed to analyse the CEPSim simulator. First, CEPSim is validated
throughput of the conversion process.
by comparing the latency and throughput metrics obtained by run-
ning queries on a real CEP/SP system and by simulating them on
CEPSim. Then, the simulator performance is assessed by analysing 6.2. Environment
the execution time and memory consumption of various simula-
tion scenarios. Finally, it is also investigated the effects of different Table 1 describes the cluster of virtual machines used in
parameters on the simulator behaviour. the experiments to run Storm topologies. All six VMs were
W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139 131

(a) Query q1 - Average window. (b) Query q2 - JSON converter.

Fig. 9. Storm topologies.

Table 1 Table 3
Storm VM cluster specification. Simulation parameters.
VM# CPU Mem. Description Parameter Value

1 1 core–Intel Xeon E5-2630 2.6 GHz 512 MB zookeeper VM processor 2 × 2500 MIPS
2 1 core–Intel Xeon E5-2630 2.6 GHz 768 MB nimbus CloudSim VM allocation policy Simple
3–6 1 core–Intel Xeon E5-2630 2.6 GHz 2048 MB workers VM scheduler Time shared
Simulation tick length 100 ms
Placement strategy User defined
Table 2
Allocation strategy Uniform
Software specification. CEPSim
Scheduling strategy Dynamic
Name Version Description Generator Uniform
Queue size 2048
Ubuntu 14.04.2 Physical server operating system
CentOS 6.5 VM operating system
VirtualBox 4.3.24 Virtualization software Experimental results have shown that the latency method provides
OpenJDK 1.7.0_75 Java runtime environment
Apache Storm 0.9.3 Stream processing system
better estimation for lower throughput operators, such as the
MySQL 5.5.41 Database system DBConsumerBolt operator, whereas the maximum throughput
method is better for higher throughput ones. This difference exists
mainly because it is hard to estimate latency accurately when the
deployed on the same physical server (12 cores Intel Xeon E5-
time spent processing each event is very short. For the experiments
2630, 2.6 GHz/96 GB RAM). VMs #1 and #2 run zookeeper (which
in this research, all ipe values were calculated using the maximum
coordinates cluster communication) and nimbus (which assigns
throughput method, except for DBConsumerBolt.
Storm tasks to workers). The workers VMs #3–#6 are the ones
which effectively execute the queries. The VM memory sizes have
been dimensioned to not be a bottleneck in the experiments. A 6.4. Validation
similar physical server hosted the database system and was used
to run all CEPSim simulations described in the experiments. The first step in CEPSim validation was to unit test all
The software used in the experiments is presented in Table 2. components and to execute a set of sanity checks to detect
All Storm topologies have been implemented using Storm’s Java programming bugs and inconsistent behaviour. After this phase,
API and use standard Java libraries for database access and XML a set of experiments was executed aiming to compare the
processing. performance metrics obtained by running queries on a real CEP/SP
system (Apache Storm) and by simulating them on CEPSim. This
validation approach is similar to the ones adopted by other
6.3. Set-up
simulators, such as NetworkCloudSim [9], iCanCloud [14], and
Before any simulation, the Storm queries had to be imple- Grozev and Buyya [35].
mented in the CEPSim model. The mapping of Storm queries to CEP- In all simulations, CEPSim was used to create an environment as
Sim is straightforward because both use DAGs as their underlying close as possible to the Storm VM cluster. Table 3 summarizes the
query model. main parameters used in the simulations. VMs have been modelled
Fig. 10 depicts the CEPSim model of both queries presented as having two processors, even though only one physical processor
in the use case section. Each edge connecting two vertices is was allocated for each. This was done because the processors used
annotated with its corresponding selectivity, and each vertex in the experiments are hyper-threaded, which enables a higher de-
is annotated with the estimated ipe attribute. Storm’s spouts gree of parallelism than regular processors. The queue size was set
and bolts are mapped to event producer and operator vertices to 2048 because by default Storm has buffers with 1024 elements
respectively. In both queries, an event consumer is also added to at both the output and input of each operator, but in CEPSim, accu-
group the events consumed by the query. mulation happens only at the operators’ input queues.
To estimate the operator’s ipe attribute, two methods have been
used: 6.4.1. Single query
• Latency estimation: the operator is fed with random events This first experiment validates CEPSim simulation of a single
at increasing rates and the average processing time (in query running entirely on a single VM.
milliseconds) is calculated for each rate value. The minimum To obtain the Storm metrics, both queries from Fig. 9 were
first instrumented to output the average throughput and latency
average is assumed to be the operator latency opl . Then the ipe
every minute. In addition, the query Spouts (event producers) were
attribute is calculated as:
  modified so that the user could define the number of sensors n that
1000 send data to the query. Each sensor generates 10 sensor readings
op.ipe = (cpum · 10 )/ 6
(11)
opl per second, of which 5% are anomalies.
The graphs from the experiments were obtained by varying
where cpum is the CPU processing power estimated in MIPS.
the number of sensors n, which consequently varied the number
• Maximum throughput estimation: the maximum throughput opt
of events generated per second. For each n, the queries were
is estimated by feeding the operator process with as many
run for 15 min and the average latency (throughput) for each of
events as possible. Then the ipe attribute is estimated as:
the last 10 min were collected. Note that each data point is an
op.ipe = (cpum · 106 )/opt . (12) observation from a sampling distribution of the average query
132 W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139

(a) Query q1 —CEPSim model.

(b) Query q2 —CEPSim model.

Fig. 10. Storm queries converted to the CEPSim model.

latency (throughput). CEPSim results were obtained using a similar Table 4


Placement experiment—Latency measurements (in ms).
procedure. The graphs show the mean value of these averages
and their 99% confidence interval (in other words, the confidence Apache Storm CEPSim
interval of the sampling distribution). In most cases, the confidence Placement1 12921.11 13234.15 +2.42%
interval is small and actually not visible. Placement2 9840.91 10117.70 +2.81%
Fig. 11 shows query q1 latency and throughput as a function Placement3 12575.42 12030.00 −4.33%
Placement4 9795.91 10061.83 +2.71%
of the input rate. Generally speaking, CEPSim achieved very high
accuracy for both metrics when compared to Storm. The latency
error was less than 1% up to 1000 events/s and was kept below used to improve the simulation precision (see discussion on
7.5% up to 20,000 events/s. The throughput calculation was even Section 6.6.2).
more accurate, with almost no error up to 20,000 events/s. Fig. 13 shows the simulated latency and throughput were very
The major estimation error occurred at 22,500 events/s, at accurate and precise. The latency error was less than 7% up to
which point the latency obtained by CEPSim was lower than the 27,500 events/s. At 30,000 events/s, the Storm query started to
real value. Further analysis showed that at this point, the Storm overload and the error increased, but the CEPSim results remained
query overloaded, and its behaviour became very unpredictable, within the confidence interval. The throughput calculation, on the
as can be seen in the high variance of the data point. Nevertheless, other hand, had no error throughout the experiment.
CEPSim still correctly predicted the maximum query throughput
around 21,000 events/s, as shown in the throughput drop in 6.4.3. Multiple queries
Fig. 11(b). This experiment analysed CEPSim’s behaviour when simulating
Results for the latency and throughput of query q2 are shown multiple queries running concurrently. To do so, first a Storm
in Fig. 12. The latency axis in Fig. 12(a) has a log scale because cluster was created at the Amazon EC2 service [45]. The setup was
the measured values encompass five orders of magnitude. Once similar to the one presented in Table 1, but all VMs were configured
again, the throughput calculation exhibited very small error, as instances of the m4.large type (2 vCPUs and 8 GB of RAM).
Then, four placement strategies were compared in a scenario
and the maximum query throughput was closely estimated at
where four copies of query q1 were simultaneously run:
approximately 21,000 events/s.
The latency at slow input rates showed some error because it is 1. Placement1 : one VM, with all four queries placed on it;
extremely hard to estimate latency accurately at sub-millisecond 2. Placement2 : two VMs, with two queries placed on each;
precision. At 100 events/s, the simulation values approached those 3. Placement3 : two VMs, with all four instances of DBConsumerBolt
placed on one VM and the remaining bolts on the other;
obtained with Storm and remained close up to the overload point at
4. Placement4 : four VMs, with one query placed on each.
22,500 events/s. After this point, the simulation latency plateaued,
whereas the Storm value spiked. This difference was caused mainly To avoid possible bottlenecks in the database server, DBConsum
by the way that CEPSim handles full queues by using backpressure erBolt was replaced by a mock implementation which does not
and discarding generated events. Storm, on the other hand, delays access the database, but spins in a busy loop for 4.5 ms (the
generation of events, but does not discard them. average time spent to process a single event, as measured by the
methodology described in Section 6.3).
Table 4 presents the average latency of all four queries for
6.4.2. Networked query both Apache Storm and CEPSim. The CEPSim column also shows
This experiment aimed to validate CEPSim simulation of the relative estimation error. Each query was set up to process
distributed queries. To perform this experiment, the query 10,000 events/s. The throughput metric has been omitted from the
from Fig. 9(a) was distributed into two VMs, such that the table because it was correctly measured as 10,000 events/s in all
DBConsumerBolt was placed into the w orker2 server and all scenarios. The results from this experiment demonstrated that
remaining vertices into w orker1. CEPSim can accurately simulate multiple queries running on the
A constant delay network interface was used to simulate this same VM and can be used to analyse different placement strategies.
query. In this network implementation, every event set sent For instance, the experiment showed that running two instances of
through the network takes a fixed amount of time to arrive at its query q1 on the same VM does not greatly affect their performance,
destination. This is a reasonable approximation because all VMs as illustrated by the small latency increase from Placement4 to
run on the same physical server and no real network traffic is being Placement2 . It is also clear from Placement1 ’s latency that placing
generated. The delay has been estimated as 1 ms in a separate four queries on the same VM can overload it and may not be a good
experiment. Furthermore, a simulation tick length of 10 ms was option depending on the users’ QoS requirements.
W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139 133

(a) Latency. (b) Throughput.

Fig. 11. Metrics estimation results—query q1.

(a) Latency. (b) Throughput.

Fig. 12. Metrics estimation results—query q2.

(a) Latency. (b) Throughput.

Fig. 13. Metrics estimation results—networked query q1.

6.5. CPU and memory overhead less than 40 MB of memory. Furthermore, both metrics grew sub-
linearly as a function of the number of queries.
This section presents two experiments that measure the The results from the second experiment are shown in Fig. 15.
execution time and memory consumption of CEPSim simulations. In this experiment, each VM ran a fixed number of queries, and
Figs. 14(a) and 14(b) depict the results from the first experi- the number of VMs in the datacentre was varied. The graphs show
ment. This experiment simulated a single VM running n instances results for two different combinations. In the first, the number
of query q2 from Fig. 9(b). The simulation time was set to 5 min and of queries per VM was set to 10 and the number of VMs varied
each query processed 100 events/s. For each value of n, the sim- from 10 to 1000; in the second, the number of queries per VM
ulation was executed 10 times and the total execution time and was set to 100 and the number of VMs varied from 1 to 100. Both
memory consumption were recorded. The graphs show the aver- combinations resulted in the same number of total queries, but
age of these values alongside the 99% confidence interval. CEPSim enabled comparison of the effects of different query placements
was able to simulate 100 queries in approximately 7 s and using on CEPSim performance.
134 W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139

(a) Execution time. (b) Memory consumption.

Fig. 14. Execution time and memory consumption—single VM.

(a) Execution time. (b) Memory consumption.

Fig. 15. Execution time and memory consumption—multiple VMs.

The results for the two combinations were very similar. The ReadingAv erageBolt window closes. This problem was even more
maximum simulation time was approximately 7 min for a total of pronounced when weighted allocation was used. In this case, the
10,000 queries, which translates to 1 million events per second. number of instructions that DBConsumerBolt received was propor-
Less than one 1 GB of memory was needed to run this simulation. tional to its ipe, which is much higher than the other operators’
Once again, both execution time and memory consumption scaled ipes.
sub-linearly. This behaviour is expected as long as the available When using dynamic scheduling strategy, CEPSim better ap-
RAM is larger than the memory required by the simulation. proximated Apache Storm’s results in all scenarios. Nevertheless,
when used with weighted allocation, dynamic scheduling under-
estimated the average latency in the 10,000 events/s case. In this
6.6. Simulation parameters
combination, the dynamic strategy prioritized DBConsumerBolt
whenever there were events on its input queues, resulting in lower
The two experiments described in this section aim to evaluate latency at the cost of lower maximum throughput.
the effects of different parameters in the simulations. First, it
is analysed how scheduling and allocation strategies affect the
6.6.2. Simulation tick length
simulation metrics. Then, the effects of simulation tick length on
To evaluate the effects of simulation tick length on CEPSim,
CEPSim is assessed.
query q1 was simulated using different simulation tick lengths in
both local and networked cases. The results are summarized in
6.6.1. Operator scheduling Table 5. The latency column shows the metric value obtained by
To analyse the effects of operator scheduling strategies, query the simulation. The execution time column displays the average
q1 latency and throughput were estimated using the default and of 10 simulations, each one including 100 instances of the query
dynamic scheduling strategies combined with the uniform and running for 5 min.
weighted allocation strategies. Fig. 16 summarizes the results The results show that the simulation tick length enables users
obtained when query input rate was configured to 100, 500, and to adjust the trade-off between precision and computational cost.
10,000 events/s. A longer tick introduced estimation error for both scenarios, but
When the default scheduling strategy was used in high in- the execution time was significantly reduced. The error was more
put rate scenarios, the throughput was considerably underes- pronounced in the networked query case because of the way
timated and the latency overestimated. This occurred mainly network communication is implemented in CEPSim: if a message
because DBConsumerBolt was scheduled at every simulation is sent to a placement that has already been scheduled, then the
tick, even though it receives events only when its predecessor message will be processed on the next simulation tick only.
W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139 135

(a) Latency. (b) Throughput.

Fig. 16. Parameters experiments—Scheduling and allocation strategies.

Table 5 user to fine control the simulation by changing parameters such


Parameters experiments—Simulation tick length. as the simulation tick length and scheduling strategy. Moreover,
Query Tick length Latency (ms) Execution even though Storm has not been stressed at a larger scale, most
(ms) time (ms) experimental results are also applicable to these scenarios. This is
Local 10 10078.43 – 63400.43 true because, in practice, the distribution of Storm (and other CEP
100 10140.53 0.62% 9645.23 systems) queries is limited to a few nodes. In other words, distinct
1000 10415.02 3.34% 2811.64 VMs usually run independent pieces of computation that can be
Networked 10 9730.00 – 66385.91 simulated in isolation from others.
100 9865.01 2.78% 10058.58
1000 12636.47 29.87% 3106.90
7. Conclusions

6.7. Discussion This article has presented CEPSim, a simulator for cloud-based
Complex Event Processing systems. CEPSim can model different
The experimental results described in this section showed that CEP systems by transforming user queries into a Directed Acyclic
CEPSim can effectively model real CEP/SP queries and simulate Graph representation. The modelled queries can be simulated
on different environments, including private, public, hybrid, and
them in a cloud environment. Execution time measurements also
multi-clouds. In addition, CEPSim also allows customization of
demonstrated that CEPSim has excellent performance, being able
operator placement and scheduling strategies, as well as the queue
to simulate 100 queries running for 5 min in 7 s only.
size and data generation functions used during simulation.
One of the main CEPSim use cases is to understand query
Experimental results have shown that CEPSim can simulate a
behaviour at various input event rates. The experiments described large number of queries running on a large number of virtual
in Sections 6.4.1 and 6.4.2 showed that this study can be performed machines within a reasonable time and with a very small memory
using CEPSim with relatively good accuracy and precision for both footprint. Furthermore, the experiments also demonstrated that
distributed and non-distributed queries and for both high and low CEPSim can model a real CEP system (Apache Storm) with good
input rate scenarios. accuracy and precision. Together, these results validated CEPSim as
As another important use case, the experiments described in an effective tool for simulation of cloud-based CEP systems in Big
Section 6.4.3 showed that CEPSim can also be used to simulate Data scenarios.
multiple queries running on the same VM. The latency estimation By using CEPSim, architects and researchers can quickly experi-
error was kept fairly low during the experiment and enabled easy ment with different configurations and query processing strategies
comparison of different operator placement strategies. and analyse the performance and scalability of CEP systems. Hope-
The limitations showed by CEPSim to simulate query q1 at the fully, the availability of a simulator may also encourage research in
maximum input rate highlighted the difficulty of simulating a this field.
system in an overloaded state. Further analysis concluded that, In future work, it is planned to add reconfiguration features
at this point, most of the query latency consisted of I/O waiting to CEPSim, including dynamically moving vertices to other VMs
time, as the DBConsumerBolt writes to the database every event and deploying new queries during a simulation. Moreover,
it receives. In this situation, the operating system continues to experiments that simultaneously use multiple clouds will also be
schedule other threads and processes, which can continue to included.
process events on their turns. CEPSim uses a simplified model in
Acknowledgements
which operator latency is caused by processing time spent on a CPU
only. In addition, the metric calculation errors at high input rates This research was supported in part by an NSERC CRD at
were also caused by differences in the strategy adopted to control Western University (CRDPJ 453294-13). Additionally, the authors
the query load: while CEPSim uses backpressure, Storm follows a would like to acknowledge the support provided by Powersmiths.
pull strategy on which events are requested from the producer only
when there is space available at the operator queues. Appendix A
As a final observation, it is claimed that CEPSim can be efficiently
used for Big Data simulations. Results from the experiments This appendix details the CEPSim implementation. It starts with
in Section 6.5 demonstrated that the simulator can scale well an overview of the simulator components, and it is followed by a
and can handle large numbers of queries with a small memory description of the core classes. The integration of CEPSim with the
footprint. In addition, CEPSim customizability also enables the CloudSim toolkit is also discussed.
136 W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139

The class diagram in Fig. A.2 shows the main parts of the event
and query model packages. Event sets and event set queues are
implemented by classes with the same respective names in the
ev ent package. The Query class represents CEP queries and, as
determined by its definition, is composed of one or more Vertex
objects and one or more Edges.
Two subclasses of Vertex have been identified: OutputVertex and
InputVertex. The former represents vertices with outgoing edges,
Fig. A.1. CEPSim components. and the latter represents vertices with incoming edges. Note that
both OutputVertex and InputVertex have one or more instances
A.1. Overview of the E v entSetQueue class representing their output and input
queues respectively.
Based on the design principles and goals presented in Section 3, The E v entProducer class describes event producers and there-
CEPSim has been designed with three main components, as shown fore is a subclass of OutputVertex only. Similarly, E v entConsumer
in Fig. A.1: characterizes event consumers and is a subclass of InputVertex. An
Operator is both an OutputVertex and an InputVertex because it re-
• CEPSim Core: implements the CEPSim concepts from Fig. 1. It ceives events from some vertices and sends them to others. The
provides APIs that enable the definition of queries and the Operator class also has a Window edOperator subclass that is used
creation of operator placement and scheduling strategies. In to represent windowed operators.
addition, it also implements the simulation logic described in Finally, note that every E v entProducer is associated with a
Section 5. Generator instance, which implements the generation function
• CloudSim: implements the CloudSim concepts from Fig. 1. It defined in Eq. (2). CEPSim currently contains two implementations
provides the overall simulation framework, which controls the of this function:
main simulation loop and the scheduling of simulation events. It
is also used to define the cloud computing environment where • UniformGenerator: generates a constant number of events per
the queries are simulated and to customize resource allocation simulation interval;
policies. • UniformIncreaseGenerator: generates a uniformly increasing
number of events until it reaches a maximum rate. After this
• CEPSim Integration: implements the pieces necessary to inte-
point, this maximum rate is maintained until the end of the
grate the CloudSim simulation engine with the CEP-specific logic
simulation.
provided by CEPSim Core. It guarantees a loose coupling be-
tween the two and enables future integration with other simu- The main classes and interfaces of the query executor and
lators. metrics packages are shown in Fig. A.3. The Placement class is
the central entity, representing the mapping of one or more
The following subsections detail the CEPSim Core and Integration
vertices to the VM in which they will be executed. To create
components.
these placements, CEPSim users must provide an implementation
of the OpPlacementStrategy interface, which defines an operator
A.2. CEPSim core placement strategy. Currently, CustomOpPlacementStrategy is the
only strategy provided by CEPSim, but others can be easily added. In
CEPSim Core classes and interfaces can be grouped into four this strategy, users must manually specify the mapping of vertices
main packages: event, which contains the event set and event set to VMs.
queue definitions; the query model, which contains the base classes The PlacementExecutor class encapsulates a Placement and
used to describe queries; the query executor, which manages implements the placement simulation algorithm described in
the query simulation; and metrics, which contains the metrics Section 5.4. This class uses an instance of the OpScheduleStrategy
calculation framework. interface, which defines the operator scheduling strategy to be

Fig. A.2. Class diagram—ev ent and querymodel packages.


W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139 137

Fig. A.3. Class diagram—queryexecutor and metrics packages.

extended and integrated with the CEPSim core. The main parts of
this extension are depicted in the class diagram in Fig. A.4.
The main part of this extension is the CepQueryCloudlet class,
a Cloudlet specialization that encapsulates the PlacementExecutor
class described in the preceding section. During the simulation, a
CepQueryCloudlet orchestrates a PlacementExecutor execution by
invoking the simulate method at each simulation tick.
The other main classes created for the integration are:
• CepSimBroker: a mediator between cloud users and providers
[12]. The CepSimBroker extends the CloudSim broker to handle
CepQueryCloudlets. It also keeps a mapping of all vertices to the
Fig. A.4. CEPSim integration with CloudSim. VMs to which they have been allocated.
• CepSimDatacenter: this datacentre specialization handles Cep
used during the simulation. Note that implementations for the QueryCloudlets and guarantees that the state of all simulated
scheduling and allocation strategies described in Section 5.1 are entities is updated at equally spaced intervals.
provided out-of-the-box by CEPSim. • CepQueryCloudletScheduler: a cloudlet scheduler defines how
In addition, the PlacementExecutor also interacts with one or the processing power of a VM is shared among all cloudlets
more instances of the MetricCalculator interface to calculate the allocated to it [12]. This research extends the time-shared policy
simulation metrics. The LatencyThroughputCalculator class shown to handle infinite or duration-based cloudlets.
in the figure is a built-in implementation that computes both
metrics described in Section 5.5. The sequence diagram in Fig. A.5 summarizes how these
classes work in tandem to implement a simulation cycle. First, the
A.3. CEPSim integration CepSimDatacenter receives a Vm_Datacenter_E v ent signal, which
is a CloudSim simulation event used to update the state of all
In accordance with the reuse design principle, CEPSim leverages simulated entities in a datacentre. By default, this event is signalled
many functionalities provided by CloudSim to enable the simula- when cloudlets resume or end their execution. In CEPSim, this
tion of CEP queries. This section describes how CloudSim has been behaviour has been changed so that the event is signalled at regular

Fig. A.5. Sequence diagram—simulation cycle.


138 W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139

intervals with the length of a simulation tick. This guarantees [15] W.A. Higashino, M.A.M. Capretz, L.F. Bittencourt, CEPSim: A Simulator
that the CEP queries are periodically updated and renders the for Cloud-Based Complex Event Processing, in: 2015 IEEE International
Congress on Big Data, IEEE, 2015, pp. 182–190. http://dx.doi.org/10.1109/
simulation more precise. BigDataCongress.2015.34.
After receiving this event, CepSimDatacenter invokes the [16] D. Luckham, The Power of Events: An Introduction to Complex Event
updateVmsProcessing method in all hosts in the datacentre. Note Processing in Distributed Enterprise Systems, first ed., Addison-Wesley
that the current simulation time is passed as a parameter of this Professional, 2002.
[17] D.J. Abadi, D. Carney, U. Çetintemel, M. Cherniack, C. Convey, S. Lee,
method call and therefore all hosts share the same clock. Then M. Stonebraker, N. Tatbul, S. Zdonik, Aurora: a new model and archi-
each host calls another updateVmsProcessing method in all VMs tecture for data stream management, VLDB J. 12 (2) (2003) 120–139.
currently deployed on it. At this point, the host also informs the http://dx.doi.org/10.1007/s00778-003-0095-z.
[18] A. Arasu, B. Babcock, S. Babu, J. Cieslewicz, M. Datar, K. Ito, R. Motwani,
number of MIPS allocated to each VM, which is obtained based on
U. Srivastava, J. Widom, STREAM: The Stanford Data Stream Management
the VM scheduling policy in use. System, Technical Report, 2004-20, Stanford InfoLab, 2004.
Next, the VM delegates the update task to the cloudlet scheduler, [19] G. Cugola, A. Margara, Processing flows of information: from data stream
which determines the number of instructions available to each to complex event processing, ACM Comput. Surv. 44 (3) (2012) 1–62.
http://dx.doi.org/10.1145/2187671.2187677.
cloudlet running on that particular VM based on the time-shared
[20] D. Luckham, R. Schulte, Event Processing Glossary—Version 2.0, Tech. Rep. July,
policy. Finally, the method updateCloudletFinishedSoFar is invoked Event Processing Technical Society, 2011.
on every CepQueryCloudlet, which delegates the simulation to the [21] A. Arasu, S. Babu, J. Widom, The CQL continuous query language: se-
encapsulated instance of PlacementExecutor. mantic foundations and query execution, VLDB J. 15 (2) (2005) 121–142.
http://dx.doi.org/10.1007/s00778-004-0147-z.
[22] N. Jain, S. Mishra, A. Srinivasan, J. Gehrke, J. Widom, H. Balakrishnan, U.
Appendix B. Supplementary data Çetintemel, M. Cherniack, R. Tibbetts, S. Zdonik, Towards a streaming SQL
standard, Proc. VLDB Endow. 1 (2) (2008) 1379–1390.
[23] I.A.T. Hashem, I. Yaqoob, N. Badrul Anuar, S. Mokhtar, A. Gani, S. Ullah Khan,
Supplementary material related to this article can be found The rise of Big Data on cloud computing: Review and open research issues, Inf.
online at http://dx.doi.org/10.1016/j.future.2015.10.023. Syst. 47 (2014) 98–115. http://dx.doi.org/10.1016/j.is.2014.07.006.
[24] K. Kambatla, G. Kollias, V. Kumar, A. Grama, Trends in big data analytics,
J. Parallel Distrib. Comput. 74 (7) (2014) 2561–2573. http://dx.doi.org/10.
References 1016/j.jpdc.2014.01.003.
[25] M.D. Assunção, R.N. Calheiros, S. Bianchi, M.A. Netto, R. Buyya, B ig Data
[1] J. Gubbi, R. Buyya, S. Marusic, M. Palaniswami, Internet of Things (IoT): A vision, computing and clouds: Trends and future directions, J. Parallel Distrib.
architectural elements, and future directions, Future Gener. Comput. Syst. 29 Comput. 79–80 (2015) 3–15. http://dx.doi.org/10.1016/j.jpdc.2014.08.003.
(7) (2013) 1645–1660. http://dx.doi.org/10.1016/j.future.2013.01.010. [26] V. Chang, G. Wills, A model to compare cloud and non-cloud storage of Big
[2] K. Grolinger, W.A. Higashino, A. Tiwari, M.A.M. Capretz, Data management in Data, Future Gener. Comput. Syst. (2015) http://dx.doi.org/10.1016/j.future.
cloud environments: NoSQL and NewSQL data stores, J. Cloud Comput. Adv. 2015.10.003.
Syst. Appl. 2 (2013) 1–22. http://dx.doi.org/10.1186/2192-113X-2-22. [27] K. Grolinger, E. Mezghani, M.A. Capretz, E. Exposito, Collaborative knowledge
[3] F.J. Ohlhorst, Big Data Analytics: Turning Big Data into Big Money, first ed., as a service applied to the disaster management domain, Int. J. Cloud Comput.
Wiley, Hoboken, NJ, USA, 2012. 4 (1) (2015) 5. http://dx.doi.org/10.1504/IJCC.2015.067706.
[4] K. Grolinger, M. Hayes, W.A. Higashino, A. L’Heureux, D.S. Allison, M.A.M. [28] J. Dean, S. Ghemawat, Mapreduce: simplified data processing on large clusters,
Capretz, C hallenges for MapReduce in Big Data, in: Services, 2014 World Commun. ACM 51 (1) (2008) 107–113. http://dx.doi.org/10.1145/1327452.
Congress on, IEEE, 2014, pp. 182–189. http://dx.doi.org/10.1109/SERVICES. 1327492.
2014.41. [29] A. Brito, A. Martin, T. Knauth, S. Creutz, D. Becker, S. Weigert, C. Fetzer, S calable
[5] Z. Qian, Y. He, C. Su, Z. Wu, H. Zhu, T. Zhang, L. Zhou, Y. Yu, Z. Zhang, and Low-Latency Data Processing with Stream MapReduce, in: 2011 IEEE Third
TimeStream: Reliable Stream Computation in the Cloud, in: Proceedings of the International Conference on Cloud Computing Technology and Science, IEEE,
8th ACM European Conference on Computer Systems, ACM Press, New York, 2011, pp. 48–58. http://dx.doi.org/10.1109/CloudCom.2011.17.
NY, USA, 2013, pp. 1–14. http://dx.doi.org/10.1145/2465351.2465353.
[30] A.M. Aly, A. Sallam, B.M. Gnanasekaran, L.-V. Nguyen-Dinh, W.G. Aref, M.
[6] V. Gulisano, R. Jimenez-Peris, M. Patino-Martinez, C. Soriente, P. Valduriez, S
Ouzzani, A. Ghafoor, M3: Stream processing on main-memory mapreduce,
treamCloud: An elastic and scalable data streaming system, IEEE Trans. Parallel
in: 2012 IEEE 28th International Conference on Data Engineering, IEEE, 2012,
Distrib. Syst. 23 (12) (2012) 2351–2365. http://dx.doi.org/10.1109/TPDS.2012.
pp. 1253–1256. http://dx.doi.org/10.1109/ICDE.2012.120.
24.
[31] Storm, Storm, distributed and fault-tolerant realtime computation.
[7] R. Barazzutti, P. Felber, C. Fetzer, E. Onica, J.-F. Pineau, M. Pasin, E.
http://storm-project.net/.
Rivière, S. Weigert, S treamHub: A Massively Parallel Architecture for High-
[32] L. Neumeyer, B. Robbins, A. Nair, A. Kesari, S 4: Distributed Stream Computing
Performance Content-Based Publish/Subscribe, in: Proceedings of the 7th
Platform, in: 2010 IEEE International Conference on Data Mining Workshops,
ACM International Conference on Distributed Event-Based Systems, DEBS’13,
IEEE, 2010, pp. 170–177. http://dx.doi.org/10.1109/ICDMW.2010.172.
ACM Press, New York, NY, USA, 2013, pp. 63–74. http://dx.doi.org/10.1145/
[33] I. Legrand, H. Newman, The MONARC toolset for simulating large network-
2488222.2488260.
distributed processing systems, in: 2000 Winter Simulation Conference
[8] W.A. Higashino, C. Eichler, M.A.M. Capretz, T. Monteil, M.B.F. De Toledo, P. Stolf,
Proceedings, vol. 2, IEEE, 2000, pp. 1794–1801. http://dx.doi.org/10.1109/
Query Analyzer and Manager for Complex Event Processing as a Service, in:
WSC.2000.899171.
WETICE Conference, 2014 IEEE 23rd International, 2014, pp. 107–109. http://
[34] T. Guérout, T. Monteil, G. Da Costa, R. Neves Calheiros, R. Buyya, M. Alexandru,
dx.doi.org/10.1109/WETICE.2014.53.
Energy-aware simulation with DVFS, Simul. Model. Pract. Theory 39 (2013)
[9] S.K. Garg, R. Buyya, N etworkCloudSim: Modelling Parallel Applications in
76–91. http://dx.doi.org/10.1016/j.simpat.2013.04.007.
Cloud Simulations, in: 2011 Fourth IEEE International Conference on Utility
[35] N. Grozev, R. Buyya, Performance modelling and simulation of three-tier
and Cloud Computing, IEEE, 2011, pp. 105–113. http://dx.doi.org/10.1109/
applications in cloud and multi-cloud environments, Comput. J. 58 (1) (2015)
UCC.2011.24.
1–22. http://dx.doi.org/10.1093/comjnl/bxt107.
[10] D. Luckham, Rapide: A Language and Toolset for Simulation of Distributed
Systems by Partial Orderings of Events, Tech. Rep. CSL-TR-96-705, Stanford [36] The Network Simulator - ns-2. URL http://www.isi.edu/nsnam/ns/.
University, 1996. [37] T. Heinze, Z. Jerzak, G. Hackenbroich, C. Fetzer, Latency-aware elastic
[11] R. Buyya, M. Murshed, G ridSim: a toolkit for the modeling and simu- scaling for distributed data stream processing systems, in: Proceedings
lation of distributed resource management and scheduling for Grid com- of the 8th ACM International Conference on Distributed Event-Based
puting, Concurrency Comput. Pract. Exp. 14 (13–15) (2002) 1175–1220. Systems, DEBS’14, ACM Press, New York, NY, USA, 2014, pp. 13–22.
http://dx.doi.org/10.1002/cpe.710. http://dx.doi.org/10.1145/2611286.2611294.
[12] R.N. Calheiros, R. Ranjan, A. Beloglazov, C.A.F. De Rose, R. Buyya, C loudSim: [38] M. Hong, M. Riedewald, C. Koch, J. Gehrke, A. Demers, Rule-based multi-
a toolkit for modeling and simulation of cloud computing environments and query optimization, in: Proceedings of the 12th International Conference on
evaluation of resource provisioning algorithms, Softw. - Pract. Exp. 41 (1) Extending Database Technology Advances in Database Technology, EDBT’09,
(2011) 23–50. http://dx.doi.org/10.1002/spe.995. ACM Press, New York, NY, USA, 2009, p. 120. http://dx.doi.org/10.1145/
[13] D. Kliazovich, P. Bouvry, S.U. Khan, G reenCloud: A packet-level simulator of 1516360.1516376.
energy-aware cloud computing data centers, J. Supercomput. 62 (3) (2012) [39] G.T. Lakshmanan, Y. Li, R. Strom, Placement strategies for internet-scale data
1263–1283. http://dx.doi.org/10.1007/s11227-010-0504-1. stream systems, IEEE Internet Comput. 12 (6) (2008) 50–60. http://dx.doi.org/
[14] A. Núñez, J.L. Vázquez-Poletti, A.C. Caminero, G.G. Castañé, J. Carretero, I.M. 10.1109/MIC.2008.129.
Llorente, I CanCloud: A Flexible and Scalable Cloud Infrastructure Simulator, [40] B. Babcock, S. Babu, R. Motwani, M. Datar, Chain: Operator Scheduling
J. Grid Comput. 10 (6) (2012) 185–209. http://dx.doi.org/10.1007/s10723-012- for Memory Minimization in Data Stream Systems, 2003, pp. 253–264.
9208-5. http://dx.doi.org/10.1145/872757.872789.
W.A. Higashino et al. / Future Generation Computer Systems 65 (2016) 122–139 139

Miriam Capretz is a Professor in the Department of Elec-


[41] Powersmiths, Powersmiths WOW—Build A More Sustainable Future
trical and Computer Engineering at Western University,
http://www.powersmithswow.com/.
Canada. Before joining Western University, she was with
[42] R. Ranjan, Streaming big data processing in datacenter clouds, IEEE Cloud
the University of Aizu, Japan. She received her B.Sc. and
Comput. 1 (1) (2014) 78–83. http://dx.doi.org/10.1109/MCC.2014.22.
M.E.Sc. degrees from UNICAMP, Brazil and her Ph.D. from
[43] L. Aniello, R. Baldoni, L. Querzoni, Adaptive Online Scheduling in Storm, ACM
the University of Durham, UK. She has been working in
Press, New York, New York, USA, 2013, pp. 207–218. http://dx.doi.org/10. the software engineering area for more than 30 years and
1145/2488222.2488267. has been involved with the organization of workshops and
[44] V. Chang, Y.-H. Kuo, M. Ramachandran, Cloud Computing Adoption symposia and has been serving on program committees in
Framework—a security framework for business clouds, Future Gener. international conferences. Her current research interests
Comput. Syst. (2015) http://dx.doi.org/10.1016/j.future.2015.09.031. include cloud computing, Big Data, service oriented archi-
[45] Amazon, Amazon Elastic Compute Cloud (EC2). URL http://aws.amazon.com/ tecture, privacy and security.
ec2.

Luiz F. Bittencourt is an Assistant Professor at the


University of Campinas (UNICAMP), Brazil. He received his
Wilson A. Higashino is a dual Ph.D. degree candidate in
Bachelor’s degree in Computer Science from the Federal
the Software Engineering program at Western University,
University of Parana, Brazil, in 2004, and his Masters
Canada and in the Computer Science program at University
(2006) and Ph.D. (2010) degrees from UNICAMP, Brazil.
of Campinas, Brazil. He obtained his M.Sc. and B.Sc. degrees
Luiz has been awarded with the IEEE Communications
in Computer Science from University of Campinas. He has
Society Latin America Young Professional Award 2013.
also worked in industry for more than eight years, mostly
He has organized cloud computing workshops (MGC,
involved with the definition of architecture for solutions
CloudAM, WCGA), and participated in several technical
with high performance and availability requirements. His
program committees (CCGrid, LatinCloud, CLOSER, Cloud
current research interests include cloud computing, Big
and Green Computing). His main interests are in the areas
Data and complex event processing.
of virtualization and scheduling in grids and clouds.

You might also like