(Series in Computer Science (New York, N.Y.) ) Giorgio C. Buttazzo - Soft Real-Time Systems - Predictability vs. Efficiency (2005, Springer)
(Series in Computer Science (New York, N.Y.) ) Giorgio C. Buttazzo - Soft Real-Time Systems - Predictability vs. Efficiency (2005, Springer)
(Series in Computer Science (New York, N.Y.) ) Giorgio C. Buttazzo - Soft Real-Time Systems - Predictability vs. Efficiency (2005, Springer)
DYNAMIC RECONFIGURATION
Architectures and Algorithms
Ramachandran Vaidyanathan and Jerry L. Trahan
MOBILE IP
Present State and Future
Abdul Sakib Mondal
Giol-gio Buttazzo
l i w e ~ s z of'Pnllzn
~l
Pnlw. It@
Giuseppe Lipari
Sczioln Szipe~"zo~~e
Smzt Atuzn
Pzsn. Itnbl
Luca Abenj
-l.lBI Groi ~p
Pisn. I t d ]
- Springer
ISBN 0-387-23701-1
9 8 7 6 5 4 3 2 1 SPIN 1136648
springeronline.com
CONTENTS
Preface vii
INTRODUCTION
1.1 Basic terminology
1.2 From hard to soft real-time systems
1.3 Providing support for soft real-time systems
OVERLOAD MANAGEMENT
2.1 Introduction
2.2 Load definitions
2.3 Admission control methods
2.4 Performance degradation methods
2.5 Service adaptation
2.6 Job skipping
2.7 Period adaptation
TEMPORAL PROTECTION
3.1 Problems without temporal protection
3.2 Providing temporal protection
3.3 The GPS model
3.4 Proportional share scheduling
3.5 Resource reservation techniques
3.6 Resource reservations in dynamic priority systems
3.7 Temporal guarantees
3.8 Resource reservations in operating system kernels
MULTI-THREAD APPLICATIONS
4.1 The thread model
4.2 Global approaches
4.3 Partition-based approaches
4.4 Concluding remarks and open problems
SYNCHRONIZATION PROTOCOLS
5.1 Terminology and notation
5.2 Shared resource in real-time systems
5.3 Synchronization protocols for hard real-time systems
5.4 Shared resources in soft real-time systems
5.5 Extending resource reservation with the SRP
5.6 Resource constraints in dynamic systems
5.7 Concluding remarks
RESOURCE RECLAIMING
6.1 Problems with reservations
6.2 The CASH algorithm
6.3 The GRUB algorithm
6.4 Other forms of reclaiming
QOS MANAGEMENT
7.1 The QoS-based resource allocation model
7.2 Static vs. dynamic resource management
7.3 Integrating design & scheduling issues
7.4 Smooth rate adaptation
FEEDBACK SCHEDULING
8.1 Controlling the number of missed deadlines
8.2 Adaptive reservations
8.3 Application level adaptation
8.4 Workload estimators
STOCHASTIC SCHEDULING
9.1 Background and definitions
9.2 Statistical analysis of classical algorithms
9.3 Real-time queueing theory
9.4 Novel algorithms for stochastic scheduling
9.5 Reservations and stochastic guarantee
REFERENCES
INDEX
PREFACE
Providing an appropriate support at the operating system level to such emerging ap-
plications is not trivial. In fact, whereas general purpose operating systems are not
predictable enough for guaranteeing the required performance, the classical hard real-
time design paradigm, based on worst-case assumptions and static resource allocation,
would be too inefficient in this context, causing a waste of the available resources and
increasing the overall system cost. For this reason, new methodologies have been in-
vestigated for achieving more flexibility in handling task sets with dynamic behavior,
as well as higher efficiency in resource exploitation.
This book illustrates the typical characteristics of soft real-time applications and presents
some recent methodologies proposed in the literat~lreto support this kind of applica-
tions.
Chapter 1 introduces the basic terminology and concepts used in the book and clearly
illustrates the main characteristics that distinguish soft real-time computing from other
types of computation.
Chapter 3 introduces the concept of temporal protection, a mechanism for isolating the
temporal behavior of a task to prevent reciprocal interference with the other system
activities.
...
Vlll
Chapter 4 deals with the problem of executing several independent multi-thread appli-
cations in the same machine, presenting some methodologies to partition the processor
into several virtual slower processors, in such a way that each application can be inde-
pendently guaranteed from each other.
Acknowledgments
This work is the result of several years of research activity in the field of real-time
systems. The majority of the material presented in this book is taken from research
papers and has been elaborated to be presented in a simplified form and with a uniform
structure. Though this book carries the names of four authors, it has been positively
influenced by a number of people who gave a substantial contribution in this emerging
field. The authors would like to acknowledgeEnrico Bini, for his insightfill discussions
on schedulability analysis, Paolo Gai for his valuable work on kernel design and algo-
rithms implementation, and Luigi Palopoli for his contribution on integrating real-time
and control issues. Finally, we would like to thank the Kluwer editorial staff for the
support we received during the preparation of the manuscript.
INTRODUCTION
In this chapter we explain the reasons why soft real-time computing is being deeply in-
vestigated d~lringthe last years for supporting a set of applicationdomains for which the
hard real-time approach is not suited. Examples of such application domains include
multimedia systems, monitoring apparatuses, robotic systems, real-time graphics, in-
teractive games, and virtual reality.
To better understand the difference between classical hard real-time applications and
soft real-time applications, we first introduce some basic terminology that will be used
throughout the book, then we present the classical design approach used for hard real-
time systems, and then describe the characteristics of some soft real-time application.
Hence, we identify the major problems that a hard real-time approach can cause in these
systems and finally we derives a set of feat~lresthat a soft real-time system should have
in order to provide efficient support for these kind of applications.
1 . BASIC TERMINOLOGY
In the common sense, a real-time system is a system that reacts to an event within a
limited amount of time. So, for example, in a web page reporting the state of a Formula
1 race, we say that the race state is reported in real-time if the car positions are updated
"as soon as" there is a change. In this particular case, the expression "as soon as" does
not have a precise meaning and typically refers to intervals of a few seconds.
When a computer is used to control a physical device (e.g., a mobile robot), the time
needed by the processor to react to events in the environment may significantly affect
the overall system's performance. In the example of a mobile robot system, a correct
maneuver performed too late could cause serious problems to the system and/or the
environment. For instance, if the robot is running at a certain speed and an obstacle
is detected along the robot path, the action of pressing the brakes or changing the
robot trajectory should be performed within a maximum delay (which depends on the
obstacle distance and on the robot speed), otherwise the robot could not be able to
avoid the obstacle, thus incurring in a crash.
Keeping the previous example in mind, a real-time system can be more precisely
defined as a computing system in which computational activities must be performed
within predefined timing constraints. Hence, the performance of a real-time system
depends not only on the functional correctness of the results of computations, but also
on the time at which such results are produced.
The word real indicates that the system time (that is, the time represented inside the
computing system) should always be synchronized with the external time reference
with which all time intervals in the environment are measured.
Release time 7 ,: is the time at which a task becomes ready for execution; it is
also referred to as a r r i ~ dtime and denoted by a ,;
Start time s,: is the time at which a task starts its execution for the first time;
Computation time C,: is the time necessary to the processor for executing the
task without interruption;
Finishing time f,: is the time at which a task finishes its execution;
Response time R , : is the time elapsed from the task release time and its finishing
time (R, = f , - r,);
Absolute deadline d,: is the time before which a task should be completed;
Relative deadline D,: is the time, relative to the release time, before which a task
should be completed ( D l = d , r l );
-
Figure 1.1 Typical timing parameters of a real-time task
Such parameters are schematically illustrated in Figure 1.1, where the release time is
represented by an up arrow and the absolute deadline is represented by a down arrow.
w Slack time or Laxity: denotes the interval between the finishing time and the
absolute deadline of a task (slack, = d l - f,); it represents the maximum time a
task can be delayed to still finish within its deadline;
w Lateness L,: L, = f , d,represents the completion delay of a task with respect
-
to its deadline; note that if a task completes before its deadline, its lateness is
negative;
w Tardiness or E.xceeding rime E , : E, = n~ax(O.
L , ) is the time a task stays active
after its deadline.
If the same computational activity needs to be executed several times on different data,
then a task is characterized as a sequence of multiple instances, or jobr. In general,
a task r, is modeled as a (finite or infinite) stream of jobs, 7, ,, ( J = 1 . 2 . . . .), each
, ,
characterized by a release time r , , an execution time c , , a finishing time f , , and,
an absolute deadline d , .,
A task is said to b e j m if only a limited number of jobs are allowed to miss their
deadline. In [KS95], Koren and Shasha defined a firm task model in which only one
job every S is allowed to miss its deadline. When a job misses its deadline, it is
aborted and the next S 1jobs must be guaranteed to complete within their deadlines.
-
A task is said to be sqff if the value of the produced result gracefully degrades with
its response time. For some applications, there is no deadline associated with soft
computations. In this case, the objective of the system is to reduce their response times
as much as possible. In other cases, a soft deadline can be associated with each job,
meaning that thejob should complete before its deadline to achieve its best performance.
However, if a soft deadline is missed, the system keeps working at a degraded level
of performance. To precisely evaluate the performance degradation caused by a soft
deadline miss, a performance value function can be associated with each soft task, as
described in Chapter 2.
Finally, a task is said to be norl r e d time if the value of the produced result does not
depend on the completion time of its computation.
When jobs activations are triggered by time and are separated by a fixed interval of time,
the task is said to be periodic. More precisely, a periodic task 7,is a time-triggered
task in which the first job 7,1 is activated at time a,, called the task phase, and each
subsequent job r,,+I is activated at time r , ,+I = 7 ,,+ T I ,where T,is the task period.
If D, is the relative deadline associated with each job, the absolute deadline of job 7 ,,
can be computed as:
+
r , 3 = a, (JI - 1)T,
+
d l 1 = r.13 Dl
If job activation times are not regular, the task is said to be apenodlc. More precisely,
an aperiodic task r, is a task in which the activation time of job 7 , ~ + is
1 greater than
>
or equal to that of its previous job 7 , A . That is, r , ~ + 1 r, A .
If there is a minimum separation time between successive jobs of an aperiodic task,
the task is said to be sporadic. More precisely, a sporadic task 7 , is a task in which
the difference between the activation times of any two adjacent jobs r ,A and 7, ~ + is1
greater than or equal to T,. That is, ?-, >A
?-, +
T,.The T,parameter is called the
?izi~~iwiz~wi
interarrival time.
In a real-time system, however, the processor load also depends on tasks' timing con-
straints. The same set of tasks with given computation requirements and arrival patterns
will cause a higher load if it has to be executed with more stringent timing constraints.
To measure the load of a real-time system in a given interval of time, Baruah, Howell
and Rosier [BMR90] introduced the concept of processor der~zarzd,defined as follows:
Definition 1.1 The processor cler~zandy ( t l .t 2 )in an intenal of firne [ t l , t 2 ]ia the
arnomf qf conlprtafion that lzaa been releaaed at or ufter t 1 and rmsf be con~plefecl
~vithint2.
Hence, the processor demand g , (tl, t n )of task r, is equal to the computation time
requested by those jobs whose arrival times and deadlines are within [t1 . t n ] That
. is:
For example, given the set of jobs illustrated in Figure 1.2, the processor demand in
the interval [ t , ,tb]is given by the sum of computation times denoted with dark gray,
that is, those jobs that arrived at or after t , and have deadlines at or before t b .
The total processor demand g(t 1. t 2 )of a task set in an interval of time [t1, t 2 ]is equal
to the sum of the individual demands of each task. That is,
Figure 1.2 Processor dernaricl for a set of jobs
Then, the processor workload in an interval [t1 , t 2 ]can be defined as the ratio of the
processor demand in that interval and the length of the interval:
In the special case of a periodic hard task, the load produced by the task is also called
the task uti1i:ation ( C , )and can be computed as the ratio between the task worst-case
computation time C , and its period T,:
Then, the total processor utilization is defined as the sum of the individual tasks'
utilizations: n
For a set of periodic tasks, the overload condition is reached when the processor uti-
lization Cp = x:kl C, exceeds one. Notice, however, that, depending on the adopted
scheduling algorithm, tasks may also miss deadlines when the processor is not over-
loaded (as in the case of the Rate Monotonic algorithm, that has a schedulability bound
less than one [LL73]).
While the overload is a condition related to the processor, the overrun is a condition
related to a single task.
A task is said to overrun when there exists an interval of time in which its computational
demand g, exceeds its expected bandwidth C,. This condition may occur either because
jobs arrive more frequently than expected (activation o v e m ~ n )or
, because computation
times exceed their expected value (e.xecufion overr~rn).Notice that a task overrun does
not necessarily cause an overload.
In this section we describe the typical characteristics of hard and soft real-time appli-
cations, and present some concrete example to illustrate their difference in terms of
application requirements and execution behavior.
For example, a defense missile could miss its target if launched a few milliseconds
before or after the correct time. Similarly, a control system could become unstable if
the control commands are not delivered at a given rate. For this reason, in such systems.
computational activities are modeled as tasks with hard deadlines, that must be met in
all predicted circumstances. A task finishing after its deadline is considered not only
late, but also wrong, since it could jeopardize the whole system behavior.
In order to guarantee a given performance, hard real-time systems are designed under
worst-case scenarios, derived by making pessimistic assumptions on system behav-
ior and on the environment. Moreover, to avoid unpredictable delays due to resource
contention, all resources are statically allocated to tasks based on their maximum re-
quirements. Such a design approach allows system designers to perform an off-line
analysis to guarantee that the system is able to achieve a minimum desired performance
in all operating conditions that have been predicted in advance.
A crucial phase in performing the off-line guarantee is the evaluation of the worst-case
computation times (WCETs) of all computational activities. This can be done either
experimentally, by measuring the maximum execution time of each task over a large
amount of input data, or analytically, by analyzing the source code, identifying the
longest path, and computing the time needed to execute it on the specific processor
platform. Both methods are not precise. In fact, the first experimental approach fails
in that only a limited number of input data can be generated during testing, hence the
worst-case execution may not be found. On the other hand, the analytical approach has
to make so many assumptions on the low-level mechanisms present in the computer
architecture, that the estimation becomes too pessimistic. In fact, in modern computer
architectures, the execution time of an instruction depends on several factors, such as
theprefetch queue, the DMA, the cache size, and so on. The effects of such mechanisms
on task execution are difficult to predict, because they also depends on the previous
computation and on the actual data. As a consequence, deriving a precise estimation
of the WCET is very difficult (if not impossible). The WCET estimations used in
practice are not precise and are affected by large errors (typically more than 20%).
This means that to have an absolute off-line guarantee, all tasks execution times have
to be overestimated.
Once all computation times are evaluated, the feasibility of the system can be analyzed
using several guarantee algorithms proposed in the literature for different scheduling
algorithms and task models (see [But971 for a survey of guarantee tests). To simplify
the guarantee test and cope with peak load conditions, the schedulability analysis of a
task set is also performed under pessimistic assumptions. For example, a set of periodic
tasks is typically analyzed under the following assumptions:
All tarks start at the rarrie time. This assumption simplifies the analysis because
it has been shown (both under fixed priorities [LL73] and dynamic priorities
[BMR90]) that synchronous activations generate the highest workload. Hence, if
the system is schedulable when tasks are synchronous, it is also schedulable when
they have different activation phases.
All jobs of a task lzave flze same con~pfatiorzfinze. This assumption can be
reasonable for tasks having a very simple structure (no branches or loops). In
general, however, tasks have loops and branches inside their code, which depend
on specific data that cannot be predicted in advance. Hence, the computation
time of a job, is highly variable. As a consequence, modeling a task with a fixed
computation time equal to the maximum execution time of all its jobs leads to a
very pessimistic estimate, which causes a waste of the processing resources.
Figure 1.3 Decoding times for a sequence of flames taken from Stur IVc~rr
In addition, often, such systems operate in more dynamic environments, where tasks
can be created or killed at runtime, or task parameters can change from one job to the
other.
There are many soft real-time applications in which the worst-case duration of some
tasks is rare but much longer than the average case. In multimedia systems, for instance,
the time for decoding a video frame in MPEGplayers can vary significantly as a function
of the data contained in the frames. Fig~lre1.3 shows the decoding times of frames in
a specific sequence of the Star War5 movie.
As another example of task with variable computation time, consider a visual tracking
system where, in order to increase responsiveness, the moving target is searched in a
small window centered in a predicted position, rather than in the entire visual field.
If the target is not found in the predicted area, the search has to be performed in a
larger region until, eventually, the entire visual field is scanned in the worst-case. If
the system is well designed, the target is found very quickly in the predicted area most
of the times. Thus, the worst-case situation is very rare, but very expensive in terms of
computational resources (computation time increases quadratically as a function of the
number of trials). In this case, an off-line guarantee based on WCETs would drastically
reduce the frequency of the tracking task, causing a severe performance degradation
with respect to a soft guarantee based on the average execution time.
Just to give a concrete example, consider a videocamera producing images with 5 12x5 12
pixels, where the target is a round spot, with a 30 pixels diameter, moving inside the
visual field. In this scenario, if L7,is the processor utilization required to track the target
in a small window of 64x64 pixels at a rate T,, a worst-case guarantee would require
the tracking task to run 64 times slower in order to demand the same bandwidth in the
entire visual field (which is 64 times bigger). Clearly, in this application, it is more
convenient to perform a less pessimistic guarantee in order to increase the tracking rate
and accept some sporadic overrun as a natural system behavior.
In other situations, periodic tasks could be executed at different rates in different op-
erating conditions. For example, in a flight control system, the sampling rate of the
altimeters is a function of the current altitude of the aircraft: the lower the altitude, the
higher the sampling frequency. A similar need arises in robotic applications in which
robots have to work in unknown environments, where trajectories are planned based on
the current sensory information. If a robot is equipped with proximity sensors, in order
to maintain a desired performance, the acquisition rate of the sensors must increase
whenever the robot is approaching an obstacle. Another example of computation with
variable activation rate is engine control, where computation is triggered by the shaft
rotation angle, hence task activation is a function of the motor speed.
In all these examples, task parameters are not fixed, as typically considered in a hard
task, but vary from a job to the other, depending on the data to be processed.
The problem becomes even more significant when the real-time software runs on top of
modern hardware platforms, which include low-level mechanisms such as pipelining,
prefetching, caching, or DMA. In fact, although these mechanisms improvethe average
behavior of tasks, they worsen the worst case, so making much more difficult to provide
precise estimates the of worst-case computation times.
To provide a more precise information about the behavior of such dynamic compu-
tational activities, one could describe a parameter through a probability distribution
derived by experimental data. Figure 1.4 illustrates the probability distribution func-
tion of job computation times for the process illustrated in Figure 1.3.
decodng tlme (m~croseconds)
Figure 1.4 Distribution of joh computation times for the frame sequence shown in Figure
1.3.
In this section we explain why classical general purpose operating systems are not
suited for supporting real-time applications. We also explain the limitations of the hard
real-time systems and finally we conclude the section with a list of desired features
that should be included in a kernel for providing efficient support for soft real-time
applications.
1.3.1 PROBLEMS WITH NON REAL-TIME SYSTEMS
The fact that a soft real-time application may tolerate a certain degree of performance
degradation does not mean that timing constraints can be completely ignored. For
example, in a multimedia application, a quality of service level needs to be enforced
on the computational tasks to satisfy a desired performance requirement. If too many
deadlines are missed, there is no way to keep the system performance above a certain
threshold.
First of all, they do not provide support for controlling explicit timing constraints.
System timers are available at a relatively low resolution, and the only kernel service
for handling time is given by the delay() primitive, which suspends the calling tasks for
a given interval of time. The problem with the delay primitive, however, is that, if a task
requires to be suspended for an interval of time equal to A, the system only guarantees
that the calling task will be delayed at least by A. When using shared resources, the
delay primitive can be very unpredictable, as shown in the example illustrated in Figure
1.5. Here, although in normal conditions (a) task 7 1 has a slack equal to 4 units of time,
a delay of 2 time units causes the task to miss its deadline (b).
In general, if we cannot limit the number of intermediate priority tasks that can run
while r, is waiting for the resource, the blocking time of 7 ,cannot be bounded,prevent-
ing any performance guarantee on its execution. The priority inversion phenomenon
illustrated above can be solved using specific concurrency control protocols when ac-
cessing shared resources, like the Priorih Inherifance Profocol, the Priorig Ceiling
Protocol [SRL90], or the Stack Resoillre Policy [Bak91]. However, unfortunately,
these protocols are not yet available in all general purpose operating systems.
normal execution
critical section
critical section
,c a
,c b
1 t ?-
(3)
blocked
Figure 1.6 The priority in\-ersion phenomenon. In case (a) T, is blocked for at most the
duration of the critical section of r h . In case (b) T, is also delayed by the entire execution
of T, having intermediate priority.
Figure 1.7 Effects of an execution overrun. In norlnal conditions (a) all tasks execute
within their deadlines: hut a sequence of overruns in q and 72 may preLent 73 to execute
(b).
When using non real-time systems, several kernel mechanisms can cause negative
effects on real-time computations. For example, typical message passing primitives
(like ~ e r and
d r-ecehv)available in kernels for intertask communication adopt a blocking
semantics when receiving a message from an empty queue or sending a message into
a full queue. If the blocking time cannot be bounded, the delay introduced in task
executions can be very high, preventing any performance guarantee. Moreover, a
blocking semantics also prevents communication among periodic tasks having different
frequencies.
First of all, as we already mentioned above, the use of worst-case assumptions would
cause a waste of resources, which would be underutilized for most of the time, just to
cope with some sporadic peak load condition. For applications with heavy computa-
tional load (e.g., graphical activities), such a waste would imply a severe performance
degradation or a significant increase of the system cost. Fig~lre1.Sa shows that, when-
ever the load has large variations, keeping the load peaks always below one causes the
average load to be very small (low efficiency). On the other hand, Figure 1.Sb shows
that efficiency can only be increased at the cost of accepting some transient overload,
by allowing some peak load to exceed one, thus missing some deadlines.
time
f load
Figure 1.8 Two different load conditions: an underloaded system with low ax-erage re-
source Llsage (a), and a system \+it11transient o~erloadsbut high average resource usage
(h).
Another problem with the hard real-time approach is that, in many practical cases, a
precise estimation of WCETs is very difficult to achieve. In fact, several low level mech-
anisms present in modern computer architectures (such as interrupts, DMA, pipelining,
caching, and prefetching) introduce a non deterministic behavior in tasks' execution,
whose duration cannot be predicted in advance.
Even though a precise WCET estimation could be derived for each task, a worst-case
feasibility analysis would be very inefficient when task execution times have a high
variance. In this case, a classical off-line hard guarantee would waste the system's
computational resources for preserving the task set feasibility under sporadic peak
load situations, even though the average workload is much lower. Such a waste of
Table 1.1 Task set parameters.
resources (which increases the overall system's cost) can be justified for very critical
applications (e.g., military defense systems or safety critical space missions), in which
a single deadline miss may cause catastrophic consequences. However, it does not
represent a good solution for those applications (the majority) in which several deadline
misses can be tolerated by the system, as long as the average task rates are guaranteed
off line.
On the other hand, uncontrolled overruns are very dangerous if not properly handled,
since they may heavily interfere with the execution of other tasks, which could be
more critical. Consider for example the task set given in Table 1.1, where two tasks,
71 and 7 2 , have a constant execution time, whereas 7 3 has an average computation
time (C:L" 3) much lower than its worst-case value (C;'" = 10). Here, if the
y'
schedulability analysis is performed using the average computation time C ', the total
processor utilization becomes 0.92, meaning that the system is not overloaded; however,
under the Earliest Deadline First (EDF) algorithm [LL73] the tasks can experience long
delays during overruns, as illustrated in Figure 1.9. Similar examples can easily be
found also under fixed priority assignments (e.g., under the Rate Monotonic algorithm
[LL73]), when overruns occur in the high priority tasks (see for example the case
illustrated in Figure 1.7).
T 3
A general technique for limiting the effects of overruns is based on a resource reserva-
tion approach [MST94b, TDS+95, Abe981, according to which each task is assigned
(off line) a fraction of the available resources and is handled by a dedicated server, which
prevents the served task from demanding more than the reserved amount. Although
such a method is essential for achieving predictability in the presence of tasks with
variable execution times, the overall system's performance becomes quite dependent
from a correct resource allocation. For example, if the CPU bandwidth allocated to a
task is much less than its average requested value, the task may slow down too much,
degrading the system's performance. On the other hand, if the allocated bandwidth is
much greater than the actual needs, the system will run with low efficiency, wasting
the available resources.
Most of the features outlined above are described in detail in the remaining chapters of
this book. Aperiodic task scheduling is not treated in detail since it has been already
discussed in [But971 in the context of hard real-time systems.
OVERLOAD MANAGEMENT
2.1 INTRODUCTION
A system is said to be in overload when the computational demand of the task set
exceeds the available processing time. In a real-time system, an overload condition
causes one or more tasks to miss their deadline and, if not properly handled, it may
cause abrupt degradations of system performance.
Even when the system is properly designed, an overload can occur for different reasons,
such as a new task activation, a system mode change, the simultaneous arrival of asyn-
chronous events, a fault in a peripheral device, or the execution of system exceptions.
If the operating system is not conceived to handle overloads, the effect of a transient
overload can be catastrophic. There are cases in which the arrival of a new task can
cause all the previous tasks to miss their deadlines. Such an undesirable phenomenon,
called the Domino effect, is depicted in Figure 2.1.
Figure 2.la shows a feasible schedule of a task set executed under EDF. However, if at
time to task 7 0 is executed, all the previous tasks miss their deadlines (see Figure 2. lb).
In general, under EDF, accepting a new task with deadline d ' causes all tasks with
deadline longer than d" to be delayed. Similarly, under fixed priority scheduling, the
activation of a task 7, with priority P, delays all tasks with lower priority. In order
to avoid domino effects, the operating system and the scheduling algorithm must be
explicitly designed to handle transient overloads in a controlled fashion, so that the
damage due to a deadline miss can be minimized.
In the real-time literature, several scheduling algorithms have been proposed to deal
with overloads. In 1984, Ramamritham and Stankovic [RS84] usedEDF to dynamically
guarantee incoming work via on-line planning, and, if a newly arriving task could not
Figure 2.1 a. Feasible scheclule nit11 Earliest Deaclline First. in normal load conclition b.
d domino effect due to the arri~alof task .ro
O ~ e ~ l o a\\ith
be guaranteed, the task was either dropped or distributed scheduling was attempted.
The dynamic guarantee performed in this approach had the effect of avoiding the
catastrophic effects of overload on EDF.
In 1986, Locke [Loc86] developed an algorithm that makes a best effort at scheduling
tasks based on earliest deadline with a rejection policy based on removing tasks with
the minimum value density. He also suggested that removed tasks remain in the system
until their deadline has passed. The algorithm computes the variance of the total slack
Overload Marzagemerzt
time in order to find the probability that the available slack time is less than zero. The
calculated probability is used to detect a system overload. If it is less than the user
prespecified threshold, the algorithm removes the tasks in increasing value density
order.
In Biyabani et. al. [BSR88] the previous work of Ramamritham and Stankovic was
extended to tasks with different values, and various policies were studied to decide
which tasks should be dropped when a newly arriving task could not be guaranteed. This
work used values of tasks such as in Locke's work but used an exact characterization of
the first overload point rather than a probabilistic estimate that overload might occur.
Haritsa, Livny, and Carey [HLC91] presented the use of a feedback controlled EDF
algorithm for use in real-time database systems. The purpose of their work was to obtain
good average performance for transactions even in overload. Since they were working
in a database environment, they assumed no knowledge of transaction characteristics,
and they considered tasks with soft deadlines that are not guaranteed.
In real-time Mach [TWW87] tasks were ordered by EDF and overload was predicted
using a statistical guess. If overload was predicted, tasks with least value were dropped.
Other general work on overload in real-time systems has also been done. For exam-
ple, Sha [SLR88] showed that the Rate-Monotonic algorithm has poor properties in
overload. Thambidurai and Trivedi [TT89] studied transient overloads in fault-tolerant
real-time systems, building and analyzing a stochastic model for such systems. How-
ever, they provided no details on the scheduling algorithm itself. Schwan and Zhou
[SZ92] did on-line guarantees based on keeping a slot list and searching for free-time
intervals between slots. Once schedulability is determined in this fashion, tasks are
actually dispatched using EDF. If a new task cannot be guaranteed, it is discarded.
More recent approaches will be described in the following sections. Before presenting
specific methods and theoretical results on overload, the concept of overload, and, in
general, the meaning of computational load for real-time systems is defined in the next
section.
2.2 LOAD DEFINITIONS
In Chapter 1, the processor workload in an interval of time [t 1, t s ]has been defined as
the ratio of the processor demand in that interval and the length of the interval:
Computing the load using the previous definition, however, may not be practical, be-
cause the number of intervals [t1, t q ]can be very large. When the task set consists only
of aperiodic activities, then a more effective method is to compute the instantaneous
load p(t), originally introduced by Buttazzo and Stankovic in [BS95]. According to
this method, the load is computed at time t , based on the current set of active aperiodic
tasks, each characterized by a remaining computation time c , ( t )and a deadline d l . In
particular, the load at time t is computed as
where
Figure 2.2a shows an example of load calculation for a set of three real-time aperiodic
tasks. At time t = 6, when rl arrives, the loading factor p , ( t )of each task is shown on
the right of the timeline, so the instantaneous load at time 6 is p(6) = 0.833. Figure2.2b
shows the load as a function of time.
For a set of synchronous periodic tasks with deadlines less than or equal to periods, the
processor demand can be computed from time t = 0 in an interval of length L as
It is worth noticing that Baruah et al. [BMR90] showed that the maximum can be
computed for L equal to task deadlines, up to a value L ,, , = min(H. L*), where H
Overload Marzagemerzt
'2 2
(t) = 314
Figure 2.2 a. Load calculation for t = 6 in a set of three real-time tasks. h. Load as a
function of time.
is the hyperperiod (i.e., the minimum common multiple of task periods) and
The methods proposed in the literature for dealing with permanent overload conditions
can be grouped in two main categories:
On the contrary, when tasks can be activated dynamically and an overload occurs,
there are no algorithms that can guarantee a feasible schedule of the task set. Since
one or more tasks will miss their deadlines, it is preferable that late tasks be the less
important ones in order to achieve graceful degradation. Hence, in overload conditions,
distinguishing between time constraints and importance is crucial for the system. In
general, the importance of a task is not related to its deadline or its period; thus, a task
with a long deadline could be much more important than another one with an earlier
deadline. For example, in a chemical process, monitoring the temperature every ten
seconds is certainly more important than updating the clock picture on the user console
every second. This means that, during a transient overload, is better to skip one or
more clock updates rather than missing the deadline of a temperature reading, since
this could have a major impact on the controlled environment.
In a real-time system, however, the actual value of a task also depends on the time at
which the task is completed; hence, the task importance can be better described by a
utility function. Figure 2.3 illustrates some utility functions that can be associated with
tasks in order to describe their importance. According to this view, a non-real-time
task, which has no time constraints, has a low constant value, since it always contributes
to the system value whenever it completes its execution. On the contrary, a hard task
contributes to a value only if it completes within its deadline, and, since a deadline miss
would jeopardize the behavior of the whole system, the value after its deadline can be
considered minus infinity in many situations. A task with a soft deadline, instead, can
still give a value to the system if executed after its deadline, although this value may
decrease with time. Then, there can be real-time activities, so-called$~'?iz,that do not
jeopardize the system but give zero value if completed after their deadline.
Overload Marzagemerzt
Figure 2.3 Utility functions that can be associated to a task to describe its importance.
Once the importance of each task has been defined, the performance of a scheduling
algorithm can be measured by accumulating the values of the task utility functions
computed at their completion time. Specifically, the ciirnulnfi~~e
value achieved by a
scheduling algorithm 4 is defined as follows:
Notice that if a hard task misses its deadline, the cumulative value achieved by the
algorithm is minus infinity,even though all other tasks completed before their deadlines.
For this reason, all activities with hard timing constraints should be guaranteed a priori
by assigning them dedicated resources (included processors). If all hard tasks are
guaranteed a priori, the objective of a real-time scheduling algorithm should be to
guarantee a feasible schedule in underload conditions and maximize the cumulative
value of soft and firm tasks d~lringtransient overloads.
Given a set of n jobs J , (C,. D l . L:), where C, is the worst-case computation time,
D, is the relative deadline, and 1; is the importance value gained by the system when
the task completes within its deadline, the maximum cumulative value achievable on
the task set is clearly equal to the sum of all values 1;; that is, T,,, = 1:. xr=,
In overload conditions, this value cannot be achieved, since one or more tasks will
miss their deadlines. Hence, if r" is the maximum possible cumulative value that can
be achieved on the task set in overload conditions, the performance of a scheduling
algorithm A can be meas~lredby comparing the cumulative value r 4 obtained by A
with the maximum achievable value r ". In this context, a scheduling algorithm that is
able to achieve a cumulative value equal to r is an optimal algorithm.
From this definition, we can notice that the competitive factor is a real number 9 4 E
[O. 11. If an algorithm A has a competitive factor q ~it means , that A can achieve a
cumulative value at least q . 4 times the cumulative value achievable by the optimal
clairvoyant scheduler on any task set.
If the overload has an infinite d~lration,then no on-line algorithm can guarantee a com-
petitive factor greater than zero. In real situations, however, overloads are intermittent
and usually have a short duration; hence, it is desirable to use scheduling algorithms
with a high competitive factor. An important theoretical result found in [BKM+92]
is that there exists an upper bound on the competitive factor of any on-line algorithm.
This is stated by the following theorem.
Theorem 2.1 (Baruah et al.) III r>stenzs ithere the loading factor is greater tlzan 2
jp > 2) arld tarks' ~ d i i e rare plnportiorlal to their cornputation timer, rlo onlirle
algorithm can gzmarltee a conzpetitive fcictcw greater tlzan 0.25.
In general, the bound on the competitive factor as a function of the load has been
computed in [BR91] and it is shown in Figure 2.4.
With respect to the strategy used to predict and handle overloads, most of the scheduling
methods proposed in the literature can be divided into three main classes, illustrated in
Figure 2.5:
Overload Marzagemerzt
+
load
Best Effort Scheduling. This class includes those algorithms with no prediction
for overload conditions. At its arrival, a new task is always accepted into the ready
queue, so the system performance can only be controlled through a proper priority
assignment.
Simple Admission Control. This class includes those algorithms in which the
load on the processor is controlled by an acceptance test executed at each task
arrival. Typically, whenever a new task enters the system, a guarantee routine
verifies the schedulability of the task set based on worst-case assumptions. If
the task set is found schedulable, the new task is accepted in the ready queue;
otherwise, it is rejected.
Robust Scheduling. This class includes those algorithms that separate timing
constraints and importance by considering two different policies: one for task
acceptance and one for task rejection. Typically, whenever a new task enters the
system, an acceptance test verifies the schedulability of the new task set based
on worst-case assumptions. If the task set is found schedulable, the new task is
accepted; otherwise, one or more tasks are rejected based on a different policy.
Notice that the simple admission control scheme is able to avoid domino effects by
sacrificing the execution of the newly arrived task. Basically, the acceptance test acts
as a filter that controls the load on the system and always keeps it less than one.
Once a task is accepted, the algorithm guarantees that it will complete by its deadline
(assuming that no task will exceed its estimated worst-case computation time). This
scheme, however, does not take task importance into account and, during transient
overloads, always rejects the newly arrived task, regardless of its value. In certain
conditions (such as when tasks have very different importance levels), this scheduling
strategy may exhibit poor performance in terms of cumulative value, whereas a robust
algorithm can be much more effective.
In the best effort scheme, the cumulative value can be increased using suitable heuristics
for scheduling the tasks. For example, in the Spring kernel [SR87], Stankovic and
Ramamritham proposed to schedule tasks by an appropriate heuristic function that can
balance timing properties and importance values.
a h a) s accepted
task Read) queue
(a)
Reject queue
Figure 2.5 Scheduling schemes for handling overload situations. a. Best Effort schedul-
ing. b. Admission control. c. Robust Scheduling.
Overload Marzagemerzt
Deadline tolerances also provide a sort of compensation for the pessimistic evaluation
of the worst-case execution time. For example, without tolerance, we could find that a
task set is not feasibly schedulable and hence decide to reject a task. But, in reality, the
system could have been scheduled within the tolerance levels. Another positive effect
of tolerance is that various tasks could actually finish before their worst-case times, so
a resource reclaiming mechanism could then compensate, and the tasks with tolerance
could actually finish on time.
In RED, the primary deadline plus the deadline tolerance provides a sort of secondary
deadline, used to run the acceptance test in overload conditions. Notice that having
a tolerance greater than zero is different than having a longer deadline. In fact, tasks
are scheduled based on their primary deadline but accepted based on their secondary
deadline. In this framework, a schedule is said to be ~ f r i c f lfeasible
? if all tasks complete
before their primary deadline, whereas is said to be tolel-ant if there exists some task
that executes after its primary deadline but completes within its secondary deadline.
The guarantee test performed in RED is formulated in terms of residual laxity L ,,de-
fined as the interval between its estimated finishing time ( f,) and its primary (absolute)
deadline (d,). All residual laxities can be efficiently computed in O ( l l ) ,in the worst
case.
To simplify the description of the RED guarantee test, we define the Exceeding time
E, as the time that task executes after its secondary deadline:
We also define the Mciuinlim E.uceeclirlg Time El,,,, as the maximum among all E l ' s
in the tasks set; that is, El,,,, = mas, ( E , ) . Clearly, a schedule will be strictly feasible
if and only if L , > 0 for all tasks in the set, whereas it will be tolerant if and only if
there exists some L , < 0, but E,,,, , = 0.
By this approach we can identify which tasks will miss their deadlines and compute the
amount of processing time required above the capacity of the system - the maximum
exceeding time. This global view allows to plan an action to recover from the overload
condition. Many recovering strategies can be used to solve this problem. The simplest
one is to reject the least-value task that can remove the overload situation. In general,
we assume that, whenever an overload is detected, some rejection policy will search
for a subset J " of least-value tasks that will be rejected to maximize the cumulative
value of the remaining s~lbset.The RED acceptance test is shown in Figure 2.6.
+
if ( L , JI, < E ) then E = ( L , + J I , ) ;
1
if (E > 0) {
<select a set J" of least-value tasks to be rejected>;
<reject all task in J">;
1
1
In RED, a resource reclaiming mechanism is used to take advantage of those tasks that
complete before their worst-case finishing time. To reclaim the spare time, rejected
tasks are not removed forever but temporarily parked in a queue, called Reject Queue,
ordered by decreasing values. Whenever a running task completes its execution before
its worst-case finishing time, the algorithm tries to reaccept the highest-value tasks in
the Reject Queue having positive laxity. Tasks with negative laxity are removed from
the system.
When an overload is detected because a task J , reaches its L S T , then the value of J ,
is compared against the total value 1/;,., of all the privileged tasks (including the value
r3,,,, of the c~lrrentlyrunning task). If
(where k is ratio of the highest value density and the lowest value density task in the
system), then J , is executed; otherwise, it is abandoned. If J , is executed, all the
privileged tasks become waiting tasks. Task J , can in turn be abandoned in favor of
+
another task J , that reaches its L S T , but only if c , > (1 fi)ts,.
It worth to observe that having the best competitive factor among all on-line algorithms
does not mean having the best performance in aq load condition. In fact, in order
to guarantee the best competitive factor, DOL" may reject tasks with values higher
than the current task but not higher than the threshold that guarantees optimality. In
other words, to cope with worst-case sequences, Do"" does not take advantage of
lucky sequences and may reject more value than it is necessary. In Section 2.3.4, the
performance of Do,, ,is tested for random task sets and compared with the one of other
scheduling algorithms.
2.3.4 PERFORMANCE EVALUATION
In this section, the performance of the scheduling algorithms described above is tested
through simulation using a synthetic workload. Each plot on the graphs represents the
average of a set of 100 independent simulations, the duration of each is chosen to be
300,000 time units long. The algorithms are executed on task sets consisting of 100
aperiodic tasks, whose parameters are generated as follows. The worst-case execution
time C, is chosen as a random variable with uniform distribution between 50 and 350
time units. The interarrival time T,is modeled as a random variable with a Poisson
distribution with average value equal to T,= AITC,/p,where 5 is the total number of
tasks and p is the average load. The laxity of a task is computed as a random value
with uniform distribution from 150 and 1850 time units, and the relative deadline is
computed as the sum of its worst-case execution time and its laxity. The task value
is generated as a random variable with uniform distribution ranging from 150 to 1850
time units, as for the laxity.
The first experiment illustrates the effectiveness of the guarantee and robust scheduling
paradigm with respect to the best-effort scheme, under the EDF priority assignment.
In particular, it shows how the pessimistic assumptions made in the guarantee test
affect the performance of the algorithms and how much a reclaiming mechanism can
compensate for this degradation. In order to test these effects, tasks were generated
with actual execution times less than their worst-case values. The specific parameter
varied in the simulations was the average Ullused Corizp~tafior~ Time Ratio, defined as
Actual Computation Time
9=1-
Worst-case Computation Time '
In the graphs reported in Figure 2.7, the task set was generated with a nominal load
p,, = 3, while 3 was varied from 0.125 to 0.875. As a consequence, the actual mean
load changed from a value of 2.635 to a value of 0.375, thus ranging over very different
actual load conditions. The performance was measured by computing the Hit Vciliie
Ratio ( H V R ) ; that is, the ratio of the cumulative value achieved by an algorithm and
the total value of the task set. Hence, HI/'R = 1 means that all the tasks completed
within their deadlines and no tasks were rejected.
In Figure 2.7, the GED curve refers to the guaranteedEDF scheme implemented with a
simple admission control, while theRED curve refers to the robust EDF algorithm. For
small values of 3, that is, when tasks execute for almost their maximum computation
Overload Marzagemerzt
Nominal load = 3
0.1 I I
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Average unused computaton time ratio (beta)
time, the guarantee (GED) and robust (RED) versions are able to obtain a significant
improvement compared to the plain EDF scheme. Increasing the unused computation
time, however, the actual load falls down and the plain EDF performs better and better,
reaching the optimality in underload conditions. Notice that as the system becomes
underloaded ( 3 -. 0.7) GED becomes less effective than EDF. This is due to the fact that
GED performs a worst-case analysis, thus rejecting tasks that still have some chance
to execute within their deadline. This phenomenon does not appear on RED, because
the reclaiming mechanism implemented in the robust scheme is able to recover the
rejected tasks whenever possible.
In the second experiment, Do,,, is compared against two robust algorithms: RED
(Robust Earliest Deadline) and RHD (Robust High Density). In RHD, the task with the
highest value density ( r ~ , / Cis
, )scheduled first, regardless of its deadline. Performance
results are shown in Figure 2.8.
deadlines into account. However, for high load conditions (p > 1.5),RHD performs
even better than RED and Do,,,..
In particular, for random task sets, Do,,, is less effective than RED and RHD for two
reasons: first, it does not have a reclaiming mechanism for recovering rejected tasks in
the case of early completions; second, the threshold value used in the rejection policy
is set to reach the best competitive factor in a worst-case scenario. But this means that
for random sequences Do,,, may reject tasks that could increase the cumulative value,
if executed.
In conclusion, we can say that in overload conditions no on-line algorithm can achieve
optimal performance in terms of cumulative value. Competitive algorithms are de-
signed to guarantee a rninir~zun~ performance in any load condition, but they cannot
guarantee the best performance for all possible scenarios. For random task sets, robust
scheduling schemes appear to be more appropriate.
Overload Marzagemerzt
It is worth noticing that the task model used in traditional real-time systems is a special
case of the one adopted for imprecise computation. In fact, a hard task corresponds to
a task with no optional part ( o , = O), whereas a soft task is equivalent to a task with
no mandatory part (n?, = 0 ) .
In systems that support imprecise computation, the error E , in the result produced by
.Jz (or simply the error of J , ) is defined as the length of the portion of 0 , discarded in
the schedule. If D , is the total processor time assigned to 0 ,by the scheduler, the error
of task J , is equal to
El = 0, - 0,.
where w , is the relative importance of J , in the task set. An error E , > 0 means that a
portion of s~lbtask0 , has been discarded in the schedule at the expense of the quality
of the result produced by task J , but for the benefit of other mandatory s~lbtasksthat
can complete within their deadlines.
As an illustrative example, let us consider the task set shown in Fig~lre2.9a. Notice
that this task set cannot be precisely scheduled; however, a feasible schedule with
an average error of F = 1 can be found, and it is shown in Fig~lre2.9b. In fact, all
mandatory subtasks finish within their deadlines, whereas not all optional subtasks are
able to complete. In particular, a time unit of execution is subtracted from 0 1, two units
from 03,and one unit from 0 5 . Hence, assuming that all tasks have an importance
value equal to one (zc, = I), the average error on the task set is F = 1.
When an application task cannot be split into a mandatory and an optional part that can
be aborted at any time, service adaptation can still be performed, at a coarse granularity,
if multiple versions are provided for the task, each having different length and quality
(the longer the task, the higher the quality). In this case, to cope with an overload
condition, a high-quality version of a task may be replaced with a shorter one with
lower quality. If all the tasks follow such a model, the objective of the system would
be to maximize the overall quality under feasibility constraints.
Overload Marzagemerzt
I Task r, d, I C, I m, o, I
Permitting skips in periodic tasks increases system flexibility, since it allows to make
a better use of resources and to schedule systems that would otherwise be overloaded.
Consider for example two periodic tasks, rl and 7-2, with periods p 1 = 10, p 2 = 100,
and execution times C1 = 5 and C2 = 55, where rl can skip an instance every 10
periods, whereas rL is hard (i.e., no instances can be skipped). Clearly, the two tasks
cannot be both scheduled as hard tasks, because the processor utilization factor is
+
C = 5/10 551100 = 1.05 > 1. However, if 7 1 is permitted to skip one instance
every 10 periods, the spare time can be used to accommodate the execution of 7 2 .
The job ~kiyyingmodel has been originally proposed by Koren and Shasha [KS95].
In their work, the authors showed that making optimal use of skips is NP-hard and
presented two algorithms, called Skip-Over Algorithms (one a variant of rate monotonic
scheduling and one of earliest deadline first) that exploit skips to increase the feasible
periodic load and schedule slightly overloaded systems. According to the job skipping
model, the maximum number of skips for each task is controlled by a specific parameter
associated with the task. In particular, each periodic task r,is characterized by a worst-
case computation time c,, a period p,, a relative deadline equal to its period, and a skip
< <
parameters,, 2 s , x, which gives theminimum distance between two consecutive
skips. For example, if s , = 5 the task can skip one instance every five. When s , = x
no skips are allowed and 7 , is equivalent to a hard periodic task. The skip parameter
can be viewed as a Q d i h qf Sewice (QoS) metric (the higher s , the better the quality
of service).
Using the terminology introduced by Koren and Shasha in [KS95], every instance of
a periodic task with skips can be red or blue. A red instance must complete before its
deadline; a blue instance can be aborted at any time. When a blue instance is aborted,
we say that it was skipped. The fact that s > 2 implies that, if a blue instance is
skipped, then the next s 1instances must be red. On the other hand, if a blue instance
-
Two on-line scheduling algorithms were proposed in [KS95] to handle tasks with skips
under EDF.
1. The first algorithm, called Red Tasks Only (RTO), always rejects the blue instances,
whereas the red ones are scheduled according to EDF.
2. The second algorithm, called Blue When Possible (BWP), is more flexible than
RTO and schedules blue instances whenever there are no ready red jobs to execute.
Red instances are scheduled according to EDF.
It is easy to find examples that show that BWP improves RTO in the sense that it is
able to schedule task sets that RTO cannot schedule. In the general case, the above
algorithms are not optimal, but they become optimal under a particular task model,
called the deeply-red model.
Definition 2.2 A system is deeply-red if all tasks are synchronously activated and the
first s, - 1 instances of every task T , are red.
In the same paper, Koren and Shasha showed that the worst case for aperiodic skippable
task set occurs when tasks are deeply-red, hence all the results are derived under this
assumption. This means that, if a task set is schedulable under the deeply-red model,
it is also schedulable without this assumption.
In the hard periodic model in which all task instances are red (i.e., no skips are per-
mitted), the schedulability of a periodic task set can be tested using a simple necessary
and sufficient condition based upon cumulative processor utilization. In particular, Liu
and Layland [LL73] showed that a periodic task set is schedulable by EDF if and only
if its cumulative processor utilization is no greater than 1. Analyzing the feasibility
of firm periodic tasks is not equally easy. Koren and Shasha proved that determining
whether a set of skippable periodic tasks is schedulable is NP-hard. They also found
that, given a set r = {T,(p,, c i , s,)) of firm periodic tasks that allow skips, then
is a necessary condition for the feasibility of F, since it represents the utilization based
on the computation that must take place.
To better clarify the concepts mentioned above consider the task set shown in Fig-
ure 2.10 and the corresponding feasible schedule, obtained by EDF. Notice that the
Task I I
Task1 Task2 Task3 I
I Cor~zu~rtafionI 1 I 2
II
I 5 I
I I
Period 1 3 1 4 1 2 1
/ Skip Puranleter 1 4 1 3 x 1
processor utilization factor is greater than 1 (C, = 1.25), but condition (2.2) is satis-
fied.
In the same work, Koren and Shasha proved the following theorem, which provides a
sufficient condition for guaranteeing a set of skippable periodic tasks under EDF.
If skips are permitted in the periodic task set, the spare time saved by rejecting the blue
instances can be reallocated for other purposes. For example, for scheduling slightly
overloaded systems or for advancing the execution of soft aperiodic requests.
Unfortunately, the spare time has a "granular" distribution and cannot be reclaimed at
any time. Nevertheless, it can be shown that skipping blue instances still produces a
Overload Marzagemerzt
bandwidth saving in the periodic schedule. In [CB97], Caccamo and Buttazzo gen-
eralized the results of Theorem 2.2 by identifying the amount of bandwidth saving
achieved by skips. To express this fact with a simple parameter, they defined an equiv-
alerlt iiti1i:atiorl fcictor C; for periodic tasks with skips.
Definition 2.3 Ghwz a Jet r = { T z ( p zc,.. s , ) ) of 7 2 periodic tasks that allow skipa,
an equivalent processor utilization factor can be dqfined as:
The following theorem ([CB97]) states the schedulability condition for a set of deeply-
red skippable tasks.
The bandwidth saved by skipping blue instances can easily be exploited by an aperiodic
server (like CBS [AB98a] described in Chapter 3) to advance the execution of aperiodic
tasks. The following theorem ([CB97]) provides a sufficient condition for guaranteeing
a feasible schedule of a hybrid task set consisting of n firm periodic tasks and a number
of soft aperiodic tasks handled by a server with bandwidth C , .
AN EXAMPLE
As an illustrative example, let us consider the task set shown in Table2.1. The task
set consists of two periodic tasks, 71 and 7 2 , with periods 3 and 5, computation times
2 and 2, and skip parameters 2 and x respectively. The equivalent utilization factor
of the periodic task set is C;I = 415 while C,,,? , = 0.27, leaving a bandwidth of
Cs = 115 for the aperiodic tasks. Three aperiodic jobs J 1 , .J2, and .J3 are released at
times t l = 0, t2 = G, and t3 = 18; moreover, they have computation times c';"' = 1,
c;~' = 2,and ezP' = 1,respectively.
Supposing that the aperiodic activities are scheduled by a CBS server with maximum
budget Q' = 1 and server period P' = 5, Figure 2.1 1 shows the resulting schedule
Overload Marzagemerzt
Figure 2.11 Schedule produced b> RTO+CBS fol the task set sho\+nin Table 2 1
by using RTO+CBS. Notice that .J2 has a deadline postponement (according to CBS
rules) at time t = 10 with new server deadline d;L,, = d& +
P' = 11 5 = 16. +
According to the sufficient schedulability test provided by Theorem 2.4, the task set is
schedulable when the aperiodic server has assigned a bandwidth C,= 1 - C;.
For example, in multimedia systems, activities such as voice sampling, image acqui-
sition, sound generation, data compression, and video playing, are performed period-
ically, but their execution rates are not as rigid as in control applications. Missing a
deadline while displaying an MPEG video may decrease the quality of service (QoS),
but does not cause critical system faults. Depending on the requested QoS, tasks may
increase or decrease their execution rate to accommodate the requirements of other
concurrent activities.
Even in some control application, there are situations in which periodic tasks could be
executed at different rates in different operating conditions. For example, in a flight
control system, the sampling rate of the altimeters is a function of the current altitude of
the aircraft: the lower the altitude, the higher the sampling frequency. A similar need
arises in robotic applications in which robots have to work in unknown environments
where trajectories are planned based on the current sensory information. If a robot
is equipped with proximity sensors, in order to maintain a desired performance, the
acquisition rate of the sensors must increase whenever the robot is approaching an
obstacle.
In other situations, the possibility of varying tasks' rates increases the flexibility of the
system in handling overload conditions, providing a more general admission control
mechanism. For example, whenever a new task cannot be guaranteed by the system,
instead of rejecting the task, the system can try to reduce the utilizations of the other
tasks (by increasing their periods in a controlled fashion) to decrease the total load and
accommodate the new request.
Unfortunately, there is no uniform approach for dealing with these situations. For
example, Kuo and Mok [KM91] propose a load scaling technique to gracefully degrade
the workload of a system by adjusting the periods of processes. In this work, tasks
are assumed to be equally important and the objective is to minimize the number of
fundamental frequencies to improve schedulability under static priority assignments.
In [NT94]. Nakajima and Tezuka show how areal-time system can be used to support an
adaptive application: whenever a deadline miss is detected, the period of the failed task
is increased. In [SLSS97], Seto et al. change tasks' periods within a specified range
to minimize a performance index defined over the task set. This approach is effective
at a design stage to optimize the performance of a discrete control system, but cannot
be used for on-line load adjustment. In [LRM96], Lee, Rajkumar and Mercer propose
a number of policies to dynamically adjust the tasks' rates in overload conditions. In
[AAS97], Abdelzaher, Atkins, and Shin present a model for QoS negotiation to meet
both predictability and graceful degradation requirements during overloads. In this
model, the QoS is specified as a set of negotiation options, in terms of rewards and
rejection penalties. In [Nak98a, Nak98b1, Nakajima shows how a multimedia activity
can adapt its requirements during transient overloads by scaling down its rate or its
computational demand. However, it is not clear how the QoS can be increased when
the system is underloaded. In [BCRZ99], Beccari et al. propose several policies for
handling overload through period adjustment. The authors, however, do not address
the problem of increasing the task rates when the processor is not fully utilized.
The elastic model presented in this section was originally introduced by Buttazzo et
al. [BAL98] and then extended to a more general case [BLCA02]). It provides a novel
theoretical framework for flexible workload management in real-time applications. In
particular, the elastic approach provides the following advantages with respect to the
classical "fixed-rate" approach.
Overload Marzagemerzt
it provides a simple and efficient method for controlling the system's performance
as a function of the current workload.
EXAMPLES
To better understand the idea behind the elastic model, consider a set of three periodic
tasks, with computation times C1 = 10, C2 = 10, and Cg = 15 and periods T1 = 20,
f i = 40. and T3 = 70. Clearly, the task set is schedulable by EDF because
Now, suppose that a new periodic task 7 4 , with computation time C4 = 5 and period
T4 = 30, enters the system at time t . The total processor utilization of the new task set
In a rigid scheduling framework, 7-4 should be rejected to preserve the timing behavior
of the previously guaranteed tasks. However, 74 can be accepted if the periods of the
other tasks can be increased in such a way that the total utilization is less than one. For
example, if T1 can be increased up to 23, the total utilization becomes L7, = 0.989,
and hence r4 can be accepted.
As another example, if tasks are allowed to change their frequency and task r 3 reduces
its period to 50, no feasible schedule exists, since the utilization would be greater than
However, notice that a feasible schedule exists (C, = 0.977) for T I = 22, T2 = 45,
and T 3 = 50. Hence, the system can accept the higher request rate of 7 3 by slightly
decreasing the rates of rl and 72. Task 73 can even run with a period T 3 = 40, since
a feasible schedule exists with periods T1 and T2 within their range. In fact, when
Tl = 24, T2 = 50, and T3 = 40, I .;= 0.992. Finally, notice that if 7 3 requires to run
at its minimum period (T3= 35), there is no feasible schedule with periods T 1 and T2
within their range, hence the request of r3 to execute with a period T3 = 35 must be
rejected.
Clearly, for a given value of T3,there can be many different period config~lrationswhich
lead to a feasible schedule; thus, one of the possible feasible configurations must be
selected. The elastic approach provides an efficient way for quickly selecting a feasible
period configuration among the all possible solutions.
In the following, T , will denote the actual period of task 7,. which is constrained to be
in the range [TI,, T,,,,~z1. Any task can vary its period according to its needs within
the specified range. Any variation, however, is subject to an elnrtic guarantee and is
accepted only if there exists a feasible schedule in which all the other periods are within
their range.
It is worth noting that the elastic model is more general than the classical Liu and
Layland's task model, so it does not prevent a user from defining hard real-time tasks.
In fact, a task having T,,r,tz, = T,,, is equivalent to a hard real-time task with fixed
period, independently of its elastic coefficient. A task with E , = 0 can arbitrarily vary
its period within its specified range, but it cannot be varied by the system during load
reconfigurations.
In this comparison, the length x , of the spring is equivalent to the task's utilization
factor C,= C,/T,, and the rigidity coefficient k , is equivalent to the inverse of the
task's elasticity ( k , = l / E , ) . Hence, a set of n tasks with total utilization factor
CP - ELl C:can be viewed as a sequence of n springs with total length L = ELl x,.
Overload Marzagemerzt
Under the elastic model, given set of n periodic tasks with L7, > C,,,, the objective of
the guarantee is to compress tasks' utilization factors in order to achieve a new desired
utilization Cc{; i C,,,, , such that all the periods are within their ranges. In the linear
spring system, this is equivalent of compressing the springs so that the new total length
Lei is less than or equal to a given maximum length L ,,,, ,. More formally, in the spring
system the problem can be stated as follows.
For the sake of clarity, we first solve the problem for a spring system without length
constraints, then we show how the solution can be modified by introducing length
constraints, and finally we show how the solution can be adapted to the case of a task
set.
If F is the force that keeps a spring in its compressed state, then, for the equilibrium
of the system, it must be:
where
1
I<,, = -1 '
CL
Substituting expression (2.10) into Equations (2.9) we finally achieve:
A-p
Vl ( L O Ld) -.
S , = S,,, - - (2.12)
kz
Equation (2.12) allows us to compute how each spring has to be compressed in order
to have a desired total length L d.
where
Overload Marzagemerzt
Whenever there exists some spring for which equation (2.13) gives x , < x , , , , , athe .
length of that spring has to be fixed at its minimum value, sets F f and F , must be
updated, and equations (2.13), (2.14), (2.15) and (2.16) recomputed for the new set F ,.
If there exists a feasible solution, that is, if the desired final length L d is greater than or
equal to the minimum possible length of the array L ,,, = I:=, J , , a . the iterative
process ends when each value computed by equations (2.13) is greater than or equal to
its corresponding minimum x ,, . The complete algorithm for compressing a set F of
,,,?,
n springs with length constraints up to a desired length LC{is shown in Figure 2.13.
When dealing with a set of elastic tasks, equations (2.13),(2.14). (2.15) and (2.16) can
be rewritten by substituting all length parameters with the corresponding utilization
factors, and the rigidity coefficients k, and I<, with the corresponding elastic coeffi-
cients E, and E, . Similarly, at each instant, the set F of periodic tasks can be divided
into two s ~ h s e t s :a set Tf of fixed tasks having minimum utilization, and a set r, of
variable tasks that can still be compressed. Let C',,= C,IT,,,be the nominal utilization
of task T,, Co = C:=l Clobe the nominal utilization of the task set, C,, be the sum of
the nominal utilizations of tasks in F , , and C f be the total utilization factor of tasks in
F f . Then, to achieve a desired utilization CTd < Co each task has to be compressed up
to the following utilization:
b
7 , E L C, = C,, - (C,,, - C ; i + C f ) I (2.17)
EL
where
EL= x
r,ET,
E,.
If there exist tasks for which L7, < L;,,, ,, a ,then the period of those tasks has to be fixed
at its maximum value T, , r , ~ z , (so that C', = C'z,r,,,), sets T i and r, must be updated
(hence, C f and E L recomputed), and equation (2.17) applied again to the tasks in
Algorithm Spring-compress(F, LC{){
Lo = Jl,,;
) while (ok == 0 ) ;
return FEASIBLE;
1
I?,. If there exists a feasible solution, that is, if the desired utilization C d is greater
than or equal to the minimum possible utilization L7,,,, - I:=+,
, r
the iterative
,,,<I
process ends when each value computed by equation (2.17) is greater than or equal to
its corresponding minimum L7,,r,,,a.The algorithm1 for compressing a set r of n elastic
tasks up to a desired utilization LTd is shown in Figure 2.14.
DECOMPRESSION
All tasks' utilizations that have been compressed to cope with an overload situation can
return toward their nominal values when the overload is over. Let r ebe the subset of
compressed tasks (that is, the set of tasks with T , > T,,), let T a be the set of remaining
tasks in T (that is, the set of tasks with T, = T,,), and let Cc{be the current processor
utilization of T . Whenever a task in T , voluntarily increases its period, all tasks in T ,
can expand their utilizations according to their elastic coefficients, so that the processor
utilization is kept at the value of LTd.
Now, let C', be the total utilization of T,, let C;, be the total utilization of T , after
some task has increased its period, and let C,,, be the total utilization of tasks in T , at
their nominal periods. It can easily be seen that if C,,, C;, GTir, + <all tasks in T ,
+
can return to their nominal periods. On the other hand, if C,,, L, > Cl,,b,then the
release operation of the tasks in r e can be viewed as a compression, where r f = F a
and T , = T,. Hence, it can still be performed by using equations (2.17), (2.19) and
(2.20) and the algorithm presented in Figure 2.14.
PERIOD RESCALING
If the elastic coefficients are set equal to task nominal utilizations, elastic compression
has the effect of a simple resealing, where all the periods are increased by the same
percentage. In order to work correctly, however, period rescaling must be uniformly
applied to all the tasks, without restrictions on the maximum period. This means having
C f = 0 and C,,, = L o . Under this assumption, by setting E , = C,,, equations (2.17)
become:
0
T, = T,, -
Cd
l ~ h actual
e implementation of the algotithm contains mote checks on tasks' xariables. mhicli ate not
sho\\n here to simplit> its description
Algorithm Task-compress(T,CCi){
ch = Cl IT," ;
cmzn = I:"=1
CzITz ,,,(,, ;
if (Cd< L;,,,,) return INFEASIBLE;
do
Cf = L;,, = EL = 0;
for (each r , ) {
if ( ( E ,== 0) or ( T , == T, ))
+
Cf = r r f rrz,,, ,, ;
else {
EL = EL + E l ;
~ L , ,= L" + ~ l , ,
ok = 1 ;
for (each r, E r,) {
if ( ( E ,> 0) and ( T I< T l , r ,))~ z{ ,
+
Cz = czll- (Ct0- Cci Cf)Ez/EL;
T, = C z / C , ;
if (Tl > TZ,,?,,,{
This means that in overload situations (Yo > 1) the compression algorithm causes all
task periods to be increased by a common scale factor
Notice that, after compression is performed, the total processor utilization becomes:
as desired.
If a maximum period needs to be defined for some task, an on-line guarantee test can
easily be performed before compression to check whether all the new periods are less
than or equal to the maximum value. This can be done in O ( n )by testing whether
By deciding to apply period rescaling, we loose the freedom of choosing the elastic
coefficients, since they must be set equal to task nominal utilizations. However, this
technique has the advantage of leaving the task periods ordered as in the nominal
configuration, which simplifies the compression algorithm in the presence of resource
constraints.
CONCLUDING REMARKS
The elastic model offers a flexible way to handle overload conditions. In fact, whenever
a new task cannot be guaranteed by the system, instead of rejecting the task, the system
can try to reduce the utilizations of the other tasks (by increasing their periods in a
controlled fashion) to decrease the total load and accommodate the new request. As
soon as a transient overload condition is over (because a task terminates or voluntarily
increases its period) all the compressed tasks may expand up to their original utilization,
eventually recovering their nominal periods.
The major advantage of the elastic method is that the policy for selecting a solution is
implicitly encoded in the elastic coefficients provided by the user (for example, based
on task importance). Each task is varied based on its current elastic status and a feasible
configuration is found, if there exists one.
The elastic model is extremely useful for supporting both multimedia systems and
control applications, in which the execution rates of some computational activities have
to be dynamically tuned as a function of the current system state. Furthermore, the
elastic mechanism can easily be implemented on top of classical real-time kernels, and
can be used under fixed or dynamic priority scheduling algorithms [But93a, LLB +97].
It is worth observing that the elastic approach is not limited to task scheduling. Rather,
it represents a general resource allocation methodology which can be applied whenever
a resource has to be allocated to objects whose constraints allow a certain degree of
flexibility. For example, in a distributed system, dynamic changes in node transmission
rates over the network could be efficiently handled by assigning each channel an elastic
bandwidth, which could be tuned based on the actual network traffic. An application
of the elastic model to the network has been proposed in [PGBA02].
The elastic model has also been extended in [BLCA02] to deal with resource constraints,
thus allowing tasks to interact through shared memory buffers. In order to estimate
maximum blocking times due to mutual exclusion and analyze task schedulability,
critical sections are assumed to be accessed through the StackReso~lrcePolicy [Bak9 11.
TEMPORAL PROTECTION
In critical real-time applications, where predictability is the main goal of the system,
traditional real-time scheduling theory can be successfully used to verify the feasibility
of the schedule under worst-case scenarios. However, when efficiency becomes rele-
vant and when the worst-case parameters of the tasks are too pessimistic or unknown,
the hard real-time approach presents some problems. In particular, if a task overruns
the system can experience a temporary or permanent overload, and, as a consequence,
some task can miss its deadline.
After an introduction to the problem, we will present two different classes of algo-
rithms for providing temporal protection: algorithms based on the fairness property,
often referred to as proportional share algorithms, and algorithms based on resource
reservation. Finally, we will describe some operating systems that provides resource
reservation mechanisms.
,deadline miss
Figure 3.2 An instance of 72 executing fol "too long" can cause a deadline miss in a.
This problem does not specifically depend on EDF, but is inherent to all scheduling
algorithms that rely on a guarantee based on worst-case execution times (WCETs).
For instance, Figure 3.3 shows another example in which two tasks, 7-1 = (2.3) and
7 2 = (1.5). are feasibly scheduled by a fixed priority scheduler (where tasks have
been assigned priorities based on the rate monotonic priority assignment). However,
if the first instance of 71 increases its execution from 2 to 3 units of time, then the first
instance of 7 2 will miss its deadline. as shown in Figure 3.4. Again. one task (r2) is
suffering for the misbehavior of another task ( r l ) .
Notice that, under fixed priority scheduling, a high priority task ( r 1 in the example)
cannot be influenced by a lower priority task (72). However, task priorities do not
Figure 3.3 A task set schedulable under RM
15
s\ 10
deadline miss
Figure 3.4 An instance of 71 executing for "too long" can cause a cleadline miss in ~1
always reflect importance and are often assigned based on other considerations, like
schedulability, as for the rate monotonic assignment. If importance values are not
related with task rates, assigning priorities to tasks is not trivial, if a high schedulability
bound has to be reached. For some specific task sets, schedulability can be increased
by applying a period transformation technique [SG90], which basically splits a task
with a long period into smaller subtasks with shorter periods. However, playing with
priorities is not the best approach to follow, and the method becomes inefficient for
large task sets with arbitrary periods.
The examples presented above show that when a real-time system includes tasks with
variable (or unknown) parameters, some kind of tenzpornl protecfiorl among tasks is
desirable.
Definition 3.1 Tlze fernpornl protection propert) requires flint the ferripolnl behavior
of a task ir {lot ciffected by the fernpornl Dehn~iorof the other faskr rz~nningill the
JvJfenz.
In a real-time system that provides temporal isolation, a task executing "too much"
cannot cause the other tasks to miss their deadlines. For example, in the case illustrated
in Figure 3.2, if temporal protection were enforced by the system, then the task missing
the deadline would be 7 2 .
distinguishing the various algorith~nsand their characteristics, they have been catego-
rized according to the taxonomy illustrated in Figure 3.5.
The class of algorithms providing temporal protection can be divided in two main
classes: the class of fciir rclzedzdirg algoritlirnr and the class of reroiilre rerer~atiorl
algontlznz~.
In both cases, the objective is to allocate the processor so that in el.ety infewal of time
each task precisely receives its share of the processor. Notice that such an objective
cannot be realized in practice, since it would require a infinitely divisible resource:
no matter how small the interval is, each task should receive its share of the proces-
sor. But the minimum time granularity of one processor is given by the clock! As
a consequence, any implementable algorithm can only approximate the ideal one. A
theoretical algorithm based on the ideal fluid resource allocation model is the Gener-
alized Processor Sharing (GPS), which will be presented in Section 3.3. The GPS is
mainly used for evaluation purposes, to verify how closely an algorithm approximates
the fluid model.
A parameter that can be used to measure how closely a realistic algorithm approximates
the ideal one is the lag. For each task, the lag is defined as the difference between the
execution time actually assigned to a task by the realistic algorithm and the amount of
time assigned by the ideal fluid algorithm. Hence, the objective of a fair scheduler is
to limit the lag to an interval as close as possible to 0.
Most of the algorithms belonging to this class divide the time line into intervals of
fixed length, called "quantum", with the constraint that only one task per processor can
be executed in one quantum. The idea is to approximate fluid allocation with small
discrete intervals of time.
We can further divide the class of fair scheduling algorithms in p-fair scheduling and
in proportional share algorithms. The main difference is on how the processor share is
assigned to tasks.
where is the number of tasks. If the number of tasks does not change during the
system lifetime (i.e., not new tasks are allowed to dynamically join the system, nor
tasks can leave the system), then the task share is a constant.
However, if tasks are allowed to dynamically join the system, task shares can change.
If this change is not controlled, the temporal isolation property is broken: a new tasks
joining the system can require a very high weight, reducing considerably the share of
the existing tasks.
Proportional share algorithms were initially presented in the context of network schedul-
ing, where the concept of task is substituted with the concept of packets "flow". A
network link is shared among different flows, each flow is assigned a weight and the
goal is to allocate the bandwidth of the link to the different flows in a fair manner, so
that each flow receives a share proportional to its weight.
The same algorithms have also been applied to the context of processor scheduling.
One difference between network scheduling and processor scheduling is that in network
scheduling the basic scheduling unit is the packet. In fact, the packet must be transmitted
entirely, and cannot be divided into smaller units. Hence, there is no need for specifying
a "scheduling quantum": the length of the packet is itself the scheduling quantum. The
problem becomes slightly more complex if packets have different lengths.
In Section 3.3, we present some of the most popular fair scheduling algorithms in the
context of processor scheduling.
Aperiodic server algorithms were proposed both for fixed priority scheduling and dy-
namic priority scheduling. In fixed priority scheduling, the main algorithms are the
Polling Server, the Deferrable Server (DS) and the Sporadic Server (SS) [SSL89,
LSS87, SLS9.51. In dynamic priority scheduling, the most important algorithms are
the Total Bandwidth Server (TBS) [SB94, SB961 and the Constant Bandwidth Server
(CBS) [AB98a, AB041.
An approach similar to the server algorithms was applied for the first time to soft real-
time multimedia applications by Mercer et al. [MST94a]. with the explicit purpose of
providing temporal protection. Later, Rajkumar et al. [RJM098] introduced the term
"resource reservation" to indicate this class of techniques.
In all the previous cited algorithms (with the exception of the TBS), a server is char-
acterized by a budget Q and a period P. The processor share assigned to each server
is QIP. In the original formulation of these algorithms, one server was defined for
the entire system, with the purpose of serving all aperiodic tasks in First-Come-First-
Served (FCFS) order. The behavior of the server is similar to that of a periodic hard
real-time task with a worst-case execution time equal to the assigned budget Q and
a period equal to P. Hence, it is possible to apply the existing real-time scheduling
analysis techniques to check the schedulability of the system.
where S is the number of tasks in the system and LTlUhis the schedulability utilization
bound, which depends on the adopted scheduling algorithm. Then, each task is guar-
anteed to obtain a budget Q , every server period P,, regardless of the behavior of the
other tasks in the system.
It is important to note that in the configuration "one server per task", the assumption of
periodic or sporadic tasks can be removed. For example, consider a non-real-time non-
interactive task (like for example a complex scientific computation or the compilation
of a large program). By assigning a server with a certain budget and a period to this
task, it will receive a steady and regular allocation of the processor, independently of
the presence of other (real-time or non-real-time) tasks in the system.
Resource reservation techniques will be described in detail in Section 3.5, and the
Constant Bandwidth Server (CBS) [AB98a, AB041 will be presented in Section 3.6.1.
Before continuing the presentation of the different approaches to temporal protection,
it is important to highlight the main differences between fair scheduling and resource
reservation techniques.
The main objective of a fair scheduler is to keep the lag between the task execution
and the ideal fluid allocation as close as possible to zero. For this reason, in processor
scheduling, these algorithms need to introduce the concept of "scheduling quantum"
that is the basic unit of allocation. The smaller the quantum, the smaller the lag bound.
However, a small quantum implies a large number of context switches. Moreover,
once the scheduling quantum has been fixed for the entire system, each task is assigned
one single parameter, the weight (or the share in p-fair schedulers). The "granularity"
of the allocation depends on the scheduling quantum while the share of the processor
depends on the task weight. Therefore, if a task requires a very small granularity, we
must reduce the scheduling quantum, causing a large number of context switches and
more overhead.
Conversely, the goal of a resource reservation algorithm is to keep resource allocation
under control so that a task can meet its timing constraints. To this end, each reservation
is associated with two parameters, the budget Q and the period P. The period of the
reservation represents the granularity of the allocation needed by the corresponding
task, while the rate Q / P represents the share of the processor. Therefore, unlike fair
schedulers, it is possible to select the most appropriate granularity for each task. If a task
requires a very small granularity, its reservation period must be reduced accordingly,
while the other reservations can keep a large period. In the general case, it is possible
to show that, the number of context switches produced by a reservation scheduler is
considerably less than the number of context switches produced by a proportional share
scheduler.
Executing each task 7, at a constant rate is the essence of the Generalized Processor
Sharing (GPS) approach [PG93, PG941. In this model, each shared resource needed
by tasks (such as the CPU) is considered as a fluid that can be partitioned among the
applications. Each task instantaneously receives a fraction f , ( t )of the resource at time
t , where f , ( t )is defined as the task slzar-e. Note that the GPS model can be seen as an
extreme form of a Weighted Round Robin policy.
Since each task consists of one or more requests for shared resources, tasks can block
and unblock, and the F(t) set can vary with time. Hence, the share f , ( t )is a time
varying quantity. The minimum guaranteed share is defined as the r-afe
The GPS model describes a task system as a fluid flow system, in which each task r ,
is modeled as an infinitely divisible fluid, and executes at a minimum rate F, that is
proportional to a user specified weight w , . For example, Figure 3.6 shows the ideal
schedule of 2 GPS tasks, rl and 72, with weights utl = 3 and ut2 = 1. Note that
7 2 is always active, whereas 7 1 is a periodic task with period T 1 = 8 and execution
time C 1 = 3. At time t = 0 , both tasks are active, hence they receives two shares
+ +
f l ( 0 ) = 3 / ( 1 3 ) = 311 and f 2 ( 0 ) = 1 / ( 1 3 ) = 111. This means that the two
tasks execute simultaneously, and 7 1 executes at 314 of the CPU speed, whereas 7 2
executes at 114 of the CPU speed. As a result, the first instance of 7 1 finishes at time
C l / f l ( 0 ) = 3 / ( 3 / 4 ) = 4, when 7 2 remains the only active task in the system and
receives a share f L ( 4 ) = 1. At time 8,7 1 activates again and the schedule repeats as at
time 0. Note that the schedule represented in Figure 3.6 cannot be realized in practice,
because tasks execute simultaneously.
According to the ideal GPS model, task r , is guaranteed to execute for an amount of
time s, ( t l .t 2 ) > ( t 2 t l ) F z in each backlogged interval [t1 . t 2 ] .More precisely, the
-
where exec, ( t l ,t 2 )is the amount of time actually executed by 7 , in the interval [ t l. t s ] .
It can be easily seen that Equation 3.1 is equivalent to exec, ( t l .t s ) = s, ( t l .t 2 ) .
Lay, = m a s { e x e c , ( t l ,t 2 ) s , ( t l . t s ) ) .
-
tl t2
e x e c , ( t l .t s ) -
Proportional Share (PS) scheduling was originally developed for handling network
packets. It provides fairness among different streams by emulating the GPS alloca-
tion model in a real system, where multiple tasks do not run simultaneously on the
same CPU, but are executed using a quantum-based allocation. In other words, in a
Proportional Share scheduler, resources are allocated in discrete time quanta having
maximum size Q: a process acquires a resource at the beginning of a time quantum
and releases the resource at the end of the quantum. To do that, each task r ,is divided
in requests qj. of size Q.
An important properties of PS schedulers (that directly derives from the GPS definition)
is that they are ~ ~ corlserving
r k nlgoritlinzs.
As we will see in the next sections, some algorithms providing temporal protection are
not work conserving (for example, hard reservation algorithms).
In the rest of this section, some of the most important PS scheduling algorithms are
analyzed, showing how they emulate the ideal GPS allocation, and evaluating their
performance in terms of allocation error and lag.
Each quantum request q; is assigned a virtual start time S ( q ; ) and a virtual finish time
~ ( q ; as) follows:
where r , k is the time at which request q: is generated and Q , A is the request dimension
(required execution time). Since Q , k is not known a priori (a task may release the
CPU before the end of the time quantum), it is assumed to be equal to the quantum
size Q (note that the quantum size is the same for all the tasks, hence the z index can
be removed). Tasks' requests are scheduled in order of increasing virtual finish time,
and the definitions presented above guarantee that each request completes before its
virtual finishing time.
Figure 3.7 WFQ scheclule genetated by the task set of Figure 3 6
Figure 3.7 shows an example of WFQ scheduling, with the same task set presented in
Figure 3.6 and considering a quantum size Q = 1. The first quantum begins at time
0, hence its virtual start time is 0 for both tasks. Since the virtual finishing time of
+ +
the first quantum is 0 113 = 113 for task rl, and 0 111 = 1 for task 7 2 , such a
quantum is assigned to rl. The virtual start time of the second quantum of task 7 1 is
+
max{l/4,1/3} = 113, hence F(qf)= 113 113 = 213 and 7 1 is scheduled again.
In the same way, S(&') = mas{1/2,2/3} = 213, and F(&') = 1. Since the virtual
finishing time of the two tasks is the same, both rl and 7 2 can be scheduled at time
t = 2: let us assume that 7-2 is scheduled. As a result, S(q;) = mnx{l. 1/41 = 1 and
F ( q 2 ) = 2. Since F ( q ? ) < F(q,'), rl is scheduled at time t = 3 and finishes its first
instance at time t = 4. At this point, the virtual time changes its increase rate to reflect
the fact that 7 2 remains the only active task in the system ( u =~ 1~+ d c ( t ) = dt).
As a result, when rl activates again at time t = 8, the virtual time c(8) = 5 is equal
to the virtual finishing time F(q;) = 5 of the latest quantum executed by 7 2 . Hence,
the virtual start time of the two competing quanta of 7 1 and 7 2 is the same (5), and the
schedule repeats as at time 0.
The WFQ algorithm is one of the first known PS schedulers, and it is the basis for
all the other PS algorithms. In fact, most of the PS schedulers are just modifications
of WFQ that try to solve some of its problems. Some of the most notable problems
presented by WFQ are:
In general, the main difference among the various PS schedulers consists in the way
they define the virtual time, or in some additional rule that can be used to increase the
fairness in some pathological situations.
,
SFQ guarantees an allocation error bound of 2 H , , so it is nearly-optimal. Moreover,
SFQ calculates c(t) in a way simpler than that used in WFQ (introducing less overhead)
and does not need the virtual finish time of a request to schedule it, so it does not require
any a priori knowledge of the request execution time ( F ( q f ) can be computed at the
end of qf execution).
A Proportional Share algorithm schedules the tasks in order to reduce the allocation
error experienced by each of them; to provide some form of real-time execution it is
important to guarantee that lag, ( t ) is bounded.
SFQ and WFQ provide an optimal upper bound for the lag (max t{lagz(t)) = Q,),but
do not provide an optimal bound for the absolute value of the lag. For example, for
+
SFQ this bound is mast{layz( t )} = Q , f zC Q,. which depends on the number
of active tasks.
EEVDF defines the virtual time as WFQ and schedules the requests by virtual finish
times (in this case called virtual deadlines), but uses the virtual start time (called virtual
eligible time) to decide whether a task is eligible to be scheduled: if the virtual eligible
time is greater than the actual virtual time, the request is not eligible. Virtual eligible
and finish time are defined as follows:
The minimum theoretical bound guaranteed by EEVDF for the absolute value of the
lag is Q; for this reason, EEVDF is said to be optimal. EEVDF can also schedule
dynamic task sets and can use non uniform quantum sizes, so it can be used in a real
operating system. To the best knowledge of the authors, EEVDF is the only algorithm
that provides a fixed lag bound. If the lag is bounded, real-time execution can be
guaranteed by maintaining the share of each real-time task constant:
C, + 11iaq { l a g , ( t ) )
fZ(t) =
D1
Some authors [RJM098] tend to distinguish between Izad and soft reservations.
A resource reservation technique for fixed priority scheduling was first presented in
[MST94a]. According to this method, a task 7, is first assigned a pair (Q,. P,) (denoted
as a CPU ccipacity reserve) and then it is enabled to execute as a real-time task for Q ,
units of time every P,. When the task consumes its reserved quantum Q ,, it is blocked
until the next period, if the reservation is hard, or it is scheduled in background as a
non real-time task, if the reservation is soft. At the beginning of the next period, the
task is assigned another time quantum Q , and it is scheduled as a real-time task until
the budget expires.
In this way, a task is rerlzciped so that it behaves like a periodic real-time task with
known parameters (Q,.P,) and can be properly scheduled by a classical real-time
scheduler. A similar technique is used in computer networks by traffic shapers, such
as the leaky bucket or the token bucket [Tan96].
The action to be taken when a reservation is depleted depends on the reservation type
(hard or soft). In a hard reservation, the task is suspended until the budget is recharged,
and another task can be executed. If all tasks are suspended, the system remains idle
until the first recharging event. Thus, in case of hard reservations, the scheduling
algorithm is said to be "non work-conserving".
In a soft reservation, if the budget is depleted and the task has not yet completed, the
task's priority is downgraded to background priority until the budget is recharged. In
this way, the task can take advantage of unused bandwidth in the system. When all
reservations are soft, the algorithm is work-conserving.
Figure 3.8 shows how the tasks of Fig~lre3.2 are scheduled using two hard CPU
reservations RSVl and RSI 2 with Q1 = 2, Pl = 3, Q2 = 1 , and P2 = 5, under RM.
The same figure also shows the temporal evolution of the budgets q l and q2. Since
the reservations are based on RM, RSI has priority over RSV2, and task rl starts to
execute. After 2 time units, 7 1 completes and 7 2 starts executing. At time 3, rL has
not completed, but its current budget q~ = 0 and the reservation RSI> is depleted.
Hence, r~ is s~lspendedwaiting for its budget to be recharged. As we can see, 7 1 does
not suffer from the overrun of 72.
Figure 3.8 Example of CPU Reservations implemented ox-er a fixed priority scheduler.
Figure 3.9 The task set is scheclulable by CPU Reservations implernentecl over EDF
At the same time, a new period for RSV1 is activated, and budget ql is recharged
to 2. Hence, 71 can execute again and complete its instance after one more unit of
time. Notice that task r l has missed its deadline at time 3. Moreover, since the task
has a period of 3, at time 3 another instance should have been activated. Depending
on the actual implementation of the scheduler and of the task, it may happen that the
task activation at time 3 is skipped or buffered. In Figure 3.8 we assume that the task
activation is buffered. Hence, at time 4 the task resumes executing the next buffered
instance.
Note that, even if the first instance of rl is "too long", the schedule is equivalent to the
one generated by RM for two tasks r l = (2.3) and 7 2 = (1,s).In other words, the
CPU reservation mechanism provides temporal isolation between the two tasks: since
rl is the one executing "too much", it will miss some deadlines, but r2 is not affected.
72
if an instance of one of the two tasks is activated later, the temporal isolation provided
by the reservation mechanism may be broken. For example, Figure 3.10 shows the
schedule produced when the third instance of rl arrives at time 18 instead of time 16:
the system is idle between time 17 and 18, and task 7 2 (which is behaving correctly)
misses a deadline.
If correctly used, dynamic priorities permit to fix this kind of problems and better
exploit the CPU time, as shown in the next section.
Note that a scheduling deadline is something different from the job deadline d , which ,.
in this case is only used for performance monitoring.
The abstract entity that is responsible for assigning a correct scheduling deadline to
each job is called aperiodic server.
,
The server assigns each job 7 , an absolute time-varying deadline dk., which can be
dynamically changed. This fact can be modeled by splitting each job r , into c h m k s ,
H z ,.A. each having a fixed scheduling deadline d:, A .
w A CBS S is characterized by a budget q' and by a ordered pair ( Q ' . P ' ) , where
Qs is the rerl,er rizarirriz~rribudget and P s is the rerl,er period. The ratio C s =
Q s / P sis denoted as the s e n v r bandwidflz. At each instant, a fixed deadline d f
is associated with the server. At the beginning d ; = 0.
, ,
Each served job r, is assigned a dynamic deadline d , equal to the current server
deadline d i .
w Whenever a served job T , ~ executes,
, the budget q' of the server S serving 7 , is
decreased by the same amount.
w When q' = 0, the server budget is recharged at the maximum value Q ' and a new
+
server deadline is generated as dl+, = d l P ' . Notice that there are no finite
intervals of time in which the budget is equal to zero.
A CBS is said to be active at time t if there are pending jobs (remember the budget
qs is always greater than 0); that is, if there exists a served job T , . , such that
r, ,< t < f,.,. A CBS is said to be idle at time t if it is not active.
T2
SOFT
,
When a job r, arrives and the server is active the request is enqueued in a queue
of pending jobs according to a given (arbitrary) non-preemptive discipline (e.g.,
FIFO, shortest execution time first, or earliest deadline first, if tasks have soft
deadlines).
,
When a job r, arrives and the server is idle, if q s >
( d f 7 , ) Y s the server
-
job arrives when the server is active, so the request is enqueued. When the first job
finishes, the second job is served with the actual server deadline ( d q = 16). At time
t2 = 1 2 , the server budget is exhausted so a new server deadline d i = d$ Ps = 2 3 +
is generated and q s is replenished to Qs. The third job arrives at time 17, when the
server is idle and q s = 1 < ( d s 73)Vs= ( 2 3 17): = 1.71, so it is scheduled with
- -
In Figure 3.12, a hard periodic task is scheduled together with a soft task 7 2 , having
fixed inter-arrival time (T2= 7) and variable computation time, with a mean value equal
TIG 3 )
HARD
T2
SOFT
Figure 3.12 Example of CBS serving a task \+it11 variable execution time and constant
inter-arrival time.
As we can see from Figure 3.12, the second job of task 7 2 is first assigned a deadline
+
d; = 7 2 P S = 14. At time t2 = 12, however, since qs is exhausted and the job
+
is not finished, the job is scheduled with a new deadline d $ = dq Ps = 21. As a
result of a longer execution, only the soft task is delayed, while the hard task meets
all its deadlines. Moreover, the exceeding portion of the late job is not executed in
background, but is scheduled with a suitable dynamic priority.
Finally, Figure 3.14 shows how the tasks presented in Figure 3.10 are scheduled by a
CBS. Since the CBS assigns a correct deadline to the instance arriving late (the third
instance of r l ) ,7 2 does not miss any deadline, and temporal protection is preserved.
7 1G3)
HARD I I I I I I I I I I I I I I I I I >
2=2
7 2
SOFT
Figure 3.13 Example of CBS serx-ing a task with constant execution time and variable
inter-arri~a1 time.
0 3 6 9 12 15 18 21 21
To prove the theorem, we show that a CBS with parameters (Q '. P ' ) cannot occupy
a bandwidth greater than C s = Q s / P s .That is, the processor demand g , ( t l ,t s ) (see
Chapter 1) of the CBS in the interval [ t l ,t 2 ]is less than or equal to ( t 2 t l ) Q s / P s .
-
,
We recall that, under a CBS, a job r, is assigned an absolute time-varying deadline
d l , which can be postponed if the task requires more than the reserved bandwidth.
, ,
Thus, each job r, is composed by a number of chunks H, A ,each characterized by
,
a release time a , , and a fixed deadline d l A .To simplify the notation, we indicate
all the chunks generated by a server with an increasing index k . The release time
and the deadline of the k t h chunk generated by the server will be denoted by cx and
d k . respectively. Using this notation, the CBS algorithm can be formally described as
illustrated in Figure 3.15.
If ek denotes the server time demanded in the interval [ak . dx] (that is, the execution
time of chunk Hk). we can say that
If q(t) is the server budget at time t and f k is the time at which chunk Hx ends to
execute, we can see that q ( f A ) = q ( a k ) - eA, while q ( a x + l ) is calculated from q ( f k )
in the following manner:
q(ax+1)= { $1 if d ~ was
+ ~generated by Rule 2
if dx+1 was generated by Rule 1 or 3.
Using these observations, the theorem can be proved by showing that:
+
g , ( a ~ , . d ~ , ) q(fx2)5 (dA2 - aAl)C5.
When job 7,a r r l v e s a t t l m e r',
enqueue t h e r e q u e s t I n t h e s e r v e r queue;
n = n + l ;
if ( n == 1) / * ( t h e s e r v e r IS I d l e ) * /
if ( r J + ( C / Q ) * P >= d ~ )
/*---------------Rule I---------------*/
k = k + l ;
ak = r J ;
d k = ak + P ;
c = Q;
else
/*---------------Rule 2---------------*/
k = k + l ;
ak = r J ;
di, = dh-1;
/* c r e m a m s unchanged * /
When job 7 , t e r m m a t e s
dequeue T~ from t h e s e r v e r queue;
n = n - I ;
if (n I = 0) s e r v e t h e n e x t job I n t h e queue w l t h d e a d l m e dk;
When job 7,e x e c u t e s f o r a t l m e u n l t
c = c - I ;
When ( c == 0)
/*---------------Rule 3---------------*/
k = k + l ;
ak = a c t u a l - t l r n e 0 ;
di, = Clh-1 + P.
c = Q;
Inductive base. If in [ t l .t l ] there is only one active chunk (11-1 = 11-2 = k ) , two cases
have to be considered.
Case b: d A = ax + P'
If dk = ak + P,then g S ( a k .d k ) + q ( f k ) = ek + q ( f k ) = Q S . Hence, in both cases,
we have:
sS(a~.dkl-i+
) q ( f k L - l ) < (dx,-1
QS
-
Given the possible relations between d x and dx-1, three cases have to be considered:
di, > d k P l + P S .That is, d k is generated by Rule 3 or Rule 1 when r , > d,-1.
w d A = dL-l. That is, d A is generated by Rule 2.
w dx~l<dk<dk~l+PS.Thatis,dxisgeneratedbyRulelwhenr,<d,-l.
k2
Q' Q ' + Q'
k + ( 2 5 1 - a -PS - q(h-1) + Q' 5 - ail)-
PS
k=A 1
and finally
If d k 2 = d L Z P 1then
, d L 2 is generated by Rule 2. In this case,
hence
hence
The isolation property allows us to use a bandwidth reservation strategy to allocate a
fraction of the CPU time to each task that cannot be guaranteed a priori. The most
important consequence of this result is that soft tasks can be scheduled together with
hard tasks without affecting the a priori guarantee even in the case in which soft requests
exceed the expected load.
In addition to the isolation property, the CBS has the following characteristics:
No assumptions are required on the WCET and the minimum inter-arrival time
of the served tasks: this allows the same program to be used on different systems
without recalculating the computation times. This property allows decoupling the
task model from the scheduling parameters.
w If the task's parameters are known in advance, a hard real-time guarantee can be
performed (see Section 3.7).
w The CBS automatically reclaims any spare time caused by early completions
or late arrivals. This is due to the fact that whenever the budget is exhausted,
it is always immediately replenished at its full value and the server deadline is
postponed. In this way, the server remains eligible and the budget can be exploited
by the pending requests with the current deadline.
Knowing the statistical distribution of the computation time of a task served by
a CBS, it is possible to perform a statistical guarantee, expressed in terms of
probability for each served job to meet its deadline (see Section 9.5).
In this section we briefly recall some possible parmzeterJ ussignnlenf policies; note
that, although most of the presented results are applied to the CBS algorithm (because
they were originally developed for the CBS), they can be extended to other reservation
policies.
The first (and simplest) usage of a reservation algorithm is to use it for serving aperiodic
tasks so that they do not interfere with the hardreal-time activities. This is the approach
followed in all the works on aperiodic servers [SSL89, LSS87, LRT92, TL92, SLS95,
SB94, SB96, GB951.
Obviously, a single CBS can be used to serve all the soft real-time tasks, but in this
case it might be very difficult to provide soft real-time guarantees. The best way to
provide some kind of performance guarantee to soft real-time tasks is to serve each
task with a dedicated CBS (or CPU reservation). In this way, it is possible to guarantee
that each task is periodically assigned a given amount of time; if the task parameters
are not know a priori this is the only performance guarantee that can be performed, but
if some information is known about the task, more complex guarantee strategies can
be used.
Finally, a dedicated server can also be used to schedule hard real-time tasks, which can
be guaranteed thanks to the hard sclzedillabilih property, expressed by the following
lemma:
Proof.
For any job of task 7,. 1 , , + I
- , ,
r, > T,> PSand c , < C, < Qs. Hence, by
, , ,
definition of the CBS, each job J , is assigned a scheduling deadline df = 1 , + Ps
,
(since r, is always greater than di,-,) and it is scheduled with a budget Q i > C,.
Moreover, since c , , < Q S ,each job finishes no later than the budget is exhausted,
hence the deadline assigned to a job does not change and is exactly the same as the one
used by EDF. 0
All the policies described above can be used off-line for assigning reservations param-
eters during the system design phase, when tasks parameters are known a-priori. But,
as explained in Chapter I, such an a-priori information is often not available and static
allocation techniques cannot be used. In this case, it is possible to dynamically change
the reservation parameters as explained in Chapter 8.
The predictability of the kernel is increased by using eager evaluation policies (opposed
to the lazy evaluation policies used by standard Mach) and by substituting the FIFO
queues contained in the kernel with priority queues (where the priorities are derived
by the tasks' temporal constraints). As an example of lazy evaluation policy used in
standard Mach, when a task dynamically allocates some memory, the kernel really
gives it to the task only when the task accesses the allocated memory. Such a "lazy
allocation" allows enhancing the kernel efficiency, and enabling some optimizations
such as copy-on-write, but increases the unpredictability of the system. Hence, RT-
Mach modifies this behavior by immediately allocating the memory; other similar
optimizations present in the Mach pkernel have been removed in RT-Mach for similar
reasons. The real-time threading library coming with RT-Mach implements the periodic
and sporadic thread models, enabling the user to express the WCET and the period (or
the minimum interarrival time) for each thread. In this way, RT-Mach can perform
the admission control and correctly schedule the treads using a Rate Monotonic (or
Deadline Monotonic) scheduler. Finally, the real-time communication mechanism
uses priority inheritance [SRL90] to bound the waiting times.
CPU reservations were added to RT-Mach by Mercer and others [MST94a] to sup-
port multimedia applications. In particular, the authors realized the lack of temporal
protection presented by the priority-based RT-Mach scheduler (similar to the problem
shown in Section 3. l), and implemented a CPU reservation mechanism based on the
Rate Monotonic algorithm. This was done by enhancing the RT-Mach time accounting
mechanism to exactly measure the execution time used by each thread (and keeping
track of the reservation budget) and by implementing an enforcement mechanism. The
enforcement mechanism downgrades a thread to non real-time when it consumes all
its reserved time (the thread will be promoted again to real time priority at the be-
ginning of the next reservation period). The authors argued that to compensate some
approximations in accounting and enforcement, a fraction of the CPU time must be left
unreserved, and they estimated this percentage in about 5 - 10%. Since in realistic sit-
uations the RM utilization bound is about 88% [LSD89], the authors claim that basing
the reservation mechanism on EDF would not give any sensible advantage with respect
to RM, and thus they adopted the RM scheduler provided by RT-Mach as a basis for
their CPU capacity reserves.
Nowadays, using modern hardware and OS kernels the overhead for accounting and
enforcement is negligible, hence there are no more reasons for compensating it. As a
consequence, basing the reservation mechanism on EDF can be a realistic choice.
For example, Rialto is a research system developed by Microsoft [JIF+96] that permits
to mix CPU reservations and other kinds of timing constraints. Rialto was designed to
combine timesharing and soft real-time in a desktop operating system, and thus uses
CPU reservations to isolate the different applications. The execution time is reserved to
nctivitier and monitored at runtime. Activities can be composed by more threads, and
threads belonging to the same activity share its reserved time in a round-robin fashion.
Another difference between Rialto CPU reservations and traditional ones is that in
Rialto reservations are continuously guaranteed. That is to say, if an activity has a
( Q . T ) reservation, then for every time t the activity will run for at least Q units of time
+
in the interval (t.t T ) 2 .This result is impossible to obtain using a priority scheduler,
and in fact Rialto uses a table driven schedule that is computed when a reservation is
created and is repeated over time.
Moreover, Rialto provides time conrtrnirltr: a time constraint is a tuple (s. c. b ) , indi-
cating that a thread requires to execute for a time c, starting at time s, and terminating
before b. Based on the thread's activity reserved time on the static schedule, and on the
available spare time, Rialto can guarantee the time constraint or reject it. If the time
constraint is accepted, the activity's threads are scheduled so that it is respected (the
scheduling algorithm used inside the activity is based on EDF).
The kernel was later extended to support multimedia applications through the CBS,
which was explicitly designed to efficiently schedule periodic and aperiodic soft tasks
with unknown execution times [ABOO]. Nowadays, the CBS can be used in HARTIK
to schedule both hard and soft real-time tasks, or to reserve a fixed fraction of the CPU
bandwidth to non real-time tasks to prevent starvation. Moreover, the CBS is used to
schedule all the drivers' tasks so that it is not necessary to adjust the drivers' WCET
estimation on every new machine the first time a driver runs on it.
Another real-time kernel developed at the Retis Lab of Scuola Superiore S. Anna
of Pisa is SHaRK [GAGBOl]. ShaRK is an evolution of HARTIK and has been
designed to easily implement new scheduling algorithms in the kernel as ~ched~rlirlg
r~zocldes.The CBS is still provided as one of the standard scheduling modules, and
other reservation mechanisms can be easily added, hence SHaRK provides full support
for CPU reservations.
A resource kernel is based on the Resource Sef abstraction, which describes all the
resources that can be used by one or more tasks. A resource set may include multiple
reservation types (for example, a CPU reservation, a network reservation, and a disk
reservation), and all the tasks attached to the resource set will be allowed to use those
reservations. Hence, in order to be guaranteed to execute in a proper timely fashion, a
task must create a resource set, create the proper resource reservations expressing its
requirements, connect them to the resource set, and then attach itself to the resource
set.
Computers are powerful enough to run several applications at the same time, each
consisting of multiple concurrent activities. In this chapter we consider the problem of
supporting multiple real-time applications in the same computing system, so that each
application can be handled by its own scheduling policy and analyzed independently
of the others.
A process can be multi-threaded, that is, it can consist of several concurrent threads.
Different threads belonging to the same process share address space, file descriptors,
and other resources. Since threads belonging to the same process share the address
space, the communication is often realized by means of shared data structures protected
by mutexes. Creating a new thread is far less expensive than creating a new process.
Context switching among threads of the same process is faster.
The thread model is supported by all general purpose operating systems because it has
a lot of advantages with respect to a pure process model. The designer of a concur-
rent application, in fact, can structure the application as a set of cooperating threads,
simplifying the communication and reducing the overhead of the implementation.
When designing a concurrent application, in which tasks have to cooperate tightly and
efficiently, the thread model is the most suited. As an example, consider a web server
that can serve many clients at the same time. We can structure the program as one main
thread that waits for new connections, and one active thread for each client. Another
example is an MPEG player that plays streams coming from the network: a typical
design structure for this application consists of a thread that waits for new data from the
network and writes them into a buffer; a second thread that periodically reads the video
frames and the audio data from the buffer, decodes and a displays them; and a third
thread that waits for user commands. All the three threads interact tightly: therefore,
communication and scheduling must be fast and efficient.
Classical hard real-time systems usually consist of periodic or sporadic tasks that tightly
cooperate to f~llfillthe system goal. For efficiency reasons, they communicate mainly
through shared memory, and appropriate synchronization mechanisms are used to reg-
ulate the access to shared data. Since all tasks in the system are designed to cooperate, a
global schedulability analysis is done on the whole system to guarantee that the tempo-
ral constraints will be respected. There is no need to protect one subset of tasks from the
others. Therefore, we can assimilate a hard real-time system to a single multi-threaded
process where the real-time tasks are modelled by threads.
In general purpose operating systems, many processes are active at the same time, and
they conlpete for the processor and other hardware resources. Therefore, two important
goals of any general purpose operating system are to regulate the competition among
the processes, allowing a fair slzare of the system resources to every process, and to
protect each process from the interferences of the others.
Processes are developed independently. They must be protected from each other
to prevent reciprocal interference. If one process fails, the other process must not
be affected. They compete for the system resources, so the global scheduler has
to regulate such a competition to ensure fairness.
w Threads are designed and developed together. They cooperate for producing the
application's results. If one thread fails, the whole application may fail.
the multi-thread model, in this chapter we assume that real-time tasks are implemented
as threads, and a classical real-time application as one single multi-threaded process.
Therefore, a real-time application is a process that can be multi-threaded, that is, it can
consists of many real-time tasks. In the remainder of this chapter, the terms thread and
task will be used as synonyms, as the terms application and procerr.
Such properties can be very well supported through any resource reservation mech-
anisms, such the ones described in Chapter 3. However, in the case of multi-thread
systems, the resource reservation mechanism must be applied not at the task level, but
at the application level. This poses many problems, as we will see in the next sections.
Therefore, in this model, we distinguish two levels of scheduling. At the higher level, a
global sclzedider selects the application to be executed on the processor and, at a lower
level, a local sclzeduler selects the task to be executed for each application. Such a
two-level scheduling scheme has two main advantages:
w each application can use the scheduler that best fits its needs;
w legacy applications, designed for a particular scheduler, can be re-used by simply
re-compiling, or at most, with some simple modification.
Applicatiou A Local
Scheduler
ppicatioB
Scheduler
1 Application C Local
Scheduler
Figure 4.1 A multi-thread operating system: each application (or process) consists of one
or more threads. In acldition. each application can ha\-e its own schecluling algorithm.
Aperiodic activities, such as the one triggered by the crnrlk angle refisor. In this
task, the time between two consecutive activations is variable and depends on
the rotations per minute of the engine. Therefore, the activity is triggered by an
interrupt at a variable rate. Probably, the best way to schedule such an activity is
by an on-line algorithm, under fixed or dynamic priorities.
Periodic activities. Many activities in a control system are periodic, and must
be activated by a timer. For example, in automotive applications, most systems
are connected to other electronic boards in the car through a Time Triggered
network, which requires precise timing. Therefore, this second set of activities
are best handled with an off-line table-driven scheduler, executed according to a
time-triggered paradigm [KDK+89].
Multi-thread Applicatiorzs
If we are forced to use a single scheduling paradigm for the whole system, we must
either reduce the two sets of activities to periodic tasks, scheduling them by a time-
triggered paradigm, or re-program the second set of activities to be handled by an
on-line priority scheduler. Both solutions require some extra programming effort and
are not optimal in terms of resource usage. The best choice would be to program the two
sets of activities as two distinct components, each one handled by its own scheduler,
and then integrate the two components in the same system.
A similar problem arises when dealing with real-time systems with different criticality,
because different scheduling paradigms are used for hard and soft real-time activities.
Again, a way of composing such activities would be to implement them as different
components, each one handled by its own scheduling algorithm.
If hard and soft real-time applications are mixed in the same system we have an ad-
ditional problem: if the system does not provide temporal protection, a non-critical
application could execute longer than expected and starve all other applications in the
system.
Now the question is: what is the minimum fraction of processor that must be reserved
to an application in order to guarantee its temporal requirements? An intuitive solution
would be to assign each application an amount of resource equal to the utilization of
the application tasks. For example, if the application tasks have a total maximum load
of 0.3, we can assign 30% of the processor to the application. Indeed, if we use a
scheduler that provides a perfect abstraction of a virtual dedicated processor whose
speed is a fraction of the shared processor speed, then this approach is correct. One of
such mechanisms is the GPS (General Processor Sharing) policy, described in Section
3.3.
Unfortunately, this solution is not feasible, as the GPS cannot be implemented in prac-
tice. All schedulers that can be implemented can only provide "imperfect" abstractions
of the virtual processor. Any scheduler that supports temporal protection (see Chapter
3) provides at least one parameter that specifies the "granularity" of the allocation. For
example, in Proportional Share algorithms (like EEVDF, described in Section 3.4.3)
we must specify the system quantum; with the Constant Bandwidth Server (CBS) de-
scribed in Section 3.6.1, we must specify the server period. The smaller the granularity,
the closer the allocation of the resource to that of the ideal GPS algorithm. However, a
small granularity implies a large system overhead. For example, in Proportional Share
algorithms we have exactly one context switch every quantum boundary. With CBS,
small periods imply a large number of deadline recalculations and queue re-orderings.
Thus, to contain runtime overhead, the "granularity" should not be too small.
On the other hand, a coarse granularity of the allocation could lead to unfeasible
schedules. An example of such a problem is illustrated in Fig~lre4.2. In this example,
the system consists of two applications A1 and A> Application A1 comprises two
sporadic tasks, 7 1 with computation time C1 = 1 and minimum interarrival time
Tl = 15; 7 2 with computation time C2 = 1 and minimum interarrival time T 2 = 5.
The total utilization of application A1 is Lj = 0.2. Application A2 is non real-time
and consists of a single job, that arrives at time 0 and requires a very large amount
of execution time (for example, it could be a scientific program performing a long
computation).
Figure 4.2 Problerns with the time granularity of the resource allocation
When task 7 1 arrives at time 0, the server S1is activated and, having the shortest
deadline, it is selected to execute. When task 7 2 arrives at time 1,the server budget is
already depleted. Therefore, the first instant at which 7 2 can be scheduled is t = 7.1,
after the task's deadline.
Although the example is based on the CBS algorithm, the same thing happens with all
resource reservation mechanisms presented so far.
There are two solutions to making application A 1 schedulable: assigning the server a
larger share of the processor or assigning it a smaller period. For example, if all arrival
times are integer numbers, by assigning S1a period of P1= 1, the above system
becomes schedulable. The resulting schedule is shown in Figure 4.3. Note that there
are a large number of context switches.
Another way to make the system schedulable is to increase the budget of application
-A1.If we let PI= 5, we can make the system schedulable by raising the budget to
Q1= 2. The resulting schedule is shown in Figure 4.4, and there are definitely less
context switches. However, A1 was assigned a bandwidth twice as much as before,
"wasting" processor resources that could be assigned to other applications. Deciding
which approach is better clearly depends on the overhead introduced by the algorithms
and on the context switch time.
- . .
,
A-
'
-1- -.-
,
-'-
' ,
*
' '
I
$
I , ,
8
.. . . . . , . . . . . . .
,
.--
,
I , ' ,
=I . . . . ,I
:
,
:..
'
i
,
. . . .
1
.
1
.
1 1
. . -1 . . . . . . .
I
' ...
v
1
'T
2
. .
. .
.
,..
. ,
-
,
,
,
I
.
'
'
'
..
I
..,
,
,
,
I I
- ..
I
I
.
'
'
'
,
$
,
,
.
8
8
.........
., . . . . . . . . . . . . . . . . .
,
, ,
. . . . .
. .
i $ 6 , ' I
- 2
' , , , ,
I
. . . . . . . . .
. . . . . " I .
. .
..
, ,
. . . . . . . . . . . . ' I , ,
0 2 4 6 8 10 12 14
0 2 4 6 8 10 12 14
Which global scheduling algorithm can be used to allocate the processor to the
different applications?
Given the global scheduling algorithm and the application timing requirements,
how to allocate the processor share to the applications so that their requirements
are satisfied?
Given the global scheduling algorithm and the processor share assigned to the
applications, how can the schedulability be analyzed?
Some solutions to the problems stated above are presented in the next sections.
The approach followed by these two algorithms is similar and only differs for the way the
system scheduler is implemented. In both cases, the system scheduler takes advantage
of the knowledge that tasks have about the budget assigned to each application. In
Deng and Liu's algorithm, the global system scheduler requires the knowledge of the
execution time of each task instance, whereas in the BSS the global scheduler requires
the knowledge of the deadlines of the application tasks.
Other approaches will be discussed in Section 4.3, which do not require any information
about the behavior of the applications.
Preemptive predictable applications. They are applications in which all the schedul-
ing events are known in advance, as for the case of applications consisting of
periodic real-time tasks.
Given this model, the authors proposed a scheme involving an on line acceptance test
and a dynamic on line scheduler. When an application A , wants to enter the system,
it must declare its quality of renice requirements in terms of desired utilization C,.
If there is enough free bandwidth to satisfy the new requirements, the application is
accepted, otherwise it is rejected.
The server dedicated to each application is the Constant Utilization Server (CUS)
[DLS97], which is a variant of the Total Bandwidth Server (TBS) [SB96] proposed
by Spuri and Buttazzo. Since the behavior of the CUS server differs for the three
application categories, its details will be described in the corresponding sections.
Multi-thread Applicatiorzs
NON-PREEMPTIVE APPLICATIONS
The CUS algorithm updates the server variables according to the following rules:
I . If a task of application A, arrives at time t , requiring a computation time c,:
w If the local scheduler queue is empty (i.e., no other task in the application is
active and the application is currently idle), and the server is eligible, then
+
q, + c, and d, + rnax(t, d l ) %. Moreover, the server is inserted in
the global EDF queue. Notice that the computation time c , of the task is
required to compute the server deadline.
If the local scheduler queue is non-empty or the server is not eligible, the
task is inserted in the ready queue of the local scheduler until it becomes the
first in the queue and the server is eligible.
2. If a server is selected for execution by the global scheduler (because it is the server
with the earliest deadline), it starts executing the corresponding task and decreases
the budget q, accordingly.
3. A server is allowed to execute until its budget is equal to 0 or until the task finishes.
If the task finishes before the budget is depleted, the server eligibility time e , is
set to d, and the server becomes non eligible. If the budget is depleted before
the task finishes executing, an exception is raised to the application. What to do
in this case is left unspecified and taking the most appropriate action is up to the
application responsibility.
At time t = 0, all tasks are ready to execute. The local scheduler of application A 1
chooses 71 to be executed. Since e l = 1, the server deadline is set to d l = 2 and
the budget ql = 1. Similarly, the deadline of server S2is set to dl = 4/C2 = 8
and q2 = 4. Hence, the global EDF scheduler selects server S1to be executed.
Figure 4.5 Example of scheclule of a lion preemptive application .Al by the CUS algorithm:
a) schedule of A1 on a dedicated processor: b) schedule generated hy CUS on the shared
processor.
Notice that the deadline of server S1 is always less than or equal to the deadline of
all the tasks that are c~lrrentlybeing executed. Actually, by comparing Figures 4.523
and 4 S b , it is possible to see that the server deadline is n l t v n ~ sequal to tlzejfiliishirlg
time of the tnrk currently ereciited irl the dedicated processor. This is an interesting
property of the CUS algorithm for non preemptive applications that is used to prove
the following important theorem [DLS97].
Multi-thread Applicatiorzs
PREDICTABLE APPLICATIONS
The CUS algorithm presented above has a limited applicability, since it requires the
local scheduling algorithm to be non-preemptive. In fact, Deng and Liu showed that,
if a preemptive application is schedulable on a dedicated slower processor of speed C,,
it would require a server with bandwidth as large as 1 to be scheduled on the shared
processor!
To overcome this limitation, Deng and Liu consider predictable and non predictable
applications. In predictable applications, the operating system knows, at each time t ,
the next instant at which a scheduling decision must be taken for the application. For
example, predictable applications are those consisting of periodic real-time tasks with
known worst-case execution time. For these applications, Deng and Liu proposed to
modify the CUS algorithm as follows:
When dealing with non predictable applications (like those including one or more spo-
radic tasks) it is not possible to know in advance the next instant at which a scheduling
decision must be taken by the local preemptive scheduler. To solve this problem, Deng
and Liu proposed to use a quantum t. The idea is to compute the server budget and
deadline as follows:
-
q, = min(c,,. i C:). d, =
'72
The smaller the t, the higher the number of budget recalculations. However, the smaller
the i,the smaller the difference between the utilization of the server and the speed of the
slower dedicated processor. The problem is very similar to the one described in Figures
4.2,4.3 and 4.4. Actually, it is easy to see that in this case the CUS algorithm becomes
very similar to the CBS algorithm. Computing the maximum error as a function of the
quantum t is not trivial and depends on the application characteristics. We remand to
the original paper [DL971 for more details on the matter.
EXTENSIONS
Deng and Liu's approach has been extended by Kuo and Li [KL99]. who presented a
model in which the global scheduling strategy is based on fixed priorities together with
a deferrable server or a sporadic server [SLS95, LSS87, SSL891. Each application can
be handled by a dedicated server with capacity Q , and period P,. To achieve maximum
utilization, the following conditions must be satisfied:
The period of each server must be a multiple or a divisor of the periods of all other
servers in the system;
w The period of all the tasks must be multiple of the period of the server.
Kuo and Li also addressed the problem of sharing resources among tasks of differ-
ent applications. Each task is allowed to share resources through mutually exclusive
semaphores under the Priority Ceiling Protocol [SRL90]. This introduces potential
blocking for a task accessing a resource locked by a task in another application. There-
fore, it is necessary to use a global scheduling condition that takes into account such a
blocking time. As a consequence, the isolation properties cannot be guaranteed as in
the case of independent applications.
This algorithm has the advantage of not requiring the knowledge of the worst case
execution time of all application tasks. However, the conditions on the periodicity of
the server are quite strong. As a consequence the algorithm is not flexible enough to
Multi-thread Applicatiorzs
be used in an open system. Nevertheless, this algorithm was the first one addressing
the problem of scheduling an application through a dedicated periodic server, as the
deferrable server algorithm. In Section 4.3 we will see other algorithms that extend
and generalize this approach.
CONCLUDING REMARKS
Deng and Liu were the first considering the problem of hierarchical scheduling in open
systems. They correctly identified the conditions under which an application that is
schedulable on a dedicated slower processor, can be scheduled in the shared processor
together with other applications. However, their approach has some limitations. First
of all, it requires the knowledge of the worst-case execution time of each task. Al-
though this assumption is reasonable for hard real-time applications, it is quite strong
in open systems, for the reasons explained above. The second limitation is that the
schedulability is possible only under certain restrictive conditions: non preemptive
applications, predictable applications, or non predictable applications (within a given
error). The algorithm presented in the next section overcomes such limitations and
provides precise guarantees also to non predicable applications.
Each time a task is ready to be executed in application A,, the server S,calculates a
budget B and a deadline d for the entire application. The active servers are then inserted
in a global EDF queue, where the global scheduler selects the earliest deadline server
to be executed. It will be allowed to execute for a maximum time equal to the server
budget. In turn, the corresponding server selects the highest priority task in the ready
queue to be executed according to the local scheduling policy.
'111 [LCBOO] the al,mo~ithmhas heen called PShED
The server deadline is assigned by the server to be always equal to the deadline of the
earliest-deadline task in the application. Notice that the task selected to be executed is
chosen according to the local scheduler policy and might not be the earliest deadline
task.
LIST OF RESIDUALS
To calculate the budget, every server uses a private data structure called l i ~ofre~iduals.
t
For each task of an application A,, this list G, contains one or more elements of the
following type:
1 = (B.d)
where d is the task's deadline and B is the budget available in interval [a.dl (where a
is the task's arrival time); that is, the maximum time that application A, is allowed to
demand in [a. dl.
Thus, an element 1 specifies for the interval [a,dl the amount of execution time available
in it. The goal of the server is to update the list such that in every interval of time the
application cannot use more than its bandwidth. From now on, symbol 1 , ( k ) will denote
the element in the k-th position of list C , .
The server assigns the application apair (budget, deadline) corresponding to the element
1 = (B,d) of the earliest deadline task in the application, regardless of the local
scheduling policy. Only in the case the local scheduling policy is EDF, this element
corresponds to the first task in the ready queue.
Two main operations are defined on this list: aclcling a new element and ~rydafingthe
list after some task has executed.
A new element is created and inserted in the residual list when a newly activated task
becomes the earliest deadline task among the ready tasks in the application. Let d,
be its deadline: first, the list is scanned in order to find the right position for the new
element. Let k be such a position, that is:
Multi-thread Applicatiorzs
where C,is the bandwidth (share)assigned to application A , and D l is the task's relative
deadline. At this point, the new element is completely specified as 1 = (B,.d,) and
+
can now be inserted at position k, so that the k-th element becomes now the (k 1)-th
element. and so on.
The basic idea behind Equation (4.1) is that the budget for the newly arrived task must
be constrained such that in any interval the application does not exceed its share. A
typical situation is shown in Fig~lre4.6: when at time t task 7 , becomes the earliest
deadline task, the algorithm must compute a new budget: it must not exceed the share
in interval [a,.d,],which is D,C z ;it must not exceed the share in interval [aA-1.d,]
which is BA-1 + (d, - dA-1)G , and must not exceed the share in interval [aA.dA]
which is Bk. It can be shown that, if B,is the minimum among these values, then the
application will not use more than its share in any other interval.
Every time an application task is suspended or completes, the corresponding list must
be updated. It could happen for any of the following reasons:
the application has been preempted by another application with an earlier deadline.
Then, the algorithm picks the element in the list corresponding to the actual deadline
of the server, say the k-th element, and updates the budgets in the following way:
>
1
1
j
j
k B,=B,e
< k A E, > EL - elenzent 1,
~-erno~,e
DELETING ELEMENTS
We also need a policy to delete elements from the list whenever they are not necessary
any longer. At time t , element 1 , ( k )can be deleted if the corresponding task's instance
has already finished and
w either d k < t;
w orEk>(dk-t)Cz.
It can be seen from Equation 4.1 that in both cases element 1 , ( k ) is not taken into
account in the calculation of the budget. In fact, suppose that element 1 , ( j ) is being
inserted just after 1 , ( k ) .Then
and E L cannot be chosen in the minimum. Since l , ( k )cannot contribute to the calcu-
lation of any new element, then it can be deleted safely.
EXAMPLE
To clarify the mechanism, consider the example in Figure 4.7, in which two applications
are scheduled by the BSS algorithm: application A 1consists of two tasks, rll and 6
and it is served by a server with a bandwidth of 0.5 and with a Deadline Monotonic
scheduler. Application A2consists of one task and it is served by a server with a
Multi-thread Applicatiorzs
Figure 4.7 An example of schedule produced hy the BSS algorithm: the two tasks 7:and
-1
i . application Alare scheclulecl by Rate Monotonic.
111
bandwidth of 0.5 (since there is only one task, the local scheduling policy does not
matter).
Then the server invokes the global scheduler. However, since the server of ap-
plication A2 has an earlier deadline, the application is not executed until time
t = 3;
At time t = 3 the global scheduler signals the server of application A that it can
execute;
w At time t = 4 an instance of task 721 arrives with deadline di = 12 and an
execution requirement of c i = 5. According to the DM scheduler, since task 7-21
has a smaller relative deadline than task r ; , a local yl-eernption is done. However,
since the earliest deadline in application A1 is still di = 10, the server budget and
deadline are not changed.
At time t = 8 the budget is exhausted: application A1 has executed for 5 units of
time. The global scheduler suspends the server. The server first updates the list:
+
then it postpones by T = 10 units of time: d l = dl 10 = 20. Now the earliest
deadline in the application is d 2 = 12, and the server calculates a new budget
equal to:
E., = (d2 d l ) L j E l :
- +
and inserts it into the list:
Gl = {(O, 1 0 ) : ( 1 , 1 2 ) } :
Finally, it invokes the global scheduler. Since it is again the earliest deadline
server in the global ready queue, it is scheduled to execute.
Now the earliest deadline in application A1 is d l = 20. Then the server calculates
a new budget and inserts it into the list:
finally, it invokes the global scheduler. Since it is not the earliest deadline server,
another server is scheduled to execute.
It is important to notice that the earliest deadline in the application has been postponed,
and this deadline can in general be different from the deadline of the executing task.
Notice also that this framework is very general: basically it is possible to choose any
kind of local scheduler. In particular, we can let tasks share local resources with any
concurrency control mechanism, from simple semaphores to the more sophisticated
Priority Ceiling Protocol or Stack Resource Policy.
FORMAL PROPERTIES
where C,. D , and T,are the worst-case execution time, the relative deadline and
the period for task i, respectively.
Rate Monotonic: Application A1, which consists of periodic or sporadic tasks with
deadlines equal to periods, is schedulable if:
Stack Resource Policy with EDF: Application A1,which consists of periodic tasks
with deadlines equal to periods, is schedulable if:
Notice that the proposed schedulability tests are similar to the equivalent schedulability
tests for a virtual dedicated processor of speed CA.Unfortunately, the BSS cannot
provide perfect equivalence between the schedule in the dedicated processor and the
schedule in the shared processor. In particular, it has been shown [LipOO] that an
application schedulable by fixed priority on a dedicated processor of speed C A may
be unschedulable (i.e. some deadline could be missed) in the shared processor when
served by a server with bandwidth CA.
COMPLEXITY
The BSS algorithm is quite complex to implement, because a linear list of budgets
has to be kept for each single server. The complexity of the algorithm depends on the
maximum number of elements that can be present in this list. It has been proved that if
the application consists of hard real-time periodic task, the length of the list is at most
equal to the number of tasks in the application. However, even in this case, the time
spent by the algorithm can be quite high. Lipari and Baruah proposed a data structure,
called Incrernentnl AVL tree to reduce the time needed to update the list. With the new
data structure, the complexity is now O ( l o g S ) ,where is the number of elements in
the list.
CONCLUDING REMARKS
The BSS algorithm presented above, like the Deng and Liu's algorithm, uses infor-
mation about the tasks to compute the server budget for the entire application. For
this reason, we can classify these algorithm as infrusive algorithms. They have the
following limitations:
It may not be possible to have enough information on the application tasks. For
example, an application may contain some non-real-time task for which it may be
impossible to derive a deadline or the worst-case execution time.
w The strong interaction between the local scheduler and the global scheduler (i.e.,
the server mechanism) makes it difficult to implement such algorithms. It would
be better to completely separate the local scheduler from the global scheduler.
For these reasons, researchers recently concentrated their efforts in a different direction,
as explained in the following section.
Notice that the partition generated by an on-line algorithm may not be periodic, as it
depends on the arrival times and execution times of the tasks. For the sake of clarity,
we first introduce some definitions that apply to static periodic partitions and then
generalize them to dynamic non-periodic partitions.
Definition 4.4 Tlze Leart Supply Function JLSF)S" ( t )of a resource partition II ir the
+
rizirlirriz~rriof ( S ( t d ) - S ( d ) )ithere t . d 0>
Figure 4.8 Example of a resource partition: a) the partition: b) the Supply Function: c)
the Least Supply Function; c) the critical partition.
When the global scheduler is an on-line algorithm (as the CBS server described in
Chapter 3, or any other algorithm that provides temporal isolation), the resource par-
tition model can be generalized to take the on-line partitioning of the resource into
account. First of all, the partition needs not to be periodic. The following definition
generalizes the concept of partition.
-
in (0.1). ZfII(t) = 1,flzen the l-esource is allocafecl f o flze corresponding ciyylication.
IfII(t)= 0,the l-eroiilre is not allocated to the applicatiorl and c a m o t be ured. A
partitiorl ir periodic iftlzere existr a P > 0 r z d z that n(t) II(t P). +
The concepts of supply function and least supply function can easily be extended to
the case of non-periodic partitions. Some additional care is needed for the availability
factor.
Now, we want to characterize all possible partitions that are generated by an on-line
algorithm. The idea is based on the observation that a resource partition is characterized
by two important parameters: the availability factor n and the partition delay A. The
latter is defined as follows:
Notice that, in the definition of partition delay, we use the least supply function S " ( t ) .
This means that A is the maximum interval of time in which an application does not
receive any service. Figure 4.9 shows the relationship between a , A and the least supply
function for the partition of the previous example. In particular, function y = ( a t A) 0
is always below the least supply function, and there is at least one point in which they
are coincident.
For any partition, it is possible to find its availability factor n and its delay A . Viceversa,
for each pair ( a ,A ) , there is more than one partition with availability factor equal to
n and delay equal to A . Therefore, the pair ( a . A) defines a set of partitions. We now
consider the class of all partitions with availability factor equal to n and delay lerr tliarl
or equal to A .
The class of partitions generated by an on-line algorithm like the CBS (or any similar
algorithm that provides resource reservation and temporal protection) is of particular
interest. As we will see in the following, there is a direct relation between the parameters
( n .A ) and the parameters Q and P of the server.
Moreover, given ( a .A) assigned to an application and its local scheduling algorithm,
it is possible to check whether the application is schedulable (see next section). Also,
given an application and its local scheduling algorithm, it is possible to compute all
possible values of ( a . A) that make the application schedulable.
Multi-thread Applicatiorzs
In the "hard reservation" version, when the budget is exhausted, the server is suspended
and the budget will be replenished at the server deadline d. At the replenishment time,
the server budget is recharged to its maximum value Q and the server deadline is set
+
to d = d P. Notice that, by introducing this rule, the algorithm becomes non-work
conserving. The hard reservation rule, however, is important to bound the maximum
delay of a partition generated by a server.
The following theorem identifies the partition with the maximum delay that can be
generated by a server.
Theorem 4.4 G i ~ aw CBS aewer with flze hard reaetiatiorz ride, arzd ~vithprari~eters
1-
( Q .P ) , i f k = , the pnitioii i~,itlzthe iiiarirrzi~indelay flint it can gerieiute
S"( t ) :
lzaa flze~followingleaat a~ryyly~firncfion
ift E [O. P - Q]
i f t E ( k P - Q. ( k +l)P - 2Q] . (4.2)
t - ( k + l ) ( P Q ) oflzetiviae
-
Proof.
We have to compute the worst-case allocation provided by the server for every interval
of time. Consider an interval starting at time t , when a new request for execution arrives
from the application. There are 2 possibilities:
case a. The server is inactive and a new request is activated at time t , with q > (d-t)C.
In this case, a new budget q = Q and a new deadline d = t P are computed. +
The worst-case allocation is depicted in Figure 4.1 la.
case b. The server is acfive at time t , (or it is inactive and q <
(d t ) L 7 ) and it has
-
already consumed x units of budget. In this case, the worst possible situation is
when the server is preempted by the global scheduler until time t = d - ( Q - x ) .
The worst-case allocation is depicted in Figure 4.1 lb, and is minimum for x = Q.
By comparing the two cases, it is clear that case b, with x = Q, is the most pessimistic.
The corresponding function is S ' ( t ) ,as given by Equation (4.2).
Similar methodologies have been proposed by many authors. Saewong et al. [SRLK02]
extended the response time analysis for fixed priority systems to the case of hierarchical
schedulers. Their model assumes a Deferrable Server [SLS95] as a basic mechanism
to partition the processor, and a fixed priority algorithm as a local scheduler.
Shin and Lee [SL03] proposed a more general framework for schedulability analysis
of hierarchical scheduling, based on the Feng and Mok's model. They do not assume
any particular global scheduling mechanism, as long as the global scheduler is able to
provide a periodic resource abstlnctiorl. Such an abstraction can be provided by any
hard reservation server mechanism. They also proposed a schedulability analysis for
a local schedulers based on EDF and on fixed priorities.
Theorem 4.5 Let A be a sef of penodic or sporcdic task5 {rl.. . . T,,), ~viflzr, =
( C , .T,. D,), it here C , is the tvorrt-care conzputntion time, T, ir the tnsk period and
Dl ir the tnsk re1atil.e deadline. This tark ret is rclzedz~lableb\ the EDF schediilirlg
algor~tlzrnor1 a reao~rrceparfltlon ~vlthleaat a~rppljfirrlctlorz S ( t )if arl onlj 9
Bini and Lipari [LBO31 proposed a methodology for the case of a fixed priority sched-
uler. It can be easily extended to other schedulers, like EDF, but we leave this extension
to the reader.
Let us first tackle the problem of finding the minimum processor speed that maintains
the task set schedulable. Slowing down the processor speed by a factor n 1 is <
equivalent to scale up the computation times by l / n :
The problem is to find the minimum speed n , keeping the system schedulable. Bini
and Buttazzo [BB02] found a new way to express the schedulability condition under
a fixed priority scheduling algorithm as a set of linear inequalities in the computation
times C,.
Multi-thread Applicatiorzs
By introducing the speed factor a, we can reformulate condition (4.4) taking into
account the substitution given by Equation (4.3). The result is the following:
and finally:
where n,,,, is the minimum processor speed that guarantees the feasibility of the task
set.
We now introduce the delay A in the analysis. In fact, when a task set is scheduled
by a server, there can be a delay in the service because the server is not receiving any
execution time from the global scheduler. To extend the previous result to the case when
A > 0 we need to look at Equation (4.6) from a different point of view. Figure 4.12
illustrates the worst-case workload for a task r,,called TT;(t), and the line a,,,,t. The
line represents the amount of time that a processor with speed n,, provides to the task
set. Task T, is schedulable if
Notice that, as A increases, the tangent point t* may change. By using Equation (4.7),
and increasing A we can find all possible a that make the task r,schedulable.
In order to find a closed formulation for the relation between n and A expressed by
Equation (4.7). we need the following Lemma proved in [BB02].
Multi-thread Applicatiorzs
By means of this lemma, the well known schedulability condition for the task set:
When the task set is served by a server with function y ( t ) = ( a t A ) o, the schedula-
-
Since the link between (n.A ) is now explicit, we can manipulate the previous ex-
pression to obtain a direct relationship between n and A . In fact, the schedulability
condition of the single task 7, can be written as:
To take into account the schedulability of all the tasks in the set (and not only r , as
done so far), this condition must be true for every task. Hence, we obtain the following
theorem.
k 7 = ( 7 1 , 7 2 . . . . , 7 , } is sclzedidable by a Jewer clzaractericecl
Theorem 4.8 A f a ~ sef
by the lo~verbound~firncfion(at A ) 0
-
Proof.
If A satisfies Equation (4.11). then it satisfies all the equations (4.10) for every task in
the set. Then every task is schedulable on such a local scheduler and so the whole set
is, which proves the theorem.
1. the required bandwidth should be small, not to waste the total processor capacity;
2. the server period should be large, otherwise the time wasted in context switches
performed by the global scheduler would be too high.
where Tole,hedd is the global scheduler context switch time, P is the server period,
a is the fraction of bandwidth, and 11-1 and k 2 are two designer defined constants.
Moreover, some additional constraints in the ( a .A ) domain, other than those specified
by Equation (4.1l ) , may be required. For example, if we use a fixed priority global
Multi-thread Applicatiorzs
scheduler, to maximize the resource utilization we could impose the server periods to
be harmonic.
To clarify the methodology, let us consider the following example. Suppose we have
an application A consisting of three tasks with parameters reported in Table 4.1 (for
simplicity, we choose D l = T,, but the approach is the same when D, < T,). The
+ +
utilization is C = 114 1/10 3/25 = 47/100, hence a cannot be smaller than 0.47.
The schedule corresponding to the worst-case scenario (i.e., the critical instant) when
the application is scheduled alone on the processor is shown in Figure 4.13.
Figure 4.14 illustrates the set of ( a . A ) pairs defined by Equation (4.13) as a gray
area whose upper boundary is drawn by a thick line. This boundary is a piece-wise
hyperbole, because it is the minimum between inequalities, each one of them is an
hyperbole (see Equations (4.10) and (4.11)). Notice that, in this particular case, the
schedulability condition for task 7 2 does not provide any additional constraint.
Figure 4.14 also shows a qualitative cost function that increases as a increases, and
decreases as A increases (see Equation 4.12). If we minimize this qualitative function
on the domain expressed by Equation (4.13),the solution is n = 11120 and A = 2411 1.
We can now find the period P and the budget Q of the server corresponding to the
selected solution:
A=2(P-Q) a = -Q
P
then:
Finally, Figure (4.15) shows the schedule for the sample application, obtained by con-
sidering the worst-case scenario both for the time requested by tasks and for the time
provided by the server. The shaded areas represent intervals where the server does not
receive any allocation by the global scheduler. As expected, all tasks complete within
their deadlines.
1 3 2 111 I
4 8 10 2 20
of the parameters of the applications' tasks to allocate the resource to the different
applications. Then, we presented a more general approach that consists in using a
resource reservation algorithm (like the CBS) at the global level. This approach seems
the most promising, as the global scheduler does not need to know the internal details
of the applications to be able to provide timing guarantees.
There are still some open issues that need to be addressed. One important question
is whether it is possible to provide an "optimal" non fluid algorithm that can be im-
Instants when the server reauires O time
Figure 4.15 Worst-case schedule of application A on a serve1 mith the computecl param-
etex
Of course, the GPS algorithm is ruled out, since it is a fluid algorithm that cannot be
implemented in practice. We have seen that Deng and Liu's algorithm and the BSS
algorithm are not able to provide such a property: both algorithm require that some
extra capacity is assigned to each application in order to guarantee it.
Another problem that needs to be solved is how to synchronize two applications. For ex-
ample, if two applications need to communicate through mutually exclusive resources,
each application will experience some blocking time that must be taken into account
in the schedulability test. Although a first proposal has been done by Kuo and Li
[KL99] to extend Deng and Liu's algorithm, more research is needed to extend the
other approaches.
SYNCHRONIZATION PROTOCOLS
FOR HARD AND SOFT REAL-TIME
SYSTEMS
Most of the real-time scheduling algorithms presented in the previous chapters assume
that tasks cannot self-suspend and cannot use blocking primitives. These assumptions
are quite restrictive because, in practice, tasks often use synchronous operating system
calls that introduce blocking. For example, if tasks exchange data via shared memory
buffers, access to shared memory has to be protected by mutually exclusive (mutex)
semaphores to prevent possible inconsistencies in the data structures. However, if a
task blocks on a mutex semaphore, the task model considered in the previous chapters
is not valid anymore.
In this chapter, we first describe the problem caused by mutual exclusion and briefly
describe some background work on real-time protocols for accessing mutually exclu-
sive resources. Then, we present protocols and scheduling algorithms that extend the
resource reservation framework to the case of mutex semaphores. We will discuss two
different approaches. In the first approach, we extend the CBS algorithm to systems
consisting of hard real-time periodic tasks and soft real-time aperiodic tasks that can
share resources. The resulting algorithm is called CBS-R. Then, we consider a more
general model of an open system, where tasks can be dynamically activated in the
system, and we present the Bandwidth Inheritance (BWI) protocol.
5.1 TERMINOLOGY AND NOTATION
We consider a set R of r resources, a set I"of n hard periodic (or sporadic) tasks,
and a set IAof m soft aperiodic tasks that have to be executed on a uniprocessor
system. Tasks are preemptable and all the resources are accessed in exclusive mode
using critical sections guarded by mutually exclusive semaphores. To simplify o ~ l r
presentation, only single-unit resources are considered, however the results can easily
be extended to deal with multi-unit resources, as described in [Bak91]. Every resource
is assigned a different semaphore. Therefore, we will often refer to a resource or to its
corresponding semaphore with the same symbol R , E R.
A critical section is a fragment of code that starts with a lock operation on a semaphore
R, and finishes with an unlock operation on the same semaphore. We denote the
lock and unlock operation with P ( R , ) and 17(R,) respectively. Critical sections can
be neatecl, that is, it is possible to access resource R, while holding the lock on re-
source R k . We assume yroyerl! nested critical sections, so that a sequence of code
like P ( R 1 ) . . . . . P ( R 2 ) . . . . . V ( R 2 ) .. . . . V(R1) is permitted, whereas a sequence like
P ( R 1 ) . . . . . P ( R 2 ) . . . . . 17(R1).. . . . 1 7 ( R 2 is
) notpermitted. I~lternnlcritical sections
are nested inside other critical sections, whereas e,~ferrlalcritical sections are not. We
denote the worst-case execution time of the longest critical section of task r, on re-
c,
source R k as ( R x ) . Note that <,( R x ) comprises the execution time of all the internal
critical sections. We also assume that if a job performs a lock operation on semaphore
R A, it performs the corresponding unlock operation before its completion. Therefore,
a critical section cannot span through two consecutive jobs.
The low priority task r L acquires the lock on the resource R with the lock operation
P(R). Immediately after, it is preempted by the high priority task r H , which tries to
lock the same resource and it is blocked. Therefore, r L resumes execution. Before
being able to release the lock, r~ is preempted by a medium priority task r l f i As a
consequence, the high priority task T H must now wait for T J J to complete.
At the time of the Mars Pathfinder mission, the problem was already known. The first
accounts of the problem and possible solutions date back to early '70s. However, the
problem became widely known in the real-time community since the seminal paper of
Sha, Rajkumar andLehoczky [SRL90]. who presented the Priority Inheritance Protocol
and the Priority Ceiling Protocol to bound the time a real-time task can be blocked on
a mutex semaphore.
Later, other similar protocols have been presented in the literature. Among the others,
we would like to mention the Stack Resource Policy [Bak90, Bak911, which will be
briefly recalled in 5.3.3, and the Dynamic Priority Ceiling [CL90].
Even though the PIP was developed in the context of fixed priority scheduling, it can
also be applied to dynamic priority scheduling (its extension has been done by Spuri
[SRBS98]). The following basic properties hold:
Using these properties, it is possible to give a sufficient condition for the schedulability
of a set of n hard real-time periodic tasks. If tasks are ordered by non decreasing
periods (T, < T,+ z < J ) , the schedulability condition can be expressed as follows
[But97, SRBS981:
where B,is the worst-case blocking time of task 7 , and C l u b ( Ais) the least upper
bound of the scheduling algorithm A used in the system.
If the PIP is applied on the example of Fig~lre5.1, the priority inversion phenomenon is
reduced. The resulting schedule is illustrated in Figure 5.2. When task T H is blocked
by T L the
, latter inheritr the priority of T H .Thus, when task T I [ is activated, it cannot
preempt rL. When r ~ releases the lock, it returns to its original priority and it is
preempted by rH.
When tasks are allowed to use nested critical sections, many complex blocking situa-
tions can occ~lr.In particular, a task can inherit a new priority every time it blocks a task
on a resource (multiple inheritance), and can be blocked by many tasks on different
resources (clzained Dlockirg). Moreover, if critical sections can be nested arbitrarily,
a deadlock may occur, as shown in Figure 5.3. In this example, both T I and access
two resources R1 and R2 in a nested fashion. In particular, rl accesses them in the
order P ( R 2 ) ,. . .. P ( R 1 ) ,. . .. l ' ( R 1 ) ,. . ., Lr(R2),whereas 7 2 accesses them in the
order P ( R 1 ) ,. . .. P ( R 2 ) ,. . .. Lr(R2),. . ., Lr(R1).If tasks arrive as shown in Figure
5.3, a deadlock occurs when 7 2 performs the P ( R 2 )operation. Unfortunately, the PIP
cannot prevent this to happen.
normal execution
critical section
w At time t , the executing task 7, can lock resource R if its priority p, is strictly
higher than all the ceilings of the resources currently locked by other jobs. Let
C*be such a ceiling, and let R*be the corresponding resource. If p, C*, < task
7 , is said to be blocked on R"by the task is holding R".
Figure 5.4 An example of schedule generated by the PCP
We now revisit the example of Figure 5.3 by using the PCP. The resulting schedule
is shown in Fig~lre5.4. Since both resources R 1 and R 2 are used by 71 and 72, their
ceiling is equal to the priority of r l : C ( R 1 ) = C ( R 2 ) = p l . At time t l , 7 2 locks
R1,therefore R" = R1. At time t 2 ,rl tries to lock R 2 ,but it is blocked because its
priority is not strictly higher than C ( R 1 ) Hence
. 7-2 inherits the priority of rl to avoid
unbounded priority inversion by other medium priority tasks. When r2 releases the
lock on R1, 71 is awaken, accessing its resources without further blocking.
According to this protocol, each task 7 ,is assigned a (static or dynamic) priority p , and
a static preemption level T,,such that the following essential property holds:
s > T,.
Property 5.1 T a ~ kr,is rzof allo~vedto yreernyf task T,,m l e ~ T,
Under EDF and Deadline Monotonic (DM) scheduling, Property 5.1 is verified if a
task 7,is assigned a preemption level inversely proportional to its relative deadline:
Figure 5.5 illustrates the same example of Figure 5.3 when resources are accessed by
the SRP.By comparing Figures 5.4 and 5.5, we note that the schedule generated by the
SRP has one less context switch. In fact, the high priority task is blocked upon arrival
and not when it accesses the resource.
At a first sight, the reader might think that such an anticipated blocking is too pes-
simistic. However, it can be shown that the PCP and the SRP provide the same bound
Figure 5.5 Schedule generated by the SRP.
on the blocking time of a task. In fact, the same task sets that are schedulable with the
PCP, are also schedulable with the SRP.
Since a task never blocks once it starts executing, under the SRP there is no need
to implement waiting queues. The blocking time B,considered in the schedulability
analysis refers to the time for which task 7, is kept in the ready queue by the preemption
test. Although a task never blocks on a lockedreso~lrce,B , is considered as a "blocking
time" because it is caused by tasks having lower preemption levels.
Assuming relative deadlines equal to periods, the maximum blocking time for a task
7 , can be computed as the longest critical section among those belonging to tasks with
preemption level less than n,and with a ceiling greater than or equal to x,:
where -,!, is the length of the longest critical section of task 7, on resource Rh.
Given these definitions, the feasibility of a set of periodic or sporadic tasks with resource
constraints under EDF can be verified by the following sufficient condition:
where we assumed that all the tasks are sorted by decreasing preemption levels, so that
>
n, njr,only if i < j.
The following theorem [LBOO] extends the feasibility analysis by providing a tighter
schedulability test.
'1n the case of multi-units resources, the ceiling of each resource is dynamic as it depends on the number
of units actually free.
Theorem 5.1 Let I" be a Jef of 7 2 1zar.d ~yora-acllctask5 ordered b j decrea~lngyreenzy-
tlon level ( 5 0 that njr,> n, onlj if 2 < 31, sidz flzaf L> = C:=, 2< 1. Tl~en,I" 15
schediilable b\ EDF+SRP Ifand on/\ If:
It is worth noting that condition (5.4)is necessary and sufficient only for sporadic tasks,
under the assumption that all tasks experience the maximum blocking time. This means
that the SRP is optimal for sporadic tasks; that is, every feasible sporadic task set can be
scheduled by EDF with the SRP. For periodic tasks the problem of deciding feasibility
in the presence of resource constraints has been shown to be NP-hard [Jef92]. The
complexity of the test is pseudo-polynomial; hence, it can be too costly for providing
on-line guarantee in large task sets.
A method for analyzing the schedulability of hybrid task sets where hard tasks may
share resources with soft tasks handled by dynamic aperiodic servers was first presented
by Ghazalie and Baker [GB95]. Their approach is based on reserving an extra budget
to the aperiodic server for synchronization purpose and using the utilization-based
test [LL73] for verifying the feasibility of the schedule. Lipari and Buttazzo [GBOO]
extended the analysis to a Total Bandwidth Server (TBS), using the Processor Demand
Criterion [BRH90]; Buttazzo and Caccamo [BC99] used a similar approach to extend
the analysis to the optimal Total Bandwidth (TB') server.
In this chapter we describe possible solutions to the problem of bounding the blocking
time of real-time tasks in hybrid systems consisting of hard and soft real-time tasks
that can share resources. We first present an extension of the Constant Bandwidth
Server (working with the SRP) that allows aperiodic tasks to share resources with hard
real-time periodic tasks. Then, we consider a more general model of open system,
i.e., a system that has no a-priori knowledge of the task behavior, and we show how
it is possible to provide real-time guarantees by using an algorithm that combines the
Constant Bandwidth Server with the Priority Inheritance Protocol.
One of the problems in the integration of the CBS with the SRP is that the SRP protocol
was developed under the assumption that preemption levels are fixed, and relative
deadlines do not change. Unfortunately, under the CBS, server relative deadlines can
be postponed, thus the resulting preemption level is dynamic.
Another problem is to avoid that an aperiodic task suspends its execution inside a
critical section because the budget is exhausted. In fact, this would cause the blocked
task to increase its blocking time, waiting until the budget is replenished. To avoid
such an extra delay, we should allow the aperiodic task to continue until it leaves the
critical section. Such an additional execution time is a kind of overrun, whose effects
need to be taken into account in the feasibility test.
Two approaches can be pursued to deal with this problem. A first solution is to reserve
extra budget for each CBS server for synchronization purposes and permit execution
overruns. A second approach does not reserve extra synchronization budget, but pre-
vents an aperiodic task from exhausting its budget inside a critical section. This section
investigates the latter approach to improve the efficiency of the CBS budget manage-
ment.
Alternatively, a job can perform a budget check before entering a critical section. If the
current budget q, is not sufficient to complete the job's critical section, the budget is
replenished and the server deadline postponed. The remaining part of the job follows
the same procedure until the job completes.
This approach dynamically partitions a job into chiirlks. Each chunk has execution time
such that the consumed bandwidth is always less than or equal to the available server
bandwidth C,.By construction, a chunk has the property that it will never suspend
inside a critical section. The following example illustrates two different solutions with
the CBS+SRP, both using static preemption levels.
Example 5.1 The task set consists of an aperiodic job J1, handled by a CBS with
Q, = 1 and P, = 10,and two periodic tasks, 7 1 and 7 2 , sharing two resources R, and
Rb. In particular, J 1 and 7 2 share resource Rb,whereas rl and 7 2 share resource R,.
The task set parameters are reported in Table 5.1.
I tnrk 11 type
.. I Q,orC I P,orT R, I Rb I
J1 soft aperiodic 4 10 - 3
rl hard periodic 2 12 2 -
71 hard periodic 6 24 4 2
The first solution presented on this task set maintains a fixed relative deadline whenever
the budget is replenished and the deadline postponed. The advantage of a fixed relative
deadline is to keep the SRP policy unchanged for handling resource sharing between
soft and hard tasks. According to this solution, when at time t the current budget is not
sufficient for completing a critical section, a new deadline is computed as d ;'"= t P, +
and the budget is recharged at the value q , = q, + (dt"' d;ld)c,, where d;ld is the
-
A possible schedule of the task set produced by CBS+SRP is shown in Figure 5.6.
Notice that the ceiling of resource R , is C(R,) = 1/12, and the ceiling of Rb is
critical section on resource Ra
critical section on resource Rb
The second solution suspends a job whenever its budget is exhausted, until the current
server deadline. Only at that time, the job will become eligible again, and a new chunk
will be ready to execute with the budget recharged at its maximum value (q, = Q,)
and the deadline postponed by a server period.
The schedule produced using this approach is shown in Figure 5.7. When job J 1 arrives
+
at time t = 2, its first chunk H1.1 receives a deadline d l 1 = a1 1 P, = 12, according
to the CBS algorithm. At time t = 5 , J 1 tries to access resource Rh,but its residual
budget is equal to one and is not sufficient to complete the whole critical section. As a
consequence, J 1 is temporarily suspended and a new chunk is released at time t = 12,
critical section on resource Ra
critical section on resource Rb
Figure 5.7 CBS+SRP \\ith static pleemnption le\els and job suspension
with deadline dl 2 = 22, and the budget replenished ( q s = Q s = 4). This approach
has also two main drawbacks: it increases the response time of aperiodic tasks and,
whenever the budget is recharged, the residual budget amount (if any) is wasted due to
job suspension.
In this section we show that using dynamic preemption levels for aperiodic tasks al-
lows achieving a simpler and elegant solution to the problem of sharing resources
under CBS+SRP. According to the new method, whenever there is a replenishment,
the server budget is always recharged by Q , and the server deadline postponed by P,.
It follows that the server is always eligible, but each aperiodic task gets a dynamic
relative deadline.
To maintain the main properties of the SRP, preemption levels are kept inversely pro-
portional to relative deadlines, but are defined at a chunk level. The preemption level
, , ,
T, of a job chunk H z is defined as x,. = l / ( d , . , , ,
cx, ). Notice that T , is assigned
-
at run time and cannot be computed off line. As a consequence, a job J , is characterized
by a cl?.~lnnzic
pl-eernpfion le~,elT : equal to the preemption level of the current chunk.
critical section on resource Ra
critical section on resource Rb
To perform an off-line guarantee of the task set, it is necessary to know the rnauinlim
pr-een~pfionlevel that can be assigned to each job J , by the server. Therefore, the
deadline assignment rule is modified to guarantee that each chunk has a minimum
relative deadline DY1" equal to its server period (the precise rules are reported in
Section 5.5.4).
the end of its critical section. When the system ceiling becomes zero, J 1 is able to
preempt 7 2 . Note that the bandwidth consumed by any chunk is no greater than C,,
since whenever the budget is refilled by Q ,,the absolute deadline is postponed by P , .
The main advantage of the proposed approach is that it does not require to reserve extra
budget for synchronization purposes and does not waste the residual budget (if any)
left by the previous chunk. However, we need to determine the effects that dynamic
preemption levels have on the properties of the SRP protocol.
We first note that, since each chunk is scheduled by a fixed deadline assigned by the
CBS, each chunk inherits the SRP properties. In particular, each chunk can be blocked
for at most the d~lrationof one critical section by the preemption test and, once started,
it will never be blocked for resource contention. However, since a soft aperiodic job
may consist of many chunks, it can be blocked more than once. The behavior of hard
tasks remains unchanged, permitting resource sharing between hard and soft tasks
without jeopardizing the hard tasks' guarantee. The details of the proposed technique
are described in the next section.
To comply with the SRP rules, a chunk H z , starts its execution only if its priority is the
, ,
highest among the active tasks and its yl-eernption l e ~ , eTl , = l / ( d , - a,,) is greater
than the system ceiling. In order for the SRP protocol to be correct, every resource R ,
is assigned a static3 ceiling C ( R , )(we assume binary semaphores) equal to the highest
maximum preemption level of the tasks that could be blocked on R , when the resource
is busy. Hence, C ( R , )can be computed as follows:
"ln the case of multi-units resources, the ceiling of each resource is dynamic as it depends on the number
of units actually free.
C ( R , ) = i i l a x { ~ ; I " ~7~~ n e e d s R , ) . (5.6)
k
It is easy to see that the ceiling of a resource computed by equation (5.6) is greater
than or equal to the one computed using the dynamic preemption level of each task.
In fact, as shown by equation (5.5), the maximum preemption level of each aperiodic
task represents an upper bound of its dynamic value.
Finally, in computing the blocking time for a periodiclaperiodic task, we need to take
into account the duration of the critical section of an aperiodic task without considering
its relative deadline. In fact, the actual relative deadline of a chunk belonging to an
aperiodic task is assigned on-line and it is not known in advance.
To simplify our formulation, we assume that each hard periodic task is handled by
a dedicated CBS-R server with Q, > C, and T, = T I . With such a parameters
assignment, hard tasks do not really need a server in order to be scheduled; we prefer
to use a server also for hard tasks, because this approach gives us the possibility to
implement an efficient reclaiming mechanism on the top of the proposed approach. A
reclaiming algorithm, like CASH [CBSOO], is able to exploit spare capacities and can
easily be integrated in this environment.
The blocking times can be computed as a function of the minimum relative deadline
of each aperiodic task, as follows:
where s , h is the worst-case execution time of the h-th critical section of task 7 , . p, h is
the resource accessed by the critical section s, i,, and T,, is the period of the dedicated
server. The B, parameter computed by equation (5.7) is the blocking time experienced
by a hard or soft task. In fact, T, = D y z n for a soft aperiodic task and T, = D, for
a hard periodic task.
The correctness of our approach will be formally proved in Section 5.5.6. We will show
that the modifications introduced in the CBS and SRP algorithms do not change any
property of the SRP and permit to keep a static ceiling for the resources even though
the relative deadline of each chunk is dynamically assigned at run time by the CBS-R
server.
A task must never exhaust its budget when it is inside a critical section.
In the following section we formally define the CBS-R algorithm which integrates the
previous rules.
,
2. Each served job chunk H z , is assigned a dynamic deadline d l equal to the current
server deadline d , .
3. Whenever a served job executes, the budget q , is decreased by the same amount.
5. A CBS-R is said to be active at time t if there are pending jobs (remember the
budget q, is always greater than 0); that is, if there exists a served job J , such that
?-, < t < f,. A CBS-R is said to be idle at time t if it is not active.
6. When a job J , arrives and the server is active the request is enqueued in a queue
of pending jobs according to a given (arbitrary) discipline (e.g., FIFO).
8. When a job finishes, the next pending job, if any, is served using the current budget
and deadline. If there are no pending jobs, the server becomes idle.
9. At any instant, a job is assigned the last deadline generated by the server.
10. Whenever a served job J , tries to access a critical section, if q, < <,(where <,is
the duration of the longest critical section of job J , such that <, < Q,), a budget
+
replenishment occurs, that is q , = q , Q , and a new server deadline is generated
+
as d , = d , P,.
It is worth noting that, with respect to the original definition given in [Abe98], we
modified rule (7) and introduced rule (10). Rule (7) has been modified in order to
guarantee that each job chunk has a minimum relative deadline equal to the server
period. In fact, whenever a job J , arrives and the server is idle, the job gets an absolute
deadline grater than or equal to the arrival time plus the server period. The budget is
recharged in such a way that the consumed bandwidth is always no greater than the
reserved bandwidth L7, = Q, / P,.
Rule (10) has been added to prevent a task from exhausting its budget when it is using
a shared resource. This is done by performing a budget check before entering a critical
section. If the current budget is not sufficient to complete a critical section, the budget
is replenished and the deadline postponed.
These two minor changes allow the CBS server to become compliant with the proposed
approach without modifying its global behavior.
5.5.5 AN EXAMPLE
The following example illustrates the usage of the CBS-R server in the presence of
resource constraints. The task set consists of an aperiodic job J 1 . handled by a CBS-R
with maximum budget Q , = 4 and server period P, = 8 and two periodic tasks r l ,
7 2 , which share two resources R , and Rb; in particular, J 1 and rl share resource Rb,
while rl and rl share resource R , . The task set parameters are shown in Table 5.2.
The schedule produced by CBS-R+SRP is shown in Figure 5.9. When job J 1 arrives at
+
time t = 3, its first chunk H1 1 receives a deadline d l 1 = a 1 1 P, = 11 according to
critical section on resource Ra
critical section on resource Rb
the CBS-R algorithm. At that time, 7 2 is already inside a critical section on resource Ra,
however H 1 . 1 ofjob J1 is ableto preempt, having apreemption level ~ 1 =~118 1 > n,.
At time t = 6, J1 tries to access resource Rb,but its residual budget is equal to one and
is not sufficient to complete the whole critical section. As a consequence, the deadline
is postponed and the budget replenished. Hence, the next chunk H 1 . 2 of J1 starts at
time n l ~ 2= 6 with deadline d l ~ 2= 19. The chunk H I . ? of J 1 cannot start because
its preemption level ~ 1 . 2= 1/13 < n,. It follows that 7 2 executes until the end of
its critical section. When the system ceiling becomes zero, J 1 is able to preempt 7 2 .
When J1 frees resource Rb,71 starts executing. It is worth noting that each chunk can
be blocked for at most the duration of one critical section by the preemption test and,
once it is started, it will never be blocked for resource contention.
In the next section, the SRP properties are formally proved and the validity of the
guarantee test is analyzed.
Property 5.2 A chunk Hz1, is not allowed to preempt a chunk H I k , unless x , 1, > n l . ~ .
Property 5.3 If the preemption level of a chunk H , ~ is, greater than the current system
ceiling, then there are sufficient resources available to meet the requirement of H z ,
and the requirement of every chunk that can preempt H z . ,
,
Property 5.4 If no chunk H, is permitted to start until x z > , n,,then no chunk can
be blocked after it starts.
Property 5.5 Under the CBS-R+SRP policy, a chunk H I . , can be blocked for at most
the duration of one critical section.
The proofs of properties listed above are similar to those in the original Baker's paper
[Bak91]. The following lemma shows how hard periodic tasks maintain their behavior
unchanged:
Lemma 5.1 Under flze CBS-R+SRP policy, each job of hard periodic task cart be
blocked at n z o ~once.
f
Proof.
The schedule of hard periodic tasks produced by EDF is the same as the one produced
by handling each hard periodic task by a dedicated CBS-R server with a maximum
budget equal to the task WCET and server period equal to the task period; it follows
that each hard task can never be cut into multiple chunks. Hence, using property (5.5),
it follows that each instance of a hard periodic task can be blocked for at most the
duration of one critical section. 0
The following theorem provides a simple sufficient condition to guarantee the feasi-
bility of hard tasks when they share resources with soft tasks under the CBS-R+SRP
algorithm.
Theorem 5.2 Let r be a t a ~ kJet cornposed 0 ) 71 lzarcl periodic tasks and 171 soff
aperiodic taska, each one (soff and h a d ) ~ched~rled b) a dedicated CBS-R Jervex
Supporing taskr are ordered b\ decrearirg rrimirniirn preenzptiorl le~,eljro that 7i inor >
,
T r r ~n r ord~if z < j), tlzerl the hard taskr are rclzedz~lable17) CBS-R+SRP i f
~vher-eQ,, is flze nlauinlur~zbudget of the dedicafecl CBS-R Jerver and T,, i~ the Jewer
period.
Proof.
Suppose equation (5.8) is satisfied for each 7,.
We have to analyze two cases:
Case A. Task 7,has a relative deadline D,= T,, . Using Baker's guarantee test (see
Equation (5.3)). it follows that the task set r is schedulable if
where D,is the relative deadline of task 7,and B;'" is the blocking time r,might
experience when each 7,has a relative deadline equal to D,.Notice that a task r, can
block as well as preempt r, varying its relative deadline D,; however, 7,cannot block
and preempt 7,simultaneously. In fact, if current instance of 7,preempts T,,its absolute
deadline must be before r,deadline; hence, the same instance of r, cannot also block
r,,otherwise it should have its deadline after 7,deadline. From considerations above,
the worst-case scenario happens when 7,makes preemption on r,,that is D, = T,, .
Hence. it follows that:
Case B. Task r, has a relative deadline D , > Ts . As in C a ~ A,
e the task set r is
schedulable if
From the considerations above, it follows that the worst-case scenario also occurs when
( V j , D, = T s J )hence
,
Notice that tasks are ordered by decreasing maximum preemption level and each task
T, has the relative deadline set as D, = T,], except task rZwhose relative deadline is
Dl > T,, . Hence, from Equation (5.2) we derive that the new blocking time B :"
of
task 7 , is a function of the relative deadline D , as follows:
It is worth noting that the terms B , , .... BrZ+, are the blocking times computed
by equation (5.7) and are experienced by hard or soft tasks if the relative deadline of
each task is set equal to the period of its dedicated server. Finally, a k > 1 will exist
such that:
The above inequality holds because k must be greater than or equal to i. Hence, it
follows that the task set is schedulable.
To provide temporal guarantees in dynamic real-time systems, the idea is to separate the
"admission test" phase from the actual scheduling phase. When a task is first activated
in the system and requires a guaranteed execution, it must go through an admission test.
If the test is passed, the task is admitted into the system with a guaranteed execution
profile. However, if the task tries to actually execute more than initially requested, the
task is "slowed down" by the temporal isolation mechanism toprevent extra interference
with the other tasks in the system. The main difference with a traditional real-time
system is that a dynamic real-time system has no a-priori knowledge about the tasks
that will be activated.
Such a lack of knowledge becomes very restrictive when tasks share resources with
a synchronization protocol. Using the SRP, the admission test can be performed by
Equations (5.3) or (5.4), which require the computation of task blocking times by
Equation (5.2). which in turns requires the knowledge of the r e a o ~ r ~ceilings.
e To
compute the resource ceilings the system must know in advance all the resources used
by the tasks during execution. Such an information is not always readily available.
Suppose for example that the code of a task is linked together with a shared library
where shared data are protected by mutexes. The programmer of the task may not be
aware of such a hidden implementation detail.
The problem becomes even more difficult if resource reservation techniques are intro-
duced in general purpose operating systems. In fact, when dealing with hard real-time
systems, the effort of analyzing the task's code and the relations among tasks is nec-
essary to guarantee the correctness of the entire application. In a general purpose
operating system, however, it is not reasonable to ask the developer of a soft real-time
application to specify the list of all the mutexes and the duration of each critical sec-
tion during the application initialization, when the admission test is performed. As a
matter of fact, the developer of a soft real-time task (like an MPEG player) might use a
software component developed by another company, for which the source code is not
available and that might use mutexes in its implementation.
Summarizing, the effectiveness of protocols like the SRP or the PCP is based on the
a priori knowledge on tasks' behavior. However, especially when dealing with soft
real-time systems, the behavior of a soft real-time task cannot always be completely
characterized. As a consequence, global concepts like the syrferri ceilirg or the resource
ceilirzga cannot be used.
In the remaining of this chapter, we will show how the problem of priority inversion
in a dynamic real-time system can be solved without using global a priori knowledge
about the tasks. The basic idea is to use the PIP, instead of PCP and SRP, as the PIP
does not require a priori knowledge about the tasks.
The scheduling algorithm we are looking for must fulfill the following requirements:
Jobs arrival times (the a,., 's) are not known a yriori, but are only revealed on
line during system execution. Hence, the scheduling strategy cannot require any
knowledge of future arrival times (e.g., cannot require tasks to be periodic).
The exact execution requirements c , , are also unknown, and can only be deter-
,
mined by actually executing J , to completion. Hence, the scheduling algorithm
cannot require an a yriori upper bound (a "worst-case execution time") on the
,
val~leof c, .
The scheduling algorithm has no a yriori knowledge of which resources a task
will access; it can only be known on line when the task tries to lock a resource.
Hence, the scheduling algorithm cannot require any a yriori upper bound on the
worst-case execution time c,,,
of a critical section.
the period T I;
w the type (hard or soft) of every task that (directly or indirectly) interacts with 7,
(see Section 5.6.3 for a definition of interaction);
for each interacting task r,,and for each shared resource Rk,the worst-case
c,
execution time (Rx) of the longest critical section of r, on Rx.
However, this solution is not suitable for a dynamic system. In fact, in order to compute
the maximum blocking time of each server, when a task is created we should "declare"
the worst-case execution time of the critical sections on each accessed resource. This
is in contrast with the goal of a scheduler that must be independent of the actual
requirements of the tasks. In addition, if a soft task holds a critical section for longer
than declared, arty server could miss its deadline.
Example 5.2 To highlight this problem, consider the example shown in Figure 5.10.
In this example, there are three servers S1 = (2.6).S2 = (2.6)and S3 = (6,lX).
Server S1is assigned task 71, which accesses a resource R for the entire duration of
its jobs (i.e., 2 units of time). Server S2is assigned task 7 2 , which does not use any
resource. Server S3is assigned task 7-3, which has an execution time of 6 units of time
blocked
Figure 5.10 In the example, blocking times are not correctly accounted for
and accesses resource R for 5 units of time. Now, suppose that 7 3 is a soft task that
claims to use resource R for only 2 units of time. The system computes a maximum
blocking time B 1= B2 = 2 for servers S1and 5'2. According to Equation (5.10), the
system is schedulable, and all servers can be admitted.
In the config~~ration of arrival times shown in Figure 5.10, server S1 arrives at time
t2 and tries to access R. Since it is locked, server S 3 inherits a deadline 6; = 8 and
continues executing. If no enforcement is put on the worst-case execution time of the
critical section of task 7-3 on resource R, server S2 misses its deadline. The simple fact
that 7-3 executes more than expected inside the critical section invalidates the PIP and
task 7 2 , which does not use any resource, misses its deadline.
Another problem that must be considered is the depletion of the server budget while
the task is in a critical section and has inherited the deadline of another server. In the
original CBS formulation, the server deadline is postponed and the server budget is
immediately recharged. When the PIP is applied it is not clear which deadline has to
be postponed.
To solve the problems mentioned above, we combined the PIP and the CBS in a single
algorithm called Bandwidth Inheritance (BWI). The basic idea is that when a task
executing inside a low-priority server blocks a high-priority server, it inherits the pair
(q, 6) of the blocked server.
5.6.2 THE BANDWIDTH INHERITANCE PROTOCOL
Before starting with the description of the Bandwidth Inheritance protocol, we need to
understand the meaning of ten~porali~olationwhen considering interacting tasks. In
the original CBS formulation (see Section 3.6.1), tasks are assumed to be independent
and hence do not interact in any way. When tasks access shared resources, they cannot
be considered completely independent anymore. What does i~olationmean in such a
scenario?
Consider again the example shown in Fig~lre5.10. Server S1and server S3 share a
resource. It is easy to see that if S g holds the lock for longer than declared, some task
will probably miss its deadline. Our goal is to prevent task rl and 73 from interfering
with 72. In fact, since rl and 7-3 both access the same resource it is impossible to
provide isolation among them.
Notice that, in the above example, rl can be blocked by 7 2 and by 73, but r 3 cannot be
blocked by rl. Hence, a blocking chain defines an antisymmetric relation = between
r, and 7,: 7, =,T 7, but not viceversa.
In general, there can be more than one chain between two tasks 7 , and T,, because they
can directly or indirectly share more than one resource. Let us enumerate the chains
starting from task 7, in any order. Let B C , be
~ the h-th blocking chain on r , . Without
loss of generality, in the remainder of the paper we will sometimes drop the superscript
on the chain. Moreover, let r ( B C , )be the set of tasks 7 2 . . . . . 7, in the sequence BC,
(7, excluded), and let R ( B C , )be the set of resources R1.. . . . R Z p 1in the sequence
BC,.
Given these definitions, we can state the goals of our scheduling strategy moreprecisely.
Whether task r, meets its deadlines should depend only on the timing requirements
of r, and on the worst-case execution time of the critical sections of the tasks in r , .
Therefore, in order to guarantee a hard task r,, it is only necessary to know the behavior
of the tasks in r,.
To solve the deadlock problem, we consider another static policy. We assume that
resources are totally ordered, and each task respects the ordering in accessing nested
critical sections. Thus, if 2 < j, then task r can access a resource R , with a critical
section that is nested inside another critical section on resource R ,. When such order
is defined, the sequence of resources in any blocking chain is naturally ordered. For
a deadlock to be possible, a blocking chain must exist in which there is a circular
relationship like BC = (. . . . R , . . . . . R,. . . . R,. . . .). Therefore, if the resources are
ordered a yriori, a deadlock cannot occur.
If the total order is not respected when accessing nested critical sections, a deadlock
can still occur. As we will see in the next section, our protocol is able to detect it during
runtime, but the action to be taken depends on the kind of resources. In the remainder
of the paper, we shall assume that resources are ordered.
As long as no task is blocked, the BWI protocol follows the same rules of the CBS
algorithm (see Section 3.6.1). In addition, the BWI protocol introduces the following
rules:
Rule 10: if task 7, is blocked when accessing a resource R that is locked by task T , ,
then 7, is added to the list of server S,(,t ) . If, in turn, 7, is currently blocked
on some other resource, then the chain of blocked tasks is followed, and server
S,(, t ) adds all the tasks in the chain to its list, until it finds a ready task5. In this
way, each server can have more than one task to serve, but only one of these tasks
is not blocked.
Rule 11: when task 7, releases resource R , if there is any task blocked on R , then 7,
was executing inside a server S,(, t ) # S,. Server S,(,t ) must now discard r,
from its own list and the first blocked task in the list is now unblocked, let it be
' ~ o t ethat index i denotes the task's index when it is the argument of function e ( ) and the server's index
when it is the value of e ( )
'lf, by follo\+ingthe chain, the algorithm finds a task that is already in the list, a deadlock is detected and
an exception is raised.
5 is esecutin~ deadline
postponed
i \
Figure 5.11 The BWT protocol is applied to the example of Figure 5.10
7,. All the servers that added 7, to their list while 7, was holding R must discard
7, and add 7,.
Example 5.3 The behavior of BWI is demonstrated by applying the algorithm to the
example of Fig~lre5.10. The resulting schedule is depicted in Fig~lre5.1 1.
Note that, task r z is not influenced by the misbehavior of r:3 and completes before its
deadline.
Lemma 5.2 Each acti~,eren<erlzns nl\vn!s e,xnct/! orle rend! task irl its lirt.
Proof.
Initially, no task is blocked and the lemma is true. Suppose that the lemma holds
just before time t b ,when task 7, is blocked on resource R by task 7,. After applying
Rule 10, both servers S, and S, have task 7, in their list, and task 7, is blocked. By
definition of e ( j . t b ) S,(,
, t , > )= S l . Moreover, if rJ is also blocked on another resource,
the blocking chain is followed and all the blocked tasks are added to S, until the first
non-blocked task is reached. The lists of all the other servers remain unchanged, thus
the lemma is true.
Now, suppose that the lemma is true just before time t , . At this time, task r, releases
the lock on resource R. If no other task was blocked on R, then the lists of all the
servers remain unchanged. Otherwise, suppose that task 7, was blocked on R and is
now unblocked: server S, has T~and 7, in its list and, by applying Rule ( 1l ) , discards
7,. The lists of all the other servers remain unchanged, and the lemma holds.
Theorem 5.3 Corl~idera syJfern consistirlg of n sewers ~viflz
Proof.
Lemma 5.2 implies that, at any time, the earliest deadline server has one and only
one ready task in its list. As explained in [LipOO], from the viewpoint of the global
scheduler, the resulting schedule can be regarded as a sequence of real-time jobs whose
deadlines are equal to the deadlines of the servers (also referred as ch~rrlk~in [Abe98]
and [AB98a]). As the earliest deadline server never blocks, the computation times and
the deadlines of the chunks generated by the server do not depend on the presence
of shared resources. In [Abe98, LipOO], it was proved that in every interval of time
the bandwidth demanded by the chunks produced by server S, never exceeds g,
regardless of the behavior of the served tasks. Since Lemma 5.2 states that each active
server always has one non blocked task in its list, the previous result is also valid for
BWI. Hence, from the optimality of EDF and from x;, <
1,it follows that none
of these chunks misses its deadline.
Definition 5.3 Given a task r,,served 0 ) a rerver S, tl'itlz the BWlpmtocol, the inter-
ference time I, is dqfined as tlze r~zavin~im
time that all other f a ~ can
k ~ evecute irl~ide
Jewer S, for each job of r,.
Considerations. When our system consists only of hard tasks, BWI is not the best
protocol to use. In fact, substituting Q , and P, into Equation (5.1 1). we obtain:
which may result in a lower utilization than Equation (5.1) because all the interference
times are summed together. Hence, if we are dealing with a hard real-time system, it
is better to use other scheduling strategies like the PCP [SRL90] or the SRP [Bak90],
for example by using a strategy like the one described in Section 5.5.
The BWI protocol is more suitable for dynamic real-time systems, where hard, soft
and non real-time tasks can coexist and it is impossible to perform an off-line analysis
for the entire system. Of course, this comes at the cost of a lower utilization for hard
tasks.
In many cases, it is desirable to guarantee a hard task 7 , even if it interacts with soft
tasks. In fact, sometimes it is possible to know indirectly the worst-case execution
time of the critical sections of a soft task. For example, consider a hard task and a soft
task that access the same resource by using common library functions. If the critical
sections are implemented as library f~mctionswith bounded execution time, then we
can still determine the amount of time a soft task can steal from the server's budget of
a hard task. Indeed, this is a very common case in a real operating system.
Therefore, we will now consider the problem of computing I , for a server S, that is
the default server of a hard task. We start by providing an important definition that
simplifies the discussion.
Definition 5.4 Let S, be a ren,er that rlever postporler itr deadlirle ji.e., S, ' r budget
ir {lever e,xlzaiisted tvlzile there is a job that har not >etjirlished). We call S, art HRT
server. If the server deadline can be portported ji.e., a time t erirtr irl t~,lziclzq , = 0
and flze sewed job l z a ~rzof yef~fini~hed), Ire call S, an SRT server.
The distinction between HRT and SRT servers depends only on the kind of tasks
they serve. Both HRT and SRT servers follow the same rules and have the same
characteristics. However, it may be impossible to know the WCET of a soft task, so
the corresponding default SRT server can decrease it5 yriorih while executing. The
presence of SRT servers that interact with HRT servers complicates the computation
of the interference.
The following examples show how one or more soft tasks can contribute (directly or
indirectly) to the interference of a hard task.
Example 5.4 Consider a hard task r,, served by server S, and a soft task r J , served by
a server S, with period P, < P , . We do not know the WCET of task 7,. Therefore, we
assign the budget of S, according to some rule of thumb. Server S, is an SRT server
as it may postpone its deadline. If 7, executes less than its server budget and the server
deadline is not postponed, S, cannot preempt S,. If instead r, executes for more than
its server budget, the server's deadline is postponed. The corresponding situation is
shown in Figure 5.12a. S, can be preempted by S, while inside a critical section, and
block T,, contributing to its interference I , .
Example 5.5 Consider three tasks, 7,. r, and r k , served by servers S , , S, and S k ,
respectively, with PI < P, < P k . Servers S, and Skare HRT servers, while S, is an
SRT server. All tasks access resource R. Task r, accesses resource R twice with two
different critical sections. One possible blocking situation is shown in Figure 5.12b.
The first time, 7, can be blocked by task 7~ on the first critical section. Then, it can
Syrzclz~.orzizntiorzprotocols
7z in a c~iticalsection on R
rj in a critical section on R
71, in a critical section on R
I
a)
C I inheritance I
A inheritance
r 5::
v
2;-
:;< V
D
A I inheritance
s k I
Figure 5.12 Example of blocking situations with soft tasks: a) A soft task with a short
period blocks a hard task \+it11a long period: h) a hard task is hlocked twice on resource R.
be preempted by task r, which first locks R , and then, before releasing the resource,
depletes the server budget and postpones its deadline. Thus, when r , executes, it can
be blocked again on the second critical section on R. Note that both r, and 7~ belong
to l-,.
Example 5.6 As a last example, we show one case in which, even if all tasks in T , are
hard tasks, it may happen that 7, interferes with S, with two different critical sections.
Consider three tasks, r,,T, and TL.Task r, accesses only resource R2 with two critical
sections. Task 7, accesses two resources R1 and R2 and R2 is accessed twice with
two critical sections both nested inside the critical section on R 1. Task r k accesses
only R1 with one critical section. The only blocking chain starting from task 7, is
BC, = (T,, R 2 , 7,). Hence T, = (7,). Note that task r k cannot interfere with task 7,.
Tasks 7,,T, and 7~are assigned servers S,,S, and SA , respectively, with Pk < P, < P, .
Tasks r, and T, are both hard tasks and we know their WCETs and periods. Task r~ is
a soft tasks and we do not know its WCET. Finally, we assume to know the d~lration
of all critical sections (for example, because resources are accessed through shared
libraries that we are able to analyze).
We assign budgets and periods of server S, and S, so that they are HRT servers (their
interference is computed using the algorithm described in Figure 5.14, which will
be presented later). The budget of server Sx is assigned according to some rule of
thumb. Since we do not know whether Skwill exhaust its budget while executing, Sx
is considered an SRT server.
One possible blocking situation is shown in Figure 5.13. Task 7, locks resource R 1 and
then resource R 2 . At time t l it is preempted by task 7, that tries to lock resource R 2
and it is blocked. As a consequence, task 7, inherits server S, and interferes with it for
the duration of the first critical section on R 2 . When 7, releases R2,it returns inside
its server S, and T, executes completing its critical section on R2. Then, server SAis
activated and r k starts executing and tries to lock resource R1. Since R1 is still locked
by r,, 7~is blocked and 7, inherits server S x . While 7, executes inside Sx,it locks
resource R 2 again. Before releasing R 2 , server Skexhausts its budget and postpones
its deadline. Now the earliest deadline server is S, that continues to execute and tries
to lock R2 at time t 2 . As a consequence, T, inherits S, and interferes with it for the
second time.
From the examples shown above, it is clear that there are many possible situations in
which a task can interfere with a server. In the next section, we formally present a set
of lemmas that identify the conditions under which a task can interfere with an HRT
server.
Syrzclz~.orzizntiorzprotocols
7. in a ciitical section on R2
T] in a critical section on R2
r3 in a ciitical section on RI
71, in a critical section on R1
Yk = 0
Figure 5.13 Example of blocking situation: task 3 can interfere twice with S, ex-en if S,
and S,are both HRT seners.
Definition 5.5 Let 9 , be flze sef of all JenvrJ ttiat can be "inlzeritecl" 03 task T,, S,
included:
9, = { S Z 3 B C , . 7, E B C , } U {S,} .
Definition 5.6 Let 9 f R T ( zbe) flze Jet of all SRT ~erversflzaf can be "lnherifecl" b?
task 7 , and interfere w~tlzJenvr S,:
= { S kS k i s an SRTsenvr A 3BCx
9fRT(2) = (Q, . . . ,7,. . . . . r z ) ) .
If S , ir art SRT ren<ei;i f is also irlcliided in 9,SRT(1).
Consider Example 5.6. There is one chain from rx to 7,: B C k = ( ~ . R1,
x r, , R2,r z ) .
Therefore, S A E 9,SRT(1).Set 9,SRT(1) is important in our analysis because it iden-
tifies the tasks that can inherit an SRT server before interfering with the server S ,
under analysis. In Example 5.6, task r, can inherit the SRT server Skwhich may later
postpone its deadline.
Proof.
When T,inherits a server S,, this server must have a scheduling deadline shorter than d k.
Recall that, by definition, e(z. t ) is the index of the server with the shortest scheduling
deadline among all servers inherited by 7,at time t. Hence d;(, ,) = dy < d f . If S,
postpones its deadline before the time at which r,releases the resource, r,continues to
execute inside the server with the shortest deadline among the inherited servers. Since
S,never postpones its deadline, the lemma follows.
Lemma 5.4 Gi~,erla task T,,ordx taskr in T,can be added to server S,and contribute
to I,.
Proof.
It directly follows from Rule (10) and from the definition of r ,. 0
Lemma 5.5 Lef S,be an HRT sewer: Task 7,~ v i t hclefaillt sen>erS,cannot inferfere
~ v i t hsewer S,
Proof.
By contradiction. For r,to interfere with S,it must happen that at a certain time t 1, r,
locks a resource R; it is then preempted by server S,at time t L ,which blocks on some
resource; T] inherits S, as a consequence of this blocking. Therefore, T, must start
executing inside its default server before S,arrives, and executes in a server S,(, t2)
with deadline d:(] > df when it is preempted. By hypothesis P,< PI+ dy < df .
Hence, r , inherits a server Se(,t l ) with d s ( J t l ) > d f . However, from the hypothesis
follows that server S, never postpones its deadline ( S , $ 9;RT ( 2 ) ) . and from Lemma
5.3, d:(,.tl) < d3 < di. This is a contradiction, hence the lemma follows.
The next definition precisely identifies the tasks that can interfere with server S , .
Definition 5.7 A proper blocking chain BC, is a blocking clzairl flzaf corltain~only
taskr that can irlterfere t ~ i f lSi , :
In some case, we have to consider multiple interferences from the same task and on
the same resource. The following lemmas restrict the number of possible interference
situations.
tlzerl T~ ccirl interfere ,tit11 rerl,er S , for at rriosf the tvonrt-care ereciifiorl of one critical
section for each job.
Proof.
Suppose r, interferes with S, in two different intervals: the first time in interval [t1 . t s ) ,
the second time in interval [t3 , t 4 ) .Therefore, at time t s ,d:(, t L ) > d i . If 7 , does not
lock any resource in [ t 2t.3 ) then
, at time t 3server S , blocks on some resource R that
was locked by T, before t l and that it has not yet released. Therefore, r, interferes
with S , for the duration of the critical section on R, which includes the duration of the
first critical section ( [ ( R )> (t4 t 3 )+ (t2 t l ) )and the lemma follows.
- -
Now suppose r, executes in interval [ t 2t.s )and locks another resource R1. It follows
that it inherits a server S A that preempts S , with d i < di. Hence Pk < P,. From the
hypothesis, S A is an HRT server and d i is not postponed before r, releases resource
R1. Hence, 7 , cannot inherit S, while it is inside Sx, and we fall back in the previous
case.
Lemma 5.7 Lef S,be an HRT sewer and R a resource. If flze ~follo~vingcorlclitiorl
lzold~:
VBC;. BC; = (. . . . R. . .). vsAE 8,SRT(4 : pA P,
Proof.
The proof of this lemma is very similar to the proof of Lemma 5.6. By contradiction.
Suppose two critical sections on the same resource R contribute to I , . The first time,
task rPinherits server S, at time tl while it is holding the lock on R. The second time,
task r, inherits server S,at time t 2 > t l while it is holding the lock on R. It follows
that:
Lemma 5.8 The tvorrt-case irlterfererlcefor serl,er S, due to a proper blockirlg chain
BC, = ( r l .R1.. . . . R z - 1 . 7 ~is)
Proof.
It simply follows from the definition of proper blocking chain. 0
Given a proper blocking chain, we need to distinguish the tasks that can interfere with
S, for at most the duration of one critical section (i.e., that verify the hypothesis of
Lemma 5.6). from the tasks that can interfere with S, multiple times.
Lines 6-12 consider the case in which 7 , is blocked on the k-th critical section. For each
proper blocking chain B C , in C S , ( k ) ,the algorithm checks if it is a legal blocking
chain, that is, the resources in R(Bc;) and the tasks in F(Bc:) have not yet been
considered in the interference time computation. If so, function ~nterf erence ( ) is
1: mt m t e r f e r e n c e ( 1 n t k , set 7, set R )
2: {
3: mt r e t = 0 ;
4: i f ( k > c s , ) returnO;
5: r e t = m t e r f e r e n c e ( k + l , 7,R);
6: foreach ( B C , E C S z ( k ) ){
7: i f ( T ( B C : )C I and R(Bc',) C R) {
8: I' = I\T(Bc:);
9: R' = R \ R(Bc,);
10: r e t = max(ret, <(BC,) + m t e r f e r e n c e ( k + l , I t , R'))
;
11: )
12: )
13: return r e t ;
14: )
Figure 5.14 Algorithm for computing the interference time for senel
+
recursively called with k' = k 1 , I' = I \ ~ ( B C , and
) , R' = R \ R(Bc,) (lines
8-10). Otherwise, it selects another chain from CS,(k). The recursion stops when
k > cs, (line 4).
The algorithm has exponential complexity, since it explores all possible interference
situations for server S,. We conjecture that the problem of finding the interference
time in the general case is NP-Hard. However, the proof of this claim is left as a fut~lre
work.
Two different approaches have been analyzed. In the first approach, the CBS algorithm
has been extended to work with the SRP. In the second approach, the CBS algorithm has
been extended to work with the PIP. The first approach is best suited in hard real-time
systems that also include soft real-time aperiodic tasks. The second approach is best
suited in dynamic real-time systems where there is no a priori knowledge about the
tasks requirements.
RESOURCE RECLAIMING
A general technique for guaranteeing temporal constraints of hard activities in the pres-
ence tasks with unpredictable execution is based on the resource reservation approach
[MST94b, TDS+95, AB98aI (see Chapter 3). Using this methodology, however, the
overall system's performance becomes quite dependent on a correct resource allo-
cation. It follows that wrong resource assignments will result in either wasting the
available resources or lowering tasks responsiveness. Such a problem can be over-
come by introducing suitable resource reclaiming techniques which are able to exploit
early completions to satisfy the extra execution requirements of other tasks.
This chapter introduces some resource reclaiming algorithms that are able to guarantee
isolation among tasks while relaxing the bandwidth constraints enforced by resource
reservations.
Example. Consider the case shown inFigure 6.1, where three tasks are handled by three
servers with budgets Q 1 = 1,Q 2 = 5, Q g = 3, and periods Tl = 4, T2 = 10, & = 12,
respectively. At time t = 6, job 7 2 . 1 completes earlier with respect to the allocated
budget, whereas job rs~lrequires one extra unit of time. Figure 6 . l a illustrates the
classical case in which no reclaiming is used and tasks are served by the plain Corlstarlt
Bandwidflz S e n v r (CBS) [AB98a] algorithm. Notice that, in spite of the budget saved
by 7 2 . 1 , the third server is forced to postpone its current deadline when its budget is
exhausted (it happens at time t = 9). As shown in Figure 6.lb, however, we observe
that the spare capacity saved by 7 2 . 1 c o ~ d dbe used by ~ ~ 3 to
. 1advance its execution and
prevent the server from postponing its deadline. The intuition is that early completions
of tasks generate spare capacities that are wasted by traditional resource reservation
approaches, unless resource reclaiming is adopted to relax the bandwidth constraints,
still providing isolation among tasks.
In the next sections, we present two scheduling techniques, the CAyacig SHaring
(CASH) algorithm [CBSOO] and the Greedy Reclamation of Url~rseclBand~viclflz(GRUB)
algorithm [GBOO], which are able to reclaim unused resources (in terms of CPU ca-
pacities) while guaranteeing isolation among tasks. Both techniques handle hybrid
task sets consisting of hard periodic tasks and soft aperiodic tasks. Moreover, both
algorithms rely on the following assumptions:
1. tasks are scheduled by a dynamic priority assignment, namely, the Earliest Dead-
line First (EDF) algorithm;
2. tasks are assumed to be independent, that is, they do not compete for gaining
access to shared and mutual exclusive resources;
Figure 6.1 O~errunshandled by a plain CBS (a) versus oLerruns handled hy a CBS with
a resonrce reclaiming mechanism (b).
handle tasks with different criticality and flexible timing constraints, to enhance
the performance of those real-time applications which allow a certain degree of
flexibility.
The CASH mechanism works in conjunction with the Constant Band~vidthS e n v r
(CBS). Each task is handled by a dedicated CBS and the reclaiming mechanism uses a
global queue, the CASH queue, of spare capacities ordered by deadline. Whenever a
task completes its execution and its server budget is greater than zero, such a residual
capacity is stored in the CASH queue along with its deadline and can be used by any
active task to advance its execution. When using a spare capacity, the task can be
scheduled using the corresponding server deadline associated with the spare capacity.
In this way, each task can use its own capacity along with the residual capacities deriving
by the other servers.
Whenever a new task instance is scheduled for execution, the server tries to use the
residual capacities with deadlines less than or equal to the one assigned to the served
instance; if these capacities are exhausted and the instance is not completed, the server
starts using its own capacity. Every time a task ends its execution and the server becomes
idle, the residual capacity (if any) is inserted with its deadline in the global queue
of available capacities. Spare capacities are ordered by deadline and are consumed
according to an EDF policy. The main benefit of the proposed reclaiming mechanism
is to reduce the number of deadline shifts (typical of the CBS), so enhancing aperiodic
tasks responsiveness. Notice that, due to the isolation mechanism introduced by the
multiple server approach, there are no particular restrictions on the task model that
can be handled by the CASH algorithm. Hence, tasks can be hard, soft, periodic, or
aperiodic.
CASH RULES
The precise behavior of the CASH algorithm is defined by the following rules.
,
2. Each task instance 7 , . handled by server S, is assigned a dynamic deadline equal
to the current server deadline d,k .
,
4. When a task instance 7, arrives and the server is idle, the server generates a new
+
deadline d l A = m n s ( r , , . d, k - l ) T,and c, is recharged at the maximum value
Q1.
Resour-ce Reclaiming
,
5 . When a task instance 7 , arrives and the server is active the request is enqueued
in a queue of pending jobs according to a given (arbitrary) discipline.
,
6. Whenever instance r, is scheduled for execution, the server S, uses the capacity
c, in the CASH queue (if there is one) with the earliest deadline d,, such that
d, < d , ~otherwise
~, its own capacity c, is used.
7. Whenever job r,~,executes, the used budget c, or c, is decreased by the same
amount. When c, becomes equal to zero, it is extracted from the CASH queue
and the next capacity in the queue with deadline less than or equal to d , k can be
used.
8. When the server is active and c, becomes equal to zero, the server budget is
recharged at the maximum value Q , and a new server deadline is generated as
+
dl k = dl k - 1 TI.
9. When a task instance finishes, the next pending instance, if any, is served using
the current budget and deadline. If there are no pending jobs, the server becomes
idle, the residual capacity c, > 0 (if any) is inserted in the CASH queue with
deadline equal to the server deadline, and c, is set equal to zero.
10. Whenever the processor becomes idle for an interval of time A, the capacity c ,
(if exists) with the earliest deadline in the CASH queue is decreased by the same
amount of time until the CASH queue becomes empty.
AN EXAMPLE
To better understand the proposed approach, we will describe a simple example which
shows how the CASH reclaiming algorithm works. Consider a task set consisting of
two periodic tasks, rl and 72, with periods Pl = 4 and P2 = 8,maxim~lmexecution
times (7;""' = 4 and CyU' = 3, and average execution times C,""" 3 and C,""" 2.
Each task is scheduled by a dedicated CBS having a period equal to the task period
and a budget equal to the average execution time. Hence, a task completing before
its average execution time saves some budget, whereas it experiences an overrun if it
completes after. A possible execution of the task set is reported in Fig~lre6.2, which
also shows the capacity of each server and the residual capacities generated by each
task. At time t = 2, task 71 has an early completion and a residual capacity equal to
one with deadline equal to 4 becomes available. After that, r 2 consumes the above
residual capacity before starting to use its own capacity; hence, at time t = 4, the
overrun experienced by 72 is handled without postponing its deadline. Notice that
each task tries to use the residual capacities before using its own capacity and that
whenever an idle interval occ~lrs(e.g., interval [19, 20]), the residual capacity with
the earliest deadline has to be discharged by the same amount in order to guarantee a
correct behavior.
The example above shows that overruns can be handled efficiently without postponing
any deadline. A classical CBS instead, would have postponed some deadlines in order
to guarantee tasks isolation. Clearly, if all the tasks consume their allocated budget, no
reclaiming can be done and this method performs as a plain CBS. However, this situation
is very rare in practical situations, hence the CASH algorithm helps in improving the
average system's performance.
overrun
normal execution
n
=1
=2
Residual
capacities
where Q , is the maximum server budget and T I is the server period. Before proving
the schedulability condition, the following lemma will prove that all the generated
capacities are exhausted before their respective deadlines.
Theorem 6.1 Given a set F of capacity Oared sen,ers along ititlz the CASH algoritlznz,
each capacity generated diirirg the rchedzdirlg is erhaurted before itr deadlirle if ard
ordy if:
Proof.
If. Assume equation (6.1) holds and suppose that a capacity c* is not exhausted at time
t " , when the corresponding deadline is reached. Let t a > 0 be the last time before
t" at which no capacity is discharging; that is, the last instant before t " during which
the CPU is idle and the CASH queue is empty (if there is no such time, set t , = 0).
Let t b > 0 be the last time before t* at which a capacity with deadline after t* is
discharging (if there is no such time, set t b = 0). If we take t = nan.x(t,. t b ) ,time t
has the property that only capacities created after t and with deadline less than or equal
to t" are used d~lring[t.t " ] . Let Q T ( t l ,t 2 )be the sum of capacities created after t l
and with deadline less than or equal to t 2 ; since a capacity misses its deadline at time
t * ,it must be that:
Q r ( t .t * ) > (t* - t )
In the interval [t.t * ] we
, can write that:
which is a contradiction.
Ordy if. Suppose that El $ > 1. Then, we show there exists an interval [ t l .t 2 ]in
which Q T ( t l .t 2 ) > ( t 2- t l ) . Assume that all the servers are activated at time 0; then,
for L = lcm(T1..... T,,) we can write that:
We now formally prove the schedulability condition with the following theorem:
Theorem 6.2 Let ?;, be a set qfyeriodic lzard tasks, wlzere eeaclz t a ~ kr, i~ ~ c h e d ~ r l e d
b? a dedicafecl Jerver ~ v i t hQ , = Cpa" and T,= P,, arzd let I, be a Jef qf oft f a s k ~
scheduled by a group of serverr tl'itlz total uti1i;ation C "ft. Then, 3,is feasible ifand
or1ly if
Proof.
The theorem directly follows from Lemma 6.1; in fact, we can notice that each hard
task instance has available at least its own capacity equal to the task WCET. Lemma
6.1 states that each capacity is always discharged before its deadline, hence it follows
that each hard task instance has to finish by its deadline.
It is worth noting that Theorem 6.2 also holds under a generic capacity-based server
having a periodic behavior and a bandwidth C,.
Intuitively, the value of d , at each instant is a measure of the yriorih that GRUB
algorithm accords server S, at that instant -GRUB will essentially be performing
earliest deadline first (EDF) scheduling based upon these d l values.
The value of 1; at any time is a measuie of how much of serve1 S,'s "reserved"
service has been consumed by that time. GRUB algolithm will attempt to update
the value of 1; in such a manner that, a f each mtarzt in time, Jerver S, l z a ~received
the same anlounf o f ~ e r v i c ethat ~twould lzave recenvd O j t m e 1; fexecufing on
n dedicated processor of ccipncltj C ,.
At any instant in time during run-time, each server S, is in one of three states:
inactive, activecontending, or activeNonContending. Intuitively at time t o a
server is in the activecontending state if it has some jobs awaiting execution
at that time; in the activeNonContending state if it has completed all jobs that
arrived prior to t o , but in doing so has "used up" its share of the processor until
beyond t o (i.e., its virtual time is greater than t o ) ; and in the inactive state if it has
no jobs awaiting execution at time t o , and it has rzof used up its processor share
beyond t o . Notice that a server is said to be active at time t if it is in either the
activecontending or the activeNonContending state, and irlncti~,eotherwise.
The GRUB algorithm maintains an additional variable, called the rjsteriz iiti1i;n-
fion L7(t), which at each instant in time is equal to the sum of the capacities L7,
of all servers S, that are active at that instant in time. L7(t)is initially set equal to
zero.
GRUB is responsible for updating the values of these variables, and will make use of
these variables in order to determine which job to execute at each instant in time. At
each time, GRUB chooses for execution some server that is in its activecontending
state (if there are no such servers, then the processor is idled). From the servers that
are in their activecontending state, GRUB algorithm chooses for execution the server
with the earliest deadline.
While S, is executing, its virtual time :L increases; while S, is not executing :L does
not change. If at any time this virtual time becomes equal to the server deadline
+
(1.; == d,), then the deadline parameter is incremented by P, (dl t d, P,). Notice
that this may cause S, to no longer be the earliest-deadline active server, in which case
it may surrender control of the processor to an earlier-deadline server.
1. If server S, is in the inactive state and a job J,' arrives (at time-instant a:), then
the following code is executed
Figure 6.3 GRUB state transition diagram: node labels refer to s e n e l states and edges
numbers to transition rules.
5 . While a job of server S , is executing, the server virtual time increases at a rate
c/LT1:
. if S, is executing
0. otherwise
If 1.; becomes equal to d l , then d l is incremented by an amount PI ( d , + d , P,). +
6. There is one additional possible state change: if the processor is ever idle, then
all servers in the system return to their inactive state.
Figure 6.3 shows the state transition diagram according to the above rules
Notice that the rate the virtual time is increasing at determines whether a server S ,
reclaims unused capacity or not. In fact, suppose that C is equal to one (none of
the servers is inactive and the system is fully utilized); intuitively, S , will be allowed
to execute for C,P, units within a server period P, . Hence, since I ; is incremented
at a rate 1 / L ; while S, is executing. In this case, GRUB is equivalent to the CBS
algorithm and performance can be guaranteed as done for the CBS.However, if c
becomes less than one, the resource reclaiming capability of GRUB is enabled and the
current executing server S , ( I ; is incremented at a rate C / C , )starts to reclaim unused
bandwidth executing for ( C ,P,)/C units within a server period P,.
In using excess processor capacity, though, we must be very careful not to end up using
any of thefuture capacity of currently inactive servers, since we do not know when the
currently inactive servers will become active. To this purpose, the slope of the virtual
time Lr(t) is dynamically updated as any server changes its current state.
As an example of the resource reclaiming capability of GRUB, just consider two servers
S1 and S2, both having bandwidth utilization C1 = C2 = 0.5 and server period
PI = P2 = 6. Assume that S1 has a pending request (job 7 ; ) at time t = 0, and
consider two possible cases: 1) server S2is active; 2) server S2is inactive. In the first
case (S2active), server S1 will assign a deadline d l = a: + +
Pl = 0 6 to job rf
and % y l ( t ) = C/C1 = 110.5 = 2; it follows that r: can execute for three units of
time before postponing the server deadline by a server period (no reclaiming occurs
and GRUB behaves like the CBS). On the other hand, if S2is inactive, server S1will
assign the same deadline d l = G to job r: as before, but the server virtual time will
increase at a rate $1; ( t ) = L7/C1 = 0.510.5 = 1; it follows that rllcan execute for
six units of time before postponing the server deadline by a server period. In the latter
case, the reserved bandwidth of server S2is completely reclaimed by server S1 fully
utilizing the processor.
The behavior of the GRUB algorithm will be clarified by the following example.
EXAMPLE
+
Let us assume that servers S3and S4,which together have (C3 L;) = 0.5 of the
total processor capacity, are not active at all - this unused processor capacity could
in fact have been allocated to servers S1and S 2 . Figure 6.4 shows the server behavior
when the following sequence of job arrivals occurs:
Server S p now becomes the only activecontending server in the system, and
consequently 7; is executed. 1% is incremented at a rate L7/C2 = 0.510.3.
At time 5, S1enters the inactive state. Now, L7 becomes equal to LT2 = 0.3, and
1% (5) is equal to (0.510.3) x 3 = 5. From now on, 1; is incremented at a rate of
0.3/0.3 = 1.
Assuming that ci is equal to 5, T: completes execution at instant 7 and enters the
activeNonContending state - at this time, 1; has increased to 7.
If the GRUB algorithm is substituted by four CBS servers with same bandwidth and
server periods, job r: would execute for one unit of time before exhausting its server
budget. As a consequence, at time t = 1, server S1 would postpone its deadline by
+ +
a server period ( d l = d l Pl = 5 5 = 10) releasing the CPU to 7 ; . Similarly,
job would execute for 2.7 units of time before exhausting its server budget; hence,
at time t = 3.7, server S2 would postpone its deadline by a server period ( d l =
+ +
dl PI = 9 9 = 18). Finally, both jobs would complete without postponing their
server deadline again. The schedule in this case is depicted in Figure 6.5.
Figure 6.5 Schedule produced hy the CBS sener without band\+idth reclamation
Comparing the two schedules generated in the example above, we immediately see
one of the advantages of the GRUB algorithm over non-reclaiming servers (like CBS):
since a reclaiming scheduler like GRUB is likely to execute a job for a longer interval
than a non-reclaiming scheduler, we would in general expect to see individual jobs
con~plefeearlier in GRUB algorithm than in non-reclaiming servers.
Dedicated processor. Let A: and F;' be the instants that job r: would begin and
complete execution, respectively, if server S, were executing on a dedicated processor
of capacity C,. The following expressions for 4; and F;' can be easily derived:
-4; = a,'
Resour-ce Reclaiming
GRUB virtual processor. In 2000, Lipari and Baruah [GBOO] bounded the error
introduced by the GRUB algorithm when emulating a virtual processor of capacity L7,.
In fact, they proved that the following inequality holds:
By using the results of Equation 6.3 and Equation 6.4, the following theorem can be
easily proved:
Theorem 6.3 The conzyletiorl finze qf a job qf senvr S , ~vlzerlsclzedidecl b? flze GRUB
algoritlznz i~ l e s ~flzan PI fin~eimifs affer flze conzyletion-firmqf the m n e job wlzen S ,
ha5 it5 oIvn dedicafeclyrocesso~
Proof.
Observe that
Thus, f: (the completion time of the j-th job of server S , when scheduledby the GRUB
algorithm) is strictly less than PI plus F,' (the completion-time of the same job when
S, has its own dedicated processor).
It is worth noting that the above theorem helps to decide how to set the server period,
which is a system parameter. In fact, the period P, is an indication of the "granularity"
of time from server S,'s perspective; as a consequence, the smaller the value of P,,
the more fine-grained the notion of real time for S , , even though an additional cost is
introduced in terms of algorithm overhead (the deadline postponement of each server
is a function of the server period).
To limit these problems while still providing effective scheduling policies, we now
introduce two simple reclaiming techniques characterized by a low computational cost,
making them more suitable for small embedded devices.
I I I I job arrivals
I I I
This problem, along with the limited temporal horizon problem of the server deadline,
can be effectively contained by exploiting a well known property of the idle times. In
fact, under static and dynamic priority scheduling, the following property holds:
Lemma 6.1 (Idle interval lemma) Given an! schedule 0, if art idle irttennl [ t o .t]
occurs, the schediilabilit~ofjobr releared at or after t is {totciffected 01 jobs rchedzded
before t. Hence, ar fcir ar the task set rclzedulabilit\ is concerned, in rtant t cart be
before t.
chosen as flze new j s f e n l Jtarf finze, ignoring all flze ~ c h e d ~ r hel>erlfJ
g
A direct consequence of the idle irtterval lerrirria is to allow all the CBS servers to
restart their budget and deadline every time an idle interval occurs. It is worth noting
that the idle time interval lemma also applies to a "zero length" idle interval (t 0 = t).
Such an anomalous idle interval occurs at time t whenever all the jobs released before
t complete by instant t and at least one new job (J,,,, ) is released exactly at t . The
visible effect of this anomaly is that the processor is never idle, but from a scheduling
point of view this event can be considered as a "zero length" idle interval occurring at '
time t. This type of reclaiming, called Deadline Advancement, is illustrated in Figure
6.7, where the same task set of Figure 6.6 is analyzed.
According to the example above, it can be noticed that a zero length idle interval occurs
at times t = 1 , 2 . 3 , 1 ; as a consequence, server S1can be restarted four times before
server Spstarts to serve its first request. After that, both servers behave according to
the classical CBS rules. It follows that the last S1job completes at time t = 8,instead
o f t = 14, by exploiting the deadline advancement technique.
I ~ h reader
e can be easily con\-inced just imagining to clelay JIT,,.. by an amount t arbitrarily small: it
immediately follo\+s that an idle interval [t.t + c] occurs, so that the validity of the idle intenal lemma is
finally claimecl.
Figure 6.7 Example of CBS setxeta mith cleaclline achancernent
According to the consideration above, the CASH algorithm can be significantly sim-
plified, still achieving reasonable performance in terms of resource reclaiming. The
resulting approach, named Budget Adjustment, is an extension of the CBS server by
adding the following rule:
w Rule: Whenever the current executing CBS server S, becomes idle and has some
residual capacity (a,) left, such an amount is transferred to the subsequent CBS
server Sb (if any) present in the scheduler (EDF) ready queue. If there is no
available server, the residual capacity is not transferred, but it is maintained by
the idle CBS according to the classical CBS rules.
The validity of the above rule directly derives from the CASH properties. In fact, since
the EDF ready queue is ordered by increasing deadlines, the residual capacity transfer
can only occur from a server S, to a server Sbwith absolute deadline db d,. By >
Resour-ce Reclaiming
contradiction, if d h < d, , server Sh would be inserted at the head of the ready queue
and a preemption would occur. Hence, according to the CASH rules, as the absolute
deadline d b is greater than or equal to the absolute deadline d,, the capacity transfer
from server S, to server Sb can be safely performed.
Consider a task set consisting of two periodic tasks, rl and 7 2 , with periods PI = 4
and P2 = 8,maxim~lmexecution times C;""" = 1 and C,"" = 3. and average
execution times C,""" 3 and C i L S= 2. Each task is scheduled by a dedicated CBS
having a period equal to the task period and a budget equal to the average execution
time. Hence, a task completing before its average execution time saves some budget,
whereas it experiences an overrun if it completes after. A possible execution of the
task set is reported in Figure 6.8, which also shows the capacity of each server and the
capacity transfer among servers whenever the "Budget Adjustment" policy allows it.
o\eiiun
normal execution
E
z
/
+1
20 24
0
t
34
In the previous chapters, the term "QoS" has been informally defined as something
related to the quality perceived by the final user, making the implicit assumption that
the QoS level is related to the number of respected deadlines.
To provide a more formal QoS definition, the quality achieved by each application must
be quantified in some way, and this is usually done by associating a utility value to each
task or application. Hence, QoS management can be defined as a resource allocation
problem, and mathematical tools for solving it already exist.
The fundamental issue for formulating and solving such an optimization problem is
to use an adequate QoS model that univocally maps subjective aspects (such as the
perceived quality that may depend on the user) to objective values (such as the utility,
that is a real number). In this chapter we will recall the most important QoS models
and their applications.
The first important aspect of this model is that each application is not characterized
by a single constraint, but may need to satisfy multiple requirements. This is an
important difference with respect to the traditional real-time model, in which each
task is characterized by a single deadline or priority. An audio streaming program is
a typical example of application that can take advantage of the rich QoS description
provided by Q-RAM: the audio data must be decoded and put in the audio card buffer
before the buffer is empty (and this is a tirriirg constraint), but can also be provided at
different rnrriplirg lnfes and erlcodirgs (affecting the final quality of the reproduced
audio). Moreover, different conzpl-ersion mechanisms can be used to reduce the needed
network bandwidth, and they may introduce some loss in the quality of the decoded
audio. Finally, there may be some additional lafenc? constraints, or data may require
to be encr?yfecl.
In summary, some important QoS dimensions may be: timeliness, security, data quality,
dependability, and so on. Each application may have a minimum QoS requirement
along each dimension (for example, the audio must not be sampled at less than 8 bits
per sample). In some cases, a maximum QoS requirement along some dimensions
can make sense too (for example, it may be useless to sample audio data at more than
~XOOOKH;).
Another important characteristic of the Q-RAM model is that it recognizes that each
application needs different resources for its execution (for example, each application
will s~lrelyneed some CPU time and some memory), and decreasing the need for
one resource may increase the need for a different one. For example, let us consider
the audio streaming application again: the amount of network bandwidth needed for
receiving an audio stream can be decreased by compressing the audio data, but this
will increase the CPU requirements. If a formalized model like Q-RAM is not used,
finding the correct resource trade-off may be difficult, and empirical methodologies
are often used to tune the application.
Note that RTn'depends both on the resource and on the algorithm used to allocate
the resource. For example, considering the CPU, R?!;: is 1 if EDF is used as a
scheduling algorithm, and can be 0.69 if RM is used. To be generic enough, Q-RAM
,
assumes that each resource is scheduled so that an amount R, of R, is assigned
to task 7 , ,but does not make any assumption on the particular scheduling algorithm
(the scheduling algorithm behavior is modeled through RT"'). Returning to the CPU
example, any algorithm providing temporal protection (such as a reservation algorithm
or a proportional share algorithm) can be used.
In order to define an optimality criterion, each application is assigned a ufilih L7,,
defined as the value achieved by assigning ( R , 1, . . . , R, ,,,)to 7,. To be more precise,
the utility is a function C,( R ) of the resource assignment; for this reason, C , ( R ) is
referred to as the iitilit? fi~rlcfiorlof 7,. The total system utility is defined as a weighted
sum of all the applications' utilities:
Note that throughout this chapter the C ,() symbol is used to denote the utility function,
and not the utilization as in the rest of the book. This notation has been adopted for
consistency with the original work.
The previous definitions provide the foundation for a QoS optimization problem that
can be successfidly solved under the following assumptions:
3. The utility functions L, () (and the dimensional utility functions L7, ,()) are non-
decreasing in all the arguments R , . ,
Note that the first assumption is only used to simplify the analysis, but it is not strictly
needed. Also, note that the application importance zc , can be eliminated from the model
by considering the weighted utility function ut,C,() instead of Cz(). At this point, it is
clear that the goal of Q-RAM is to find a matrix R such that:
The generic case (multiple resources and multiple QoS dimensions with generic utility
functions) is not easy to solve, hence the authors propose different solution algorithms
that are valid under some simplifying assumptions. The single resource and single QoS
dimension is analyzed first, under the additional hypothesis that the utility function
C,( R )is twice continuously differentiable and concave. Let R 1 be the single resource
considered in this case. Because of Assumption 2 (the minimum resource constraints
can be satisfied), it is possible to focus only on the allocation of the "excess resource"
R: = Rz 1 - R:in1 (the J and k indexes have been removed from R' because there
is only one resource and one QoS dimension). By definition, R'""" = R y -
Erzz=l
R~l~~l,andRl17ZZrZ0. Standard results from operational research (the Kuhn
I
-
Tucker theorem) ensure that a resource allocation R ' is optimal only if Vz. R/,= 0, or
for any ( I . h ) : Ri > 0 and RA > 0, the first derivative of C,() computed in Ri is equal
to the first derivative of Cj() computed in R:. Note that this condition is necessary,
but not sufficient. Based on this result, it is possible to find an optimal allocation R 1 ,
by using an iterative algorithm. At each step, the following quantities are computed:
Two QoS dimensions are independent if a quality variation in one of them does not
change the amount of resource needed to keep the quality level on the other dimension
stable. In this case, dimension utilities are additive. Conversely, two QoS dimensions
are dependent if a quality variation in one of them can cause a quality change on the
other one, assuming the amount of resources is not increased.
If all the QoS dimensions are independent, then C , ( R ) = xk L, k ( R ) , and the opti-
mization problem can be treated as a single-QoS-dimension problem, by introducing
some fake applications r f that describe the various QoS dimensions. Hence, the re-
source allocation problem is transformed into an equivalent problem, where the new
task set is composed by n * d tasks (remember that 71 is the number of applications
and d is the number of QoS dimensions). Tasks from rl to rd will describe the d QoS
dimensions of the first application (and will be characterized by the utility functions
C1 1 ( R ) . . . . . C1,{(R)),tasks from rd+l to T Z will
~ describe the QoS dimensions of the
second application, and so on. This new problem is a single-QoS-dimension problem,
and can be solved using the algorithm presented above.
If QoS dimensions are dependent, solving the problem is more complex. In this case,
the total system utility is a multi-dimensional function of the dimensional utilities. If
the dimension utility functions L7,.k( R ) are continuous, imposing R , = k (remember
that there is only a single resource in the system) defines a surface in the QoS space
that can be projected on the function that maps dimensional utilities to the system
utility. By getting the maximum utility for each R , = k surface, the problem is
again transformed into a single-QoS-dimension problem, and it is possible to apply the
algorithm presented above.
The case in which the dimensional utility functions are not continuous (i.e., QoS di-
mensions are discrete) cannot be treated in this way, and is even more complex. In
fact, the authors propose only a nearly-optimal algorithm, and f~lrtherinvestigate the
problem in a different paper [LRLS98]. In such a paper, the authors prove that solving
the optimization problem in the case of dependent and discrete QoS dimensions is
NP-hard, and propose an approximation based on a polynomial-time algorithm that
provides a solution at a bounded distance from the optimal resource allocation.
If more than one resource is considered (multiple resource problem), the complexity of
the optimization problem increases, because some new degrees of freedom are added.
To make resource allocation more tractable with conventional mathematical tools, the
authors add some additional constraints: first of all, the system can work according to JI
different schemes, and for each scheme the utility function mapping the requirements
of resource R, to the utility does not depend on the other resources. Hence, once a
scheme and a utility level C are chosen, the requirements for all the resources can
be univocally determined. Moreover, all the utility functions are chosen to be linear
(with a saturation). In this way, the resource allocation problem can be formulated
as a linearized rniuer integer progrmzn~ingproblem that can be solved by using some
numerical method.
A taxonomy of the different algorithms that can be used for allocating system resources
when Q-RAM is adopted, together with a comparison of such algorithms based on
accuracy and computational cost, can be found in [CLS99].
As a final remark, note that utility functions are generally the results of a subjective
evaluation of the output quality, and may depend on the user (what a final user evaluates
as a "good quality" can be unsatisfactory for a different user). Hence, assigning utility
functions is not easy. In some cases, however, utility values can be deterministically
associated to a resourceconfiguration. For example, in control applications it is easy to
define a control metric that describes the quality of the control action. It can be based
on the period of the control tasks or on other quantities dependent on the amount of
resources allocated to the control tasks. This will be better explained in Section 7.3.
Static resource management is performed at a system design phase, and can be for-
mulated as an off-line optimization problem. During system design, if requirements
and resource consumptions of each task are known in advance, then it is possible to
formulate the optimization problem and solve it to find the optimal resource assignment
and scheduling parameters. In this case, the complexity of the optimization algorithm
and the time needed to find an optimal solution are not much critical, and the accuracy
of the solution is the most important factor.
Static (a-priori) resource assignment has been traditionally used in designing critical
real-time systems, and it is still a good choice for those systems in which an objective
QoS metrics exists, and the relation between resource usage and QoS level is clear and
known. Control systems are a good example: in a feedback controller, the quality of the
control action can be clearly expressed by an objective metric, based on the difference
between the response of the closed loop system and the desired response.
Dynamic resource management can be performed at run time to better cope with system
unpredictability, or with the inherent dynamic nature of many real-time applications.
In this case, the resource optimization problem is solved on line by an active entity,
typically a QoS Mmager. The QoS manager is a task responsible for dynamically
assigning system resources, and tuning them so that the global utility is maximized.
The QoS manager partitions system resources by using a rizecharlisriz and a polic\ : the
mechanism is used for assigning a specified amount of resources to each task, and
can be based on modifying the scheduling parameters, or on changing the application
behavior. The first approach does not require any modification in the applications,
but implies a strict cooperation between the QoS manager and the scheduler. Hence,
the QoS manager results to be tightly dependent on the kernel, and on the adopted
scheduling algorithm. The second approach allows making the QoS manager and the
applications independent of the kernel and of the scheduling algorithm, but requires
to heavily modify the applications to support dynamic QoS adaptation. Every "QoS-
Aware" application must support different service levels, and must be able to switch
between them upon manager requests. Application-level QoS adaptation, as described
for example is Section 8.3, is an example of this approach.
The policy is used by the QoS manager for deciding how to partition system resources
among tasks. Such an assignment can be performed by the QoS manager by solving
an on-line optimization problem (similar to the one proposed by Q-RAM) , or by using
some kind of heuristics. In this case, the complexity of the optimization algorithm (and
the amount of time needed to solve it) becomes relevant.
The QoS manager can perform its dynamic resource assignment decisions based on
resource requirements explicitly declared by the applications, or it can use some form
of feedback from the system. In the first case, applications have to explicitly declare
their requirements and resource consumptions (e.g., by using something similar to the
Q-RAM utility functions). Using this approach, the QoS manager decisions can be
performed every time an application enters the system, leaves the system, explicitly
requires to change its service level, or changes its requirements or declared resource
consumptions. An example of this approach is given by the Elastic Tcisk Model, pre-
sented in Section 2.7.1. When using some form of feedback, the QoS manager peri-
odically monitors system performance and application resource usage to dynamically
construct the utility functions. This approach results in a form of feedback scheduling,
which is treated in Chapter 8.
Finally, it is worth observing that an abstract QoS model, such as Q-RAM, is funda-
mental for implementing any form of QoS management or any kind of QoS manager.
In particular, the Q-RAM model is generic enough to be used in both dynamic and
static resource allocation, and the authors put a lot of effort in developing optimization
algorithms that are efficient enough to be used on line.
where f , is the frequency of 7 ,, c,, is a magnitude coefficient, and 3, is the decay rate.
A typical PLI is illustrated in Fig~lre7.1. where f ,, is the lower bound on the sampling
frequency.
The performance loss index of the overall system A J ( f l, .... f n ) is defined as follows:
where zc, is a design parameter determined from the application. For instance, it can
be the relative importance of the task in the control system with respect to the others.
Given the available bandwidth (A), the minimum permitted frequency ( f i'"'"), the
worst-case execution time (TT7CET,) and the weighedPLI ( u ~ , A . J(, f , ) ) of each task r,
as input parameters, Seto, Lehoczky, Sha and Shin provided an optimization algorithm
to compute the frequencies f PPt
which minimize the PLI of the system while guaran-
teeing the schedulability constraints (i.e., ensuring each task will meet its deadlines).
Notice that each task frequency f zo"t computed by this technique is always greater than
or equal to the corresponding minimum frequency f :'"'". After defining the notion of
PLI, the next section describes the control performance optimization algorithm used to
assign the task frequencies when using digital control.
'In the original formulation, the perforlnance loss index was simply called performance index or PI. In
the following. it will be callecl PLT for more clarity.
fm Sampling Frequency
Figure 7.1 Control system Perforlnance Loss Index as a function of the sampling fre-
quency.
mi11 P L I =
(fl fn)
C
w z P L I l ( f l )= C wZoze-'~f
z=l z=l
subject to:
where n is the total number of tasks in the system. Having defined the control optimiza-
tion problem, the following theorem introduces the control performance optimization
algorithm for computing its unique optimal solution.
Theorem 7.1 G h w flze objecthv ~firncfionand the conatrainta of flze "control oyfi-
nzi:ation yroblenz ", there existr a unique oytirnal rolz~tiorlgiver1 by:
In practice, the first step for identifying the optimal value of each f , is to evaluate the
parameter p; that is, the smallest integer p such that equation (7.8) is verified. After
that, the second step is straightforward and is just consists in evaluating equation (7.6)
for each task.
AN EXAMPLE
The following example will clarify how the control performance optimization algorithm
works. The technique is applied to a bubble control system, which is a simplified model
designed to study diving control in submarines [SLSS97]. The bubble control system
considered here consists of a tank filled with air and immersed in the water. Depth
control of the diver is achieved by adjusting the piston connected to the air bubble. In
this example, a camera monitors the diver as sensor for getting its position.
Suppose that two such systems with different physical dimensions are installed on an
underwater vehicle to control the depth and orientation of the vehicle, and assume
they are controlled by one on-board processor. The task set parameters are shown in
Table 7.1, where, for each bubble control system 2 , TT7CET,(ms) is the control task
worst-case execution time in each sampling period, f :nzn (Hz) is the lower bound on
sampling frequency, and L C , is the weight assigned to system 1 .
The following data are given for the control design and scheduling problem: A J , =
n , e - 3 ~ f ~1 ,= 1 . 2 , where the frequencies f , must be determined.
A simple computation shows that the total CPU utilization of the overall bubble system
is 75% when the minimum task frequencies are assigned. Supposing the total CPU
utilization available for the bubble systems is loo%, the control performance optimiza-
tion algorithm allows assigning the optimal task frequencies to fully utilize the CPU.
In particular, to compute the frequencies f PPt,
the correct value of the parameter p
(p = 0) must determined first; then, the optimal frequencies are computed by means
of equation (7.6). In conclusion, it follows that f ,""'= 12.IGH;, and fyPt = 27.81
achieving a resulting Performance Loss Index P L I = 0.0772.
When considering periodic activities, the QoS can often be adapted by changing the
activation rate of the application, and smooth QoS adaptation can be implemented by
enforcing a gradual transition of the period. Typically, a rate change may be caused
either by the task itself, as a response to a variation occurred in the environment, or
by the system, as a way to cope with an overload condition. For example, whenever
a new task cannot be guaranteed by the system to meet its timing constraints, instead
of rejecting the task, the system can try to reduce the utilizations of the other tasks
(by increasing their periods in a controlled fashion) to decrease the total load and
accommodate the new request.
The problem of rate adaptation during overload conditions has been widely considered
in the real-time literature and has been treated in Section 2.7. In this section, we describe
a method for achieving smooth rate transitions in periodic tasks that are required to
adapt to abrupt environmental or system changes. This method was originally proposed
by Buttazzo and Abeni [BA02b] as an extension of the elastic task model (see Section
2.7.1). According to the elastic model, tasks utilizations are treated as springs with
given elastic coefficients. To achieve smooth rate transitions, the model is extended by
coupling each spring with a damping device which prevents abrupt period changes.
In the following, T,will denote the actual period of task 7 , .which is constrained to be
in the range [T,,,. T, ,I.
,,?,, Any period variation is always subject to an elarfic guarantee
and is accepted only if there exists a feasible schedule in which all the other periods are
within their range. In this framework, tasks are scheduled by EDF, hence, if C
T J0
<
1,
Figure 7.2 A clamped elastic element.
all tasks can be created at the minimum period T,,,, otherwise the elastic algorithm is
used to adapt the tasks' periods to T, such that C 5= Cd < 1, where L;i is some
desired utilization factor.
For the sake of completeness, a damped spring is a special case of a system which
behaves as a mechanical impedance, with stiffness k , damping b, and mass n?, as
shown in Figure 7.3.
From a system point of view, the inputloutput behavior of a linear system like this
is described by the ratio of two variables: the effort and the flow. For a mechanical
system, effort is represented by force and torque, and flow is represented by linear and
angular velocity. Motors and batteries are equivalent from a system point of view, both
being effort sources. Similarly, a current generator or a rotating shaft are both flow
SOLU-ces.
From G ( z ) , the discrete time equation expressing the position ~ ( tof) the damped
spring as a function of the force F ( t ) becomes:
( t )= ( 1 - p)F(t - 1) +p ~ ( t
- 1). (7.1 1)
It is worth observing that any transient law can be used to perform a transition from a
period to another. The one expressed by equation (7.11) is just the one which describes
the change occurring in a damped spring, which is exponential. In the experiments
described below, a linear period transition will be also evaluated.
The graceful rate adaptation mechanism can been implemented on top of the EM, as a
periodic task, the Damping Manager (DM). The purpose of the DM is to perform the
rate transition according to the transition law set by the user. To bound the transition
time, the DM runs with a period Tnlr and performs a full transition in A17steps,requiring
an interval of STDAlitime units.
The DM task can be in two states: active or idle; when the system is started, the DM
is idle, and it becomes active when some other task wants to change its period. When
the DM task becomes active at time t o ,instead of changing the periods immediately,
it gracefully changes them during a transient of size T = S T D11.
After activations, the periods arrive to their final values and the period adapter returns
to its idle state, waiting for the next request. More specifically, the graceful adaptation
mechanism works as follows:
w When a task T, wants to change its period from T , to T?'lr, it posts a request to
the Damping Manager.
w When the DM is idle and receives a new request, it becomes active and computes
the next period value T , ( k )according to the transition law set by the user. We
note that TI( 0 )= T I ,and after S steps T , ( S )= T:'".
At each period TDlr,the Damping Manager updates the period T , to the next
value T,( k ) and invokes the Elastic Manager to achieve a feasible config~lration.
w After &I7activations, the periods are adjusted to their final values, and the period
adapter returns to its idle state.
There are some details to be considered in the implementation of the Damping Manager.
When 7-4 is started, the task set is not schedulable with the current periods, thus the
EM tries to accommodate the request of 7 4 by increasing the periods of the other tasks
according to the elastic model. The actual execution rates of the tasks are shown in
Figure 7.5. Notice that, although the first three tasks have the same elastic coefficients
and the same initial utilization, their periods are changed by a different amount, because
7-3 reaches its maximum period.
To verify the behavior of the Damping Manager, another experiment has been per-
formed using 4 tasks with the parameters shown in Table 7.3. Considering the utiliza-
tion reserved for the EM, the DM and other device handlers in the system, the effective
total utilization L7,,, available for the task set is 0.782. Since 23/100 231100 + +
+
231100 231100 > 0.782, the periods are initially expanded by the elastic law, and
the tasks start with current periods different from their nominal periods: T 1 = 107,
T2 = 107, T3 = 122, and T4 = 143. At time t = 5, 7-1 issues a request to change
its period to TFE" = 50, and the DM starts to gracefully adapts the periods. At time
t = 15, rl issues a request to change its period to 250, and all the other periods can
gracefully go to their nominal values. In this experiment, the transient periods T I( k )
for 71 were generated using a linear law. The result of this experiment is illustrated
in Figure 7.6, which shows the number of executed jobs as a function of time. Figure
Second experment
Table 7.3 Task set parameters used for the second experiment. Times are expressed in
milliseconds.
7.7 shows how task periods evolve d~lringthe transition. It is worth observing that,
although Tl is modified using a linear transition law, the other periods vary according to
a non-linear function. This happens because, when T 1is decreased, the total processor
utilization increases, so the Elastic Manager performs a compression of the other tasks
(enlarging their periods) to keep the total load constant. Given the non linear relation
between total utilization and periods (C= C1/Tl + +
. . . C,,/T,), the other periods
change in a non-linear fashion. Finally, Figure 7.8 shows the period evolution when
an exponential law is used for rl to modify its period.
A different experiment has been performed using the task set shown in Table 7.4 to
test the behavior of the DM in the presence of dynamic task activations. In this case,
when a new task ~h needs to enter the system with period T; and there exists a feasible
elastic schedule for it, it cannot be immediately activated with that period. In fact, the
other tasks have to gradually reduce their utilizations (according to the damping law)
T ~ m e(usec)
Figure 7.6 Number of processecl jobs as a function of time when 71 changes its periocl
\+ith a linear law.
Tasks' Perods
task 4
400
Figure 7.7 Petiods ebolution as a function of time mhen changes its periocl mith a linear
lam
to decrease the load and create space for the new task. As a consequence, the new task
is activated with a large period (theoretically infinite, practically equal to MAXINT),
which is gradually reduced to the final Ti; value by applying the damping law. We
note that the time required to the transition is always bounded to S T D1 1 , where S
is the number of steps fixed for the transition and TDsIis the period of the Damping
Manager.
Tasks' Perods
500
400
-
-n 300
a
200
i00
0
0 5ei06 lei07 i5ei07 2ei07 2 5ei07
T ~ m e(usec)
Figure 7.8 Periods evolution as a function of time when changes its period with an
exponential la\+.
Table 7.4 Task set parameters 11sed for the third experiment. Times are expressed in
milliseconds.
In the experiment, task 7 2 was added at time t = 5sec, and the DM started to decrease
its period towards the final value TL,,= 100ms. Figures 7.9 and 7.10 show the results
of this experiment when a linear transition law was used. It is worth noticing that, as
a consequence of 7 2 arrival, all the task periods begin to gracefully expand to create
space for 7 2 , thus 7 2 actually begins to execute only when T2(k) < TTar.From
Figure 7.10 it is also interesting to observe that, as T 2is decreased linearly, the other
periods increase exponentially, based on their elastic coefficients, for the same reason
explained in the previous experiment.
Figure 7.11 shows the result achieved on the same task set using an exponential transient
law. In this case, the activation delay experienced by 7 2 is smaller with respect to the
linear case. Moreover, the other periods reach their final values with a much smoother
transition.
T ~ m e(usec)
Figure 7.9 Number of processed jobs as a function of time when task Q is clynamically
acthated and its period is changed using a linear transition law.
Tasks' Perods
task l -
task2 -
500 task 3
task 4
Figure 7.10 Periocls ex-olution as a function of time when task Q is dynamically actix-atecl
and its period is changed using a linear transition law.
From the experiments presented above, it can be seen that, when an active task wants
to change its period (either lower or higher than the current one), a linear transition law
is able to achieve smoother period variations on the other tasks. On the other hand,
when a new task needs to be activated in the system, an exponential law (for reducing
its period to the required final value) is able to vary the other periods more gracefully
and it also allows to reduce the activation delay of the newly arrived task.
Tasks' Perods
Figure 7.11 Periods as a function of time \\hen T2 is changed using an evponential tran-
sition lam.
FEEDBACK SCHEDULING
Traditional hard real-time applications are designed to respect all deadlines of every
task in worst-case scenarios. Although such an approach is very effective when the
characteristics of the system are known in advance, it presents some problems for
highly dynamic systems, where the characteristics of the environment can vary during
system's lifetime, and the total utilization is subject to online (and often unpredictable)
fluctuations. In these cases, the classical hard real-time approach suffers from the
following problems:
One of the first proposed real-time closed-loop scheduler, Feedback Control EDF (FC-
EDF) [SLS99], was originally developed for working with insufficient resources (i.e.,in
overload conditions), but it can also be used for handling dynamic systems characterized
by unpredictable workloads. In particular, FC-EDF uses a feedback scheme on the EDF
scheduling algorithm to tolerate uncertainty in tasks' execution times. The observed
variable is the dendlirle rizirr lnfio J I , defined as the ratio of the number of missed
deadlines and the total number of deadlines in an observation window. Admission
control is used as an actuator to affect the system workload. In particular, the control
action on the system utilization is performedby rejecting tasks or changing their service
level.
When a new task r, arrives in the system, it has to pass and admission test that uses
information from the feedback to decide whether 7, can be accepted or not. Every
accepted task is characterized by two or more service levels, having different execution
times and different qualities of the output (see also the imprecise computation model
[HLW91]). A service level controller uses information from the feedback to set the
tasks' quality levels so that the system load is increased or decreased when needed.
The structure of the resulting scheduler is shown in Figure 8.1.
FC-EDF uses a Proportional Integral Derivative (PID) controller to compute the vari-
ation Al'to apply at the system load based on the observed deadline miss ratio -21:
Feedback Sclzeditlirzg
o +p- controller tJ
Figure 8.1 Stlucture of the FC-EDF closed-loop scheduler.
Such a PID controller is used to increase the utilization when the system is not over-
loaded for a certain period of time. In this way, it is possible to increase system
efficiency by accepting a higher number of tasks, or running them at the highest possi-
ble quality level. As an alternative [LSTS99], the reference value for the deadline miss
ratio can be set to -\I0 > 0, so that the system utilization is automatically increased
when J I ( t ) arrives to 0. Using this setting, a simple PID (without any modification)
can be successf~dlyused without underutilizing the system.
The resulting FC-EDF scheduler can be modeled as shown in Figure 8.2; if the EDF
scheduler is modeled as a tank (as proposed by Stankovic and others [SLS99, LSTS991).
it is possible to design the PID controller to properly stabilize the system. This can be
done by using control theory, that already provides tools (such as the Z transform) to
analyze the behavior of closed loop systems.
However, modeling the EDF scheduler as a simple tank system is an oversimplification
that can lead to system instability [LSA+OO]. Such an instability is visible when the
input workload is constant: in this case, FC-EDF is able to control the deadline miss
ratio to 0 in a short time, but after this transient the system continues to experience
periodic deadline misses (using control terminology, this is a limit c>cle). This happens
because FC-EDF only monitors the overload (through the deadline miss ratio), and
cannot monitor underload situations. The result is a control saturation (the deadline
miss ratio cannot be less than 0): for example, the controller cannot make any distinction
between a situation with -21 = 0. C = 0.9 and a situation with JI = 0. C = 0.1. To
avoid system underutilization, FC-EDF always tries to increase the system load when
-21 = 0, but this causes the limit cycle.
The instability problem can be solved by monitoring both the deadline miss ratio
and the system utilization [LSA+OO]: the FC-EDF~scheduler uses two different PID
controllers: the first one computes AC based on the deadline miss ratio, whereas the
second one computes AC based on the utilization. The control signal is then selected
by choosing the minimum between the outputs of the two controllers. In this way,
FC-EDF~is able to achieve a stable deadline miss ratio in all workload conditions.
If a CPU reservation is used to schedule a task r,,then the amount of CPU time Q f
(or the amount of CPU bandwidth Cf) reserved to 7 , can be used as an actuator, and
the number of reservation periods needed to serve each job can be used as an observed
value. For example, the reservation and feedback approaches can be combined to adjust
tasks' periods according to the actual CPU load [Nak98c], or the CPU proportion of
each process can be adapted to control the task's performance. As an alternative, if the
processes served by reservation-based scheduler are organized in a pipeline then the
adaptation mechanism can control the length of the queues between pipeline's stages
[SGG+99].
Adaptive Reservations are an interesting abstraction that allows separating task param-
eters from scheduling parameters [AB99a]. In fact, the traditional task models used in
the real-time literature are useful to directly map each task to proper scheduling param-
eters, but have the disadvantage of exporting some low-level details of the scheduling
Feedback Sclzeditlirzg
algorithm. Since users are not generally interested in such details and do not often
know all tasks parameters, in many cases the (C.T ) model is very different from the
real needs, hence programmers are forced to assign low-level parameters according to
complex mapping f~mctions.
These problems can be addressed by introducing high-level task models which provide
an interface closer to real user's requirements. In particular, such high-level task models
eliminate the need for an a-priori knowledge of the worst-case execution time. Each
task 7 , can be characterized by a weight z c , , representing its in~porfurlcewith respect
to the others, and by some fer~zporulcon~fruirlt~, such as a period or a desired service
latency.
If the system is overloaded, and the CPU bandwidth is not sufficient to fulfill each
task's requirement, a bandwidth compression algorithm has to be introduced to correct
the fraction of CPU bandwidth assigned to each task using the task weights L C , (tasks
with higher weights will receive a bandwidth nearest to the requested one).
The advantage of such a model is that it separates task temporal constraints (the period
T I ,or the rate R, = l / T z )from task importance, expressed by the weight u t , . In fact,
one of the major problems of classical real-time scheduling algorithms (such as RM or
EDF) is that the task importance implicitly results to be proportional to the inverse of
the task period.
Since the underlying priority assignment is based on EDF, if the server is schedulable,
,
each instance 7, is guaranteed to finish within the last assigned server deadline di,.
,
Hence, the CBS scheduling error E , represents the difference between the deadline
,
d : , that 7 , ir guarurlfeed to respect and the deadline d , , = ?-, ,+ T, that it rlzoiild
, ,
respect. A value E , = 0 means that job r, met its soft deadline, whereas a value
,
E , I > 0 means that job r, completed after its (soft) deadline, because the reserved
bandwidth L', = Q;'/P,' was not enough to properly serve it. Hence, the objective of
the system is to control the scheduling error to 0: if this value increases, Q i has to be
increased accordingly, otherwise it can be left unchanged.
If C ,C f , > Club(where Clubis the utilization least upper bound of the schedul-
ing algorithm), then the reserved bandwidths must be rescaled using a compression
mechanism, to maintain the system schedulable. To better understand the compression
mechanism, some additional definitions are needed:
L;" ifC,L;"<Club
otherwise
being s,the scaling factor. Since the compression must be done according to the tasks'
weights, s , must be proportional to zc,: s, = zc, JI.For the sake of generality, the sum
of the reserved bandwidths is set to C m a " < Liuh, hence imposing C, L7s' = LTmUT
we have:
'Each feedback controller is not aware of all the other reserved tasks in the system.
Feedback Sclzeditlirzg
Refel ence r
i
go
\
J
- 'compressioni
i
Algorithm
ll 0 i
Reser\ed
Bandwidth
i
Scheduling
System
fo
'
-
CB S Scheduling Error
Hence,
This simple method can be slightly modified to guarantee a minimum bandwidth L7 " I n
to each task.
The closed loop control used to adjust the reserved bandwidth is shown in Figure 8.3.
When implementing an Adaptive Reservation abstraction, it is important to design the
feedback function so that the resulting adaptive scheduler is able to assign the correct
amount of resources to each task (when possible) in a short time and with an acceptable
accuracy. Since control theory has already been proven to be a valid tool for designing
feedback schedulers (see FC-EDF [SLS99, LSTS99, LSA+OO]), it is interesting to
use it for evaluating the performance of an adaptive reservation. Using control theory
terminology, the closed loop system must be stable, and the response time, overshoot,
and steady-state error must be compliant with some specifications.
Since a proper feedback scheme providing the required characteristics can be designed
only based on an accurate model of the system, a precise model of a reservation sched-
uler has been developed [APLW02]. Such a model is highly non-linear, and contains
some quantization effects (given by the presence of a ceiling operator in the model),
hence it is very difficult to control. However, by applying some approximations it
is possible to linearize the model, and to design a feedback controller that is able to
stabilize the closed-loop system. As for FC-EDF~,the resulting controller is based on
a stvitchirg &nnrnic (two different controllers are designed, and the one to be used
is dynamically chosen based on the value of the controlled variable c). The classical
"pole-placement" technique can be used to synthesize the two controllers; in this way it
is possible to comply with requirements on the closed loop dynamics (i.e., the evolution
of the scheduler under the action of a feedback controller).
value (-Q--F+
0) +
G 4 F, paE
Figure 8.4 Dynamic system representing a linearized resenation \+it11a feedback mnech-
anism.
The simplicity of the system (whose dynamic equations are similar to those of a tank)
suggested the use of a Proportional Integral (PI) controller (this is a difference with
respect to FC-EDF~,which needs a PID). A PI controller is described by:
where C p and CI are the coefficients of the proportional and integral actions, respec-
tively. By manipulating the previous equation, a PI can also be described as:
wherea = C p and 3 = CI C p .
According to control theory, the closed-loop system is stable if the poles Z , of the
closed-loop system (i.e. the zeros of the denominator of the closed-loop transfer film-
tion) have norm strictly lower than 1: Z , < 1. Moreover, the decay rate p (i.e.,
the "speed" with which the closed-loop system returns to a stable state) is given by
the maximum norm of the poles. Observe that the use of the PI controller enables the
choice of the two closed-loop poles.
Feedback Sclzeditlirzg
reference
4 Fc h Quantization
-€
u
> G Fu
Figure 8.5
I Dynamic system representing a linearized CBS nit11 a feedback mechanism
Finally, note that the linearization performed to use the Z transform introduced a
q~rarlticationerror due to the approximation of a ceiling. This quantization error can
be taken into account (as shown in [APLW02]), and its effects can be bounded. A
model of the system accounting for the quantization error is shown in Figure 8.5. It
has been proven that the quantization error has no effects on E , but only causes an
overestimation in the reserved bandwidth. If P s < T, the maximum overestimation is
such that - -
-
C
T-
<c7,:<T - P 5 '
-
C
The first strategy is the one used by Adaptive Reservations, where the scheduler, or
a QoS manager, adapts the reserved bandwidths based on tasks' specifications. The
second strategy seems to be similar to the one used by FC-EDF, in which a service level
controller can switch the service level of each task. However, it presents a fundamental
difference, because each application e.xplicitly scales down its own QoS (and conse-
quently its resource requests) to remove the overload condition and make the system
schedulable. Since the centralized scheduler is unaware of such a QoS adaptation, this
second mechanism is referred to as ~~~~~~~~~~~~~~level adnpfnfiorl. Each application has
the responsibility to cope with its own overloads and can scale down its QoS in different
ways, because it is the only entity that knows how to perform such a QoS adaptation,
without any help from the scheduler.
Several approaches for performing such an application-level adaptation have been pro-
posed in the literature and are well known in the multimedia community, ranging from
enlarging task periods to skipping some task instances. For example, DQM [BNBM98]
is a feedback-based QoS manager which does not require any support from the operating
system. DQM is a middleware solution aimed at supporting soft real-time applications
in a conventional OS (Linux). Application execution levels are changed based on re-
source usage, by monitoring the benefit directly experienced by the application, and
based on the system load estimated by the middleware itself. A similar approach is
adopted by FARA [RSY98], where a resource allocator monitors the resource usage
and coordinates the adaptation. This solution addresses the problem of integrating
QoS adaptation with real-time techniques, however it relies on the a-priori knowledge
about the resources required by each application in each operating mode. The QRAM
model, presented in Section 7.1, can also be used for performing QoS adaptation at the
application level.
In other solutions [Apa98]. each application QoS can be scaled by a global QoS manager
in order to better respond to the user needs. The adaptation is based on specific r ~ z o d e ~
of o y e m f i o ?provided
~ by each application, but it is still performed on a global basis.
Note that application-level adaptation is mainly useful when the system is perma-
nently overloaded. For example, if the sum of all CPU utilizations requested by the
applications is less than the maximum available bandwidth (1 for CBSIEDF), then
the adaptive reservation mechanism is able to find a feasible bandwidth assignment
Feedback Sclzeditlirzg
/ Reouested
QoS Mapping ~alidwidth
Algorithm
Scheduling
Refe~ence R e s e ~\ ed System
Feedback Colnpression Bmdnidth
Function Algorithm
CBS Scheduling
Ello]
L7 = (L;". . . . Cl)such that each task will receive enough CPU time. In this case,
application-level adaptation is not needed. On the other hand, if the sum of CPU uti-
lizations requested by the applications is continuously greater than the available CPU
bandwidth, then the least important tasks (i.e., the tasks having smaller weights LC ,) can
suffer from local overloads. Indeed, the goal of the global adaptive reservation mech-
anism is to isolate task overloads in the least important tasks, independently of their
requirements and periods2. In this case, an overloaded task can use application-level
adaptation to scale down its requirements and resolve the overload condition reaching
n lotver QoS level irl n corlflnlled fashion, otherwise the QoS degradation would be
unpredictable.
w the scheduler adaptation, realized by an active entity having a global system visi-
bility, such as a QoS manager or the scheduler itself;
w the application-level QoS adaptation, performed by each single application.
To prevent such a behavior, the application-level adaptation has to act slower than the
scheduler adaptation, so that QoS is changed only when the overload condition is long
(in most cases, the QoS is not scaled in response to transient overloads).
Once an on-line estimation of task execution times is available, the estimated value c ,
can be used as an observed value, and some kind of workload adaptation mechanism
can be used as an actuator. For example, the elastic model presented in Section 2.7.1
can be combined with the on-line execution time estimation to dynamically adjust the
tasks' periods [BA02a].
When the system workload estimated through such an explicit monitoring is found to
be greater than a predefined threshold, the adaptation mechanism can be used to find
a feasible tasks' config~lration. This approach can be combined with CPU reserva-
tions [FNT95, Nak98cI to enforce the maximum execution time to each task. With this
technique, the amount of time reserved to each task in each period must still be defined
based on some off-line estimation, whereas the period is dynamically adapted based on
the actual execution time. If the reserved budget is too small, the task will experience
large overruns that will cause the algorithm to increase its period too much. On the
other hand, if the estimation is too big, the periods are not optimized, the reserved
budget is never used completely, and the system is underutilized.
Feedback Sclzeditlirzg
Ud >
Elastic
Compression I
T i
> Kernel - o
.". $!(
Estimator Estimator
If resource reservation is not used, the elastic approach (see Section 2.7.1) provides
a powerful and flexible methodology for adapting tasks' rates to different working
conditions. However, it strongly relies on the knowledge of the worst-case execution
times (WCETs). When WCETs are not precisely estimated, the elastic compression
algorithm will lead to wrong period assignment. In particular, if WCETs are under-
estimated the compressed tasks may start missing deadlines, whereas, if WCETs are
overestimated, the algorithm will cause a waste of resources, as well as a performance
degradation.
To overcome this problem, on-line estimates of tasks' execution times can be used as
feedback for achieving workload adaptation. Such estimates are derived by a runtime
monitoring mechanism embedded in the kernel. When a task starts its execution, it
is created at its minimum rate, and, at the end of each period, a runtime monitoring
mechanism updates the mean execution time 6,and the maximum execution time c,.
Figure 8.7 shows the architecture used to perform rate adaptation. The two values C,
and c, derived by the monitoring mechanism are used to compute an execution time
estimate Q , , used by the load estimator to compute the actual workload C,, = $. 1
Such a value is then used by the elastic algorithm (periodically invoked with a period
P) to adapt tasks' rates. Thus, the objective of the global control loop is to maintain
the estimated actual load C,as close as possible to a desired value LTd.
The advantage of using the elastic compression algorithm is that rate variations can
be controlled individually for each task by means of elastic coefficients, whose values
can be set to be inversely proportional to tasks' importance. Using this approach, the
application is automatically adapted to the actual computational power of the hardware
platform. The effectiveness of the adaptation depends on whether tasks' utilizations are
computed based on worst-case ( c ~ )or average-case (6,) estimates. If the C, estimate
is used to compute tasks' utilizations for the elastic algorithm, tasks are assigned larger
periods and the number of deadline misses quickly reduces to zero. However, this
solution can cause a waste of resources, since tasks seldom experience their worst case
simultaneously.
To prevent the number of deadline misses per time unit to increase indefinitely, the
execution time estimate used to perform the elastic compression must be greater than
the mean execution time, so a value between 6,and C,is typically acceptable. Hence,
the elastic compression algorithm is invoked using a value
It is worth noting that, if k = 1,the elastic algorithm results to be based on WCET esti-
mations, so only few deadlines can be missed when the estimated WCET C,is smaller
than the real one. In general, if no information about execution times is provided, the
first C,values will be underestimated and it will cause some missed deadline in the
task startup time.
A smaller value of k allows increasing the actual system utilization at the cost of an
increased number of possible deadline misses (remember that a deadline is missed
when many tasks require a long execution at the same time). A value of k = 0
allows maximum efficiency, but is the limit under which the system overload becomes
permanent.
Feedback Sclzeditlirzg
The estimation method described above allows using the feedback mechanism either
when no a-priori information about execution times is provided, or when an approx-
imated estimation of the mean or maximum execution time is known. In practice,
the mean execution time estimation is computed iteratively (that is, 6,is periodically
updated based on the last execution times experienced by the task) starting from an
initial value cO.If nothing is known about the task parameters, an arbitrary value can
be assumed for cO;if, on the other hand, an approximate estimation of the execution
time is known in advance, it can be used as cO,reducing the initial transient during
which 6,converges to a reasonable estimation and increasing the speed at which the
periods converge to a stable value.
As a final observation, it is worth noting that this approach can be successfully applied to
task sets characterized by variable execution times, allowing periods to vary according
to execution times variations. If, instead, the proposed mechanism is applied to tasks
characterized by fixed execution times, it allows adapting task periods to the unknown
execution times without any deadline miss. In this case, the mean execution time
is equal to the WCET, but if the starting estimation c0 is different from the actual
value, the mean execution time estimation needs some time to converge to the correct
value. In this transient, if the guarantee factor is less than I , there could be missed
deadlines. Hence, in the case of fixed execution times, a guarantee factor k = 1 is
more appropriate.
If some additional information about the application is provided, the explicit workload
estimation can be combined with an "ad hoc" adaptation mechanism, instead of using
a generic mechanism such as the elastic model. For example, if the real-time tasks
implement a control algorithm, a feedback scheduling architecture for control tasks
can be designed to optimize some control performance metric [Cer03, ACr02, CEOO].
Such a feedback scheduler attempts to keep the CPU at a high utilization level while
avoiding overload and distributing the available computing resources among the control
tasks. It is composed by 3 modules: a ~vor-kloadeatinlafor, a r-eao~rrceallocator, and
a pr-oacti~,eactiorl. The resource allocator assigns periods to tasks so that the total
estimated utilization is controlled to a set point Cd (typically less than CITib),
and a cost
function (see Section 7.3) is minimized. The proactive action is based on the fact that
in the proposed model (see [ACr02]) each controller can work in different modes, and
each mode is characterized by a different profile of execution times. The controller
switches between different modes depending on the controlled system state, and can
signal the feedback scheduler in advance when a mode switch is going to happen.
Such a feedforward information can be used by the feedback scheduler to implement
the proactive action, reacting in advance with respect to changes in the execution times.
As an example, the feedback scheduler can be implemented as a periodic task (simi-
larly to [BA02a]) that periodically reads the monitored mean execution times 6 ,and
computes the estimated utilization C = C c,/T,, where T , is the period assigned to
task T , (i.e., the sampling period of the l t h controller). Each mode of each control
task T , is characterized by a ~torizirtalperiod T,nOln,and the feedback scheduler tries to
assign periods to the control tasks starting from the nominal periods: in other words,
if C &IT,'""" < Cd,then all tasks are allowed to execute at their nominal periods. If,
on the other hand, C &/Tpo"' > Lb, then the feedback scheduler uses its resource
allocator to assign new periods T , to tasks so that C = C;i. As said, the optimal periods
T, can be derived by using control theoretical arguments to minimize a cost function
J ( T Z associated
) with the 2"'" controller. The PLI presented in Section 7.3 can be used
as a cost function, so that standard techniques can be adopted to optimize the tasks'
periods [SLSS97]. However, some simulations show that a simple linear rescaling of
the tasks' periods is able to achieve good results [Cer03]; this is due to the fact that
under certain assumptions the linear rescaling can be proven to be optimal with respect
to the overall control performance.
Finally, the proactive action is used by the feedback scheduler in 3 different ways:
to keep track of the operating mode of each controller, in order to assign more
suitable sampling periods;
to run separate workload estimators for each mode of each task: since each task 7 ,
is characterized by a different execution time profile for each operation mode n?,
maintaining different execution time estimations i.7 can actually help improving
the accuracy of the estimation.
STOCHASTIC SCHEDULING
This section briefly recalls the most important definitions and concepts needed to un-
derstand the chapter, and introduces some of the most important mathematical tools
used to deal with probabilities.
Informally speaking, the final goal of a stochastic guarantee is to compute the proba-
bility to miss a deadline. To perform a formal analysis of such a probability, we need to
define the concept of random ~<al-iable and specify the probability of some events (such
as "job 7 , has execution time c,.,") through the yrobabilih cli~tribiffion~firncfionand
the cirn~illativedistrihtion function.
Definition 9.2 IfXi~a rand on^ variable, flze Cumulative Distribution Function (CDF)
ofXis dqfined a5 C x ( s ) = P{X < x).
Definition 9.3 IfX i~ a ra-andon~variable deJirlecl in D,and CAI (x) i~ ifs CDE the
Probability Distribution Function (PDF) FAX(x) of X is a fimctiorl D + R deJi11ec1
+
ar Fy(.r) = C y (x 1) - C y ( x ) , i f X ir a dircrete lnrdorri ~<al-iable,
or F y ( x ) =
dCy (x)/dx, i f X is a corltiniious mrldoriz ~,ariaDle.
Stochastic Sclzeditlirzg
For discrete random variables (i.e., if D = A[), the PDF FAY( x ) gives the probability
that the variable assumes a random value T: FAY(T) = P { X = T } . By definition,
if C is a discrete random variable and C ( c ) is its PDF, the corresponding CDF can
be computed as C;=oC(C). If the random variable is continuous, the sum must be
replaced with an integral.
Note that the execution time c,., and the interarrival time r,.,+l r, of a job r, - , ,
can be considered as two cliscrefe random variables, defined in .Ir,and their PDFs and
CDFs are discrete functions ,lr+ R.
To simplify the notation, the PDF and CDF of the random variable X , will not be
denoted with F x ( ) and C x ( ) , but simply with X ( s ) , specifying the meaning in the
text.
The definition for continuous functions is similar, using an integral instead of a sum.
The convolution is frequently used in the probabilistic analysis of scheduling algorithms
to compute the distribution of the sum of tasks' execution times, or similar quantities.
If the observed quantities evolve with time, the concept of random variable is not
sufficient to describe them, and we must introduce the concept of a sfochasficprocess.
,
According to the previous definition, the sequence c , of job execution times of task 7 ,
is a stochastic process. However, note that the probability P{c ,~,= c) of having a job
execution time equal to c does not depend on the job index j, hence the execution times
of a task can be described by a simple PDF L7(c).A stochastic process having such a
property is said to be a time invariant process. Another important class of processes is
represented by Marko~> proceaseJ:
Definition 9.5 A stoclza~ticprocess X, i~ a Mar-ko~>
process ifthe d due XIord? depend^
X , 3 . . . . , X1 or 011 finle 2 .
on X Z 1 ;flzaf i ~i f, does rzof depend on XZp2.
Since disciete tunctions are simpler to woi k with, time instants are often considered as
integers, so that a task^, can be described by a pail of time invaliant stochastic processes
( C ( c ) I, r ( t ) )with
, , - ,
L7(c) = P { c , = c ) and I 7 ( t )= P { ? , , + I 1, = t } . The only
case in which execution times and allival times ale consideled to be leal numbers is in
the real-time queueing theory (RTQT)(see Section 9.3), because tiaditional queueing
theory is based on continuous time, and its real-time extension did not change this
assumption.
Most of the algorithms related to probabilistic analysis are based on a simplified task
model, in which the interarrival times are constant (i.e., 1 7 ( t )= 1 if t = T , , and
V ( t )= 0 otherwise). This is the so called renziperiodic taskrizodel [TDS+95], in which
a task r, is described by the tuple ( C ( c ) T
. , ) . In other words, the semiperiodic task
model simply extends the traditional Liu & Layland periodic task model by replacing
the worst-case execution times with stochastically distributed execution times.
Finally, we need some way for indicating the "average value" of a random variable:
using a more rigorous formalism, this is called the expectatiorl, or the e~pectedvalue.
Figure 9.1 An example of time demand function for three tasks {q = (100. 300). ~2 =
(100.100). ~3 = (200.600) scheduled by RM.
in fixed priority systems, a job 7, k experiences its worst-case response time when it is
released simultaneously with all the jobs with higher priority. Such a release time is
called the cl-ificnl instant. If job 7, k is released at the critical instant, ut, ( t ) is defined
,
as the maximum amount of time demanded between r k and t by r, k and by all higher
priority jobs finishing before f , k .
As shown in Figure 9.1, the time demand function w, ( t ) is a step function, increasing
by C, every time a higher priority job r , 1, is released. According to the TDA, if
, < <
3t : 7 k t d l and zc, < t 7 , k , then r, and all higher priority tasks will not
-
This approach can be extended by considering probabilistic distributions for the execu-
tion times instead of worst-case values. This is done, for example, in the Probabilistic
Time Demand Analysis (PTDA) [TDS+95] or in the Stochastic Time Demand Anal-
ysis (STDA) [GL99]. Both PTDA and STDA are based on the previously introduced
semipel-iodic task model. The analysis is performed by considering a demanded time
<
distribution IT; ( t ) = P{w, ( t ) t ) instead of the time demand function ut, ( t ) .
The STDA analysis [GL99] extends PTDA to the case of D , > T I ,and fixes some
inacc~lraciesin the analysis (in fact, PTDA is not very accurate when the average
utilization approaches 1).
The key point in STDA is the concept of level-j bus! irltend: if tasks can miss their
deadlines (as it happens when a stochastic guarantee is used), analyzing the first job
after the critical instant is not enough, because the successive jobs can be affected by
its behavior. A level-j busy interval is an interval of time beginning when a job 7 ,
or a higher priority job is released and immediately prior to the instant no job in those
tasks is ready for execution. The interval ends at the first time instant t at which all
jobs of 7, and higher priority tasks released before t have completed. If a level-j busy
interval begins at a critical instant (i.e., it begins with the release of a job r , and all
>
tasks r, : p, p , ) , it is called m-ylzaae le~vl-@ , bus! interval.
According to the previous observation, the analysis cannot be stopped at the first job
T, k after a critical instant, but must cover the whole in-phase level-@, busy interval
that follows the critical instant. No job after the end of the interval will be affected by
the previous history of the system, as already known in real-time theory [Leh90]. Since
the time demand function and the demanded time distribution are computed on more
than one job, an additional index must be added, so they will be denoted as ut, A ( t ) and
11; k ( t ) , respectively.
As the demanded time distribution TI7, ,(t) must be computed on the whole in-phase
level-@, busy interval, computing the length of such an interval is crucial for STDA.
Such a computation can be easily performed by considering that a job 7, k terminates
when w, A ( t ) = t, hence IT; , ( t ) is also the probability that job T, A finishes within
a time t. In other words, the demanded time distribution coincides with the response
time distribution.
For the first job of a level-@, busy interval, IT; k ( t ) can be easily computed as in PTDA,
since
rr;.,(t) = p { z ~ ? , . ~ (5t ) t).
The response time (demanded time) distribution for the successive jobs in a level-@ ,
busy interval is computed by conditioning the probability to the previous workload:
Stochastic Sclzeditlirzg
Hence, T I j k is computed by convolving the execution time distribution of the task with
the distribution of the backlog obtained by conditioning. This iterative computation
must be repeated until the end of the busy interval. To take into account the effects
>
of higher priority tasks r, : p, p,, the time interval (r, A .j, A ) is divided into sub-
intervals delimited by the releases of higher priority jobs r , [ : ?-, E (r, A .j, A ) , and
the response time probability in each interval is conditioned to the workload in the
previous one. Finally, the probability of a 7 , to complete within its deadline is given
by 11, k(D1).
If the1 e is a single task pel pi ioi ity level, the length of a level-@, busy intei val can be
computed by checking whether r, k finishes betoie the release of r, ~ + 1i.e., , whether
f , < 7, k+l. Therefore, the computation of the lesponse time distlibution TI7, ~ ( t )
can be stopped when P{ul, <
k+l) 7 I k + l )= 1 0 .
The case in which the maximum system workload is greater than one can be analyzed
in a rigorous way only by applying a different approach that does not use the time
demand analysis [DGK+02].Such an alternative approach is based on computing the
finishing time distribution based on the concept of P-level backlog. The P-level backlog
observed at time t is defined as the sum of the remaining execution times of all the jobs
having priorities higher than P that are not completed at time t.
To find a mathematical formulation of the problem that can be easily solved, the analysis
must be extended to an hyperperiod: if Bk is the P-level backlog for the lowest priority
at the beginning of the k t h hyperperiod, then
Since task arrivals are periodic, the same arrival pattern is repeated in each hyperperiod,
hence P { B A = y B L p 1 = s} does not depend on k. In other words, the backlog
process is a Markov chain, and Equation 9.2 can be expressed as
all the probability values of the PDF that would be shifted to negative time values.
Therefore, since the pattern of job arrivals in the hyperperiod is known, the procedure
shown above can be used to compute the P-level backlog during all the hyperperiod,
and at the end of the hyperperiod, by computing 171 ,.,. Obviously, the result depends
on the scheduling algorithm (since jobs priorities depend on the scheduling algorithm).
Solutions have been proposed for computing -11,both in the fixed priority and dynamic
priority (EDF) case [DGK+02].
If the maximum system load 1,C,/T, is less than or equal to 1 and a RM scheduler
is used, the computation shown above is equivalent to STDA. If the average system
load 1,E[C,( c ) ] / Tisz less than or equal to 1, then the process b k = -libkpl has a
stationary solution; that is, after a transient bk, it converges to a probability distribution
b : b = -1Ib. This solution can be found either by using an exact computation based on
some regularity in the matrix J I , or by truncating -11 and b to a reasonable dimension
and by using some numerical technique to solve the resulting eigenvector problem.
See Section 9.5 for more details about the approximation process.
Starting from the P-level backlog, it is possible to compute the job-level backlog,
defined as the backlog due to jobs with priorities higher than or equal to the priority of
a specified job. Under a fixed priority scheme, the job-level backlog is equal to the P-
level backlog, where P is the priority of the specified job. Under EDF, the computation
is a little bit different, but it can still be performed. Since the job-level backlog can be
,
used to compute the job response time as j , = r , ,+ ,+ c, CTAE H C L (where
, ,
r , is the job-level backlog of job r , , and H is the set of jobs that may preempt r , ,),
the PDF of the job response time can be obtained by convolving the job-level backlog
with the PDFs of the execution times of 7 , and of the jobs in H.
The computation of the response time PDF is performed in two stages: in the first
stage, the convolution C:(c) between C ,( c ) and the PDF of the job-level backlog is
performed, and, in the second stage, the effects of preemptions from jobs rk 1, E H are
computed. Since a job rk 1, cannot preempt T , . , before rk.h, its effects are computed
Stochastic Sclzeditlirzg
C:/(c) = ( c ) ifc<rkh
C:/(e) = 0 >
if c 7 k.h
C (e) = 0 if c < 7 k.h
C ' (e) = L7:(e) >
if c 7 k.h
Then CA(c) is convolved with C:'(c), and C: (c) and C:"(c) are recomposed in a single
PDF.
Note that the algorithm described above may seem to be complex and inefficient;
however, to obtain the deadline miss probability, it is not necessary to compute the
whole response time PDF, but only the values for c < D , are sufficient.
Finally, it is worth noting that the worst-case assumptions used by STDA (for example,
the simultaneous arrival of all the tasks) are not used in the backlog-based analysis.
As a consequence, the results of this kind of analysis are less pessimistic, and better
approximate the deadline miss probability, as shown by the authors [DGK +02]. How-
ever, the cost for such an increased accuracy is to perform a complete analysis in the
whole hyperperiod.
If the server is idle when a new client arrives, then the server immediately starts to
serve the client and will finish in a time distributed according to L7(c),otherwise the
client is inserted in a queue. When the server finishes to serve a client, the next client
is extracted from the queue; if the queue is empty, the server becomes idle. The mean
number of clients in the queue is indicated by w , whereas T,, indicates the mean time a
client spends in the queue. Similarly, T,indicates the mean time needed by the server
Figure 9.2 Model of a queue.
+
to serve a client, q = ut 1 indicates the mean number of clients in the system (server
+ queue), and T,indicates the mean time spent by a client in the system. The model
described above is illustrated in Figure 9.2.
The standard queueing theory provides tools for analyzing the statistical properties
of a system modeled as a queue, under the simplifying assumption that the queueing
discipline is FIFO [Kle75]. For example, Little's formula ensures that
Moreover, in the case of Poisson arrivals and exponential service times (the MIMI1
case),
Other interesting results can be found on standard queueing theory books [Kle75].
Note that the previous formulas are only valid if p < 1 (i.e., if the mean interarrival
time is greater than the mean service time). If p > 1,the queue will not reach a steady
state, and its size will increase towards +x, whereas p = 1 is the rneta~fablesituation,
in which nothing can be told about the queue state.
Classical queueing theory is very useful for analyzing network systems or computer
systems where clients (packets or tasks) are served in a FIFO (or Round Robin) order,
but it is not suitable for analyzing real-time schedulers that use different (and more
complex) algorithms. For this reason, a real-time queueing theory was developed
by Lehoczky [Leh96] to cope with complex scheduling algorithms and task timing
constraints. Traditional queueing analysis is fairly simple, because a system (queue
+ server) can be described by a single state variable, as the number of clients in the
queue. On the contrary, real-time queueing theory must distinguish the various clients
Stochastic Sclzeditlirzg
(tasks) to schedule them, and must characterize each task with its deadline. Hence, the
system state becomes a vector (n?,1 1 , . . . , l,,,), where n? is the number of tasks in the
queue, and 1 , is the lend time of task T , , defined as d, - t , where d, is the deadline of
the current job of task r,,and t is the current time.
Similar computations can be repeated for other scheduling algorithms (such as a pro-
portional share algorithm, or a fixed priority algorithm), permitting to reconstruct the
evolution of the system state. Although the previous equations can be used to compute
a probability distribution of the system state (e.g., the queue length distribution, the
deadline miss probability, and so on), such a computation is very complex and can-
not be easily extended to other (non M/M/l) queue models. To simplify the analysis,
Lehoczky proposed to consider the case of a scheduler under Izea~yf m f i c conditions.
The heavy traffic analysis of a queue permits to compute a simple and insightful ap-
proximation of the complex exact solution, under some simplifying assumptions. In
particular, when the traffic on a queueing system is high enough (i.e., p is near enough to
1). the queue can be described using a simpler model that can be easily analyzed [Dai95].
This is similar to the approach taken by the central limit theorem, which approximates
the sum of a large number of independent random variables with a normal distribution.
The heavy traffic approximation is based on rescaling the time and the queue length
in the model, and applying the heavy traffic condition X ,, = X(1 -
/2/;;).pn = X
-
(thus, the load is p = 1 - ;/ fi). This approach can be taken to analyze the Markov
process describing the real-time queue presented above. Although the formal analysis
is fairly complex, the final results are very simple: for example, under EDF, the mean
queue length turns out to be q = p / ( l p), and the PDF of the lead time results to be
-
f ( J )= X ( l G(s))/q.
-
Note that the heavy traffic assumption may seem to be too restrictive, but it is generally
reasonable, because it covers the case that is interesting for stochastic real-time systems.
In fact, the deadline miss probability becomes significant when the system is near to
the full utilization, and hence when it is under heavy traffic.
Finally, the heavy traffic analysis of a real-time queueing system can also be applied to
fixed priority schedulers (obtaining an analysis of generalized RM or DM systems), or
to hard real-time schedulers, in which clients with a negative lead time are automatically
removed from the queue. This method has also been extended to networks of real-time
queues [Leh97].
The periodic subtask 7: has period T,, a well known WCET Cy, and a relative deadline
D r = a D , , with a, E ( 0 , l ) . Note that Cy and a,can be chosen during the design
phase so that the set T P = { r y } of the periodic subtasks is schedulable.
, ,
The sporadic subtask r: is used to serve the jobs r, having c, > CF, which cannot
be completely served by the periodic subtask. Hence, a job T:, of the sporadic task
is released only if c,, > CT, has execution time c:, = c,, - C ,; and arrives at
time r:J = f: when the corresponding periodic job finishes. The sporadic subtask
execution times are distributed according to the following PDF:
where A, is the probability ~ { c : ,> 0 ) of arrival of a sporadic job. Note that C,( c +
C:) must be divided by A,to ensure that CE&,S(~) = 1.
The sporadic subtasks can be served by using an aperiodic server, such as the Sporadic
Server [SSL89] (but other service mechanisms can be used as well), or by using a Slack
Stealer [LRT92, TL921. In the original paper, the authors presented a probabilistic
guarantee based on RM + Sporadic Server. Using the Sporadic Server, tasks with
similar periods are "clustered" together; that is, T is partitioned into cliistenr T such
Stochastic Sclzeditlirzg
that all the sporadic subtasks belonging to a cluster are served by a server with a period
Pf : VT, E r k ,Pf < D,; that is, Pf is less than or equal to the minimum relative
deadlines in the cluster. Moreover, the server budget is set to C i = C A/I<, where C A =
max, ,,trl {C:} and I< = min,, Er, {LD,/PL]).In this way, if a sporadic subtask
arrives when the server is idle, then it is served in less than K P ; < min, ,,Er, { D , } ,
and it will respect its deadline.
The actual deadline miss probability can be computed by combining (through a con-
volution) the probability that a sporadic job r:, requires 12 server periods to complete
with the probability that the backlog in the server queue when T,', arrives requires x
server periods to be consumed. If H A( h ) is the PDF of the number of server periods
required to serve a sporadic job, and X A(s) is the PDF of the server backlog found by
a sporadic job when it arrives, then
P { T ~F Ti,
misses a deadline) = x
I<
H(h)
l~=l
xx
T=I<-h+l
X(x). (9.3)
The distribution H(l2) of the number of server periods needed to serve a job can be
computed based on the PDFs of the execution times of the tasks belonging to the cluster
F A , whereas the Z transform of X ( x ) is given by
where E [6]is the expected value of the number of server periods needed to serve all the
requests arriving in a server period. Clearly, if E [6]> 1,the server queue will explode,
because the amount of time to be served in a server period is bigger than the amount of
>
time the server can serve in period. Hence, if E[6] 1, a stationary distribution X ( x )
cannot be found; note that E[6]is similar to the load p of a queue. More details can be
found on the original paper [TDS+95].
An alternative approach is to serve the periodic subtasks with an EDF scheduler, and
a to use a Slack Stealer for serving the sporadic subtasks. The advantage of this
second methodology is that tasks response times are expected to improve. However,
no stochastic analysis of the EDF + Slack Stealer case has been performed, because of
the difficulty of modeling the Slack Stealer behavior.
The two task transformation approaches presented above have been compared through
a set of simulations [TDS+95], showing that EDF + Slack Stealer provides better
response times than RM + Sporadic Server. However, RM + Sporadic Server gives
more control on the deadline miss probability: there are tasks that when scheduled
by RM + Sporadic Server have a larger average response time, but a lower deadline
miss probability (that is to say, the response time probability is concentrated on values
smaller than the relative deadline).
A different approach has been proposed by the Statistical Rate Monotonic (SRMS)
algorithm [AB98c], which provides a firm guarantee by accepting or rejecting each job
on its arrival: if a job is accepted, it is guaranteed to respect its deadline, otherwise
it does not even start executing. Moreover, a per task guarantee is performed on task
creation, to guarantee a deadline miss probability for the task.
SRMS is based on a variation of the semiperiodic task model, in which each task r ,
,
is described by three parameters ( C ,(c). T,. 6,), where 6, = P{ f , > d l ,} is the
deadline miss probability requested by the task (in the original paper, it is called task's
r e p a f e d QoS). On task arrival, the system runs an admission test, checking whether
a deadline miss probability equal to 6, can be guaranteed to 7 , . If the test does not
fail, the task is accepted. Once 7, is accepted in the system, each arriving job r, is ,
guaranteed to be accepted with a probability 6 , . Such a stochastic guarantee is achieved
by SRMS through two different mechanisms: accoilnfing of the time consumed by r ,,
and aggregafion of consecutive jobs T , . , . r, ,+I. . . .. This method can be used only if
,
the execution time el., of job r, is known at the job arrival time.
To perform execution time accounting, SRMS associates a budget (the maximum bud-
get is called a l l o ~ t ~ a in
r ~ the
e original paper) to each task: as in a reservation based
algorithm, the budget is periodically replenished every PP L ~ of time S ( P t is referred
as silperperiocl) and is decreased when a job r,., executes. However, since SRMS is
a firm algorithm, the budget can be immediately decreased when a job arrives, and
can be used to accept or reject a job, as will be shown later. In the original paper, the
superperiod of task r, is defined to be equal to the period of the next lower priority
task 7,+1.
Figure 9.3 Since the task supelperiod P: is an intege~multiple of the task peliod T,,it is
di\icled into m = P,'/T, phases
(G, < T,- x;~: Q;T,/P;) guarantees that an admitted job will finish within its
deadline, because the d~fferencebetween the budget and the maximum amount of time
required by high priority tasks is greater than the job execution time.
The task's admission control used to guarantee that each task will receive the desired
,
QoS (that is, that a job 7, will be rejected with a probability not greater than 1 6,) is -
performed in two steps: a first test guarantees that each task r , will receiveits allowance
Q: in its superperiod; then, a second test verifies whether the pair (Q k . P,') is enough
for guaranteeing 6,. As for a traditional reservation algorithm, each task is guaranteed
to receive Qf units of time every P: if C:=, QYP,s < c l u b . Note that c'"" 1,
because the analysis is performed under the assumption that periods are harmonic.
Also, P15is set to be equal to T,+l.As a result, the first part of the admission control is
If this condition is verified, each task r, is automatically guaranteed to receive its al-
lowance, hence for harmonic task sets the probability of rejecting a job can be computed
by considering only the first of the two job admission conditions.
Since the task set is harmonic, the superperiod will be an integer multiple of T,,
,
and 171 = T,+l/T,jobs 7, . . . . , 7,.,+,-1 will be released in a superperiod, dividing
it into 171 phases, as illustrated in Figure 9.3. A job arriving in the first phase of a
superperiod always finds a budget q , = Q:, and will be accepted if c, ,<
Q:. Hence,
,
if T , is the probability to accept a job r , arriving in the kt" phase,
The probability T , 2 to accept a job r , , arriving in the second phase is given by the
sum of two terms, considering the two cases in which the job arriving in the first phase
was accepted or rejected:
,
since c, and c, ,-I have the same PDF (because jobs are supposed to be independent),
,
P{c, +c, ,-I < <
Qi} = P{2c,, Qi}, and the probability can be easily computed.
The other probabilities T , A : 0 < k < T,+l/T, can be computed in a similar way,
by considering all the possible histories of the task inside the first k phases, and by
expressing x,A as a sum of 2 k p 1 terms.
If the jobs are uniformly distributed in the various phases, the probability 6 , to accept
,
a job 7, is
If the task periods are not harmonic, the complexity of the analysis increases, because
there are situations in which a job released in a superperiod can have its deadline in the
next superperiod. This case can be addressed in three different ways:
1. admitting the job based on the current budget (the budget of the superperiod in
which the job arrives);
2. admitting the job based on the budget in the next superperiod (the superperiod in
which the job deadline is). In this case, the job execution must be delayed until
the next superperiod, or until all the lower priority tasks are inactive;
3. splitting the job in two sub-jobs, guaranteeing the first one in the current superpe-
riod, and the second one in the next superperiod.
These possible solutions have been considered in a separate paper [AB98b], and the
analysis is omitted here for the sake of simplicity. Note that in this case the second job
admission rule has to be considered too.
r z misses the deadline, then all the successive optional parts r,., k+,, . . . of the
current job are skipped.
To guarantee that a task r , will respect its QoS parameter q, (i.e., that it will complete
at least a fraction q, of its optional parts), the fraction q , of completed optional parts
can be computed as q , = E [ A , ] / o ,where
, A, is a random variable indicating the
number of optional parts of task r , completed in a period (note that since the scheduler
is non-preemptable, this is equal to the number of optional parts that can be started in
a period). By definition,
Figure 9.4 Number of optional parts schecluled for the fitst task
The PDFs P { A , = k ) of the number of completed optional parts are computed starting
from task 7 1 : if X is a random variable given by the sum of the execution times
e l , . . . , c,, of the mandatory parts, then P { A 1 = k } can be computed by finding the
probability that X leaves enough free time for k optional parts in a period, and that the
time Q y reserved for optional parts is enough (see Figure 9.4). This computation can
be easily performed because P{X = s) can be obtained by convolving the c,"(c)
PDFs.
convolved it with the PDF of X, obtaining the PDF of a new variable X ', that can be
used to compute P { A L = k } (that is, the distribution of the number of optional parts
started for task r2).In fact, the probability P { A 2 = k ) is computed by repeating the
process used for computing P { A 1 = k ) , but using X' instead of X . This process can
be iterated to obtain the probability for all the other tasks.
Since all the tasks have the same period T, the RM assignment does not give any
usefill hints in deriving tasks' priorities. The authors propose a priority assignment
called Quality Monotonic Scheduling (QMS), that assigns higher priorities to tasks
Stochastic Sclzeditlirzg
with a higher quality parameter q , (remember that mandatory parts always have higher
priorities than optional parts).
The proposed analysis can be extended to preemptable resources by changing the way
in which X' is computed (see the original paper [HLR+01] for all the details).
When considering arbitrary periods, the admission test for mandatory parts ( C, <
T ) must be changed: if periods are harmonic, c ' "is~still 1,and the test is C ,C,/T, <
1. After passing this admission test, the optional parts can be guaranteed as shown
above. If periods are not harmonic, an exact schedulability test (based on response time
analysis or on time demand analysis) should be used, although the authors analyzed
the behavior of the algorithm by simulation.
As a final consideration, it is worth noting that all the modifications proposed to the
traditional scheduling algorithms for controlling the deadline miss probability (i.e.,
the Task Transformation method, SRMS, and Quality-Assuring Scheduling) tend to
implement some kind of temporal protection among tasks (see also Chapter 3). This
fact seems to suggest that temporal protection is helpfill to simplify the probabilistic
analysis of a real-time system: for this reason, a stochastic guarantee of reservation
based schedulers is presented in the next section.
,
2. the arrival of job r,.,corresponds to a request of c , units of time entering the
queue.
This model is similar to the one used by the Task Transformationmethod to analyze the
sporadic subtasks, with the difference that here each task has its own queue. Having
per-task queues also simplifies the analysis with respect to the real-time queueing
theory: in fact, since all the jobs of a single task are served in a FIFO order, there is no
need to model the scheduler behavior in the queue.
3. when a job arrives, the next request of c,+l units will arrive after ?-,
,+I - ?-, ,=
,
z, P;' units of time
Since execution and interarrival times are random variables described by the PDFs
K(c) and I; ( t ) , the amount of execution t i m e s that still has to be served immediately
after a job arrival is a random variable too, described by a PDF T!'") = P{s, = k}. ,
,
Being Qf time units served every period P,". job r, will finish before time
J'
hence the probability nf that the queue length
arrival, is a lower bound of the probability P{f , ,
J,., is k , immediately after a job
,<
- ?-, 6,) that the job finishes
before the probabilistic deadline
Being the interarrival times multiple of the server period P;', it is possible to define
T.</(z) = P{r,., -r , ,-I = zP;') as the probability that the interarrival time between
two consecutive jobs is zP,". Hence,
zf tmodPs#O
V(t) =
otherwise.
Stochastic Sclzeditlirzg 255
, ,
Note that, since c, and r, ,+1 r, are time invariant, L,(c) and I;/(;) do not depend
-
,
Being s , . , and z, greater than 0, by definition, the sums can be computed for 12 and
z going from 0 to infinity:
Hence,
with
Considering 1nL A as an element of a matrix LII'.xj' "' can be computed by solving the
equation
nil J ) = Alpn(z J-1)
(9.7)
where
where E [ C zis
] the execution time expectation and E [ T , ]is the interarrival time expec-
tation.
, ,
If this condition is not satisfied, then the difference f , - r, between the finishing
, ,
time f , , and the arrival time r , of each job 7, of task 7 , will increase indefinitely,
diverging to infinity as j increases:
This means that, for preserving the schedulability of the other tasks, 7 , will slow down
in an unpredictable manner.
If a queue is stable, a stationary solution of the Markov chain describing the queue can
be found; that is, there exists a finite solution 111such that IIz = lim,,, II(' Since
J ) .
This solution can be approximated by truncating the infinite dimension matrix JI ' to an
n x n matrix -2'and solving the eigenvectorproblem rll = AZ?'rlz with somenumerical
calculus technique.
w be conservative (pessimistic);
w verify Equation (9.5).
0 zf trnodP:#O
r/;(t) = otherwise.
Figure 9.5 of a CDF
Conser\ati\e app~oui~nation
Equation (9.8) states that the approximated interarrival times CDF lf; ( t )computed
I;
from ( t )must be greater than or equal to the interarrival times CDF 11, ( t )computed
from I; ( t )(recall that the CDF of a stochastic variable expresses the probability that
the variable is less than or equal to a given value).
In practice, the intuitive interpretation of Equation (9.8) is that T/; ( t )is conservative if
I;;
the probability that the interarrival time is smaller than t according to ( t )is bigger
than according to LT1( t ) .This concept is explained in Figure 9.5.
I;;
It can easily be verified that, if ( t )is computed according to Equation (9.9), then it
will have both the required properties.
REFERENCES
T.F. Abdelzaher, E.M. Atkins, and K.G. Shin. Qos negotiation in real-time
systems and its applications to automated flight control. In Proceedings of
tlze IEEE Real-Time Teclznology a d Applications S~rnposiiirn,Montreal,
Canada, June 1997.
[ABOO] Luca Abeni and Giorgio Buttazzo. Support for dynamic QoS in the HAR-
TIK kernel. In Proceedings of tlze IEEE Real Time Computing S y s t e m
arld Applications, Cheju Island, South Korea, December 2000.
Luca Abeni and Giorgio Buttazzo. Hierarchical qos management for time
sensitive applications. In proceeding^ of flze IEEE Real-Time Teclznology
n ~ 2001), Taipei, Taiwan, May 2001.
and Applicatiorzs S y n ~ p o ~ i u(RTAS
[ABO 1b] Luca Abeni and Giorgio Buttazzo. Stochastic analysis of a reservation
based system. In Proc. of the 9th Irztenznfional Work~hopon Parallel and
Dirtributed Real-Time Sxsterizs, San Francisco, CA, April 2001.
Luca Abeni and Giorgio Buttazzo. Reso~lrcereservations in dynamic
real-time systems. Real-Time S y ~ f e r m27(2):
, 123-165.2004.
Enrico Bini and Giorgio C. Buttazzo. The space of rate monotonic algo-
rithm. In Proceedings of the 23ld IEEE Real-Time S y ~ f e n Synlpo~i~rnl,
z~
December 2002.
S.K. Baruah, A.K. Mok, and L.E. Rosier. Preemptively scheduling hard-
real-time sporadic tasks on one processor. In Proceedingr of tlze 11th
IEEE Real-Erne Syrternr Synzporiz~nz,pages 182-190, December 1990.
S.K. Baruah, L.E. Rosier, and R.R. Howell. Algorithms and complexity
concerning the preemptive scheduling of periodic real-time tasks on one
processor. Tlze Joz~rrlalof Real-Erne S~stenzs,2, 1990.
G.C. Buttazzo and J. Stankovic. Red: A robust earliest deadline schedul-
ing algorithm. In Proceecling~of Tlzird Irztenznfional Workslzoy on Re-
syonsh>eC o n p f i n g S y s t e m , 1993.
[CB97] M. Caccamo and G.C. Buttazzo. Exploiting skips in periodic tasks for
enhancing aperiodic responsiveness. In IEEE Real-Time S\sterizs Sjnzpo-
siunl, pages 330-339, San Francisco, 1997.
[CBSOO] M. Caccamo, G. Buttazzo, and L. Sha. Capacity sharing for overrun con-
trol. In Proceedings oftlze IEEE Real-Time Sjstenls Sj r~zposiurn,Orlando,
Florida, December 2000.
[DLS97] Z. Deng, J. W. S. Liu, and J. Sun. A scheme for scheduling hard real-time
applications in open system environment. In Ninflz Eimtnicro Work~hoy
on Real-Time S y s t e m , 1997.
[FM02] Xiang Feng and Aloysius K. Mok. A model of hierarchical real-time
virtual resources. In Proceedings of the 231d IEEE Real-Time S y s t e m
Sxnzposiiinz, pages 26-35, Austin, TX, USA, December 2002.
[FNT95] Hiroshi Fujita, Tatsuo Nakajima, and Hiroshi Tezuka. A processor reser-
vation system supporting dynamic qos control. In 2ndIntermtiorud Work-
slzoy on Real-Time C o n p f i n g S y ~ f e r mand Ayylicafions, October 1995.
[GAGBO11 Paolo Gai, Luca Abeni, Massimiliano Giorgi, and Giorgio Buttazzo. A
new kernel approach for modular real-time systems developmet. In Pro-
ceedings of the 13th IEEE E~rronzicroConference on Real-Time Sysfenz~,
Delft, The Netherlands, June 2001.
[GB95] T.M. Ghazalie and T.P. Baker. Aperiodic servers in a deadline scheduling
environment. Joz~rrlalof Real-Time System, 9, 1995.
[GBOO] G.Lipari and S.K. Baruah. Greedy reclaimation of unused bandwidth in
constant bandwidth servers. In IEEE Proceedings qf the 12th Eimnzicro
C o n f e r e ~ on
z ~ Real-Time S y s t e m , Stokholm, Sweden, June 2000.
Pawan Goyal, Xingang Guo, and Harrik M. Vin. A hierical cpu scheduler
for multimedia operating systems. In 2nd OSDI Syr~zyosiur~z, October
1996.
[SAWJ+96] Ian Stoica, Hussein Abdel-Wahab, Kevin Jeffay, Sanjoy K. Baruah, Jo-
hannes E. Gehrke, and C. Greg Plaxton. A proportional share resource
allocation algorithm for real-time, time-shared systems. In IEEE Real
Time S y ~ f e rS?r~zyosiim
~z 1996.
Q
Open system, 103 QoS, 47, 195
Optional parts, 250 QoS adaptation, 228
Overload, 6 , 2 3 QoS dimensions, 195
Overload management, 20 QoS manager, 201,228
Overrun, 7 QoS optimization, 197
Q-RAM, 195 Scheduling error, 223
Quality of Service, 47, 195 Scheduling
Queueing theory, 235,243,253,256 elastic, 50
best effort, 3 1
robust, 31
Schwan, 25
Rajkumar, 135, 138 Semiperiodic task model, 238,246
Ramamritham, 23 Server, 65,78, 88
Random process, 254 Server budget, 79-80, 83, 101, 108,
Random variable, 236 121, 145, 149,158, 179, 181
Rate Monotonic, 16 Service adaptation, 39
Real-time guarantee, 235 Service levels, 220
Real-Time Mach, 89 Seto, 203
Reclaiming mechanism, 32 SFQ, 72
Recovery strategy, 34 Sha, 135, 138,142,203
RED algorithm, 33 Shared memory, 133
RED Linux, 92 Shared resources, 134, 141, 148, 159
Relative deadline, 2 SHaRK, 92
Release time, 2 Shasha, 35,43
Reservation, 25 1,254 Shin, 203
Reservation period, 75 Slack time, 3
Residual laxity, 33 Soft deadline, 4
Resource constraints, 57-58, Soft real-time system, 10
140-142, 147,150 Soft reservations, 74
Resource Kernels, 92 Sporadic Server, 65
Resource management, 200 Spuri, 136
Resource partitioning, 117 SRP, 138
Resource reclaiming, 21, 33, 35 Stack Resource Policy, 138
Resource reservation, 63,93 Stankovic, 23.26, 33
Resource set, 92 Start Fair Queuing, 72
Response time, 2 Start time, 2
Rialto, 91 Statistical Rate Monotonic, 248
RM, 7 Statistically distributed execution
Robust Scheduling, 3 1 times, 235,238,241,247
Stochastic analysis, 236
Stochastic guarantee, 236,250
Stochastic process, 237
Scheduling algorithm Stochastic Time Demand Analysis,
D-over, 35 239
Robust Earliest Deadline, 33 Supply function, 117, 119
Scheduling deadline, 78 System ceiling, 139, 147, 151, 156
Virtual deadlines, 73
Virtual eligible time, 73
Table-driven scheduler, 98 Virtual finish time, 70
Tardiness, 3 Virtual processor, 97, 100
Task, 2 Virtual start time, 70
Task models, 223 Virtual time, 70
Task overloads, 229
Task
aperiodic, 4
firm, 4 WCET, 8 , 2 3 1
hard, 4 Weighted Fair Queuing, 70
non real time, 4 WFQ, 70
periodic, 4 Work conserving, 70
soft, 4 Workload, 5-6.26
sporadic, 5 Worst-case computation time, 8
TBS, 65,104
Temporal firewalling, 62
Temporal granularity, 100
z
Temporal isolation, 20, 62, 83 Zhou, 25
Temporal protection, 61, 236,253 Zlokapa, 25
Thambidurai, 25
Thread, 95
Time Demand Analysis, 238
Time-triggered, 98
Time-triggered activation, 4
Total Bandwidth Server, 104, 65
Trivedi, 25
Underutilization, 235
Utility, 195
Utility function, 28, 197
Utilization. 6
Value, 28
Value density, 28
Variable computation time, 10
Variable execution time, 233