Providing End-To-End Qos For Ip-Based Latency-Sensitive Applications
Providing End-To-End Qos For Ip-Based Latency-Sensitive Applications
Latency-sensitive Applications
Dissertation Proposal
Chen-Nee Chuah
Department of Electrical Engineering and Computer Science
University of California at Berkeley
Committee: Professors S. McCanne (Chair), R. H. Katz, J. Walrand, G. Shanthikumar
Abstract
Advances in IP-technology, which include the real-time encoding and decoding of voice and video,
router mechanisms, and real-time transport protocol (RTP) offer great potential for computer-telephony
integration over IP networks. However, due to the unpredictable loss, delay, and delay jitter in packet
switching network, the conventional Internet single best-effort class of service can adversely impact the
end-to-end performance of real-time applications. This dissertation proposal describes a research effort in
determining mechanisms to provide Quality of Service (QoS) in an efficient and scalable manner over the
Internet. We will elaborate on an end-to-end service architecture that can support per session statistical
quality of service guarantees, by building on the significant amount of previous research on network- and
application-level QoS solutions. In our initial work, we have chosen Voice over IP (VoIP) traffic as a workload for evaluation, and use a subjective test to determine the effects of network centric parameters on
human perceived voice quality.
First, we develop resource allocation techniques for VoIP traffic in virtual private networks (VPNs) to
achieve the required QoS. Specifically, we will show how to use traffic statistics in predicting aggregate
bandwidth usage, which is crucial for capacity planning and admission control decisions. Since these
results may not scale to the publicly shared Internet, we propose to study performance of real-time applications within the Diff-Serv framework. We will describe a) the trade-offs between performance and network
efficiency for VoIP traffic in the presence of other integrated computer-telephony applications, e.g., interactive web applications, and b) the interaction between real time flows transported using UDP, and congestion-controlled TCP connections. We expect to make modifications to the existing RTP protocol, so that
real-time applications will adapt their rates to network load, and prevent potential starvation of TCP connections. Finally, as an example, we will investigate the role of adaptive audio applications by re-evaluating the play-out delay adjustment algorithms in the modern context. We have shown the existence of time
correlation between packet delays, and propose to design and analyze a delay adjustment algorithm based
on this conditional delay estimate. We will evaluate this algorithm using trace-based simulations.
Traffic Behavior
QoS Requirements
Client-server Business
Applications
1. Vat
is an audio conferencing tool developed by Network Research Group of Lawrence Berkeley National
Laboratory (http://www-nrg.ee.lbl.gov/vat/).
2. A
flow is a sequence of packets from the sender to the receiver that do not necessarily follow the same route.
Diff-Serv, on the other hand, supports the needs of various applications by using a simple classification
scheme. The basic idea behind Diff-Serv model is to carry the QoS information in band within the packet
in the Type of Service (TOS) field in IPv4 header or Differentiated Service (DS) field in IPv6 [18]. TOS or
DS field is used to indicate the need for low-delay, high-throughput, or low-loss-rate service. The backbone routers provide per-hop differential treatments to different service classes as defined by Per Hop
Behaviors (PHBs) [19-20]. Two service models have been proposed: assured service [21] and premium
service [22]. Assured service is intended for customers to specific amount of bandwidth needed from service providers, while premium Service provides low-delay and low-jitter service, and is suitable for
Internet audio/video applications and for IP-tunnels for Virtual Private Networks (VPNs) [23].
Diff-Serv approach has several advantages over Int-Serv:
Diff-Serv is simpler than Int-Serv and does not require end-to-end signaling (no RVSP).
Diff-Serv is efficient since classification and PHBs are based on per-class rather than per-flow information. Since the number of service classes are limited by the size of TOS field, and the amount of
state information is proportional to the number of classes instead of number of flows, Diff-Serv
approach is more scalable than Int-Serv.
Diff-Serv requires minimum change to the current network infrastructure.
1.4 Our Research Effort
In the previous section, we described several approaches that help provide quality of service for
Internet real-time applications, both at the network level and the end application level. The main contribution of this dissertation study will be the design of an end-to-end service architecture, based on Diff-Serv
framework and adaptive application concept, that can satisfy QoS requirements of IP-based real time applications in an efficient manner. We approach this problem by studying QoS provisioning mechanisms and
their interactions across multiple levels: network, transport and application layers. For our initial work, we
consider VoIP as a workload for evaluation because of their traffic characteristics are relatively wellknown. We use a subjective test to determine the performance measures in terms of human perceived voice
quality in the presence of packet delays and losses. This study will be based on a combination of analysis,
trace collection, and simulation. We will address the following problems:
Resource allocation for real-time applications in VPNs VPN services are potential solutions to
provide performance assurances close to that of private leased lines over the Internet. However, the
existing Service Level Agreements (SLAs) [24] between the network provider and the customers
remain vague. Further development is still needed in proper resource management in VPNs before
QoS of delay sensitive applications can be accommodated. We will describe how to use traffic statistics to estimate the aggregate bandwidth usage. The results are useful to network managers to determine the amount of bandwidth required for a shared link, and the maximum number of connections
that can be supported. We evaluate the proposed solution using VoIP as a workload in a simulation
study. We will further validate our results with trace-driven experiments.
Design an end-to-end service architecture for real-time traffic under Diff-Serv framework. We
choose to base our study on Diff-Serv because of its scalability and relatively ease of deployment.
Previous studies have evaluated the performance of different Diff-Serv service models in a single
hop, using several analytical traffic models (e.g., Poisson, on-off source), but no specific conclusions
can be drawn from these studies to help determine the best combination of router mechanisms to provide QoS for real-time traffic. First, we propose to evaluate the end-to-end performance of VoIP in
the presence of interfering traffic, using Diff-Serv assured and premium service models, for multiplehop topology. In the process, we will develop new workload models for integrated computer-telephony applications, such as VoIP and interactive web traffic. We will describe the trade-offs between
performance and network efficiency for a given traffic load and mixture. Since the VoIP traffic is
transported using UDP flows which do not back off in times of congestion, it can potentially impair
the throughput of TCP connections, and cause network inefficiency. We expect to make some modifications to RTP transport protocol, so that the real-time applications will adapt their transmission rates
to network load and become more TCP-friendly.
An example of application-level QoS: improve perceived audio quality by exploiting time correlation of network delays in adaptive playout delay algorithms The playout buffer for packet audio
applications tracks the packet delay and jitter to dynamically adjust the talkspurts playout delay to
minimize degradation of perceived quality. Previous approach relies on the marginal distribution of
delay statistics, and we propose to exploit additional information about the correlation between
packet delays in designing new adaptive algorithms/schemes that are more robust. We define robustness as the ability of the algorithm to perform well without being affected by the choice of voice
traces, or the nature of network delays.
1.5 Proposal Organization
The goal of this research effort is to design cross-level QoS mechanisms that are crucial to provide
QoS to IP-based real-time traffic. We use VoIP as a workload for evaluation. Figure 1 shows a protocol
stack for the end-to-end IP-based service architecture that we consider. The shaded layers denote the primary focus of this research effort.
The remainder of this research proposal is organized as follows:
Section 2 gives a more detailed discussion of the related work.
Section 3 describes VoIP, its performance requirements, and its equivalent mathematical model that
we use as a workload for evaluation purposes for all our subsequent simulation study.
Section 4 discusses how we use prior knowledge about the traffic statistics in capacity planning and
resource allocation for VoIP over VPNs. We will discuss our simulation results.
Section 5 highlights current and future work in designing and evaluating an end-to-end service architecture to provide QoS for real-time applications under Diff-Serv framework.
Section 6 discusses how we use delay correlation information to improve the efficiency and robustness of the adaptive playout algorithms of packet audio applications.
Section 7 presents the specific research agenda and development plan for this project.
Application Layer
API
API
Transport Layer
Transport Layer
Network Layer
Network Layer
MAC/Link Layer
MAC/Link Layer
Physical Layer
Physical Layer
Figure 1. Network architecture and scope. The shaded layers denote the primary focus of this research effort.
2 Related Work
2.1 Providing QoS via Diff-Serv Framework
Besides the previous studies mentioned in Section 1.3, there have been several recent research efforts
devoted to understand the delay and/or loss behavior of Diff-Serv service model, via both analysis and
experimentation [25-33]. In [25], May et. al. developed analytical models to derive the expected delay and
packet loss of Poisson arrivals, for the two Diff-Serv service models: assured Service and premium Service.
Naser et al. [26] quantify the expected delay of packets for constant bit rate and On-Off source traffic using
a two-bit architecture which combines assured and premium services, for a given traffic load and buffer
size. S. Sahu et. al. [27] compare the performance of two router mechanisms: threshold dropping and priority scheduling, coupled with edge marking and edge discarding techniques. They observe that priority
scheduling provides lower expected delays to preferred traffic than threshold dropping. They also observe
that there is little difference in the loss incurred by preferred traffic under both router mechanisms, except
when the sources are extremely bursty, in which case threshold dropping performs better. Yeom and
Narasimha Reddy discussed techniques for achieving desired throughput guarantees for a Diff-Serv architecture in [28]. They pointed out that the drop precedences by themselves cannot achieve the desired target
rates because of the strong interaction of the transport protocol with packet drops, and propose the following techniques to solve the problem: (i) modify the transport protocol at the sender, (ii) modify the
marking strategies, and (iii) modify the dropping policies at the router. A few other researchers address the
routing issues, and scalable signaling protocols [29, 30]. Adiseshu et. al. [30] proposed a lightweight protocol called SSP (state setup protocol) which is designed to disseminate and manage state information that
associates data flows with QoS classes in the routers.
It appears that there has been no published results of end-to-end performance of realistic workload
such as VoIP, using Diff-Serv service models for multiple-hop topology. There is also a lack of study on
architectural design of admission control for real-time applications over Diff-Serv.
2.2 Playout Delay Adjustment Algorithms in Internet Audio Applications
Previous studies [34-36] have indicated the presence of spikes in end-to-end Internet delays. The
delay spike is found to have an initially steep rise and linearly monotonic decrease. Bolots conjecture [34]
attributes this phenomenon to probe compression - the accumulation of audio packets (from the flow
being examined) behind a large number of packets from other sources in a router queue. Ramjee et. al.
show in [36] that it is advantageous to quickly react to the delay spike if the spike spans multiple talkspurts, and present a delay spike detection algorithm. They also study the trade-off between average
playout delay and loss due to late packet arrivals incurred by different delay adjustment algorithms. S.
Moon et. al. revisited this problem in [37] and compute the upper and lower bounds on the optimum (minimum) average playout delay for a given number of packet losses. They also presented a new delay adjustment algorithm that tracks the network delay of recently received packets.
Although there is a significant amount of previous work on playout delay adjustment techniques, most
of the results are trace-specific, and based on estimators derived from marginal distribution of packet
delays. We need to re-evaluate the performance of these algorithms in the presence of delay correlation, as
we will discussed in Section 6.
3 Modeling VoIP
Before we address the various mechanisms used to provide QoS to IP-based real-time applications, we
first explain why we choose VoIP as a workload for evaluation, and how we determine its performance
requirements and equivalent mathematical model in this section.
VoIP refers to real-time delivery of packet voice across networks using the Internet protocols. The
rapid growth of IP-based packet switched networks and the overall bandwidth efficiency of an integrated
IP network make it an attractive candidate to transport voice connections. In fact, multiplexing data and
voice results in a better bandwidth utilization than the traditional circuit-switched voice-or-nothing backbone in the PSTN (Public Switched Telephone Networks), which consists of over-engineered voice trunks.
This justifies looking at VoIP as a workload for future Internet packet networks.
PSTN
IP Telephony
Gateway
(IPTG)
RTP/UDP/IP
Packets from
voice source
Diff-Serv
Boundary
Router
Managed IP Backbone
(Diff-Serv)
Voice VPNs
Diff-Serv
Boundary
Router
Receiver
Sender
Diff-Serv
Boundary
Router
Figure 2. An end-to-end service architecture for Voice over IP using Differentiated Services.
We consider a very simple service architecture, as shown in Figure 2, which includes two important
functional blocks: IP Telephony Gateways (IPTG) and Diff-Serv Boundary Routers (DSBR). IPTGs perform the necessary conversion between the transmission format of the input voice traffic and RTP/UDP/IP
format that are carried over Diff-Serv networks at the ingress and egress points. Further details about DiffServ components and VoIP are described in Appendix A and Appendix B, respectively.
3.1 VoIP Performance Requirements
In this section, we quantify the performance requirements of VoIP, by mapping the human perceived
voice quality to the more tangible network centric parameters: packet loss and packet delay.
3.1.1 Packet Loss Rate
In the fall of 1998, we used vat to run a simple subjective test to map the packet loss rate to perceived
voice quality for the following case: PCM codec with silence suppression, 8 kHz sampling rate, 8 bits per
sample (contributing to 64kbps when the source is active), and 20 ms of voice samples per packet.
The sound files of three sentences (about 6 seconds each) from the movie, A Few Good Men were
downloaded and converted to PCM format with 8 kHz sampling rate1. The voice samples were packetized
into RTP packets with 12-byte RTP Header and sent through a simple network emulation that introduced
uniformly distributed packet losses according to different loss rates. The perceived voice quality was
scored on a numeric 0 to 5 scales with the following definitions: 5 = crystal clear, 4 = comprehensible but
1. Since these sound files are in WAV format, we used sndrfmt program from ICSI to resample the voice at 8KHz
less clear; 3 = choppy speech; 2 = harder to comprehend sentences due to noise; 1 = can comprehend less
than 50% of the sentence; 0 = gibberish noise. The result is plotted in Figure 3. Results show that the tolerable loss rates are within 1-3% and the quality becomes intolerable when more than 3% of the voice
packets are lost.
Note that packet voice using Forward Error Correction (FEC) is more resilient to losses and therefore
we would expect the curve to shift to the right in this case. On the other hand, the quality of voice connection using compressed speech is more sensitive to lost voice samples, and we expect the curve to shift to
the left. The impact of packet loss on voice quality depends on the codec used, burstiness of losses, and
frame sizes per packet, but this is out of scope of this project. We have considered packet losses that are
caused by buffer overflows in routers as well as discarding of delayed packets that miss the playout time at
the receiver.
3.1.2 Maximum Tolerable Delay
ITU-T Recommendation G. 114 [38] specifies that one-way transmission time for connections with
adequately controlled echo should be in 0-150 ms range to be acceptable for most user applications. As
mentioned earlier the end-to-end delay for VoIP depends on various components of the packet network. In
this project, we assume PCM transcoding introduces almost negligible delay if implemented in hardware
(0.75 ms). We also assume that propagation delay is relatively constant and can be easily estimated. From
[38], Public Land Mobile Systems contribute around 80 - 110 ms to one-way propagation time. Satellite
systems introduce 12 ms at 1400 km altitude, and 110 ms at 14,000 km altitude. Optical fibre cable system
contributes around 50-60 ms from coast to coast in United States. Assuming it takes 100 ms propagation
delay for voice packets to be transported across United States, the total queueing delay should be kept
within 50 ms (150ms - propagation delay). Since queueing delay is the only variable part in our model, we
need to budget the per hop queuing delay. From traceroute, we found out that there were typically around
8-12 hops between a machine on the west coast and the east coast. Assuming that queueing delay is almost
the same for each hop, we require the per hop queueing delay to be at most 5 ms when we design our
resource allocation schemes.
5
Increased sensitivity
to losses with
compression
0
0
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
1 , respectively. The fraction of time that the voice source is on is ------------- . When the source is in the
+
on state, fixed-size packets are generated at a constant interval. No packets are transmitted when the
source is off. The size of the packet and the rate at which the packets are sent depends on the voice
codecs and compression scheme. Let X i ( t ) be the instantaneous rate of voice connection i, then,
R
X i(t ) =
0
(1)
where R is the voice bit rate (i.e. packet size/packet interval). The rate of transition from state 0 kbps to
state R kbps is while the reverse transition happens at the rate of .
strate how to allocate limited resources to meet QoS constraints for VoIP in VPNs. The following are two
fundamental network design questions that are of interest:
Given that N VoIP flows arrive at a node i, how much bandwidth, C, and buffer space, B, should be
allocated to the VoIP service class to limit packet loss to at most 3% (Section 3.1.1) and delay to at
most 5 ms (Section 3.1.2),
Given a fixed link capacity, C, what is the maximum number of VoIP flows, nmax, we can admit into
the queue before the QoS requirements of VoIP are violated?
In the fall of 1998, we addressed the questions mentioned using both analytical and simulation techniques, and submitted the results to ICNP99 [41] (attached as Appendix D). Although our study is based
on VoIP workload, the results can be extended to other delay sensitive applications. The next two sections
briefly describe our findings.
4.3 Aggregate Bandwidth Predictor
We consider N potential incoming VoIP traffic served at Cv kbps. Each VoIP source is modeled as an
on-off Markov process as described in Section 3.2. Let Xi be the instantaneous arrival rate of flow i, and
N
Y =
Xi
i=1
in the equivalent discrete time model (Appendix C). Assume that Xis are i. i. d. with mean m
and variance 2. When N gets large, Y tends to have a normal (Gaussian) distribution with mean Nm and
variance N2 under the Central Limit Theorem [42], Y~N (Nm, N2). Assume that losses occur when
Y > C v , and the maximum tolerable loss is P ( Y > C v ) = max . Using the well-known Q-function, we can
estimate the required bandwidth, Cvo as the following (detailed derivation can be found in Appendix C),
o
C v Nm + Q ( max ) N
(2)
In practice, the voice traffic arrives in IP-packet format, and there are buffers in the system. The ideal
fluid model in Appendix C does not hold, but the results in Eq. (2) can still serve as a first order approximation of the actual bandwidth required. From Figure 3 in Section 3.1.1, the packet loss rate should not
exceed 3% to preserve satisfactory voice quality. Therefore, we chose max = 0.03. With
1
C v = N m + 1.88 N
(2a)
to satisfy the QoS requirements of VoIP. As a numerical example, let = 0.4, = 0.6, R = 80 kbps. From
2
Eq. (2), the mean m = 0.4*80 = 32 kbps, while = E [ X m ] 39.2 kbps. For every value of N, one
can then estimate the capacity needed by substituting the numerical values into Eq. (2a).
The statistical gain from multiplexing multiple flows together results in a lower effective bandwidth
per flow. Therefore, the actual bandwidth that is needed to meet the performance requirement of VoIP can
be less than the predicted value of Cv from Eq. (2a), which can be viewed as a upper bound. For comparison purposes, we denote the required bandwidth computed from analytical model as Cvo, and compare it
with the actual bandwidth determined from simulation, denoted as Cvns, in the next section.
4.4 Numerical Results & Discussions
We used simulation to determine the minimum bandwidth, Cvns, needed to support a specific number
of voice users N. We used ns simulator1 to model a simple one-hop topology where N VoIP flows share the
10
same link served at Cv kbps. The VoIP sources were simulated according to Section 3.2 with voice activity
cycle of 40% (=0.4). Assume that 8 KHz 8 bits/sample PCM codec was used with 20 ms frame per
packet. The voice data packets were 160 Bytes. With 12 byte RTP header, 8 byte UDP header and 20 byte
IP header, the size of each packet = 200 Bytes. With these header overheads, the effective rate of a single
voice connection when it was active was (200*8)/20 =80 kbps (25% overhead). Buffer size B was chosen
to limit the maximum possible delay (for packet at the tail of the queue) to at most 5 ms:
ns
( B 200 8 ) C v 5 ms.
(3)
N was varied from 0 to 200. For each N, Cv was increased until the worst per flow loss rate decreased
to 3%, and the corresponding value Cvns(N) was recorded. Figure 4 shows how Cvns(N) was determined for
N=20 and N=30. Result were plotted in Figure 5, together with Cvo that was predicted using Eq. (2a). Our
main findings were as follows:
We found that Cvns(N) increased linearly with N at a slope of approximately 33 kbps for each additional voice connection (per unit of N). The slope was roughly 40% of the peak rate of 80 kbps, which
was very close to the mean rate of 32 kbps.
Cvo served as an upper bound for the actual bandwidth usage, and tracked Cvns fairly well for moderate N. This was a pleasant discovery since it implied that if we know the mean and variance of each
individual flow, we can estimate the aggregate bandwidth required. By allocating Cvo kbps to the
whole VoIP service class, VPNs can give statistical guarantees (maximum packet loss & delay) to
each individual VoIP flow.
0.045
Cvo
3.5
N=30
0.05
0.04
0.035
0.03
0.025
0.02
Cvns
3
2.5
2
1.5
1
0.5
0
0.015
0.7 0.75 0.8 0.85 0.9 0.95 1 1.05 1.1 1.15 1.2
10
ns (Mbps)
20
30
40
50
60
70
80
90 100
Number of users, N
Cv
The result shown in Figure 5 can also be interpreted in a different way. We defined Nmax as the maximum number of users that could be supported so that loss rate was below 3% and delay was less than
1. ns
is a discrete event simulator derived from REAL simulator [43]. The ns development effort is now
part of the ongoing VINT project.ns source code and documentation can be downloaded from http://www-
mash.cs.berkeley.edu/ns/.
11
5 ms. Using the same experimental settings, the value of Nmax for each corresponding value of available
bandwidth, Cv, was obtained from Figure 5 for: a) ns simulation, and b) analytical prediction. Nmax was
plotted against Cv, for these two cases in Figure 6 (solid and dashed lines). For comparison, we considered
a third case where peak rate of 80 kbps was allocated to each flow (assuming all sources became active
simultaneously). Nmax in this case was the largest integer such as N max 80 C v kbps (dotted line in
Figure 6). This line lay much lower than the previous two cases. This implied that the peak rate allocation
was over-conservative and resources were under-utilized. For example, at 6.4 Mbps link bandwidth, one
could only support 80 users with peak rate allocation. But simulation showed that one could actually support up to 196 users, 2.4 times as many (solid line in Figure 6). If we allocated bandwidth based on Eq.
(2a), we could support Nmax = 170, which was still more than double the Nmax with peak rate allocation.
Further details of this work can be found in [41].
200
Simulation result
from ns
160
Analytical result
120
80
40
0
0
is a Window based Internet real-time conferencing tool that supports multi-point data
conferencing, text chat, and point-to-point audio and video (http://www.microsoft.com/windows/netmeeting).
2. The
Iceberg project seeks to construct a scalable service architecture for computer-telephony (both PSTN and
wireless cellular infrastructure) integration based on Internet Protocol (IP) technology
(http://iceberg.cs.berkeley.edu).
12
H.323 Terminals
e.g. PCs running Netmeeting
PSTN phone
T1-line
H.323
Gateway
H.323
Gatekeeper
PSTN phone
Ethernet
Figure 7. Testbed set up for trace collection, including the H.323 gateway, H.323 terminals running client
applications (e.g., Netmeeting) and a T-1 line connecting to circuit-switched telephone lines.
13
14
15
tk
tk
tk
nk
tk
tk + 1
tk + 1
Sender
time
1
ak
ak
ak
ak
nk
ak + 1 ak
Receiver
Arrival Time
time
1
pk
pk
Receiver
Playout Time
pk
pk
pk
nk
pk
pk + 1
time
Playout delay for k-th talkspurt, Dk
Figure 8. Timing information of audio packets at the sender, receiver, and output of playout buffer.
16
This proves the existence of substantial time correlation in network delays. The next step is to design a
better delay adjustment algorithm using a better delay estimator based on the knowledge of network correlation, which has the following advantages over the existing approach:
In [36] and [37], the adaptive algorithm is based on estimating two statistics: the delay itself, and a
variational measure of the observed delays. Both estimates are in the form of
i
i1
u k = u k
i1
+ ( 1 )d k , where u k and u k
is the i-th delay sample. The weight, , is chosen based on off-line sensitivity analysis using simulation, without considering the presence of delay correlation. Clearly, the accuracy of these estimates
i
can be greatly improved by dynamically choosing the values of that based on the observed delay
correlation.
The previous approach [36] depends on delay spike detection algorithm to decide whether the playout delay should be adapted, based on whether the delay spike spans multiple talkspurts. If we design
i
0.1
0.08
0.8
Empirical CCDF
Empirical f(RTTm)
an algorithm based on network correlation, the decreasing correlation (as shown by ) will automatically detect the end of delay spike and adjust the playout delay accordingly. This can potentially
improves efficiency of the delay adjustment algorithm.
0.06
0.04
0.6
0.4
0.2
0.02
0
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.05
0.1
0.15
0.2
0.25
0.3
0.35
RTTm(s)
RTTm(s)
f ( RT T m )
f ( RT T m r )
f ( RT T m RT T m 1 0.1611 )
f ( RT T m r RT T m 1 0.1611 )
17
Jitters are variances in delay, which come from queueing delays. In [55-56], researchers have defined
and derived different analytical definitions and bounds on delay jitter. From my discussions with Ramesh
Nagarajan1, we define the delay jitters as the following using the notations as shown in Figure 8:
i+1
1-jitter: ( a k
i
i+1
tk
We assume that out-of-order is taken care of. Note that T-jitter is the variable part in queueing delay
since the minimum round-trip time is subtracted off. On the other hand, 1-jitter distribution has a much
smaller variation than T-jitter, since it is just the delay variation between adjacent packets. The next challenge is to evaluate different types of jitter definition to see which is the most appropriate/effective candidate to be used as the estimates for the playout delay adjustment algorithms.
6.3 Research Plan
Below we describe two specific research efforts that we will pursue.
We will use delay correlation information to get a better estimate of the future packet delay, and
design a new playout delay adjustment algorithm based on this new estimate. Since a delay spike
may or may not be completely contained within a talkspurt, the algorithm should be able to differentiate between the two cases and adjust the delays accordingly. One approach is to introduce additional threshold, , such that if the delay correlation is greater than , the algorithm will track the
delay and adjust the playout delay accordingly to compensate for the network delay. Otherwise, we
assume that the correlation is too small, and will not adapt to the changes.
We will explore different playout delay algorithms using different jitter estimates as defined in Section 6.2.2. We will evaluate the performance of our algorithms based on simulations using the traces
we collected in Section 4.5, and compare the results to the existing algorithms approach.
We will perform sensitivity analysis on how the performance of the proposed algorithms are affected
by the nature of network delays. We consider a) the current Internet and b) the proposed service
architecture built on Diff-Serv framework (Section 5), since they will potentially have very different
delay characteristics.
We are interested in the following performance measures: a) loss vs. average playout delay, b) number
of gaps in a talkspurt, c) average length of gaps in a talkspurt, d) total length of gaps in a session, and e)
mean opinion score.
1. Ramesh
18
19
1999
Q4
2000
Q1
Q2
Q3
Capacity Planning
Trace Collection
Q4
2001
Q1
Simulation Study
Diff-Serv Performance
Design
Simulation Study
Refinement
Acknowledgment
We would like to thank Dr. B. Lyles from Sprint Labs, L. Salgarelli and Dr. R. Nagarajan from Lucent
Technology for sharing their expertise in this field. Special thanks are dedicated to Professor A. Joseph and
other ICEBERG group members at Berkeley for their constructive feedback and enlightening discussions.
20
References
[1]
W. A. Montgomery, Techniques for packet voice synchronization, IEEE Journal on Selected Areas in Communications,
vol. SAC-1, no.6, pp. 1022-1028, December 1983.
[2]
D. Clark, S. Shenker, and L. Zhang, Supporting real-time applications in an integrated services packet network: Architecture and mechanism, Proc. ACM Sigcomm92, pp. 14-26, 1992.
[3]
F. Alvarez-Cuevas, M. Bertran, F. Oller and J. M. Selga, Voice synchronization in packet switching networks, IEEE Networks Magazine, 7(5):20-25, September 1993.
[4]
J. DeTreville and D. Sincoskie, A distributed experimental communications system, IEEE Journal on Selected Areas in
Communications, vol. SAC-1, no. 6, pp. 1070-1075, 1983.
[5]
H. Schulzrinne, Voice communication across the Internet: A Network Voice Terminal, Technical report, Dept. of CS,
University of Massachusetts, Amherst, MA, July 1992.
[6]
V. Hardman, M.A. Sasse, M. J. Handley, A. Watson, Reliable Audio for Use over the Internet, Proc INET95.
[7]
R. Cox and P. Kroon, Low bit-rate speech coders for multimedia communications, IEEE Communications Magazine, pp.
34-40, Vol. 34, No.2. December 1996.
[8]
S. Fosse-Parisis Bolot, D. Towsley, Adaptive FEC-Based Error Control for Interactive Audio in the Internet, IEEE Infocom99, New York, USA, March 1999.
[9]
H. Schulzrinne, S. Casner, R. Frederick, V. Jacobson, RTP: A Transport Protocol for Real-Time Applications, IETF RFC
1889, January 1996.
21
[30] H. Adiseshu, G. Parulkar and T. Yavatkar, A State Management Protocol for IntServ, DiffServ and Label Switching, Proc.
IEEE ICNP98, pp. 272-81, October 1998.
[31] N. Seddigh, B. Nandy, P. Pieda, J. Hadi Salim and A. Chapman, Experimental Study of Assured Services in a Diffserv IP
QoS Network, Proc. SPIE, pp. 217-30, 1998.
[32] J. Ibanez, and K. Nichols, Preliminary Simulation Evaluation of an Assured Service, IETF Internet Draft <draft-ibanezdiffserv-assured-eval-00.txt>, August 1998.
[33] Y. Moret, S. Fdida, A Proportional Queue Control Mechanism to Provide Differentiated Services, ISCIS98, pp. 17-24,
October 1998.
[34] J. Bolot, End-to-end packet delay and loss behavior in the Internet, Proc. ACM Sigcomm93, pp. 289-298, September
1993.
[35] D. Sanghi, O. Gudmundsson, A. Agrawala, and B. N. Jain, Experimental assessment of end-to-end behavior on Internet,
Proc. IEEE Infocom93, pp. 867-874, April 1993.
[36] R. Ramjee, J. Kurose, D. Towsley, and H. Schulzrinne, Adaptive playout mechanisms for packetized audio applications in
wide-area networks, Proc. IEEE Infocom94, vol. 2, pp. 680-8, April 1994.
[37] S. B. Moon, J. Kurose, and D. Towsley, Packet Audio Playout Delay Adjustment: Performance Bounds and Algorithms,
ACM/Springer Multimedia Systems, vol. 6, pp. 17-28, January, 1998.
[38] ITU-T Recommendation G. 114, General Characteristics of International Telephone Connections and International Telephone Circuits: One-Way Transmission Time, February 1996.
[39] S. Kent and R. Atkinson, Security Architecture for the internet Protocol, RFC 2401, November 1998.
[40] B. Gleeson, A. Lin, J. Heinanen, and G. Armitage, A Framework for IP BAsed Virtual PRivate Networks, Internet draft:
draft-gleeson-vpn-framework-00.txt, September 1998.
[41] Chen-Nee Chuah and Randy Katz, Network Provisioning and Resource Management for IP Telephony, submitted to
ICNP99. Full paper can also be found in Report No. UCB//CSD-99-1061, CS Division, University of California, Berkeley,
August, 1999.
[42] H. Cramer, Mathematical Methods of Statistics, Princeton University Press, Princeton, N.J., 1946.
[43] S. Keshav, REAL: A Network Simulator, Computer Science Department Technical Report 88/472, UC Berkeley, 1988.
[44] ITU-T Study Group 16, ITU-T Recommendation H.323 - Packet Based Multimedia Communications Systems, ITU-T HSeries Recommendations H.323, Audiovisual and Multimedia Systems, February 1998.
[45] A. Demers, S. Keshav, and S. Shenkar, Analysis and Simulation of a Fair Queueing Algorithm, Journal of InterNetworking Research and Experiment, Vol. 1, pp. 3-26, October, 1990.
[46] A. K. Parekh, and R. Gallager, A Generalized Processor Sharing Approach to Flow COntrol in Integrated Services Networks - The Single NOde case, IEEE/ACM Transactions on Networking, Vol. 1, No. 3, pp. 344-357, June 1993.
[47] H. Zhang, Service Disciplines for Guaranteed Performance Service in Packet-Switching Networks, Proc. IEEE, vol. 83,
no. 10, October 1995.
[48] L. Braslau, S. Shenker, Is Service Priority Useful in Networks? Proc. of ACM Sigmetrics, pp. 66-77, June 1998.
[49] L. Braslau, S. Shenker, Best-Effort versus Reservations: A Simple Comparative Analysis, Proc. ACM Sigcomm98, pp. 316, October 1998.
[50] F. Alvarez-Cuevas, M. Bertran, F. Oller and J. M. Selga, Voice Synchronization in Packet Switching Networks, IEEE Networks Magazine, 7(5), pp.20-25, September 1993.
[51] J. Rosenberg, Slides on Internet Loss Measurements, presented to the Voice on the Net Conference, Boston MA, Sep. 25,
1997 (http://www.cs.columbia.edu/~jdrosen/).
[52] I. Wakeman, A. Ghosh, J. Crowcroft, V. Jacobson, S. Floyd, Implementing real time packet forwarding policies using
Streams, Proc. USENIX Technical Conference95, pp.71-82, 1995.
[53] A. Floyd, and V. Jacobson, Random Early Detection Gateways for Congestion Avoidance, IEEE/ACM Transactions on
Networking, vol.1, no. 4, pp. 397-413, August 1993.
[54] D. Clark and J. Wroclawski, An Approach to Service Allocation in the Internet, Internet draft, draft-clark-different-svcalloc-00.txt, July 1997.
[55] Z. Wang and J. Crowcroft, Analysis of burstiness and jitter in real-time communications, ACM Sigcomm93, pp. 13-19,
September, 1993.
[56] A. Banerjea, D. Ferrari, B.A. Mah, M. Moran, D.C. Verma, H. Zhang, The Tenet Real-time Protocol Suite: Design, Implementation, and Experiences, IEEE/ACM Transactions on Networking, vol. 4, noo.1, pp. 1-10, February, 1996.
22
Diff-Serv Architecture
Diff-Serv uses edge-based packet-marking and per-class queue management to support multiple service level agreements (SLAs) over an IP-based network. An SLA is a contract between customer and network provider for a service, and may include packet classification and re-marking rules, traffic profiles,
actions to traffic streams which are in or out of profile, reliability, security, routing constraints, pricing and
billing, etc. Here is a quick overview of several major Diff-Serv components.
1. Edge Mechanisms
The following functionality is required from the Diff-Serv boundary routers which reside at the ingress
and egress points to and from the Diff-Serv IP core network:
Classification Edge components classify and set the DS field in packet header based on SLAs.
Monitoring A monitoring interface collect statistics regarding traffic carried at various Diff-Serv service levels, which are important for accounting purposes and for traffic policing.
Traffic policing and marking Edge routers can compare the statistics collected against the SLAs
that specify an upper bound on the amount of traffic that senders negotiate to send in the specified
service classes. The packets that are within the profile are in-profile and the excess packets are outprofile. The edge routers can choose to either discard the out-profile packets, or mark the packets
accordingly before forwarding them into the network [20].
Traffic Engineering If necessary, edge routers interact with the Diff-Serv admission control component in order to coordinate resource requests. Admission policy is subjected to the constraints of the
local resources and PHBs that are requested.
Traffic Shaping Ingress routers may delay certain packets to make the particular flow compliant with
a traffic profile. Egress routers may shape behavior of aggregate traffic before it is submitted to a subsequent providers network.
2. Packet Classifier
Classification is a necessary function for Diff-Serv routers that treat certain traffic differently from
other traffic. It can be implemented in various degrees of granularity (for each source-destination pair, or
for a specific service class). Each Diff-Serv router must have a packet classifier that selects packets based
on the DS-byte and decide which queues to insert these packets into.
3. Scheduler and Queue Management
Scheduling policy is part of queue management which decides which packet to transmit next. It can
choose a packet to transmit from a single queue on first come first serve basis, or from multiple queues e.g.
Class Based Queueing (CBQ) [52], in a modified round robin fashion. For example, in CBQs with priority
scheduling, the lower priority queues receive service only when the all the higher priority queues are
empty. Another commonly used scheduler is Weighted Fair Queueing (WFQ) ([45]-[47]).
The buffer management scheme is responsible for putting packets in a queue as they arrive and decide
which packets to drop when buffer memory is exceeded. Examples include Drop-tail, Drop-front, Drop
from longest queue and Random Early Detection (RED) [53]. RIO [54] extends RED to handle two classes
of packets and includes two sets of drop parameters, one for In packets and one for the Out packets, and
service discrimination between the two classes can be achieved either by using two different thresholds, or
using different dropping policies. For instance, Assured Service can be implemented by sending all packets
to an Assured Queue managed by RED with In and Out - RIO.
23
NA
i=1
X i(t )
(1)
where NA is the number of actual voice calls in progress. The maximum possible value of NA is N, which is
the maximum number of possible voice sources in the network. For example, N in a VPN can be the total
number of telephone handsets or other end devices that are capable of generating voice traffic.
At any particular time instant, say t = T , X i ( T ) are just discrete time random variables. We assume
that the random process X i ( t ) is ergodic, i.e., time averages see the ensemble averages, and stationary, i.e.,
X i ( T ) have the same statistics at any time instant T. Since Y(t) is the sum of i. i. d. stationary processes,
Y(t) is also stationary. At any time instant, t = T , Y(T) is just the same of i. i. d. random variables. For
simplicity, we omit the time dependence and use the notations Xi, Y instead. Assume the stationary distribution of Xi is:
P( X i = x) =
------------+
------------+
0
when x = R
when x = 0
(2)
otherwise.
We are interested in estimating the minimum capacity Cv that need to be allocated to VoIP so that the
loss rate per flow is less than . Assume scheduler/server is work conserving and non-preemptive. If the
24
queues are bufferless, then losses occur when the sum of arrival rates is greater than the rate at which the
queue is served: Y > C v . We made the following approximations:
Consider the worst case where all the potential voice users have calls in progress, i.e., NA = N.
As soon as the aggregate rate Y exceeds the server rate, information (in bits) from some connections
are lost. Let say this happens with probability P ( Y > C v ) = . The losses can be shared by some
connections, or in the worst case, the losses may happen to only one connection. Assume that in any
cases when losses happen, the worst loss rate suffered by an individual voice flow is not more than
the total loss rate, .
To achieve satisfactory voice quality, loss rate per source has to be bounded: max .
Therefore, the following worst case constraint has to be satisfied:
P ( Y > C v ) max
(3)
P X i > C v max
i = 1
(4)
When N gets large, Y tends to have a normal (Gaussian) distribution under the Central Limit Theorem[42]. In short, the Central Limit Theorem states that the sum of a large number of independent observations from any distribution tends to have a normal distribution, and this is true for observations from all
distributions. Given that Xis are i. i. d. with mean m and variance 2 (this easily can be determined from
Eq. (2)), and Y is the sum of Xis, the mean and variance of Y is simply the sum of the mean and the sum of
the variance of Xis, respectively. Therefore Y is normal distributed with mean Nm, and variance N2,
Y~N (Nm, N2). To solve Eq. (3), one can use the well-known Q-function:
Y Nm C v Nm
P ( Y > C v ) max P ------------------ > --------------------- max
N
N
C v Nm
P Z --------------------- max
( since Z is continuous )
N
C v Nm
,
Q --------------------- max
N
where Z is a normalized zero mean unit variance normal random variable, and
(5)
1 -----2
Q ( x ) = ---------- e
dt
2 x
(6)
Using the inverse Q-table, one can determine the value of Cv for any given max:
1
C v Nm + Q ( max ) N
(7)
25