Effect of Database Server Arrangement To The Performance of Load Balancing Systems
Effect of Database Server Arrangement To The Performance of Load Balancing Systems
Effect of Database Server Arrangement To The Performance of Load Balancing Systems
1 Introduction
With the popularity of the Internet, the number of Internet users is increasing rapidly.
How to create attractive web pages and avoid server overloads are important issues for
administrators. A popular web site needs powerful hardware to support its services. But
even if the performance of a single server is improved, the requests for the web site may
increase more dramatically. Thus, load balancing architectures are probably the most
appropriate solutions for these kinds of web sites [1,2]. In load balancing architectures,
the administrator can easily add or remove the backend servers when needed. The
flexibility and availability obtained from the load balancing architecture can improve
the overall performance of the web site.
Most popular web sites create dynamic web pages for their clients. The contents of
dynamic web pages are always retrieved from the database. All of the data are stored in
the database server. When a client requests to access a web page whose data stored in
the database, the web server need to generates the contents by retrieving the required
data from the database, and responds with a web page to the client. Owing to how the
A. Hua and S.-L. Chang (Eds.): ICA3PP 2009, LNCS 5574, pp. 146–155, 2009.
© Springer-Verlag Berlin Heidelberg 2009
Effect of Database Server Arrangement to the Performance of Load Balancing Systems 147
data are stored in the database server, the web pages can be diverse, and the web sites
can provide real time information to clients.
Generally speaking, the static web pages need more I/O processing, and the dynamic
web pages need more CPU resources. In the load balanced web server architecture with
multiple backend web servers, to alleviate the overload of web sites for dynamic web
pages, the arrangement of the database server is an important issue. In this paper, we
will discuss two kinds of arrangement for database servers: centralized and decentral-
ized arrangement [3,4]. In the centralized database arrangement architecture, the da-
tabase server is a stand-alone server which only serves the database function for all the
backend web servers. In this architecture, the web server retrieves the contents from the
centralized database server and generates the dynamic contents, which it then transfers
to the clients. The other architecture is the decentralized database arrangement archi-
tecture. In this architecture, the backend servers act both as web server and the database
server. The data are stored in each backend server. When clients request dynamic web
pages, the web server retrieves the data from itself because it is also the database server.
The web server generates the dynamic web pages and responds to the clients.
In this paper, we will discuss the performance difference between these two kinds of
arrangement under various load condition. Any load balancing algorithm, such as
Round Robin, least connection, or hash can be used in the load balancing system. Our
focus is on the comparison of these two kinds of arrangement, making the load bal-
ancing algorithm an unimportant issue. Thus, in our experimental environment, we use
the simplest algorithm, Round Robin algorithm, as our load balancing algorithm. The
backend servers can be homogeneous or heterogeneous. In a homogeneous environ-
ment, all the backend servers have the same hardware specifications; that is, they have
the same processing power. In a heterogeneous environment, the hardware specifica-
tions are different. Their processing powers are different. In order to avoid the influence
of this factor, our environment uses homogeneous backend servers. We use some cli-
ents to issue a large amount of requests to the load balancing system to evaluate the
performance of both the centralized and decentralized arrangements. [5]
This paper is organized as follows: Section 2 introduces some related works about
load balancing systems. Section 3 addresses the system architecture and experimental
environment. Section 4 describes the experimental results. Section 5 discusses the
conclusion.
2 Related Works
There are different kinds of load balancing architectures, including hardware-based,
software-based, and hybrid architectures [6]. Different load balancing architectures can
be used in different environments. We will discuss some load balancing architectures in
this section.
In this architecture, there are a number of backend servers works together as a cluster of
servers. An authoritative DNS server is setup as the cluster DNS server. This distrib-
uted web server architecture uses request routing mechanisms on the cluster side, and
148 J.-B. Chen, T.-L. Pao, and K.-D. Lee
In this architecture, a centralized dispatcher receives all incoming requests and dis-
tributes them among the web server nodes through the HTTP redirection mechanism
[9]. The dispatcher redirects a request by specifying the redirection status code [10] in
the response, indicating in its header the server address where the client can get the
desired document. Such redirection is largely transparent; at most, users might notice
an increased response time. Unlike most dispatcher-based solutions, HTTP redirection
does not require IP address modification of packets reaching or leaving the web server
system. HTTP redirection can be implemented through one of the two techniques de-
scribed below.
Server-state-based dispatching is used in the Distributed Server Groups architecture
[11]. This adds new methods to HTTP protocol to administer the web system and ex-
change messages between the dispatcher and the servers. Since the dispatcher must be
aware of the server loading, each server periodically reports the number of processes in
its run queue and the number of received requests per second. The dispatcher then
selects the least-loaded server.
Location-based dispatching is used in the Cisco Systems’ Distributed Director [12]
appliance. This provides two dispatching modes. The first applies the DNS-based
approach with client and server state information. The other uses the HTTP redirection.
The Distributed Director estimates the proximity of the client’s server and the node
availability with algorithms that apply to the DNS-based solution. Client requests are
redirected to the server that is evaluated as most suitable for each request at a certain
time.
Effect of Database Server Arrangement to the Performance of Load Balancing Systems 149
In the centralized arrangement architecture, the web servers and the database server are
connected to an Ethernet switch. The web servers are dedicated to server the web
function and the database server provides the database function. In order to avoid that
the database server becomes the bottleneck of the entire system, we install the database
in a powerful hardware [13,14]. There is a log server which is not part of the load
balancing system and is used to collect the experimental data for further analysis. The
dispatcher is used to dispatch the client requests to the backend web servers according
to the Round Robin algorithm. When the client issues a request to the dispatcher, the
dispatcher selects a web server and redirects the request to the appropriate web server.
Our focus is on the web server retrieving contents from the database to generate the
dynamic web pages and responds to the client. The architecture for the centralized
database arrangement is shown in Fig. 1.
150 J.-B. Chen, T.-L. Pao, and K.-D. Lee
In the decentralized arrangement architecture, the backend server acts as both web
server and database server. Every backend server has a copy of the contents stored in its
own database. When the client issues a request to the dispatcher, the dispatcher selects a
backend web server and redirects the request to that server. When the web server wants
to generate the dynamic web pages, it retrieves the contents from its own database. The
retrieval of data does not need network transmission time. Theoretically the perform-
ance in the decentralized arrangement should be better than the centralized arrange-
ment. This is true only when the requests is under a certain amount. If the amount of
client requests is larger than a threshold, however, the performance for the centralized
arrangement is better than the decentralized arrangement. The experimental results will
illustrate this situation in Section 4. The architecture for the decentralized database
arrangement is shown in Fig. 2.
The comparisons between these two kinds of arrangement are shown in Table 1.
Each architecture has its own advantages and disadvantages. in the centralized ar-
rangement, only one database server license is required while in the decentralized case
with N backend servers, we need N database server licenses. In terms of maintenance
consideration, because the data from the database for centralized arrangement is stored
in only one database server, the administrator only needs to maintain one copy of the
contents. But when maintaining a decentralized arrangement, the administrator should
make N copies of the contents for each backend server. On the other hand, if N backend
servers are required, we need only N servers in the decentralized case and N+1 servers
in the centralized case. Furthermore, the centralized arrangement will incur a single
point of failure while the decentralized arrangement will not. The comparisons and
performance evaluations will be shown in Section 4.
Effect of Database Server Arrangement to the Performance of Load Balancing Systems 151
4. response
1. request
Web server & Database
.
2. redirect
client
.
Dispatcher
As shown in Fig. 3, there are four modules in our experiment setup: dispatcher, web
server, database server and clients. A log server is setup to collect the experimental
data. In order to issue the requests at the same time for each client, the clients must be
synchronized first using the Network Time Protocol. After all the clients are synchro-
nized, the clients will issue a requests for dynamic contents at a pre-determined time.
The web server then retrieves contents from the database server in both centralized or
decentralized arrangement. The log server records the start and end time for each re-
quest. When finishing the experiment, all the records are stored in the log server. We
can analyze the experimental results from these records to discover which arrangement
achieves better performance.
Table 2 is the list of our experimental hardware. Our focus is to find out the best
database arrangement in the load balancing system. Thus, we use four web servers to
share the load. In the centralized arrangement, only one database server is needed. To
152 J.-B. Chen, T.-L. Pao, and K.-D. Lee
avoid the database server becoming the bottleneck of the load balancing system, we use
powerful hardware to serve as the database server. We use twenty clients in our ex-
periments to generate requests we need to the load balancing system.
4 Experimental Results
In this section, we will describe and analyze the experimental results. The performance
comparison focuses on the average response time. The clients issue requests and record
the response time. The more requests the clients issue, the longer the average response
time. Furthermore, we want to know the performance of the centralized and
Effect of Database Server Arrangement to the Performance of Load Balancing Systems 153
increase to a certain number, such as 880 in the three web server experiment and 920 in
the four web server experiment, the performance of the centralized arrangement is
better than the decentralized arrangement.
The reason for this phenomenon is that when the requests are less than a threshold,
the loading of both the web servers and database server are light. In the centralized
arrangement there are network communications between the web servers and the da-
tabase server, so the average response time is higher than that of the decentralized
arrangement. But when the requests from clients are larger than the threshold, the
loading of both web servers and the database server become heavy. In the decentralized
arrangement, the backend server acts as both the web server and the database server, so
the loading increases more rapidly. The performance of the centralized arrangement is
better than the decentralized arrangement when there are huge amount of requests
because the database server can share the loads of web servers.
5 Conclusion
Web sites with dynamic contents are more attractive than those with static contents. In
order to generate the dynamic web pages, the web server must retrieve the contents
from the database server. The arrangement of the database server acts as an important
factor that influences the performance of the load balancing system. The centralized
arrangement and decentralized arrangement architectures are compared in this paper.
From the experimental results, we can see that when the requests from clients are under
a threshold, the performance of the decentralized arrangement is better that of the
centralized arrangement because of the overhead of network communications. But
when the number of requests are larger than the threshold, the performance of the
centralized arrangement becomes better because the database server shares the loads of
web servers. Based on these experimental results, the administrator can chose a suitable
arrangement for his load balancing system.
References
1. Pao, T.-L., Chen, J.-B.: Capacity Based Load Balancing Scheme for Fair Request Dis-
patching. Asian Journal of Information Technology 5(11), 1284–1290 (2006)
2. Pao, T.-L., Chen, J.-B.: The Scalability of Heterogeneous Dispatcher-Based Web Server
Load Balancing Architecture. Parallel and Distributed Computing, Applications and
Technologies (PDCAT 2006), 229–233 (December 2006)
3. Choi, E., Lim, Y., Min, D.: Performance Comparison of Various Web Cluster Architectures.
In: Baik, D.-K. (ed.) AsiaSim 2004. LNCS, vol. 3398, pp. 617–624. Springer, Heidelberg
(2005)
4. Guo, J., Bhuyan, L.N.: Load Balancing in a Cluster-Based Web Server for Multimedia
Applications. IEEE Trans. Parallel and Distributed Systems 17(11), 1321–1334 (2006)
5. Li, C., Peng, G., Gopalan, K., Chiueh, T.-c.: Performance Garantees for Cluster-Based
Internet Services. In: Proc. 23th International Conference on Distributed Computing Sys-
tems, pp. 378–385 (May 2003)
6. Yu, P.S., Cardellini, V., Colajanni, M.: Dynamic Load Balancing on Web-server Systems.
IEEE Trans. Internet Computing, 28–39 (May/June 1999)
Effect of Database Server Arrangement to the Performance of Load Balancing Systems 155
7. Colajanmi, M., Yu, P.S.: Adaptive TTL schemes for Load Balancing of Distributed Web
Servers. ACM Trans. Sigmetrics Performance Evaluation Review 25(2), 36–42 (1997)
8. Yu Philip, S., Colajannin, M., Cardellini, V.: Dynamic Load Balancing in Geographically
Distributed Heterogeneous Web Servers. In: IEEE International Conference on Distributed
Computing Systems (ICDCS 1998), Amsterdam, Netherlands, pp. 26–29 (May 1998)
9. Kopparapu, C.: Load Balancing Servers, Firewalls, and Caches. Wiley Computer, Chich-
ester (2002)
10. W3C World Wide Web Consortium, http://www.w3c.org
11. Garland, M., Grassia, S., Monroe, R., Puri, S.: Implementing Distributed Server Groups for
the World Wide Web. Technical Report CMUCS-95-114 (January 1995)
12. Cisco System, http://www.cisco.com
13. Haney, D., Madsen, K.S.: Load-balancing for MySQL. Kobenhavns Universitet (2003)
14. Hellerstein, J.M., Stonebraker, M., Hamilton, J.: Architecture of a Database System.
Foundations and Trends, pp.141–259 (October 2007)