Cluster Computing

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 7

CLUSTER COMPUTING

WHITE PAPER

Author: Deepika Baghla.


WHITE PAPER Cluster Computing

Abstract
In computers, clustering is the use of multiple computers, typically PCs or UNIX
workstations, multiple storage devices, and redundant interconnections, to form what appears to
users as a single highly available system. Cluster computing can be used for load balancing as
well as for high availability. One of the main ideas of cluster computing is that, to the outside
world, the cluster appears to be a single system.

A common use of cluster computing is to load balance traffic on high-traffic Web sites. A
Web page request is sent to a "manager" server, which then determines which of several identical
or very similar Web servers to forward the request to for handling. Having a Web farm (as such a
configuration is sometimes called) allows traffic to be handled more quickly.

 Wipro Technologies Page 2 of 7


WHITE PAPER Cluster Computing

Table of Contents

Introduction:..............................................................................................................................................4
1. Definition:..............................................................................................................................................4
2. Cluster Categorization:.........................................................................................................................4
2.1. High-availability (HA) clusters:.....................................................................................................4
2.2. Load-balancing clusters:................................................................................................................4
2.3. High-performance computing (HPC) clusters..............................................................................4
2.4. High Throughput Clusters.............................................................................................................5
2.5. Grid computing...............................................................................................................................5
3. Architecture:..........................................................................................................................................5
4. Applications:..........................................................................................................................................6
5. Comparison:...........................................................................................................................................6
6. Advantages:............................................................................................................................................7
7. Conclusion and Future Scope:..............................................................................................................7
References:.................................................................................................................................................7

 Wipro Technologies Page 3 of 7


WHITE PAPER Cluster Computing

Introduction:
Today, a wide range of applications are hungry for higher computing power, and even though
single processor PCs and workstations now can provide extremely fast processing, the even faster
execution that multiple processors can achieve by working concurrently is still needed. Now,
finally, costs are falling as well. Networked clusters of commodity PCs and workstations using
off-the-shelf processors and communication platforms such as Fast Ethernet, Gigabit Ethernet are
becoming increasingly cost effective and popular. This concept, known as Cluster computing
combines computing concepts and technologies of Internet, Supercomputing Applications,
Distributed and Parallel Processing.

1. Definition:
A cluster is a collection of connected, independent computers that work together to solve a
problem. All of the cluster can work together on a single problem at the same time. Portions of
the cluster can be working on different problems at the same time.
The constituent computer nodes are commercial-off-the-shelf (COTS), are capable of full
independent operation as is, and are of a type ordinarily employed individually for standalone
mainstream workloads and applications. The nodes may incorporate a single microprocessor or
multiple microprocessors in a symmetric multiprocessor (SMP) configuration. The
interconnection network employs COTS local area network (LAN) or systems area network
(SAN) technology that may be a hierarchy of or multiple separate network structures. A cluster
network is dedicated to the integration of the cluster compute nodes and is separate from the
cluster’s external (worldly) environment.

2. Cluster Categorization:

2.1. High-availability (HA) clusters:


High-availability clusters (also known as failover clusters) are implemented primarily for the
purpose of improving the availability of services which the cluster provides.
They operate by having redundant nodes, which are then used to provide service when system
components fail. They are useful for mission critical applications.

2.2. Load-balancing clusters:


Load-balancing clusters operate by having all workload come through one or more load-
balancing front ends, which then distribute it to a collection of back end servers. Although they
are primarily implemented for improved performance, they commonly include high-availability
features as well. Such a cluster of computers is sometimes referred to as a server farm.
They are useful for Web servers, mail servers.

2.3. High-performance computing (HPC) clusters


High-performance computing (HPC) clusters are implemented primarily to provide increased
performance by splitting a computational task across many different nodes in the cluster. HPCs
are optimized for workloads which require jobs or processes happening on the separate cluster
computer nodes to communicate actively during the computation. These include computations
where intermediate results from one node's calculations will affect future calculations on other
nodes.

 Wipro Technologies Page 4 of 7


WHITE PAPER Cluster Computing

2.4. High Throughput Clusters


High Throughput Clusters operate by having large number of independent tasks, hence providing
high throughput. They are useful for large number of independent tasks.

2.5. Grid computing


Grid computing or grid clusters are a technology closely related to cluster computing. Grids
support heterogeneous collections than are commonly supported in clusters.

3. Architecture:

The Network Interface Hardware is responsible for transmitting and receiving packets of data
between nodes.
The Communication software is responsible to offer an efficient and reliable means of data
communication between nodes and potentially outside the cluster.
System-level middleware is responsible for offering the illusion of a unified system image
(single system image) from a collection of independent but interconnected computers.

A typical cluster consists of hardware and software components.

3.1. Hardware components include:


Nodes
Storage
Interconnection network

3.2. Software components:


The software components that comprise the environment of a commodity cluster may be
described in two major categories:
1 Programming tools
 Message passing Interface (MPI)
2 Resource management system software
 Installation and Configuration
 Scheduling and Allocation
 System Administration
 Monitoring and Diagnosis
 Distributed Secondary Storage
 Availability

Various operating systems, including Linux, Solaris, and Windows, can be used for managing
node resources.
It is responsible for making sure the computers work together as one entity.
System level Middleware offers Single System Image (SSI) and high availability infrastructure
for processes, memory, storage, I/O, and networking.

4. Applications:
Clusters have evolved to support applications ranging from supercomputing and mission-critical
software, through web server and e-commerce, to high performance database applications.
Numerous Scientific & Engineering Apps
Business Applications

 Wipro Technologies Page 5 of 7


WHITE PAPER Cluster Computing

E-commerce Applications (Amazon.com, eBay.com ….)


Database Applications (Oracle on cluster)
Decision Support Systems
Internet Applications
Web serving / searching
eMail, eChat, ePhone, eBook, eCommerce, eBank, eSociety, eAnything
Computing Portals
Mission Critical Applications
Banks, nuclear reactor control, and handling life threatening situations e.g. space
applications.

5. Comparison:
The terms "grid computing" and "cluster computing" have been used almost interchangeably to
describe networked computers that run distributed applications and share resources.
However, cluster and grid computing represent different approaches to solving performance
problems; although their technologies and infrastructure differ, their features and benefits
complement each other.

Cluster Computing:
1. The computers (or "nodes") on a cluster are networked in a tightly-coupled fashion--they are
all on the same subnet of the same domain, often networked with very high bandwidth
connections.
2. The nodes are homogeneous; they all use the same hardware, run the same software, and are
generally configured identically. Each node in a cluster is a dedicated resource--generally only the
cluster applications run on a cluster node.
3. One advantage available to clusters is the Message Passing Interface (MPI) which is a
programming interface that allows the distributed application instances to communicate with
each other and share information.
4. Dedicated hardware, high-speed interconnects, and MPI provide clusters with the ability to
work efficiently on “fine-grained” parallel problems, including problems with short tasks,
some of which may depend on the results of previous tasks.

Grid Computing:
1. In contrast, the nodes on a grid can be loosely-coupled; they may exist across domains or
subnets.
2. The nodes can be heterogeneous; they can include diverse hardware and software
configurations.
3. Grids typically do not require high-performance interconnects; rather, they usually are
configured to work with existing network connections.
4. As a result, grids are better suited to relatively “coarse-grained” parallel problems,
including problems composed primarily of independent tasks.

6. Advantages:
1. Performance
No matter what measure of performance one is seeking, its is straightforward to claim
that one can get even more of it by using a bunch of machines.
2. Availability
High availability & resilience to failure
3. Price/performance

 Wipro Technologies Page 6 of 7


WHITE PAPER Cluster Computing

Workstation clusters are a cheap and readily available alternative to specialized High
Performance Computing (HPC) platforms.
Organizations are reluctant to buy large supercomputers, due to the large expense and
short useful life span.
4. Incremental growth
Use of clusters of workstations as a distributed compute resource is very cost effective
due to incremental growth of system. No need to make a large initial investment which
motivates the use of Clusters.
5. Scalability
Offer great scalability as potentially there is no limitation to the number of machines that
can be stacked side by side.
6. Rapid response to technology improvements

7. Conclusion and Future Scope:


Applications can perform very well on current generation clusters with the hardware and software
that is now available.
High availability is becoming increasingly important since there is an increasing demand for
minimal downtime of systems. Clustering provides the basic infrastructure, both in hardware and
software, to support high availability, throughput, high performance computing, scalability, and
standardization of application programming interfaces. We can have hyper clusters in future
which are cluster of Clusters.

References:
 www.springer.com/
 http://www.rzg.mpg.de/computing/
 http://www.bestpricecomputers.co.uk
 http://www.buyya.com/cluster/
 http://en.wikipedia.org/

 Wipro Technologies Page 7 of 7

You might also like