Testing Oracle 10 Rac Scalability: On Dell Poweredge Servers and Dell/Emc Storage

DATABASES: ORACLE
Testing Oracle 10g RAC Scalability

on Dell PowerEdge Servers and Dell/EMC Storage
Oracle 10g Real Application Clusters (RAC) software running on standards-based Dell
PowerEdge servers and Dell/EMC storage can provide a flexible, reliable platform
for a database grid. Administrators can scale the database easily and reliably simply
by adding nodes to the cluster. A team of engineers from Dell and Quest Software
ran benchmark tests against a Dell-based Oracle 10g Release 1 RAC cluster to
demonstrate the scalability of this platform.
BY ZAFAR MAHMOOD; ANTHONY FERNANDEZ; BERT SCALZO, PH.D.; AND MURALI VALLATH
Related Categories:
Characterization
Oracle
hen businesses start small and have little capital to
results. In contrast, horizontal scalability enabled by
invest, they tend to keep their computer systems
clustered hardware can distribute user workload among
simple, typically using a single database server and a stor-
multiple servers, or nodes. These nodes may be rela-
age array to support a database. However, as a business
tively small and can use inexpensive standards-based
Benchmark Factory
grows, this simple database configuration often cannot
hardware, offering economical upgradeability options that
Quest Software
handle the increased workload. At this point, businesses
can enhance a single large system. In addition, clusters
typically upgrade their hardware by adding more CPUs
offer both horizontal and vertical scalability, providing
and other required resources or they add servers to create
further investment protection.
Scalable enterprise
Visit www.dell.com/powersolutions
for the complete category index.
a cluster. Increasing resources, or vertical scaling,

g is like
the current situation of increased workload and resource
Understanding Oracle 10g

0g Real
Application Clusters
demand but does not address the possibility of future work-
Oracle 10gg Real Application Clusters (RAC) is designed
load increase. Instead of adding more memory or CPUs to
to provide true clustering in a shared database environ-
the existing configuration, organizations can add servers
ment. A RAC configuration comprises two or more nodes
placing a temporary bandage on the systemit solves
44
and configure them to function as a cluster to provide
supporting two or more database instances clustered
load balancing, workload distribution, and availability
together through Oracle clusterware. Using Oracles cache
a process known as horizontal scaling.
fusion technology, the RAC cluster can share resources
The growth potential for a vertically scalable system
and balance workloads, providing optimal scalability
is limited because it reaches a point at which the addition
for todays high-end computing environments. A typical
of resources does not provide proportionally improved
RAC configuration consists of the following:
DELL POWER SOLUTIONS
Reprinted from Dell Power Solutions, February 2006. Copyright 2006 Dell Inc. All rights reserved.
February 2006
DATABASES: ORACLE
A database instance running on each node
(API)can receive this notification and connect to one of the
All database instances sharing a single physical database
other instances in the cluster. Such proactive notification mech-
Each database instance having common data and control files
Each database instance containing individual log files and undo
anisms can help prevent connections to a failed node.

If the application attempts to connect using the VIP address
segments
of the failed node, the connection will be refused because of
All database instances simultaneously executing transactions
a mismatch in the hardware address and the application is
against the single physical database
immediately notified of the failure.
Cache synchronization between user requests across various

database instances using the cluster interconnect
Shared storage
Another important component of a RAC cluster is its shared storage,
Figure 1 shows the components of a typical RAC cluster.
which is accessed by all participating instances in the cluster. The

shared storage contains the data files, control files, redo logs, and
Oracle clusterware
undo files. Oracle Database 10gg supports three methods for storing
Oracle clusterware comprises three daemon processes: Oracle Cluster
files on shared storage: raw devices, Oracle Cluster File System
Synchronization Services (CSS), Oracle Event Manager (EVM), and
(OCFS), and Oracle Automatic Storage Management (ASM).
Oracle Cluster Ready Services (CRS). This clusterware is designed
Raw devices. A raw device partition is a contiguous region of a
to provide a unified, integrated solution that enables scalability
disk accessed by a UNIX or Linux character-device interface. This
of the RAC environment.
interface provides raw access to the underlying device, arranging

for direct I/O between a process and the logical disk. Therefore,
Cluster interconnect
when a process issues a write command to the I/O system, the data
An interconnect is a dedicated private network between the vari-
is moved directly to the device.
ous nodes in a cluster. The RAC architecture uses the cluster
Oracle Cluster File System. OCFS is a clustered file system
interconnect for instance-to-instance block transfers by providing
developed by Oracle to provide easy data file management as well
cache coherency. Ideally, interconnects are Gigabit Ethernet

adapters configured to transfer packets of the maximum size
supported by the OS. Depending on the OS, the suggested protocols may vary; on clusters running the Linux OS, the recommended protocol is UDP.
Public network
Network switch
G1
G2
G1
G2
Network switch
Virtual IP
Traditionally, users and applications have connected to the RAC
cluster and database using a public network interface. The network
protocol used for this connection has typically been TCP/IP. When
G1
G2
G1
G2
Cluster interconnect
ORADB1
ORADB2
ORADB3
ORADB4
VIP
VIP
VIP
VIP
SSKY1
SSKY2
SSKY3
SSKY4
a node or instance fails in a RAC environment, the application is

unaware of failed attempts to make a connection because TCP/IP
can take more than 10 minutes to acknowledge such a failure, causing end users to experience unresponsive application behavior.
Interprocess
Communication (IPC)
IPC
IPC
IPC
Communication layer
Communication layer
Communication layer
Communication layer
If a node fails when an application or user makes a connection
Listeners | monitors
using VIP, the Oracle clusterwarebased on an event received from
Clusterware
Clusterware
Clusterware
Clusterware
EVMwill transfer the VIP address to another surviving instance.
OS
OS
OS
OS
Virtual IP (VIP) is a virtual connection over the public interface.
Then, when the application attempts a new connection, two pos-
SAN switch
sible scenarios could ensue, depending on the Oracle 10gg database
G1
G2
G1
G2
features that have been implemented:
If the application uses Fast Application Notification (FAN) calls,
Shared disk
Oracle Notification Services (ONS) will inform ONS running on

the client systems that a node has failed, and the application
using an Oracle-provided application programming interface
www.dell.com/powersolutions
Source: Oracle 10g RAC Grid, Services & Clustering by Murali Vallath, 2005.
Figure 1. Components within a typical Oracle 10gg RAC cluster
45
DATABASES: ORACLE
Oracle 10g RAC

cluster nodes (10)
Hardware
Software
Dell PowerEdge 1850 servers,

each with:
Two Intel Xeon processors
at 3.8 GHz
4 GB of RAM
1 Gbps* Intel NIC for the LAN
Two 1 Gbps LAN on Motherboards
(LOMs) teamed for the private
interconnect
Two QLogic QLA2342 HBAs
Dell Remote Access Controller
Two internal RAID-1 disks (73 GB
10,000 rpm) for the OS and
Oracle Home
Red Hat Enterprise Linux AS 4 QU1

EMC PowerPath 4.4
EMC Navisphere agent
Oracle 10g R1 10.1.0.4
Oracle ASM 10.1.0.4
Oracle CRS 10.1.0.4
Linux bonding driver for the private
interconnect
Dell OpenManage
Dell PowerEdge 1850 servers
Dell PowerConnect 5224

Gigabit Ethernet switch
G1
G1
G2
G2
Interconnect switch
QLogic QLA2342
HBAs (2)
Microsoft Windows Server 2003

Benchmark Factory application
and agents
Spotlight on RAC
Storage
Dell/EMC CX700 storage array

Dell/EMC Disk Array Enclosure
with 30 disks (73 GB 15,000 rpm)
RAID Group 1: 16 disks with four
50 GB RAID-10 logical units (LUNs)
for data and backup
RAID Group 2: 10 disks with
two 20 GB LUNs for the redo logs
RAID Group 3: 4 disks with one
5 GB LUN for the voting disk,
Oracle Cluster Repository (OCR),
and sples
Two 16-port Brocade SilkWorm 3800
Fibre Channel switches
Eight paths congured to each
logical volume
24-port Dell PowerConnect 5224

Gigabit Ethernet switch for the
private interconnect
24-port Dell PowerConnect 5224
Gigabit Ethernet switch for the
public LAN
Dell PowerConnect 5224

Gigabit Ethernet switch
Brocade SilkWorm 3800

Fibre Channel switch
Brocade SilkWorm 3800

Fibre Channel switch
Benchmark Factory Dell PowerEdge 6650 servers,

for Databases
each with:
Four Intel Xeon processors
servers (2)
8 GB of RAM
Network
Benchmark Factory
Spotlight on RAC
Two controllers with

four ports each
ASM disk groups

DATADG 50 GB
INDEXDG 50 GB
QUESTDG 50 GB
EMC FLARE Code Release 16
REDO01DG 20 GB
REDO02DG 20 GB
SYSTEMDG 50 GB
Dell/EMC CX700 storage array
Figure 4. Ten-node cluster architecture for the test environment

Linux binding driver used to team
dual on-board NICs for the private
interconnect
load across all available resources to help optimize performance

and throughput.
Testing Oracle RAC environments for scalability
*This term does not connote an actual operating speed of 1 Gbps. For high-speed transmission, connection
to a Gigabit Ethernet server and network infrastructure is required.
The primary advantages of Oracle RAC systems, apart from

improved performance, are availability and scalability. Availability
Figure 2. Hardware and software configuration for the test environment
is enhanced with RAC because, if one of the nodes or the instances

in the cluster fails, the remainder of the instances would continue
Database
Oracle Database 10g R1 10.1.04 Enterprise Edition
ASM disk groups
SYSTEMDG: 50 GB
DATADG: 50 GB
INDEXDG: 50 GB
REDO01DG: 20 GB
REDO02 DG: 20 GB
because, when the user workload increases, users can access the
database from any of the available instances that have resources
available. Database administrators (DBAs) also can add nodes to
All disk groups were created using the external redundancy option of ASM.
Tablespaces
to provide access to the physical database. Scalability is possible
Quest_data in the DATADG disk group (40 GB) using the OMF feature
Quest_index in the INDEXDG disk group (10 GB) using the OMF feature
All other database tablespaces were created in the SYSTEMDG disk group.
Redo log les were created in the REDO01DG and REDO02DG disk groups
the RAC environment when the user base increases.

When organizations migrate to a RAC environment, best
practices recommend conducting independent performance tests
to determine the capacity of the cluster configured. Such tests can
help determine when a cluster will require additional instances
to accommodate a higher workload. To illustrate this, in August
Figure 3. Database configuration for the test environment
and September 2005 engineers from the Dell Database and Applications team and Quest Software conducted benchmark tests
as performance levels similar to raw devices. OCFS 1.0 supports only
on Dell PowerEdge servers and Dell/EMC storage supporting
database files to be stored on devices formatted using OCFS, while
an Oracle 10gg RAC database cluster. The results of these tests
OCFS 2.0 supports both Oracle and non-Oracle files. OCFS supports
demonstrate the scalability of Dell PowerEdge servers running
both Linux and Microsoft Windows operating systems.
Oracle RAC and ASM.
Oracle Automatic Storage Management. ASM is a storage
Figure 2 lists the hardware and software used in the test envi-
management feature introduced in Oracle Database 10g. ASM is
ronment, while Figure 3 describes the database configuration,
designed to integrate the file system and volume manager. Using
including the disk groups and tablespaces. Figure 4 shows the
Oracle Managed Files (OMF) architecture, ASM distributes the I/O
layout of the cluster architecture.
46
February 2006
DATABASES: ORACLE
Benchmark Factory for Databases

Benchmark Factory
y for Databases from Quest Software provides a
simple yet robust graphical user interface (GUI), shown in Figure 5,
for creating, managing, and scheduling industry-standard database benchmarks and real-world workload simulation. It helps
determine accurate production database hardware and software
configurations for optimal effectiveness, efficiency, and scalability. Using Benchmark Factory, DBAs can address two challenging tasks: selecting a hardware architecture and platform
for deployment and determining the appropriate performancerelated service-level agreements.
While Benchmark Factory offers numerous industry-standard
benchmarks, the test team selected benchmarks similar to the TPC-C
benchmark from the Transaction Processing Performance Council
(TPC). This benchmark measures online transaction processing
(OLTP) workloads, combining read-only and update-intensive
transactions that simulate the activities found in complex OLTP
Figure 6. Spotlight on RAC GUI
enterprise environments.
The benchmark tests simulated loads from 100 to 5,000 con-
and metrics represented by a dashboard-like display (see Figure 6)
current users in increments of 100, and these tests used a 10 GB
that enables easy monitoring and diagnostics of a database cluster.
database created by Benchmark Factory. The goal was to ascertain
With this tool, DBAs can easily monitor their clusters to detect,
two critical data points: how many concurrent users each RAC node
diagnose, and correct potential problems or hotspots. Spotlight
could sustain, and whether the RAC cluster could scale both predict-
on RAC also provides alarms with automatic prioritization and
ably and linearly as additional nodes and users were added.
weighted escalation rankings to help less-experienced RAC DBAs

focus their attention on the most critical or problematic issues.
Spotlight on RAC
Spotlight on RAC requires only a Windows OS on the client system
Spotlight on RAC from Quest Software is a database monitor-
to monitor all the nodes on the cluster. It does not require server-
ing and diagnostic tool that extends the proven architecture and
side agents or a data repository.
intuitive GUI of the Spotlight on Oracle tool to RAC environments.

Spotlight on RAC is designed to provide a comprehensive yet com-
Defining the testing methodology
prehendible overview of numerous internal Oracle RAC settings
Methodology is critical for any reliable benchmarking exercise,

especially for complex and repetitive benchmark tests. A methodology allows for comparison of current activity with previous
activities, while recording any changes to the baseline criteria.
Oracle RAC testing is no differenta methodology to identify
performance candidates, tune parameters or settings, run the
tests, and then record the results is critical. And because of
its highly complex multi-node architecture, RAC benchmarking should follow an iterative testing process, as described in
this section.
For the first node and instance, administrators should take
the following steps:
1. Establish a fundamental baseline. Install the OS

and Oracle database (keeping all normal installation
defaults); create and populate the test database schema;
shut down and restart the database; and run a simple
benchmark (such as TPC-C for 200 users) to establish a
Figure 5. Benchmark Factory for Databases GUI
baseline for default OS and database settings.
47
DATABASES: ORACLE
2. Optimize the basic OS. Manually optimize the typical OS
automatic ability to balance the new user load); shut down
settings; shut down and restart the database; run a simple
and restart the database, adding the new instance; run the
benchmark (such as TPC-C for 200 users) to establish a new
baseline RAC benchmark; and plot the transactions-per-
baseline for basic OS improvements; and repeat the prior
second graph showing this run versus all the prior baseline
three steps until a performance balance results.
benchmark runsthe results should show a predictable and
3. Optimize the basic non-RAC database. Manually optimize
reliable scalability factor.
the typical database spfile parameters; shut down and restart

the database; run a simple benchmark (such as TPC-C for
As with any complex testing endeavor, the initial benchmark-
200 users) to establish a new baseline for basic Oracle data-
ing setup and sub-optimization procedure can be time-consuming.
base improvements; and repeat the prior three steps until a
In fact, nearly two-thirds of the overall effort is expended in set-
performance balance results.
ting up the first node and instance correctly and properly defining
4. Ascertain the reasonable per-node load. Manually opti-
the baseline benchmark. However, once that initial work is com-
mize the scalability database spfile parameters; shut down
pleted, the remaining steps of adding nodes and retesting progresses
and restart the database; run an increasing user load
rather quickly. In addition, if the DBA duplicates all the nodes and
benchmark (such as TPC-C for 100 to 800 users, with user
instances using the first node and instance, then the additional
load increasing by increments of 100) to determine how
node benchmarking can be run with little or no DBA interaction
many concurrent users a node can reasonably support, a
(that is, steps 1 and 2 for setting up the second through nth nodes
measurement referred to as the sweet spot; monitor the
and instances can be eliminated). This also provides flexibility to
benchmark test via the vmstat command, looking for the
test various scenarios and in any order that the DBA prefers (for
points at which excessive paging and swapping begins and
example, testing 10 nodes down to 1 node).
the CPU idle time consistently approaches zero; record the

sweet spot number of concurrent users, which represents
Testing the Oracle 100g RAC environment
an upper limit; and reduce the sweet spot number of con-
In the benchmarking test case described in this article, the first
current users by some reasonable percentage to account for
three steps for setting up the first node and instance (establish a
RAC architecture and inter- and intra-node overheads (for
fundamental baseline, optimize the basic OS, and optimize the non-
example, reduce it by 10 percent).
RAC database) are straightforward. The test team installed Red Hat
5. Establish the baseline RAC benchmark. Shut down and
Enterprise Linux AS 4 Update 1, the device drivers necessary for
restart the database; create an increasing user load bench-
the hardware, Oracle 10gg Release 1, and the Oracle 10.1.0.4 patch.
mark based on the node count and the sweet spot number
Then, the team modified the Linux kernel parameters to best sup-
(such as TPC-C for 100 to node count multiplied by the
port Oracle by adding the following entries to /etc/sysctl.conf:
sweet spot number of users, with user load increasing by

increments of 100); and run the baseline RAC benchmark.
kernel.shmmax = 2147483648
kernel.sem = 250 32000 100 128
For the second through nth nodes and instances (where n is
fs.file-max = 65536
the number of nodes in the cluster), administrators should take
fs.aio-max-nr = 1048576
the following steps:
net.ipv4.ip_local_port_range = 1024 65000
net.core.rmem_default = 262144
net.core.rmem_max = 262144
net.core.wmem_default = 262144
net.core.wmem_max = 262144
1. Duplicate the environment. Install the OS and duplicate all

of the base nodes OS settings.
2. Add the node to the cluster. Perform node registration

tasks; propagate the Oracle software to the new node;
update the database spfile parameters for the new node;
Next, the test team performed the following steps to help ensure
and alter the database to add node-specific items (such as
that asynchronous I/O feature was compiled into the Oracle binaries
redo logs).
and is currently being used:
3. Run the baseline RAC benchmark. Update the baseline

benchmark criteria to include user load scenarios from
48
1. Go to the Oracle Home directory and rebuild the Oracle binaries:
the prior runs maximum up to the new maximum based
cd $ORACLE_HOME/rdbms/lib
on node count multiplied by the sweet spot number of
make -f ins_rdbms.mk async_on
concurrent users (and relying upon Benchmark Factorys
make -f ins_rdbms.mk ioracle
February 2006
DATABASES: ORACLE
2. Set the necessary spfile parameter settings:
predictably beyond four servers. Because each server was pushed
disk_asynch_io = true
to a near-thrashing threshold by the high per-node user load, the
filesystemio_options = setall
nodes did not have sufficient resources to communicate in a timely

fashion for inter- and intra-node messaging. Thus, the Oracle data-
The default value of disk_asynch_io is true. The setall
base assumed that the nodes were either down or non-respondent.
value for filesystemio_options enables both asynchronous and
Furthermore, the Oracle client and server-side load balancing feature
direct I/O.
allocates connections based on which nodes are responding, so
Note: In Oracle 10gg Release 2, asynchronous I/O is compiled

in by default.
the user load per node became skewed in this first test and then
exceeded the per-node sweet spot value. For example, when the
The test team then created the RAC database and initial instance
team tested 7,000 users for 10 nodes, some nodes appeared down to
using Oracle Database Configuration Assistant (DBCA), selecting
the Oracle database and thus the load balancer simply directed all
parameter settings suited for the proposed maximum scalability
the sessions across whichever nodes were responding. As a result,
(10 nodes). Finally, the team manually made the following spfile
some of the nodes tried to handle far more than 700 usersand
adjustments:
this made the thrashing increase.

Note: This problem should not occur in Oracle Database 10g
cluster_database = true
Release 2. With the runtime connection load-balancing feature and
cluster_database_instances = 10
FAN technology, the Oracle client will be proactively notified regard-
db_block_size = 8192
ing the resource availability on each node, and the client can place
processes = 16000
connections on instances that have more resources. Load balancing
sga_max_size = 1500m
can be performed based on either connections or response time.
sga_target = 1500m
pga_aggregate_target = 700m
test team made two major improvements. First, they reevaluated
db_writer_processes = 2
the sweet spot number by carefully monitoring the single-node
open_cursors = 00
test (in which user load increased from 100 to 800) for the onset
optimizer_index_caching = 80
of excessive paging, swapping, or a consistent CPU idle time near
optimizer_index_cost_adj = 40
zero. The team determined that the sweet spot number was
With a valuable lesson learned by the first test attempt, the
actually 600 users, not 700. They then reduced that number to 500
The primary goal was to consume as much System Global Area
users to accommodate overhead for the RAC architecture, which
(SGA) memory as possible within the 32-bit OS limit (about 1.7 GB).
would require approximately 15 percent of the system resources.
Because the cluster servers had only 4 GB of RAM each, allocating
This amount is not necessarily a recommendation for all RAC imple-
half of the memory to Oracle was sufficientthe remaining memory
mentations; the test team used this amount to help yield a positive
was shared by the OS and the thousands of dedicated Oracle server
scalability experience for the next set of benchmarking tests. A less
processes that the benchmark created as its user load.
conservative sweet spot number could have been used if the team
were able to keep repeating the tests until a definitive reduction
Finding the sweet spot
percentage could be factually derived. Instead, the test team chose
The next step was to ascertain the reasonable per-node load that the
a sweet spot value that they expected would work well yet would
cluster servers could accommodate. This is arguably the most critical
not overcompensate. In addition, the team used the load-balancing
aspect of the entire benchmark testing processespecially for RAC
feature of Benchmark Factorywhich allocates one-nth of the jobs
environments with more than just a few nodes. The test team initially
to each node (where n is the number of nodes in the cluster)to
ran the benchmark on the single node without monitoring the test
help ensure that the number of users running on any given node
via the vmstat command. Thus, simply looking at the transactions-
never exceeds the sweet spot value.
per-second graph in the Benchmark Factory GUI yielded a deceiving

conclusion that the sweet spot was 700 users per node. Although
Increasing the user load to determine scalability
the transactions per second continued to increase up to 700 users,
With the sweet spot user load identified and guaranteed through
the OS was overstressed and exhibited minimal thrashing charac-
load balancing, the test team then ran the benchmark on the
teristics at about 600 users. Moreover, the test team did not temper
cluster nodes as follows:
that value by reducing for RAC overhead.

The end result was that the first attempt at running a series
One node: 100 to 500 users
of benchmarks for 700 users per node did not scale reliably or
Two nodes: 100 to 1,000 users
49
DATABASES: ORACLE
Figure 9 shows the final Benchmark Factory results

for all the node and user load scenarios tested. These
results show that RAC scaled predictably as nodes and
users were added. The scalability was near linear because
the cluster interconnect generated a small amount of
overhead during block transfers between instances.
However, the interconnect performed well. The network
interface card (NIC) bonding feature of Linux was implemented to provide load balancing across the redundant
interconnects, which also helped provide availability in
the case of interconnect failure.
The Dell/EMC storage subsystem that consisted
of six ASM disk groups for the various data files types
provided high throughput as well as high scalability.
EMC PowerPath software provided I/O load balancing
and redundancy utilizing dual Fibre Channel host bus
adapters (HBAs) on each server.
Figure 7. Spotlight on RAC GUI showing high CPU usage on nodes racdb1 and racdb3 during the
four-node test
Note: Figure 9 shows a few noticeable troughs for the

8- and 10-node clusters. These performance declines were
likely caused by undo tablespace management issues on
Four nodes: 100 to 2,000 users
one of the nodes in the cluster; these issues were later resolved
Six nodes: 100 to 3,000 users
by increasing the size of the tablespace.
Eight nodes: 100 to 4,000 users
Ten nodes: 100 to 5,000 users
The test team also monitored the storage subsystem using

Spotlight on RAC. As shown in Figure 10, the Spotlight on RAC
performance graphs indicated that ASM performed well at the peak
For each scenario, the workload was increased in increments
of the scalability testing10 nodes with more than 5,000 users.
of 100 users. The Benchmark Factory default TPC-C-like test itera-
ASM achieved fast service times, with performance reaching more
tion requires about four minutes for a given user load. Therefore,
than 2,500 I/Os per second.
for the single node with five user loads, the overall benchmark test
run required 20 minutes.
As the Spotlight on RAC results show, Oracle RAC and ASM

performance were predictable and reliable when the cluster was
During the entire testing process, the load was monitored using Spotlight on RAC to identify any problems.
As shown in Figure 7, when the four-node tests were
conducted, Spotlight on RAC identified that CPUs on
node racdb1 and racdb3 reached 84 percent and 76 percent, respectively. This high CPU utilization probably was
caused by a temporary overload of users on these servers
and the ASM response time. To address this problem, the
test team increased the SHARED_POOL and LARGE_POOL
parameters on the ASM instance from their default values
of 32 MB and 12 MB, respectively, to 67 MB each. They
then ran the four-node test again, and none of the nodes
experienced excessive CPU utilization. This was the only
parameter change the team made to the ASM instance.
Figure 8 shows the cluster-level latency graphs from
Spotlight on RAC during the eight-node test. These
graphs indicated that the interconnect latency was well
within expectations and in line with typical network
latency numbers.
50
Figure 8. Spotlight on RAC GUI showing cluster-level latency graphs during the eight-node test
February 2006
DATABASES: ORACLE
250
500
Legend
2 nodes
4 nodes
6 nodes
150
8 nodes
10 nodes
100
50
Transactions per second
Transactions per second
200
16 nodes
450
1 node
15 nodes
14 nodes
400
12 nodes
350
10 nodes
300
8 nodes
250
6 nodes
200
150
100
50
0
0
100
600
1,100
1,600
2,100
2,600
3,100
3,600
4,100
4,600
User load
100
1,100
2,100
3,100
4,100
5,100
6,100
7,100
8,100
9,100
User load
Figure 9. Benchmark Factory for Databases results for 1 to 10 RAC nodes
Figure 11. Projected RAC scalability for up to 17 nodes and 10,000 users
scaled horizontally. Each successive node provided near-linear
Anthony Fernandez is a senior analyst with the Dell Database and

Applications Team of Enterprise Solutions Engineering, Dell Product
Group. His focus is on database optimization and performance. Anthony
has a bachelors degree in Computer Science from Florida International
University.
scalability. Figure 11 shows projected scalability for up to 17

nodes and approximately 10,000 concurrent users based on the
results of the six node scenarios that were tested. In this projection, the cluster is capable of achieving nearly 500 transactions
per second.
Optimizing Oracle 100g RAC environments

on Dell hardware
As demonstrated in the test results presented in this article, an Oracle
Zafar Mahmood is a senior consultant in the Dell Database and Applications Team of Enterprise Solutions Engineering, Dell Product Group. Zafar
has an M.S. and a B.S. in Electrical Engineering, with specialization in
Computer Communications, from the City University of New York.
10gg RAC cluster can provide excellent near-linear scalability. Oracle

10gg RAC software running on standards-based Dell PowerEdge servers and Dell/EMC storage can provide a flexible, reliable platform
for a database cluster. In addition, Oracle 10gg RAC databases on
Dell hardware can easily be scaled out to provide the redundancy
or additional capacity that database environments require.
Figure 10. Spotlight on RAC GUI showing ASM performance for 10 RAC nodes
Bert Scalzo, Ph.D., is a product architect for Quest Software and a

member of the Toad development team. He has been an Oracle DBA
and has worked for both Oracle Education and Consulting. Bert has also
written articles for the Oracle Technology Network, Oracle Magazine,
Oracle Informant, PC Week, Linux Journal, and Linux.com as well as
three books. His key areas of DBA interest are Linux and data
warehousing. Bert has a B.S., an M.S., and a Ph.D. in Computer
Science as well as an M.B.A., and he holds several Oracle Masters certifications.
Murali Vallath has more than 17 years of IT experience designing and developing databases, including more than 13 years of
working with Oracle products. He has successfully completed
more than 60 small, medium, and terabyte-sized Oracle9i and
Oracle 10g RAC implementations for well-known corporations.
Murali also is the author of the book Oracle Real Application
Clusters and the upcoming book Oracle 10g RAC Grid, Services
& Clustering. He is a regular speaker at industry conferences and
user groupsincluding Oracle Open World, the UK Oracle User
Group, and the Independent Oracle Users Groupon RAC and
Oracle relational database management system performancetuning topics. In addition, Murali is the president of the Oracle
RAC SIG (www.oracleracsig.org) and the Charlotte Oracle Users
Group (www.cltoug.org).
51

Testing Oracle 10 Rac Scalability: On Dell Poweredge Servers and Dell/Emc Storage

Uploaded by

Copyright:

Available Formats

Testing Oracle 10 Rac Scalability: On Dell Poweredge Servers and Dell/Emc Storage

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Testing Oracle 10 Rac Scalability: On Dell Poweredge Servers and Dell/Emc Storage

Uploaded by

Copyright:

Available Formats

DATABASES: ORACLE

Testing Oracle 10g RAC Scalability

hen businesses start small and have little capital to

results. In contrast, horizontal scalability enabled by

invest, they tend to keep their computer systems

clustered hardware can distribute user workload among

simple, typically using a single database server and a stor-

multiple servers, or nodes. These nodes may be rela-

age array to support a database. However, as a business

tively small and can use inexpensive standards-based

grows, this simple database configuration often cannot

hardware, offering economical upgradeability options that

handle the increased workload. At this point, businesses

can enhance a single large system. In addition, clusters

typically upgrade their hardware by adding more CPUs

offer both horizontal and vertical scalability, providing

and other required resources or they add servers to create

further investment protection.

a cluster. Increasing resources, or vertical scaling,

Understanding Oracle 10g

demand but does not address the possibility of future work-

Oracle 10gg Real Application Clusters (RAC) is designed

load increase. Instead of adding more memory or CPUs to

to provide true clustering in a shared database environ-

the existing configuration, organizations can add servers

ment. A RAC configuration comprises two or more nodes

placing a temporary bandage on the systemit solves

and configure them to function as a cluster to provide

supporting two or more database instances clustered

load balancing, workload distribution, and availability

together through Oracle clusterware. Using Oracles cache

a process known as horizontal scaling.

fusion technology, the RAC cluster can share resources

The growth potential for a vertically scalable system

and balance workloads, providing optimal scalability

is limited because it reaches a point at which the addition

for todays high-end computing environments. A typical

of resources does not provide proportionally improved

RAC configuration consists of the following:

DELL POWER SOLUTIONS

A database instance running on each node

(API)can receive this notification and connect to one of the

All database instances sharing a single physical database

other instances in the cluster. Such proactive notification mech-

Each database instance having common data and control files

Each database instance containing individual log files and undo

anisms can help prevent connections to a failed node.

of the failed node, the connection will be refused because of

All database instances simultaneously executing transactions

a mismatch in the hardware address and the application is

against the single physical database

immediately notified of the failure.

Cache synchronization between user requests across various

Figure 1 shows the components of a typical RAC cluster.

which is accessed by all participating instances in the cluster. The

Oracle clusterware comprises three daemon processes: Oracle Cluster

files on shared storage: raw devices, Oracle Cluster File System

Synchronization Services (CSS), Oracle Event Manager (EVM), and

(OCFS), and Oracle Automatic Storage Management (ASM).