Testing Oracle 10 Rac Scalability: On Dell Poweredge Servers and Dell/Emc Storage
Testing Oracle 10 Rac Scalability: On Dell Poweredge Servers and Dell/Emc Storage
Testing Oracle 10 Rac Scalability: On Dell Poweredge Servers and Dell/Emc Storage
Related Categories:
Characterization
Oracle
Benchmark Factory
Quest Software
Scalable enterprise
Visit www.dell.com/powersolutions
for the complete category index.
44
Reprinted from Dell Power Solutions, February 2006. Copyright 2006 Dell Inc. All rights reserved.
February 2006
DATABASES: ORACLE
segments
Shared storage
Another important component of a RAC cluster is its shared storage,
Oracle clusterware
undo files. Oracle Database 10gg supports three methods for storing
Cluster interconnect
when a process issues a write command to the I/O system, the data
Public network
Network switch
G1
G2
G1
G2
Network switch
Virtual IP
Traditionally, users and applications have connected to the RAC
cluster and database using a public network interface. The network
protocol used for this connection has typically been TCP/IP. When
G1
G2
G1
G2
Cluster interconnect
ORADB1
ORADB2
ORADB3
ORADB4
VIP
VIP
VIP
VIP
SSKY1
SSKY2
SSKY3
SSKY4
Interprocess
Communication (IPC)
IPC
IPC
IPC
Communication layer
Communication layer
Communication layer
Communication layer
Listeners | monitors
Listeners | monitors
Listeners | monitors
Listeners | monitors
Clusterware
Clusterware
Clusterware
Clusterware
OS
OS
OS
OS
SAN switch
G1
G2
G1
G2
Shared disk
Source: Oracle 10g RAC Grid, Services & Clustering by Murali Vallath, 2005.
Reprinted from Dell Power Solutions, February 2006. Copyright 2006 Dell Inc. All rights reserved.
45
DATABASES: ORACLE
Hardware
Software
G1
G2
G2
Interconnect switch
QLogic QLA2342
HBAs (2)
Storage
Network
Benchmark Factory
Spotlight on RAC
REDO01DG 20 GB
REDO02DG 20 GB
SYSTEMDG 50 GB
*This term does not connote an actual operating speed of 1 Gbps. For high-speed transmission, connection
to a Gigabit Ethernet server and network infrastructure is required.
Database
SYSTEMDG: 50 GB
DATADG: 50 GB
INDEXDG: 50 GB
REDO01DG: 20 GB
REDO02 DG: 20 GB
because, when the user workload increases, users can access the
database from any of the available instances that have resources
available. Database administrators (DBAs) also can add nodes to
All disk groups were created using the external redundancy option of ASM.
Tablespaces
Quest_data in the DATADG disk group (40 GB) using the OMF feature
Quest_index in the INDEXDG disk group (10 GB) using the OMF feature
All other database tablespaces were created in the SYSTEMDG disk group.
Redo log les were created in the REDO01DG and REDO02DG disk groups
and September 2005 engineers from the Dell Database and Applications team and Quest Software conducted benchmark tests
OCFS 2.0 supports both Oracle and non-Oracle files. OCFS supports
Figure 2 lists the hardware and software used in the test envi-
46
Reprinted from Dell Power Solutions, February 2006. Copyright 2006 Dell Inc. All rights reserved.
February 2006
DATABASES: ORACLE
enterprise environments.
The benchmark tests simulated loads from 100 to 5,000 con-
With this tool, DBAs can easily monitor their clusters to detect,
two critical data points: how many concurrent users each RAC node
could sustain, and whether the RAC cluster could scale both predict-
Spotlight on RAC
to monitor all the nodes on the cluster. It does not require server-
ing and diagnostic tool that extends the proven architecture and
www.dell.com/powersolutions
Reprinted from Dell Power Solutions, February 2006. Copyright 2006 Dell Inc. All rights reserved.
47
DATABASES: ORACLE
and restart the database, adding the new instance; run the
second graph showing this run versus all the prior baseline
ting up the first node and instance correctly and properly defining
rather quickly. In addition, if the DBA duplicates all the nodes and
instances using the first node and instance, then the additional
(that is, steps 1 and 2 for setting up the second through nth nodes
test various scenarios and in any order that the DBA prefers (for
three steps for setting up the first node and instance (establish a
fundamental baseline, optimize the basic OS, and optimize the non-
RAC database) are straightforward. The test team installed Red Hat
the hardware, Oracle 10gg Release 1, and the Oracle 10.1.0.4 patch.
mark based on the node count and the sweet spot number
Then, the team modified the Linux kernel parameters to best sup-
kernel.shmmax = 2147483648
fs.file-max = 65536
fs.aio-max-nr = 1048576
net.core.rmem_default = 262144
net.core.rmem_max = 262144
net.core.wmem_default = 262144
net.core.wmem_max = 262144
Next, the test team performed the following steps to help ensure
that asynchronous I/O feature was compiled into the Oracle binaries
redo logs).
48
cd $ORACLE_HOME/rdbms/lib
Reprinted from Dell Power Solutions, February 2006. Copyright 2006 Dell Inc. All rights reserved.
February 2006
DATABASES: ORACLE
disk_asynch_io = true
filesystemio_options = setall
direct I/O.
the user load per node became skewed in this first test and then
exceeded the per-node sweet spot value. For example, when the
The test team then created the RAC database and initial instance
team tested 7,000 users for 10 nodes, some nodes appeared down to
the Oracle database and thus the load balancer simply directed all
(10 nodes). Finally, the team manually made the following spfile
some of the nodes tried to handle far more than 700 usersand
adjustments:
cluster_database = true
cluster_database_instances = 10
db_block_size = 8192
ing the resource availability on each node, and the client can place
processes = 16000
sga_max_size = 1500m
sga_target = 1500m
pga_aggregate_target = 700m
db_writer_processes = 2
open_cursors = 00
test (in which user load increased from 100 to 800) for the onset
optimizer_index_caching = 80
optimizer_index_cost_adj = 40
zero. The team determined that the sweet spot number was
actually 600 users, not 700. They then reduced that number to 500
The primary goal was to consume as much System Global Area
(SGA) memory as possible within the 32-bit OS limit (about 1.7 GB).
mentations; the test team used this amount to help yield a positive
conservative sweet spot number could have been used if the team
were able to keep repeating the tests until a definitive reduction
The next step was to ascertain the reasonable per-node load that the
a sweet spot value that they expected would work well yet would
environments with more than just a few nodes. The test team initially
ran the benchmark on the single node without monitoring the test
help ensure that the number of users running on any given node
With the sweet spot user load identified and guaranteed through
load balancing, the test team then ran the benchmark on the
teristics at about 600 users. Moreover, the test team did not temper
of benchmarks for 700 users per node did not scale reliably or
www.dell.com/powersolutions
Reprinted from Dell Power Solutions, February 2006. Copyright 2006 Dell Inc. All rights reserved.
49
DATABASES: ORACLE
one of the nodes in the cluster; these issues were later resolved
tion requires about four minutes for a given user load. Therefore,
for the single node with five user loads, the overall benchmark test
run required 20 minutes.
During the entire testing process, the load was monitored using Spotlight on RAC to identify any problems.
As shown in Figure 7, when the four-node tests were
conducted, Spotlight on RAC identified that CPUs on
node racdb1 and racdb3 reached 84 percent and 76 percent, respectively. This high CPU utilization probably was
caused by a temporary overload of users on these servers
and the ASM response time. To address this problem, the
test team increased the SHARED_POOL and LARGE_POOL
parameters on the ASM instance from their default values
of 32 MB and 12 MB, respectively, to 67 MB each. They
then ran the four-node test again, and none of the nodes
experienced excessive CPU utilization. This was the only
parameter change the team made to the ASM instance.
Figure 8 shows the cluster-level latency graphs from
Spotlight on RAC during the eight-node test. These
graphs indicated that the interconnect latency was well
within expectations and in line with typical network
latency numbers.
50
Figure 8. Spotlight on RAC GUI showing cluster-level latency graphs during the eight-node test
Reprinted from Dell Power Solutions, February 2006. Copyright 2006 Dell Inc. All rights reserved.
February 2006
DATABASES: ORACLE
250
500
Legend
2 nodes
4 nodes
6 nodes
150
8 nodes
10 nodes
100
50
200
16 nodes
450
1 node
15 nodes
14 nodes
400
12 nodes
350
10 nodes
300
8 nodes
250
6 nodes
200
150
100
50
0
0
100
600
1,100
1,600
2,100
2,600
3,100
3,600
4,100
4,600
User load
100
1,100
2,100
3,100
4,100
5,100
6,100
7,100
8,100
9,100
User load
Figure 11. Projected RAC scalability for up to 17 nodes and 10,000 users
Zafar Mahmood is a senior consultant in the Dell Database and Applications Team of Enterprise Solutions Engineering, Dell Product Group. Zafar
has an M.S. and a B.S. in Electrical Engineering, with specialization in
Computer Communications, from the City University of New York.
Figure 10. Spotlight on RAC GUI showing ASM performance for 10 RAC nodes
www.dell.com/powersolutions
Reprinted from Dell Power Solutions, February 2006. Copyright 2006 Dell Inc. All rights reserved.
51