Data Guard and Fail Safe

Download as pdf or txt
Download as pdf or txt
You are on page 1of 104

'LVDVWHU7ROHUDQW+LJK$YDLODELOLW\

Oracle Data Guard with Oracle Fail Safe

An Oracle White Paper


June 2002
Disaster-Tolerant High Availability

1 EXECUTIVE OVERVIEW .....................................................................................................3


2 INTRODUCTION .................................................................................................................3
3 CLUSTER CONFIGURATION ..............................................................................................6
3.1 Microsoft Cluster Service ..............................................................................................7
3.1.1 Install MSCS and Configure the Example Clusters...............................................7
3.2 Validate the Cluster Network Configuration ...............................................................9
4 ORACLE SOFTWARE CONFIGURATION...........................................................................9
4.1 Install and Configure Oracle9i Database Enterprise Edition....................................10
4.1.1 Create Initial Network Configuration Files .........................................................10
4.1.2 Create Primary Database .....................................................................................11
4.1.3 Start Instance and Listener Services and Connect to Primary Database ...........14
4.1.4 Specify Location for Oracle Data Guard Configuration Files ............................14
4.1.5 Specify Standby Archive Log File Destination....................................................14
4.2 Create Initial Primary/Standby Database Configuration ...........................................14
4.2.1 Configure Oracle Enterprise Manager ................................................................16
4.2.2 Start the Oracle Management Server...................................................................16
4.2.3 Start Oracle Intelligent Agent on All Cluster Nodes...........................................16
4.2.4 Open the Oracle Enterprise Manager Console ..................................................17
4.2.5 Discover Primary and Standby Cluster Nodes....................................................17
4.2.6 Set Preferred Credentials on Primary and Standby Nodes ................................18
4.2.7 Open Data Guard Manager .................................................................................18
4.2.8 Create the Initial Oracle Data Guard Configuration ..........................................19
4.2.9 Validate Initial Primary/Standby Configuration..................................................28
4.2.10 Optionally, Specify a Time Delay for Applying Archived Redo Log Files......30
4.2.11 Delete Initial Primary/Standby Configuration ..................................................31
4.2.12 Stop and Disable Default Oracle Intelligent Agents ........................................32
4.3 Install Oracle Fail Safe................................................................................................33
4.3.1 Open the Oracle Universal Installer ...................................................................33
4.3.2 Specify File Locations ..........................................................................................34
4.3.3 Select Oracle Fail Safe .........................................................................................35
4.3.4 Installation Types.................................................................................................36
4.3.5 Reboot Needed After Installation........................................................................37
4.3.6 Review Summary Information.............................................................................38
4.3.7 Enter Domain User Account for Oracle Services for MSCS ...............................40
4.3.8 Confirm Installation and View Release Notes ....................................................41
4.3.9 Reboot Cluster Node............................................................................................41

Disaster-Tolerant High Availability Page 1


4.3.10 Install Oracle Fail Safe on Remaining Nodes ...................................................41
4.3.11 Verify the Primary and Standby Clusters ..........................................................41
4.4 Configure Database Virtual Servers ...........................................................................44
4.4.1 Configure Database Parameter Files ...................................................................44
4.4.2 Create Virtual Servers for Primary and Standby Databases ...............................45
4.4.3 Execute Verify Standalone Database Command ................................................49
4.4.4 Add Each Database to its Associated Virtual Server ..........................................51
4.5 Create Final Highly Available Primary/Standby Configuration ................................62
4.5.1 Discover Virtual Servers ......................................................................................62
4.5.2 Create Highly Available Primary/Standby Configuration...................................63
4.5.3 Verify Highly Available Primary/Standby Configuration ...................................70
5 OTHER CONFIGURATIONS .............................................................................................71
5.1.1 Benefits.................................................................................................................73
5.1.2 Trade-offs .............................................................................................................73
5.2 Single Active/Active Cluster .......................................................................................73
5.2.1 Benefits.................................................................................................................74
5.2.2 Trade-offs .............................................................................................................74
5.3 Two Active/Active Clusters ........................................................................................74
5.3.1 Benefits.................................................................................................................75
5.3.2 Trade-offs .............................................................................................................76
5.4 Multiple Primary Locations and Single Standby Location ........................................76
5.4.1 Benefits.................................................................................................................77
5.4.2 Trade-offs .............................................................................................................77
6 MAINTENANCE AND ADMINISTRATION EXAMPLES ....................................................78
6.1 Performing an Oracle Fail Safe (MSCS) Failover.......................................................78
6.2 Changing the SYS Database Account Password .......................................................80
6.2.1 Update Primary SYS Database User Account Password ....................................81
6.2.2 Update the Standby SYS Database User Account Password .............................87
6.3 Performing Rolling Upgrades.....................................................................................89
6.3.1 Upgrading Hardware or Operating System Software ........................................89
6.3.2 Upgrading Oracle Fail Safe or Oracle Application Software .............................91
6.3.3 Upgrading Oracle Database Software.................................................................91
6.4 Performing an Oracle Data Guard Failover or Switchover Operation ....................93
6.4.1 Disable Is Alive Polling .......................................................................................93
6.4.2 Perform the Role Transition Operation ..............................................................94
6.4.3 Reenable Is Alive Polling.....................................................................................94
6.4.4 Verify the Primary and Standby Virtual Server Groups .....................................94
6.4.5 Switchover Example ............................................................................................94
6.5 Performing Database Backups...................................................................................99
7 SUMMARY AND MORE INFORMATION........................................................................100
7.1 Oracle Product Documentation ...............................................................................100
7.2 Oracle9i Database High Availability and Disaster Recovery Web Site..................100
7.3 Oracle Fail Safe Web Sites .......................................................................................100
7.4 Oracle University Online Learning Web Site ..........................................................100
7.5 Oracle Support MetaLink Web Site .........................................................................100

Disaster-Tolerant High Availability Page 2


Disaster-Tolerant High Availability

1 EXECUTIVE OVERVIEW
When deploying an “always on” 7 x 24 x 365 mission-critical business system, it is essential to
ensure both high availability and disaster tolerance. Many lower cost disaster recovery solutions (for
example, the creation, offsite storage, and retrieval of system backups) do not meet the availability
requirements for business-critical operations. This paper describes how deploying Oracle9i Database
(release 9.2 or later) on commodity Windows clusters with a combination of Oracle Data Guard
(release 9.2 or later) disaster tolerance features and Oracle Fail Safe (release 3.3 or later) high
availability features provides easy-to-configure and cost-effective disaster-tolerant high availability.

2 INTRODUCTION
While there can be some overlap, the features and technologies used to provide high availability
are generally distinct from those used to ensure disaster tolerance. High-availability solutions
typically focus on protecting against individual component or system failures, while disaster-
tolerance solutions typically focus on protecting against data corruption and site failures. Each can
help to keep business-critical systems operational, but neither alone is sufficient to ensure the levels
of near continuous operation required for most business-critical systems. For example, while
redundant or clustered hardware can eliminate individual systems as points of failure, it does not
protect against a disaster that incapacitates the site where the systems reside. Similarly, while a
standby database solution, such as Oracle Data Guard, provides excellent disaster tolerance features,
it may take time to switch operations from the primary site to a physically separate standby site (for
example, you may first need to apply additional time delayed redo data to make the standby
database current before it can be reconfigured as the new primary site). This is true not only when
dealing with unexpected disasters and component failures, but also for the more common outages
associated with planned maintenance and upgrades. Fortunately, Oracle supports many
technologies that easily can be combined to provide the required levels of high availability and
disaster tolerance.
This paper describes how to combine Oracle Data Guard with Oracle Fail Safe to provide an
enhanced level of disaster-tolerant high availability for single-instance Oracle9i Database Enterprise
Edition databases deployed on Windows clusters configured with Microsoft Cluster Service. The
result is a complementary and easy-to-configure set of high availability and disaster tolerance
features (shown in Table 1 below) that eliminates many potential sources of downtime.

Disaster-Tolerant High Availability Page 3


&RPSRQHQW )HDWXUHV
0LFURVRIW&OXVWHU6HUYLFH • Provides basic failover clustering services that eliminate individual
06&6  host systems as single points of failure
• Is included with Windows NT Enterprise Edition, Windows 2000
Advanced Server, and Windows 2000 Datacenter Server
• Supports clusters with up to two nodes for Windows NT
Enterprise Edition and Windows 2000 Advanced Server; supports
clusters with up to four nodes for Windows 2000 Datacenter
Server
• Continuously monitors cluster resources such as disks, IP
addresses, and databases for high availability
• Supports configuring a group of cluster resources into a “virtual
server” where users access the resources in the group through a
fixed (node independent) network address, regardless of which
physical cluster node hosts the group
• Relies on RAID or other redundant storage to protect against
media failure; does not protect against data corruption
2UDFOHi'DWDEDVH • When used with transparent application failover, allows client
(QWHUSULVH(GLWLRQ  applications to automatically reconnect to the database if the
connection fails, and optionally resume a SELECT statement that
was in progress
• When used with the Fast-Start Fault Recovery feature, allows
database administrators to specify a maximum duration for
database recovery time (and thus guarantee recovery times in
service level agreements)
• When used with Oracle9i Flashback Query, allows users to view
data at different points in time (and potentially ‘undo’ committed
operations to easily recover from user error)
• When used with Oracle9i LogMiner, allows database
administrators to analyze the content of database files or to
quickly and selectively undo or track erroneous user updates
• Allows all maintenance operations formerly associated with
downtime (such as schema changes, reorganizations, or changes
to memory and storage parameters) to be performed online with
no disruption of service to users

Disaster-Tolerant High Availability Page 4


2UDFOH'DWD*XDUG • Excellent protection from data corruption, site disasters, and
media failures; good protection from system and component
failures; maintains multiple copies of data
• Maintains up to nine standby databases, each of which is a real-
time or time-delayed copy of the production database, to protect
against all threats—corruptions, human errors, and disasters
• Supports both physical and logical standby database
configurations
• Includes Oracle Data Guard broker (with Data Guard Manager
wizards) to automate complex creation and maintenance tasks
and provide dramatically enhanced monitoring, alert, and control
mechanisms
2UDFOH)DLO6DIH • Excellent protection from system and component failures;
maintains single copy of data
• Provides a fast, easy, and accurate way to configure and verify
Oracle resources on Windows clusters
• Works with Microsoft Cluster Service to monitor Oracle databases
and applications for high availability and, when necessary, to
automatically restart them on a surviving cluster node
• Automatically fails back Oracle applications and databases to a
preferred node immediately, at a specific time, or not at all
• Supports planned failovers to permit rolling cluster upgrades or
workload balancing
• Is a core high availability feature included with every Oracle9i
Database and Oracle Applications 11i license for Microsoft
Windows NT and Windows 2000
Table 1: High Availability and Disaster Tolerance Features
The components listed in Table 1 can be configured in a variety of ways to meet individual business
requirements for availability and disaster tolerance. In general, hardware or software failures,
upgrades, or other maintenance operations are efficiently handled through automatic MSCS virtual
server failover from one cluster node to another (typically within tens of seconds to a minute or
two), with no need for users to reconnect to a remote site or to convert a standby database into a
primary database. Only when all cluster nodes at the primary site fail (or there is a loss or
corruption of the database files) is it necessary to switch over operations to a standby database.
Figure 1 shows an example deployment (also referred to in this paper as the “example
configuration”) with a primary database and a physical standby database each deployed on a
separate Windows cluster. In this example configuration, both the primary cluster and the remote
standby cluster are configured with Microsoft Cluster Service (MSCS). All database data, log, and
control files are located on MSCS cluster disks. Oracle Data Guard is used to create the

Disaster-Tolerant High Availability Page 5


primary/standby configuration, while Oracle Fail Safe is used to configure each database in a highly
available virtual server environment. In the example, the redo data from the primary database is
transported to the physical standby database asynchronously. Note that many production
environments deploy multiple standby databases with a combination of synchronous and
asynchronous redo shipping (section provides additional examples of disaster-tolerant high
availability configurations). Also, some production environments, based on individual requirements,
may include other Oracle high availability and disaster-tolerant technologies such as Oracle Data
Guard logical standby databases, Oracle Advanced Replication, or Oracle Real Application Clusters.

Figure 1: Disaster-Tolerant High Availability Example Configuration


The remaining sections of the paper describe the steps required to configure the example disaster-
tolerant high availability solution and provide examples of how to:
• Coordinate changing database passwords
• Perform a rolling upgrade using planned failovers from one cluster node to another
• Perform an Oracle Data Guard site switchover between primary and standby locations
• Write scripts to automatically back up the standby database
The configuration process described in the paper is designed to minimize risk of user error through
the use of various wizards and automated configuration tools. (Manual configuration options are
also supported, but are not described in the paper.) It is assumed that you have basic familiarity
with the features and concepts associated with each component. Section provides links to
additional information.

3 CLUSTER CONFIGURATION
A cluster is a group of independent computing systems (nodes) that operates as a single virtual
system. The component redundancy in clusters eliminates individual host systems as points of
failure and provides a highly available hardware platform for deploying mission-critical databases
and applications.

Disaster-Tolerant High Availability Page 6


Oracle Data Guard can be used to configure and manage Oracle9i Database primary and standby
databases deployed in a variety of standalone and clustered configurations, including:
• Single-instance standalone databases
• Single-instance databases configured with Oracle Fail Safe release 3.3 and later (basic
shared-nothing cluster configuration with cold failover)
• Multi-instance Oracle Real Application Clusters databases (scalable shared-data cluster
configuration with warm failover)
Specifically, this paper describes how the Oracle Data Guard Manager and Oracle Fail Safe Manager
wizards automate configuration and management of disaster-tolerant high availability solutions on
shared-nothing Windows clusters configured with Microsoft Cluster Service.

3.1 Microsoft Cluster Service


Microsoft Cluster Service (MSCS) provides a basic shared-nothing cluster environment for Windows
systems. Individual cluster nodes never share the same cluster resources and individual cluster
resources such as disks, IP addresses, database instances and the like are always owned by and
accessed exclusively through one cluster node at any given time. If a failure occurs, ownership of
the affected resources are transferred, or failed over, to a surviving cluster node. Each cluster
typically is configured with at least one private (cluster heartbeat) network used for internode
cluster communications and at least one public network used for client access. Because individual
nodes cannot share information in memory, or read and write to the same disks, workloads cannot
scale across multiple nodes (as they can, for example, with Oracle Real Application Clusters).
However, because most MSCS clusters are built from standard commodity components, these
feature limitations are somewhat offset by lower hardware costs. Operating system restrictions
currently limit MSCS clusters to two nodes for Windows NT Enterprise Edition and Windows 2000
Advanced Server, and to up to four nodes for Windows 2000 Datacenter Server.

3.1.1 Install MSCS and Configure the Example Clusters

Microsoft Cluster Service is easily installed on any cluster hardware configuration listed on the
Microsoft hardware compatibility list (http://www.microsoft.com/hcl/default.asp, search for $OO
3URGXFWV of type &OXVWHU). Although the initial steps to begin installing and configuring MSCS differ
based on the underlying operating system, the overall process is similar and takes only a few
minutes per cluster node. The MSCS installation and cluster configuration process is described in
detail in the documentation accompanying your Windows operating system software and also in the
Installing MSCS lab module in the online course Introduction to Oracle Fail Safe, which is available
through Oracle University Online Learning (http://www.oracle.com/education/oln/index.html).
Note that you must first install MSCS and create a working cluster before you can install Oracle Fail
Safe. Other Oracle program executable software (database and application software) can be
installed on a private disk on each cluster node before or after MSCS installation (refer to the Oracle
Fail Safe Installation Guide for more information).
The example configuration uses two clusters, one for the primary database and one for the standby
database. Each cluster consists of two identically configured nodes, and each node must have:

Disaster-Tolerant High Availability Page 7


• Sufficient private disk space and memory for the operating system and all required Oracle
software
• Sufficient private disk storage (either on the system drive or on additional local drives) to
create Oracle home directories for Oracle9i Database Enterprise Edition and Oracle Fail Safe
In addition, each cluster should be configured with at least two MSCS cluster disk resources (one for
the MSCS cluster quorum resource and one for the database files). For optimal performance,
databases are often deployed using multiple disk resources (for example, separate disk arrays for
the data files associated with different table spaces and for the log and control files); however, for
the example, a single array will be used on each cluster for all database files. In all cases, the
physical configuration for the primary and standby databases must be identical (although the drive
letters used for the cluster disks on each cluster can differ).
Note that although the primary and standby clusters do not have to be identical, using identical
clusters makes administration and management easier. Figures 2 and 3 provide Microsoft Cluster
Administrator screen views that encapsulate the initial states of the primary cluster (FS-150) and
standby cluster (FS-240) used in the example configuration after MSCS has been installed and
configured. Both clusters are two-node Windows 2000 Advanced Server clusters. Aside from some
slight differences in the number of cluster drives, each cluster is configured similarly. For each
cluster in the example configuration, disk H: is reserved for all the data files, such as the database
parameter, data, log, archive log, and control files.

Figure 2: Initial State of Primary Cluster

Disaster-Tolerant High Availability Page 8


Figure 3: Initial State of Standby Cluster

3.2 Validate the Cluster Network Configuration


To ensure proper network name resolution on each cluster node, perform the following steps:
1. Update the hosts file in the \system32\drivers\etc operating system directory on each
cluster node to add entries for each cluster node and each Cluster Name (MSCS cluster
alias). For the example configuration, this includes FS-150, FS-151, FS-152, FS-240, FS-241,
and FS-242. Also add additional entries for the database virtual servers that later will be
created to host the primary and standby database (FS-153 and FS-245, respectively, for the
example configuration).
2. From an MS-DOS command window, use the ping command to verify that each of the
preceding network names resolves to the correct public IP address on each of the primary
and standby cluster nodes. If any network names do not resolve correctly, refer to the
Network Configuration Requirements appendix in the Oracle Fail Safe Concepts and
Administration Guide for information on how to troubleshoot and correct the problem.

4 ORACLE SOFTWARE CONFIGURATION


While the specific requirements for individual business solutions differ, the following general setup
guidelines are recommended when installing and configuring Oracle software on MSCS clusters:
• Create an Oracle home on a private disk (for example, the system disk) on each node for
each Oracle product that you plan to install. When possible, to minimize downtime during
future upgrades, use a separate Oracle home for each major component (for example, a
separate Oracle home each for the database, application software, and Oracle Fail Safe). To
allow applications to fail over, ensure that the Oracle homes on each system are named in
the same way (for example, name the Oracle Fail Safe home on each system ofs_home and
the database home on each system dbs_home).

Disaster-Tolerant High Availability Page 9


• Install all necessary Oracle product executables into the Oracle home (or homes) on each
node. Generally, it is best to install all Oracle products that you plan to configure for high
availability before you install Oracle Fail Safe (to ensure that Oracle Fail Safe installs the
proper resource DLLs to manage the Oracle software resources on the system).
• Place the Oracle Data Guard configuration files and all database data files, control files, log
files, and archive log files on cluster disks so that they can fail over from one cluster node to
another, when necessary. Allocate storage resources carefully so that independent
workloads can be configured into separate groups.
• If you are planning to use Oracle Fail Safe to configure primary and standby databases on
MSCS clusters (as in the example configuration shown in Figure 1), create the initial
primary/standby configuration using Data Guard Manager before you configure the
databases with Oracle Fail Safe.
The installed locations of the Oracle software components required to create the example
configuration are shown in Figure 4. Refer to your Oracle product documentation to ensure that you
have allocated sufficient disk and memory resources on each cluster node for the products you plan
to install. Note that each node must have sufficient resources to handle not only its own normal
workload but also any additional workloads that potentially could fail over from other nodes.

Figure 4: Location of Software Components for Example Configuration

4.1 Install and Configure Oracle9i Database Enterprise Edition


Use the Oracle Universal Installer to install Oracle9i Database Enterprise Edition into an Oracle
home directory on a private disk on each cluster node (four installations altogether for the example
configuration). Note that Oracle Data Guard is installed automatically when you install Oracle9i
Database Enterprise Edition. Use the same Oracle home name on each cluster node (dbs_home in
the example). Because the database data, log, and control files must be located on cluster disks and
not the private disks that contain the Oracle home installation directories, select the 6RIWZDUH2QO\
installation option on the Database Configuration installer window to perform each installation
without creating a database.

4.1.1 Create Initial Network Configuration Files

After the database installations are complete, use the Oracle Net Configuration Assistant (NetCA) to
create a default Oracle listener service and the initial network configuration files on each primary

Disaster-Tolerant High Availability Page 10


and standby cluster node. Optionally, you can use the following command line to execute NetCA in
silent mode and create the required listener.ora, sqlnet.ora, and tnsnames.ora configuration files:

"Program Files\Oracle\jre\1.1.8\bin\jre.exe" -Duser.dir=C:\Oracle\network\jlib -classpath ";C:\Program


Files\Oracle\jre\1.1.8\lib\rt.jar;C:\Oracle\jlib\ewt3.jar;C:\Oracle\jlib\ewtcompat-
3_3_15.jar;C:\Oracle\network\jlib\NetCA.jar;C:\Oracle\network\jlib\netcam.jar;C:\Oracle\jlib\netcfg.jar;C:\
Oracle\jlib\help3.jar;C:\Oracle\jlib\oracle_ice5.jar;C:\Oracle\jlib\share.jar;C:\Oracle\jlib\swingall-
1_1_1.jar;C:\Program Files\Oracle\jre\1.1.8\lib\i18n.jar;C:\Oracle\jlib\srvm.jar;C:\Oracle\network\tools"
oracle.net.ca.NetCA /orahome C:\Oracle /orahnam OUIHome /instype typical /inscomp
client,oraclenet,javavm,server,ano /insprtcl tcp,nmp,tcps /cfg local /authadp NO_VALUE /nodeinfo NO_VALUE
/responseFile C:\Oracle\network\install\netca_typ.rsp

Replace each occurrence of C:\oracle with your ORACLE_HOME directory path, verify that the
path for jre.exe is correct, and execute the command from the MS-DOS command prompt on each
cluster node. To ensure that the network configuration files are created consistently and correctly on
each cluster node, delete or rename any previously existing listener.ora, sqlnet.ora, or tnsnames.ora
files in the Oracle home network\admin directory before you execute this command.

4.1.2 Create Primary Database

After completing the initial network configuration, use Oracle Database Configuration Assistant
(DBCA) to create the primary database on the primary cluster using the selected cluster disks (Disk
H: in the example). If the cluster disks selected for the database files are not already owned by the
node where you are running DBCA, use Microsoft Cluster Administrator to move these disks to that
node. Table 2 lists the input values used for the initial DBCA windows in the example:

'%&$,QSXW 9DOXH8VHG)RU([DPSOH&RQILJXUDWLRQ
Template Name General Purpose
Global Database Name testdb1.us.oracle.com
SID testdb1
Connection Option Dedicated Server Mode
Table 2: Initial DBCA Input Values Used for Example Primary Database
At the DBCA Initialization Parameters screen shown in Figure 5, make any necessary changes to
ensure that the primary database is configured with ARCHIVELOG mode enabled and that all
database data, log, and control files are correctly located on cluster disks. The following list of
changes is typical for most databases:
1. Click the $UFKLYH tab, then click )LOH/RFDWLRQ9DULDEOHV and create variables to specify the
cluster disk locations to be used for the database files. The example uses a variable
DB_FILES with value H:\oracle.
2. Under the $UFKLYH tab:
• Enable $UFKLYHORJPRGH
• Enable $XWRPDWLFDUFKLYDO

Disaster-Tolerant High Availability Page 11


• Replace ORACLE_BASE with the appropriate cluster disk location variable (DB_FILES
for the example configuration) for all file locations.
3. Under the )LOH/RFDWLRQVtab:
• Ensure that the &UHDWHVHUYHUSDUDPHWHUVILOH VSILOH box is checked.
• Replace ORACLE_BASE and ORACLE_HOME with the appropriate cluster disk location
variable (DB_FILES in the example) for all file locations except for the initialization
parameter file location, as shown in Figure 5.
4. Click $OO,QLWLDOL]DWLRQ3DUDPHWHUV, scroll through the parameter list, and check that the
values assigned to all relevant database parameters use cluster disks. For the example, the
value assigned to the control_files parameter was changed in three places to use DB_FILES
instead of ORACLE_BASE.
5. Click 1H[W to continue.

Figure 5: DBCA Initialization Parameters Screen


In the Database Storage window, review all file locations and replace any ORACLE_HOME or
ORACLE_BASE references with the appropriate cluster disk location as needed. For the example, all
file locations were updated (control files, data files, and redo log files). After these changes are
made, click 1H[W to continue.
Optionally, in the Creation Options window (in addition to creating the database), you can save the
modified database configuration as a template. This step is recommended if you plan to create other
similar disaster-tolerant high availability configurations in the future.
When all changes are complete, click )LQLVKand review the Summary report to verify that all
database files will be created on cluster disks. Figure 6 shows a portion of the Summary report for

Disaster-Tolerant High Availability Page 12


the primary database used in the example configuration. Note that all displayed file locations use
the previously created DB_FILES file location variable instead of the default ORACLE_HOME or
ORACLE_BASE file location variables.

Figure 6: DBCA Summary Report for Example Primary Database


At the end of the database creation process, you will be prompted to enter new passwords for the
default SYS and SYSTEM database user accounts.

Disaster-Tolerant High Availability Page 13


4.1.3 Start Instance and Listener Services and Connect to Primary Database

From the Services Control Panel, start the Oracle primary database instance and listener services (if
they are not already started). Depending on how your system is configured, you also may need to
create entries for the database in the listener.ora and tnsnames.ora network configuration files
before you can connect to the primary database. Update these files if necessary and then be sure to
stop and restart the Oracle Listener and Oracle Intelligent Agent processes on any system where you
changed these files. When the network environment is configured correctly, use SQL*Plus to
connect to the database and verify that you are able to query the database successfully (for
example, by executing SELECT * FROM ALL_USERS from the SYSTEM account).

4.1.4 Specify Location for Oracle Data Guard Configuration Files

Because the Oracle Data Guard configuration files must be accessible by whichever cluster node
hosts the primary database virtual server, these files must be located on shared-nothing cluster disks
(usually the same disks used for the database data, control, and log files). To specify the location to
be used when these files are created, use SQL*Plus to connect to the primary database through the
SYS database user account (as SYSDBA) and, at the SQL prompt, enter the following commands:
SQL> alter system set dg_broker_config_file1 = ‘<path>\dr1<instance_name>.dat’ scope=both;

SQL> alter system set dg_broker_config_file2 = ‘<path>\dr2<instance_name>.dat’ scope=both;

where <path> is the shared-nothing cluster disk location where you want these files to be created
(H:\oracle\database in the example) and <instance_name> is the SID for the primary
database (testdb1 in the example). If the cluster disk directories specified in <path> do not
already exist, be sure to create them. The scope=both qualifier ensures that this change is written
both to memory and to the database system parameter file (spfile) on disk.

4.1.5 Specify Standby Archive Log File Destination

The standby_archive_dest parameter for the primary database is used only if the database is later
reconfigured as standby database. By default, it is set to %ORACLE_HOME%\RDBMS. However,
because the standby archive log files must be accessible by whichever cluster node hosts the
primary database virtual server, these files must be located on shared-nothing cluster disks (usually
the same disks used for the database data, control, and log files). To specify the required cluster
disk location for the standby archive log files, use SQL*Plus to connect to the primary database
through the SYS database user account (as SYSDBA) and, at the SQL prompt, enter the following:
SQL> alter system set standby_archive_dest = ‘<path>’ scope=both;

where <path> is the shared-nothing cluster disk directory where you want these files to be created
(H:\oracle\oradata\testdb1\standby_archive in the example). If the cluster disk
directories specified in <path> do not already exist, be sure to create them. The scope=both
qualifier ensures that this change is written both to memory and to the database system parameter
file (spfile) on disk.

4.2 Create Initial Primary/Standby Database Configuration


Data Guard Manager automates the process of configuring primary and standby databases into a
single easily managed disaster-tolerant solution. Because Data Guard Manager is integrated with

Disaster-Tolerant High Availability Page 14


Oracle Enterprise Manager and relies on the Oracle Intelligent Agent to perform database discovery,
it is first necessary to configure Oracle Enterprise Manager and ensure that the Oracle Intelligent
Agent is running on each cluster node. The steps to configure Oracle Enterprise Manager and to use
Data Guard Manager to discover and configure the initial primary/standby configuration are
described later in this section.
The example disaster-tolerant high availability solution makes use of databases hosted by node-
independent virtual servers. To ensure that Oracle Enterprise Manager and Data Guard Manager
correctly configure and discover the disaster-tolerant high availability configuration, the Oracle Data
Guard Create Configuration Wizard is invoked twice, as summarized in the following sequence of
events:
1) Use Oracle Enterprise Manager to discover the primary database using the default (node-
specific) Oracle Intelligent Agent running on the primary cluster node that hosts the primary
database.
2) Invoke the Oracle Data Guard Create Configuration Wizard to create the standby database
and to create an initial Oracle Data Guard configuration with the primary database hosted
by one of the primary cluster nodes and the standby database hosted by one of the standby
cluster nodes.
3) Verify that this initial primary/standby configuration is working correctly and resolve any
configuration issues.
4) Remove the initial Oracle Data Guard configuration information from the Data Guard
Manager tree view.
5) Delete the primary and standby cluster nodes and associated resources from the Oracle
Enterprise Manager tree view, then stop and disable the default Oracle Intelligent Agent on
each node to ensure that there will not be any resource discovery conflicts later.
6) Use Oracle Fail Safe Manager to create a virtual server on the primary cluster for the primary
database and a virtual server on the standby cluster for the standby database.
7) Add an Oracle Intelligent Agent to each virtual server so that Oracle Enterprise Manager and
Data Guard Manager can discover the resources hosted by each virtual server. (Note that
Oracle Fail Safe only allows an Intelligent Agent resource to be added to a virtual server
group that already contains a database resource.)
8) Discover each virtual server and the highly available database it hosts using Oracle
Enterprise Manager.
9) Use the Oracle Data Guard Create Configuration Wizard to create the final Oracle Data
Guard configuration (with each database now hosted by a highly available virtual server).
Steps 1 through 5 in this list are covered in the remainder of this section, while steps 6 through 9
are covered in sections and . Refer to the Oracle Data Guard product documentation listed in
section of this paper if your configuration differs from that used in the example.

Disaster-Tolerant High Availability Page 15


4.2.1 Configure Oracle Enterprise Manager

If you have already configured Oracle Enterprise Manager and the Oracle Management Server on a
separate management system, you optionally can start Oracle Enterprise Manager on that system
and skip ahead to section . Note that, depending on your environment, your Oracle Enterprise
Manager tree view may differ from that shown in this paper.
If you have not already configured Oracle Enterprise Manager on another system, you optionally
can configure Oracle Enterprise Manager now using the Enterprise Manager Configuration Assistant
(EMCA). For production deployments, you should always put Oracle Management Server and the
Oracle Enterprise Manager repository database on a separate system so that you will not lose access
to the repository when you take one of the primary or standby sites offline. However, for purposes
of illustration, one of the primary cluster nodes (FS-152) will be used in the example configuration.
Oracle Enterprise Manager is installed automatically when you install Oracle9i. From the system
where you plan to configure Oracle Enterprise Manager, select:
6WDUW!3URJUDPV!2UDFOHRUDFOHBKRPH!
where <oracle_home> is the name of the previously created Oracle9i Database Enterprise Edition
home (dbs_home for the example configuration). Then, to open the Oracle Enterprise Manager
Configuration Assistant, choose:
&RQILJXUDWLRQDQG0LJUDWLRQ7RROV!(QWHUSULVH0DQDJHU&RQILJXUDWLRQ$VVLVWDQW
On the second EMCA window (Configuration Operation), select &RQILJXUHORFDO2UDFOH0DQDJHPHQW
6HUYHU. For the third window, choose &UHDWHDQHZUHSRVLWRU\. Choose 7\SLFDOfor the Create New
Repository Options on the fourth window. Record the username and password information from the
Create Repository Summary window for future use and click )LQLVK to complete the Oracle
Enterprise Manager configuration process.

4.2.2 Start the Oracle Management Server

The steps in section should automatically start Oracle Management Server. However, if
necessary, start Oracle Management Server from the command-line prompt by entering the
command oemctl start oms. Note that the Oracle Management Server must be able to connect
to the Oracle Enterprise Manager repository database in order to start. If you have difficulty starting
the Oracle Management Server, verify that the repository database is configured correctly and that
the corresponding instance and listener services are started.

4.2.3 Start Oracle Intelligent Agent on All Cluster Nodes

During the Oracle9i Database Enterprise Edition installation process, an Oracle Intelligent Agent
process was created for each primary and standby cluster node. Issue the command agentctl
status from the MS-DOS command line on each cluster node to determine the status of the Agent
on each node. For any node where the Agent is not already started, start the Agent from an MS-DOS
command window by issuing the command agentctl start. Note that in general, any time the
configuration on a cluster node changes, you will need to stop and restart the Oracle Intelligent
Agent on that node to allow the new changes to be discovered by the Agent.

Disaster-Tolerant High Availability Page 16


4.2.4 Open the Oracle Enterprise Manager Console

To open Oracle Enterprise Manager, choose:


6WDUW!3URJUDPV!2UDFOH2UDFOHB+RPH!
where <oracle_home> is the name of the previously created Oracle9i Database Enterprise Edition
home (dbs_home for the example configuration). Then, to open the Enterprise Manager Console,
choose:
(QWHUSULVH0DQDJHU&RQVROH
(Optionally, you can also open the Enterprise Manager Console by executing the command oemapp
console from the command-line prompt.) When the login dialog opens, choose /RJLQWRWKH
2UDFOH0DQDJHPHQW6HUYHU and connect to an existing Oracle Management Server (for example, the
one created in section ). Do not choose /DXQFKVWDQGDORQH, because Data Guard Manager will
not be available from the Enterprise Manager Console if you select this option. The default Oracle
Management Server administrator login account is sysman with an initial default password of
oem_temp; if prompted, be sure to change the default password and record the new value.

4.2.5 Discover Primary and Standby Cluster Nodes

Run the Enterprise Manager Discovery Wizard, also referred to as the Discovery Wizard, to discover
each node of the primary and standby clusters and to gain access to the databases that you want to
configure and administer with Data Guard Manager. To invoke the Discovery Wizard from the
Enterprise Manager Console menu bar, choose:
1DYLJDWRU!'LVFRYHU1RGHV
Follow the directions in the Discovery Wizard to discover each of the nodes in the primary and
standby clusters (FS-151, FS-152, FS-241, and FS-242 in the example configuration). When finished,
all discovered nodes and databases are displayed in the Enterprise Manager navigator tree. For the
example configuration, Oracle Enterprise Manager discovers and displays the following, as shown in
Figure 7:
• On the node where the primary database was created (FS-151 in the example
configuration), the wizard discovers the primary database (testdb1.us.oracle.com).
• In addition, if you optionally used EMCA to create a repository database on one of the
cluster nodes, the Oracle Enterprise Manager repository database (OEMREP.us.oracle.com) is
discovered on the node where it was created (FS-152 for the example configuration).
• On all cluster nodes, the wizard finds the Oracle home where you have installed Oracle9i
Database Enterprise Edition.

Disaster-Tolerant High Availability Page 17


Figure 7: Oracle Enterprise Manager Tree View After Node Discovery
For other configurations, the Oracle Enterprise Manager tree view may display additional resources.

4.2.6 Set Preferred Credentials on Primary and Standby Nodes

You must set preferred credentials on each of the primary and standby cluster nodes to ensure Data
Guard Manager can run remote processes to create the configuration. To set preferred credentials
from the Enterprise Manager Console menu bar, select:
&RQILJXUDWLRQ!3UHIHUHQFHV!3UHIHUUHG&UHGHQWLDOV
For each cluster node, specify an account with administrator privileges on that system. Note that the
selected account also must be granted logon as a batch job user rights for the system. After setting
the preferred credentials, verify that you are able to use Oracle Enterprise Manager to successfully
run a small test job on each cluster node (for example, execute a system dir command).
Although setting preferred credentials for databases is not required, you also might want to set
preferred credentials (for example, the SYS account) for the primary database (and also later for the
standby database when it is created).

4.2.7 Open Data Guard Manager

Once the preceding steps have been completed, you can open Data Guard Manager from the
command-line prompt or from the Enterprise Manager Console:
• From the command-line prompt, enter oemapp dataguard.
• From the Oracle Enterprise Manager Console, use either of the following methods:
o Choose7RROV!'DWDEDVH$SSOLFDWLRQV!'DWD*XDUG0DQDJHU
o From the 'DWDEDVH$SSOLFDWLRQVdrawer, move the cursor over the icons and select
the 'DWD*XDUG0DQDJHU icon.

Disaster-Tolerant High Availability Page 18


After opening Data Guard Manager, you should see the initial screen view shown in Figure 8:

Figure 8: Data Guard Manager

4.2.8 Create the Initial Oracle Data Guard Configuration

The steps to create the initial Oracle Data Guard configuration are described in this section. To
open the Create Configuration Wizard, right-click 2UDFOH'DWD*XDUG&RQILJXUDWLRQV in the navigator
tree and choose &UHDWH&RQILJXUDWLRQ:L]DUG. Figure 9 shows the initial welcome screen for the
Create Configuration Wizard.

Disaster-Tolerant High Availability Page 19


Figure 9: Create Configuration Wizard - Welcome
The wizard takes you through the following five steps:
1. Verify the initial Oracle Data Guard configuration requirements.
2. Provide a configuration name.
3. Choose a primary database.
4. Choose how you want to add a standby database:
• Import an existing standby database.
• Create a new physical or logical standby database.
5. Verify the information you supplied to the wizard and make changes, if necessary.
Each of these steps is described in more detail for the example configuration in the remainder of
this section. Refer to the Data Guard Manager online help system and to the Oracle9i Data Guard
Broker Concepts manual for complete information.

4.2.8.1 Verify the Initial Oracle Data Guard Configuration Requirements

Click 'HWDLOV on the Create Configuration Wizard welcome page (see Figure 9) and review the
checklist of setup requirements and information that is displayed. If necessary, make any additional
changes required to set up the Oracle Data Guard environment on the primary and standby clusters.
Click 1H[W to continue.

Disaster-Tolerant High Availability Page 20


4.2.8.2 Provide a Configuration Name

Enter a unique Oracle identifier for the name of the new Oracle Data Guard configuration. Figure 10
shows the Configuration Name window in which the example configuration name (testdb_config)
has been entered. Click 1H[W to continue.

Figure 10: Create Configuration Wizard - Configuration Name

Disaster-Tolerant High Availability Page 21


4.2.8.3 Choose a Primary Database

Select the primary database from the list of discovered databases. As shown in Figure 11, the
selected primary database for the example is TESTDB1.us.oracle.com. Accept the default primary
site name; this site will be deleted and replaced by a new site after Oracle Fail Safe Manager
configures the primary and standby databases for failover.
Verify that the cluster disks used by the primary database are owned by the node where the
database was created. If necessary, use Microsoft Cluster Administrator to move the disks to this
node. Ensure that the database instance is started and that you can connect to the database. Click
1H[W to continue.

Figure 11: Create Configuration Wizard – Choose Primary Database

Disaster-Tolerant High Availability Page 22


4.2.8.4 Create a New Physical Standby Database

The wizard allows you to create a new physical or logical standby database or to add an existing
standby database. For the example, choose &UHDWHD1HZ3K\VLFDO6WDQGE\'DWDEDVH (as shown in
Figure 12) and click 1H[W to continue.

Figure 12: Create Configuration Wizard – Standby Creation Method

Disaster-Tolerant High Availability Page 23


At this point, the wizard verifies the primary database configuration and then displays a sequence of
screens to collect the information required to create the new physical standby database. In the first
of these screens, shown in Figure 13, specify the standby cluster node that currently owns the
cluster disks where the standby database files will be located (node FS-241 for the example
configuration) and accept the default site name. If necessary, use Microsoft Cluster Administrator to
move the cluster disks to the selected node of the standby cluster.

Figure 13: Create Configuration Wizard – Standby Oracle Home

Disaster-Tolerant High Availability Page 24


In the next screen, specify the directory location where the standby database data files should be
copied. To allow the database to be configured with Oracle Fail Safe, all files must be located on
MSCS shared-nothing cluster disks. For the example, as shown in Figure 14, all database data files,
log files, archive log files, and control files will be copied to the cluster disk directory
H:\oracle\standby\. After entering the directory location, click 1H[W.

Figure 14: Create Configuration Wizard - Data file Copy Location

Disaster-Tolerant High Availability Page 25


Click <HV when prompted to create the directory, as shown in Figure 15, and review the standby
database options displayed in the next window, as shown in Figure 16. Click 9LHZ(GLW
,QLWLDOL]DWLRQ)LOH and ensure that the two Data Guard Manager configuration file locations
(specified using the dg_broker_config_file1 and dg_broker_config_file2 parameters) use the correct
cluster disk directories (refer to section for information on the corresponding location of these
files for the primary database). For the example configuration, the paths for the Data Guard
Manager configuration files were edited to use the directory H:\oracle\standby\. In most
cases, you can accept the default values for all other parameters and information shown in the
options window. Click 1H[W to continue.

Figure 15: Create Directory Dialog

Figure 16: Create Configuration Wizard - Standby Database Options

Disaster-Tolerant High Availability Page 26


4.2.8.5 Verify Wizard Information

Review the Create Configuration Wizard Summary window (shown in Figure 17) and verify that the
information displayed for the primary and standby sites is correct. If you find an error, click %DFNto
move backward through the wizard screens and make the needed changes. When the information is
correct, click )LQLVK. The wizard displays a report similar to that shown in Figure 18 that records
progress while the configuration is created.

Figure 17: Create Configuration Wizard - Summary

Disaster-Tolerant High Availability Page 27


Figure 18: Create Configuration Wizard Progress Report

4.2.9 Validate Initial Primary/Standby Configuration

After closing the Create Configuration Wizard progress report, use Data Guard Manager to connect
to the newly created configuration. You will be prompted to enter the database username and
password required to connect to the configuration, as shown in Figure 19. Once connected to the
configuration, expand the Data Guard Manager tree view and verify that the configuration
properties are similar to those shown in Figure 20.

Disaster-Tolerant High Availability Page 28


Figure 19: Data Guard Manager Configuration Connection Dialog

Figure 20: Initial Configuration Tree View and Property Information

Disaster-Tolerant High Availability Page 29


4.2.10 Optionally, Specify a Time Delay for Applying Archived Redo Log Files

By default, a physical standby database automatically applies archived redo logs when they arrive
from the primary database. A logical standby database automatically applies SQL statements once
they have been transformed from the archived redo logs. But in some cases, you may want to create
a time lag between the archiving of a redo log at the primary site and the applying of the redo log
at the standby site. A time lag can protect against the application of corrupted or erroneous data
from the primary site to the standby site. For example, if the problem is detected on the primary
database before the logs have been applied to the standby database, administrators have the option
to switchover operations to the unaffected standby database (where the problem has not yet
propagated), effectively rolling back the clock to a point in time before the problem occurred.
To specify a time lag for applying redo logs at the standby site:
• Select the standby database in the Data Guard Manager tree view.
• Click on the 3URSHUWLHV tab.
• Locate the 'HOD\0LQV property and enter the desired redo log application delay in minutes.
• Click $SSO\
Changing the DelayMins property for a standby database updates the DELAY attribute of the
corresponding LOG_ARCHIVE_DEST_n initialization parameter for the primary database. For the
example configuration, a value of 30 minutes was entered, as shown in Figure 21.

Figure 21: Optionally, Specify a Time Delay for Applying Redo Logs

Disaster-Tolerant High Availability Page 30


4.2.11 Delete Initial Primary/Standby Configuration

At this point, Data Guard Manager has successfully created and configured the initial standby
configuration. However, when the primary and standby databases are configured with Oracle Fail
Safe, client access to these databases will change from using node-specific network addresses to
node-independent virtual addresses and the initial Oracle Data Guard configuration and the
database information stored in the Oracle Enterprise Manager repository will not be valid. Because
of this, it is necessary to remove the initial Oracle Data Guard configuration from the Data Guard
Manager tree view and to delete the initially discovered primary and standby cluster nodes and
database resources from the Enterprise Manager tree view. Once the Oracle Fail Safe configuration
process is completed, these tree views will be updated with the final disaster-tolerant high
availability configuration (as described in sections and ).
To remove the initial configuration information, right-click the name of the initial Oracle Data Guard
configuration (testdb_config in the example) and then click 5HPRYH in the pop-up menu. In the
resulting window, as shown in Figure 22, ensure that the 5HPRYH2UDFOH'DWD*XDUG&RQILJXUDWLRQ
3HUPDQHQWO\ and 5HPRYH$OO'HVWLQDWLRQVLQ&RQILJXUDWLRQ options are chosen. This leaves each
database in place, but stops transport and application of logs to the standby database.

Figure 22: Remove Oracle Data Guard Configuration Window


After the removal operation completes, exit Data Guard Manager. Then, from the Oracle Enterprise
Manager console, delete the tree view entries for the primary and standby cluster nodes. This

Disaster-Tolerant High Availability Page 31


removes the nodes and all discovered targets hosted by the nodes from the Oracle Enterprise
Manager repository database. (To delete a node from the tree view, right-click the node and then
choose 'HOHWH.) After the nodes are deleted from the Oracle Enterprise Manager tree view, exit the
Oracle Enterprise Manager console.

4.2.12 Stop and Disable Default Oracle Intelligent Agents

Finally, to ensure that there will be no resource discovery conflicts in later steps, you must stop and
then disable the default Oracle Intelligent Agent service (Oracledbs_homeAgent in the example) on
each cluster node. To do this, open the Windows Services Control Window and right-click the
Oracle Intelligent Agent service to open the properties page for the service, as shown in Figure 23.

Figure 23: Default Oracle Intelligent Agent Property Page


Note that, currently, the default Intelligent Agent discovers all resources hosted by a given system,
regardless of whether they are accessed through a node-specific IP address or through a virtual
server IP address. By disabling the default Intelligent Agent, you will not be able to discover any
node-specific resources. However, you will be able to discover any highly available virtual servers
created using Oracle Fail Safe that contain an Oracle Intelligent Agent resource. Because the highly
available Intelligent Agent configured for each virtual server only monitors resources that are
accessed through the IP addresses associated with that specific virtual server, there are no resource
discovery conflicts between the Intelligent Agents in different virtual server groups (even when
multiple virtual servers are hosted by the same physical cluster node). As shown in Figure 58, each

Disaster-Tolerant High Availability Page 32


discovered virtual server appears in the Oracle Enterprise Manager tree view as if it were a separate
physical node.

4.3 Install Oracle Fail Safe


Oracle Fail Safe ships on a separate CD-ROM in the Oracle database media pack for Windows.
Locate the CD-ROM and use it to install Oracle Fail Safe into a new Oracle home directory on a
private disk on each node of the primary and standby clusters. Note that Oracle Fail Safe release 3.3
or later is required to create the disaster-tolerant high availability configurations discussed in this
paper. The screen views in the sections that follow summarize the Oracle Fail Safe installation
process. Refer to the online Oracle Fail Safe Installation Guide and Release Notes included on the
Oracle Fail Safe distribution media for complete installation instructions.

4.3.1 Open the Oracle Universal Installer

Insert the Oracle Fail Safe CD-ROM into the CD-ROM drive of one of the primary or standby cluster
nodes. From the initial Autorun screen, click ,QVWDOO'HLQVWDOO3URGXFWV to open the Oracle Universal
Installer Welcome screen shown in Figure 24. If the Autorun screen is not displayed on your system
after the Oracle Fail Safe CD-ROM is inserted, you can start the Oracle Universal Installer using the
setup.exe program located in the \install\Win32\ directory on the CD-ROM. Click 1H[W to continue.

Figure 24: Oracle Universal Installer: Welcome

Disaster-Tolerant High Availability Page 33


4.3.2 Specify File Locations

In the File Locations window, accept the default source location and specify the name and location
(on a private disk) for the Oracle home directory where Oracle Fail Safe is to be installed. To ensure
that software components can fail over correctly, the Oracle home where Oracle Fail Safe is
installed must have the same name on each cluster node; for the example, the Oracle Fail Safe
installation home is ofs_home, as shown in Figure 25. After entering the required information, click
1H[W to continue.

Figure 25: Oracle Universal Installer: File Locations

Disaster-Tolerant High Availability Page 34


4.3.3 Select Oracle Fail Safe

From the Available Products window, choose 2UDFOH)DLO6DIH (as shown in Figure 26) and then
click 1H[W to continue.

Figure 26: Oracle Universal Installer: Available Products

Disaster-Tolerant High Availability Page 35


4.3.4 Installation Types

In the Installation Types window shown in Figure 27, choose 7\SLFDO, and then click 1H[W to
continue.

Figure 27: Oracle Universal Installer: Installation Type

Disaster-Tolerant High Availability Page 36


4.3.5 Reboot Needed After Installation

A Reboot Needed After Installation window, similar to that shown in Figure 28, warns you to reboot
the system after the installation is complete. Note that this window is not displayed if you have
previously installed Oracle Fail Safe components from this release and the changes to the system
path and Oracle resource DLL have been made and detected previously. Click 1H[W to continue.

Figure 28: Oracle Universal Installer: Reboot Needed After Installation

Disaster-Tolerant High Availability Page 37


4.3.6 Review Summary Information

Review the installation summary screen, which should be similar to that shown in Figure 29, and
then click ,QVWDOO to begin installing the selected software components. Note that if there is
insufficient space to perform the installation, the text below 6SDFH5HTXLUHPHQWV is displayed in red.

Figure 29: Oracle Universal Installer: Summary

Disaster-Tolerant High Availability Page 38


The Install window, shown in Figure 30, displays the progress of the installation, including the
names of the files that are being installed.

Figure 30: Oracle Universal Installer: Install

Disaster-Tolerant High Availability Page 39


4.3.7 Enter Domain User Account for Oracle Services for MSCS

If the installation is successful, the Configuration Tools window and the Oracle Services for MSCS
Account/Password dialog box are displayed, as shown in Figure 31.
In the Oracle Services for MSCS Account/Password dialog box, enter the domain, user name, and
password of an operating system user account that has Administrator privileges. This is the account
that Oracle Services for MSCS will be using. Oracle Services for MSCS runs as a Windows service
(called OracleMSCSServices) under a user account that must be a domain user account (not the
system account) that has Administrator privileges on all cluster nodes. The account must be the
same on all cluster nodes, or you will receive an error message when you attempt to connect to a
cluster using Oracle Fail Safe Manager.
Enter the information in the form Domain\Username, as shown in Figure 31, or if you are using
Windows 2000, you optionally can enter a user principal name in the form
Username@DnsDomainName.

Figure 31: Oracle Services for MSCS Security Information

Disaster-Tolerant High Availability Page 40


4.3.8 Confirm Installation and View Release Notes

At the end of the installation, the Oracle Universal Installer displays the window shown in Figure
32. Click ,QVWDOOHG3URGXFWV to confirm that Oracle Fail Safe has been successfully installed. Click
5HOHDVH,QIRUPDWLRQ to view the Oracle Fail Safe Release Notes. Click ([LW to exit the installer.

Figure 32: Oracle Universal Installer: End of Installation

4.3.9 Reboot Cluster Node

If an installer screen instructing you to reboot after the installation is complete was displayed during
the installation, reboot the cluster node. A reboot is required only for the initial installation of an
Oracle Fail Safe release or if you have installed Oracle Fail Safe into a new Oracle home (on a node
with multiple Oracle homes).

4.3.10 Install Oracle Fail Safe on Remaining Nodes

Repeat steps the installation steps described in sections  through  on each additional
primary and standby cluster node.

4.3.11 Verify the Primary and Standby Clusters

After Oracle Fail Safe has been successfully installed and each node of the primary and standby
clusters has been rebooted (as described in section ), open Oracle Fail Safe Manager on one of
the cluster nodes by choosing the following from the Windows taskbar:
6WDUW!3URJUDPV!2UDFOH2UDFOHB+RPH!!2UDFOH)DLO6DIH0DQDJHU

Disaster-Tolerant High Availability Page 41


where <Oracle_Home> is the name of the Oracle home where you installed Oracle Fail Safe. Oracle
Fail Safe Manager automatically opens the Add Cluster to Tree dialog box, as shown in Figure 33. (If
the Add Cluster to Tree dialog box is not open, choose $GG&OXVWHUWR7UHH from the )LOHmenu.) In
the Cluster Alias box, enter the alias for the primary cluster (FS-150 in the example) and click 2..

Figure 33: Oracle Fail Safe Manager: Add Cluster To Tree


Oracle Fail Safe Manager displays an icon for the cluster in the tree view. Right-click the cluster
icon, choose &RQQHFW, and enter the Oracle Services for MSCS administrator account and password
information in the resulting dialog box. Optionally, you can save this information by checking the
6DYHDV/RFDO3UHIHUUHG&UHGHQWLDOVbox, as shown in Figure 34. Click 2. to connect to the cluster.

Figure 34: Oracle Fail Safe Manager: Connect to Cluster

Disaster-Tolerant High Availability Page 42


The first time you connect to a cluster after you install Oracle Fail Safe, Oracle Fail Safe Manager
prompts you to run the Verify Cluster operation to validate the MSCS cluster environment, Oracle
Fail Safe installation, and network configuration. The Verify Cluster operation displays its progress
in a Verifying Cluster window, as shown in Figure 35. (Later, you can run the Verify Cluster
operation at any time by choosing 7URXEOHVKRRWLQJ!9HULI\&OXVWHU from the Oracle Fail Safe
Manager menu bar. This is especially useful if you later change your cluster configuration.)

Figure 35: Oracle Fail Safe Manager: Verify Cluster


If any problems are identified during the Verify Cluster operation, correct them and repeat the
Verify Cluster operation until it is successful. Refer to the Oracle Fail Safe Concepts and
Administration Guide for help in troubleshooting any cluster related problems identified in the
status report.
Repeat the Add Cluster to Tree and Verify Cluster steps for the standby cluster (FS-240 in the
example). After successfully connecting to and verifying each cluster, expand the Oracle Fail Safe
Manager tree view to display the standalone resources on each cluster node. The tree view should
now appear similar to the view shown in Figure 36 and should contain entries for the previously
created primary and standby databases in the appropriate Standalone Resources folders. Note that

Disaster-Tolerant High Availability Page 43


Oracle Fail Safe performs its own independent resource discovery that does not make use of either
the Oracle Intelligent Agent or Oracle Enterprise Manager.

Figure 36: Oracle Fail Safe Manager: Expanded Tree View

4.4 Configure Database Virtual Servers


At the current point in the configuration process, the primary and standby databases are each
hosted by a specific cluster node and are each configured for access through a node-specific IP
address. This section describes how to use Oracle Fail Safe Manager to configure each database so
that it is hosted by a highly available virtual server and accessed through a node-independent virtual
address.

4.4.1 Configure Database Parameter Files

Oracle Fail Safe supports either a single initialization parameter file located on the same cluster
disks as the database data, log, and control files or allows you to use a separate initialization
parameter file on each cluster node, provided that the path on each node is the same and that you
manually ensure that any relevant changes are propagated to all copies of the initialization
parameter file. Data Guard Manager expects to find an initialization parameter file or server
parameter file in the Oracle home database directory. Because Oracle Data Guard may make
changes to the content of the Oracle9i Database server parameter file (for example, during a site
switchover), there is a potential for server parameter file synchronization issues if a separate copy of

Disaster-Tolerant High Availability Page 44


the server parameter file is maintained on each cluster node. The solution is to place a local
initialization parameter file similar to that shown in Figure 37 in the <Oracle_Home>\database
directory on each cluster node. It must contain a single line indicating the location of the server
parameter file, which, in turn, is located on the cluster disks that contain the database data, control,
and log files.

Figure 37: Parameter File for Primary Database


For the primary database, the Oracle Database Creation Assistant should already have placed the
spfile<SID>.ora file on the correct cluster disk and created an init<SID>.ora file with the desired
contents on the node where the database was created (where <SID> represents the SID for the
primary database, testdb1 for the example). Verify that the contents of the init<SID>.ora file are
similar to Figure 37 and place a copy of this file in the same <Oracle_home>\database directory
location on all nodes of the primary cluster.
Oracle Data Guard created the initialization and server parameter files for the standby database in
the <Oracle_home>\database directory on the node where the standby database was created. To
configure these files for the standby cluster, perform the following steps:
1. Move the spfile<SID>.ora file (spfiletestdb12.ora for the example) to a directory on the
cluster disk used for the database files (H:\oracle\standby\spfiletestdb12.ora for the
example).
2. Rename the existing init<SID>.ora file to init<SID>.ora_old (which contains the information
Oracle Data Guard used to create the spfile<SID>.ora file).
3. Create a new init<SID>.ora file (similar to that shown in Figure 36) that specifies the new
location for the standby database spfile<SID>.ora file.
4. Place a copy of this new init<SID>.ora file in the same <Oracle_home>\database directory
location on all nodes of the standby cluster.
After the database parameter files have been configured on the primary and standby clusters, record
the locations of the initialization parameter files for the primary and standby databases; this
information will be used during the database verification operation described in section .

4.4.2 Create Virtual Servers for Primary and Standby Databases

In the Microsoft Cluster Service environment, a virtual server is a group of resources that contains at
least one virtual address (a network name resource and its associated IP address resource). The
Oracle Fail Safe Create Group wizard collects the information needed to create an empty group and
then optionally allows you to add one virtual address to the group (the Add Resource to Group
wizard allows you to add additional virtual addresses after the group is created). As previously
noted (during the cluster network configuration and validation steps described in section ), the

Disaster-Tolerant High Availability Page 45


network name that will be used for the primary database virtual server is FS-153 and the network
name that will be used for the standby database virtual server is FS-245.
To create the virtual server for the primary database, click the tree view to select the node that
currently hosts the primary database (FS-151 in the example). Then, from the Oracle Fail Safe
Manager menu, select *URXSV!&UHDWHto open the Create Group Wizard. In the initial window,
enter the group name and, optionally, a description. For the example, as shown in Figure 38, the
group is given the same name as the virtual server network name to make identification easier.

Figure 38: Create Group Wizard - Group Name

Disaster-Tolerant High Availability Page 46


In the Failback Policies window, accept the default 3UHYHQW)DLOEDFN option, as shown in Figure 39.

Figure 39: Create Group Wizard - Failback Policies


In general, for active/passive cluster configurations, preventing failback minimizes the number of
failovers associated with unplanned outages. By contrast, for clusters with active/active
configurations (each cluster node actively hosts database or application workloads), it usually is best
to associate workloads (groups) with preferred nodes and to enable failback so that the overall
cluster workload is automatically rebalanced when a previously failed node is restored to service.
After specifying the appropriate failback policy for your configuration, click )LQLVK. Review the
summary screen, and if the information is correct, click 2. to create the group. After the group is
created, you are prompted to add a virtual address to the group. Click <HV to open the Add
Resource to Group Wizard Virtual Address window shown in Figure 40.

Disaster-Tolerant High Availability Page 47


Figure 40: Add Resource to Group Wizard - Virtual Address
Select the previously created Public network and enter the network name associated with the virtual
address in the Host Name field (FS-153 in the example). When you click the IP Address field, Oracle
Fail Safe should automatically fill in the value associated with the previously entered network name.
Click )LQLVK to continue, and then click 2. to accept the information shown in the summary screen
that follows.
Repeat this process on the standby cluster to create a virtual server for the standby database (for the
example, the standby virtual server is called FS-245 and uses the previously identified network
name FS-245). After each group has been created and populated with is associated virtual address
resources, expand the Oracle Fail Safe Manager tree view to verify that the contents of the newly
created primary and standby virtual server groups are similar to those shown in Figure 41.

Disaster-Tolerant High Availability Page 48


Figure 41: Tree View Showing Database Virtual Servers

4.4.3 Execute Verify Standalone Database Command

Before configuring the primary and standby databases, first run the Oracle Fail Safe Manager Verify
Standalone Database command on each database. This command verifies that each database and its
associated network configuration files are correctly configured for use with Oracle Fail Safe.
To execute this command, select the icon for the database in the tree view and choose
7URXEOHVKRRWLQJ!9HULI\6WDQGDORQH'DWDEDVH from the Oracle Fail Safe Manager menu. Enter the
requested information in the dialog box, as shown in Figure 42. Note that although the Service
Name and Instance Name values for the primary and standby databases will differ, the Database
Name value is the same for both databases (testdb1, for the example). If, as in the example, Use
Operating System Authentication is selected, Oracle Fail Safe will automatically make any changes
necessary to enable operating system authentication (if it is not already enabled). Unless you have

Disaster-Tolerant High Availability Page 49


specific reasons to the contrary, allow Oracle Fail Safe to make any recommended configuration
changes when prompted. During the verification operation, Oracle Fail Safe displays a status report
similar to that shown in Figure 43. Resolve any reported problems and repeat the verification
process until it is successful for each database. Because a separate copy of the initialization
parameter file has been copied to each cluster node, the FS-10288 warning message that appears in
the status report can be ignored.

Figure 42: Verify Standalone Database Dialog

Figure 43: Verify Standalone Database Status Report

Disaster-Tolerant High Availability Page 50


4.4.4 Add Each Database to its Associated Virtual Server

The Oracle Fail Safe Manager Add Resource to Group Wizard automates the process of adding the
resources associated with each database to their respective virtual servers. For the example
configuration, the wizard is used twice for each virtual server: once to configure the database
resource and then once more to add an Oracle Intelligent Agent resource.

4.4.4.1 Primary Database Virtual Server

In the tree view, right-click the primary database and choose $GGWR*URXS (as shown in Figure 44)
to open the Add Resource to Group Wizard.

Figure 44: Add Primary Database to Group

Disaster-Tolerant High Availability Page 51


In the first window, as shown in Figure 45, choose 2UDFOH'DWDEDVH as the resource type (this will
be selected by default) and select the name of the previously created group for the primary
database virtual server ()6 for the example configuration). Click 1H[Wto continue.

Figure 45: Add Resource to Group Wizard – Resource Type

Disaster-Tolerant High Availability Page 52


In the next window, shown in Figure 46, enter the requested database identity information (the
same information previously entered during the Verify Standalone Database operation) and click
1H[W to continue.

Figure 46: Add Resource to Group Wizard – Database Identity


Oracle Fail Safe Manager will display an informational message similar to that shown in Figure 47 to
alert you that the specified initialization parameter file is not located on a cluster disk. Click 2. to
acknowledge the message. Note that the initialization parameter file for the database has been
configured previously (in section ) to ensure that the database can fail over between nodes.

Figure 47: Oracle Fail Safe Manager – Initialization Parameter File Location

Disaster-Tolerant High Availability Page 53


The wizard automatically detects the existence of the database password file on the node where the
database was created. Click <HV to have Oracle Fail Safe create a password file for the database on
each cluster node and enter the requested password information for the SYS account as shown in
Figure 48. Then click )LQLVK to continue.

Figure 48: Add Resource to Group Wizard – Database Password


Review the summary information displayed by the wizard (as shown in Figure 49), and if all
information is correct, click 2. to begin adding the database to the group.

Disaster-Tolerant High Availability Page 54


Figure 49: Add Resource to Group Wizard – Summary
Before starting the operation, Oracle Fail Safe displays an informational message, shown in Figure
50, to alert the administrator that the database will be taken offline. Click <HV to acknowledge the
message and continue.

Figure 50: Confirm Add Database to Group


During the clusterwide configuration process, Oracle Fail Safe automatically detects all cluster disks
used by the database and adds them to the group. Oracle Fail Safe also creates and adds a resource
for the listener service and updates the Oracle Net files on all cluster nodes with the new virtual
server information. In addition, Oracle Fail Safe creates and registers with the cluster software all
appropriate dependencies between resources in the group.
The ongoing status of the clusterwide operation is displayed in a status report that optionally can be
saved to disk for future reference or printing, as shown in Figure 51. If any errors are encountered
during the configuration process, they are reported in the status report and the configuration
operation is rolled back. As before, you can safely ignore the FS-10288 parameter file location
warning. After reviewing the status report, click &ORVH to continue.

Disaster-Tolerant High Availability Page 55


Figure 51: Add Database to Group Status Report
To allow the virtual server to be discovered by Oracle Enterprise Manager and allow Data Guard
Manager to create and manage the database, an Oracle Intelligent Agent must be added to the
group. To perform this task, right click the database virtual server group in the Oracle Fail Safe
Manager tree view and choose $GG5HVRXUFHWR*URXS, as shown in Figure 52.

Disaster-Tolerant High Availability Page 56


Figure 52: Open the Add Resource to Group Wizard
From the initial window, choose 2UDFOH,QWHOOLJHQW$JHQW as the resource type and the database
virtual server group as the group (the group should be selected by default) as shown in Figure 53.
Click 1H[W to continue, and click 2. to acknowledge the message informing you that the Oracle
Intelligent Agent must be installed on each cluster node to add an agent to a group.

Disaster-Tolerant High Availability Page 57


Figure 53: Add Resource to Group Wizard – Resource Type
In the next window, as shown in Figure 54, specify a cluster disk that the agent can use to store
information. For the example, choose + (the disk used for the database files). The agent is installed
as part of Oracle9i Database Enterprise Edition in the same home as the database (for the example,
choose GEVBKRPH as the Oracle home for the agent). After entering the information, click )LQLVK to
display the summary screen as shown in Figure 55. If the information is correct, click 2. to
continue.

Disaster-Tolerant High Availability Page 58


Figure 54: Add Resource to Group Wizard – Intelligent Agent Information

Figure 55: Add Resource to Group Wizard - Summary

Disaster-Tolerant High Availability Page 59


The ongoing status of the clusterwide operation is displayed in a status report that optionally can be
saved to disk for future reference or printing, as shown in Figure 56. If any errors are encountered
during the configuration process, they are reported in the status report and the configuration
operation is rolled back. After reviewing the status report, click &ORVH to continue.

Figure 56: Add Intelligent Agent to Group Status Report

Disaster-Tolerant High Availability Page 60


4.4.4.2 Standby Database Virtual Server

Repeat the same process to configure the standby database. After the standby database
configuration process is complete, expand the Oracle Fail Safe Manager tree view to verify that the
contents of the primary and standby virtual server groups (in the example, FS-153 and FS-245,
respectively) are similar to that shown in Figure 57. Note that the group for each database contains
all the cluster resources associated with that database. Virtual server groups used for production
deployments may contain additional resources (for example, additional disks associated with the
database or Oracle Intelligent Agent or additional IP address and network name resources if
multiple virtual addresses are configured for use with the database).

Figure 57: Tree View Showing Primary and Standby Virtual Servers

Disaster-Tolerant High Availability Page 61


4.5 Create Final Highly Available Primary/Standby Configuration
The final configuration task is to reestablish the Data Guard configuration for the now highly
available primary and standby databases.

4.5.1 Discover Virtual Servers

To begin the process, start the Oracle Management Server and Oracle Enterprise Manager (as
described in sections and ). Then, from the Oracle Enterprise Manager Console menu bar,
choose 1DYLJDWRU!'LVFRYHU1RGHV to open the Oracle Enterprise Manager Discovery Wizard.
Follow the directions in the wizard to discover the primary and standby virtual servers (FS-153 and
FS-245, for the example configuration). After discovery completes, the Oracle Enterprise Manager
tree view should be similar to that shown in Figure 58.
Note that the way the primary and standby databases and virtual servers are named in the tree view
may differ due to slight differences in the way the various wizards updated the database and
network configuration information on each cluster (the virtual servers shown in Figure 58, for
example, are named fs-153 and fs-245.us.oracle.com). This does not affect the primary/standby
configuration process. Also, because not all Oracle Intelligent Agent releases are fully cluster aware,
you may encounter errors if you attempt to discover both virtual servers and individual cluster
nodes. For the example, because the default (node-specific) Oracle Intelligent Agent was disabled
(refer to section ), it is only possible to discover the primary and standby virtual servers.

Figure 58: Tree View Showing Discovered Virtual Server

Disaster-Tolerant High Availability Page 62


4.5.2 Create Highly Available Primary/Standby Configuration

In the same way as described in section , open the Data Guard Manager Create Configuration
Wizard. Click 1H[W to proceed past the initial Welcome screen, enter the name you want to use for
the configuration (testdb1_config, for the example) as shown in Figure 59, and click 1H[W.

Figure 59: Create Configuration Wizard – Configuration Name

Disaster-Tolerant High Availability Page 63


In the Primary Database window, choose the primary database (testdb1.us.oracle.com in the
example), specify a site name (fs-153_site in the example), and click 1H[W, as shown in Figure 60.

Figure 60: Create Configuration Wizard – Primary Database

Disaster-Tolerant High Availability Page 64


In the Connect to Primary Database window, enter the primary database account information (SYS
account in the example) and click 1H[W, as shown in Figure 61.

Figure 61: Create Configuration Wizard – Connect to Primary Database

Disaster-Tolerant High Availability Page 65


In the Standby Creation Method window, choose $GGDQH[LVWLQJVWDQGE\GDWDEDVH, as shown in
Figure 62, and click 1H[W to continue.

Figure 62: Create Configuration Wizard – Standby Creation Method

Disaster-Tolerant High Availability Page 66


In the Add Existing Standby Database window, choose the standby database and site name, as
shown in Figure 63 (testdb12 and FS-245_site, respectively, for the example configuration). Click
1H[W to continue.

Figure 63: Create Configuration Wizard – Add Existing Standby Database

Disaster-Tolerant High Availability Page 67


In the Connect to Standby Database window, as shown in Figure 64, enter the connection
information for the standby database (SYS account for the example) and click 1H[W to continue.

Figure 64: Create Configuration Wizard – Connect to Standby Database


Data Guard Manager should detect that that the standby database was previously configured as a
standby site and display the informational message shown in Figure 65. Click <HV to acknowledge
the message and proceed with creating the new Oracle Data Guard configuration.

Figure 65: Data Guard Manager – Informational Message

Disaster-Tolerant High Availability Page 68


Review the information in the Summary window (shown in Figure 66) and click )LQLVK to begin
creating the configuration.

Figure 66: Create Configuration Wizard – Summary


Data Guard Manager displays the configuration progress in a window similar to that shown in
Figure 67. Click &ORVH to close the window after the configuration processing is complete.

Figure 67: Create Oracle Data Guard Configuration Progress Report

Disaster-Tolerant High Availability Page 69


4.5.3 Verify Highly Available Primary/Standby Configuration

Click the new configuration in the Data Guard Manager tree view to open the configuration
connection information dialog box shown in Figure 68. Enter the requested account information
(SYS for the example) and click 2. to connect to the configuration.

Figure 68: Connect to Oracle Data Guard Configuration


After connecting to the cluster, expand the tree view and select the configuration to view the status
of each component, as shown in Figure 69. If all components are functioning correctly, you have
successfully configured the example disaster-tolerant high availability solution.

Disaster-Tolerant High Availability Page 70


Figure 69: Final Highly Available Oracle Data Guard Configuration
Optionally, based on your business requirements, you can perform additional steps to customize the
example configuration. For example, if you plan to use maximum protection mode, use the Data
Guard Manager Standby Redo Log Assistant to create the required standby redo log files for each
database. To facilitate future management and administration operations, you also can update the
Oracle Enterprise Manager preferred credentials for the primary and standby databases and virtual
servers. For the primary and standby databases, choose &RQILJXUDWLRQ!3UHIHUHQFHV!3UHIHUUHG
&UHGHQWLDOV from the Oracle Enterprise Manager menu, select each database from the list of targets,
and enter the account information for the corresponding SYS database user account. Similarly, for
the primary and standby virtual servers, specify as preferred credentials a domain account with
administrator privileges on each physical cluster node that potentially can host the virtual server (for
example, specify the account used during Oracle Fail Safe installation in section ).

5 OTHER CONFIGURATIONS
As noted in the introduction, Oracle Fail Safe and Oracle Data Guard can be combined in multiple
ways to provide a range of disaster-tolerant high availability solutions. This section compares the
example configuration described in this paper with three alternative disaster-tolerant high
availability configurations. When designing any disaster-tolerant high availability solution, it is
important to understand the trade-offs among the various possible configuration options, particularly
with respect to the risks of data loss or interruption of service. Refer to the Oracle Fail Safe and
Oracle Data Guard documentation listed in section for complete information.

Disaster-Tolerant High Availability Page 71


5.1 Two Active/Passive Clusters

This is the configuration described earlier in the paper; it adds additional availability to a typical
Oracle Data Guard primary/standby configuration by replacing each standalone system with an
active/passive cluster and using Oracle Fail Safe to configure the primary and standby databases so
that they can fail over between cluster nodes, as shown in Figure 70.

Figure 70: Two Active/Passive Clusters

Disaster-Tolerant High Availability Page 72


5.1.1 Benefits

• The geographic separation between clusters enhances disaster tolerance


• Most planned and unplanned outages are handled efficiently through instance failover from
one cluster node to another
• The supported rolling upgrade scenarios of hardware and some software (described in section
) require only a single MSCS failover between cluster nodes (two MSCS failovers are
required for active/active clusters)
• Oracle Data Guard site failover or switchover to a standby location required only if all primary
cluster nodes are incapacitated

5.1.2 Trade-offs

• Distance between nodes makes asynchronous redo shipping best solution, but introduces a
risks of data loss and data divergence between the primary and standby databases
• Passive cluster nodes (nodes B and D in Figure 70, for example) typically perform no useful
work during normal operations

5.2 Single Active/Active Cluster


Similarly, as shown in Figure 71, for the price of an additional disk array (all required software is
already licensed on all cluster nodes), you can add a basic level of disaster tolerance to a typical
active/passive Oracle Fail Safe configuration by creating a second database and using Oracle Fail
Safe and Oracle Data Guard to create an active/active cluster configuration with one node hosting
the primary database virtual server and the other node hosting the secondary database virtual
server. Because both databases are located on the same cluster (but on different disk arrays to
ensure data protection), synchronous redo shipping can be used with minimal performance impact
to keep both copies of the data identical with no risk of data loss if the primary database fails.

Figure 71: Single Active/Active Cluster

Disaster-Tolerant High Availability Page 73


5.2.1 Benefits

• This configuration provides an inexpensive way to enhance an Oracle Fail Safe deployment to
protect against media failure and data corruption (just add a second disk array)
• There is no risk of data loss or data divergence between the primary and standby databases
(when configured using the Oracle Data Guard maximum availability mode)
• An Oracle Data Guard site failover or switchover to a standby location is required only if all
primary cluster nodes are incapacitated

5.2.2 Trade-offs

• Primary database shuts down when network access to the standby database is interrupted
• Instance failover times from one cluster node to another can be slower than for active/passive
clusters because not all resources on the failover node are available to Microsoft Cluster
Service to process the failover
• Because there is no time delay for the application of redo logs to the standby databases, there
is no protection from human error or other sources of data corruption
• Supported rolling upgrade scenarios of hardware and some software (described in section )
require two MSCS failovers between cluster nodes (only one MSCS failover is required for
active/passive clusters)
• Disaster tolerance is limited to protection from media failure (two copies of the data) and to a
basic level of protection from local area disasters (like floods or fires) largely based on the
degree of geographic separation between the cluster nodes and disk arrays. Because the
maximum separation between components in an MSCS fibre channel cluster is currently on the
order of 7-10 kilometers, this configuration does not protect against wide area disasters (like
hurricanes or earthquakes).

5.3 Two Active/Active Clusters


In Figure 72, a combination of synchronous and asynchronous redo shipping is used with multiple
standby databases on local and remote active/active clusters. The standby database located on the
same cluster as the primary database can be kept current with the primary database through
synchronous redo shipping, while each of the standby databases on the remote cluster can be
updated at different time delays through asynchronous redo shipping. In addition, the standby
databases can be used to offload reporting and backup operations from the primary database so that
more resources are available to support end users. This is a more complex solution, but provides
the best protection from data loss, data corruption, and site disasters from among the solutions
presented in this section.

Disaster-Tolerant High Availability Page 74


Figure 72: Two Active/Active Clusters

5.3.1 Benefits

• Efficiently offloads reporting and backup operations from the primary database. For example,
Standby Database 1 could be configured as a logical standby database available at all times for
read-only reporting and Standby Databases 2 and 3 could be configured as physical standby
databases also available for periodic database backups and occasional additional reporting

Disaster-Tolerant High Availability Page 75


• Makes full use of all cluster nodes and maintains multiple copies of data at different locations
for enhanced disaster protection
• Combination of synchronous and asynchronous redo transport ensures that one copy of the
data (Standby Database 1) is always fully synchronized with the primary database, while the
remaining remote standby databases (Standby Database 2 and Standby Database 3) are
maintained at different time delays to protect against data corruptions
• Oracle Data Guard site failover or switchover to a standby location is required only if all
primary cluster nodes are incapacitated

5.3.2 Trade-offs

• Is more complex to configure (four active databases)


• There is risk of data loss or divergence if both nodes of the primary cluster fail
• Instance failover times from one cluster node to another can be slower than for active/passive
clusters because not all resources on the failover node are available to Microsoft Cluster
Service to process the failover
• Supported rolling upgrade scenarios of hardware and some software (described in section )
require two MSCS failovers between cluster nodes (only one MSCS failover is required for
active/passive clusters)

5.4 Multiple Primary Locations and Single Standby Location


Figure 73 shows a configuration in which Oracle Data Guard, Oracle Fail Safe, and Real Application
Clusters provide complementary features that together help you to implement cost-effective high
availability, disaster protection, and scalability. Each primary location uses a two-node active/active
MSCS cluster configured with Oracle Fail Safe. Each node is configured as the preferred node for a
primary database virtual server. For each primary cluster, if a cluster node fails, the surviving node
will host both of the virtual servers for the primary databases configured on that cluster.
During normal operations, all MSCS cluster nodes actively service clients. To amortize the hardware
and management costs associated with multiple standby databases, a single shared Oracle Real
Application Clusters standby location hosts all the standby databases. Oracle Real Application
Clusters allows the standby cluster to scale easily (by adding more nodes and disks) if additional
standby databases are later added to the standby location. The single standby location also can be
used to consolidate corporate reporting and database backup operations. The number and
distribution across the cluster nodes of the Oracle Real Application Clusters database instances
associated with each standby database can be adjusted based on availability requirements and on
the reporting and backup workloads associated with the database (for example, in Figure 73,
Instances 1a and 1b are configured for Standby Database 1 and Instances 4a, 4b, and 4c are
configured for Standby Database 4, which has a larger reporting requirement).

Disaster-Tolerant High Availability Page 76


Figure 73: Multiple Primary Locations and Single Standby Location

5.4.1 Benefits

• All nodes for each primary cluster perform useful work


• A single scalable standby location simplifies disaster recovery planning and also consolidates
reporting and database backup operations
• The return on hardware and software investment is optimized
• Oracle Data Guard site failover or switchover to a standby location is required only if all
primary cluster nodes are incapacitated

5.4.2 Trade-offs

• Adds an additional level of complexity (configuring an Oracle Real Application Clusters


standby location currently requires manual Oracle Data Guard configuration steps and also
requires use of cluster hardware certified for use with Oracle Real Application Clusters)
• Instance failover times from one primary cluster node to another can be slower than for
active/passive clusters because not all resources on the failover node are available to Microsoft
Cluster Service to process the failover

Disaster-Tolerant High Availability Page 77


• Supported rolling upgrade scenarios of hardware and some software (described in section )
require two MSCS failovers between cluster nodes (only one MSCS failover required for
active/passive clusters)

6 MAINTENANCE AND ADMINISTRATION EXAMPLES


Features from both Oracle Fail Safe and Oracle Data Guard can affect the way some maintenance
and administrative operations are performed. For example, because Microsoft Cluster Service
monitors both the primary and standby databases for high availability, these databases are
automatically restarted if they are stopped using a normal database shutdown command. In general,
any time you plan to use Data Guard Manager or any other administrative tool to perform
operations that could affect access to a primary or standby database or for which you want to
disable the possibilities of an automatic restart or failover, you should first use Oracle Fail Safe
Manager to disable Is Alive polling temporarily for the database or to take the database offline
(which also stops Is Alive polling). Similarly, any time you plan to perform operations with Oracle
Fail Safe Manager that could affect access to a primary or standby database, you should first use
Data Guard Manager to temporarily disable Oracle Data Guard monitoring for that database. This
includes not only operations such as cold database backups, but also administrative operations that
need to be performed while users continue to access the database or any operations that could
affect query response times during the periodic Is Alive polling of the database by MSCS.
The topics in this section provide step-by-step instructions for five typical maintenance and
administrative operations:
• Performing an Oracle Fail Safe (MSCS) failover
• Changing the database SYS account password
• Performing rolling upgrades
• Performing an Oracle Data Guard site failover or switchover
• Performing database backups
To minimize the risk of unanticipated downtime, Oracle corporation recommends that you first
rehearse planned administrative or maintenance operations on identically configured test systems
before performing the operations on business-critical production systems.

6.1 Performing an Oracle Fail Safe (MSCS) Failover


An Oracle Fail Safe (MSCS) failover between cluster nodes can be unplanned (for example, as a
result of an unexpected component failure) or planned (for example, while performing a rolling
upgrade of hardware or software or to rebalance workloads across the cluster nodes). Unplanned
failovers are handled automatically by MSCS (based on the user-specified failover policy for each
group). Planned failovers are initiated manually by the administrator from within Oracle Fail Safe
Manager by selecting a group in the tree view and choosing *URXSV!0RYHWRD'LIIHUHQW1RGH.
Because clients access databases and other resources configured with Oracle Fail Safe through a
fixed (node-independent) virtual server address, failovers usually appear to clients as a brief
interruption in service that is in many ways like an instant node reboot. Unless Oracle9i Database
features such as transparent application failover (TAF) are in use, any uncommitted work is lost

Disaster-Tolerant High Availability Page 78


(rolled back) and the client needs to reconnect to the database after the instance restarts on the new
node. With TAF, reconnection and resumption of interrupted SELECT statement execution is
automatic. For more information about using TAF with databases configured with Oracle Fail Safe,
refer to the white paper Cluster-Aware ODBC and OCI Client Applications for Oracle Fail Safe
Solutions (available through the Oracle Technology Network at
http://otn.oracle.com/tech/windows/failsafe/).
From a management perspective, Data Guard Manager is currently not “cluster-aware”, and several
issues related to MSCS failovers require special attention. Depending on the specific circumstances
associated with the failover (planned, unplanned, and so forth), one or more of the following may
apply:
• Any time you plan to perform operations with Oracle Fail Safe Manager that could affect
access to a primary or standby database (such as a planned failover), you should first use Data
Guard Manager to temporarily disable Oracle Data Guard monitoring for the database.
• Any time you complete operations with Oracle Fail Safe Manager and have previously used
Data Guard Manager to disable Oracle Data Guard monitoring for a database, you should use
Data Guard Manager to reenable Oracle Data Guard monitoring for the database.
• Data Guard Manager uniquely identifies each database in a configuration by the combination
of the database name and the name of the physical node hosting the database at the time it
was added to the configuration. Data Guard Manager stores this information in its
configuration files and does not update the information after the configuration is created. This
means that, for databases also configured with Oracle Fail Safe, Data Guard Manager can only
access a database when the same physical node that hosted the virtual server when it was
initially added to the configuration hosts the virtual server for that database. Note that this
does not affect the day-to-day operation of the primary/standby configuration (redo shipping
and redo application are all based on the virtual server), but that it does affect how Data
Guard Manager can be used to view and manage the configuration. Following is a partial list
of known limitations:
o To connect to a primary or standby site using Data Guard Manager after a virtual
server failover has occurred, currently you must choose from between the following
two options:
Œ Use the Oracle Fail Safe Manager Move to a Different Node command to
move the virtual server group for the database to the physical node
associated with that database in the Connect Through field of the Data
Guard Configuration Connect Information screen (for example, the
Connect Through field shown in Figure 68 associates the primary database
testdb1 with physical node FS-151). Reconnect to the configuration using
Data Guard Manager and review the status of the primary and standby
sites and databases. Reenable any sites or databases that Data Guard
Manager disabled as a result of the failover. If the physical node associated
with the database in the Data Guard Manager configuration files is not
available to host the virtual server, then the next option must be used.

Disaster-Tolerant High Availability Page 79


Œ Delete the existing configuration (following the steps described in section
) and then re-create the configuration (following the steps described
in section ) so that the Data Guard Manager configuration files are
updated with the current physical host information for the primary and
standby databases.
o Oracle Enterprise Manager cannot discover any Data Guard configurations that are
also configured using Oracle Fail Safe. Because of this, to manage the configuration
with Data Guard Manager, always connect to the same Oracle Enterprise Manager
repository that was used when the Data Guard configuration was initially created.
o Data Guard Manager operations that assume that Oracle Enterprise Manager has
discovered the Data Guard configuration or the physical cluster nodes are not fully
functional. For example,
Œ It is not possible to submit event tests from Data Guard Manager.
Œ It is not possible to view or monitor Data Guard log files from within Data
Guard Manager.
Œ Some steps in the Data Guard Manager Verify configuration operation will
not succeed.

6.2 Changing the SYS Database Account Password


If you did not enable operating system authentication for the primary and standby databases (refer
to section ) when configuring these databases, then Oracle Fail Safe uses the SYS database
account password information you provided when each database was added to its respective virtual
server (refer to section ) to connect to the database during management operations and for Is
Alive polling. You may also have enabled both operating system authentication and the SYS
database account (for remote client access) as in the example configuration. Depending on your
configuration, SYS database account password information may be maintained in multiple places,
including:
• The database password file on each cluster node
• Oracle Enterprise Manager preferred credentials
• Oracle Fail Safe database property information
If you use the SYS database user account and change the SYS account password for any database
configured with Oracle Fail Safe, it is critical to ensure that this change is synchronized across all
places noted in the preceding list to ensure uninterrupted operation. Note specifically that database
password files are stored on private disks and that changes made to the password file on one
cluster node are not automatically applied to the corresponding file on the other cluster nodes.
Note also that the database password synchronization process for primary/standby configurations
described in this section differs from the standard database password synchronization process
because the SYS database user password cannot be modified while a physical standby database is in
managed recovery or read-only mode. For more information on changing the SYS database account

Disaster-Tolerant High Availability Page 80


password and on synchronizing password files on multiple cluster nodes, refer to the Oracle Fail
Safe Concepts and Administration Guide.

6.2.1 Update Primary SYS Database User Account Password

Locate and right-click the primary database in the Data Guard Manager configuration tree view, as
shown in Figure 74. Then, from the pop-up menu, choose 'LVDEOH. This will temporarily disable
Oracle Data Guard monitoring of the database and prevent unnecessary alerts when Oracle Fail Safe
Manager is used to fail over the database virtual server in the steps that follow. Choose )LOH!([LW
to close Data Guard Manager.

Figure 74: Data Guard Manager: Disable Primary Database

Disaster-Tolerant High Availability Page 81


Using Oracle Fail Safe Manager, connect to the primary cluster, and as shown in Figure 75, choose
5HVRXUFHV!8SGDWH'DWDEDVH3DVVZRUG to open the Update Database Password Wizard.

Figure 75: Open Oracle Fail Safe Manager Update Database Password Wizard
In the first window, select the primary database, as shown in Figure 76. Click 1H[W to continue.

Figure 76: Update Database Password - Choose Databases

Disaster-Tolerant High Availability Page 82


In the second window, enter the current password for the SYS database user account and then enter
the new password for the SYS account twice, as shown in Figure 77. Click )LQLVK to continue.

Figure 77: Update Database Password - Enter Password Information


The wizard displays a summary screen similar to that shown in Figure 78. Review the summary
screen and, if the information is correct, click 2. to continue.

Figure 78: Update Database Password – Summary


The wizard displays a status window during the update process similar to that shown in Figure 79.
Click 2. to close the Finished Updating Passwords window and &ORVH to close the status window.

Disaster-Tolerant High Availability Page 83


Figure 79: Update Database Password - Status
At this point, the SYS user password for the primary database has been updated in the Oracle Fail
Safe metadata and in the password file on the cluster node that currently hosts the primary database
virtual server. To avoid unnecessary downtime, Oracle Fail Safe normally defers updating the
password file on the other cluster nodes. However, because it is possible that an Oracle Data Guard
site failover or switchover may occur (which would place the database in a standby recovery or
read-only role) before the database virtual server is hosted by another node, it is necessary to
ensure that all password files for the primary database are updated. To update the password files on
the other cluster nodes, perform the following steps:
1. Right-click the database virtual server in the Oracle Fail Safe Manager tree view and choose
0RYHWRD'LIIHUHQW1RGH, as shown in Figure 80.
2. Click <HV when prompted to confirm the Move Group operation, as shown in Figure 81.
3. After the Move Group operation completes, click 2. to close the Clusterwide Operation
Status message.
4. Review the status report to confirm that the primary virtual server group and the resources it
contains are online on the other cluster node, as shown in Figure 82.
5. Click &ORVH to close the status report.
6. If necessary, repeat these steps for any additional nodes in the cluster.
7. After the password file has been updated on all cluster nodes, use Oracle Fail Safe Manager
to move the primary virtual server group back to the initial cluster node so that the physical
node hosting the group matches the node expected by Data Guard Manager (refer to the
section for additional information about this requirement).
8. Optionally, to verify that Oracle Fail Safe updated the database password files, confirm that
the primary database password file (<Oracle_Home>\database\PWDtestdb1.ora in the
example) was modified recently on each node.

Disaster-Tolerant High Availability Page 84


Figure 80: Oracle Fail Safe Manager - Move to a Different Node

Figure 81: Confirm Move Group

Disaster-Tolerant High Availability Page 85


Figure 82: Move Group Status Report
To complete the password configuration process, perform the following steps:
1. From the Oracle Enterprise Manager main menu, choose &RQILJXUDWLRQ!3UHIHUHQFHV!
3UHIHUUHG&UHGHQWLDOV and update the preferred credentials for the primary database with the
new SYS database user account password information (as SYSDBA). Click 2. to apply this
change.
2. From the Oracle Enterprise Manager main menu, choose 7RROV!'DWDEDVH$SSOLFDWLRQV!
'DWD*XDUG0DQDJHU to open Data Guard Manager.
3. Select the configuration from the Data Guard Manager tree view and connect to the
configuration.
4. Expand the primary site in the Data Guard Manager tree view, right-click the primary
database, and choose (QDEOH, as shown in Figure 83. Click <HV when prompted to confirm
that you want to enable the primary database.

Disaster-Tolerant High Availability Page 86


Figure 83: Data Guard Manager - Enable Primary Database.

6.2.2 Update the Standby SYS Database User Account Password

For each logical standby database, repeat the steps described in section to update the SYS
database user account password on all nodes in the standby cluster. Note that logical standby
databases are fully functional databases that are usually administered and managed in the same way
as primary databases, with the exception that the tables replicated on the logical standby database
are read-only.
For each physical standby database (such as the physical standby database in the example
configuration), however, the SYS database user account password information cannot be updated
while the standby database is operating in managed recovery or read-only modes. To change the
SYS account password for a standby database, you must choose one of the following options:
• Option 1 (updates SYS account password and retains any other entries in the password file)
1. Perform a site switchover (refer to section ) to convert the standby database to the
primary database.
2. Update the SYS database user password, as described in section .
3. Perform a site switchover to return the primary and standby databases to their original
sites (refer to section ). Step 3 is optional for configurations where the physical
locations of the primary and standby databases are not important.

Disaster-Tolerant High Availability Page 87


• Option 2 (creates a new password file on each node that contains only the SYS account)
1. From the Data Guard Manager tree view, right-click the standby database and choose
'LVDEOH. Exit from Data Guard Manager.
2. Locate the standby database password file on the cluster node that currently hosts the
standby database virtual server and record the path and file name information for this file
(C:\oracle\database\PWDtestdb12.ora in the example configuration). Then rename the file
(for example, to C:\oracle\database\PWDtestdb12_old.ora).
3. Open an MS-DOS command window on the node that currently hosts the standby
database virtual server and enter the following command:

RUDSZGILOH IQDPH!SDVVZRUG SDVVZRUG!

The <fname> variable is the original location and name of the password file on that node
(C:\oracle\database\PWDtestdb12.ora in the example configuration) and the <password>
variable is the new password for the SYS database user account.
4. From the Oracle Fail Safe Manager tree view, right-click the standby database virtual server
group and choose 0RYHWRD'LIIHUHQW1RGH. Click 2. to acknowledge the Confirm Move
Group informational message.
5. Repeat Steps 2 and 3 on the new cluster node.
6. Repeat Steps 4 and 5 for each additional node (if the cluster has more than two nodes).
7. After the password file has been updated on all standby nodes, choose 5HVRXUFHV!
8SGDWH'DWDEDVH3DVVZRUG to open the Update Database Password Wizard. In the first
window, select the standby database and click 1H[W to continue. In the second window,
enter the new SYS password in the Old Password, New Password, and Confirm New
Password fields. Note that when all three password fields contain the same value, as in
this case, Oracle Fail Safe Manager will verify that the password is valid and update the
SYS account password information stored by Oracle Fail Safe, but will not attempt to
update the password file on any of the cluster nodes. Click )LQLVK to continue. Review the
summary screen and, if the information is correct, click 2. to continue. Click 2. to close
the Finished Updating Passwords window and &ORVH to close the status window.
8. Use Oracle Fail Safe Manager to move the standby virtual server group back to the initial
cluster node so that the physical node hosting the group matches the node expected by
Data Guard Manager (refer to the section for additional information about this
requirement).
9. From the Oracle Enterprise Manager main menu, choose &RQILJXUDWLRQ!3UHIHUHQFHV!
3UHIHUUHG&UHGHQWLDOV and update the preferred credentials for the standby database with
the new SYS database user account password information (as SYSDBA). Click 2. to apply
this change.

Disaster-Tolerant High Availability Page 88


10. From the Oracle Enterprise Manager main menu, choose 7RROV!'DWDEDVH$SSOLFDWLRQV
!'DWD*XDUG0DQDJHU to open Data Guard Manager.
11. Select the configuration from the Data Guard Manager tree view and connect to the
configuration.
12. Expand the configuration in the Data Guard Manager tree view, right-click the standby
site, and choose (QDEOH. Click <HV when prompted to confirm that you want to enable the
standby site. Then right-click the standby database and choose (QDEOH Click <HV when
prompted to confirm that you want to enable the standby database.
13. On each standby cluster node, delete the old password file temporarily renamed in Step 2.

6.3 Performing Rolling Upgrades


Performing a Oracle Data Guard site switchover can reduce downtime during hardware upgrades
and some software upgrades; however there is still noticeable downtime and a significant number
of manual steps involved when performing upgrades in this manner. Furthermore, the Oracle9i
Data Guard Concepts and Administration manual specifically warns against using a switchover
operation to perform a rolling upgrade of Oracle database server software. The topics in this section
describe how combining Oracle Data Guard with Oracle Fail Safe overcomes many of these
limitations and allows rolling upgrades to be performed in some cases with only a minute or two of
downtime and without the need for a Oracle Data Guard site switchover.
The following guidelines apply to all rolling upgrade scenarios described in this section:
• If the upgrade will affect databases monitored by Oracle Data Guard, right-click on the
database and choose 'LVDEOH. After the upgrade is complete, repeat these steps, but instead
choose (QDEOH to reenable Oracle Data Guard monitoring.
• To minimize impact on users, wait for a quiet period in cluster operations before proceeding
with the upgrade process.
• When upgrading Oracle product software, do not begin the installation procedure while any
Oracle Fail Safe Manager operations or MSCS Cluster Administrator operations are in progress.
• Upgrade only one node in a cluster at a time.
• If you are upgrading Oracle database software, consider performing a database backup prior
to any major upgrade.
• For active/passive cluster configurations, you can eliminate a failover if you begin the rolling
upgrade on the passive node.
• To ensure minimal downtime and to identify any potential issues with other software that
might be running on the cluster, Oracle Corporation recommends that you test any upgrade
operations on an identically configured test cluster before you upgrade the production cluster.

6.3.1 Upgrading Hardware or Operating System Software

In most cases, hardware or operating system software upgrades can be performed without an
Oracle Data Guard site switchover. Table 3 summarizes the rolling upgrade process for hardware or
operating system upgrades. Note also the following restrictions:

Disaster-Tolerant High Availability Page 89


• If a hardware upgrade will interfere with access to the cluster disks in the database virtual
server group, you may need to shut down the database during the upgrade process or
consider performing an Oracle Data Guard site switchover.
• If any software updates are required to ensure that the database release remains compatible
with the hardware or operating system after the upgrade, refer to section for information
about additional steps that may be required.
6WHS 7DVN 7RRO &RPPHQWV

1 Change the database virtual Oracle Fail Safe Manager Follow the instructions in the Oracle Fail Safe Manager online
server group failback help. Changing the failback attributes prevents the group from
attributes to the Prevent failing back to the node while it is being rebooted or when the
Failback mode. cluster service is restarted.

2 Perform a planned failover by Oracle Fail Safe Manager Choose *URXSV!0RYHWRD'LIIHUHQW1RGH. (See the
moving all groups on the instructions in the Oracle Fail Safe help for more information.)
node being upgraded to By moving all groups to another node, you can work on the
another node. current node. When moving a group that contains a database
with this method, Oracle Fail Safe will perform a checkpoint
operation prior to moving the group.

3 Exit Oracle Fail Safe Manager. Oracle Fail Safe Manager Choose )LOH!([LW to exit Oracle Fail Safe Manager.
4 Perform the hardware or Various Follow the instructions provided by your hardware or operating
operating system upgrade system vendor.

5 Run the Verify Group Oracle Fail Safe Manager Select 7URXEOHVKRRWLQJ!9HULI\*URXS to check all resources in
operation on all groups. all groups and confirm that they have been configured correctly.
If you upgraded Oracle database software, the Verify Group
operation will update the tnsnames.ora file. If prompted, click
Yes. Otherwise, the Oracle database might not come online after
you add it to a group.

6 Repeat steps 2 through 5 on Various No comments.


the other server node or
nodes in the cluster.

7 Run the Verify Cluster Oracle Fail Safe Manager This step verifies that there are no discrepancies in the software
operation. installation, such as with the release information on each node in
the cluster.

8 Restore the failback policy Oracle Fail Safe Manager Follow the instructions in the Oracle Fail Safe Manager online
attributes on the groups. help to set the failback policy for all groups in the cluster.

9 Fail back groups, as Oracle Fail Safe Manager Perform a planned failover to move the groups back to the
necessary, by moving groups preferred node. This rebalances the workloads across the cluster
back to the other node or nodes. Refer to the instructions in the Oracle Fail Safe Manager
nodes. online help regarding moving a group to a different node.

Table 3: Hardware or Operating System Rolling Upgrade Process

Disaster-Tolerant High Availability Page 90


6.3.2 Upgrading Oracle Fail Safe or Oracle Application Software

Generally, the rolling upgrade steps described in the Oracle Fail Safe Installation Guide for
upgrading Oracle Fail Safe or other Oracle software apply, provided that the software being
upgraded is installed in a different home from the Oracle database software and that no scripts or
other changes must be applied to the database as part of the upgrade process.
Note that additional steps beyond those listed in Table 3 are required when performing rolling
upgrades of Oracle software. Refer to the Oracle Fail Safe Installation Guide for complete details. If
any software updates are required to the database to ensure that it remains compatible with the
Oracle Fail Safe or other Oracle application software being upgraded, refer to section for
information about additional steps that may be required.

6.3.3 Upgrading Oracle Database Software

Both Oracle Fail Safe and Oracle Data Guard currently impose restrictions when performing
upgrades of Oracle database software. The downtime required during database upgrades varies,
based on the nature of the upgrade being performed. In many cases, Oracle Fail Safe can help to
reduce the overall downtime experienced by end users during upgrades of Oracle database
software by allowing program executable files to be upgraded on one cluster node while users
continue to work on another cluster node. Refer to the database upgrade information provided in
the Oracle Fail Safe Installation Guide and the Oracle9i Database documentation set for more
information. Also, several upgrade-related Support Notes for Standby databases are available
through Oracle MetaLink and are listed in section of this paper.

6.3.3.1 Applying a Database Software Patch

In general, if there are no changes to the in-memory or on-disk structure of the database and if the
initial three fields of the database version do not change (for example, during the application of a
software patch to the program executable software), then the rolling upgrade process outlined in
section can be used to minimize downtime. Perform each step in parallel on the primary and
standby clusters so that the state of the software on the primary and standby cluster nodes remains
consistent. Note that you will not need to take the database out of the virtual server group during
the rolling patch application process.

6.3.3.2 Upgrading to a New Database Version

If any of the initial three fields of the database version changes (for example, from release 9.0.1 to
release 9.0.2), or if any scripts or changes must be applied to the database structure during the
upgrade process, then additional steps that may include periods of database downtime are required.
The specific upgrade steps also depend on whether any nologging changes to the database are
required. In general, you will need to unconfigure the database virtual servers, upgrade the
databases, and then reconfigure the database virtual servers. If nologging changes to the database
are required, then the additional step of refreshing or re-instantiating the standby databases may
also required. Table 4 summarizes the major tasks in the database upgrade process.

Disaster-Tolerant High Availability Page 91


6WHS 7DVN 7RRO &RPPHQWV

1 Prior to upgrading, check for the Various 6FHQDULR5HFRYHULQJ$IWHUWKH


Refer to section
existence of nologging 12/2**,1*&ODXVH,V6SHFLILHG of the Oracle9i Data Guard
operations and update the Concepts and Administration Release 2 (9.2) manual for further
standby database if necessary. details.

2 Perform a hot backup or cold Various This step ensures that you can recover the original configuration
backup of the primary database. if necessary.
Also back up the initialization
parameter files, server parameter
files, and Oracle Data Guard
configuration files for the
primary and standby databases.

3 Remove the Oracle Data Guard Data Guard Manager and Follow the same steps previously described in section  All
Configuration from the Data Oracle Enterprise entries for the primary and standby databases and virtual servers
Guard Manager and Oracle Manager should be removed from the tree views.
Enterprise Manager tree views.

4 Remove the primary database Oracle Fail Safe Manager In the Oracle Fail Safe Manager tree view, locate the primary
from the primary database cluster, select the primary database, and choose 5HVRXUFHV!
virtual sever. 5HPRYHIURP*URXS.
5 Remove the standby database Oracle Fail Safe Manager In the Oracle Fail Safe Manager tree view, locate the standby
from the standby database 5HVRXUFHV!
cluster, select the primary database, and choose
virtual sever. 5HPRYHIURP*URXS. If there are multiple standby databases,
perform this step for each standby database.

6 Exit Oracle Fail Safe Manager. Oracle Fail Safe Manager Choose )LOH!([LW to exit Oracle Fail Safe Manager.
7 Upgrade the program executable Various Follow the documented upgrade or migration instructions for
files in the primary and standby your database releases. To identify the specific upgrade steps for
database Oracle home your configuration, review the information in:
directories and upgrade the
• The Oracle Fail Safe Concepts and Administration Guide
primary database.
section on8SJUDGLQJD)DLO6DIH'DWDEDVHZLWKWKH2UDFOH
'DWDEDVH8SJUDGH$VVLVWDQW
• Oracle Support Note 165296.1

• The Oracle9i Database Migration manual

In general, you should first upgrade the program executable files


on the primary and standby cluster nodes and then apply any
required upgrade scripts to the primary database. During the
upgrade process, redo shipping between the primary and
standby sites may need to be temporarily deferred. Additional
steps, such as those described in Oracle Support Note 165296.1,
may be needed to resynchronize the primary and standby
databases.

Disaster-Tolerant High Availability Page 92


8 Refresh or reinstantiate each Various Most of the time, you will not need to reinstantiate the standby
standby databases database. If nologging changes are made during execution of the
primary database upgrade script, you can check the v$datafile
view to identify any data files with nologging changes that need
to be “refreshed” in the standby database. If there are no
nologging changes, then any changes made to the primary
database should automatically have been applied to the standby
databases. If you lose or corrupt an archive file or if you do
point in time recovery or resetlogs on the primary database, then
re-instantiation of each standby database is required. If you
encounter any problems while attempting to “refresh” standby
database files or if changes made to the primary database are not
correctly replicated to the standby databases, remove and then
reinstantiate each standby database (follow the steps previously
described in section to reinstantiate each standby database).
9 Reconfigure the disaster tolerant Data Guard Manager and Perform the steps described previously in sections and to
high availability configuration Oracle Fail Safe Manager complete the upgrade.

Table 4: Database Upgrade Process Overview

6.4 Performing an Oracle Data Guard Failover or Switchover Operation


If you are using a primary/standby configuration and you intend to perform an Oracle Data Guard
failover or switchover operation, you must first disable Oracle Fail Safe Is Alive polling for the
affected databases. When you perform a failover or switchover for a physical standby configuration,
the Oracle Data Guard software takes the affected primary and standby databases offline. However,
Oracle Fail Safe strives to keep every database it monitors online and may interfere with the
operation by attempting to bring the affected databases back online before the failover or
switchover completes. Similarly, during a failover or switchover for a logical standby configuration,
there can be no other active sessions connected to either database (such as the connection
associated with MSCS Is Alive polling).
Therefore, before performing a failover or switchover, you must first disable Is Alive polling for the
primary database (if it is still accessible) and the standby database that directly participates in the
role transition. When the failover or switchover is complete, reenable Is Alive polling so that Oracle
Fail Safe can resume monitoring these databases for failures. Sections 6.4.1 through 6.4.4 describe
the steps that must be followed during a failover or switchover operation in more detail.

6.4.1 Disable Is Alive Polling

For each database that will be directly involved in the role transition, disable Is Alive polling by
performing the following steps:
1. Select the database in the Oracle Fail Safe Manager tree view.
2. Click the 'DWDEDVH tab.
3. In the 'DWDEDVH3ROOLQJ box, select 'LVDEOHG

Disaster-Tolerant High Availability Page 93


4. Click $SSO\.
5. Wait one Is Alive polling interval to ensure that this state change takes effect before you
initiate any switchover or failover commands.

6.4.2 Perform the Role Transition Operation

Use Oracle Data Guard Manager to perform the switchover or failover operation.

6.4.3 Reenable Is Alive Polling

For each database that was directly involved in the role transition, reenable Is Alive polling by
performing the following steps after the switchover or failover is complete:
1. Select the database in the Oracle Fail Safe Manager tree view.
2. Click the 'DWDEDVH tab.
3. In the 'DWDEDVH3ROOLQJ box, select (QDEOHG
4. Click $SSO\
Note also that Oracle Fail Safe automatically reenables Is Alive polling each time the group
containing the database is moved to another cluster node.

6.4.4 Verify the Primary and Standby Virtual Server Groups

After completing a switchover or failover, use the Oracle Fail Safe Manager Verify Group menu
command (click 7URXEOHVKRRWLQJ!9HULI\*URXS) to verify the primary and standby database virtual
server groups. Correct any reported problems and rerun the Verify Group command until it is
successful. As before, you can safely ignore the FS-10288 parameter file location warning.

6.4.5 Switchover Example

Follow the procedure described in section 6.4.1 to disable Is Alive polling on the primary and
standby databases that will be participating in the switchover operation (testdb1 and testdb12 in the
example). Then choose 2EMHFW!6ZLWFKRYHU from the Oracle Data Guard Manager menu to start
the Data Guard Manager Switchover Wizard, as shown in Figure 84.

Disaster-Tolerant High Availability Page 94


Figure 84: Starting the Switchover Wizard
Click Next to acknowledge and close the initial Welcome window. Oracle Data Guard Manager then
checks for active user sessions on the primary database and may display a Check Open Sessions
Dialog window similar to that shown in Figure 85. Click &RQWLQXH to disconnect the users identified
in the window and proceed with the switchover operation.

Figure 85: Check Open Sessions Dialog (Primary Database)

Disaster-Tolerant High Availability Page 95


In the next window, select the standby site that will become the new primary site. For the example
configuration, this is fs-245_site, as shown in Figure 86. Click 1H[Wto continue.

Figure 86: Switchover Wizard –Select Standby Site


If there are any active user sessions connected to the selected standby database, Oracle Data Guard
Manager will display a Check Open Sessions Dialog window similar to that shown in Figure 87.
Click &RQWLQXH to disconnect the users identified in the window and proceed with the switchover
operation.

Disaster-Tolerant High Availability Page 96


Figure 87: Check Open Sessions Dialog (Standby Database)
After disconnecting the active sessions on the standby database, the Switchover Wizard will display
a summary window similar to that shown in Figure 88. If the information displayed in the window
is correct, click Finish to begin the switchover operation.

Figure 88: Switchover Wizard – Summary


During the switchover operation, a window similar to that shown in Figure 89 records progress.
When the switchover is complete, click &ORVH to close the status report window. Oracle Data Guard
Manager now displays the updated configuration, as shown in Figure 90. For the example
configuration, testdb12 now has the primary role, while testdb1 has the standby role.

Disaster-Tolerant High Availability Page 97


Figure 89: Switchover Wizard – Status Report

Figure 90: Data Guard Configuration After Switchover


After verifying that the switchover was successful, reenable Is Alive Polling on each database
following the steps described in section  and execute the Verify Group command on each
database virtual server, as described in section .

Disaster-Tolerant High Availability Page 98


6.5 Performing Database Backups
Because backup operations may involve taking the database offline and may consume significant
system resources that could affect query response times, it is necessary to stop Oracle Fail Safe from
performing Is Alive polling during backup operations. This can be done interactively using the
Oracle Fail Safe Manager menu commands described in the previous section or through the Oracle
Fail Safe Manager FSCMD command-line interface. The following sample script can be modified for
your environment to automate database backups. In the script, replace [dbname], [groupname], and
[clustername] with the name of your database, virtual server, and cluster, respectively. For the
example configuration, the respective values for the standby database are WHVWGEXVRUDFOHFRP,
)6, and)6.
REM This sample script shows how to perform an automated backup
REM operation on a database configured with Oracle Fail Safe.
REM
REM 1. Move the group FS Group1 that contains the database to
REM the node on which the backup operation will run.
REM
REM Alternatively, you can map a network drive for each
REM cluster disk to allow the backup software to access
REM the drives through a virtual server address regardless
REM of which cluster node currently owns them.

fscmd movegroup [groupname] /node=[nodename] /cluster=[clustername]

REM 2. Disable Is Alive polling for the database resource.

fscmd disableisalive [dbname] /cluster=[clustername]

REM 3. Begin the backup operation here.

[insert appropriate backup commands here]

REM 4. Reenable Is Alive polling for the database resource.

fscmd enableisalive [dbname] /cluster=[clustername]

REM The backup operation is complete.

Refer to the Oracle9i Data Guard Concepts and Administration manual for information on the
specific RMAN commands used to back up databases configured with Oracle Data Guard (for
example, refer to section 6FHQDULR8VLQJD6WDQGE\'DWDEDVHWR%DFN8SWKH3ULPDU\
'DWDEDVH). Note that you will typically only backup the primary database control file and will
offload all other backup operations to a standby database. To ensure that your backup software can
always access the database disks, you may find it helpful on the system where the backup software
will be run to map each cluster disk used by the database as a node-independent network drive
using the virtual server (for example, in the example, create = as ??)6?K).

Disaster-Tolerant High Availability Page 99


7 SUMMARY AND MORE INFORMATION
Oracle Fail Safe and Oracle Data Guard can be effectively combined to create efficient, cost-
effective, and easily configured disaster-tolerant high availability solutions. Anyone with business-
critical databases deployed on Windows systems should consider the combination of Oracle Fail
Safe and Oracle Data Guard to meet their high availability and disaster-tolerance requirements.

7.1 Oracle Product Documentation


• Oracle9i Database Administrator’s Guide for Windows
• Oracle9i Data Guard Broker
• Oracle9i Data Guard Concepts and Administration
• Data Guard Manager quick tour and online help system
• Oracle Fail Safe Concepts and Administration Guide
• Oracle Fail Safe Installation Guide
• Oracle Fail Safe quick tour, tutorial, and online help system

7.2 Oracle9i Database High Availability and Disaster Recovery Web Site
• http://otn.oracle.com/deploy/availability

7.3 Oracle Fail Safe Web Sites


• http://www.oracle.com/ip/deploy/database/features/failsafe/
• http://otn.oracle.com/tech/windows/failsafe/

7.4 Oracle University Online Learning Web Site


• http://www.oracle.com/education/oln/index.html
o Introduction to Oracle Fail Safe self-paced eStudy class, Database Administration
Capacity, Availability, and Recovery track
o OCP - Oracle9i New Features for Administrators Exam Prep - Module 9: Data Guard
eClass, Database Administration Capacity, Availability, and Recovery track

7.5 Oracle Support MetaLink Web Site


• http://metalink.oracle.com/
o Support Note 165304.1: Downgrading from 9i with Standby Database in Place
o Support Note 165296.1: Upgrading to 9i with Standby Database in Place

Disaster-Tolerant High Availability Page 100


Disaster-Tolerant High Availability Page 101
Disaster-Tolerant High Availability:
Oracle Data Guard with Oracle Fail Safe

June 2002
Author: Laurence Clarke
Contributing Authors: Vivian Schupman, Ingrid Stuart

Oracle Corporation
World Headquarters
500 Oracle Parkway
Redwood Shores, CA 94065
U.S.A.

Worldwide Inquiries:
Phone: +1.650.506.7000
Fax: +1.650.506.7200
www.oracle.com

Oracle is a registered trademark of Oracle Corporation. Various


product and service names referenced herein may be trademarks
of Oracle Corporation. All other product and service names
mentioned may be trademarks of their respective owners.

Copyright © 2002 Oracle Corporation


All rights reserved.

You might also like