Data Guard and Fail Safe
Data Guard and Fail Safe
Data Guard and Fail Safe
1 EXECUTIVE OVERVIEW
When deploying an “always on” 7 x 24 x 365 mission-critical business system, it is essential to
ensure both high availability and disaster tolerance. Many lower cost disaster recovery solutions (for
example, the creation, offsite storage, and retrieval of system backups) do not meet the availability
requirements for business-critical operations. This paper describes how deploying Oracle9i Database
(release 9.2 or later) on commodity Windows clusters with a combination of Oracle Data Guard
(release 9.2 or later) disaster tolerance features and Oracle Fail Safe (release 3.3 or later) high
availability features provides easy-to-configure and cost-effective disaster-tolerant high availability.
2 INTRODUCTION
While there can be some overlap, the features and technologies used to provide high availability
are generally distinct from those used to ensure disaster tolerance. High-availability solutions
typically focus on protecting against individual component or system failures, while disaster-
tolerance solutions typically focus on protecting against data corruption and site failures. Each can
help to keep business-critical systems operational, but neither alone is sufficient to ensure the levels
of near continuous operation required for most business-critical systems. For example, while
redundant or clustered hardware can eliminate individual systems as points of failure, it does not
protect against a disaster that incapacitates the site where the systems reside. Similarly, while a
standby database solution, such as Oracle Data Guard, provides excellent disaster tolerance features,
it may take time to switch operations from the primary site to a physically separate standby site (for
example, you may first need to apply additional time delayed redo data to make the standby
database current before it can be reconfigured as the new primary site). This is true not only when
dealing with unexpected disasters and component failures, but also for the more common outages
associated with planned maintenance and upgrades. Fortunately, Oracle supports many
technologies that easily can be combined to provide the required levels of high availability and
disaster tolerance.
This paper describes how to combine Oracle Data Guard with Oracle Fail Safe to provide an
enhanced level of disaster-tolerant high availability for single-instance Oracle9i Database Enterprise
Edition databases deployed on Windows clusters configured with Microsoft Cluster Service. The
result is a complementary and easy-to-configure set of high availability and disaster tolerance
features (shown in Table 1 below) that eliminates many potential sources of downtime.
3 CLUSTER CONFIGURATION
A cluster is a group of independent computing systems (nodes) that operates as a single virtual
system. The component redundancy in clusters eliminates individual host systems as points of
failure and provides a highly available hardware platform for deploying mission-critical databases
and applications.
Microsoft Cluster Service is easily installed on any cluster hardware configuration listed on the
Microsoft hardware compatibility list (http://www.microsoft.com/hcl/default.asp, search for $OO
3URGXFWV of type &OXVWHU). Although the initial steps to begin installing and configuring MSCS differ
based on the underlying operating system, the overall process is similar and takes only a few
minutes per cluster node. The MSCS installation and cluster configuration process is described in
detail in the documentation accompanying your Windows operating system software and also in the
Installing MSCS lab module in the online course Introduction to Oracle Fail Safe, which is available
through Oracle University Online Learning (http://www.oracle.com/education/oln/index.html).
Note that you must first install MSCS and create a working cluster before you can install Oracle Fail
Safe. Other Oracle program executable software (database and application software) can be
installed on a private disk on each cluster node before or after MSCS installation (refer to the Oracle
Fail Safe Installation Guide for more information).
The example configuration uses two clusters, one for the primary database and one for the standby
database. Each cluster consists of two identically configured nodes, and each node must have:
After the database installations are complete, use the Oracle Net Configuration Assistant (NetCA) to
create a default Oracle listener service and the initial network configuration files on each primary
Replace each occurrence of C:\oracle with your ORACLE_HOME directory path, verify that the
path for jre.exe is correct, and execute the command from the MS-DOS command prompt on each
cluster node. To ensure that the network configuration files are created consistently and correctly on
each cluster node, delete or rename any previously existing listener.ora, sqlnet.ora, or tnsnames.ora
files in the Oracle home network\admin directory before you execute this command.
After completing the initial network configuration, use Oracle Database Configuration Assistant
(DBCA) to create the primary database on the primary cluster using the selected cluster disks (Disk
H: in the example). If the cluster disks selected for the database files are not already owned by the
node where you are running DBCA, use Microsoft Cluster Administrator to move these disks to that
node. Table 2 lists the input values used for the initial DBCA windows in the example:
'%&$,QSXW 9DOXH8VHG)RU([DPSOH&RQILJXUDWLRQ
Template Name General Purpose
Global Database Name testdb1.us.oracle.com
SID testdb1
Connection Option Dedicated Server Mode
Table 2: Initial DBCA Input Values Used for Example Primary Database
At the DBCA Initialization Parameters screen shown in Figure 5, make any necessary changes to
ensure that the primary database is configured with ARCHIVELOG mode enabled and that all
database data, log, and control files are correctly located on cluster disks. The following list of
changes is typical for most databases:
1. Click the $UFKLYH tab, then click )LOH/RFDWLRQ9DULDEOHV and create variables to specify the
cluster disk locations to be used for the database files. The example uses a variable
DB_FILES with value H:\oracle.
2. Under the $UFKLYH tab:
• Enable $UFKLYHORJPRGH
• Enable $XWRPDWLFDUFKLYDO
From the Services Control Panel, start the Oracle primary database instance and listener services (if
they are not already started). Depending on how your system is configured, you also may need to
create entries for the database in the listener.ora and tnsnames.ora network configuration files
before you can connect to the primary database. Update these files if necessary and then be sure to
stop and restart the Oracle Listener and Oracle Intelligent Agent processes on any system where you
changed these files. When the network environment is configured correctly, use SQL*Plus to
connect to the database and verify that you are able to query the database successfully (for
example, by executing SELECT * FROM ALL_USERS from the SYSTEM account).
Because the Oracle Data Guard configuration files must be accessible by whichever cluster node
hosts the primary database virtual server, these files must be located on shared-nothing cluster disks
(usually the same disks used for the database data, control, and log files). To specify the location to
be used when these files are created, use SQL*Plus to connect to the primary database through the
SYS database user account (as SYSDBA) and, at the SQL prompt, enter the following commands:
SQL> alter system set dg_broker_config_file1 = ‘<path>\dr1<instance_name>.dat’ scope=both;
where <path> is the shared-nothing cluster disk location where you want these files to be created
(H:\oracle\database in the example) and <instance_name> is the SID for the primary
database (testdb1 in the example). If the cluster disk directories specified in <path> do not
already exist, be sure to create them. The scope=both qualifier ensures that this change is written
both to memory and to the database system parameter file (spfile) on disk.
The standby_archive_dest parameter for the primary database is used only if the database is later
reconfigured as standby database. By default, it is set to %ORACLE_HOME%\RDBMS. However,
because the standby archive log files must be accessible by whichever cluster node hosts the
primary database virtual server, these files must be located on shared-nothing cluster disks (usually
the same disks used for the database data, control, and log files). To specify the required cluster
disk location for the standby archive log files, use SQL*Plus to connect to the primary database
through the SYS database user account (as SYSDBA) and, at the SQL prompt, enter the following:
SQL> alter system set standby_archive_dest = ‘<path>’ scope=both;
where <path> is the shared-nothing cluster disk directory where you want these files to be created
(H:\oracle\oradata\testdb1\standby_archive in the example). If the cluster disk
directories specified in <path> do not already exist, be sure to create them. The scope=both
qualifier ensures that this change is written both to memory and to the database system parameter
file (spfile) on disk.
If you have already configured Oracle Enterprise Manager and the Oracle Management Server on a
separate management system, you optionally can start Oracle Enterprise Manager on that system
and skip ahead to section . Note that, depending on your environment, your Oracle Enterprise
Manager tree view may differ from that shown in this paper.
If you have not already configured Oracle Enterprise Manager on another system, you optionally
can configure Oracle Enterprise Manager now using the Enterprise Manager Configuration Assistant
(EMCA). For production deployments, you should always put Oracle Management Server and the
Oracle Enterprise Manager repository database on a separate system so that you will not lose access
to the repository when you take one of the primary or standby sites offline. However, for purposes
of illustration, one of the primary cluster nodes (FS-152) will be used in the example configuration.
Oracle Enterprise Manager is installed automatically when you install Oracle9i. From the system
where you plan to configure Oracle Enterprise Manager, select:
6WDUW!3URJUDPV!2UDFOHRUDFOHBKRPH!
where <oracle_home> is the name of the previously created Oracle9i Database Enterprise Edition
home (dbs_home for the example configuration). Then, to open the Oracle Enterprise Manager
Configuration Assistant, choose:
&RQILJXUDWLRQDQG0LJUDWLRQ7RROV!(QWHUSULVH0DQDJHU&RQILJXUDWLRQ$VVLVWDQW
On the second EMCA window (Configuration Operation), select &RQILJXUHORFDO2UDFOH0DQDJHPHQW
6HUYHU. For the third window, choose &UHDWHDQHZUHSRVLWRU\. Choose 7\SLFDOfor the Create New
Repository Options on the fourth window. Record the username and password information from the
Create Repository Summary window for future use and click )LQLVK to complete the Oracle
Enterprise Manager configuration process.
The steps in section should automatically start Oracle Management Server. However, if
necessary, start Oracle Management Server from the command-line prompt by entering the
command oemctl start oms. Note that the Oracle Management Server must be able to connect
to the Oracle Enterprise Manager repository database in order to start. If you have difficulty starting
the Oracle Management Server, verify that the repository database is configured correctly and that
the corresponding instance and listener services are started.
During the Oracle9i Database Enterprise Edition installation process, an Oracle Intelligent Agent
process was created for each primary and standby cluster node. Issue the command agentctl
status from the MS-DOS command line on each cluster node to determine the status of the Agent
on each node. For any node where the Agent is not already started, start the Agent from an MS-DOS
command window by issuing the command agentctl start. Note that in general, any time the
configuration on a cluster node changes, you will need to stop and restart the Oracle Intelligent
Agent on that node to allow the new changes to be discovered by the Agent.
Run the Enterprise Manager Discovery Wizard, also referred to as the Discovery Wizard, to discover
each node of the primary and standby clusters and to gain access to the databases that you want to
configure and administer with Data Guard Manager. To invoke the Discovery Wizard from the
Enterprise Manager Console menu bar, choose:
1DYLJDWRU!'LVFRYHU1RGHV
Follow the directions in the Discovery Wizard to discover each of the nodes in the primary and
standby clusters (FS-151, FS-152, FS-241, and FS-242 in the example configuration). When finished,
all discovered nodes and databases are displayed in the Enterprise Manager navigator tree. For the
example configuration, Oracle Enterprise Manager discovers and displays the following, as shown in
Figure 7:
• On the node where the primary database was created (FS-151 in the example
configuration), the wizard discovers the primary database (testdb1.us.oracle.com).
• In addition, if you optionally used EMCA to create a repository database on one of the
cluster nodes, the Oracle Enterprise Manager repository database (OEMREP.us.oracle.com) is
discovered on the node where it was created (FS-152 for the example configuration).
• On all cluster nodes, the wizard finds the Oracle home where you have installed Oracle9i
Database Enterprise Edition.
You must set preferred credentials on each of the primary and standby cluster nodes to ensure Data
Guard Manager can run remote processes to create the configuration. To set preferred credentials
from the Enterprise Manager Console menu bar, select:
&RQILJXUDWLRQ!3UHIHUHQFHV!3UHIHUUHG&UHGHQWLDOV
For each cluster node, specify an account with administrator privileges on that system. Note that the
selected account also must be granted logon as a batch job user rights for the system. After setting
the preferred credentials, verify that you are able to use Oracle Enterprise Manager to successfully
run a small test job on each cluster node (for example, execute a system dir command).
Although setting preferred credentials for databases is not required, you also might want to set
preferred credentials (for example, the SYS account) for the primary database (and also later for the
standby database when it is created).
Once the preceding steps have been completed, you can open Data Guard Manager from the
command-line prompt or from the Enterprise Manager Console:
• From the command-line prompt, enter oemapp dataguard.
• From the Oracle Enterprise Manager Console, use either of the following methods:
o Choose7RROV!'DWDEDVH$SSOLFDWLRQV!'DWD*XDUG0DQDJHU
o From the 'DWDEDVH$SSOLFDWLRQVdrawer, move the cursor over the icons and select
the 'DWD*XDUG0DQDJHU icon.
The steps to create the initial Oracle Data Guard configuration are described in this section. To
open the Create Configuration Wizard, right-click 2UDFOH'DWD*XDUG&RQILJXUDWLRQV in the navigator
tree and choose &UHDWH&RQILJXUDWLRQ:L]DUG. Figure 9 shows the initial welcome screen for the
Create Configuration Wizard.
Click 'HWDLOV on the Create Configuration Wizard welcome page (see Figure 9) and review the
checklist of setup requirements and information that is displayed. If necessary, make any additional
changes required to set up the Oracle Data Guard environment on the primary and standby clusters.
Click 1H[W to continue.
Enter a unique Oracle identifier for the name of the new Oracle Data Guard configuration. Figure 10
shows the Configuration Name window in which the example configuration name (testdb_config)
has been entered. Click 1H[W to continue.
Select the primary database from the list of discovered databases. As shown in Figure 11, the
selected primary database for the example is TESTDB1.us.oracle.com. Accept the default primary
site name; this site will be deleted and replaced by a new site after Oracle Fail Safe Manager
configures the primary and standby databases for failover.
Verify that the cluster disks used by the primary database are owned by the node where the
database was created. If necessary, use Microsoft Cluster Administrator to move the disks to this
node. Ensure that the database instance is started and that you can connect to the database. Click
1H[W to continue.
The wizard allows you to create a new physical or logical standby database or to add an existing
standby database. For the example, choose &UHDWHD1HZ3K\VLFDO6WDQGE\'DWDEDVH (as shown in
Figure 12) and click 1H[W to continue.
Review the Create Configuration Wizard Summary window (shown in Figure 17) and verify that the
information displayed for the primary and standby sites is correct. If you find an error, click %DFNto
move backward through the wizard screens and make the needed changes. When the information is
correct, click )LQLVK. The wizard displays a report similar to that shown in Figure 18 that records
progress while the configuration is created.
After closing the Create Configuration Wizard progress report, use Data Guard Manager to connect
to the newly created configuration. You will be prompted to enter the database username and
password required to connect to the configuration, as shown in Figure 19. Once connected to the
configuration, expand the Data Guard Manager tree view and verify that the configuration
properties are similar to those shown in Figure 20.
By default, a physical standby database automatically applies archived redo logs when they arrive
from the primary database. A logical standby database automatically applies SQL statements once
they have been transformed from the archived redo logs. But in some cases, you may want to create
a time lag between the archiving of a redo log at the primary site and the applying of the redo log
at the standby site. A time lag can protect against the application of corrupted or erroneous data
from the primary site to the standby site. For example, if the problem is detected on the primary
database before the logs have been applied to the standby database, administrators have the option
to switchover operations to the unaffected standby database (where the problem has not yet
propagated), effectively rolling back the clock to a point in time before the problem occurred.
To specify a time lag for applying redo logs at the standby site:
• Select the standby database in the Data Guard Manager tree view.
• Click on the 3URSHUWLHV tab.
• Locate the 'HOD\0LQV property and enter the desired redo log application delay in minutes.
• Click $SSO\
Changing the DelayMins property for a standby database updates the DELAY attribute of the
corresponding LOG_ARCHIVE_DEST_n initialization parameter for the primary database. For the
example configuration, a value of 30 minutes was entered, as shown in Figure 21.
Figure 21: Optionally, Specify a Time Delay for Applying Redo Logs
At this point, Data Guard Manager has successfully created and configured the initial standby
configuration. However, when the primary and standby databases are configured with Oracle Fail
Safe, client access to these databases will change from using node-specific network addresses to
node-independent virtual addresses and the initial Oracle Data Guard configuration and the
database information stored in the Oracle Enterprise Manager repository will not be valid. Because
of this, it is necessary to remove the initial Oracle Data Guard configuration from the Data Guard
Manager tree view and to delete the initially discovered primary and standby cluster nodes and
database resources from the Enterprise Manager tree view. Once the Oracle Fail Safe configuration
process is completed, these tree views will be updated with the final disaster-tolerant high
availability configuration (as described in sections and ).
To remove the initial configuration information, right-click the name of the initial Oracle Data Guard
configuration (testdb_config in the example) and then click 5HPRYH in the pop-up menu. In the
resulting window, as shown in Figure 22, ensure that the 5HPRYH2UDFOH'DWD*XDUG&RQILJXUDWLRQ
3HUPDQHQWO\ and 5HPRYH$OO'HVWLQDWLRQVLQ&RQILJXUDWLRQ options are chosen. This leaves each
database in place, but stops transport and application of logs to the standby database.
Finally, to ensure that there will be no resource discovery conflicts in later steps, you must stop and
then disable the default Oracle Intelligent Agent service (Oracledbs_homeAgent in the example) on
each cluster node. To do this, open the Windows Services Control Window and right-click the
Oracle Intelligent Agent service to open the properties page for the service, as shown in Figure 23.
Insert the Oracle Fail Safe CD-ROM into the CD-ROM drive of one of the primary or standby cluster
nodes. From the initial Autorun screen, click ,QVWDOO'HLQVWDOO3URGXFWV to open the Oracle Universal
Installer Welcome screen shown in Figure 24. If the Autorun screen is not displayed on your system
after the Oracle Fail Safe CD-ROM is inserted, you can start the Oracle Universal Installer using the
setup.exe program located in the \install\Win32\ directory on the CD-ROM. Click 1H[W to continue.
In the File Locations window, accept the default source location and specify the name and location
(on a private disk) for the Oracle home directory where Oracle Fail Safe is to be installed. To ensure
that software components can fail over correctly, the Oracle home where Oracle Fail Safe is
installed must have the same name on each cluster node; for the example, the Oracle Fail Safe
installation home is ofs_home, as shown in Figure 25. After entering the required information, click
1H[W to continue.
From the Available Products window, choose 2UDFOH)DLO6DIH (as shown in Figure 26) and then
click 1H[W to continue.
In the Installation Types window shown in Figure 27, choose 7\SLFDO, and then click 1H[W to
continue.
A Reboot Needed After Installation window, similar to that shown in Figure 28, warns you to reboot
the system after the installation is complete. Note that this window is not displayed if you have
previously installed Oracle Fail Safe components from this release and the changes to the system
path and Oracle resource DLL have been made and detected previously. Click 1H[W to continue.
Review the installation summary screen, which should be similar to that shown in Figure 29, and
then click ,QVWDOO to begin installing the selected software components. Note that if there is
insufficient space to perform the installation, the text below 6SDFH5HTXLUHPHQWV is displayed in red.
If the installation is successful, the Configuration Tools window and the Oracle Services for MSCS
Account/Password dialog box are displayed, as shown in Figure 31.
In the Oracle Services for MSCS Account/Password dialog box, enter the domain, user name, and
password of an operating system user account that has Administrator privileges. This is the account
that Oracle Services for MSCS will be using. Oracle Services for MSCS runs as a Windows service
(called OracleMSCSServices) under a user account that must be a domain user account (not the
system account) that has Administrator privileges on all cluster nodes. The account must be the
same on all cluster nodes, or you will receive an error message when you attempt to connect to a
cluster using Oracle Fail Safe Manager.
Enter the information in the form Domain\Username, as shown in Figure 31, or if you are using
Windows 2000, you optionally can enter a user principal name in the form
Username@DnsDomainName.
At the end of the installation, the Oracle Universal Installer displays the window shown in Figure
32. Click ,QVWDOOHG3URGXFWV to confirm that Oracle Fail Safe has been successfully installed. Click
5HOHDVH,QIRUPDWLRQ to view the Oracle Fail Safe Release Notes. Click ([LW to exit the installer.
If an installer screen instructing you to reboot after the installation is complete was displayed during
the installation, reboot the cluster node. A reboot is required only for the initial installation of an
Oracle Fail Safe release or if you have installed Oracle Fail Safe into a new Oracle home (on a node
with multiple Oracle homes).
Repeat steps the installation steps described in sections through on each additional
primary and standby cluster node.
After Oracle Fail Safe has been successfully installed and each node of the primary and standby
clusters has been rebooted (as described in section ), open Oracle Fail Safe Manager on one of
the cluster nodes by choosing the following from the Windows taskbar:
6WDUW!3URJUDPV!2UDFOH2UDFOHB+RPH!!2UDFOH)DLO6DIH0DQDJHU
Oracle Fail Safe supports either a single initialization parameter file located on the same cluster
disks as the database data, log, and control files or allows you to use a separate initialization
parameter file on each cluster node, provided that the path on each node is the same and that you
manually ensure that any relevant changes are propagated to all copies of the initialization
parameter file. Data Guard Manager expects to find an initialization parameter file or server
parameter file in the Oracle home database directory. Because Oracle Data Guard may make
changes to the content of the Oracle9i Database server parameter file (for example, during a site
switchover), there is a potential for server parameter file synchronization issues if a separate copy of
In the Microsoft Cluster Service environment, a virtual server is a group of resources that contains at
least one virtual address (a network name resource and its associated IP address resource). The
Oracle Fail Safe Create Group wizard collects the information needed to create an empty group and
then optionally allows you to add one virtual address to the group (the Add Resource to Group
wizard allows you to add additional virtual addresses after the group is created). As previously
noted (during the cluster network configuration and validation steps described in section ), the
Before configuring the primary and standby databases, first run the Oracle Fail Safe Manager Verify
Standalone Database command on each database. This command verifies that each database and its
associated network configuration files are correctly configured for use with Oracle Fail Safe.
To execute this command, select the icon for the database in the tree view and choose
7URXEOHVKRRWLQJ!9HULI\6WDQGDORQH'DWDEDVH from the Oracle Fail Safe Manager menu. Enter the
requested information in the dialog box, as shown in Figure 42. Note that although the Service
Name and Instance Name values for the primary and standby databases will differ, the Database
Name value is the same for both databases (testdb1, for the example). If, as in the example, Use
Operating System Authentication is selected, Oracle Fail Safe will automatically make any changes
necessary to enable operating system authentication (if it is not already enabled). Unless you have
The Oracle Fail Safe Manager Add Resource to Group Wizard automates the process of adding the
resources associated with each database to their respective virtual servers. For the example
configuration, the wizard is used twice for each virtual server: once to configure the database
resource and then once more to add an Oracle Intelligent Agent resource.
In the tree view, right-click the primary database and choose $GGWR*URXS (as shown in Figure 44)
to open the Add Resource to Group Wizard.
Figure 47: Oracle Fail Safe Manager – Initialization Parameter File Location
Repeat the same process to configure the standby database. After the standby database
configuration process is complete, expand the Oracle Fail Safe Manager tree view to verify that the
contents of the primary and standby virtual server groups (in the example, FS-153 and FS-245,
respectively) are similar to that shown in Figure 57. Note that the group for each database contains
all the cluster resources associated with that database. Virtual server groups used for production
deployments may contain additional resources (for example, additional disks associated with the
database or Oracle Intelligent Agent or additional IP address and network name resources if
multiple virtual addresses are configured for use with the database).
Figure 57: Tree View Showing Primary and Standby Virtual Servers
To begin the process, start the Oracle Management Server and Oracle Enterprise Manager (as
described in sections and ). Then, from the Oracle Enterprise Manager Console menu bar,
choose 1DYLJDWRU!'LVFRYHU1RGHV to open the Oracle Enterprise Manager Discovery Wizard.
Follow the directions in the wizard to discover the primary and standby virtual servers (FS-153 and
FS-245, for the example configuration). After discovery completes, the Oracle Enterprise Manager
tree view should be similar to that shown in Figure 58.
Note that the way the primary and standby databases and virtual servers are named in the tree view
may differ due to slight differences in the way the various wizards updated the database and
network configuration information on each cluster (the virtual servers shown in Figure 58, for
example, are named fs-153 and fs-245.us.oracle.com). This does not affect the primary/standby
configuration process. Also, because not all Oracle Intelligent Agent releases are fully cluster aware,
you may encounter errors if you attempt to discover both virtual servers and individual cluster
nodes. For the example, because the default (node-specific) Oracle Intelligent Agent was disabled
(refer to section ), it is only possible to discover the primary and standby virtual servers.
In the same way as described in section , open the Data Guard Manager Create Configuration
Wizard. Click 1H[W to proceed past the initial Welcome screen, enter the name you want to use for
the configuration (testdb1_config, for the example) as shown in Figure 59, and click 1H[W.
Click the new configuration in the Data Guard Manager tree view to open the configuration
connection information dialog box shown in Figure 68. Enter the requested account information
(SYS for the example) and click 2. to connect to the configuration.
5 OTHER CONFIGURATIONS
As noted in the introduction, Oracle Fail Safe and Oracle Data Guard can be combined in multiple
ways to provide a range of disaster-tolerant high availability solutions. This section compares the
example configuration described in this paper with three alternative disaster-tolerant high
availability configurations. When designing any disaster-tolerant high availability solution, it is
important to understand the trade-offs among the various possible configuration options, particularly
with respect to the risks of data loss or interruption of service. Refer to the Oracle Fail Safe and
Oracle Data Guard documentation listed in section for complete information.
This is the configuration described earlier in the paper; it adds additional availability to a typical
Oracle Data Guard primary/standby configuration by replacing each standalone system with an
active/passive cluster and using Oracle Fail Safe to configure the primary and standby databases so
that they can fail over between cluster nodes, as shown in Figure 70.
5.1.2 Trade-offs
• Distance between nodes makes asynchronous redo shipping best solution, but introduces a
risks of data loss and data divergence between the primary and standby databases
• Passive cluster nodes (nodes B and D in Figure 70, for example) typically perform no useful
work during normal operations
• This configuration provides an inexpensive way to enhance an Oracle Fail Safe deployment to
protect against media failure and data corruption (just add a second disk array)
• There is no risk of data loss or data divergence between the primary and standby databases
(when configured using the Oracle Data Guard maximum availability mode)
• An Oracle Data Guard site failover or switchover to a standby location is required only if all
primary cluster nodes are incapacitated
5.2.2 Trade-offs
• Primary database shuts down when network access to the standby database is interrupted
• Instance failover times from one cluster node to another can be slower than for active/passive
clusters because not all resources on the failover node are available to Microsoft Cluster
Service to process the failover
• Because there is no time delay for the application of redo logs to the standby databases, there
is no protection from human error or other sources of data corruption
• Supported rolling upgrade scenarios of hardware and some software (described in section )
require two MSCS failovers between cluster nodes (only one MSCS failover is required for
active/passive clusters)
• Disaster tolerance is limited to protection from media failure (two copies of the data) and to a
basic level of protection from local area disasters (like floods or fires) largely based on the
degree of geographic separation between the cluster nodes and disk arrays. Because the
maximum separation between components in an MSCS fibre channel cluster is currently on the
order of 7-10 kilometers, this configuration does not protect against wide area disasters (like
hurricanes or earthquakes).
5.3.1 Benefits
• Efficiently offloads reporting and backup operations from the primary database. For example,
Standby Database 1 could be configured as a logical standby database available at all times for
read-only reporting and Standby Databases 2 and 3 could be configured as physical standby
databases also available for periodic database backups and occasional additional reporting
5.3.2 Trade-offs
5.4.1 Benefits
5.4.2 Trade-offs
Locate and right-click the primary database in the Data Guard Manager configuration tree view, as
shown in Figure 74. Then, from the pop-up menu, choose 'LVDEOH. This will temporarily disable
Oracle Data Guard monitoring of the database and prevent unnecessary alerts when Oracle Fail Safe
Manager is used to fail over the database virtual server in the steps that follow. Choose )LOH!([LW
to close Data Guard Manager.
Figure 75: Open Oracle Fail Safe Manager Update Database Password Wizard
In the first window, select the primary database, as shown in Figure 76. Click 1H[W to continue.
For each logical standby database, repeat the steps described in section to update the SYS
database user account password on all nodes in the standby cluster. Note that logical standby
databases are fully functional databases that are usually administered and managed in the same way
as primary databases, with the exception that the tables replicated on the logical standby database
are read-only.
For each physical standby database (such as the physical standby database in the example
configuration), however, the SYS database user account password information cannot be updated
while the standby database is operating in managed recovery or read-only modes. To change the
SYS account password for a standby database, you must choose one of the following options:
• Option 1 (updates SYS account password and retains any other entries in the password file)
1. Perform a site switchover (refer to section ) to convert the standby database to the
primary database.
2. Update the SYS database user password, as described in section .
3. Perform a site switchover to return the primary and standby databases to their original
sites (refer to section ). Step 3 is optional for configurations where the physical
locations of the primary and standby databases are not important.
The <fname> variable is the original location and name of the password file on that node
(C:\oracle\database\PWDtestdb12.ora in the example configuration) and the <password>
variable is the new password for the SYS database user account.
4. From the Oracle Fail Safe Manager tree view, right-click the standby database virtual server
group and choose 0RYHWRD'LIIHUHQW1RGH. Click 2. to acknowledge the Confirm Move
Group informational message.
5. Repeat Steps 2 and 3 on the new cluster node.
6. Repeat Steps 4 and 5 for each additional node (if the cluster has more than two nodes).
7. After the password file has been updated on all standby nodes, choose 5HVRXUFHV!
8SGDWH'DWDEDVH3DVVZRUG to open the Update Database Password Wizard. In the first
window, select the standby database and click 1H[W to continue. In the second window,
enter the new SYS password in the Old Password, New Password, and Confirm New
Password fields. Note that when all three password fields contain the same value, as in
this case, Oracle Fail Safe Manager will verify that the password is valid and update the
SYS account password information stored by Oracle Fail Safe, but will not attempt to
update the password file on any of the cluster nodes. Click )LQLVK to continue. Review the
summary screen and, if the information is correct, click 2. to continue. Click 2. to close
the Finished Updating Passwords window and &ORVH to close the status window.
8. Use Oracle Fail Safe Manager to move the standby virtual server group back to the initial
cluster node so that the physical node hosting the group matches the node expected by
Data Guard Manager (refer to the section for additional information about this
requirement).
9. From the Oracle Enterprise Manager main menu, choose &RQILJXUDWLRQ!3UHIHUHQFHV!
3UHIHUUHG&UHGHQWLDOV and update the preferred credentials for the standby database with
the new SYS database user account password information (as SYSDBA). Click 2. to apply
this change.
In most cases, hardware or operating system software upgrades can be performed without an
Oracle Data Guard site switchover. Table 3 summarizes the rolling upgrade process for hardware or
operating system upgrades. Note also the following restrictions:
1 Change the database virtual Oracle Fail Safe Manager Follow the instructions in the Oracle Fail Safe Manager online
server group failback help. Changing the failback attributes prevents the group from
attributes to the Prevent failing back to the node while it is being rebooted or when the
Failback mode. cluster service is restarted.
2 Perform a planned failover by Oracle Fail Safe Manager Choose *URXSV!0RYHWRD'LIIHUHQW1RGH. (See the
moving all groups on the instructions in the Oracle Fail Safe help for more information.)
node being upgraded to By moving all groups to another node, you can work on the
another node. current node. When moving a group that contains a database
with this method, Oracle Fail Safe will perform a checkpoint
operation prior to moving the group.
3 Exit Oracle Fail Safe Manager. Oracle Fail Safe Manager Choose )LOH!([LW to exit Oracle Fail Safe Manager.
4 Perform the hardware or Various Follow the instructions provided by your hardware or operating
operating system upgrade system vendor.
5 Run the Verify Group Oracle Fail Safe Manager Select 7URXEOHVKRRWLQJ!9HULI\*URXS to check all resources in
operation on all groups. all groups and confirm that they have been configured correctly.
If you upgraded Oracle database software, the Verify Group
operation will update the tnsnames.ora file. If prompted, click
Yes. Otherwise, the Oracle database might not come online after
you add it to a group.
7 Run the Verify Cluster Oracle Fail Safe Manager This step verifies that there are no discrepancies in the software
operation. installation, such as with the release information on each node in
the cluster.
8 Restore the failback policy Oracle Fail Safe Manager Follow the instructions in the Oracle Fail Safe Manager online
attributes on the groups. help to set the failback policy for all groups in the cluster.
9 Fail back groups, as Oracle Fail Safe Manager Perform a planned failover to move the groups back to the
necessary, by moving groups preferred node. This rebalances the workloads across the cluster
back to the other node or nodes. Refer to the instructions in the Oracle Fail Safe Manager
nodes. online help regarding moving a group to a different node.
Generally, the rolling upgrade steps described in the Oracle Fail Safe Installation Guide for
upgrading Oracle Fail Safe or other Oracle software apply, provided that the software being
upgraded is installed in a different home from the Oracle database software and that no scripts or
other changes must be applied to the database as part of the upgrade process.
Note that additional steps beyond those listed in Table 3 are required when performing rolling
upgrades of Oracle software. Refer to the Oracle Fail Safe Installation Guide for complete details. If
any software updates are required to the database to ensure that it remains compatible with the
Oracle Fail Safe or other Oracle application software being upgraded, refer to section for
information about additional steps that may be required.
Both Oracle Fail Safe and Oracle Data Guard currently impose restrictions when performing
upgrades of Oracle database software. The downtime required during database upgrades varies,
based on the nature of the upgrade being performed. In many cases, Oracle Fail Safe can help to
reduce the overall downtime experienced by end users during upgrades of Oracle database
software by allowing program executable files to be upgraded on one cluster node while users
continue to work on another cluster node. Refer to the database upgrade information provided in
the Oracle Fail Safe Installation Guide and the Oracle9i Database documentation set for more
information. Also, several upgrade-related Support Notes for Standby databases are available
through Oracle MetaLink and are listed in section of this paper.
In general, if there are no changes to the in-memory or on-disk structure of the database and if the
initial three fields of the database version do not change (for example, during the application of a
software patch to the program executable software), then the rolling upgrade process outlined in
section can be used to minimize downtime. Perform each step in parallel on the primary and
standby clusters so that the state of the software on the primary and standby cluster nodes remains
consistent. Note that you will not need to take the database out of the virtual server group during
the rolling patch application process.
If any of the initial three fields of the database version changes (for example, from release 9.0.1 to
release 9.0.2), or if any scripts or changes must be applied to the database structure during the
upgrade process, then additional steps that may include periods of database downtime are required.
The specific upgrade steps also depend on whether any nologging changes to the database are
required. In general, you will need to unconfigure the database virtual servers, upgrade the
databases, and then reconfigure the database virtual servers. If nologging changes to the database
are required, then the additional step of refreshing or re-instantiating the standby databases may
also required. Table 4 summarizes the major tasks in the database upgrade process.
2 Perform a hot backup or cold Various This step ensures that you can recover the original configuration
backup of the primary database. if necessary.
Also back up the initialization
parameter files, server parameter
files, and Oracle Data Guard
configuration files for the
primary and standby databases.
3 Remove the Oracle Data Guard Data Guard Manager and Follow the same steps previously described in section All
Configuration from the Data Oracle Enterprise entries for the primary and standby databases and virtual servers
Guard Manager and Oracle Manager should be removed from the tree views.
Enterprise Manager tree views.
4 Remove the primary database Oracle Fail Safe Manager In the Oracle Fail Safe Manager tree view, locate the primary
from the primary database cluster, select the primary database, and choose 5HVRXUFHV!
virtual sever. 5HPRYHIURP*URXS.
5 Remove the standby database Oracle Fail Safe Manager In the Oracle Fail Safe Manager tree view, locate the standby
from the standby database 5HVRXUFHV!
cluster, select the primary database, and choose
virtual sever. 5HPRYHIURP*URXS. If there are multiple standby databases,
perform this step for each standby database.
6 Exit Oracle Fail Safe Manager. Oracle Fail Safe Manager Choose )LOH!([LW to exit Oracle Fail Safe Manager.
7 Upgrade the program executable Various Follow the documented upgrade or migration instructions for
files in the primary and standby your database releases. To identify the specific upgrade steps for
database Oracle home your configuration, review the information in:
directories and upgrade the
• The Oracle Fail Safe Concepts and Administration Guide
primary database.
section on8SJUDGLQJD)DLO6DIH'DWDEDVHZLWKWKH2UDFOH
'DWDEDVH8SJUDGH$VVLVWDQW
• Oracle Support Note 165296.1
For each database that will be directly involved in the role transition, disable Is Alive polling by
performing the following steps:
1. Select the database in the Oracle Fail Safe Manager tree view.
2. Click the 'DWDEDVH tab.
3. In the 'DWDEDVH3ROOLQJ box, select 'LVDEOHG
Use Oracle Data Guard Manager to perform the switchover or failover operation.
For each database that was directly involved in the role transition, reenable Is Alive polling by
performing the following steps after the switchover or failover is complete:
1. Select the database in the Oracle Fail Safe Manager tree view.
2. Click the 'DWDEDVH tab.
3. In the 'DWDEDVH3ROOLQJ box, select (QDEOHG
4. Click $SSO\
Note also that Oracle Fail Safe automatically reenables Is Alive polling each time the group
containing the database is moved to another cluster node.
After completing a switchover or failover, use the Oracle Fail Safe Manager Verify Group menu
command (click 7URXEOHVKRRWLQJ!9HULI\*URXS) to verify the primary and standby database virtual
server groups. Correct any reported problems and rerun the Verify Group command until it is
successful. As before, you can safely ignore the FS-10288 parameter file location warning.
Follow the procedure described in section 6.4.1 to disable Is Alive polling on the primary and
standby databases that will be participating in the switchover operation (testdb1 and testdb12 in the
example). Then choose 2EMHFW!6ZLWFKRYHU from the Oracle Data Guard Manager menu to start
the Data Guard Manager Switchover Wizard, as shown in Figure 84.
Refer to the Oracle9i Data Guard Concepts and Administration manual for information on the
specific RMAN commands used to back up databases configured with Oracle Data Guard (for
example, refer to section 6FHQDULR8VLQJD6WDQGE\'DWDEDVHWR%DFN8SWKH3ULPDU\
'DWDEDVH). Note that you will typically only backup the primary database control file and will
offload all other backup operations to a standby database. To ensure that your backup software can
always access the database disks, you may find it helpful on the system where the backup software
will be run to map each cluster disk used by the database as a node-independent network drive
using the virtual server (for example, in the example, create = as ??)6?K).
7.2 Oracle9i Database High Availability and Disaster Recovery Web Site
• http://otn.oracle.com/deploy/availability
June 2002
Author: Laurence Clarke
Contributing Authors: Vivian Schupman, Ingrid Stuart
Oracle Corporation
World Headquarters
500 Oracle Parkway
Redwood Shores, CA 94065
U.S.A.
Worldwide Inquiries:
Phone: +1.650.506.7000
Fax: +1.650.506.7200
www.oracle.com