HPUX Filesystem Whitepaper For Oracle
HPUX Filesystem Whitepaper For Oracle
HPUX Filesystem Whitepaper For Oracle
Database environments
Technical white paper
Table of contents
Executive summary............................................................................................................................... 2
Intended audience............................................................................................................................ 2
Using HP OnlineJFS ............................................................................................................................. 3
Creating filesystems ............................................................................................................................. 3
Archive and Redo log filesystems ....................................................................................................... 4
Oracle tablespace filesystems ............................................................................................................ 5
Oracle binaries filesystems ................................................................................................................ 6
Oracle 11.2.0.2 and filesystemio_options=setall ................................................................................. 6
Impact of convosync=direct on sequential access..................................................................................... 6
Sequential I/O penalty with readv ..................................................................................................... 6
Read-ahead ..................................................................................................................................... 7
Oracle 10g and 11g........................................................................................................................ 8
Summary ............................................................................................................................................ 8
For more information ............................................................................................................................ 9
Executive summary
This white paper provides detailed guidelines for specifying mount options for HP-UX VxFS filesystems
in an Oracle Database environment so as to optimize performance. This paper includes the support of
concurrent I/O option with single instance databases to provide near RAW performance.
HP OnlineJFS, an add-on HP-UX software product, can significantly optimize the interaction between
HP-UX and Oracle by supporting direct I/O communications that bypass HP-UX buffer cache.
The paper provides information on addressing the sequential I/O performance penalty that may be
associated with certain system calls and describes VxFS read-ahead, one of the benefits of using
buffer cache.
The following table summarizes the general recommendations for filesystem block sizes and mount
options for Oracle filesystems.
Table 1. Summary of Recommendations
Filesystem
Redo Logs
Access
Block
size
Mount options
Direct
1 KB1
delaylog,mincache=direct,convosync=direct
Concurrent I/O
1 KB
delaylog,cio
Direct
1 KB1
delaylog,mincache=direct,convosync=direct2
Concurrent I/O
1 KB
delaylog,cio
Cached4
8 KB
delaylog,nodatainlog
Direct5
8 KB7
delaylog,mincache=direct,convosync=direct
Concurrent I/O6
8 KB7
delaylog,cio
Cached
Any
delaylog,nodatainlog
Archive Logs
Tablespaces
Binaries
Recommendations provided herein apply to most HP-UX 11.0, 11i v1 (11.11), 11i v2 (11.23), and
11i v3 (11.31) environments.
Note that it's fairly straightforward to make changes AFTER the standard tools have done the first
filesystem creation but before the database has used the storage. You can specify the "mkfs"
command to generate the different block size filesystems as shown in the following syntax:
mkfs -F vxfs -o bsize=<n> /dev/vg01/rlvol1
Intended audience
This white paper is intended for HP-UX administrators that are familiar with configuring filesystems on
HP-UX.
1
2
3
4
5
6
7
Filesystem blocks size can be any size if VxFS 5.0 is used on 11.31, but 1 KB will always work.
For VxFS 3.5, install VxFS patches PHKL_32355 or later on 11.11, or PHKL_34179 or later on 11.23
For VxFS 3.3, mount the filesystem for buffered I/O using delaylog,nodatainlog mount options and tune discovered_direct_iosz
Use cached I/O if Oracle benefits from VxFS read-ahead or db_file_multiblock_read_count is 16 or less
Use Direct I/O to avoid overhead of buffer/file cache, or if the Oracle block size is < 8 KB
Use Concurrent I/O to avoid JFS Inode Lock contention. Concurrent I/O is available with the OnlineJFS license and VxFS 5.0.1 or higher.
Any filesystem block size can be used, but 8 KB allows flexibility to move to cached I/O if needed
Using HP OnlineJFS
To optimize VxFS performance, you should use OnlineJFS, an add-on HP-UX software product that
can significantly increase the availability of VxFS filesystems. In addition to its dynamic online
management capabilities, OnlineJFS provides a direct I/O mode that can optimize interaction
between HP-UX and Oracle Database.
With direct I/O, requests that exceed a certain size (by default, 256Kb or more) are performed
directly, bypassing the HP-UX buffer cache (11.23 and earlier) or the Unified File Cache (11.31).
Such requests are typically initiated by operations (such as backup or copy) that only read the data
once; thus, there is no value to caching this data in the buffer/file cache where it might otherwise
flush out more useful information.
However, mixing buffered I/O and direct I/O on the same files can cause significant performance
issues. Also, for direct I/O to be effective, there are specific alignment requirements. Due to these and
other issues, the Oracle filesystems must be created with certain filesystem block size requirements
and mounted with the appropriate options to optimize the filesystem performance.
Changes were made to the licensed features of OnlineJFS with VxFS 5.0.1. Direct I/O is now
available with the base product (no licenses needed), and concurrent I/O is now available with
OnlineJFS product. Please review Performance improvements using Concurrent I/O on HP-UX 11i v3
with OnlineJFS 5.0.1 and the HP-UX 11i Logical Volume Manager white paper for more information
on concurrent I/O performance improvements.
Creating filesystems
This section provides guidelines for creating filesystems for the following:
Redo logs
Archive logs
Oracle tablespaces
Oracle binaries
Table 2 outlines key mount options discussed in this document
Option
Description
delaylog
The delaylog option allows the filesystem to delay the writing of non-critical
filesystem structural information to the VxFS intent log. This option improves
filesystem performance by allowing some system calls to return before the noncritical write data are placed in the intent log.
This option does not impact most Oracle file operations, such as reads and
writes. The delaylog option can have some impact when creating, deleting,
renaming, and extending files. So delaylog is often recommended.
nodatainlog
Since Oracle always uses synchronous writes, the nodatainlog option can be
used to prevent VxFS from writing data to the intent log as well as the file. Note
that the datainlog/nodatainlog options only impact buffered or cached I/O. They
do not impact direct I/O.
convosync=direct
This option converts buffered or cached synchronous I/O requests to direct I/O.
By bypassing the buffer/file cache, the overhead of copying data between the
Oracle System Global Area (SGA) and cache can be eliminated. Using direct
I/O for Oracle will bypass the VxFS read-ahead code, and can impact sequential
access.
In Oracle Database 8.x and later, the convosync=direct mount option also causes
unnecessary physical I/Os during sequential read operations if
db_file_multiblock_read_count is less than 32 or a non-default value on Oracle
10g or 11g.
For more information, see
access.
mincache=direct
This option converts normal asynchronous I/O requests through the buffer/file
cache to direct I/O. Since Oracle uses the O_DSYNC option for synchronous
I/O, using this option would typically not impact Oracle.
However, if the convosync=direct option is used, HP recommends using
mincache=direct to accommodate I/Os from non-Oracle applications (such as
backup utilities) as well as operating system commands like cp and gzip. This
avoids a mix of direct and buffered I/O, which can degrade performance.
cio
This option also converts normal asynchronous I/O requests through the
buffer/file cache to direct I/O. This option also allows for concurrent read and
write operations by converting exclusive lock requests by writes into shared locks,
relying on the Oracle database code to provide the lock synchronization.
Using the cio mount option implies the use of mincache=direct and
convosync=direct.
For VxFS 3.5 on 11.11 or 11.23, be sure to install the latest VxFS 3.5 patches to address
performance issues when accessing sparse files, as Oracle creates the archive logs as sparse files.
The fixes for accessing sparse files for direct I/O were incorporated into the following patches:
For VxFS 3.3 the archive log filesystems must use the following mount options and use buffered I/O
for optimal performance:
delaylog,nodatainlog
Modify the /etc/vx/tunefstab file, changing the discovered_direct_iosz parameter for the archive
filesystem to 2097152, which enhances performance when writing archive files. This parameter will
apply to cached tablespace filesystems as well.
Oracle block sizes of 8KB or more are standard practice and are the preferred block size for most
environments. However, it should be noted that when using an Oracle block size of 4KB or less,
direct I/O provides added benefits and is significantly more efficient than cached I/O.
Using Concurrent I/O
Concurrent I/O also bypasses the buffer/file cache and provides the added benefit of eliminating any
JFS inode lock contention. Concurrent I/O can be enabled with the following mount options:
delaylog, cio
Beginning with VxFS 5.0.1or higher on HP-UX 11i v3, the concurrent I/O feature is available with the
OnlineJFS licenses. Prior to VxFS 5.0.1, concurrent I/O was available with the HP Serviceguard
Storage Management Suite bundle. When available, concurrent I/O is recommended over direct I/O
for Oracle tablespace filesystems for better performance.
Oracle 10g and 11g environments
To use direct I/O in an Oracle 10g or 11g environment, allow the value of the Oracle
db_file_multiblock_read_count parameter to remain at default, which results in 1 MB reads that have
no impact on Oracle optimizer logic.
to 8 and an 8 KB Oracle block size, the result would yield eight physical I/Os of 8 KB each,
performed one I/O at a time, rather than a single 64 KB physical I/O.
Note
If the filesystem was mounted with mincache=direct,convosync=direct, VxFS
would perform a separate physical I/O for each vector specified by
readv().
Oracle 8.x uses the readv system call for full table scans or scattered reads.
Oracle 9.x uses the readv system call for full table scans if the db_file_multiblock_read_count is 16
or less. If the db_file_multiblock_read_count is greater than 16, then Oracle will use a single read
system call with a larger transfer size resulting in more efficient direct I/O. However, increasing the
db_file_multiblock_read_count may alter the behavior of the Oracle optimizer.
On Oracle 10g and 11g, the db_file_multiblock_read_count should be allowed to default. This will
provide for 1 MB multi-block reads without impacting the Oracle optimizer value.
However, if the convosync=direct mount option is not used, readv passes requests through the HP-UX
buffer cache, allowing VxFS to coalesce the eight vectors using fewer physical I/Os and issue the
physical I/Os in parallel.
Read-ahead
An added benefit of using buffer cache is the ability of VxFS to read ahead. After recognizing two
adjacent reads, such as a table scan, VxFS can initiate a read-ahead (256 KB by default), further
enhancing table scan performance.
Note that, in certain circumstances, a read-ahead may be wasteful. VxFS interprets two adjacent
single-block reads as a sequential I/O pattern and, as a result, generates 256 KB of read-ahead;
however, since this pattern is often random, the read-ahead data may not be used.
VxFS can modify the amount of read-ahead based on the number of stripes (columns) and stripe size.
Since this may result in extremely large read-aheads, you should tune the read-ahead size to its
default value, which leads to balanced performance in most Oracle environments.
Read-ahead size
To check read-ahead size, use the vxtunefs command on a mount point. Perform the following
calculation:
Read-ahead size = read_pref_io * read_nstream * 4
Values recommended for balanced performance in most Oracle environments are:
read_pref_io = 65536
read_nstream = 1
These above values are the default values when using LVM. When using VxVM, the default values
reflect the VxVM striping attributes. If the VxVM volume is striped across a large number of disks with
a large stripe size, the read-ahead size will be too large. Be sure to use vxtunefs to check the values
and tune them as mentioned above. While they can be changed interactively through vxtunefs, you
should enter these values in /etc/vx/tunefstab to make them persistent after a reboot.
Higher Oracle multi-block read count values
With any version of Oracle Database, if db_file_multiblock_read_count is set to a value higher than
16, Oracle reverts to the read system call. Thus, with these higher multi-block read count values, the
use of the convosync=direct mount option does not result in the sequential I/O penalty inherent in
readv; conversely, however, the benefits of VxFS read-ahead cannot be achieved.
Higher multi-block read count values may be appropriate in some large data warehouse
environments.
Summary
This white paper provides detailed guidelines for specifying mount options for HP-UX VxFS filesystems
in an Oracle Database environment so as to optimize performance. HP recommends upgrading to the
latest versions of HP-UX and VxFS to take advantage of performance improvements. However we
recognize that many users may not be using the most current versions of HP-UX, VxFS and Oracle
databases therefore recommendation were provided to configure VxFS mount options for these
different configurations. OnlineJFS can significantly optimize the interaction between HP-UX and
Oracle by supporting direct I/O communications that bypass HP-UX buffer cache. With VxFS 5.0.1or
higher, the concurrent I/O feature is available with the OnlineJFS license and provides the added
benefit of eliminating JFS inode lock contention.
http://h20000.www2.hp.com/bc/docs/support/Sup
portManual/c02220689/c02220689.pdf
http://bizsupport1.austin.hp.com/bc/docs/support/S
upportManual/c02627959/c02627959.pdf
http://bizsupport1.austin.hp.com/bc/docs/support/S
upportManual/c01915880/c01915880.pdf
http://h20000.www2.hp.com/bc/docs/support/Sup
portManual/c01919408/c01919408.pdf
http://h20195.www2.hp.com/V2/GetDocument.asp
x?docname=4AA1-5719ENW&cc=us&lc=en
To know how you can make informed decisions when choosing an I/O subsystem configuration,
please visit: http://h71028.www7.hp.com/enterprise/w1/en/os/hpux11i-fsvm-learn-more.html
Copyright 2008 - 2011 Hewlett-Packard Development Company, L.P. The information contained herein is
subject to change without notice. The only warranties for HP products and services are set forth in the express
warranty statements accompanying such products and services. Nothing herein should be construed as
constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions
contained herein.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
4AA1-9839ENW, Created May 2008; Updated March 2011, Rev. 3