Oracle10g Log Buffer Tunning Tips

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 7

Oracle log_buffer sizing tips

Donald K. Burleson

Overview of redo log tuning

Important note for Oracle 10gr2 and beyond: Per Metalink note 351857.1, starting in
release 10.2 and beyond, Oracle will automatically size the log_buffer on your behalf and
log_buffer cannot be changed dynamically. The automatic log_buffer sizing is based on
the granule size (as determined by to _ksmg_granule_size):

select
a.ksppinm name,
b.ksppstvl value,
a.ksppdesc description
from
x$ksppi a,
x$ksppcv b
where
a.indx = b.indx
and
a.ksppinm = '_ksmg_granule_size';

NAME VALUE DESCRIPTION


------------------------------ --------------------
--------------------
_ksmg_granule_size 16777216 granule size
in bytes

Also note that if you are Oracle's Automatic Memory Management AMM (AMM is Not
recommended for some databases), the log_buffer is part of the memory_target algorithm.

Tuning the redo log in Oracle

The steps for tuning redo log performance are straightforward:


1 - Determine the optimal sizing of the log_buffer.

2 - Size online redo logs to control the frequency of log switches and minimize system
waits.

3 - Optimize the redo log disk to prevent bottlenecks. In high-update databases, no


amount of disk tuning may relieve redo log bottlenecks, because Oracle must push all
updates, for all disks, into a single redo location.
Once you have optimized your redo and I/O sub-system, you have few options to relieve
redo-induced contention. This can be overcome by employing super-fast solid-state disk
for your online redo log files, since SSD has far greater bandwidth than platter disk. For
complete details on Oracle redo tuning and redo diagnostic scripts, see my book "Oracle
Tuning: The Definitive Reference".

Optimizing the log_buffer region

The log_buffer is one of the most complex of the Oracle RAM region parameters to
optimize, but it's a low-resource parameter (only using a few meg of RAM), so the goal
in sizing log_buffer is to set a value that results in the least overall amount of log-related
wait events.

The big issue with the log buffer is determining the optimal sizing for the log_buffer in a
busy, high-DML database. Common wait events related to a too-small log_buffer size
include high "redo log space requests" and a too-large log_buffer may result in high "log
file sync" waits.

For more details on log_buffer sizing, see Bug 4930608 and MetaLink Note 604351.1.

Per MetaLink MetaLink note 216205.1 Database Initialization Parameters for Oracle
Applications 11i, recommends a log_buffer size of 10 megabytes for Oracle Applications,
a typical online database:
A value of 10MB for the log buffer is a reasonable value for Oracle Applications and it
represents a balance between concurrent programs and online users.

The value of log_buffer must be a multiple of redo block size, normally 512 bytes.

Obsolete advise - Use a small log_buffer

Even though Oracle has traditionally suggested a log_buffer no greater than one meg, I
have seen numerous shops where increasing log_buffer beyond one meg greatly
improved throughput and relieved undo contention.

The log_buffer should remain small. This is perpetuated with Metalink notes that have
become somewhat obsolete:

“In a busy system, a value 65536 or higher is reasonable [for log_buffer].

“It has been noted previously that values larger than 5M may not make a difference.”

Metalink notes that in 10gr2, we see a bug in log_buffer where a customer cannot reduce
the log_buffer size from 16 meg:

“In 10G R2, Oracle combines fixed SGA area and redo buffer [log buffer] together. If
there is a free space after Oracle puts the combined buffers into a granule, that space is
added to the redo buffer. Thus you see redo buffer has more space as expected. This is an
expected behavior.. .

In 10.2 the log buffer is rounded up to use the rest of the granule. The granule size can
be found from the hidden parameter "_ksmg_granule_size" and in your case is probably
16Mb. The calculation for the granule size is a little convoluted but it depends on the
number of datafiles”

If the log_buffer has been set too high (e.g. greater than 20 meg), causing performance
problems because the writes will be performed synchronously because of the large log
buffer size, evidenced by high log file sync wait. Oracle consultant Steve Adams notes
details on how Oracle processes log file sync waits:

"Before writing a batch of database blocks, DBWn finds the highest high redo block
address that needs to be synced before the batch can be written.

DBWn then takes the redo allocation latch to ensure that the required redo block address
has already been written by LGWR, and if not, it posts LGWR and sleeps on a log file
sync wait."

Detecting an undersized log_buffer


Here is a AWR report showing a database with an undersized log_buffer, in this case
where the DBA did not set the log_buffer parameter in their init.ora file:
Avg
Total Wait wait Waits
Event Waits Timeouts Time (s) (ms) /txn
---------------------------- ------------ ---------- ---------- ------ --------
log file sequential read 4,275 0 229 54 0.0
log buffer space 12 0 3 235 0.0

Top 5 Timed Events


~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time 163,182 88.23
db file sequential read 1,541,854 8,551 4.62
log file sync 1,824,469 8,402 4.54
log file parallel write 1,810,628 2,413 1.30
SQL*Net more data to client 15,421,202 687 .37

It's important to note that log buffer shortages do not always manifest in the top-5 timed
events, especially if their are other SGA pool shortages. Here is an example of an Oracle
10g database with an undersized log buffer, in this example 512k (This is the database as
I found it, and there was a serious data buffer shortage causing excessive disk I/O):
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) DB Time Wait Class
------------------------------ ------------ ----------- --------- -----------
log file parallel write 9,670 291 55.67 System I/O
log file sync 9,293 278 53.12 Commit
CPU time 225 43.12
db file parallel write 4,922 201 38.53 System I/O
control file parallel write 1,282 65 12.42 System I/O

Log buffer related parameter issues

In addition to re-sizing log_buffer, you can also adjust the hidden Oracle10g parameter
_log_io_size (but only at the direction of Oracle technical support) and adjust your
transactions_per_rollback_segment parameters. In 10g, the _log_io_size parameter
govern the offload threshold and it defaults to log_buffer/3.

The transactions_per_rollback_segment parameter specifies the number of concurrent


transactions you expect each rollback segment to have to handle. The Oracle 10g
documentation notes:

"TRANSACTIONS_PER_ROLLBACK_SEGMENT specifies the number of concurrent


transactions you expect each rollback segment to have to handle. The minimum number
of rollback segments acquired at startup is TRANSACTIONS divided by the value for this
parameter.

For example, if TRANSACTIONS is 101 and this parameter is 10, then the minimum
number of rollback segments acquired would be the ratio 101/10, rounded up to 11."
At startup time, Oracle divides transactions by transactions_per_rollback_segment to
have enough rollback space, and Oracle guru Osamu Kobayashi has a great test of
transactions_per_rollback_segment:

"I set transactions_per_rollback_segment to 4. The result of 2-4. in fact suggests that


twenty-one transactions are specified. Twenty-one transactions also match the number of
transactions_per_rollback_segment that can be specified at the maximum.

As far as I analyze, one rollback segment can handle up to twenty-one transactions at a


time, regardless of transactions_per_rollback_segment value.
This transactions_per_rollback_segment parameter therefore is used to determine the
number of public rollback segments to start."

CPU's, log_buffer sizing and multiple log writer processes

The number of CPUs is also indirectly related to the value of log_buffer, and Metalink
discusses multiple LGWR slaves that are used to asynchronously offload the redo
information.

The hidden parameter _lgwr_io_slaves appear to govern the appearance of multiple log
writer slaves, and the Metalink note that clearly states that multiple LGWR processes will
only appear under high activity. The Oracle docs are very clear on this:

“Prior to Oracle8i you could configure multiple log writers using the
LGWR_IO_SLAVES parameter.”

In Oracle10g it becomes a hidden parameter (_lgwr_io_slaves). Metalink note 109582.1


says that log I/O factotum processes started way-back in Oracle8 and that they will only
appear as DML activity increases:

“Starting with Oracle8, I/O slaves are provided. These slaves can perform asynchronous
I/O even if the underlying OS does not support Asynchronous I/O. These slaves can be
deployed by DBWR, LGWR, ARCH and the process doing Backup. . .

In Oracle8i, the DBWR_IO_SLAVES parameter determines the number of IO slaves for


LGWR and ARCH. . .

As there may not be substantial log writing taking place, only one LGWR IO slave has
been started initially. This may change when the activity increases.”

The Oracle8 docs note that the value for the parameter log_simultaneous_copies is
dependent on the number of CPU’s on the server:

“On multiple-CPU computers, multiple redo copy latches allow multiple processes to
copy entries to the redo log buffer concurrently. The default value of
LOG_SIMULTANEOUS_COPIES is the number of CPUs available to your Oracle
instance”

Starting in Oracle8i, it’s a hidden parameter (_log_simultaneous_copies). From Metalink


note 147471.1 “Tuning the Redo log Buffer Cache and Resolving Redo Latch
Contention”, we see that the default is set to cpu_count * 2. Also, it notes that multiple
redo allocation latches become possible by setting the parm _log_parallelism, and that
the log buffer is split in multiple log_parallelism areas that each have a size equal to the
log_buffer. Further, Metalink discusses the relationship of log_buffer to the number of
CPU’s:

“The number of redo allocation latches is determined by init.ora LOG_PARALLELISM.


The redo allocation latch allocates space in the log buffer cache for each transaction
entry.

If transactions are small, or if there is only one CPU on the server, then the redo
allocation latch also copies the transaction data into the log buffer cache.”

We also see that log file parallel writes are related to the number of CPU’s. Metalink
note 34583.1 “WAITEVENT: "log file parallel write" Reference Note”, shows that the
log_buffer size is related to parallel writes (i.e. the number of CPU’s), and discusses how
LGWR must wait until all parallel writes are complete. It notes that solutions to high
“log file parallel write” waits are directly related to I/O speed, recommending that redo
log members be on high-speed disk, and that redo logs be segregated onto

“on disks with little/no IO activity from other sources.

(including low activity from other sources against the same disk controller)”.

This is a strong argument for using super-fast solid-state disk.

Here are some great tips by Steve Adams for sizing your log_buffer:

"If the log buffer is too small, then log buffer space waits will be seen during bursts of
redo generation. LGWR may not begin to write redo until the _log_io_size threshold (by
default, 1/3 of the log buffer or 1M whichever is less) has been exceeded, and the
remainder of the log buffer may be filled before LGWR can complete its writes and free
some space in the log buffer.

Ideally, the log buffer should be large enough to cope with all bursts of redo generation,
without any log buffer space waits.

Commonly, the most severe bursts of redo generation occur immediately after a log
switch, when redo generation has been disabled for some time, and there is a backlog of
demand for log buffer space"

You might also like