QV4311 StudentGuide

V3.1.0.
cover
Front cover
AIX Performance
Management I: Concepts and
Tools
(Course Code QV431)
Student Notebook
ERC 1.1
UNIX Software Service Enablement

Student Notebook
April 2009 Edition
The information contained in this document has not been submitted to any formal IBM test and is distributed on an “as is” basis without
any warranty either express or implied. The use of this information or the implementation of any of these techniques is a customer
responsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. While
each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will
result elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk.
© Copyright International Business Machines Corporation 2009. All rights reserved.

This document may not be reproduced in whole or in part without the prior written permission of IBM.
Note to U.S. Government Users — Documentation related to restricted rights — Use, duplication or disclosure is subject to restrictions
set forth in GSA ADP Schedule Contract with IBM Corp.
V3.1.0.1
Student Notebook
TOC Contents
Course Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Agenda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Unit 1. Data Collection and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1

Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
What Exactly is Performance? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
Components of System Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
Performance Metrics and Baseline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
Factors That Can Affect Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
Determine the Type of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7
Trade-offs and Performance Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8
Performance Analysis Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9
Collecting Performance Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
Capturing Data with PerfPMR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11
PerfPMR Report Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12
monitor.sh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-13
When to Run PerfPMR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-14
The topas Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-15
The nmon and nmon_analyser Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-16
Exercise 1: Data Collection and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-17
Review Questions (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-18
Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-21
Unit 2. Tuning Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1

Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
What is Performance Tuning? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
Performance Tuning Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4
Performance Tuning Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5
Tuning Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
Types of Tunables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7
Tunable Parameter Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
Syntax of Tuning Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9
Tunables Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10
Displaying Tunable Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11
Displaying Current Tunables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12
Displaying Attributes of Tunables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13
Changing Non-Restricted Tunable Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14
Changing Restricted Tunable Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-15
TUNE_RESTRICTED Error Log Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-16
Tunables Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17
nextboot File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-18
© Copyright IBM Corp. 2009 Contents iii

Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Student Notebook
lastboot File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-19

lastboot.log File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-20
Managing Tunables Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-21
Exercise 2: Tuning Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-22
Review Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-23
Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-24
Unit 3. Monitoring CPU Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1

Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-3
CPU Monitoring Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-4
Processes and Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-5
The Life of a Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-6
Run Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-7
Process and Thread Priorities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-8
Changing Priority with nice/renice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-9
nice/renice Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-10
Viewing Process Priorities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-11
Viewing Thread Priorities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-12
Context Switches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-13
User Mode vs. System Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-14
What is Simultaneous Multi-Threading? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-15
When to Use Simultaneous Multi-Threading . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-16
Viewing Processor and Attribute Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-17
Turning On/Off SMT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-18
Viewing smtctl Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-19
Timing Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-20
Monitoring CPU Usage with vmstat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-21
The sar Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-22
Using the sar -P Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-23
topas - Example Screens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-24
Using the mpstat Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-25
Locating Dominant Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-26
CPU Monitoring Strategy Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-27
Exercise 3: Monitoring CPU Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-28
Review Questions (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-29
Review Questions (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-30
Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-31
Unit 4. Virtual Memory Performance Monitoring . . . . . . . . . . . . . . . . . . . . . . . . 4-1

Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-2
VMM Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-3
Major VMM Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-4
Page Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-5
Values for Persistent and Client Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-6
VMM Thresholds (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-7
VMM Thresholds (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-8
When to Steal Pages Based on Free Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-9
When to Steal Pages Based on Client Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-10
iv AIX Performance Management I © Copyright IBM Corp. 2009

V3.1.0.1
Student Notebook
TOC What Type of Pages are Stolen? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-11

How is Memory Being Used? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-12
How Many File Pages are in Memory? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-13
Is Memory Over Committed? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-14
Exercise 4: Virtual Memory Performance Monitoring . . . . . . . . . . . . . . . . . . . . . . 4-15
Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-18
Unit 5. I/O Performance Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1

Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2
LVM Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
LVM Attributes That Affect Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
Causes of Poor I/O Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
The lvmstat Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
lvmstat Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7
Migrating Physical Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
Migration Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
sar -d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10
Using iostat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
What is iowait? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
Monitoring Adapter I/O Throughout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13
Monitoring System Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14
File System I/O Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-15
File System Performance Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-16
How to Measure File System Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17
How to Measure Write Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-18
How to Measure Read Throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-19
The filemon Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-20
filemon - Most Active Files Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-21
filemon - Most Active LV and PV Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-22
filemon - Detailed File Stats Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-23
filemon - Detailed PV Stats Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-24
Fragmentation and Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-25
Logical Volume Fragmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-26
Logical Volume Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-27
lslv -p . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-28
lslv -m . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-29
Determine Fragmentation Using fileplace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-30
Reorganizing the File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-31
Defragmenting a File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-32
JFS and JFS2 Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-33
Creating Additional JFS/JFS2 Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-34
The Commands to Use for Monitoring I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-35
Exercise 5: I/O Performance Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-36
Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-39
© Copyright IBM Corp. 2009 Contents v

Student Notebook
Appendix A. Review Answers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
vi AIX Performance Management I © Copyright IBM Corp. 2009

V3.1.0.1
Student Notebook
pref Course Description

AIX Performance Management I: Concepts and
Tools
Purpose
Develop the skills to measure and analyze common performance
issues on IBM System p systems running AIX 6. Learn about
performance management concepts and techniques and the use of
basic AIX tools to monitor, analyze, and how to use Performance
Problem Reporting (PerfPMR) and its scripts to capture meaningful
performance data. Also, discuss monitor and analyzing tools including
vmstat, iostat, sar, tprof, svmon, filemon, and lvmstat.
Audience
This course is intended for AIX technical support personnel,
performance benchmark personnel, and AIX system administrators.
Prerequisites
Students attending this course are expected to have AIX problem
determination skills. These skills can be obtained by attending the
following courses:
• AHQV011 - AIX Problem Determination I: Boot Issues
• AHQV012 - AIX Problem Determination II: LVM Issues
AHQV332 - POWER6 LPAR Configuration and Operations is
recommended
Objectives
On completion of this course, students should be able to:
- Define performance terminology
- Describe the methodology for tuning a system
- Identify the AIX tools to monitor and analyze an AIX system
- Use AIX tools to determine bottlenecks related to Central
Processing Unit (CPU), Virtual Memory Manager (VMM),
physical and logical I/O, and file systems
- Use AIX tools to demonstrate techniques to tune the
subsystems
© Copyright IBM Corp. 2009 Course Description vii

Student Notebook
viii AIX Performance Management I © Copyright IBM Corp. 2009

V3.1.0.1
Student Notebook
pref Agenda
Unit 1 - Data Collection and Analysis
Exercise 1 - Data Collection and Analysis
Unit 2 - Tuning Overview

Exercise 2 - Tuning Overview
Unit 3 - Monitoring CPU Usage

Exercise 3 - Monitoring CPU Usage
Unit 4 - Virtual Memory Performance Monitoring

Exercise 4 - Virtual Memory Performance Monitoring
Unit 5 - I/O Performance Monitoring

Exercise 5 - I/O Performance Monitoring
© Copyright IBM Corp. 2009 Agenda ix

Student Notebook
Text highlighting
The following text highlighting conventions are used throughout this book:
Bold Identifies file names, file paths, directories, user names,
principals, menu paths and menu selections. Also identifies
graphical objects such as buttons, labels and icons that the
user selects.
Italics Identifies links to web sites, publication titles, is used where the
word or phrase is meant to stand out from the surrounding text,
and identifies parameters whose actual names or values are to
be supplied by the user.
Monospace Identifies attributes, variables, file listings, SMIT menus, code
examples and command output that you would see displayed
on a terminal, and messages from the system.
Monospace bold Identifies commands, subroutines, daemons, and text the user
would type.
x AIX Performance Management I © Copyright IBM Corp. 2009

V3.1.0.1
Student Notebook
Uempty Unit 1. Data Collection and Analysis
What This Unit Is About

This unit defines performance terminology and gives you a set of tools
to analyze a system. It also describes how to define a performance
problem, then use the PerfPMR utility to collect performance data.
What You Should Be Able to Do

After completing this unit, you should be able to:
• Describe the performance terms: throughput, response time,
metric, baseline, and performance goal
• List performance components
• List tools available for monitoring and analysis
• Distinguish between a performance problem and a functional
problem
• Install PerfPMR
• Collect performance data using PerfPMR
• List the steps to approach performance analysis
How You Will Check Your Progress

• Lecture
• Review Questions
• Hands-on machine exercises
References
SC23-5253 AIX Performance Management
SC23-5254 AIX Performance Tools Guide and Reference
AIX Commands Reference, Volumes 1-6
SG24-6478 AIX Practical Performance Tools and Tuning Guide
(Redbook)
© Copyright IBM Corp. 2009 Unit 1. Data Collection and Analysis 1-1
Student Notebook
Unit Objectives

• Describe the following performance terms: throughput,
response time, metric, baseline, performance goal
• List performance components
• List tools available for monitoring and analysis
• Distinguish between a performance problem and a
functional problem
• Install PerfPMR
• Collect performance data using PerfPMR
• List the steps to approach performance analysis
UNIX Software Service Enablement © Copyright IBM Corporation 2009
Figure 1-1. Unit Objectives QV4311.1
Notes:
Introduction
The objectives in the visual above state what you should be able to do at the end of this
unit.
1-2 AIX Performance Management I © Copyright IBM Corp. 2009

V3.1.0.1
Student Notebook
Uempty
What Exactly is Performance?
• Performance is the major factor on which the productivity of a
system depends
• Performance is dependent on a combination of:
– Throughput
– Response time
• Acceptable performance is based on expectations:
– Expectations are the basis for quantitative performance goals
4 O’clock
Panic
Lunch
Dip
Morning 5 O’clock
Crunch Cliff
7am 8 9 10 11 12 1 2 3 4 5 6
Figure 1-2. What Exactly is Performance? QV4311.1
Notes:
Throughput is a measure of the amount of work over a period of time. Examples include
database transactions per minute or kilobytes of a file transferred per second.
Response time is the elapsed time between when a request is submitted to when the
response from that request is returned. Examples include how long a database query
takes or how long it takes to access a web page.
Throughput and response time are related. Sometimes you can have higher throughput
at the cost of response time or better response time at the cost of throughput. So,
acceptable performance is based on reasonable throughput combined with reasonable
response time. Sometimes a decision has to be made as to which is more important:
throughput or response time.
Student Notebook
Components of System Performance

• Central Processing Unit (CPU) resources
– Processor speed and number of processors
– Performance of software that controls CPU scheduling
• Memory resources
– Random Access Memory (RAM) speed, amount of
memory, and caches
– Virtual Memory Manager (VMM) performance
• I/O resources
– Disk latencies, number of disks and I/O adapters
– Device driver and kernel performance
• Network resources
– Network adapter performance and physical network itself
– Software performance of network applications
Figure 1-3. Components of System Performance QV4311.1
Notes:
The performance of a computer system depends on four main components: CPU,
Memory, I/O, and Network.
Both hardware and software contribute to the entire system performance. You should
not depend on very fast hardware as the sole contributor of system performance. Very
efficient software on average hardware can cause a system to perform much better
(and probably be less costly) than poor software on very fast hardware.

V3.1.0.1
Student Notebook
Uempty
Performance Metrics and Baseline
• Performance is measured through analysis tools
• Metrics that are measured include:
– CPU utilization
– Memory utilization and paging
– Disk I/O
– Network I/O
• Each metric can be subdivided into finer details
• Create a baseline measurement to compare against in
the future
Figure 1-4. Performance Metrics and Baseline QV4311.1
Notes:
CPU utilization can be split into %user, %system, %idle, and %IOwait. Other CPU
metrics can include the length of the run queues, process/thread dispatches, interrupts,
and lock contention statistics.
Memory metrics include virtual memory paging statistics, file paging statistics, and
cache and TLB miss rates.
Disk metrics include disk throughput (kilobytes read/written), disk transactions
(transactions per second), disk adapter statistics, disk queues (if the device driver and
tools support them), and elapsed time caused by various disk latencies. The type of
disk access, random versus sequential, can also have a big impact on response times.
Network metrics include network adapter throughput, protocol statistics, transmission
statistics, network memory utilization, and much more.
You should create a baseline measurement when your system is running well and
under a normal load. This will give you a guideline to compare against when your
system seems to have performance problems.
Student Notebook
Factors That Can Affect Performance

• Detecting the bottleneck(s) within a server system depends
on a range of factors such as:
– Configuration of the server hardware
– Software application(s) workload
– Configuration parameters of the operating system
– Network configuration and topology
Throughput
Bottlenecks
Figure 1-5. Factors That Can Affect Performance QV4311.1
Notes:
As server performance is distributed throughout each server component and type of
resource, it is essential to identify the most important factors or bottlenecks that will
affect the performance for a particular activity. Detecting the bottleneck within a server
system depends on a range of factors such as those shown in the visual:
A bottleneck is a term used to describe a particular performance issue which is throttling
the throughput of the system. It could be in any of the subsystems: CPU, memory, or I/O
including network I/O. The graphic in the visual above illustrates that there may be
several performance bottlenecks on a system and some may not be discovered until
other, more constraining, bottlenecks are discovered and solved.

V3.1.0.1
Student Notebook
Uempty
Determine the Type of the Problem
• Determine the type of the problem:
– Is it a functional problem or purely a performance problem?
– Is it a trend or a sudden issue?
– Is the problem only at certain times?
• What do you do when someone reports a performance
problem?
– Know the nature of the problem
– Gather data and compare against the baseline
• Use AIX tools
• Use PerfPMR
• Document statistics regularly to spot trends for capacity
planning
• Document statistics during high workloads
Figure 1-6. Determine the Type of the Problem QV4311.1
Notes:
A functional problem is when the application, hardware or network is not behaving
properly. A performance problem is when the functions are being achieved but the
performance is slow. Sometimes functional problems lead to performance problems. In
these cases, rather than tune the system, it is more important to determine the root
cause of the problem and fix it.
It is quite common for support personnel to receive a problem report in which all it says
is that someone has a performance problem on the system and here is some data for
you to analyze. This little information is not enough to accurately determine the nature
of a performance problem.
Student Notebook
Trade-offs and Performance Approach

• Trade-offs must be considered, such as:
– Cost versus performance
– Conflicting performance requirements
– Speed versus functionality
• Performance may be improved using a methodical

approach:
1. Understanding the factors which can affect performance
2. Measuring the current performance of the server
3. Identifying a performance bottleneck
4. Changing the component which is causing the bottleneck
5. Measuring the new performance of the server to check for
improvement
Figure 1-7. Trade-offs and Performance Approach QV4311.1
Notes:
There are many trade-offs related to performance tuning that should be considered.
The key is to ensure there is a balance between them.
The trade-offs are:
- Cost versus performance
In some situations, the only way to improve performance is by using more or faster
hardware. But, ask the question “Does the additional cost result in a proportional
increase in performance?”
- Conflicting performance requirements
If there is more than one application running simultaneously, there may be
conflicting performance requirements.
- Speed versus functionality
Resources may be increased to improve a particular area, but serve as an overall
detriment to the system. Also, you may need to make choices when configuring your
system for speed versus maximum scalability.

V3.1.0.1
Student Notebook
Uempty
Performance Analysis Tools
CPU Memory System I/O Subsystem
vmstat vmstat iostat
iostat lsps vmstat
ps svmon lsps
sar filemon lsattr
tprof, gprof, prof lsdev
time, timex lspv, lslv, lsvg
fileplace
filemon
lvmstat
topas topas topas
trace, trcrpt, curt, trace, trcrpt, truss trace, trcrpt, truss
splat, truss
cpupstat, lparstat, lparstat sar
mpstat, smtctl
Figure 1-8. Performance Analysis Tools QV4311.1
Notes:
Student Notebook
Collecting Performance Data

• The data may be from just one system or from multiple
systems
• Gather a variety of data
• To make this simple, a set of tools supplied in a package
called PerfPMR is available
• PerfPMR is downloadable from a public website:
í ftp://ftp.software.ibm.com/aix/tools/perftools/perfpmr
í Choose appropriate version based on the AIX release
í PerfPMR may be updated for added functionality on an
ongoing basis
• Be sure to collect the performance data when the problem is
occurring!
Figure 1-9. Collecting Performance Data QV4311.1
Notes:
It is important to collect a variety of data that show statistics regarding the various
system components. In order to make this easy, a set of tools supplied in a package
called PerfPMR is available on a public ftp site. The following URL can be used to
download your version using a web browser:
ftp://ftp.software.ibm.com/aix/tools/perftools/perfpmr
The goal is to collect a good base of information that can be used by AIX technical
support specialists or development lab programmers to get started in analyzing and
solving the performance problem. This process may need to be repeated after analysis
of the initial set of data is completed.

V3.1.0.1
Student Notebook
Uempty
Capturing Data with PerfPMR
• Create a directory to collect the PerfPMR data
• Run perfpmr.sh 600 to collect the standard data
• perfpmr.sh will collect information by:
í Running trace for 5 seconds
í Gathering 600 seconds of general system performance data
í Collecting hardware and software configuration information
and putting it into a file named config.sum
í Attempting to collect additional data by:
• Running iptrace for 10 seconds
• Running tcpdump for 10 seconds
• Running filemon for 60 seconds
• Running tprof for 60 seconds
• Answer the questions in PROBLEM.INFO
Figure 1-10. Capturing Data with PerfPMR QV4311.1
Notes:
Create a data collection directory and cd into this directory. Allow at least
12 MB/processor of unused space in whatever file system is used.
If there is not enough space in the file system, perfpmr.sh will print a message similar
to:
perfpmr.sh: There may not be enough space in this filesystem
perfpmr.sh: Make sure there is at least 44 Mbytes
To run PerfPMR, type in the command perfpmr.sh. One of the scripts perfpmr.sh
calls is monitor.sh. monitor.sh calls several scripts to run performance monitoring
commands. By default, each of these performance monitoring commands called by
monitor.sh will collect data for 10 minutes (600 seconds). This default time can be
changed by specifying the number of seconds to run as the first parameter to
perfpmr.sh. You can also run the PerfPMR scripts individually.
Student Notebook
PerfPMR Report Types

The primary report types are:
.int Data collected at intervals over time
.sum Average of interval data and summary

information
.out One-time output from various commands
.before Data collected before the monitoring time
.after Data collected after the monitoring time
.raw Binary files for input into other commands
Figure 1-11. PerfPMR Report Types QV4311.1
Notes:
PerfPMR collects its data into many different files. The types of files created are listed
on the visual.
The .int data is most useful for metrics analysis. The .sum data is most useful for
overall or configuration type of data. The .before and .after data are metrics before the
testcase begins and those at end of the test interval. These are good for determining a
starting and delta value for what occurred over life of the test interval.

V3.1.0.1
Student Notebook
Uempty
monitor.sh
• The monitor.sh script invokes commands and scripts to
collect performance data
• monitor.sh is executed from the perfpmr.sh script or

directly by the user
• The time parameter specifies the length of time each script

within monitor.sh will execute:
– When executed from perfpmr.sh, uses the time value
from perfpmr.sh (default 600 seconds)
– When executed directly, uses the value given on the
monitor.sh command line (e.g., monitor.sh 300)
• Options can be used to inhibit some scripts from running
Figure 1-12. monitor.sh . QV4311.1
Notes:
The perfpmr.sh script calls the monitor.sh script. The monitor.sh script invokes
commands and other scripts to gather performance data. The monitor.sh script can be
run by itself.
The monitor.sh script captures before and after data by invoking the following
commands and scripts: lsps -a, lsps -s, vmstat -i, vmstat -v, and svmon.sh. The
svmon command captures and analyzes a snapshot of virtual memory. The svmon
commands that the svmon.sh script invokes are svmon -G, svmon -Pns, and svmon -S.
The monitor.sh script invokes the following scripts to monitor system data for the
amount of time given in the perfpmr.sh or monitor.sh command: nfsstat.sh (unless
the -n flag is used), netstat.sh (unless the -n flag is used), ps.sh, vmstat.sh,
emstat.sh (unless the -e flag is used), mpstat.sh (unless the -m flag is used),
lparstat.sh (unless the -l flag is used), sar.sh, iostat.sh, and pprof.sh.
Student Notebook
When to Run PerfPMR

• OK. So now that I know all about PerfPMR and the data it
collects, when do I need to run it?
– When your system is running under load and is
performing correctly, so you can get a baseline
– Before you add hardware or upgrade your software
– When you think you have a performance problem
• Installing PerfPMR:
– Download the latest PerfPMR version from the website
– Read about the PerfPMR process in the README file
• It's better to have PerfPMR installed on a system before you
need it rather than try to install it after the performance
problem starts!
Figure 1-13. When to Run PerfPMR QV4311.1
Notes:
The PerfPMR package is distributed as a compressed tar file. Obtain the latest version
of PerfPMR from the website ftp://ftp.software.ibm.com/aix/tools/perftools/perfpmr.
When you install PerfPMR, a link will be created in /usr/bin to the perfpmr.sh script.
The PerfPMR process is described in a README file provided in the PerfPMR
package.
PerfPMR should be installed when the system is initially set up and tuned. Then, you
can get a baseline measurement from all the performance tools. When you suspect a
performance problem, PerfPMR can be run again and the results compared with the
baseline measurement.
It is also recommended that you run PerfPMR before and after hardware and software
changes. If your system is performing fine and you then you upgrade your system and
begin to have problems, then it’s difficult to identify the problem without a baseline to
compare against.

V3.1.0.1
Student Notebook
Uempty
The topas Command
Topas Monitor for host: woolf222 EVENTS/QUEUES FILE/TTY
Tue Feb 3 19:43:13 2009 Interval: 10 Cswitch 165 Readch 1373
Syscall 949.2K Writech 335
CPU User% Kern% Wait% Idle% Reads 949.3K Rawin 0
ALL 22.4 77.6 0.0 0.0 Writes 0 Ttyout 64
Forks 0 Igets 0
Network KBPS I-Pack O-Pack KB-In KB-Out Execs 0 Namei 5
Total 0.2 1.0 0.4 0.1 0.1 Runqueue 1.2 Dirblk 0
Waitqueue 0.0
Disk Busy% KBPS TPS KB-Read KB-Writ MEMORY
Total 0.0 0.0 0.0 0.0 0.0 PAGING Real,MB 1024
Faults 17 % Comp 68.8
FileSystem KBPS TPS KB-Read KB-Writ Steals 0 % Noncomp 11.1
Total 1.1 1.0 1.1 0.0 PgspIn 0 % Client 11.1
PgspOut 0
Name PID CPU% PgSp Owner PageIn 0 PAGING SPACE
cpuprog 503892 99.7 0.1 root PageOut 0 Size,MB 512
getty 213180 0.1 0.5 root Sios 0 % Used 1.1
topas 262368 0.0 1.3 root % Free 99.9
java 204836 0.0 70.0 pconsole NFS (calls/sec)
gil 57372 0.0 0.9 root SerV2 0 WPAR Activ 0
java 114916 0.0 37.9 root CliV2 0 WPAR Total 0
rpc.lock 81986 0.0 1.2 root SerV3 0 Press: "h"-help
ksh 290824 0.0 0.5 root CliV3 0 "q"-quit
rmcd 266382 0.0 2.5 root
aixmibd 225438 0.0 1.1 root
sendmail 217244 0.0 1.1 root
xmgc 45078 0.0 0.4 root
Figure 1-14. The topas Command QV4311.1
Notes:
The topas command reports selected statistics about the activity on the local system.
This tool can be used to provide a full screen of a variety of performance statistics.
The topas tool displays a continually changing screen of data rather than a sequence of
interval samples, as displayed by such tools as vmstat and iostat. Therefore, topas is
most useful for online monitoring and the other tools are useful for gathering detailed
performance monitoring statistics for analysis.
If you're running topas in a partition and do a dynamic LPAR command which changes
the system configuration, then topas must be stopped and restart to view accurate
data.
The topas command can show many performance statistics at the same time. The
output consists of two fixed parts and a variable section.
Student Notebook
The nmon and nmon_analyser Tools

• nmon is similar to topas in that it displays a large amount of data
on the screen and can dynamically change what is displayed
• nmon_analyser takes nmon output and produces spreadsheets

and high quality graphics
• The nmon functionality is integrated within the topas command in

AIX 5.3 TL09, AIX 6.1 TL02 and Virtual I/O Server (VIOS) 2.1
• nmon is not for problem determination; it does not include

diagnostics information
• Do not raise an IBM AIX Support call with nmon data

– AIX Support covers tools from AIX Development (this does not
include nmon or nmon data)
Figure 1-15. The nmon and nmon_analyser Tools QV4311.1
Notes:
Like topas, the nmon tool is helpful in presenting important performance tuning
information on one screen and dynamically updating it.
Another tool, the nmon_analyser, takes files produced by nmon and turns them into
spreadsheets containing high quality graphs ready to cut and paste into performance
reports.
The nmon and nmon_analyser tools come with AIX 5.3 TL09, AIX 6.1 TL02 and Virtual
I/O Server (VIOS) 2.1 and installed by default.
Obtaining the nmon tools

The nmon tool can be obtained from:
http://www.ibm.com/developerworks/wikis/display/WikiPtype/nmon
The nmon_analyser tool can be obtained from:
http://www.ibm.com/developerworks/wikis/display/WikiPtype/nmonanalyser

V3.1.0.1
Student Notebook
Uempty
Exercise 1: Data Collection and Analysis
• Get familiar with some basic performance

analysis tools
• Use topas to monitor the system
• Install PerfPMR
• Use PerfPMR to collect performance data
Figure 1-16. Exercise 1: Data Collection and Analysis QV4311.1
Notes:
Student Notebook
Review Questions (1 of 3)
1. Use these terms with the following statements:
metrics, baseline, performance goals,
throughput, response time
a. Performance is dependent on a combination of ____________

and ___________________ .
b. Expectations can be used as the basis for _______________ .
c. You need to know this to be able to tell if your system is

performing normally. _______________________
d. These are collected by analysis tools. ___________________
Figure 1-17. Review Questions (1 of 3) QV4311.1
Notes:

V3.1.0.1
Student Notebook
Uempty
2. The four components of system performance are:
–
–
–
–
3. After tuning a resource or system parameter and monitoring the
outcome, what is the next step in the tuning process? _________
____________________________________________________
4. List an important AIX tool for CPU analysis: _________________
5. List an important AIX tool for memory analysis: ______________
6. List an important AIX tool for I/O analysis: __________________
Notes:
Student Notebook
7. What is the difference between a functional problem and a

performance problem? _____________________________
________________________________________________
________________________________________________
8. What is the name of the supported tool used to collect reports

with a wide variety of performance data? ________________
9. True or False You can run individually the scripts that
perfpmr.sh calls.
10. True or False You can dynamically change the topas and
nmon displays.
Notes:

V3.1.0.1
Student Notebook
Uempty
Unit Summary
• A better understanding of how the system works leads

to better performance management decisions
• When you have a problem, determine if it's a functional
problem or a performance problem
• Install and run PerfPMR before you have a
performance problem to get a valid baseline
• Collect the performance data when the problem is
occurring
• If you don’t know what your performance was BEFORE
you made the change, you won’t know what effect you
had on performance
Figure 1-20. Unit Summary QV4311.1
Notes:
Student Notebook

V3.1.0.1
Student Notebook
Uempty Unit 2. Tuning Overview

This unit discusses the process for tuning a system and gives you a
set of tools to tune a system. It also shows you how to display and
change tunable values.

• Describe the performance tuning process
• List the tools available for tuning
• Find help on the performance tunables
• Define the types of performance tunables
• Describe, display, and change performance tunables
• Describe the error log entry when a restricted tunable is changed
permanently

• Lecture
References
(Redbook)
© Copyright IBM Corp. 2009 Unit 2. Tuning Overview 2-1

Student Notebook
Unit Objectives
• Describe the performance tuning process
• List the tools available for tuning
• Find help on the performance tunables
• Define the types of performance tunables
• Describe, display, and change performance tunables
• Describe the error log entry when a restricted tunable is
changed permanently
Notes:

V3.1.0.1
Student Notebook
Uempty
What is Performance Tuning?
• Performance tuning involves changing the

current system configuration to improve the
efficiency of the system including:
– Resource management
– Correct tuning parameter setting
• A skilled system administrator can often “get

more” out of a computer system
BUT...
Every system has limits
• There is no such thing as a general

recommendation for performance dependent
tuning settings
Figure 2-2. What is Performance Tuning? QV4311.1
Notes:
Performance tuning is one aspect of performance management. The definition of
performance tuning sounds simple and straight forward, but it’s actually a complex
process.
Performance tuning involves managing your resources. Resources could be logical
(queues, buffers, etc.) or physical (real memory, disks, CPUs, network adapters, etc.).
Resource management involves the various tasks listed here. We will examine each of
these tasks later.
Tuning always must be done based on performance analysis. While there are
recommendations as to where to look for performance problems, what tools to use, and
what parameters to change, what works on one system may not work on another. So
there is no cookbook approach available for performance tuning that will work for all
systems.

Student Notebook
Performance Tuning Process

1. Define and prioritize specific goal or set of goals
2. Monitor system and determine if goals are being met
If goals are not being met:
3. Identify performance bottlenecks
4. Identify the required resources (logical or physical)
or other action to take Tuning
5. If resources are constrained: Project Example
a. Minimize resource requirements
Start
• Use the appropriate resources
Plan the
• Structure for parallel resource usage Engagement
b. Control the allocation of resources
Implement
c. Apply additional resources as Recommendations
indicated Gather Data

6. Go back to Step 2
Develop and Present
Recommendations
To Client
Analyze Data
Figure 2-3. Performance Tuning Process QV4311.1
Notes:
The wheel graphic in the visual above represents the phases of a more formal tuning
project. Experiences with tuning may range from the informal to the very formal where
reports and reviews are done prior to changes being made. Even for informal tuning
actions, it is essential to plan, gather data, develop a recommendation, implement, and
document.

V3.1.0.1
Student Notebook
Uempty
Performance Tuning Tools
CPU Memory I/O

System Subsystem
nice vmo vmo
renice
schedo ioo ioo
chdev chdev chdev
bindprocess chps/mkps migratepv
bindintcpu chlv
reorgvg
Figure 2-4. Performance Tuning Tools QV4311.1
Notes:
The table in the visual shows the tuning commands that can be used for each
subsystem.

Student Notebook
Tuning Commands
• Tunable commands include:
– vmo manages Virtual Memory Manager tunables
– ioo manages I/O tunables
– schedo manages CPU scheduler/dispatcher tunables
– no manages network tunables
– nfso manages NFS tunables
– raso manages reliability, availability, serviceability tunables
• Tunables are the parameters the tuning commands
manipulate
• Tunables can be managed from:
– SMIT
– Web-based System Manager
– Command line
• All tunable commands have the same syntax
Figure 2-5. Tuning Commands QV4311.1
Notes:
There are six tunable commands (vmo, ioo, schedo, no, nfso, and raso) that are used
to display and change tuning parameters. These actions can be done through SMIT
panels, Web-based System Manager plug-ins, and the tunable commands.
All six tuning commands (vmo, ioo, schedo, no, nfso and raso) use a common syntax
and are available to directly manipulate the tunable parameter values. Available options
include making permanent changes and displaying detailed help on each of the
parameters that the command manages.

V3.1.0.1
Student Notebook
Uempty
Types of Tunables
• There are two types of tunables (AIX 6.1):
– Restricted Tunables
• Should not be changed unless recommended by AIX
development or development support
• Dynamic change will show a warning message
• Permanent change must be confirmed
• Permanent changes will cause an error log entry at boot
time
– Non-Restricted Tunable
• Can have restricted tunables as dependencies
• Migration from AIX 5.3 to AIX 6.1 will keep the old tunable values
Figure 2-6. Types of Tunables QV4311.1
Notes:
Beginning with AIX 6.1, many of the tunables are considered restricted. Restricted
tunables should not be modified unless told to do so by AIX development or support
professionals.
The restricted tunables are not displayed, by default.
When migrating to AIX 6.1, the old tunable values will be kept. However, any restricted
tunables that are not at their default AIX 6.1 value will cause an error log entry.

Student Notebook
Tunable Parameter Categories

• The tunable parameters manipulated by the tuning commands
have been classified into the following categories:
– Dynamic
– Static
– Reboot
– Bosboot
– Mount
– Incremental
– Connect
Figure 2-7. Tunable Parameter Categories QV4311.1
Notes:
Types of tunable parameters

All the tunable parameters manipulated by the tuning commands (vmo, ioo, schedo, no,
nfso and raso) have been classified into these categories:
Dynamic The parameter can be changed any time
Static The parameter can never be changed
Reboot The parameter can only be changed during boot
Bosboot The parameter can only be changed by running
bosboot and rebooting the machine
Mount Changes to the parameter are only effective for future
file systems or directory mounts
Incremental Parameter can only be incremented, except at boot
Connect Changes to the parameter are only effective for future
socket connections

V3.1.0.1
Student Notebook
Uempty
Syntax of Tunable Commands
• All tuning commands have the same syntax:
command [ -p | -r ] { -o Tunable [= Newvalue]}
command [ -p | -r ] {-d Tunable }
command [ -p | -r ] -D
command [ -p | -r ] [-F] -a
command -h [ Tunable ]
command [-F] -L [ Tunable ]
command [-F] -x [ Tunable ]
Figure 2-8. Syntax of Tuning Commands QV4311.1
Notes:
The descriptions of the flags are:
Flag Description
-p Makes the change apply to both current and reboot values
-r Forces the change to go into effect on the next reboot
-o Displays or sets individual parameters
-d Resets individual Tunable to default value
-D Resets all tunables to default values
Forces display of the restricted tunable parameters when the -a, -L or
-F
-x options are specified alone on the command line to list all tunables.
-a Displays all parameters
-h Displays help information for a Tunable
-L Lists attributes of one or all tunables
Lists characteristics of one or all tunables, one per line, using a
-x
spreadsheet-type format

Student Notebook
Tunables Documentation
• The -h flag displays information for the tuning commands
vmo, ioo, schedo, raso, no, and nfso:
– command -h displays the usage statement for the command
– command -h <tunable> displays the tunable's purpose,
values (default, range, type, unit), and tuning information
• Beginning with AIX 6.1, none of the AIX manuals or man pages
contain documentation on the performance tunables
• Why the change in AIX 6.1?

– Documentation is fairly static and it was hard to keep up with all
the changes
– Now only one method to keep track of the functions of the tuning
commands
– The IBM AIX Version 6.1 Differences Guide Redbook has a
sample script in an appendix that can be used to display help
information for all tunables for each tuning command
Figure 2-9. Tunables Documentation QV4311.1
Notes:
The -h flag of the tuning commands display help about the tunable parameter, if one is
specified. Otherwise, the command usage statement is displayed.
Prior to AIX 6.1, the performance tunables were described in the documentation and
man pages of the related command (schedo, vmo, ioo, raso, no, and nfso). The
documentation could not keep up with the changes being made to the tunable values
(default and range), or the addition of new tunables.
In AIX 6.1, to keep the tunables information up to date, the tunables descriptions can
only be found from the tuning command itself. System documentation is fairly static, so
it was hard to keep up with the many tunables available, the adjustments to default
tunable values, changes to the tunable value ranges, and new tunables. The tunable
information is now dynamically retrieved from the kernel providing more accurate help.
This ensures a single method to know what functions a command currently has.

V3.1.0.1
Student Notebook
Uempty
Displaying Tunable Values
• The no, nfso, vmo, ioo, raso, and schedo tuning commands all
support the following syntax to display tunables:
– To display a single tunable:
command -o tunable (display current value)
command -L tunable (display tunable attributes)
– To display the current values for all the command's non-restricted
tunables:
command -a
– To display the current values for all the command's tunables:
command -F -a
– To display the tunable attributes for all the command's non-restricted
tunables:
command -L
– To display the tunable attributes for all the command's tunables:
command -F -L
Figure 2-10. Displaying Tunable Values QV4311.1
Notes:
The -o flag
displays the current value of the given tunable.
The -a flag
Displays the current, reboot (when used with the -r option), or permanent (when used
with the -p option) values for all tunable parameters.
The -F flag
Forces display of the restricted tunable parameters when the -a, -L or -x options are
specified alone on the command line to list all tunables.
The -L flag
Lists the attributes of one or all tunables (current, default, boot, minimum, maximum,
unit, type and dependencies).

Student Notebook
Displaying Current Tunables

# vmo -a -F
...
kernel_heap_psize = 65536
...
maxfree = 1088
maxperm = 214525
maxpin = 211768
...
minfree = 960
minperm = 7150
minperm% = 3
nokilluid = 0
npskill = 1024
npswarn = 4096
...
##Restricted tunables
...
kernel_psize = 65536
...
lru_file_repage = 0
lru_poll_interval = 10
lrubucket = 131072
maxclient%
maxperm%
=
=
90
90
Restricted Tunables
mbuf_heap_psize = 65536
...
strict_maxclient = 1
strict_maxperm = 0
...
Figure 2-11. Displaying Current Tunables QV4311.1
Notes:
The vmo -a command will display the current value of the VMM tunables.
When the -F flag is not specified, restricted tunables are not displayed, unless these
restricted tunables are specifically named as a parameter
(e.g.vmo -o maxclient%).
As shown in the visual, when the -F flag is included with the vmo -a command
(vmo -a -F), the non-restricted tunables are displayed first followed by the restricted
tunables. Note, the output line, ##Restricted tunables, is displayed before listing the
restricted tunables.
The restricted tunables will not be shown by default (without the -F flag) regardless of
whether they have been modified or not.

V3.1.0.1
Student Notebook
Uempty
Displaying Attributes of Tunables
# vmo -L
NAME CUR DEF BOOT MIN MAX UNIT TYPE

DEPENDENCIES
...
<part of output omitted>
...
--------------------------------------------------------------------------------
maxfree 1088 1088 1088 16 209715 4KB pages D
minfree
memory_frames
--------------------------------------------------------------------------------
minfree 960 960 960 8 209715 4KB pages D
maxfree
memory_frames
--------------------------------------------------------------------------------
minperm 7150 7150 S
--------------------------------------------------------------------------------
minperm% 3 3 3 1 100 % memory D
maxperm%
--------------------------------------------------------------------------------
...
<rest of output omitted>
Figure 2-12. Displaying Attributes of Tunables QV4311.1
Notes:
The -L option of the tunable commands (vmo, ioo, schedo, no and nfso) can be used to
print out the attributes of a single tunable or all the tunables.
The output of the command with the -L option shows the current value, default value,
value to be set at next reboot, minimum possible value, maximum possible value, unit,
type, and dependencies.

Student Notebook
Changing Non-Restricted Tunable Values

• To change a tunable value dynamically, use the -o flag and give the new
value for the tunable:
# vmo -o minperm%=5
Setting minperm% to 5
• To change a tunable value permanently, use the -p flag with the -o flag:
# vmo -p -o minperm%=7
Setting minperm% to 7 in nextboot file
• To change a tunable at the next reboot, use the -r flag with the -o flag:
# vmo -r -o minperm%=6
Setting minperm% to 6 in nextboot file
Warning: changes will take effect only at next reboot
• To change a tunable value to the default value, use the -d flag:

# vmo -d minperm%
• To change all the tunable value to their default values, use the -D flag:
# vmo -D
Figure 2-13. Changing Non-Restricted Tunable Values QV4311.1
Notes:
The -o option of the tunable commands (vmo, ioo, schedo, no and nfso) is used to
change a tunable value to the new value specified.
Any change (with -o, -d or -D) to a parameter of type Mount will result in a message
being displayed to warn the user that the change is only effective for future mount
operations.
Any change (with -o, -d or -D flags) to a parameter of type Connect will result in inetd
being restarted, and a message displaying a warning to the user that the change is only
effective for future socket connections.
Any attempt to change (with -o, -d or -D) a parameter of type Bosboot or Reboot
without -r, will result in an error message.
Any attempt to change (with -o, -d or -D but without -r) the current value of a
parameter of type Incremental with a new value smaller than the current value, will
result in an error message.

V3.1.0.1
Student Notebook
Uempty
Changing Restricted Tunable Values
Restricted tunables should NOT be changed without
approval from AIX Development or AIX Support!
• Changing a restricted tunable dynamically
– Warning message is written that states a restricted tunable has
been modified
# vmo -o maxperm%=95
Setting maxperm% to 95
Warning: a restricted tunable has been modified
• Changing a restricted tunable permanently

– Confirmation is required before a permanent change is made
# vmo -p -o maxperm%=94
Modification to restricted tunable maxperm%, confirmation required yes/no y
Setting maxperm% to 94 in nextboot file
Figure 2-14. Changing Restricted Tunable Values QV4311.1
Notes:
CAUTION!
Restricted tunables should not be modified unless told to do so by AIX development
or support professionals.
Messages when changing restricted tunables

When a restricted tunable is modified with the -o, -d or -D flag, a warning message is
displayed to warn the user that a restricted tunable has been modified.
If a restricted tunable is modified permanently by using the -p or -r flag, the user will
be prompted to confirm the change.

Student Notebook
TUNE_RESTRICTED Error Log Entry

LABEL: TUNE_RESTRICTED
IDENTIFIER: D221BD55
Date/Time: Tue Feb 3 20:43:47 CST 2009

Sequence Number: 231
Machine Id: 00066BD2D900
Node Id: woolf222
Class: O
Type: INFO
WPAR: Global
Resource Name: perftune
Description
RESTRICTED TUNABLES MODIFIED AT REBOOT
Probable Causes
SYSTEM TUNING
User Causes
TUNABLE PARAMETER OF TYPE RESTRICTED HAS BEEN MODIFIED
Recommended Actions
REVIEW TUNABLE LISTS IN DETAILED DATA
Detail Data
LIST OF TUNABLE COMMANDS CONTROLLING MODIFIED RESTRICTED TUNABLES AT REBOOT, SEE FILE
/etc/tunables/lastboot.log
vmo
Figure 2-15. TUNE_RESTRICTED Error Log Entry QV4311.1
Notes:
When the system is rebooted, any restricted tunables in the /etc/tunables/nextboot file
that were modified from their default values (by using a tuning command specifying the
-r or -p flag) will cause an error log entry with a label of TUNE_RESTRICTED.
The /usr/lib/perf/tunerrlog command creates the TUNE_RESTRICTED error log
entry. The tunerrlog command is a new performance command that is included in the
bos.perf.tune package.
The tunerrlog command is called by /usr/sbin/tunrestore -R (which is in
/etc/inittab).

V3.1.0.1
Student Notebook
Uempty
Tunables Files
• The /etc/tunables directory centralizes the tunable files
• The tunable files contain the tunable parameters, grouped in

one or more sections called stanzas
• The /etc/tunables directory contains two stanza files:

– nextboot
– lastboot
• The /etc/tunables directory contains one log file:

– lastboot.log
Figure 2-16. Tunables Files QV4311.1
Notes:
The parameter values tuned by vmo, schedo, ioo, no, and nfso are stored in files in
/etc/tunables.
Tunables files currently support six different stanzas: one for each of the tunable
commands (schedo, vmo, ioo, no and nfso), plus a special info stanza. The six
stanzas schedo, vmo, ioo, no, nfso and raso contain tunable parameters managed by
the corresponding command (see the command's man pages for the complete
parameter lists).

Student Notebook
nextboot File
Example of a nextboot file:
info:
AIX_level = "6.1.1.1"
Kernel_type = "MP64"
Last_validation = "2009-01-18 14:24:43 CST (current, reboot)"
vmo:
maxperm% = "94"
minperm% = "6"
schedo:
ioo:
raso:
Figure 2-17. nextboot File QV4311.1
Notes:
The nextboot file is automatically applied at boot time and only contains the list of
tunables to change. It does not contain all parameters. The bosboot command also
gets the value of Bosboot type tunables from this file. It contains all tunable settings
made permanent.

V3.1.0.1
Student Notebook
Uempty
lastboot File
# cat lastboot
info:
Logfile_checksum = "1323389206"
Description = "Full set of tunable parameters after last boot"
AIX_level = "6.1.2.1"
Kernel_type = "MP64"
Last_validation = "2009-02-03 20:43:47 CST (current, reboot)"
...
schedo:
...
vmo:
...
minfree = "960" # DEFAULT VALUE
minperm = "14301" # STATIC (never restored)
minperm% = "6"
nokilluid = "0" # DEFAULT VALUE
npskill = "256" # DEFAULT VALUE
...
ioo:
...
raso:
...
no:
...
net_malloc_police = "16384" # RESTRICTED not at default value
...
nfso:
...
Figure 2-18. lastboot File QV4311.1
Notes:
The lastboot file is automatically generated at boot time. It contains the full set of
tunable parameters, with their values at the beginning of this boot. Default values are
marked with
# DEFAULT VALUE. Restricted parameters have a blank second column or the phrase
# RESTRICTED not at default value (depending on the TL of AIX 6).

Student Notebook
lastboot.log File
# cat /etc/tunables/lastboot.log
Restoring schedo values

=======================
Restoring vmo values

====================
Restoring ioo values

====================
Restoring raso values

=====================
Restoring no values
===================
Setting net_malloc_police to 65536
Restoring nfso values

=====================
Figure 2-19. lastboot.log File QV4311.1
Notes:
The lastboot.log file should be the only file in /etc/tunables that is not in the stanza
format described here. It is automatically generated at boot time, and contains the
logging of the creation of the lastboot file, i.e. any parameter change made is logged.
Any change which could not be made (possible if the nextboot file was created
manually and not validated with tuncheck) is also logged. (tuncheck will be covered
soon.)

V3.1.0.1
Student Notebook
Uempty
Managing Tunables Files
• Commands to manipulate the tunables files in /etc/tunables
are:
– tuncheck
Used to validate the parameter values in a file
– tunrestore
Changes tunables based on parameters in a file
– tunsave
Saves tunable values to a stanza file
– tundefault
Resets tunable parameters to their default values
Figure 2-20. Managing Tunables Files QV4311.1
Notes:
tuncheck command
The tuncheck command validates a tunables file. All tunables listed in the specified file
are checked for range and dependencies. If a problem is detected, a warning is issued.
tunrestore command
The tunrestore command is used to change all tunable parameters to values stored in
a specified file.
tunsave command
The tunsave command saves the current state of the tunables parameters in a file.
tundefault command
The tundefault command resets all tunable parameters to their default values. It
launches all the tuning commands (ioo, vmo, schedo, no and nfso) with the -D flag.

Student Notebook
Exercise 2: Tuning Overview

• Get familiar with some basic performance
tuning command
• View and change the attributes of tunables
• Validate the tunable parameters
• Examine the tunables files
• Reset tunables to their default values
Figure 2-21. Exercise 2: Tuning Overview QV4311.1
Notes:

V3.1.0.1
Student Notebook
Uempty
Review Questions
1. True or False: In AIX 6.1, help for the performance tunables
is available in the man pages.
2. Which tunable command flag will show the restricted
tunables:
a) -h
b) -s
c) -F
d) -R
3. True or False: A confirmation must always be given when
permanently changing a restricted tunable.
4. True or False: An error log entry is created when a
restricted tunable is changed permanently.
5. True or False: When a system is rebooted, the lastboot file
flags any tunables that have been changed since the
system booted.
Figure 2-22. Review Questions QV4311.1
Notes:

Student Notebook
Unit Summary
• Tune one or one set of tunables at a time and see if performance improved
• The documentation is only found in the help message of the tuning
commands
• The -h flag displays information for the tuning commands
vmo, ioo, schedo, raso, no, nfso
• Restricted tunables should not be changed unless recommended by AIX
development or development support
• Restricted tunables are not shown by tuning commands unless the –F flag
is used
• Changing restricted tunables:
– Dynamically: will show a warning message
– Permanently: must be confirmed
• Permanent changes to restricted tunables will cause an error log entry at
boot time
Notes:

V3.1.0.1
Student Notebook
Uempty Unit 3. Monitoring CPU Usage
Introduction
This unit identifies the tools to help determine CPU bottlenecks. It also
demonstrates techniques to tune CPU-related issues on your system.
Unit Objectives
• Describe processes and threads
• Describe how process priorities affect CPU scheduling
• Use the nice and renice commands to change process priorities
• Describe the simultaneous multi-threading concept and its effect
on performance monitoring and tuning
• View logical processors
• Use smtctl to enable/disable simultaneous multi-threading and
view simultaneous multi-threading statistics
• Use the output of the following AIX tools to determine symptoms of
a CPU bottleneck:
- vmstat, sar, ps and topas

• Lecture
References
(Redbook)
SG24-7559 IBM AIX Version 6.1 Differences Guide (Redbook)
© Copyright IBM Corp. 2009 Unit 3. Monitoring CPU Usage 3-1

Student Notebook
SG24-7940 PowerVM Virtualization on IBM System p: Introduction

and Configuration Fourth Edition (Redbook)
SG24-7590 PowerVM Virtualization on IBM System p: Managing
and Monitoring (Redbook)
REDP-4194 IBM System p Advanced POWER Virtualization Best
Practices (Redpaper)

V3.1.0.1
Student Notebook
Uempty
Unit Objectives
• Describe processes and threads
• Describe how process priorities affect CPU scheduling
• Use the nice and renice commands to change process
priorities
• Describe the simultaneous multi-threading concept and its
effect on performance monitoring and tuning
• View logical processors
• Use smtctl to enable/disable simultaneous multi-threading
and view simultaneous multi-threading statistics
• Use the output of the following AIX tools to determine
symptoms of a CPU bottleneck:
– vmstat, sar, ps and topas
Notes:
Introduction
The objectives in the visual above state what you should be able to do at the end of this
unit.

Student Notebook
CPU Monitoring Strategy

START
Monitor CPU usage
& compare with goals
Yes CPU No
High CPU
supposed to
usage?
be idle?
No Yes
Determine cause Locate dominant

of idle time by process(es)
tracing
Fix or tune the Is No Kill

app or OS process behavior abnormal
normal? processes
Yes
Tune applications /
operating system
Figure 3-2. CPU Monitoring Strategy QV4311.1
Notes:
This flowchart illustrates the CPU-specific monitoring and tuning strategy. If the system
is not meeting the CPU performance goal, you need to find the root cause for why the
CPU subsystem is constrained. It may be simply that the system needs more physical
CPUs, but it could also be because of errant applications or processes gone awry. If the
system is behaving normally but is still showing signs of a CPU bottleneck, tuning
strategies may help to get the most out of the CPU resources.
If you see unusually high CPU usage when monitoring, the next question to ask is,
“What processes are accumulating CPU time?” “Are they supposed to be accumulating
so much CPU time?” If they are, then perhaps there are some tuning strategies you can
use to tune the application or the operating system to make sure that important
processes get the CPU they need to meet the performance goal.
Another scenario is that you’re not meeting performance goals and the CPUs are fairly
idle or not working as much as they should. This points to a bottleneck in another area
of the computer system.

V3.1.0.1
Student Notebook
Uempty
Processes and Threads
Disk Memory CPU
Single-
Program Thread 1 CPU 0
threaded
process
Multi-threaded
process CPU 0
Thread 1
Program Thread 2
CPU 1
Thread 3
CPU 2
Figure 3-3. Processes and Threads QV4311.1
Notes:
Process
A process is the entity that the operating system uses to control the use of system
resources. A process is started by a command, shell program or another process.
Thread
Each process is made up of one or more kernel threads. A thread is a single sequential
flow of control. A single-threaded process can only handle one operation at a time,
sequentially. Multiple threads of control allow an application to overlap operations, such
as reading from a terminal and writing to a file. AIX schedules and dispatches CPU
resources at the thread level. In general, when we refer to threads in this course, we will
be referring to the kernel threads within a process.

Student Notebook
The Life of a Process
SNONE Process/Thread States

"I"
SIDL "A"
"R" "T"
Ti
m RUN
e "S"
NING
"Z"
Zombie
Figure 3-4. The Life of a Process QV4311.1
Notes:
Before a process is created, it needs a slot in the process and thread tables; at this
stage it is in the SNONE state. While a process is undergoing creation, waiting for
resources (memory) to be allocated, it is in the SIDL state (I (idle) state).
When a process is in an A state, one or more of its threads are in the R (ready-to-run)
state. Threads of a process in this state have to contend for the CPU with all other
ready-to-run threads. Only one thread can have the use of the CPU at a time; this is the
running thread for that processor. A thread will be in an S state if a thread is waiting on
an event or I/O. Instead of wasting CPU time, it sleeps and relinquishes control of the
CPU. A thread may be stopped via the SIGSTOP signal, and started again via the
SIGCONT signal; while suspended it is in the T state. This has nothing to do with
performance management.
The Z state: When a process dies (exits) it becomes a zombie.

V3.1.0.1
Student Notebook
Uempty
Run Queues
CPU 0 Run Queue
0
1 Prioritized
Global Run Queue . threads
.
0 .
1
.
.
. 255
. CPU 1 Run Queue
.
0
255 1
.
.
.
254
255
Figure 3-5. Run Queues QV4311.1
Notes:
There is a run queue structure for each CPU as well as a global run queue. The run
queue is divided further into queues that are priority ordered (one queue per priority
number). The per-CPU run queues are called local run queues. When a thread has
been running on a CPU, it will tend to stay on that CPU’s run queue. If that CPU is busy,
then the thread can be dispatched to another idle CPU and will be assigned to that
CPU’s run queue. When a CPU performs idle load balancing (i.e., a CPU is idle, and
tries to steal work from another CPU), it will steal threads that are less favored, since
the highly favored threads will run soon enough on the busy CPU. If the higher favored
threads were moved, they would suffer cache misses and performance would be worse.
Less favored threads are moved, since even though they will suffer cache misses, they
still end up running sooner than they would have if they'd remained on the busy CPU
run queue.
The dispatcher picks the best priority thread in the run queue when a CPU is available.
When a thread is first created, it is assigned to the global run queue. It stays on that
queue until assigned to a local run queue.

Student Notebook
Process and Thread Priorities

High 0 • A thread can have a fixed or
(Best/ non-fixed priority
Most favored) 1
• Default scheduling policy is
2
called SCHED_OTHER which is
Kernel 3 non-fixed
.
Priorities . • Priorities can be changed by:
.
– Thread itself
User 40
Priorities – System calls
– User
.
.
. The priority of a thread is:
Base priority
(Worst/ Nice penalty
Least favored) + CPU usage penalty (for non-fixed)
255 wait
Low
Figure 3-6. Process and Thread Priorities QV4311.1
Notes:
A priority is a number assigned to a thread used to determine the order of scheduling
when multiple threads are runnable. A process priority is the most favored priority of any
one of its threads. The initial process/thread priority is inherited from the parent
process.
The kernel maintains a priority value (sometimes termed the scheduling priority) for
each thread. The priority value is a positive integer and varies inversely with the
importance of the associated thread. That is, a smaller priority value indicates a more
important thread. When a CPU is looking for a thread to run, it chooses the
dispatchable thread with the smallest priority value.
A thread can be fixed-priority or nonfixed-priority. The priority value of a fixed-priority
thread is constant, while the priority value of a nonfixed-priority thread can change
depending on its CPU usage.
Real-time thread priorities are lower than 40. Real-time applications should run with a
fixed priority and a numerical value less than 40 so that they are more favored than
other applications.

V3.1.0.1
Student Notebook
Uempty
Changing Priority with nice/renice
• Initial priority of a non-fixed thread is 40 + nice
• Nice has a default value of:
– 20 for a foreground process
– 24 for a ksh background process
• Nice value can range from 0 to 39
– The higher the nice value the lower its priority
– Only root can make its priority more favorable
• Nice value can be set when a process is started by using the nice
command
– Example: Add 10 to the nice value:
# nice -n 10 command
• Nice value can be changed for a running process using the renice
command
– Example: Add 10 to the nice value:
# renice -n 10 -p PID
Figure 3-7. Changing Priority with nice/renice . QV4311.1
Notes:
A user can use the nice and renice commands to change the nice value for a process.
The nice value is used by the system to calculate the current priority of a running
process. It is added to the base user priority of 40 for non-fixed priority threads and is
irrelevant for fixed priority threads. The default nice value is 20 and therefore a resulting
priority of 60. This is because the nice value is added to the user base priority value of
40. Some shells (such as ksh) will automatically add a nice value of 4 to the default nice
value if a process is started in the background (using &). Only the root user can change
the priority to a more favored priority.
The nice command does not return an error message if you attempt to increase a
command's priority without the appropriate authority. Instead, the command's priority is
not changed, and the system starts the command as it normally would.
The renice command alters the nice value of a specific process, all processes with a
specific user ID, or all processes with a specific group ID.

Student Notebook
nice/renice Examples
• nice Examples:
Command Action Relative Priority
nice -10 foo Add 10 to current nice value Lower priority (disfavored)
nice -n 10 foo Add 10 to current nice value Lower priority (disfavored)
nice --10 foo Subtract 10 from current nice value Higher priority (favored)
nice -n -10 foo Subtract 10 from current nice value Higher priority (favored)
• renice Examples:
Command Action Relative Priority
renice 10 -p 563 Add 10 to default nice value Lower priority (disfavored)
renice -n 10 -p 563 Add 10 to current nice value Lower priority (disfavored)
renice -10 -p 563 Subtract 10 from default nice value Higher priority (favored)
renice -n -10 -p 563 Subtract 10 from current nice value Higher priority (favored)
Figure 3-8. nice/renice Examples QV4311.1
Notes:
The -Increment flag to the nice command is equivalent to the -n Increment flag. Both
flags increments a command’s priority up or down. You can specify a positive or
negative number. Positive increment values reduce priority (disfavor). Negative
increment values increase priority (favor). Only users with root authority can specify a
negative increment.
With the renice command, the way the increment value is used depends on whether
the -n flag is specified. If -n is specified, then the increment value is added to the
current nice value. If the -n flag is not specified, then the increment value is added to
the default value of 20 to get the effective nice value.

V3.1.0.1
Student Notebook
Uempty
Viewing Process Priorities
# ps -elk
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
303 A 0 0 0 120 16 -- 15004190 384 - 0:01 swapper
200003 A 0 1 0 0 60 20 10001480 708 - 0:00 init
303 A 0 8196 0 0 255 -- 17006190 384 - 20:31 wait
303 A 0 12294 0 0 17 -- 19008190 448 - 0:00 sched
303 A 0 16392 0 0 16 -- 1b00a190 512 f100080009786c08 - 0:00 lrud
303 A 0 49176 0 0 255 -- 1d02c190 384 - 20:11 wait
303 A 0 53274 0 0 255 -- 1f02e190 384 - 20:35 wait
303 A 0 57372 0 0 255 -- 1030190 384 - 20:14 wait
303 A 0 61470 0 0 36 -- 2033190 448 - 0:00 netm
303 A 0 65568 0 0 37 -- 4035190 960 * - 0:01 gil
303 A 0 69666 0 0 16 -- 9038190 512 3f2af70 - 0:00 wlmsched
40201 A 0 81986 0 0 60 20 170a6190 448 - 0:00 lvmbb
240001 A 0 106618 1 0 60 20 1c14d480 552 * - 0:00 syncd
240001 A 0 180346 151706 0 60 20 f1de480 376 - 0:00 syslogd
240001 A 0 192764 204958 0 60 20 1fb2e480 824 f100070000159c78 pts/2 0:00 ksh
200001 A 0 262372 192764 34 87 24 1cb4d480 92 pts/2 52:37 myprog
200001 A 0 286896 192764 35 87 24 1fb4e480 92 pts/2 52:32 myprog2
200001 A 0 290950 356386 0 60 20 15b64480 732 pts/1 0:00 ps
# ps -L 192764 -l
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
240001 A 0 192764 204958 0 60 20 1fb2e480 824 f100070000159c78 pts/2 0:00 ksh
200001 A 0 262372 192764 40 90 24 1cb4d480 92 pts/2 55:02 myprog
200001 A 0 286896 192764 40 90 24 1fb4e480 92 pts/2 54:55 myprog2
Figure 3-9. Viewing Process Priorities QV4311.1
Notes:
The priority values shown on the visual are those of the most favored thread for each
process. To view all processes, run the command: ps -el.
To view the most favored thread for each process including kernel processes, run the
command: ps -elk.
Beginning with AIX 5.3, the -L <PIDlist> option generates a list of descendants of
each PID that has been passed to it in the Pidlist variable.
The priority is listed under the PRI column. If the value under NI is --, this indicates that
it is a fixed priority.
You can use the ps command with the -l flag to view a command's nice value. The nice
value appears under the NI heading in the ps command output. If the nice value in ps is
--, the process is running at a fixed priority.
Another column in the ps output is important, the C (CPU usage) column. This
represents the CPU utilization of all the process’s threads, incremented each time the
system clock ticks and a thread is found to be running.

Student Notebook
Viewing Thread Priorities

# ps -ekmo THREAD
USER PID PPID TID ST CP PRI SC WCHAN F TT BND COMMAND
root 0 0 - A 120 16 1 - 303 - - swapper
- - - 3 S 120 16 1 - 1000 - - -
root 1 0 - A 0 60 1 - 200003 - - /etc/init
- - - 4099 S 0 60 1 - 410400 - - -
root 8196 0 - A 0 255 1 - 303 - 0 wait
- - - 8197 R 0 255 1 - 3000 - 0 -
root 16392 0 - A 0 16 2 f100080009786c08 303 - - lrud
- - - 16393 S 0 16 1 f100080009786c08 1004 - - -
- - - 45079 S 0 16 1 - 1004 - - -
root 49176 0 - A 0 255 1 - 303 - 2 wait
- - - 77863 R 0 255 1 - 3000 - 2 -
root 53274 0 - A 0 255 1 - 303 - 1 wait
- - - 81961 R 0 255 1 - 3000 - 1 -
root 57372 0 - A 0 255 1 - 303 - 3 wait
- - - 86059 R 0 255 1 - 3000 - 3 -
root 262372 192764 - A 58 100 0 - 200001 pts/2 - ./cpuprog
- - - 733299 R 58 100 0 - 0 - - -
root 286896 192764 - A 63 103 1 - 200001 pts/2 - ./cpuprog
- - - 737399 R 63 103 1 - 0 - - -
root 192764 204958 - A 0 60 1 f100070000159c78 240001 pts/2 - -ksh
- - - 692347 S 0 60 1 f100070000159c78 10400 - - -
# ps -mo THREAD -p 192764

USER PID PPID TID ST CP PRI SC WCHAN F TT BND COMMAND
root 192764 204958 - A 0 60 1 f100070000159c78 240001 pts/2 - -ksh
- - - 692347 S 0 60 1 f100070000159c78 10400 - - -
Figure 3-10. Viewing Thread Priorities QV4311.1
Notes:
Viewing thread priorities

To view the thread priorities of all threads, run the command:
ps -emo THREAD
To view the thread priorities of all threads including kernel threads:
ps -ekmo THREAD
To view the thread priorities of all threads within a specific process:
ps -mo THREAD -p <PID>
The priority is listed under the PRI column.
Process ID’s are even numbers and thread ID’s are odd numbers.
The CP (CPU usage) column in the visual above is the same as the C column on the last
visual. This represents the CPU utilization of the thread, incremented each time the
system clock ticks and the thread is found to be running.

V3.1.0.1
Student Notebook
Uempty
Context Switches
• A context switch is when one thread is taken off a CPU and
another thread is dispatched onto the same CPU
• Context switches are normal for multi-processing systems:
– What is abnormal? Check against baseline
– High context switch rate could be indication of lock contention
• Use vmstat, sar, or topas to see context switches
• Example:
# vmstat 1 5
System configuration: lcpu=2 mem=512MB
kthr memory page faults cpu

----- ------------- ------------------------ --------------- ----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
33 3 257759 3371 0 0 0 102 208 0 6061 3355 13551 1 41 8 50
38 1 257760 3292 0 0 8 666 1398 0 5286 1091 23504 0 62 7 31
35 1 257760 3437 0 0 0 514 1039 0 5507 2363 11142 1 39 7 53
34 6 257760 3356 0 0 0 513 982 0 8927 1264 19310 1 55 4 40
36 2 258259 3289 0 0 0 256 516 0 7161 5649 16332 1 46 5 48
Figure 3-11. Context Switches QV4311.1
Notes:
A context switch (also known as process switch or thread switch) is when a thread is
dispatched to a CPU and the previous thread on that CPU was a different thread from
the one currently being dispatched. Context switches occur for various reasons. The
most common reason is where a thread has used up its timeslice or has gone to sleep
waiting on a resource (such as waiting on an I/O to complete or waiting on a lock) and
another thread takes its place.
High context switch rates may be an indication of a resource contention issue such as
application or kernel lock contention.
The rate is given in switches per second. It’s not uncommon to see the context switch
rate be approximately the same as the device interrupt rate (the in column in vmstat).
A context switch occurs when:
- A thread has to wait for a resource (voluntarily)
- A “higher priority” thread wakes up (involuntarily)
- The thread has used up its timeslice (10 ms by default)

Student Notebook
User Mode vs. System Mode

• User mode:
í User mode is when a thread is executing its own application
code or shared library code
í Time spent in user mode is reflected as %user time in output
of commands such as vmstat, topas, iostat, and sar
• System mode:
í System mode is when the CPU is executing code in the kernel
í CPU time spent in kernel mode is reflected as system time in
output of commands such as vmstat, topas, iostat, and
sar
í Context switch time, system calls, device interrupts, NFS I/O,
and anything else in the kernel is considered as system time
Figure 3-12. User Mode vs. System Mode QV4311.1
Notes:
User time is simply the percentage of time the CPUs are spending executing code in
the applications or shared libraries. System time is the percentage of time the CPUs
execute kernel code. System time can be because the applications are executing
system calls which enter the applications into the kernel, or can be because there are
kernel threads running that only execute in kernel mode, or can be because interrupt
handler code is currently being run. When using monitoring tools, add up the user and
the system CPU utilization percentage to see the total CPU utilization.
The use of a system call by a user mode process allows a kernel function to be called
from user mode. This is considered a mode switch. Mode switching is when a thread
switches from user mode to kernel or system mode. Switching from user to system
mode and back again is normal for applications. System mode does not just represent
operating system housekeeping functions.
Mode switches should be differentiated between the context switches seen in the output
of vmstat (cs column) and sar (cswch/s).

V3.1.0.1
Student Notebook
Uempty
What is Simultaneous Multi-Threading?
• Two hardware threads can run on one physical processor at the
same time
• One processor appears as two logical processors to the operating
system
• On an LPAR with shared processors, the logical processors will be
twice the number of virtual (not physical) processors
• Simultaneous multi-threading is a means of converting thread-level
parallelism (multiple CPUs) to instruction-level parallelism (same
CPU)
Logical Logical
CPU0 CPU1
AIX Layer
Physical Layer
Hardware Hardware
Thread0 Thread1
Physical CPU
Figure 3-13. What is Simultaneous Multi-Threading? QV4311.1
Notes:
Simultaneous multi-threading (SMT) is the ability of a single physical processor to
concurrently execute instructions from more than one hardware thread. There are two
hardware threads per physical processor, so additional instructions can run at the same
time. Since instructions from any of the threads can be fetched by the processor in a
given cycle, the processor is no longer limited by the instruction level parallelism of the
individual threads.
Simultaneous multi-threading also allows instructions from one thread to utilize all the
execution units if the other thread encounters a long latency event. For instance, when
one of the threads has a cache miss, the second thread can continue to execute.
Each hardware thread is supported as a separate logical processor by the operating
system. So, a dedicated partition that is created with one physical processor is
configured by the operating system as a logical two-way when simultaneous
multi-threading is enabled. This is independent of the partition type, so a shared
partition with one virtual processor is configured as a logical two-way. Beginning with
AIX 5.3, SMT is enabled by default on hardware that supports it.

Student Notebook
When to Use Simultaneous Multi-Threading

• Simultaneous multi-threading may be beneficial if:
– Two threads are similar in execution needs and CPU utilization
is high
– There is random data access (waiting for data to be loaded into
cache)
– Overall throughput is more important than the throughput of an
individual thread
• Simultaneous multi-threading may not be beneficial if:
– One thread is slower, which could slow down both threads
running on the processor
– Both threads use same execution units
• Where simultaneous multi-threading is not beneficial,
POWER5 and later systems supports single-threaded
execution mode:
– Automatically by snoozing
– Manually by disabling simultaneous multi-threading
Figure 3-14. When to Use Simultaneous Multi-Threading QV4311.1
Notes:
Simultaneous multi-threading is a good choice when the overall throughput is more
important than the throughput of an individual thread.
Simultaneous multi-threading is not always advantageous. Any workload where the
majority of individual software threads highly utilize any resource in the processor or
memory will benefit very little from simultaneous multi-threading.
Where simultaneous multi-threading is not beneficial, POWER5 and later systems
support single-threaded execution mode. In this mode, the system gives all the physical
resources to the active thread.
If simultaneous multi-threading is not beneficial, it can be disabled.
The process of putting an active thread into a dormant state is known as snoozing. In
dedicated processor partitions, if there are not enough tasks available to run on both
hardware threads of a processor, the operating system’s idle process will be selected to
run on the idle hardware thread.

V3.1.0.1
Student Notebook
Uempty
Viewing Processor and Attribute Information
• List processors with the lsdev command:
– lsdev lists physical or virtual processors:
# lsdev -Cc processor
proc0 Available 00-00 Processor
proc2 Available 00-02 Processor
• List processor attributes with the lsattr command:

# lsattr -El proc0
frequency 4204000000 Processor Speed False
smt_enabled true Processor SMT enabled False
smt_threads 2 Processor SMT threads False
state enable Processor state False
type PowerPC_POWER6 Processor type False
• List logical processors with the bindprocessor command:

# bindprocessor -q
The available processors are: 0 1 2 3
Figure 3-15. Viewing Processor and Attribute Information QV4311.1
Notes:
The lsdev command lists processors that the operating system sees, and their AIX
location codes. When a partition is using dedicated processors, lsdev shows physical
processors. When a partition is using shared processors, lsdev shows virtual
processors.
The lsattr command shows the processor attributes:
- The smt_enabled attribute indicates whether simultaneous multi-threading is
enabled or not.
- The smt_threads attribute shows the number of simultaneous multi-threading
threads per physical (for dedicated processor partitions) or virtual processor (on
shared processor partitions).
The numbers shown by the bindprocessor -q command are the logical CPU numbers
for the AIX instance. These don't necessarily correspond with the processors shown in
the lsdev command output.

Student Notebook
Turning On/Off SMT

• Use the smtctl command or SMIT to enable/disable or see
status:
− smtctl [ -m off | on [ -w boot | now]]
− SMIT fastpath: smitty smt
• To turn simultaneous multi-threading off dynamically (for now):
# smtctl -m off -w now
smtctl: SMT is now disabled.
# bindprocessor -q
The available processors are: 0 1
• To turn simultaneous multi-threading on dynamically (now and
reboot):
# smtctl -m on (defaults to both)
smtctl: SMT is now enabled. It will persist across
reboots if you run the bosboot command before
the next reboot.
# bosboot -a
# bindprocessor -q
The available processors are: 0 1 2 3
Figure 3-16. Turning On/Off SMT QV4311.1
Notes:
Beginning with AIX 5.3, simultaneous multi-threading is enabled by default and
supported by AIX. You may dynamically change the simultaneous multi-threading
setting with the smtctl command or with the SMIT menu subsystem.
The smtctl command provides privileged users and applications the ability to control
utilization of processors with simultaneous multi-threading support. With this command,
you can enable or disable simultaneous multi-threading system-wide, either
immediately or the next time the system boots.
The smtctl command does not rebuild the boot image. If you want your change to
persist across reboots, the bosboot command must be used to rebuild the boot image.
Beginning with AIX 5.3, the boot image has been extended to include an indicator that
controls the default simultaneous multi-threading mode.

V3.1.0.1
Student Notebook
Uempty
Viewing smtctl Settings
# smtctl
This system is SMT capable.

SMT is currently enabled.
SMT boot mode is set to enabled.
SMT threads are bound to the same physical processor.
proc0 has 2 SMT threads.

Bind processor 0 is bound with proc0
proc2 has 2 SMT threads.

Figure 3-17. Viewing smtctl Settings QV4311.1
Notes:
smtctl command output

The smtctl command with no options reports the following information:
SMT Capability Indicates whether the processors in the system are capable of
simultaneous multi-threading
SMT Mode Shows the current runtime simultaneous multi-threading mode
(disabled or enabled)
SMT Boot Mode Shows the current boot time simultaneous multi-threading
mode (disabled or enabled)
SMT Bound Indicates whether the simultaneous multi-threading threads are
bound on the same physical or virtual processor
SMT Threads Shows the number of simultaneous multi-threading threads per
physical or virtual processor

Student Notebook
Timing Commands
• Time commands show:
– Elapsed time
– CPU time spent in user mode
– CPU time spent in system mode
# /usr/bin/time <command> <command arguments>
real 9.30
user 3.10
sys 1.20
# /usr/bin/timex <command> <arguments>

real 26.08
user 26.02
sys 0.06
# time <command> <arguments>

real 0m10.07s
user 0m3.00s
sys 0m2.07s
Figure 3-18. Timing Commands QV4311.1
Notes:
Timing commands
Use the timing commands to understand the performance characteristics of a single
program and its synchronous children. The output from /usr/bin/time and timex are
in seconds. The output of the Korn shell’s built-in time command is in minutes and
seconds. The C shell’s built-in time command is in yet another format.
Interpreting the output

Comparing the user+sys CPU time to the real time may give you an idea if the
application is CPU bound or I/O bound. The difference between the real and the sum of
user+sys is how much time the application spent sleeping (either waiting on I/O, for
locks, or for some other resource like the CPU). The sum of user+sys may exceed the
real time if a process is multi-threaded. The reason is because the real time is the time
from start to finish of the process, but the user+sys is the sum of the CPU time of each
of its threads.

V3.1.0.1
Student Notebook
Uempty
Monitoring CPU Usage with vmstat
# vmstat 5 3
----- ------------- ---------------------- --------------- ------------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
19 2 127005 758755 0 0 0 0 0 0 1692 10464 1070 48 52 0 0
19 2 127096 758662 0 0 0 0 0 0 1397 71452 1059 28 72 0 0
19 2 127100 758656 0 0 0 0 0 0 1361 72624 1001 28 72 0 0
• Runnable threads simply show total number of threads in queue:

– High number could simply mean your system is efficiently running
lots of threads
– If the high number is abnormal, look at what processes are running
and if total CPU utilization is higher than normal
• If us + sy = 100%, then there may be a CPU bottleneck:
í Compare interrupt, system call, and context switch rates to baseline
Figure 3-19. Monitoring CPU Usage with vmstat . QV4311.1
Notes:
Using vmstat with intervals during the execution of a workload will provide information
on paging space activity, real memory use, and CPU utilization. vmstat data can be
retrieved from the PerfPMR monitor.int file.
What to look for

If the user time (us) is abnormally high, then application profiling may need to be done.
If the system time (sy) is abnormally high, then kernel profiling (trace and/or tprof)
may need to be done.
If idle (id) or wait time (wa) is high, then you must determine if that is to be expected or
not.
Use vmstat -I to also see file in and file out rates which can show how quickly free
pages are being used and give some idea of the demand for free pages.

Student Notebook
The sar Command

• Reports system activity information from selected cumulative
activity counters
# sar -q 2 1
AIX woolf222 1 6 00066BD2D900 02/04/09
System configuration: lcpu=4 mode=Capped

Queue
15:48:29 runq-sz %runocc swpq-sz %swpocc
Statistics 15:48:31 6.0 100
15:48:33 6.5 100
15:48:35 6.0 100
15:48:37 6.0 100
Average 6.1 100
# sar -u 2
AIX woolf222 1 6 00066BD2D900 02/04/09

CPU
System configuration: lcpu=4 mode=Capped
Usage
15:48:15 %usr %sys %wio %idle physc
15:48:17 14 36 0 50 2.01
Figure 3-20. The sar Command QV4311.1
Notes:
The sar command is the System Activity Report tool and is standard for UNIX systems.
The sar command can collect data in real-time and postprocess the data in real-time or
after the fact. sar data can be retrieved from the PerfPMR monitor.int file.
sar -q reports queue statistics. A blank value in any column indicates that the
associated queue is empty.
The -q option can indicate whether you just have many jobs running (runq-sz) or have
a potential paging bottleneck.
A large number of runnable threads does not necessarily indicate a CPU bottleneck. If
the performance goals are being met and the system is running the threads quickly,
then it doesn’t matter if this number seems high.
The sar -u report in the visual displays the system-wide statistics. The -u flag
information is expressed as percentages so the system-wide information is simply the
average of each individual processor's statistics. Also, the I/O wait state is defined
system-wide and not per processor.

V3.1.0.1
Student Notebook
Uempty
Using the sar -P Command
• Reports system activity information from selected
cumulative activity counters
# sar -P ALL 2 1
Example of AIX woolf222 1 6 00066BD2D900 02/04/09

sar –P ALL System configuration: lcpu=2 mode=Capped
with SMT
15:49:18 cpu %usr %sys %wio %idle
disabled 15:49:20 0 0 0 0 100
2 24 76 0 0
- 12 38 0 50
# sar -P ALL 2 100
Example of AIX woolf222 1 6 00066BD2D900 02/04/09
sar –P ALL System configuration: lcpu=4 mode=Capped

with SMT 15:49:35 cpu %usr %sys %wio %idle physc
enabled 15:49:37 0 4 12 0 84 0.56
1 0 1 0 99 0.05
2 0 0 0 100 0.44
3 27 69 0 4 0.96
- 14 36 0 50 2.01
Figure 3-21. Using the sar -P Command QV4311.1
Notes:
If the sar -P flag is given, the sar command reports activity which relates to the
specified processor or processors. If -P ALL is given, the sar command reports
statistics for each individual processor, followed by system-wide statistics in the row that
starts with the hyphen.
The visual above shows a system running with the same workload, first with SMT
disabled, then with it enabled. Notice that the logical CPU number doubled with SMT
enabled. Also notice the new statistic of physc or physical CPU consumed with SMT
enabled. This shows how much of a CPU was consumed by the logical processor (the
measurement of fraction of time a logical processor was getting physical processor
cycles).
The example in the visual was created on a partition with dedicated processors. When
the partition has shared processors, an additional column is displayed (%entc). The
%entc column reports the percentage of entitled capacity consumed.

Student Notebook
topas - Example Screens

Run topas and press L
Interval: 2
Logical Partition: Wed Feb 4 16:30:25 2009
Dedicated SMT ON Online Memory: 1024.0
Partition CPU Utilization Online Virtual CPUs: 2 Online Logical CPUs: 4
%user %sys %wait %idle %hypv hcalls
19 81 0 0 0.0 127
===============================================================================
LCPU minpf majpf intr csw icsw runq lpa scalls usr sys _wt idl pc
Cpu0 0 0 105 112 62 0 100 722709 19 81 0 0 0.49
Cpu1 0 0 101 82 55 0 99 731461 19 81 0 0 0.51
Cpu2 0 0 254 102 65 0 100 728143 19 81 0 0 0.50
Cpu3 0 0 102 78 48 0 100 728527 19 81 0 0 0.50
Topas Monitor for host: woolf222

Wed Feb 4 16:34:15 2009 Interval: 2
Run topas and
CPU User% Kern% Wait% Idle%
1
0
19.4
19.3
80.6
80.7
0.0
0.0
0.0
0.0
press c twice
2 19.2 80.8 0.0 0.0
3 19.4 80.6 0.0 0.0
(Only part of the display is shown)
Figure 3-22. topas - Example Screens QV4311.1
Notes:
The topas output has been modified to show statistics by logical processor.
The visual above shows output from a system with two dedicated processors and
simultaneous multi-threading enabled which is why we see four logical processors.

V3.1.0.1
Student Notebook
Uempty
Using the mpstat Command
• The mpstat command displays performance statistics for logical
processors:
– Shows the distribution of work between logical processors
– Percentage for each logical processor is sum of %user and %sys
# mpstat -s 1 1
Dedicated System configuration: lcpu=4 mode=Capped

Processors Proc0 Proc1
cpu0 cpu1 cpu2 cpu3
49.72% 50.22% 49.83% 50.20%
# mpstat -s 1 1
System configuration: lcpu=2 ent=0.2 mode=Uncapped
Proc0
Shared
0.39%
Processors cpu0 cpu1
0.30% 0.09%
Figure 3-23. Using the mpstat Command QV4311.1
Notes:
If SMT is enabled, the mpstat -s command displays logical processors usage as
shown in the visual above.
With a dedicated processors, the two logical processor utilization metrics will always
add up to a whole processor. In the dedicated processor example, logical processor
cpu0 is 49.72% busy and logical processor cpu1 is 50.22%. cpu0 and cpu1 are
hardware threads for proc0. Logical processor cpu2 is 49.83% busy and logical
processor cpu2 is 50.20%. cpu2 and cpu3 are hardware threads for proc1.
If the partition was using shared processors, the percentages add up to the actual
overall CPU time that was consumed in the period, not to the whole number of allocated
processors. With shared processors, you're given an overall percentage for the
processor.

Student Notebook
Locating Dominant Processes

• What processes are currently using the most CPU time?
í Run the ps command periodically
# ps aux
USER PID %CPU %MEM SZ RSS TTY STAT STIME TIME COMMAND
root 262372 9.0 0.0 92 96 pts/2 A 14:58:25 51:00 ./myprog
root 286896 8.9 0.0 92 96 pts/2 A 14:58:15 50:55 ./myprog2
root 376848 8.9 0.0 92 96 pts/2 A 14:58:28 50:47 ./tstcase
root 335904 8.8 0.0 92 96 pts/2 A 14:58:12 50:18 ./statpgm
root 372976 8.4 0.0 92 96 pts/2 A 15:18:38 40:48 ./minep
root 294918 8.2 0.0 92 96 pts/2 A 15:18:33 40:14 ./extst
root 53274 2.8 0.0 384 384 - A 14:16:53 20:35 wait
root 8196 2.8 0.0 384 384 - A 14:16:53 20:31 wait
root 57372 2.8 0.0 384 384 - A 14:16:53 20:14 wait
root 49176 2.7 0.0 384 384 - A 14:16:53 20:11 wait
pconsole 250038 0.0 6.0 52936 52940 - A 14:17:30 0:14 /usr/java5/bin/j
root 311456 0.0 5.0 40216 40220 - A 14:17:24 0:07 /usr/java5/bin/j
• The problem may not be one or a few processes dominating the

CPU, it could be the sum of many processes
Figure 3-24. Locating Dominant Processes QV4311.1
Notes:
To locate the processes dominating CPU usage, the ps command is a useful tool.
The ps command, run periodically, will display the CPU time under the TIME column and
the ratio of CPU time to real time under the % CPU column. Keep in mind that the CPU
usage shown is the average CPU utilization of the process since it was first created.
Therefore, if a process consumes 100% of the CPU for five seconds and then sleeps for
the next five seconds, the ps report at the end of ten seconds would report 50% CPU
time. This can be misleading because right now the process is not actually using any
CPU time.

V3.1.0.1
Student Notebook
Uempty
CPU Monitoring Strategy Summary
START vmstat, sar, topas, time
Monitor CPU usage
& compare with goals
Yes CPU No
High CPU
supposed to
usage?
be idle?
Actions No Yes
Actions
Determine cause Check memory & Locate dominant ps, topas
of idle time by disk subsystems process(es)
tracing
Actions Actions
Fix or tune the Is No Kill

app or OS process behavior abnormal
normal? processes
nice/renice, bindprocessor,
smtctl, schedo, kill
Make app multi-threaded Yes
Tune applications /
Actions
operating system
Figure 3-25. CPU Monitoring Strategy Summary QV4311.1
Notes:
Tools such as vmstat, sar, and topas help determine whether a system is CPU bound.
If it is determined that a system is CPU bound, then you need to find out which
processes or applications are dominating the CPU usage. This could be accomplished
by running the ps command periodically.
Once the culprit is pinpointed, then it must be determined if the behavior is abnormal
(unexpected application behavior) or not. If not abnormal, a variety of methods can be
used to improve the performance of the application. They include specific coding
techniques, special libraries, and compiler options. Profilers can help you determine
where in the application to concentrate your efforts. It should be emphasized that tuning
is an iterative process, not only at the overall system level, but also at the application
level. Fixing abnormal behavior may involve changes to the application or to how the
application is invoked. Examples of abnormal behavior would be “runaway processes”
where an application is in a loop executing on a non-existent terminal.
And, sometimes the CPUs are idle and they shouldn’t be. Tracing can reveal the reason
why.

Student Notebook
Exercise 3: Monitoring CPU Usage

• Modify process priorities
• Monitor CPU usage
• Observe the run queue
• Characterize CPU usage
• Enable and disable simultaneous multi-
threading
• Analyze CPU related PerfPMR files
Figure 3-26. Exercise 3: Monitoring CPU Usage QV4311.1
Notes:

V3.1.0.1
Student Notebook
Uempty
1. What is the difference between a process and a thread?
___________________________________________________
___________________________________________________
2. The default scheduling policy is called: _________________
3. The default scheduling policy applies to fixed or non-fixed priorities?
_________________
4. Priority numbers range from ____ to ____.
5. True/False The higher the priority number the more favored the
thread will be for scheduling.
6. List at least two tools to monitor CPU usage:
í
í
Notes:

Student Notebook
7. True or False: All applications will run faster with simultaneous multi-
threading enabled.
8. What is the command used to enable or disable simultaneous multi-

threading?
A. smtctl
B. cfgmgr
C. mpstat
Notes:

V3.1.0.1
Student Notebook
Uempty
Unit Summary
• The standard tools that can be used for determining CPU

bottlenecks are vmstat, sar, ps, and topas
• Various tools exist to aid in tuning process/thread priorities
and the entire system
• Simultaneous multi-threading is the ability of a single
physical processor to simultaneously execute instructions
from more than one hardware thread
• With typical commercial workloads, SMT can provide
better overall performance by using CPUs more efficiently
• The smtctl command is used to enable/disable
simultaneous multi-threading and view simultaneous multi-
threading details
Notes:

Student Notebook

V3.1.0.1
Student Notebook
Uempty Unit 4. Virtual Memory Performance Monitoring

This unit describes virtual memory concepts including page
replacement, analyzing and tuning the Virtual Memory Manager
(VMM), and issues that affect memory performance.

• Define basic virtual memory concepts and what issues affect
performance
• Describe and monitor paging activity
• Use the Virtual Memory Management (VMM) monitoring tools

• Lecture
References
(Redbook)
SG24-7559 IBM AIX Version 6.1 Differences Guide (Redbook)
© Copyright IBM Corp. 2009 Unit 4. Virtual Memory Performance Monitoring 4-1
Student Notebook
Unit Objectives

• Define basic virtual memory concepts and what
issues affect performance
• Describe and monitor paging activity
• Use the Virtual Memory Management (VMM)
monitoring tools
Notes:

V3.1.0.1
Student Notebook
Uempty
VMM Terminology
Segments
JFS
Program text (Persistent) Filesystem
• Segment Types: Process

Data file (Persistent)
Threads
– Persistent JFS2/NFS
Data file (Client) Filesystem
– Client
– Working
Process private - stack and data (Working)
Paging
Space
Shared library data (Working)
• Segment classification: Real I/O
– Computational: Real Memory
– Working segments
– Program text
– Non-computational (file memory):
– Persistent segments
– Client segments
Figure 4-2. VMM Terminology QV4311.1
Notes:
The virtual memory system is composed of the real memory plus physical disk space
where portions of a file that are not currently in use are stored.
The pages of a persistent segment have permanent storage locations on disk. Files
containing data or executable programs are mapped to persistent segments.
The client segments are used for all file system file caching except for JFS and GPFS.
(GPFS uses its own mechanism.)
Working segments are transitory and exist only during their use by a process. They
have no permanent disk storage location and are therefore stored on disk paging space
if their page frames are stolen.
Computational memory, also known as computational pages, consists of the pages
that belong to working storage segments or program text (executable files) segments.
File memory, also known as file pages or non-computational memory, consists of the
remaining pages. These are usually pages from permanent data files in persistent
storage (persistent or client segments).
Student Notebook
Major VMM Functions

• To manage memory, the VMM:
– Manages the allocation of page frames
– Resolves references to virtual memory pages that are not
currently in RAM
• To accomplish these functions, the VMM:
– Maintains a free list of available page frames
– Uses a page replacement algorithm to determine which
virtual memory pages, currently in RAM, will have their
page frames reassigned to the free list
• The page replacement algorithm is called lrud (also referred
to as the page stealer), which is a multi-threaded process
• Memory is divided into one or more memory pools. There is
one lrud for each memory pool
Figure 4-3. Major VMM Functions QV4311.1
Notes:
The Virtual Memory Manager (VMM) coordinates and manages all the activities
associated with the virtual memory system. It is responsible for allocating real memory
page frames and resolving references to pages that are not currently in real memory.
The VMM maintains a list of unallocated page frames that it uses to satisfy page faults,
called the free list. In most environments, the VMM must occasionally add to the free list
by stealing some page frames owned by running processes. The virtual memory pages
whose page frames are to be reassigned are selected by the VMM’s page stealer. The
VMM thresholds determine the number of frames reassigned.
When a process exits, its working storage is freed up immediately and its associated
memory frames are put back on the free list. However, any files the process may have
opened can stay in memory. When a file system is unmounted, any cached file pages
are freed.
Page stealing occurs when the lrud kernel process selects a currently allocated real
memory page frame to be placed on the free list.

V3.1.0.1
Student Notebook
Uempty
Page Replacement
Initial PFT (excerpt) Pages added to
Physical Segment Modified Paging the free List
Ref. Bit
Address Type ? Space Physical
aaa1 W On Yes Address
aaa2 W Off Yes aaa2
aaa3 W On No aaa4
aaa4 W Off No JFS bbb2
bbb1 P On Yes
Filesystem
bbb4
bbb2 P Off Yes ccc2
bbb3 P On No ccc4
bbb4 P Off No
JFS2/NFS
ccc1 C On Yes
Filesystem
ccc2 C Off Yes
Resulting PFT (excerpt)
ccc3 C On No
ccc4 C Off No Physical Segment Modified
Ref. Bit
Address Type ?
aaa1 W Yes
aaa3 W No
bbb1 P Yes
bbb3 P No
ccc1 C Yes
ccc3 C No
Figure 4-4. Page Replacement QV4311.1
Notes:
A process requires real memory pages to execute. When a process references a virtual
memory page that is on disk (because it either has been paged out or has yet to be read
in), the referenced page must be paged in. If the memory is already nearly full, this may
cause one or more pages to be paged out to make room, creating I/O traffic and
delaying the progress of the process.
The VMM uses the page stealer to steal page frames that have not been recently
referenced, and thus would be unlikely to be referenced in the near future. A successful
page stealer allows the operating system to keep enough processes active in memory
to keep the CPU busy.
Pinned page frames or pinned memory are pages that cannot be stolen.
The VMM uses a Page Frame Table (PFT) to keep track of what page frames are in
use. The PFT includes flags to signal which pages have been referenced and which
have been modified. If the page stealer encounters a page that has been referenced,
then it does not steal that page at that time, but instead resets the reference flag for that
page.
Student Notebook
Values for Persistent and Client Pages

• JFS pages are classified as persistent
– The numperm value reflects number of non-
computational pages in memory
• JFS2 and NFS pages are classified as client pages

– The numclient value reflects number of client pages in
memory
• Some command output:

– Includes numclient in the numperm value
(e.g., vmstat, vmstat -v)
– Lists numclient and numperm values separately
(e.g., svmon)
Figure 4-5. Values for Persistent and Client Pages QV4311.1
Notes:
The numperm value

The numperm value is the number of non-computational (file memory) pages in use. This
is not the number of persistent pages in memory because persistent pages that belong
program text (executable files) are considered computational pages.
The numclient value

The numclient value is the number of client pages in use.

V3.1.0.1
Student Notebook
Uempty
VMM Thresholds (1 of 2)
• The following vmo parameters ensure there are pages on the free list:
– minfree - default 960 pages
– maxfree - default 1088 pages
• The percentage of real memory that can be used by file pages (non-
computational segments) is controlled by the following vmo
parameters:
– minperm%
• AIX 5.2/5.3 - default 20%
• AIX 6.1 - default 3%
– maxperm%
• AIX 5.2/5.3 - default 80%
– maxclient%
• AIX 5.2/5.3 - default 80%
Figure 4-6. VMM Thresholds (1 of 2) QV4311.1
Notes:
VMM Thresholds
Several numerical thresholds define the objectives of the VMM. When one of these
thresholds is breached, the VMM takes appropriate action to bring the state of memory
back within bounds. These thresholds are:
- minfree specifies the minimum acceptable number of real memory page frames on
the free list
- maxfree specifies the maximum size to which the free list will grow by VMM page
stealing
- minperm% specifies the point below which the page stealer will steal file or
computational pages regardless of repaging rates
- maxperm% specifies the point above which the page stealer steals only file pages
- maxclient% specifies maximum percentage of RAM that can be used for caching
client pages
Student Notebook
VMM Thresholds (2 of 2)
• Other vmo parameters that affect page replacement are:
– strict_maxclient (default 1)
– strict_maxperm (default 0)
– lru_file_repage
• AIX 5.2/5.3 - default 1
• AIX 6.1 - default 0
Figure 4-7. VMM Thresholds (2 of 2) QV4311.1
Notes:
The strict_maxclient and strict_maxperm tunables are restricted tunables and
should NOT be changed unless directed by IBM AIX Development or Support.
When strict_maxclient is set to 1 (the default), then the maxclient% value will be a
hard limit on how much of RAM can be used as a client file cache.
If strict_maxperm is set to 1, (NOT the default), then the maxperm% value will be a hard
limit on how much of RAM can be used as a persistent file cache. With the default of
strict_maxperm set to 0, maxperm% is a soft limit.
When lru_file_repage is set to 0 (the default in AIX 6.1), then the repage rates are
ignored and the page stealer will try to only steal file pages, as long as numperm is
greater than minperm. If lru_file_repage is set to 1 (not the default in AIX 6.1), then
the repage rates may be considered when deciding what type of page to steal.

V3.1.0.1
Student Notebook
Uempty
When to Steal Pages Based on Free Pages
•The following vmo parameters ensure there are pages on the free list:
– minfree (default 960 pages)
– maxfree (default 1088 pages)
Begin stealing when the Stop stealing when the number

number of free pages is less of free pages is equal to
than minfree maxfree
maxfree Number of
free pages
Number of minfree
free pages
System memory System memory

Figure 4-8. When to Steal Pages Based on Free Pages QV4311.1
Notes:
Stealing based on the number of free pages

The VMM tries to keep the free list size greater than or equal to minfree so it can
supply page frames to requestors immediately, without forcing them to wait for page
steals and the accompanying I/O to complete. When the free list size fall below
minfree, the page stealer runs.
When the free list reaches maxfree number of pages, the page stealer stops.
The default values for minfree and maxfree have not changed in AIX 6.1.
Student Notebook
When to Steal Pages Based on Client Pages

• Assuming strict_maxclient = 1 (the default)
• Note: Only client pages are stolen in this situation
Begin stealing when Stop stealing when

numclient gets up to numclient gets down to
maxclient - minfree maxclient - maxfree
AIX 5.2/5.3 AIX 6.1 AIX 5.2/5.3 AIX 6.1
90% 90%
80% maxclient
80% maxclient
numclient
maxclient
numclient minus
minfree maxclient numclient
minus
numclient maxfree
System memory System memory System memory System memory
Figure 4-9. When to Steal Pages Based on Client Pages QV4311.1
Notes:
If strict_maxclient=1 (the default), the page stealer may start before the free list
reaches minfree number of pages. When the number of client pages exceeds the
value of maxclient minus minfree, then page stealing starts.
When the number of client pages drops below the value of maxclient minus maxfree,
then page stealing stops.
The visual shows a comparison of what happens in AIX 5.2/5.3 versus AIX 6.1. Note
that in AIX 6.1, the page stealer will start later than it would in AIX 5.2/5.3 and stops
earlier than it would in AIX 5.2/5.3. This allows more client pages to remain in memory
in AIX 6.1.

V3.1.0.1
Student Notebook
Uempty
What Types of Pages are Stolen?
lru_file_repage = 1 lru_file_repage = 0
(default in AIX 5.2/5.3) (default in AIX 6.1)
numperm > maxperm Tries to only steal
Tries to only steal file pages
file pages 90%
80% maxperm
If file repage rate >

computational repage rate
Then numperm > minperm Tries to only steal
steal computational pages AND file pages
Else numperm < maxperm
steal file pages
20%
Steals the least minperm

3%
recently used pages
numperm < minperm Steals the least recently used pages
Note: File pages here mean BOTH client and persistent pages
Figure 4-10. What Type of Pages are Stolen? QV4311.1
Notes:
numperm > maxperm

If the percentage of RAM occupied by file pages rises above maxperm, then the
preference is to try and steal file pages. But, there are some unusual circumstances
where computational pages can and will be stolen.
numperm < minperm

If the percentage of RAM occupied by file pages falls below minperm, then any page
(file or computational) that has not been referenced can be selected for replacement.
numperm is between minperm and maxperm

If the percentage of RAM occupied by file pages is between minperm and maxperm, then
page replacement may steal file or computational pages depending on the repage rates
and the value of lru_file_repage.
Student Notebook
How is Memory Being Used?

# vmstat -I 5

-------- ----------- --------------------------- ----------------- ---------------
r b p avm fre fi fo pi po fr sr in sy cs us sy id wa
1 0 0 187052 3219 11195 0 0 0 8720 9165 521 87646 7632 8 21 58 12
1 0 0 187067 3214 4332 0 0 0 2697 2697 455 63884 4932 9 18 61 12
1 0 0 187084 3144 3730 0 0 0 2374 2374 389 62610 5618 10 20 60 11
2 0 0 187069 3006 4283 0 0 0 2634 2634 430 65111 6241 10 21 59 11
1 0 0 187213 3048 5145 0 0 0 3979 75936 385 67500 4276 9 23 56 12
0 1 0 187200 3140 14301 0 0 0 13494 13935 428 78735 3471 6 20 60 14
1 0 0 187230 3188 13208 0 0 0 11605 11748 346 126253 12208 8 24 57 11
1 0 0 187376 3135 3070 0 0 0 1092 1188 427 162036 29224 16 30 51 3
1 0 0 187332 3618 4756 0 0 0 3390 3865 414 152360 21478 13 27 54 5
1 0 0 187520 3244 4776 0 0 0 2351 2364 445 162840 30134 13 27 55 5
# svmon -G
size inuse free pin virtual
memory 262144 259018 3126 108991 187230
pg space 131072 1876
work pers clnt other

pin 98745 0 0 10246
in use 187230 0 71788
PageSize PoolSize inuse pgsp pin virtual

s 4 KB - 147530 1876 23887 75742
m 64 KB - 6968 0 5319 6968
Figure 4-11. How is Memory Being Used? QV4311.1
Notes:
The vmstat -I command reports virtual memory statistics including file page ins (fi)
and file page outs (fo) per second, paging space page ins (pi) and paging space page
outs (po) for working pages, number of pages scanned (sr), number of pages stolen or
freed (fr).
The svmon -G command gives a snapshot of the overall picture of memory use
including the total amount of real memory (size), number of free memory frames
(free), number of memory frames containing working segment pages (work field in the
in use), number of memory frames containing persistent segment pages (pers field in
the in use), and the number of memory frames containing client segment pages
(clnt). These four fields add up to the total real memory.
In the vmstat output, avm stands for Active Virtual Memory. The avm value in the vmstat
output and the virtual value in the svmon -G output show the active number of 4 KB
virtual memory pages in use at that time.
The fre value in the vmstat output and the free field in the svmon -G output indicate
the average number of 4 KB pages that are currently on the free list.

V3.1.0.1
Student Notebook
Uempty
How Many File Pages are in Memory?
# vmstat -v
262144 memory pages
238362 lruable pages
3013 free pages
1 memory pools
109201 pinned pages
80.0 maxpin percentage
3.0 minperm percentage
90.0 maxperm percentage
27.3 numperm percentage
65249 file pages
0.0 compressed percentage
0 compressed pages
27.3 numclient percentage
90.0 maxclient percentage
65249 client pages
0 remote pageouts scheduled
32 pending disk I/Os blocked with no pbuf
0 paging space I/Os blocked with no psbuf
2484 filesystem I/Os blocked with no fsbuf
26 client filesystem I/Os blocked with no fsbuf
0 external pager filesystem I/Os blocked with no fsbuf
Figure 4-12. How Many File Pages are in Memory? QV4311.1
Notes:
In a particular workload, it might be more important to avoid file I/O. In another
workload, keeping computational segment pages in memory might be more important.
To get the file page and other statistics, use the vmstat -v command. If PerfPMR was
run, the output is in vmstat_v.before and vmstat_v.after.
Note: The numperm value can be less than numclient because text pages are classified
as computational.
Student Notebook
Is Memory Over Committed?

• Memory is considered overcommitted if the number of pages currently in
use exceeds the real memory pages available
• The number of pages currently in use is the sum of the:
– Virtual pages
– File cache pages
• If memory is over committed, then it is recommended to either:
– Reduce the workload
– Add more real memory
• Example:
# svmon -G
size inuse free pin virtual

memory 733184 731505 1679 191889 933823
pg space 1572864 282872
work pers clnt

pin 191624 0 265
in use 689073 0 42432
Virtual pages = 933823 (3647 MB)

+ File cache pages = 42432 ( 166 MB)
--------------------------------------------------------
Total pages in use = 976255 (3813 MB) vs. Real memory = 733184 pages (2864 MB)
Figure 4-13. Is Memory Over Committed? QV4311.1
Notes:
A successful page replacement keeps the memory pages of all currently active
processes in RAM, while the memory pages of inactive processes are paged out.
However, when RAM is over committed, it becomes difficult to choose pages for page
out because they will be referenced in the near future by currently running processes.
The result is that pages that will soon be referenced still get paged out and then paged
in again later. When this happens, continuous paging in and paging out may occur. The
system spends most of its time paging in and paging out instead of executing useful
instructions, and none of the active processes make any significant progress.
Use the svmon -G command to get the amount of memory being used and compare that
to the amount of real memory. To do this:
- The total amount of real memory is shown in the memory size field
- The amount of memory being used is the total of the virtual pages shown in the
memory virtual field, the persistent pages shown in the in use pers field, and the
client pages shown in the in use clnt field.

V3.1.0.1
Student Notebook
Uempty
Exercise 4:
Virtual Memory Performance Monitoring
• Observe the memory utilization

• Monitor the VMM free list
• Use svmon to monitor the amount of memory
in use
• Analyze memory related PerfPMR files
Figure 4-14. Exercise 4: Virtual Memory Performance Monitoring QV4311.1
Notes:
Student Notebook
1. The virtual memory system is composed of
________________ and ______________________
2. Virtual memory is divided into the three segment types:
_____________, _____________, and _____________
3. What type of segments are paged out to paging space?
__________________
4. Segments are classified as either __________________
or ___________________
5. The two major functions of the VMM are:
–
–
6. The name of the kernel process that implements the page
replacement algorithm is _______
Notes:

V3.1.0.1
Student Notebook
Uempty
7. List the vmo parameter that matches the description:
a. Specifies the minimum number of frames on the free list when the
VMM starts to steal pages to replenish the free list _______
b. Specifies the number of frames on the free list at which page
stealing stops ______________
c. Specifies the point below which the page stealer will steal file or
computational pages regardless of repaging rates ___________
d. Specifies the point above which the page stealing algorithm steals
only file pages ______________
e. Specifies the maximum percentage of RAM that can be used for
caching client pages ______________
f. Specifies whether the maxclient value will be a hard limit on how
much of RAM can be used as a client file cache
_______________
g. Specifies whether the maxperm value will be a hard limit on how
much of RAM can be used as a persistent file cache
_______________
h. Specifies whether or not to consider repage rates when deciding
what type of page to steal ________________
Notes:
Student Notebook
Unit Summary
• The amount of virtual memory that is in use at any given time
can be larger than real memory. The VMM must store the
surplus virtual memory on disk.
• From the performance standpoint, the VMM has two
objectives:
– Minimize the overall CPU time and I/O bandwidth cost for
virtual memory
– Minimize the response time cost of page faults
• To fulfill these objectives, the VMM:
– Maintains a free list of page frames that are available to
satisfy a page fault
– Uses a page replacement algorithm to determine which
virtual memory pages currently in memory will have their
page frames reassigned to the free list
Notes:

V3.1.0.1
Student Notebook
Uempty Unit 5. I/O Performance Monitoring

This unit describes the issues related to physical volume, logical
volume and file system performance. It also shows you how to use
performance tools to monitor physical, logical and file system I/O.

• Identify factors related to physical and logical volume performance
and file systems
• Use performance tools to identify I/O bottlenecks
• Describe how file fragmentation affects file system I/O performance
• List guidelines for accurate I/O measurements
• Measure read and write throughput
• Define and create JFS and JFS2 logs
• Reorganize a file system

• Lecture
References
(Redbook)
© Copyright IBM Corp. 2009 Unit 5. I/O Performance Monitoring 5-1

Student Notebook
Unit Objectives
• Identify factors related to physical and logical volume
performance and file systems
• Use performance tools to identify I/O bottlenecks
• Describe how file fragmentation affects file system I/O
performance
• List guidelines for accurate I/O measurements
• Measure read and write throughput
• Define and create JFS and JFS2 logs
• Reorganize a file system
Notes:

V3.1.0.1
Student Notebook
Uempty
LVM Terminology
Application Raw
JFS/JFS2
Layer Logical Volume
Volume
Group Logical Volume Logical Volume
Logical Logical Logical Volume Device Driver (LVDD)

Layer Volume
Manager
Physical Physical Physical
Volume Volume Volume
Device Driver Device Driver
Physical
Layer
Physical Physical
Physical Disk Disk
Disk
Array
Figure 5-2. LVM Terminology QV4311.1
Notes:
The logical volume layer is between the application and physical layers. The application
layers are the file system or raw logical volumes. The physical layer are the physical
disks. LVM maps the data between application layer and physical storage. Even
physical volumes are part of the logical layer, as the physical layer only contains the
actual disks, device drivers, and disk arrays that may already be configured.
The physical disk drives, storage arrays or virtual disks are named as a physical
volumes in LVM. All of the physical volumes in a volume group are divided into physical
partitions. All the physical partitions within a volume group are the same size, although
different volume groups can have different physical partition sizes. A volume group is
made up of one or more physical volumes. Within each volume group, one or more
logical volumes are defined. Logical volumes are groups of information located on
physical volumes. Each logical volume consists of one or more logical partitions.
Logical partitions are the same size as the physical partitions within a volume group.
Each logical partition is mapped to one, two or three physical partitions.

Student Notebook
LVM Attributes That Affect Performance

• Define specific physical volumes to use
• Position on physical volume (intra-disk allocation policy)
• Range of physical volumes (inter-disk allocation policy)
• Number of copies of each logical partition (mirroring)
• Mirror Write Consistency (off, active, or passive)
• Allocate each logical partition copy on separate physical
volumes (yes, no, superstrict)
• Relocate logical volume during reorganization (yes, no)
• Enable write verify (yes, no)
• Logical volume I/O serialization (yes, no)
Figure 5-3. LVM Attributes That Affect Performance QV4311.1
Notes:
When a logical volume is created, you can specify which physical volumes to use.
The intra-disk allocation policy choices are based on the five regions of a disk where
physical partitions can be located.
The inter-disk allocation policy specifies the number of disks on which the physical
partitions of a logical volume are located.
A logical volume can have from 1 to 3 copies. You can also decide how to handle
recovery of a mirrored logical volume with the Mirror Write Consistency setting.
The strictness policy defines the rule of whether each logical partition copy must be
on a separate physical volume.
Relocate LV during reorganization specifies whether to allow the relocation of the
logical volume during reorganization.
Write verify sets an option for the disk to do whatever its “write verify” procedure is.
Logical volume serialization serializes overlapping I/Os.

V3.1.0.1
Student Notebook
Uempty
Causes of Poor I/O Performance
• Fragmentation (file, file system, or logical volume)
• MWC writes
• Write Verify enabled
• Excessive disk seeks
• Saturated devices (disks, adapters, buses)
• Locality of data (hot partitions, hot disks)
• Slow disk subsystem
Figure 5-4. Causes of Poor I/O Performance QV4311.1
Notes:
Fragmentation can occur at the file, file system, or the logical volume level.
Mirror Write Consistency Check writes can seriously hurt performance. This is more
of an issue with random writes. If it’s a problem, then consider using passive MWC.
Write verify can also be very expensive because after the write, the data has to be
read and verified.
Disk seeks can cause poor performance since this is the slowest part of a physical
disk. Disk seeks can be caused by the application, by fragmentation, or by concurrent
accesses to multiple data sets on the same physical disk drive by different threads.
Saturated devices (disks, adapters, or buses) can cause poor performance due to lack
of throughput.
Locality of data is important because certain areas or partitions of a disk may be more
frequently accessed than others.
Sometimes the disk subsystem may be slow for one reason or another. It could be
software or hardware issues.

Student Notebook
The lvmstat Utility

• The lvmstat utility reports I/O statistics for logical
partitions and volumes
• Statistics can be gathered for:
– Volume group (which will include all its logical volumes)
– Logical volume
• Enable statistic recording:
– For a volume group: # lvmstat -e -v vgname
– For a logical volume: # lvmstat -e -l lvname
• lvmstat can be run:
– At intervals and only shows the partitions that have been
accessed during each interval
– Without the interval option, shows the most heavily used
partitions since recording was enabled
Figure 5-5. The lvmstat Utility QV4311.1
Notes:
Rather than migrating entire logical volumes from one disk to another in an attempt to
rebalance the workload, if we can identify the individual hot logical partitions, then we
can focus on migrating just those to another disk.The lvmstat utility can be used to
monitor the utilization of individual logical partitions of a logical volume. By default,
statistics are not kept on a per partition basis. These statistics can be enabled with the
lvmstat -e option. You can enable statistics for:
- All logical volumes in a volume group with lvmstat -e -v vgname
- Per logical volume basis with lvmstat -e -l lvname
The first report generated by lvmstat provides statistics concerning the time since the
system was booted. Each subsequent report covers the time since the previous report.
All statistics are reported each time lvmstat runs. The report consists of a header row
followed by a line of statistics for each logical partition or logical volume depending on
the flags specified.

V3.1.0.1
Student Notebook
Uempty
lvmstat Example
• The following shows how to list the top logical partitions of a
logical volume:
# lvmstat -l lv03 -e
# lvmstat -l lv03 -c 10
Log_part mirror# iocnt Kb_read Kb_wrtn Kbps

1 1 64 8192 0 0.05
2 1 64 8192 0 0.05
3 1 64 8192 0 0.05
4 1 28 3584 0 0.02
5 1 0 0 0 0.00
6 1 0 0 0 0.00
7 1 0 0 0 0.00
8 1 0 0 0 0.00
9 1 0 0 0 0.00
10 1 0 0 0 0.00
Figure 5-6. lvmstat Example QV4311.1
Notes:
The visual shows enabling lvmstat to gather statistics for the logical volume, lv03. It
then gathers and reports the statistics for the 10 busiest logical partitions on lv03.
The reports has the following fields:
Field Description
Log_part Logical partition number
mirror# Mirror copy number of the logical partition
iocnt Number of read and write requests
Kb_read The total number of kilobytes read
Kb_wrtn The total number of kilobytes written
Kbps The amount of data transferred in kilobytes per second

Student Notebook
Migrating Physical Partitions

• The migratepv command can be used to move a logical volume
from one physical disk to another:
– Use filemon to determine if there are multiple logical volumes
that are heavily utilized on the same disk
– If so, move one using:
migratepv -l lvname source_disk destination_disk
• migratelp allows moving individual partitions of a logical volume

rather than the entire logical volume:
– migratelp is useful when there are certain partitions that are
being accessed heavily from a logical volume (e.g. a database
index)
– Syntax:
migratelp lvname/lpartnum[/copynum] destpv[/ppartnum]
– The lvmstat command can be used to determine which
partitions are being accessed the most
Figure 5-7. Migrating Physical Partitions QV4311.1
Notes:
Moving a logical volume to another physical disk

One way to solve an I/O bottleneck is to see if placement of different logical volumes
across multiple physical disks is possible. First, you would have to determine if a
particular physical disk is being heavily utilized using the iostat or filemon
commands. Second, determine if there are multiple logical volumes being accessed on
that same physical disk. If so, then you can move one logical volume from that physical
disk to another physical disk.
Moving logical partitions

There may also be the case where a disk has a single very large logical volume on it. In
this case, moving the entire logical volume to an equivalent disk would not help. You
could check to see if individual partitions are accessed heavily.

V3.1.0.1
Student Notebook
Uempty
Migration Example
# lvmstat -v datavg -e
# lvmstat -v datavg
Logical Volume iocnt Kb_read Kb_wrtn Kbps
lv00 2099 26564 25364 0.12
lv01 1682 0 253 0.11
lv02 39 0 156 0.00
# lvmstat -l lv00
Log_part mirror# iocnt Kb_read Kb_wrtn Kbps
2 1 1848 12760 12416 0.06
8 1 684 10624 9480 0.03
7 1 556 1196 733 0.01
3 1 507 2836 210 0.03
# migratelp lv00/2 hdisk2
Figure 5-8. Migration Example QV4311.1
Notes:
The first command, lvmstat -v datavg -e, enables LVM statistics gathering on
datavg. It also enables statistics gathering on all logical volumes in datavg.
To get the LVM statistics on datavg, use the command: lvmstat -v datavg
The activity is highest on lv00. To take a closer look at lv00, use the command:
lvmstat -l lv00.
At this point, we may want to consider migrating lv00’s logical partition 2 to another
disk. This can be done with the command: migratelp lv00/2 hdisk2.
Note: We are assuming that LP2 and LP8 of lv00 are on the same disk. They may not
be. In reality, we would need to confirm that using lslv -m lv00 or look at the mapping
of the storage array.

Student Notebook
sar -d
# sar -d 1 2
AIX leguin221 1 6 00066BA2D900 02/09/09
System configuration: lcpu=2 drives=8 mode=Capped
15:31:47 device %busy avque r+w/s Kbs/s avwait avserv
15:31:48 hdisk0 0 0.0 0 0 0.0 0.0
hdisk1 100 0.0 282 1128 0.0 4.0
hdisk2 0 0.0 0 0 0.0 0.0
hdisk3 0 0.0 0 0 0.0 0.0
hdisk4 0 0.0 0 0 0.0 0.0
hdisk5 0 0.0 0 0 0.0 0.0
cd0 0 0.0 0 0 0.0 0.0
hdisk6 91 0.0 6045 24180 0.0 0.2
15:31:49 hdisk0 0 0.0 0 0 0.0 0.0

hdisk1 100 0.0 280 1120 0.0 4.0
hdisk2 0 0.0 0 0 0.0 0.0
hdisk3 0 0.0 0 0 0.0 0.0
hdisk4 0 0.0 0 0 0.0 0.0
hdisk5 0 0.0 0 0 0.0 0.0
cd0 0 0.0 0 0 0.0 0.0
hdisk6 50 0.0 3245 12980 0.0 0.2
Average hdisk0 0 0.0 0 0 0.0 0.0

hdisk1 100 0.0 281 1124 0.0 4.0
hdisk2 0 0.0 0 0 0.0 0.0
hdisk3 0 0.0 0 0 0.0 0.0
hdisk4 0 0.0 0 0 0.0 0.0
hdisk5 0 0.0 0 0 0.0 0.0
cd0 0 0.0 0 0 0.0 0.0
hdisk6 70 0.0 4645 18580 0.0 0.2
Figure 5-9. sar -d . QV4311.1
Notes:
The -d flag of sar provides real time disk I/O statistics. The fields listed by sar -d are:
- %busy - Reports the portion of time device was busy servicing a transfer request.
- avque - Reports the average number of requests outstanding from the adapter to
the device during the time interval.
- r+w/s - The number of read/write transfers from or to device.
- Kbs/s -The amount of data transferred to the drive in KB per second.
- avwait - The average time (in milliseconds) that transfer requests waited idly on
the queue for the device. Prior to AIX 5.3, this was not supported. If you see large
numbers in the avwait column, try to distribute the workload on other disks.
- avserv - The average time (in milliseconds) to service each transfer request
(includes seek, rotational latency, and data transfer times) for the device. Prior to
AIX 5.3, this was not supported.
Note:%busy is the same as %tm_act in iostat. r+w/s is equal to tps in iostat.

V3.1.0.1
Student Notebook
Uempty
Using iostat
# iostat 5 2
System configuration: lcpu=2 drives=8 paths=8 vdisks=0
tty: tin tout avg-cpu: % user % sys % idle % iowait

1.0 3275.0 3.3 22.5 36.6 37.6
Disks: % tm_act Kbps tps Kb_read Kb_wrtn

hdisk0 0.0 0.0 0.0 0 0
hdisk1 16.0 184.0 46.0 24 160
hdisk2 0.0 0.0 0.0 0 0
hdisk3 0.0 0.0 0.0 0 0
hdisk4 0.0 0.0 0.0 0 0
hdisk5 0.0 0.0 0.0 0 0
cd0 0.0 0.0 0.0 0 0
hdisk6 95.0 23016.0 5754.0 2564 20452

0.0 3393.0 3.6 24.4 34.8 37.2

hdisk0 0.0 0.0 0.0 0 0
hdisk1 100.0 1120.0 280.0 124 996
hdisk2 0.0 0.0 0.0 0 0
hdisk3 0.0 0.0 0.0 0 0
hdisk4 0.0 0.0 0.0 0 0
hdisk5 0.0 0.0 0.0 0 0
cd0 0.0 0.0 0.0 0 0
hdisk6 90.0 24228.0 6057.0 2704 21524
Figure 5-10. Using iostat . QV4311.1
Notes:
The iostat command is used for monitoring system input/output device loading by
observing the time the physical disks are active in relation to their average transfer
rates. It does not provide data for file systems or logical volumes. The iostat
command generates reports that can be used to change the system configuration to
better balance the input/output load between physical disks and adapters.
Beginning with AIX 5.3, the collection of disk input/output statistics is disabled by default
to improve performance. To enable the collection of this data, type:
chdev -l sys0 -a iostat=true
To display the current settings, type:
lsattr -E -l sys0 -a iostat
If the collection of disk input/output history is disabled and iostat is called without an
interval, the iostat output displays the message Disk History Since Boot Not
Available instead of disk statistics.

Student Notebook
What is iowait?
• iowait is a form of idle time
• The iowait statistic is simply the percentage of time the
CPU is idle AND there is at least one I/O still in progress
(started from that CPU)
• The iowait value seen in the output of commands like
vmstat, iostat, and topas is the iowait percentages
across all CPUs averaged together
• High I/O wait does not mean that there is definitely an I/O
bottleneck
• Zero I/O wait does not mean that there is not an I/O
bottleneck
• A CPU in I/O wait state can still execute threads if there are
any runnable threads
Figure 5-11. What is iowait? QV4311.1
Notes:
To summarize it in one sentence, iowait is the percentage of time the CPU is idle AND
there is at least one I/O in progress. Each CPU can be in one of four states:
- user
- sys
- idle
- iowait
Performance tools such as vmstat, iostat, sar, etc. print out these four states as a
percentage. The sar tool can print out the states on a per CPU basis (-P flag) but most
other tools print out the average values across all the CPUs. Since these are
percentage values, the four state values should add up to 100%.

V3.1.0.1
Student Notebook
Uempty
Monitoring Adapter I/O Throughput
• iostat -a shows adapter throughput
• Disks are listed following the adapter to which they are
attached
# iostat -a
System configuration: lcpu=2 drives=8 paths=8 vdisks=0 tapes=0
0.0 3395.0 3.6 24.5 34.8 37.2
Adapter: Kbps tps Kb_read Kb_wrtn
sissas0 1128.0 282.0 128 1000
hdisk0 0.0 0.0 0.0 0 0
hdisk1 99.0 1128.0 282.0 128 1000
hdisk2 0.0 0.0 0.0 0 0
hdisk3 0.0 0.0 0.0 0 0
hdisk4 0.0 0.0 0.0 0 0
hdisk5 0.0 0.0 0.0 0 0
cd0 0.0 0.0 0.0 0 0
Adapter: Kbps tps Kb_read Kb_wrtn
fcs0 24300.0 6075.0 2720 21580
hdisk6 91.0 24300.0 6075.0 2720 21580
Figure 5-12. Monitoring Adapter I/O Throughout QV4311.1
Notes:
Adapter throughput
The -a option to iostat will combine the disks statistics to the adapter to which they
are connected. The adapter throughput will simply be the sum of the throughput of each
of its connected devices. With the -a option, the adapter will be listed first, followed by
its devices and then followed by the next adapter, followed by its devices, and so on.
The adapter throughput values can be used to determine if any particular adapter is
approaching its maximum bandwidth or to see if the I/O is balanced across adapters.

Student Notebook
Monitoring System Throughput

•iostat -s shows system throughput
• Output is the sum of all the adapter’s throughputs
# iostat -s 1
System configuration: lcpu=2 drives=8 paths=8 vdisks=0
0.0 3394.0 3.6 24.4 34.8 37.2
System: leguin221.beaverton.ibm.com
Kbps tps Kb_read Kb_wrtn
Physical 25360.0 6340.0 2836 22524
hdisk0 0.0 0.0 0.0 0 0
hdisk1 99.0 1124.0 281.0 124 1000
hdisk2 0.0 0.0 0.0 0 0
hdisk3 0.0 0.0 0.0 0 0
hdisk4 0.0 0.0 0.0 0 0
hdisk5 0.0 0.0 0.0 0 0
cd0 0.0 0.0 0.0 0 0
hdisk6 91.0 24236.0 6059.0 2712 21524
Figure 5-13. Monitoring System Throughput QV4311.1
Notes:
System throughput
The -s option to iostat shows the system throughput. This is the sum of all the
adapter’s throughputs.

V3.1.0.1
Student Notebook
Uempty
File System I/O Layers
Logical File Local or NFS
System
Virtual
Paging
Memory Manager
Logical
Disk space management
Volume Manager
Physical Hardware dependent

Disk I/O
Figure 5-14. File System I/O Layers QV4311.1
Notes:
There are a number of layers involved in file system storage and retrieval. It’s important
to understand what performance issues are associated with each layer. The
management tools used to monitor file system activity can provide data on each of
these layers.
The effect of a file’s placement on I/O performance diminishes when the file is buffered
in memory. When a file is opened in AIX, it is mapped to a persistent (JFS) or client
(JFS2) data segment in virtual memory. The segment represents a virtual buffer for the
file. The file’s blocks map directly to segment pages. The VMM manages the segment
pages, reading file blocks into segment pages upon demand (as they are accessed).
There are several circumstances that cause the VMM to write a page back to its
corresponding block in the file on disk.

Student Notebook
File System Performance Factors

• Dynamic allocation of resources may cause:
í Logically contiguous files to be fragmented
í Logically contiguous logical volumes to be fragmented
í File blocks to be scattered
• Issues when files are accessed from disk:

í Sequential access no longer sequential
í Random access affected
í Access time dominated by longer seek time
í Once the file is in memory, this effect diminishes
• To minimize the file system CPU overhead, the size of a read

or write request should be a multiple of the file system block
size
Figure 5-15. File System Performance Factors QV4311.1
Notes:
There’s a theory that anything that starts out with perfect order will, over time, become
disordered due to outside forces. This concept certainly applies to file systems. The
longer a file system is used, the more likely it will become fragmented. Also, the
dynamic allocation of resources (e.g., extending a logical volume) contributes to the
disorder. File system performance is also affected by physical considerations.
With fragmentation, sequential file access will no longer find contiguous physical disk
blocks. Random access may not find physically contiguous logical records and will have
to access more widely dispersed data. In both cases, seek time for file access grows.
Both JFS and JFS2 attach a VM segment to do I/O, so file data becomes cached in
memory and disk fragmentation does not affect access to the cached data.
Each read or write operation on a file system is done through system calls. System calls
for reads and writes define the size of the operation. The smaller the operation the more
system calls are needed to read or write the entire file. Therefore, more CPU time is
spent making the system calls. The read or write size should be a multiple of the file
system block size to reduce the amount of CPU time spent per system call.

V3.1.0.1
Student Notebook
Uempty
How to Measure File System Performance
• General guidelines for accurate measurements
í System has to be idle
í System management tools like WLM should be turned
off
í I/O subsystems should not be shared with other
systems
í Files must not be cached in memory for read throughput
measurement
í Writes must go to the file system disk
Figure 5-16. How to Measure File System Performance QV4311.1
Notes:
File system operations require system resources such as CPU, memory, and I/O. The
result of a file system performance measurement will NOT be accurate if one or more of
these resources are in use by other applications. The same applies if one or more of
these resources is managed and/or the statistics are gathered with system
management tools like Workload Manager (WLM). Those tools should be turned off.
I/O subsystems can share disk space among several systems. The available bandwidth
might not be enough to achieve maximum file system performance if the I/O subsystem
is used by other systems during the performance measurement, thus it should not be
shared. When a file is cached in memory, a read throughput measurement does not
give any information about the file system throughput since no physical operation on the
file system takes place. The best way to assure that a file is not cached in memory is to
unmount then mount the file system on which the file is located. A write throughput
measurement does not give any information about file system performance if nothing is
written out to disk. Unless the application opens files in such a way that it doesn’t use
file system buffers (such as direct I/O), then each write to a file is done in memory and is
written out to disk by either a syncd or a write-behind algorithm.

Student Notebook
How to Measure Write Throughput

• Example of a write throughput measurement with dd
command:
# sync; sync; date; dd if=/dev/zero of=/lv1fs/bigfile1
bs=1024k count=100; date; sync; sync; date
Thu Feb 11 23:03:17 CST 2009

100+0 records in.
100+0 records out.
Thu Feb 12 23:03:20 CST 2009
Thu Feb 12 23:03:21 CST 2009
Figure 5-17. How to Measure Write Throughput QV4311.1
Notes:
The dd command is a good utility to measure the throughput of a file system since it
allows you to specify the exact size for reads or writes as well as the number of
operations.
Example
The first set of sync commands flush all modified file pages in memory to disk. The time
between the first and the second date command is the amount of time the dd
command took to write the file into memory. The time between the first and third date
command is the total amount of time it took to write the file to disk.
In this example, dd completed after 3 seconds (23:03:20 - 23:03:17) and wrote about
33.3 MB per second and the total amount of time it took to write the data to the file
system is 4 seconds (23:03:21 - 23:03:17), about 25 MB per second.

V3.1.0.1
Student Notebook
Uempty
How to Measure Read Throughput
• Useful tools for file system performance measurements
are dd and time
• Example of a read throughput measurement with dd

command:
# time dd if=/lv1fs/bigfile1 of=/dev/null bs=1024k

100+0 records in.
100+0 records out.
real 0m1.16s
user 0m0.00s
sys 0m0.19s
Figure 5-18. How to Measure Read Throughput QV4311.1
Notes:
Example
The time command shows the amount of time it took to complete the read.
The read throughput in this example is about 86.2 MB per second (100 MB / 1.16
seconds real time).

Student Notebook
The filemon Utility

• Reports the I/O activity of:
• Logical files (lf)
• Logical volumes (lv)
• Physical volumes (pv)
• Virtual memory segments (vm)
• Basic syntax:
filemon -O report-types -o output-file
• Runs in the background; stops with the trcstop command
• Uses the trace facility
• Reports have two types of information:

• Most Active statistics
• Detailed statistics
Figure 5-19. The filemon Utility QV4311.1
Notes:
If an application is believed to be disk-bound, the filemon utility is useful to find out
where and why.
The filemon command uses the trace facility to obtain a detailed picture of I/O activity
during a time interval on the various layers of file system utilization, including the logical
file system, virtual memory segments, LVM, and physical disk layers. Data can be
collected on all the layers, or some of the layers. The default is to collect data on the
virtual memory segments, LVM, and physical disk layers.
By default, filemon runs in the background while other applications are running and
being monitored. When the trcstop command is issued, filemon stops and generates
its report.
The report begins with a summary of the I/O activity for each of the levels (the Most
Active sections) and ends with detailed I/O activity for each level (Detailed sections).
Each section is ordered from most active to least active.
When running PerfPMR, the filemon data is in the filemon.sum file.

V3.1.0.1
Student Notebook
Uempty
filemon - Most Active Files Report
# filemon -O lv,lf,pv -o fmon.out
# dd if=/lv1fs/bigfile1 bs=1M of=/dev/null
# trcstop
# cat fmon.out
Wed Feb 11 23:08:09 2009
System: AIX 6.1 Node: leguin221 Machine: 00066BA2D900
Cpu utilization: 88.9%

Cpu allocation: 0.8%
Most Active Files

-----------------------------------------------------------------------
#MBs #opns #rds #wrs file volume:inode
-----------------------------------------------------------------------
101.0 1 101 0 bigfile1 /dev/lv1:21
100.0 1 0 100 null
...
Figure 5-20. filemon - Most Active Files Report QV4311.1
Notes:
The visual on this page shows the logical file output (lf) from the filemon report. The
logical file I/O includes read, writes, opens and seeks which may or may not result in
actual physical I/O depending on whether or not the files are already buffered in
memory. Statistics are kept by file.
Output is ordered by #MBs read and/or written to a file.
By default, the logical file reports are limited to the top 20. If the verbose flag (-v) is
added, activity for all files would be reported. The -u flag can be used to generate
reports on files opened prior to the start of the trace daemon.
Look for the most active files to see usage patterns. If they are dynamic files, they may
need to be backed up and restored. The Most Active Files sections shows the
bigfile1 file (read by dd command) as most active file with one open and 101 reads.
The number of writes (#wrs) is 1 less than the number of reads (#rds), because
end-of-file has been reached.

Student Notebook
filemon - Most Active LV and PV Reports
Most Active Logical Volumes

------------------------------------------------------------------------
util #rblk #wblk KB/s volume description
------------------------------------------------------------------------
0.03 205016 0 3306.1 /dev/lv1 /lv1fs
0.00 0 24 0.4 /dev/hd4 /
0.00 0 16 0.3 /dev/hd8 jfs2log
Most Active Physical Volumes

------------------------------------------------------------------------
util #rblk #wblk KB/s volume description
------------------------------------------------------------------------
0.03 205016 0 3306.1 /dev/hdisk2 N/A
0.00 0 40 0.6 /dev/hdisk0 N/A
Figure 5-21. filemon - Most Active LV and PV Reports QV4311.1
Notes:
Most active logical volumes

The filemon command monitors I/O operations on logical volumes. I/O statistics are
kept on a per-logical-volume basis. The logical volume with the highest utilization is at
the top, and the others are listed in descending order.
Most active physical volumes

The filemon command monitors I/O operations on physical volumes. At this level,
physical resource utilizations are obtained. I/O statistics are kept on a
per-physical-volume basis. The disks are presented in descending order of utilization.
The disk with the highest utilization is shown first.

V3.1.0.1
Student Notebook
Uempty
filemon - Detailed File Stats Report
------------------------------------------------------------------------
Detailed File Stats
------------------------------------------------------------------------
FILE: /lv1fs/bigfile1 volume: /dev/lv1 (/lv1fs) inode: 21

opens: 1
total bytes xfrd: 105906176
reads: 101 (0 errs)
read sizes (bytes): avg 1048576.0 min 1048576 max 1048576 sdev 0.0
read times (msec): avg 10.154 min 0.002 max 17.055 sdev 2.217
FILE: /dev/null
opens: 1
total bytes xfrd: 104857600
writes: 100 (0 errs)
write sizes (bytes): avg 1048576.0 min 1048576 max 1048576 sdev 0.0
write times (msec): avg 0.003 min 0.003 max 0.005 sdev 0.000
Figure 5-22. filemon - Detailed File Stats Report QV4311.1
Notes:
Detailed File Stats report

The Detailed File Stats report is based on the activity on the interface between the
application and the file system. As such, the number of calls and the size of the reads or
writes reflects the application calls. The read sizes and write sizes will give you an idea
of how efficiently your application is reading and/or writing information.
In this example, the report shows the average read size is approximately 1 MB, which
matches the block size specified on the dd command on the previous visual.
The size used by an application has performance implications. For sequentially reading
of a large file, a larger read size will result in fewer read requests and thus lower CPU
overhead to read the entire file. When specifying an application’s read or write block
size, using values which are a multiple of the page size which is 4 KB is recommended.

Student Notebook
filemon - Detailed PV Stats Report
------------------------------------------------------------------------
Detailed Physical Volume Stats (512 byte blocks)
------------------------------------------------------------------------
VOLUME: /dev/hdisk2 description: N/A

reads: 3242 (0 errs)
read sizes (blks): avg 63.2 min 8 max 64 sdev 6.0
read times (msec): avg 0.548 min 0.133 max 5.444 sdev 0.507
read sequences: 65
read seq. lengths: avg 3154.1 min 8 max 8192 sdev 2972.6
seeks: 65 (2.0%)
seek dist (blks): init 28841008,
avg 166391.1 min 512 max 581832 sdev 119184.3
seek dist (%tot blks):init 20.11593,
avg 0.11605 min 0.00036 max 0.40581 sdev 0.08313
time to next req(msec): avg 7.208 min 0.058 max 22246.806 sdev 391.436
throughput: 3306.1 KB/sec
utilization: 0.03
Figure 5-23. filemon - Detailed PV Stats Report QV4311.1
Notes:
Detailed Physical Volume Stats report

As contrasted with the Detailed File States report, the Detailed Physical Volume Stats
report shows the activity at disk device driver. This report shows the actual number and
size of the reads and writes to the disk device driver. The file system uses VMM
caching. The default unit of work in VMM is the 4 KB page. But, rather than writing or
reading one page at a time, the file system tends to group work together to read or write
multiple pages at a time. This grouping of work can be seen in the physical volume read
and write sizes provided in this report.
Note that the sizes are expressed in blocks, where a block is the traditional Unix block
size of 512 bytes. To translate the sizes to KBs, divide the number by 2.

V3.1.0.1
Student Notebook
Uempty
Fragmentation and Performance
Logical File Physical File System
i-nodes
1 2 3
i-nodes
5 6
Physical Disk Allocation
3 4 6
i-nodes
5
i-nodes
1 2
Figure 5-24. Fragmentation and Performance QV4311.1
Notes:
While an operating system’s file is conceptually a sequential and contiguous string of
bytes, the physical reality might be very different. Fragmentation may arise from
multiple extensions to logical volumes, the sequence of allocation/release/reallocation
activity within a file system or simply appending to a file while other applications are
also writing to the files in the same area. A file system is fragmented when its available
space consists of large numbers of small chunks of space, making it impossible to write
out a new file in contiguous blocks.
Access to files in a highly fragmented file system may result in a large number of seeks
and longer I/O response times (seek latency dominates I/O response time). For
example, if the file is accessed sequentially, a file placement that consists of many,
widely separated chunks requires more seeks than a placement that consists of one or
a few large contiguous chunks. If the file is accessed randomly, a placement that is
widely dispersed requires longer seeks than a placement in which the file’s blocks are
close together.

Student Notebook
Logical Volume Fragmentation

• Logical volumes can be fragmented across disks
• To view logical volume policies:

# lslv lvname
• To check logical volume fragmentation with respect to intra-

allocation policy:
# lslv -l lvname
• To see logical volume fragmentation and placement of logical

partitions on disks:
# lslv -p hdisk# lvname (disk perspective)
or
# lslv -m lvname (logical volume perspective)
Figure 5-25. Logical Volume Fragmentation QV4311.1
Notes:
To see the characteristics of a hot logical volume, the lslv command may be used with
the specified logical volume name. This will tell you what policies are in effect. Then, to
see if the intra-policy is being followed, the lslv -l command may be used. The
IN BAND column will indicate the percentage of physical partitions that are allocated in
the region specified by the intra-policy. The total region distribution is also displayed.
To see the allocation map for logical partitions on a specific disk, use the
lslv -p hdisk# lvname command. Replace the # with the number of the desired
physical disk.
To see the allocation map for a logical volume across all disks it occupies, use the
lslv -m lvname command.

V3.1.0.1
Student Notebook
Uempty
Logical Volume Settings
# lslv lv00
LOGICAL VOLUME: lv00 VOLUME GROUP: testvg
LV IDENTIFIER: 00066ba20000d9000000011f5c6d5c5a.1 PERMISSION:read/write
VG STATE: active/complete LV STATE: closed/syncd
TYPE: jfs WRITE VERIFY: off
MAX LPs: 512 PP SIZE: 128 megabyte(s)
COPIES: 1 SCHED POLICY: parallel
LPs: 20 PPs: 20
STALE PPs: 0 BB POLICY: relocatable
INTER-POLICY: minimum RELOCATABLE: yes
INTRA-POLICY: middle UPPER BOUND: 32
MOUNT POINT: N/A LABEL: None
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes
Serialize IO ?: NO
# lslv -l lv00
lv00:N/A
PV COPIES IN BAND DISTRIBUTION
hdisk1 013:000:000 15% 009:002:002:000:000
hdisk2 007:000:000 42% 003:003:001:000:000
Figure 5-26. Logical Volume Settings QV4311.1
Notes:
Using the output from lslv, you can compare the requested policies against the actual
implementation. The lslv -l output shows several characteristics of the logical
volume. The PerfPMR config.sum file lists the output of lslv for each logical volume.
The COPIES column shows the disks where the physical partitions reside. There are
three columns, one for each of the possible logical volume copies.
The IN BAND column shows the percentage of the partitions that met the intra-policy
criteria.
The DISTRIBUTION column shows the locations of the physical partitions of this logical
volume as numbers separated by a colon (:). Each of these numbers represents an
intra-policy location. Of the remaining percentage in the IN BAND value, the rest may be
on a different part of the disk and may be fragmented.

Student Notebook
lslv -p
# lslv -p hdisk1 lv00
hdisk1:lv00:N/A
FREE FREE FREE FREE FREE FREE FREE FREE FREE FREE 1-10
FREE FREE FREE FREE FREE FREE FREE FREE FREE 0001 91-100
0002 0003 0004 0005 0006 0015 0016 0017 FREE FREE 101-110
USED USED USED USED USED USED USED USED USED USED 111-120
FREE FREE FREE FREE FREE FREE FREE FREE FREE 0010 141-150
FREE 0011 FREE FREE FREE FREE FREE FREE FREE FREE 151-160
FREE FREE FREE FREE FREE FREE FREE FREE FREE 211-219
0013 0014 FREE FREE FREE FREE FREE FREE FREE 320-328
Figure 5-27. lslv -p . QV4311.1
Notes:
In the previous visual, even if the partitions were all in-band, that does not guarantee
that they are not fragmented. Therefore, the lslv -p data should be looked at next.
Logical volume fragmentation occurs if logical partitions are not contiguous across the
disk. The lslv -p command shows the logical volume allocation map for the physical
volume given.
The state of the partition is listed as one of the following:
- USED indicates that the physical partition at this location is used by a logical volume
other than the one specified with lslv -p.
- FREE indicates that this physical partition is not used by any logical volume.
- STALE indicates that the specified partition is no longer consistent with other
partitions. The system lists the logical partition number with a question mark if the
partition is stale.
- Where it shows a number, this indicates the logical partition number of the logical
volume specified with the lslv -p command.

V3.1.0.1
Student Notebook
Uempty
lslv -m
# lslv -m lv00
lv00:
LP PP1 PV1 PP2 PV2 PP3 PV3
0001 0100 hdisk1
0002 0101 hdisk1
0003 0102 hdisk1
0004 0103 hdisk1
0005 0104 hdisk1
0006 0105 hdisk1
0007 0051 hdisk2
0008 0052 hdisk2
0009 0055 hdisk2
0010 0150 hdisk1
0011 0152 hdisk1
0012 0300 hdisk2
0013 0320 hdisk1
0014 0321 hdisk1
0015 0106 hdisk1
0016 0107 hdisk1
0017 0108 hdisk1
0018 0145 hdisk2
0019 0146 hdisk2
0020 0149 hdisk2
Figure 5-28. lslv -m . QV4311.1
Notes:
The lslv -m option shows the mapping of a logical volume. For each logical partition,
it gives the physical partition and physical volume where the logical partition resides.

Student Notebook
Determine Fragmentation Using fileplace

# fileplace -pv /lv1fs/bigfile1
File: /lv1fs/bigfile1 Size: 104857600 bytes Vol: /dev/lv1

Blk Size: 4096 Frag Size: 4096 Nfrags: 25600 Compress: no
Inode: 21 Mode: -rw-r--r-- Owner: root Group: sys
Physical Addresses (mirror copy 1) Logical Fragment

---------------------------------- ----------------
3704419-3705374 hdisk2 956 frags 3915776 Bytes, 3.7% 0099395-0100350
<25 lines omitted>
25600 frags over space of 61842 frags: space efficiency = 41.4%

46 fragments out of 25600 possible: sequentiality = 99.8%
Figure 5-29. Determine Fragmentation Using fileplace . QV4311.1
Notes:
The fileplace tool displays the placement of a file’s blocks within a logical or physical
volume(s). fileplace expects an argument containing the name of the file to examine.
This tool can be used to detect file fragmentation.
By default, fileplace sends its output to the display, but the output can be redirected
to a file via normal shell redirection.
The example in the visual demonstrates how to use fileplace to determine whether a
file is fragmented.
The report generated by the -pv options displays the file’s placement in terms of
physical volume blocks for the physical volumes. The verbose part of the report is one
of the most important sections since it displays the efficiency and sequentiality of the
file.
Higher space efficiency and sequentiality provide better sequential file access.

V3.1.0.1
Student Notebook
Uempty
Reorganizing the File System
• After identifying a fragmented file system, reduce the
fragmentation by:
1. Backing up the files (by name) in that file system
2. Deleting the contents of the file system (or
recreating it with mkfs)
3. Restoring the contents of file system
• Some file systems should not be reorganized because

the data is either transitory (e.g. /tmp), or does not
change that much (e.g. / and /usr)
Figure 5-30. Reorganizing the File System QV4311.1
Notes:
File system fragmentation can be alleviated by copying the files to a backup media,
recreating the file system using mkfs fsname or deleting the contents of the file
system, and reloading the files into the new file system. This loads the file sequentially
and reduces fragmentation.
Some file systems or logical volumes shouldn’t be reorganized because the data is
either transitory (that is, /tmp), does not change much (that is, /usr and /), or not in a file
system format (log).
Backup and recovery procedures are only needed when there is a lot of fragmented
free space. If there are a few high usage large sequential files which are fragmented
and there is enough contiguous free space, you should copy the file to a different file
name, delete the original file, and then move the new file back to its original name.

Student Notebook
Defragmenting a File System

• To defrag a file system, use one of the following:
í smit dejfs
í smit dejfs2
í defragfs command
• Flags:
-q Reports the current state of the file system.
-r Reports the current state of the file system and the state that would
result if the defragfs command is run without either the -q, -r or -s flag.
-s Reports the fragmentation in the file system. This option causes
defragfs to pass through meta-data in the file system which may result
in degraded performance.
Figure 5-31. Defragmenting a File System QV4311.1
Notes:
Using small fragment sizes is not recommended, but if a journal file system has been
created with a fragment size smaller than 4 KB, it becomes necessary after a period of
time to query the amount of scattered unusable fragments. If many small fragments are
scattered, it makes it difficult to find available contiguous free space.
To recover these small, scattered spaces, use smit or the defragfs command. Some
free space must be available for the defragmentation procedure to be used. The file
system must be mounted for read-write.

V3.1.0.1
Student Notebook
Uempty
JFS and JFS2 Logs
• AIX uses a special logical volume called the log device as a
circular journal for recording modifications to file system
metadata
• When a file update is actually written to disk, commit

records are written to the log to indicate that modified pages
in memory have been committed to disk
• Put the log on a separate disk from the file system
• Record log I/O statistics with the filemon utility
Figure 5-32. JFS and JFS2 Logs QV4311.1
Notes:
JFS and JFS2 use a technique that duplicates transactions that are made to file system
metadata to the circular file system log. File system metadata includes the superblock,
i-nodes, indirect data pointers, and directories. All I/Os to the log are synchronous.
File system logs enable rapid and clean recovery of file systems if a system goes down.
However, there may be a performance trade-off. If an application is doing synchronous
I/O or is creating and/or removing many files in a short amount of time, then there may
be a lot of I/O going to the log logical volume. Information about I/Os to the log can be
recorded using the filemon command.
If you notice that a file system and its log device are both heavily utilized, it may be
better to put each one on a separate physical disk (assuming that there is more than
one disk in that volume group). This can be done using the migratepv command or via
SMIT.
JFS2 file systems have an option to have an inline log. An inline log allows you to create
the log within the same data logical volume. With an inline log, each JFS2 file system
can have its own log device without having to share this device.

Student Notebook
Creating Additional JFS/JFS2 Logs

• Multiple JFS/JFS2 logs in one volume group may improve
performance if more than one file system in that volume group is
competing for that volume group’s JFS or JFS2 log
• What to do:
í Create a new JFS or JFS2 log logical volume
(JFS) # mklv -t jfslog -y LVname VGname 1 PVname
(JFS2) # mklv -t jfs2log -y LVname VGname 1 PVname
í Unmount the file system

í Format the log
(JFS) # /usr/sbin/logform -V jfs /dev/LogName
(JFS2)# /usr/sbin/logform -V jfs2 /dev/LogName
í Modify /etc/filesystems and the LVCB

# chfs -a log=/dev/LVname /filesystemname
í Mount file system
Figure 5-33. Creating Additional JFS/JFS2 Logs QV4311.1
Notes:
Overview
Placing the log logical volume on a physical volume different from your most active file
system’s logical volume will increase parallel resource usage assuming that the I/O
pattern on that file system causes JFS/JFS2 log transactions. If there is more than one
file system in the same volume group which is causing JFS/JFS2 log transactions, you
may get better performance by creating a separate JFS/JFS2 log for each of these file
systems. The downside of this is that if you have one JFS/JFS2 log for each file system
then you are potentially faced with storage waste, since the smallest each JFS/JFS2 log
can be is one physical partition.
The performance of disk drives differ. So, try to create a logical volume for a hot file
system on a fast drive (possibly one with fast write cache).

V3.1.0.1
Student Notebook
Uempty
The Commands to Use for Monitoring I/O
• Look for most active files, file systems, and logical volumes:
í Can “hot” file systems be better located on physical drive
or be spread across multiple physical drives? (filemon)
í Are “hot” files local or remote? (filemon)
í Is there enough memory to cache the file pages being
used by running processes? (svmon)
• Look for file fragmentation:

í Are “hot” files heavily fragmented? (fileplace)
• Look for heavy physical volume utilization:

í Is the “type of drive” or I/O adapter causing a bottleneck?
(iostat, sar, filemon)
Figure 5-34. The Commands to Use for Monitoring I/O QV4311.1
Notes:
Overview
When monitoring disk I/O, there are several areas to look at. The visual gives a list of
questions to ask to help determine your course of action.

Student Notebook
Exercise 5: I/O Performance Monitoring
• Use the filemon utility

• Understand I/O wait
• Locate and fix I/O bottlenecks
• Work with synchronous files
• Display and correct fragmentation
Figure 5-35. Exercise 5: I/O Performance Monitoring QV4311.1
Notes:

V3.1.0.1
Student Notebook
Uempty
List the command/utility to do the following:
1. Monitor a trace of file system and I/O, and report on the file
and I/O access performance during that period: ___________
2. Show the logical volume allocation map for the physical
volume given: _______________
3. Report statistics for logical partitions and volumes:
____________
4. Show how the actual layout of a logical volume meets the
intra-allocation policy: _____________
5. Monitor system input/output device loading by observing the
time the physical disks are active in relation to their average
transfer rates: _____________
Notes:

Student Notebook
6. An I/O bottleneck may be solved by moving logical
partitions:
a) The ____________ command moves individual logical
partitions of a logical volume.
b) The ___________ command moves an entire logical
volume to another physical disk.
7. Two commands to measure read throughput are:

_________ and __________ .
8. The _____________ command can be used to determine if

there is file fragmentation.
Notes:

V3.1.0.1
Student Notebook
Uempty
Unit Summary
• Some causes of poor performance are:

– Fragmentation
– Shortage of buffers
– Lack of disk command queuing
– MWC writes
– Excessive disk seeks
– Saturated devices
– Locality of data
– Slow disk subsystem
• Dynamic allocation of resources may cause fragmentation
• Issues when files are accessed from disk:
– Sequential access no longer sequential
– Random access affected
– Access time dominated by longer seek time
– Once the file is in memory, this effect diminishes
Notes:

Student Notebook

V3.1.0.1
Student Notebook
AP Appendix A. Review Answers

Unit 1 - Data Collection and Analysis
Answers to Review Questions (1 of 3)

1. Use these terms with the following statements:
metrics, baseline, performance goals,
throughput, response time
a. Performance is dependent on a combination of throughput and

response time.
b. Expectations can be used as the basis for performance goals.
c. You need to know this to be able to tell if your system is

performing normally. baseline
d. These are collected by analysis tools. metrics
© Copyright IBM Corp. 2009 Appendix A. Review Answers A-1

Student Notebook
Unit 1 - Data Collection and Analysis (cont.)

2. The four components of system performance are:
– CPU
– Memory
– I/O
– Network
3. After tuning a resource or system parameter and monitoring the outcome,
what is the next step in the tuning process? Determine if the performance
goal(s) have been met.
4. List an important AIX tool for CPU analysis: iostat, vmstat, sar, topas,
or others listed in this unit
5. List an important AIX tool for memory analysis: vmstat, lsps, svmon,
topas, or others listed in this unit
6. List an important AIX tool for I/O analysis: iostat, vmstat, lspv,
filemon, or others listed in this unit
A-2 AIX Performance Management I © Copyright IBM Corp. 2009

V3.1.0.1
Student Notebook
AP Unit 1 - Data Collection and Analysis (cont.)

7. What is the difference between a functional problem and a
performance problem? A functional problem is when an
application, hardware, or network is not behaving correctly. A
performance problem is when the function is working, but the
speed it's performing at is slow.
8. What is the name of the supported tool used to collect reports

with a wide variety of performance data? PerfPMR
9. True or False You can individually run the scripts that

perfpmr.sh calls.
10. True or False You can dynamically change the topas and
nmon displays.

Student Notebook
Unit 2 - Tuning Overview
Answers to Review Questions

1. True or False: In AIX 6.1, help for the performance tunables
is available in the man pages.
2. Which tunable command flag will show the restricted
tunables:
a) -h
b) -s
c) -F
d) -R
3. True or False: A confirmation must always be given when
permanently changing a restricted tunable.
4. True or False: An error log entry is created when a
restricted tunable is changed permanently.
5. True or False: When a system is rebooted, the lastboot file
flags any tunables that have been changed since the
system booted.

V3.1.0.1
Student Notebook
AP Unit 3 - Monitoring CPU Usage

1. What is the difference between a process and a thread?
A process is an activity within the system that is started by a
command, shell program or another process. A thread is what is
dispatched to a CPU and is part of a process. A process can have
one or more threads.
2. The default scheduling policy is called: SCHED_OTHER
3. The default scheduling policy applies to fixed or non-fixed priorities?
non-fixed
4. Priority numbers range from 0 to 255.
5. True or False: The higher the priority number the more favored the
thread will be for scheduling.
6. List at least two tools to monitor CPU usage:
– vmstat, sar, topas

Student Notebook
Unit 3 - Monitoring CPU Usage (cont.)

7. True or False: All applications will run faster with simultaneous multi-
threading enabled.
8. What is the command used to enable or disable simultaneous multi-

threading?
A. smtctl
B. cfgmgr
C. mpstat

V3.1.0.1
Student Notebook
AP Unit 4 - Virtual Memory Performance Monitoring

1. The virtual memory system is composed of the real
memory and physical disk space
2. Virtual memory is divided into the three segment types:
persistent, client, and working
3. What type of segments are paged out to paging space?
working
4. Segments are classified as either computational memory
or non-computational (file) memory
5. The two major functions of the VMM are:
– Manage allocation of page frames
– Resolve references to virtual memory pages that aren't
currently in RAM
6. The name of the kernel process that implements the page
replacement algorithm is lrud

Student Notebook
Unit 4 - Virtual Memory Performance Monitoring (cont.)

7. List the vmo parameter that matches the description:
a. Specifies the minimum number of frames on the free list when the
VMM starts to steal pages to replenish the free list minfree
b. Specifies the number of frames on the free list at which page
stealing stops maxfree
c. Specifies the point below which the page stealer will steal file or
computational pages regardless of repaging rates minperm%
d. Specifies the point above which the page stealing algorithm steals
only file pages maxperm%
e. Specifies the maximum percentage of RAM that can be used for
caching client pages maxclient%
f. Specifies whether the maxclient value will be a hard limit on how
much of RAM can be used as a client file cache
strict_maxclient
g. Specifies whether the maxperm value will be a hard limit on how
much of RAM can be used as a persistent file cache
strict_maxperm
h. Specifies whether or not to consider repage rates when deciding
what type of page to steal lru_file_repage

V3.1.0.1
Student Notebook
AP Unit 5 - I/O Performance Monitoring

List the command/utility to do the following:
1. Monitor a trace of file system and I/O, and report on the file
and I/O access performance during that period: filemon
2. Show the logical volume allocation map for the physical
volume given: lslv -p PVname
3. Report statistics for logical partitions and volumes: lvmstat
4. Show how the actual layout of a logical volume meets the
intra-allocation policy: lslv -l LVname
5. Monitor system input/output device loading by observing the
time the physical disks are active in relation to their average
transfer rates: iostat

Student Notebook
Unit 5- I/O Performance Monitoring (cont.)

6. An I/O bottleneck may be solved by moving logical
partitions:
a) The migratelp command moves individual logical
partitions of a logical volume.
b) The migratepv command moves an entire logical
volume to another physical disk.
7. Two commands to measure read throughput are:

dd and time.
8. The fileplace command can be used to determine if

there is file fragmentation.

V3.1.0.1
backpg
Back page
®

QV4311 StudentGuide

Uploaded by

Copyright:

Available Formats

QV4311 StudentGuide

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

QV4311 StudentGuide

Uploaded by

Copyright:

Available Formats

V3.1.0.

Front cover

(Course Code QV431)

UNIX Software Service Enablement

April 2009 Edition

© Copyright International Business Machines Corporation 2009. All rights reserved.

Unit 1. Data Collection and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1

Unit 2. Tuning Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1

© Copyright IBM Corp. 2009 Contents iii

lastboot File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-19

Unit 3. Monitoring CPU Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1

Unit 4. Virtual Memory Performance Monitoring . . . . . . . . . . . . . . . . . . . . . . . . 4-1

iv AIX Performance Management I © Copyright IBM Corp. 2009

TOC What Type of Pages are Stolen? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-11

Unit 5. I/O Performance Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1

© Copyright IBM Corp. 2009 Contents v

Appendix A. Review Answers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1

vi AIX Performance Management I © Copyright IBM Corp. 2009

pref Course Description

© Copyright IBM Corp. 2009 Course Description vii

viii AIX Performance Management I © Copyright IBM Corp. 2009

Unit 2 - Tuning Overview

Unit 3 - Monitoring CPU Usage

Unit 4 - Virtual Memory Performance Monitoring

Unit 5 - I/O Performance Monitoring

© Copyright IBM Corp. 2009 Agenda ix

x AIX Performance Management I © Copyright IBM Corp. 2009

Uempty Unit 1. Data Collection and Analysis

What This Unit Is About

What You Should Be Able to Do

How You Will Check Your Progress

After completing this unit, you should be able to:

UNIX Software Service Enablement © Copyright IBM Corporation 2009

Figure 1-1. Unit Objectives QV4311.1

1-2 AIX Performance Management I © Copyright IBM Corp. 2009

UNIX Software Service Enablement © Copyright IBM Corporation 2009

Figure 1-2. What Exactly is Performance? QV4311.1

Components of System Performance

UNIX Software Service Enablement © Copyright IBM Corporation 2009

Figure 1-3. Components of System Performance QV4311.1

1-4 AIX Performance Management I © Copyright IBM Corp. 2009

UNIX Software Service Enablement © Copyright IBM Corporation 2009

Figure 1-4. Performance Metrics and Baseline QV4311.1

Factors That Can Affect Performance

UNIX Software Service Enablement © Copyright IBM Corporation 2009

Figure 1-5. Factors That Can Affect Performance QV4311.1

1-6 AIX Performance Management I © Copyright IBM Corp. 2009

UNIX Software Service Enablement © Copyright IBM Corporation 2009

Figure 1-6. Determine the Type of the Problem QV4311.1

Trade-offs and Performance Approach

• Performance may be improved using a methodical

UNIX Software Service Enablement © Copyright IBM Corporation 2009

Figure 1-7. Trade-offs and Performance Approach QV4311.1

1-8 AIX Performance Management I © Copyright IBM Corp. 2009

UNIX Software Service Enablement © Copyright IBM Corporation 2009

Figure 1-8. Performance Analysis Tools QV4311.1

Collecting Performance Data

UNIX Software Service Enablement © Copyright IBM Corporation 2009

Figure 1-9. Collecting Performance Data QV4311.1