AIX For System Administrators - DUMP

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

AIX for System Administrators

http://aix4admins.blogspot.mx/2011/06/aix-generates-system-dump-w...

PAGEVIEWS

Dump - Core

690,933
AIX generates a system dump when a severe error occurs. A system dump creates a picture of the system's memory contents. If the AIX kernel crashes kernel data is written to the primary dump device. After a kernel crash AIX must be rebooted. During the next boot, the dump is copied into a dump directory (default is /var/adm/ras). The dump file name is vmcore.x (x indicates a number, e.g. vmcore.0) When installing the operating system, the dump device is automatically configured. By default, the primary device is /dev/hd6, which is a paging logical volume, and the secondary device is /dev/sysdumpnull. A rule of thumb is when a dump is created, it is about 1/4 of the size of real memory. The command "sysdumpdev -e" will also provide an estimate of the dump space needed for your machine. (Estimation can differ at times with high load, as kernel space is higher at that time.) When a system dump is occurring, the dump image is not written to disk in mirrored form. A dump to a mirrored lv results in an inconsistent dump and therefore, should be avoided. The logic behind this fact is that if the mirroring code itself were the cause of the system crash, then trusting the same code to handle the mirrored write would be pointless. Thus, mirroring a dump device is a waste of resources and is not recommended. Since the default dump device is the primary paging lv, you should create a separate dump lv, if you mirror your paging lv (which is suggested.)If a valid secondary dump device exists and the primary dump device cannot be reached, the secondary dump device will accept the dump information intended for the primary dump device. IBM recommendation: All I can recommend you is to force a dump the next time the problem should occur. This will enable us to check which process was hanging or what caused the system to not respond any more. You can do this via the HMC using the following steps: Operations -> Restart -> Dump As a general recommendation you should always force a dump if a system is hanging. There are only very few cases in which we can determine the reason for a hanging system without having a dump available for analysis. ------------------------------------------Traditional vs Firmware-assisted dump:
ABOUT

FS - LVM FS LV Mirror Pool PV VG

GENERAL AIX History Backup Commands CPU - Processes Crontab - At Date - Time Devices Dump - Core Errpt - Diag - Alog Syslogd Firmware IO - AIO, DIO, CIO Memory - Pag.Space ODM Printing SRC Startup - Shutdown System - Kernel Tunables User - Group User Login

HACMP - POWERHA Appl. Monitor Basics Build - Configure Clverify Commands - Cases Config too long DARE - Snapshot Disk Heartbeat Storage - VG, NFS

HMC - ISD ASMI HMC Basics - Console HMC CLI HMC/P7 Install ISD Basics RMC

Up to POWER5 only traditioanl dumps were available, and the introduction of the POWER6 processor-based systems allowed system dumps to be firmware assisted. When performing a firmware-assisted dump, system memory is frozen and the partition rebooted, which allows a new instance of the operating system to complete the dump. Traditional dump: it is generated before partition is rebooted. (When system crashed, memory content is trying to be copied at that moment to dump device) Firmware-assisted dump: it takes place when the partition is restarting. (When system crashed, memory is frozen, and by hypervisor (firmware) new memory space is allocated in RAM, and the contents of memory is copied there. Then during reboot it is copied from this new memory area to the dump device.) Firmware-assisted dump offers improved reliability over the traditional dump, by rebooting the partition and using a new kernel to dump data from the previous kernel crash. When an administrator attempts to switch from a traditional to firmware-assisted system dump, system memory is checked against the firmware-assisted system dump memory requirements. If these memory requirements are not met, then the "sysdumpdev -t" command output reports the required minimum system memory to allow for firmware-assisted dump to be configured. Changing from traditional to firmwareassisted dump requires a reboot of the partition for the dump changes to take effect. Firmware-assisted system dumps can be one of these types: Selective memory dump: Selective memory dumps are triggered by or use of AIX instances that must be dumped. Full memory dump: The whole partition memory is dumped without any interaction with an AIX instance that is failing. -------------------------------------------

NETWORK Basics - Devices, Routing Basics - Protocol, Subnet Basics - Vlan Commands Eth. Chan. Ethernet Adapter IVE - HEA Netcd NFS RSH - RCP Sendmail SSH - SCP SSH - X11 Telnet - FTP

NIM Basics Install LPP Source

Use the sysdumpdev command to query or change the primary or secondary dump devices. - Primary: usually used when you wish to save the dump data - Secondary: can be used to discard dump data (that is, /dev/sysdumpnull)

Machines MKSYSB Nimadm SPOT

1 of 3

04/06/2013 16:43

AIX for System Administrators

http://aix4admins.blogspot.mx/2011/06/aix-generates-system-dump-w...

Flags for sysdumpdev command: -l list the current dump -e estimates the size of -p primary -s secondary -P make change permanent -C turns on compression -c turns off compression -L shows info about last -K turns on: alway allow sysdumpdev -P -p /dev/dumpdev

PERFORMANCE

destination the dump (in bytes)

Basics CPU I/O - Disk, Adapter Memory Network svmon - RAM

dump system dump

topas - nmon vmstat - CPU/RAM VMM concepts

change the primary dumpdevice permanently to /dev/dumpdev

VIO

root@aix1: /root # sysdumpdev -l primary /dev/dumplv secondary /dev/sysdumpnull copy directory /var/adm/ras forced copy flag TRUE always allow dump TRUE <--if it is on FALSE then in smitty sysdumpdev it can be change dump compression ON <--if it is on OFF then sysdumpdev -C changes it to ON-ra (-c changes it to OFF)

STORAGE Adapter Basics - SAN Basics - Settings EMC Hitachi HP EVA - SSA MPIO

Other commands: sysdumpstart starts a dump (smitty dump)(it will do a reboot as well) kdb it analysis the dump /usr/lib/ras/dumpcheck checks if dump device and copy directory are able to receive the system dump If dump device is a paging space, it verifies if enough free space exists in the copy dir to hold the dump If dump device is a logical volume, it verifies it is large enough to hold a dump (man dumpcheck) -------------------------------------------

SDD

UPDATE - INSTALL Basics Commands IFIX

VIO

Creating a dump device 1. sysdumpdev -e 2. mklv -t sysdump -y lg_dumplv rootvg 3 3. sysdumpdev -Pp /dev/lg_dumplv <--shows an estimation, how much space is required for a dump <--it creates a sysdump lv with 3 PPs <--making it as a primary device (system will use this lv now for dumps)

Basics Commands

hdisk0

AME - AMS CDROM - DVDROM LPM

------------------------------------------System dump initiaded by a user !!!reboot will take place automatically!!! 1. sysdumpstart -p <--initiates a dump to the primary device (Reboot will be done automatically) (If a dedicated dump device is used, user initiated dumps are not copied automatically to copy directory.) (If paging space is used for dump, then dump will be copied automatically to /var/adm/ras) 2. sysdumpdev -L <--shows dump took place on the primary device, time, size ... (errpt will show as well) 3. savecore -d /var/adm/ras <--copy last dump from system dump device to directory /var/adm/ras (if paging space is used this is not needed) ------------------------------------------How to move dumplv to another disk: We want to move from hdisk1 to hdisk0: 1. lslv -l dumplv 2. sysdumpdev -l 3. sysdumpdev -Pp /dev/sysdumpnull state) 4. migratepv -l dumplv hdisk1 hdisk0 5. sysdumpdev -Pp /dev/dumplv <--checking which disk <--checking sysdump device (primary was here /dev/dumplv) <--changing primary to sysdumpnull (secondary, it is a null device) (lsvg -l roovg shows closed <--moving it from hdisk1 to hdisk0 <--changing back to the primary device

NIB - LA NPIV SEA Shared Storage Pool VIOS update/upgrade Virt. Eth. Ad. Virt. Proc.-Ent. Cap. VSCSI VSCSI - Stor. Pool

+EXTRAS +others alt_disk awk - sed bash citrix java kdb ksh locale multibos perl profiles rbac Rsh

------------------------------------------The largest dump device is too small: (LABEL: DMPCHK_TOOSMALL IDENT) 1. Dumpcheck runs from crontab # crontab -l | grep dump 0 15 * * * /usr/lib/ras/dumpcheck >/dev/null 2>&1 2. Check if there are any errors: # errpt IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION E87EF1BE 0703150008 P O dumpcheck The largest dump device is too small. E87EF1BE 0702150008 P O dumpcheck The largest dump device is too small. 3. If you find new error message, find dumplv: # sysdumpdev -l primary /dev/dumplv secondary /dev/sysdumpnull copy directory /var/adm/ras forced copy flag TRUE always allow dump TRUE dump compression ON List dumplv form rootvg: # lsvg -l rootvg|grep dumplv dumplv dump 8 8 1 open/syncd N/A 4. Extend with 1 PP # extendlv dumplv 1 dumplv dump 9 9 1 open/syncd N/A Run problem check at the end OK -> done Not OK -> Extend with 1 PP again. ------------------------------------------changing the autorestart attribute of the systemdump: (smitty chgsys as well) 1.lsattr -El sys0 -a autorestart autorestart true Automatically REBOOT system after a crash True 2.chdev -l sys0 -a autorestart=false sys0 changed ---------------------------------------------CORE FILE:

samba screen script sudo vi

2 of 3

04/06/2013 16:43

AIX for System Administrators

http://aix4admins.blogspot.mx/2011/06/aix-generates-system-dump-w...

errpt shows which program, if not: - use the strings command (for example: strings core | grep _=) - or the lquerypv command: (for example: lquerypv -h core 6b0 64) man syscorepath syscorepath -p /tmp syscorepath -g

Labels: GENERAL

1 comment:
basanth February 13, 2013 at 6:51 PM How to take the backups of SYSdump and Device? Reply

Comment as:

Newer Post Subscribe to: Post Comments (Atom)

Home

Older Post

Template images by Storman. Powered by Blogger.

3 of 3

04/06/2013 16:43

You might also like