EMC Control Center Troubleshooting
EMC Control Center Troubleshooting
EMC Control Center Troubleshooting
Version 5.0.3
Troubleshooting Guidelines
P/N 300-000-302
REV A02
6/17/02
Procedures
◆ Troubleshooting Principles ................................................................12
◆ Diagnosis and Remedy.......................................................................15
Reference ....................................................................................................40
◆ Agent Names and Codes....................................................................41
◆ Port Settings .........................................................................................45
◆ Trace and Log Files..............................................................................45
◆ Error Messages.....................................................................................47
◆ Tools.......................................................................................................47
1
Introduction
Introduction
Troubleshooting involves diagnosing and often simultaneously
repairing a system malfunction. A range of behaviors, from
undesirable (for example, poor performance from traffic bottlenecks)
to nonperformance (for example, functional errors, data
inaccessibility, or corruption) may prompt you to act.
EMC ControlCenter™ aids in troubleshooting and fixing the storage
environment it oversees. The scope of this manual, however, is the
troubleshooting of ControlCenter software itself. While other
ControlCenter manuals cover the successful installation and
functioning of ControlCenter, this guide covers unexpected failures
that could disable normal ControlCenter operation, describing
symptoms that may include unusual behavior in the GUI.
About this Document This document provides the two basic types of diagnostic and
remedy information: procedures and references.
Procedures
Commonly encountered problems, along with their remedies, are
described with specific procedures. In certain cases, specific
remedies cannot be predefined, and a strategy or an escalation
path to EMC Customer Support is given.
See Troubleshooting Principles on page 12 and Diagnosis and Remedy
on page 15.
Reference
Tables and examples describing diagnostic objects such as logs,
initialization files, and the like are provided here.
See Reference on page 40.
Related To avoid problems, the user should adhere to specifications provided in the
Documentation EMC ControlCenter version 5.0.3 Installation and Configuration Guide, EMC
ControlCenter Version 5.0.3 Release Notes, EMC ControlCenter/Open Edition
Support Matrix, and EMC ControlCenter Performance and Scalability Guidelines.1
Typographical Conventions
EMC uses the following type style conventions in this guide:
c:\Program Files\EMC\Symapi\db
System Overview
ControlCenter is a suite of applications used to manage the data
storage and related devices in an EMC Enterprise Storage Network
(ESN). A very brief overview of system components is provided here.
For a much more detailed overview on components and their context, consult
EMC ControlCenter Version 5.0 Introduction (P/N 300-000-295).
Repos
Console(s)
omer Storage Environment
ECC-000042
Infrastructure Tier The infrastructure manages and maintains the driving components of
ControlCenter, the ECC Server, Store, and Repository.
ECC Server
This is the primary driving component of ControlCenter. It performs
several management tasks among components.
Repository
This component is a relational database holding the current and
historical metadata for the ControlCenter Console and infrastructure,
as well as the extended storage environment.
Store
The Store populates the Repository with data collected from the
storage environment by the agents.
Storage Agents
These components handle communication with specific storage
systems.
Includes agents for — Symmetrix®, Celerra™, CLARiiON®,
Compaq StorageWorks, Hitachi Data System (HDS), RVA/SVA,
and IBM ESS.
Host Agents
Each operating system class requires a host agent tailored to its
environment.
Includes agents for — Windows, Solaris, HP-UX, AIX, MVS, and
Novell.
Connectivity Agents
These agents gather information about the interconnections between
storage-system hosts, devices, and networks.
Includes agents for — SNMP, switches, and SDM.
Database Agents
Two agents provide for communication with databases.
Includes agents for — Oracle and DB2.
Backup Agents
These agents assist in backup operations.
Includes agents for — Tivoli Storage Manager for: AIX, HP-UX,
Windows NT, Windows 2000, and Solaris.
Tape Agent
Provides capability for reporting on and managing tape subsystems.
Includes agents for — OS/390 for StorageTek/STK, VTS/SMS,
CA1/TMC, and RMM/DFSMS.
Other Agents
Some agents are not specific to hosts or other devices, but act on
multiple devices and applications, or the system as a whole.
Console and Client The Console is the consolidated GUI for ControlCenter. It provides a
Tier window on the objects managed by application, their attributes, and
their relationships to each other. See Figure 2 on page 8.
Other client applications may be part of this tier.
Installation Utility
Parts of the installation and configuration procedures are managed
through the Installation utility. This software is installed from the
ControlCenter CD set early in the installation procedure.
Console
After installation, normal ControlCenter operations are run through
the GUI provided by the Console. (Examine detail in Figure 2 on
page 8, or consult other ControlCenter manuals.)
Recognizing The following descriptions refer to features for dealing with problems
Problems in the detected by ControlCenter in the managed objects it is monitoring.
Monitored Storage Malfunctions in ControlCenter itself are the focus of this document,
System however; see the discussion starting with the following section,
Troubleshooting Principles on page 12.
GUI Devices The GUI has several devices that will indicate error conditions, when
encountered, in the devices it monitors.
Window Panels
Tree Panel
This lower left-hand panel displays a structured list of storage
environment physical and logical objects. The list may be collapsed or
expanded in a manner similar to that in Windows Explorer.
A line-item object (for example, a physical device) may indicate a
problem by using a warning or error version of its miniature icon. See
Table 1 for descriptions of icon problem indicators.
Target Panel
As in the tree panel, warning and error icons indicate problems with
devices. The icons may be presented in a graphical layout or as part
of a table.
Icons
The Console uses small icons to indicate that particular devices are
part of the interworking storage network. These icons provide:
Device Recognition - A device that may be physically present but is
not recognized will not appear as a listed object in the tree or in
configuration dialog boxes.
Warnings and Errors Indication - When there is a problem with a
known device, the icon will change appearance to indicate that a
problem exists. See Table 1 for descriptions.
Icon Interpretation
Troubleshooting Principles
ControlCenter provides several reporting mechanisms and tools for
recording application events and alerting the user when problems
occur.
◆ Note the symptoms that prompted your attention. Record any information provided (pop-up
dialog boxes, warning icons, and so on).
◆ Use Online Help from the Console (if it is running) to interpret error messages.
◆ Find symptoms in diagnostic tree, and review the information provided in the various paths.
Monitoring The user should establish a schedule for regular system monitoring.
Through regular use, problem indicators are learned and problems
may be recognized as soon as possible after they occur.
Symptom Identification
To identify the symptom of a ControlCenter problem:
1. Recognize the problem — Determine what specifically and
visibly indicates a problem (such as warning icon appears, alert
appears, expected icon is missing, pop-up error window). Note
what is visible without (at this point) manipulating the GUI.
• In the diagnostic tree, see Symptoms (white box at top), as
well as Procedure and Context (grey box below).
2. Interpret the Symptoms — Formulate an explanation (where,
when, and how it is occurring).
• In the diagnostic tree, see Suspected Diagnosis column.
• If there is no obvious interpretation, follow diagnostic tree
tests, in Item order.
Preparation
To prepare before acting:
3. Isolate the problem — Identify where (on which hosts, between
which connections) the problem has occurred or is occurring.
4. Protect the environment — If relevant and practical, physically
or logically isolate the problem from the rest of the symptom.
5. Gather information — At this point, look for additional
symptoms.
• In the diagnostic tree, follow Diagnosis and Remedy paths in
the order of numbered Suspected Diagnoses.
6. Decide if help is needed.
• In the diagnostic tree, Diagnosis and Remedy instructions may
suggest you attempt a fix, or may ask you to call Customer
Support.
Diagnosis
When the diagnostic tree (see discussion beginning at page 15) does
not address a problem:
7. Explore the symptoms — navigate through (drill down) the GUI
tree and select views for further error indicators; open your logs.
8. Organize evidence gathered from exploration, and hypothesize
diagnoses.
9. Formulate remedies based on your hypotheses.
Remedy
Following confirmation of a Suspected Diagnosis, steps will be
implied, or made explicit, in Diagnosis and Remedy Procedures to:
10. Perform remedies.
11. Test as indicated to confirm success.
Using the Tree The diagnostic tree is a set of tables, generally one for a particular
major symptom or group of symptoms indicating a malfunction.
Each table has three basic parts:
◆ At the top of a table (in a white box), symptoms are stated.
◆ Below the symptoms (in a shaded box), operator context is
identified.
◆ Below the symptoms and context, a set of numbered Suspected
Diagnoses is given, with corresponding Diagnostic and Remedy
Procedures.
Symptoms
Procedure if applicable
Book
Chapter Symptoms describes what the user saw to prompt his or her reaction.
Procedure describes what the user was doing and/or the user’s objective.
Sections Context tells where in the product interface the user was working.
Context if applicable
Console
Tree panel
Menu Items
Final Menu Item
Reference Suspected Diagnosis is one (of possibly Diagnostic Procedure is the path the user should follow to confirm or deny a
Number for several) root causes for why the symptom particular Suspected Diagnosis. Includes any work needed prior to actual fixing
each occurred. procedures—protect environment, gather visible information.
suspected Remedy Procedure is the (additional path) path (if any) the user should follow to
diagnosis addressed a confirmed Suspected Diagnosis.
Infrastructure Installation
Install
Installation Won’t Launch 18
Error Dialog Boxes: “InstallerPane”, “error code 2” 19
Error Dialog Box: “ECC uninstall directory” 20
Error Dialog Box: “Failed to launch”, “Access is denied” 21
Error Dialog Box: “no disk in drive” 22
CD Fails to Eject 22
Error: “Return code 255” 23
Repository Installation Slow 23
Repository Installation Hang at 1% 24
WLA PerformanceView License Key Not Valid 25
Uninstall
Error Dialog Boxes: “InstallerPane”, “error code 2” 26
Processes and Connectivity
ECC Server
Server-Database Connection Fails 27
Server-Database Connection Dropped 28
Store
Store-Database Connection Dropped 29
Console
Console Does Not Appear on Startup 1 30
Console Does Not Appear on Startup 2 31
Hosts
Host Icon Does Not Appear in Console 35
Agents
Agent Installation Cannot Be Performed on Console 32
Agent Startup Quits 33
Host Agent Device Icon Does Not Appear in Console 36
Console Error Dialog Box: “Agent not found”, 39
“Communication failed”
Preparation
Networks EMC ControlCenter version 5.0.3 Installation and
Hosts Configuration Guide, Chapter 2, Chapter 3.
Environments
Installation
ECC Server EMC ControlCenter version 5.0.3 Installation and
Store Configuration Guide, Chapter 2.
Repository
1 CD-ROM drive in use. 1. Shut down any other running media players.
2. Eject CD 1.
3. Reinsert CD 1 and restart installation.
2 Autorun not enabled. You can launch the installation from CD 1 by command line:
<CD_drive>:\ECC.exe
3 Dirty or scratched CD-ROM. 1. Clean CD: Wipe with clean cloth from inner toward outer edge.
2. If CD still appears damaged, contact EMC for replacement.
4 Installation target is nonwritable or 1. Close dialog boxes. Data entry screen “Choose installation directory”
nonexistent. should be available.
a. Open the Registry Editor with: Start, Run, regedit (or regedt32).
b. Drill down (as shown above), and select folder (5.0).
c. Create new key (right-click in right-hand pane, and select New, String
Value).
3. Restart installation.
6 One or more folders within the installation 1. Remove the two registry keys shown:
folder are missing or corrupt.
HKEY_LOCAL_MACHINE
SOFTWARE
EMC Corporation
EMC CONTROL CENTER
5.0
INSTALL_ROOT
LAST_CDDRIVE_USED
a. Open the Registry Editor with: Start, Run, regedit (or regedt32).
b. Drill down (as shown above), select, and delete.
7 CD-ROM is expected in drive tray. 1. Insert next CD and close drive; dialog box should disappear.
2. Continue installation.
This error is harmless to software and
installation procedure. Refer to EMC ControlCenter version 5.0.3 Installation and
Configuration Guide, Chapter 2.
8 Windows Explorer CD-ROM folder is open. 1. Close Windows Explorer, or just the CD-ROM folder within it.
9 Demo32.exe process is still running. 1. Open Task Manager, select Processes tab, select Image Name list.
2. Locate and select Demo32.exe and click End Process.
11 Host may not be equipped according to Check that system configuration meets current specifications:
EMC specifications. • Refer to EMC ControlCenter/Open Edition Support Matrix dated
4/24/02 or later, available on EMC Powerlink.
• If unavailable, refer to EMC ControlCenter Performance and Scalability
Guidelines (P/N 300-000-431 Rev A04 or later), available on EMC
Powerlink.
12 Repository host is down. 1. Leave ControlCenter installation as is—do not yet take recovery
measures.
2. Check that Repository host is running. Reboot if necessary.
3. Wait for ControlCenter installation to recognize the Repository; then
continue.
13 Slow host processor. Installation can take several minutes to advance past 1%; please wait. The
process should not take longer than 35 minutes; else, refer to Table 12 on
page 23.
14 Oracle silent installation failed. Look for Oracle installation error messages in:
\\Program Files\Oracle\Inventory\logs\
InstallActions.log
and
\\Program Files\Oracle\Inventory\logs\
oraInstall.err
15 Windows version is neither of the Replace host or host operating system with the EMC-supported English or
supported English or Japanese versions. Japanese language Windows 2000.
16 The host had a previous installation of 1. Delete any remaining previously installed folders.
ControlCenter, but was not fully cleaned of 2. Reinsert CD 1.
previous software.
Installation utility continues to function erroneously and will show the following dialog box if an uninstall is attempted:
3. If above error message is not found, try any other Suspected Diagnoses
available for the same symptoms.
Else, if above error message is found, continue with Item 18 and/or 19.
20 Database failure. 1. Check for an error message confirming the database is unavailable, and
respond:
a. Open server.trc
b. Locate the following error message string:
DatabaseMonitorPlugin: Database short term
failure
2. If above error message is not found, try any other Suspected Diagnoses
available for the same symptoms.
Else, if the above error message is found:
• Check Repository status using ControlCenter System Monitor:
a. Start System Monitor. (See ControlCenter System Monitor on
page 47.)
b. Under the General Status (or Server Status) tab, check whether
Repository is active or not.
• Alternatively, check status using Task Manager.
3. Check Services to ensure that at least the following are running (Status
shows Started):
OracleECCREP-HOMEAgent
OracleECCREP-HOMEDataGatherer
OracleECCREP-HOMETNSListener
OracleServicerambdb
2. If above error message is not found, try any other Suspected Diagnoses
available for the same symptoms.
Else, if the above error message is found:
• Check component status using ControlCenter System Monitor:
a. Start System Monitor. (See ControlCenter System Monitor on
page 47.)
b. Under the General Status (or Server Status) tab, check whether
Store and Repository are active or not.
• Alternatively, check status using Task Manager.
22 ECC Server is not running. 1. Check that Console logging is enabled. (See Console Logging in EMC
ControlCenter Version 5.0.3 Release Notes.)
2. Ping ECC Server host.
If host does not respond, try other remedies for same symptoms.
3. Check for an error message:
a. Open console.trc
b. Locate the following error message:
Could not connect to the ECC server; The EMC
ControlCenter server on <hostname> may not be
ready for logins
4. If above error message is not found, try any other Suspected Diagnoses
available for the same symptoms.
Else if message is found, check service status:
• Check component status using ControlCenter System Monitor:
a. Start System Monitor. (See ControlCenter System Monitor on
page 47.)
b. Under the General Status (or Server Status) tab, check
whether server is running (“active”) or not.
• Alternatively, check component status using Task Manager.
5. If server is not running, skip to step 6.
Else if server is running, prepare for a clean restart:
a. From the Windows desktop, select Start, Settings, Control Panel,
Administrative Tools, Services.
b. Locate service EMC ControlCenter Server.
c. Stop service: Right-click on service name, and select Stop.
6. Restore system:
a. Restart server: Right-click service name (see Step 5), and select
Start.
b. Restart Console: Click Start ECC Console icon.
Procedure
User Guide
Context
(Various)
Procedure
Installation and Configuration Guide
User Guide
Context
(Various)
Procedure
Installation and Configuration Guide
User Guide
Context
(Various)
5. Start (restart) Store: Right-click service name (see step 4), and select
Start.
29 Master Agent is not running. Check that the Master Agent is running.
Windows:
1. Open Task Manager.
2. Under Processes tab, attempt to locate mstragent.exe
UNIX:
1. Run ps -ef
2. Attempt to locate mstragent in output.
3. If process name does not appear, Master Agent is not running. Do the
following:
Windows:
• Restart Master Agent through Windows Services.
UNIX:
• Run <install_root>/exec/start_master.csh
30 Network error occurred. Check to see if a network error is reaching the ECC Server.
Windows, UNIX:
1. Open masteragent.log
2. Find any entries labeled network error
3. Ping IP address of device indicated in error log entry.
31 MSR Agent or MNR Agent is not running. Check whether MNR or MSR Agent is running:
Windows:
1. Open Task Manager.
2. Under Processes tab, attempt to locate mnragent.exe,
msragent.exe
UNIX:
1. Run ps -ef
2. Attempt to locate mnragent, msragent in output.
32 Network error occurred. Check whether a network error is reaching the ECC Server.
1. Open mnr_sst.log and msr_sst.log
2. Find any entries labeled “network error.”
3. Ping the IP address of device indicated in error item to see if it can be
reached.
4. If necessary, restart host or check network.
33 Store errors occurred. Check Store log store.trc for any SQL errors.
34 Symmetrix agent is not running. Check that the Storage Agent for Symmetrix is running.
(installation problems) Windows:
1. Open Task Manager.
2. Under Processes tab, attempt to locate egsagent.exe
UNIX:
1. Run ps -ef
2. Attempt to locate egsagent in output.
35 Network error occurred. Check to see if a network error is reaching the ECC Server.
1. Open egs_sst.log
2. Find any entries labeled network error.
3. Ping IP address of error device to see if it can be reached.
4. If necessary, restart host or check network.
Table 26 Symmetrix Icon Does Not Appear in Console; Agent Error Message
38 1. Open <agent_abbrev>_sst.log
2. Locate:
EGSAgent::AmPrimary __|Current Symms list
but without any Symmetrix storage arrays are listed.
3. If list is missing:
a. Wait until connection is established.
b. Break connection.
Table 27 Console Error Dialog Box: “Agent not found”, “Communication failed”
Procedure
User Guide
Context
(Various)
39 Master Agent is down. 1. Check Master Agent status using the Console.
2. If the Master Agent appears down from the Console, check its status
locally, using the Task Manager at the agent host.
3. If the Master Agent locally appears down, restart it from Windows
Services at the agent host.
40 Agent is down. 1. Check Master Agent status, and restart if necessary, according to Item
38.
2. Check agent status using the Console.
3. If agent appears down, skip to step 6.
4. Check agent logs on agent host for failure indicators.
5. If no messages are found in agent logs, try a new Suspected Diagnosis,
or contact EMC.
6. Restart agent from Console.
Reference
The following tables identify names and locations for specific objects
in ControlCenter.
Contents
Agent code names do not always carry over to agent executable names—in
some cases, generic agents are used. See Table 28 and Table 29 for full
comparison in EMC ControlCenter version 5.0.3.
Middle Character
of Agent Code Operating System
A AIX
G (multiple or “generic”)
H HP-UX
M MVS
S Solaris
V Novell
Executable
Code Agent Name (if applicable)
Executable
Code Agent Name (if applicable)
Short Name or
Index-only item Code Full Name
Port Settings Port settings are requested by the installation wizard, but may be
reset. Default listening port settings are given in Table 31.
Server 5799
Repository 1521
Trace and Log Files Each component has at least one log file. Depending on its source
code origin (see Error Levels), a log can capture up to five or six
severity levels.
Error Levels
Log files for the Java-based ECC Server, Store, and Console can be set
to capture messages at one of the threshold levels shown in Table 32;
while log files for the C++ based agents can be set as in Table 33.
Logging can be adjusted from default levels by consulting with EMC
Customer Support.
7 ERROR — failure
6 WARN — warning
5 INFO — information
0 Fatal error
1 Warning
2 Information
3 Detailed information
Trace and Log Tables 34 and 35 show trace log files available and their locations.
Files and Locations
Error Messages Consult the online error Help for comprehensive error message
information.
Online Error Help Each Help message contains the message label, description, reason,
and action to take (if any). See Figure 4 on page 47.
Console
Two types of error messages are presented:
◆ Responses to unauthorized requests to delete a managed object
◆ Symmetrix error alert messages
ControlCenter System This utility, which is installed on an ECC Server host, provides status
Monitor information for ControlCenter components.
ControlCenter Log This utility, which runs on Windows platforms, is used to search logs
Viewer for patterns to assist log analysis, and to zip log files for delivery to
Customer Service.