Info Scale
Info Scale
Info Scale
2
Fundamentals for UNIX/Linux:
Administration (Lessons)
THIS PUBLICATION IS PROVIDED “AS IS” AND ALL EXPRESS OR IMPLIED CONDITIONS,
REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE
EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID. VERITAS TECHNOLOGIES LLC
SHALL NOT BE LIABLE FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES IN CONNECTION WITH THE
FURNISHING, PERFORMANCE, OR USE OF THIS PUBLICATION. THE INFORMATION CONTAINED HEREIN
IS SUBJECT TO CHANGE WITHOUT NOTICE.
No part of the contents of this book may be reproduced or transmitted in any form or by any means
without the written permission of the publisher.
For specific country offices Veritas World Headquarters © 2020 Veritas Technologies LLC. All
rights reserved. Veritas and the Veritas
and contact numbers, please 500 East Middlefield Road
Logo are trademarks or registered
visit our website at Mountain View, CA 94043 USA
trademarks of Veritas Technologies LLC
www.veritas.com. +1 (650) 933 1000 or its affiliates in the U.S. and other
www.veritas.com countries. Other names may be
trademarks of their respective owners.
ii
Not for Distribution.
Table of Contents
PART 1: Veritas InfoScale Storage 7.4.2 for UNIX/Linux: Administration
Course Introduction
About this course ................................................................................................................. Intro-2
Education and support resources ........................................................................................ Intro-7
Table of Contents v
© 2020 Veritas Technologies LLC. All Rights Reserved
Course introduction
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the Course Introduction lesson in the Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
This course is designed for UNIX/Linux system administrators, system engineers, technical
support personnel, network/SAN administrators, and systems integration/development staff,
who will install, configure, manage and integrate InfoScale Storage and InfoScale Availability.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
This course will NOT prepare you for the certification exams or the Advanced courses of both the products.
After completing this course, you will be able to perform the tasks listed on this slide.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
The lessons in this course are displayed on the slide. This five-day class is a condensed version
of the five-day Veritas InfoScale Storage 7.4.2 for UNIX/Linux: Administration course and the
five-day Veritas InfoScale Availability 7.4.2 for UNIX/Linux: Administration course. This course
is a subset of the two courses, and it covers the absolute basics of the two products -
InfoScale Storage 7.4.2 and InfoScale Availability 7.4.2.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
This topic describes Veritas Education offerings and other Veritas resources
available to help you design, configure, operate, monitor, and support
Veritas InfoScale 7.4.2.
This topic describes Veritas Education offerings and other Veritas resources available to
help you design, configure, operate, monitor, and support Veritas InfoScale 7.4.2.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The Veritas Open eXchange allows customers and users of Veritas products to network, get
help, and learn more about the industry-leading solutions. Veritas Open eXchange is a
customer-focused resource, intended to help you design and implement a utility computing
strategy to provide availability, performance, and automation for your storage, servers, and
applications. Veritas Open eXchange provides the following resources:
• Technical documents, such as articles, white papers, and product specs.
• Interactive services, such as the discussion forum, where members can discuss current
topics, share tips and tricks, and help one another troubleshoot problems.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
MyVeritas is single destination that allows you to access all of your Veritas enterprise services
and information. Visit https://www.veritas.com/support/en_US.html to view this page.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
10
Visit the Veritas Education Services page to learn more about Veritas product training and
certification at: https://www.veritas.com/services/education-services.html
This slide displays links related to curriculum paths, Veritas certification, and other training
related information.
• Curriculum Paths: Backup & Recovery, Information Governance, Storage & Availability.
• Get Certified in InfoScale and other Veritas products.
• View FAQs about Education Services.
• Manage your training transcript and print certificates of completion by signing in to the
Veritas Learning Portal.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
End of presentation
Intro-11
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the Installing and Licensing InfoScale lesson in the Veritas InfoScale 7.4.2
Fundamentals for UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The InfoScale family of products draws on Veritas' long heritage of world-class storage
management and availability solutions.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
InfoScale Foundation
Base offering targeting DMP vDMP SF*
storage management
The earlier SFHA offerings mapped to the new InfoScale family are displayed on the slide. For
additional information about the Veritas InfoScale family, refer to:
https://sort.veritas.com/infoscale/intro
Note: While Operations Manager maps to the InfoScale Operations Manager, it is not a stand-
alone product. Instead, it is a graphical management interface for the InfoScale family and is
available free of charge to InfoScale customers.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• 1-256 TB File System • Encryption of data at rest • Mount lock • Quality of Service (QoS) for
• ALUA and generic ALUA array (FIPS Certified) • Named data streams applications
support • Fault tolerant file system • Online file system • SSD device exploitation
• Data Management APIs • File Change Log defragmentation • Site awareness with remote
• Device names consistent • File system migration from • Online file system grow and mirrors
across cluster nodes EXT4 to VxFS shrink • SmartIO Support for vMotion
• Device names using Array • File system snapshots • Online patching of selected with DMP for VMware with
Volume IDs Volume Manager packages SmartPools
• Import cloned LUN
• Dirty region logging • Online relay out • Support for Erasure Coded
• Install and configure InfoScale volumes
• Dynamic LUN expansion enterprises through VIOM • Online volume grow and
shrink • Support for native 4k sector
• Dynamic Multi-pathing with • Installer - Required patches size storage devices
intelligent pathing for EMC for OS updates • Operations Manager
VPLEX • SmartTier
• Installer-SORT integration • Partitioned directories
• Enclosure-based naming • Thin storage reclamation
• Keyless licensing • Public cloud support - AWS using UNMAP
infrastructure
• iSCSI device support
This slide displays information about the InfoScale Foundation features for Linux. For
additional information, refer to: https://sort.veritas.com/product_features
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
This slide displays information about the InfoScale Storage features for Linux. For additional
information, refer to: https://sort.veritas.com/product_features
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
10
This slide displays information about the InfoScale Availability features for Linux. For
additional information, refer to: https://sort.veritas.com/product_features
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Installer - One click Rolling upgrade Sybase ASE CE Oracle RAC integration
default install for fresh integration
installs
11
This slide displays information about the InfoScale Enterprise features for Linux. For additional
information, refer to: https://sort.veritas.com/product_features
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
InfoScale Foundation
Transitions
DMP vDMP SF*
Co-Deploy
12
This slide displays the information about the flow of transitions and how InfoScale products
are co-deployed.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Full upgrade
Phased upgrade
Rolling upgrade
Manual upgrade
13
• AI (Automated Installer)
• ZFS BEs (Boot Environments)
• Kickstart (RHEL)
• YUM (RHEL)
• Satellite server (RHEL)
• NIM
• NIMADM
• ADI
14
This slide displays information about the native installer support for InfoScale.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
NOTE:
• Using the Ansible automation tool, you can deploy, configure, and administer the InfoScale product and it’s
features.
• For more information, refer to the links provided in the Notes section.
15
You can deploy the Veritas InfoScale product using the automation tools such as Ansible, Chef,
and Puppet.
• For additional information about Ansible Guides, refer to:
https://sort.veritas.com/public/infoscale/ansible/docs/Infoscale7.4.1_Ansible_Support_Li
nux_Guide.pdf
https://sort.veritas.com/utility/ansible
https://www.veritas.com/content/support/en_US/doc/109864724-141543589-
0/v135387664-141543589
• For additional information about Chef Guides, refer to:
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
https://www.veritas.com/support/en_US/doc/infoscale_chef_deploy_unix
https://www.veritas.com/content/support/en_US/doc/infoscale_chef_deploy_unix
16
You can use the Veritas SORT Web page to download the InfoScale product guides and
patches related to latest and previous releases.
For additional information about SORT, refer to:
https://sort.veritas.com/
https://sort.veritas.com/mobile_apps
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
17
Product Features
• Products and Platforms Lookups
Installations or • Documents
upgrades review • InfoScale and SFHA Future Platform and Feature Plans
product information
18
This slide displays information about the pre-installation tasks you need to perform before
installing InfoScale.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
19
The details about InfoScale SCL, HCL, and Licensing information are listed on the slide.
For additional information, refer to:
• InfoScale 742 SCL (Linux):
https://www.veritas.com/content/support/en_US/doc/infoscale_scl_742_lin
• InfoScale 742 Windows: https://sort.veritas.com/DocPortal/pdf/infoscale_scl_742_win
• InfoScale 742 HW list:
https://www.veritas.com/content/support/en_US/doc/infoscale_hcl_73_731_74_unix
• InfoScale 742 docs list: https://sort.veritas.com/sitemap/document/28
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
20
− Storage HBAs
− Use shared storage with SCSI 3 support for data protection
• Configure system’s software identically for HA solutions:
− Operating system version and patch level
− Kernel, networking and configuration files
Note that some hardware variability may be appropriate, for example, to meet different
workload requirements among HA services.
Automatically
using SORT Data
Collectors
Manually using
the checklist Using the Installer
downloaded from pre-check option
SORT
21
22
To perform pre-checks using SORT Data Collectors, perform the following steps:
3. Run the data collection tool. The data collection tool analyzes your systems and stores the
results in an XML file.
23
Information required
• Product to install
• Product component to install
./installer -precheck sys1 sys2
Veritas InfoScale Storage and Availability Solutions Precheck Program
sys1 sys2
1) Veritas InfoScale Foundation
2) Veritas InfoScale Availability
3) Veritas InfoScale Storage
4) Veritas InfoScale Enterprise
24
Keep the following information handy before you start the installation:
25
Before installing the InfoScale product, keep the following information ready:
• The system name with the fully-qualified domain name
• The product license key if there are no plans to use keyless licensing
• The cluster name and cluster ID (HA products only)
• The public NIC device names (HA products only)
• The private heartbeat NIC device names (HA products only)
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Select product
Key (Enterprise
Add valid key Valid key
keys, .SLF file)
Keyless After 60 days, warning messages are written to syslog every four hours.
licensing:
Hosts must be managed by Veritas InfoScale Operations Manager to be license-compliant.
In previous releases 7.3.x text-based license keys are used. From 7.4 (or later releases), .SLF license key certificates are issued in
the form of license key files. For more information refer to the 7.4.2 InfoScale Install and License Guides:
Linux: https://www.veritas.com/content/support/en_US/doc/109508799-141543583-0/v23679202-141543583
Windows: https://www.veritas.com/content/support/en_US/doc/109356204-141924754-0/wxrt-tot_v55825087-141924754
26
For Enterprise Licensing: .SLF key: You need to purchase the .SLF key for Enterprise
installation.
For Keyless licensing:
• After 60 days, warning messages are written to syslog every four hours.
• Hosts must be managed by Veritas InfoScale Operations Manager to be license-compliant.
• Use vxkeyless for setting and changing product level.
In previous releases 7.3.x text-based license keys are used. From 7.4 (or later releases), .SLF
license key certificates are issued in the form of license key files. For more information about
the 7.4.2 InfoScale Install and License Guides, refer to:
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Linux: https://www.veritas.com/content/support/en_US/doc/109508799-141543583-
0/v23679202-141543583
Windows: https://www.veritas.com/content/support/en_US/doc/109356204-141924754-
0/wxrt-tot_v55825087-141924754
27
28
29
If your licenses are monitored using Telemetry, you will avail the following benefits:
• Compliance requirements of license usage (by customer) are met.
• You will be able to proactively plan for the renewals of licenses.
• You will be able to see the consolidated report on one console related to platform wise
deployment and license information.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Installer
./installer –version sys1
The -version option is used to check the status of installed products on the system. The installer also
provides an option to perform a detailed post-installation check.
30
This slide displays information about the tools required to perform post-install checks.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
VIOM CLI
For custom solution package and InfoScale deployments using the VIOM Distribution Manager add-on,
refer to:
https://www.veritas.com/content/support/en_US/doc/120571146-141764001-1
https://www.veritas.com/content/support/en_US/doc/120571146-141764001-0/v121755920-141764001
31
The user interfaces for managing InfoScale products are Veritas InfoScale Operations Manager
and the Command-line interface. VIOM is suitable for large numbers of clusters and the
management for all Storage Foundation products. The CLI is suitable for local system or local
cluster management.
For custom solution package and InfoScale deployments using the VIOM Distribution Manager
add-on, refer to:
https://www.veritas.com/content/support/en_US/doc/120571146-141764001-1
https://www.veritas.com/content/support/en_US/doc/120571146-141764001-
0/v121755920-141764001
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
32
33
• ESXi:
– v.6.5 U3
– v.6.7 U3
• HyperV: 2016
• Nutanix: AOS 5.10.5
34
35
This slide displays the InfoScale upgrade path from the previous releases to the current
InfoScale 7.4.2 release.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to explain the InfoScale support
for cloud environments and solutions.
36
37
The Veritas InfoScale product suite helps you make your applications highly available and
resilient to disruptions. The applications may reside in your on-premises data centers or in
public, private, or hybrid cloud environments. With InfoScale, you can combine cost-effective
platform independence with consistent high application performance and availability. Veritas
supports InfoScale configurations in the following cloud environments:
• Amazon Web Services (AWS)
• Microsoft Azure
• Google Cloud Platform (GCP)
• OpenStack
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• Nutanix
• Veritas InfoScale Availability agents provide monitoring and failover capabilities for the networking, storage, and
replication resources that are associated with an application in the cloud.
• Using these components you can configure an application for:
– Migration from on-premises to cloud or from cloud to cloud.
– High availability or disaster recovery within a cloud.
– Disaster recovery from on-premises to cloud or across clouds.
38
Veritas InfoScale Storage components allows you to manage various kinds of storage, which
includes SAN, local flash, SSD, DAS, and cloud S3 targets. Veritas InfoScale Availability agents
provide monitoring and failover capabilities for the networking, storage, and replication
resources that are associated with an application in the cloud. Using these components you
can configure an application for:
• Migration from on-premises to cloud or from cloud to cloud
• High availability or disaster recovery within a cloud
• Disaster recovery from on-premises to cloud or across clouds
For additional information about the InfoScale 7.4.2 support for cloud environments, refer to:
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
https://www.veritas.com/support/en_US/doc/130803809-141542355-0/index
39
Veritas InfoScale Enterprise enables organizations to provision and manage storage and
provides high availability (HA) for business-critical applications. Storage provisioning is
independent of hardware types or locations with predictable quality-of-service by identifying
and optimizing critical workloads. It increases storage agility and enables you to work with
and manage multiple types of storage to achieve better ROI without compromising on
performance and flexibility.
For application HA, InfoScale Enterprise monitors an application to detect any application or
node failure and brings the application services up on a target system in case of a failure. You
can deploy the InfoScale solution in AWS and Azure marketplace, for more information refer
to the following links:
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
AWS deployment:
• https://aws.amazon.com/marketplace/pp/Veritas-Technologies-LLC-Veritas-InfoScale-
Enterpr/B07CYGD14V
• https://s3.amazonaws.com/veritas-infoscale-
7.3.1/CFTs/Veritas_Infoscale731_CFT_Deployment_Guide.pdf
Azure deployment:
• https://www.veritas.com/content/support/en_US/doc/infoscale_azure_deploy_armst_lin
• https://azuremarketplace.microsoft.com/en-us/marketplace/eca/3909
− For InfoScale Storage (only): https://azuremarketplace.microsoft.com/en-
us/marketplace/eca/3908
− For other InfoScale solutions in Azure: https://azuremarketplace.microsoft.com/en-
us/marketplace/eca?page=1&search=veritas%20InfoScale
After completing this topic, you will be able to perform basic installation
and configuration of InfoScale Storage.
40
41
Storage Foundation Cluster Includes Cluster File System (CFS) and Cluster Volume Manager (CVM). In addition to
File System (SFCFS). all InfoScale Storage functionality, features unlocked by this component are:
42
Storage Foundation (SF) - Includes Veritas File System (VxFS) and Veritas Volume Manager
(VxVM). In addition to all InfoScale Foundation functionality, this component unlocks the
features that are displayed on the slide. Storage Foundation Cluster File System (SFCFS) -
Includes Cluster File System (CFS) and Cluster Volume Manager (CVM). In addition to all
InfoScale Storage functionality, the features unlocked by this component are displayed on the
slide.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Keep the following information handy before you start the installation:
43
The following information is required to install and configure the InfoScale Storage SF
component:
• The system name with the fully-qualified domain name.
• The product license key if there are no plans to use keyless licensing.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Keep the following information handy before you start the installation:
44
The following information is required to install and configure the InfoScale Storage SFCFS
component:
• The system name with the fully-qualified domain name.
• The product license key if there are no plans to use keyless licensing.
• The cluster name and cluster ID.
• Network interfaces for cluster interconnect heartbeat links.
• Disks or Coordination Point Server information.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to perform basic installation
and configuration of InfoScale Availability.
45
46
Veritas InfoScale Availability keeps critical business services up and running. It provides
resiliency across physical and virtual environments and delivers high availability with a robust
software-defined approach. It also maximizes IT service continuity.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
InfoScale Availability
• Adaptive HA
• FireDrill
• Global Cluster Option (GCO)
• Intelligent Monitoring Framework (IMF)
• Just In Time(JIT) target VM Availability (VMware) for
Planned Failover
• Priority based failover
• Replication agents
• Virtual Business Service
47
48
− Storage HBAs
− Use shared storage with SCSI 3 support for data protection
• Configure systems’ software identically for HA solutions:
− Operating system version and patch level
− Kernel, networking and configuration files
49
After completing this topic, you will be able to perform a basic upgrade of
the SFHA component of InfoScale Enterprise.
50
51
52
53
The hardware and software recommendations for InfoScale Enterprise are as follows:
• Eliminate single points of failure
• Provide redundancy for:
− Public network interfaces and infrastructures
− HBAs for shared storage (Fibre or SCSI)
• Configure systems’ hardware identically for HA solutions:
− System hardware
− Network interface cards
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
− Storage HBAs
− Use shared storage with SCSI 3 support for data protection
• Configure systems’ software identically for HA solutions:
− Operating system version and patch level
− Kernel, networking and configuration files
• The system name with the fully-qualified • The cluster virtual IP address, NIC, and
domain name netmask
• The product license key if there are no plans • Setting cluster secure mode communication
to use keyless licensing • VCS user names and passwords (Default
• The cluster name and cluster ID admin / password)
• Network interfaces for cluster interconnect • SMTP server host name
heartbeat links • SNMP console host name, trap port, and
message levels
54
55
Ensure that you complete the following tasks before upgrading to InfoScale:
• Review the Veritas InfoScale Release Notes
• Review the Veritas Technical Support website for additional information
• Review product documentation that includes but is not limited to the InfoScale
Configuration and Upgrade Guide, the HCL, the SCL, and other product guides.
• Make sure that all users are logged off and that all major user applications are properly
shut down
• Make sure that you have created a valid backup
• Ensure that you have enough file system space to upgrade
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• For any startup scripts in /etc/init.d/, comment out any application commands or processes
that are known to hang if their file systems are not present
• Schedule sufficient outage time and downtime for the upgrade
• Make sure that the file systems are clean before upgrading
Keep the following information handy before you start the installation:
56
Keep the following information handy before you start the installation:
• The system names with the fully-qualified domain name
• The cluster name
• The product license key if there are no plans to use keyless licensing
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• Reference materials
– Veritas InfoScale Enterprise Administrator’s Guide
– https://sort.veritas.com
– http://www.veritas.com/support
– InfoScale 7.4.2 Document list:
• https://sort.veritas.com/sitemap/document/25
• https://www.veritas.com/content/support/en_US/doc/79618328-141543577-0/v89676408-141543577
– InfoScale support for Cloud environments: https://www.veritas.com/support/en_US/doc/130803809-141542355-
0/index
57
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
58
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
59
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
60
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. RHEL 7.7
B. RHEL 8.1
C. CentOS 7.7
D. RHEL: 7.7 & 8,1 and CentOS 7.7 & 8.1
61
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. RHEL 7.7
B. RHEL 8.1
C. CentOS 7.7
D. RHEL: 7.7 & 8,1 and CentOS 7.7 & 8.1
The correct answer is D. InfoScale 7.4.2 supports both RHEL and CentOS for both 7.7 and 8.1 releases.
62
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. AWS, Azure
B. GCP, OpenStack
C. VMware (ESXi), Nutanix
D. All the above
63
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. AWS, Azure
B. GCP, OpenStack
C. VMware (ESXi), Nutanix
D. All the above
The correct answer is D. InfoScale product family supports all the above cloud platforms.
64
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
End of presentation
01-65
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the Virtual Objects lesson in the Veritas InfoScale 7.4.2 Fundamentals for UNIX/Linux:
Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
Topic Objective
• Recall basic storage terminology.
Operating system storage devices and
• Explain the structural characteristics of a disk placed under
virtual data storage
VxVM control.
Identify the virtual objects that are created by VxVM to manage
Volume Manager (VxVM) storage objects data storage, including disk groups, VxVM disks, subdisks,
plexes, and volumes.
Identify virtual storage layout types used by Volume Manager
VxVM volume layouts and RAID levels
(VxVM) to remap address space.
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
This is the Operating system storage devices and virtual data storage topic.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Each UNIX flavor supported by Storage Foundation has a unique way of detecting and using
storage devices. Some platforms, such as Solaris and Linux, use a partition table and disk
partitions to organize data on the physical disks. Other platforms such as AIX and HP-UX, use
OS-native logical volume management software to detect disks as physical volumes.
Storage Foundation hides the complexity of the device management layer by introducing a
virtual data layer that works the same on all the supported UNIX platforms. The way Volume
Manager uses disks to organize data is explained in detail later in this lesson.
However, the key point to note is that the Volume Manager can only use a device if it is
recognized by the operating system on the Storage Foundation host. Therefore, if a disk
device is not visible in the Volume Manager, you first have to ensure that the operating system
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
detects it correctly.
SCSI disks:
/dev/sdf (no slice)
/dev/sdf3
Linux IDE disks:
/dev/hda (no slice)
/dev/hda8
Use the following OS-specific commands to list storage devices on individual platforms. Refer
to manual pages for specific command syntax.
Solaris
You locate and access the data on a physical disk by using a device name that specifies the
controller, target ID, and disk number. A typical device name uses the format: c#t#d#.
• c# is the controller number.
• t# is the target ID.
• d# is the logical unit number (LUN) of the drive attached to the target.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
If a disk is divided into partitions, you also specify the partition number in the device name:
s# is the partition (slice) number. For example, the device name c0t0d0s1 is connected to
controller number 0 in the system, with a target ID of 0, physical disk number 0, and partition
number 1 on the disk.
HP-UX
Traditionally, you locate and access the data on a physical disk by using a device name that
specifies the controller, target ID, and disk number. A typical traditional device name uses the
format: c#t#d#.
• c# is the controller number.
• t# is the target ID.
• d# is the logical unit number (LUN) of the drive attached to the target.
AIX
Every device in AIX is assigned a location code that describes its connection to the system.
The general format of this identifier is AB-CD-EF-GH, where the letters represent decimal
digits or uppercase letters. The first two characters represent the bus, the second pair identify
the adapter, the third pair represent the connector, and the final pair uniquely represent the
device. For example, a SCSI disk drive might have a location identifier of 04-01-00-6,0. In this
example, 04 means the PCI bus, 01 is the slot number on the PCI bus occupied by the SCSI
adapter, 00 means the only or internal connector, and the 6,0 means SCSI ID 6, LUN 0.
However, this data is used internally by AIX to locate a device. The device name that a system
administrator or software uses to identify a device is less hardware dependent. The system
maintains a special database called the Object Data Manager (ODM) that contains essential
definitions for most objects in the system, including devices. Through the ODM, a device
name is mapped to the location identifier. The device names are referred to by special files
found in the /dev directory. For example, the SCSI disk identified previously might have the
device name hdisk3 (the fourth hard disk identified by the system). The device named hdisk3
is accessed by the file name /dev/hdisk3.
If a device is moved so that it has a different location identifier, the ODM is updated so that it
retains the same device name, and the move is transparent to users. This is facilitated by the
physical volume identifier stored in the first sector of a physical volume. This unique 128-bit
number is used by the system to recognize the physical volume wherever it may be attached
because it is also associated with the device name in the ODM.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Linux
On Linux, device names are displayed in the format:
• sdx[N]
• hdx[N]
In the syntax, sd refers to a SCSI disk, and hd refers to an EIDE disk.
• x is a letter that indicates the order of disks detected by the operating system. For
example, sda refers to the first SCSI disk, sdb refers to the second SCSI disk, and so on.
• N is an optional parameter that represents a partition number in the range 1 through 16.
For example, sda7 references partition 7 on the first SCSI disk.
Primary partitions on a disk are 1, 2, 3, 4; logical partitions have numbers 5 and up. If the
partition number is omitted, the device name indicates the entire disk.
Physical disks/LUNs
• Disk array: A system containing multiple physical disks used to create LUNs
• LUN: A logical representation of storage space that is recognized by the
operating system as a uniquely identifiable storage device
• Multipathing: Multiple access routes to LUNs in a disk array to achieve
performance and redundancy
Note: Throughout this course, the term disk is used to mean either a physical
disk or LUN.
Disk arrays
Reads and writes on unmanaged physical disks can be a relatively slow process, because disks
are physical devices that require time to move the heads to the correct position on the disk
before reading or writing. If all the read and write operations are performed to individual
disks, one at a time, the read-write time can become unmanageable.
A disk array is a collection of physical disks. Performing I/O operations on multiple disks in a
disk array can improve I/O speed and throughput.
Hardware arrays present disk storage to the host operating system as LUNs. A LUN can be
made up of a single physical disk, a collection of physical disks, or even a portion of a physical
disk. From the operating system point of view, a LUN corresponds to a single storage device.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Multipathing
Some disk arrays provide multiple ports to access disk devices. These ports, coupled with the
host bus adaptor (HBA) controller and any data bus or I/O processor local to the array,
compose multiple hardware paths to access the disk devices. This is called multipathing.
In a multipathing environment, a single storage device may appear to the operating system as
multiple storage devices. Special multipathing software is usually required to administer
multipathed storage devices. Veritas Dynamic Multi-Pathing (DMP) product which is part of
the Storage Foundation software provides seamless management of multiple access paths to
storage devices in heterogeneous operating systems and storage environments.
Twelve
array-based
LUNs
In an array, the LUNs are a virtual presentation. Therefore, you cannot know where in the
array the actual data will be put. That means you have no control over the physical conditions.
The array in the slide contains slots for 14 physical disks, and the configuration places 12
physical disks in the array. These physical disks are paired together into 6 mirrored RAID
groups. In each RAID group, 12 logical units, or LUNs, are created. These LUNs appear to hosts
as SAN-based SCSI disks. The remaining two disks are used as spares in case one of the active
disks fails.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Physical
disks/
LUNs
10
that interact with volumes work in the same way as with physical disks. All users and
applications access volumes as contiguous address space using special device files in a
manner similar to accessing a disk partition.
Volumes have block and character device nodes in the /dev tree. You can supply the name of
the path to a volume in your commands and programs, in your file system and database
configuration files, and in any other context where you would otherwise use the path to a
physical disk partition.
CDS disk
(default) Offset (128K)
Private region (32 MB)
OS-reserved areas: (metadata)
• Platform blocks
Metadata
• VxVM ID blocks
• AIX and HP-UX Public region
coexistence labels (user data)
User
data
11
overhead.
• Public region: The public region consists of the remainder of the space on the disk. The
public region represents the available space that Volume Manager can use to assign to
volumes and is where an application stores data. Volume Manager never overwrites this
area unless specifically instructed to do so.
• Private and public regions on • Private and public regions • Private and public regions at
a single disk slice on separate disk slices specific disk offsets
• Portable between different • Not portable between • Not portable between
operating systems different operating systems different operating systems
• Unsuitable for OS boot • Suitable for OS boot • Suitable for HP-UX boot
partitions partitions partitions (not AIX)
12
In addition to the default CDS disk format, Volume Manager supports other platform-specific
disk formats. These disk formats are used for bringing the boot disk under VxVM control on
operating systems that support that capability.
On platforms that support bringing the boot disk under VxVM control, CDS disks cannot be
used for boot disks. CDS disks have specific disk layout requirements that enable a common
disk layout across different platforms, and these requirements are not compatible with the
particular platform-specific requirements of boot disks. Therefore, when placing a boot disk
under VxVM control, you must use a non-default disk format (sliced on Solaris and Linux,
hpdisk on HP-UX).
For non-boot disks, you can convert CDS disks to other disk layout formats and vice versa by
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to identify the virtual objects
created by VxVM to manage data storage, including disk groups, VxVM
disks, subdisks, plexes, and volumes.
13
Subdisks:
• acctdg01-01|02 acctdg01 acctdg02 acctdg03
• acctdg02-01|02
• acctdg03-01|02 acctdg01-01 acctdg02-01 acctdg03-01
acctdg01-02 acctdg02-02 acctdg03-02
VxVM disks:
• acctdg01
• acctdg02
• acctdg03 Disk1 Disk2 Disk3
OS disks
14
Disk groups
A disk group is a collection of VxVM disks that share a common configuration. Disks are
grouped into disk groups for management purposes, such as to hold the data for a specific
application or set of applications. For example, data for accounting applications can be
organized in a disk group called acctdg. A disk group configuration is a set of records with
detailed information about related Volume Manager objects in a disk group, their attributes,
and their connections.
Volume Manager objects cannot span disk groups. For example, a volume’s sub disks, plexes,
and disks must be derived from the same disk group as the volume. You can create additional
disk groups as necessary. Disk groups enable you to group disks into logical collections. Disk
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
groups and their components can be moved as a unit from one host machine to another.
Volume Manager disks
A Volume Manager (VxVM) disk represents the public region of a physical disk that is under
Volume Manager control. Each VxVM disk corresponds to one physical disk. Each VxVM disk
has a unique virtual disk name called a disk media name. The disk media name is a logical
name used for Volume Manager administrative purposes. Volume Manager uses the disk
media name when assigning space to volumes. A VxVM disk is given a disk media name when
it is added to a disk group.
consists of one or more subdisks located on one or more physical disks. The length of a plex is
determined by the last block that can be read or written on the last subdisk in the plex.
Default plex name: volume_name-##
Volumes
A volume is a virtual storage device that is used by applications in a manner similar to a
physical disk. Due to its virtual nature, a volume is not restricted by the physical size
constraints that apply to a physical disk. A VxVM volume can be as large as the total of
available, unreserved free physical disk space in the disk group. A volume consists of one or
more plexes.
After completing this topic, you will be able to identify virtual storage
layout types used by VxVM to remap address space.
16
Striped RAID-5
Concatenated
(RAID-0) (RAID-5)
17
Volume layouts
RAID levels correspond to volume layouts. A volume’s layout refers to the organization of
plexes in a volume. Volume layout is the way plexes are configured to remap the volume
address space through which I/O is redirected at run-time. Volume layouts are based on the
concepts of disk spanning, redundancy, and resilience.
Disk spanning
Disk spanning is the combination of disk space from multiple physical disks to form one logical
drive. Disk spanning has two forms:
• Concatenation: Concatenation is the mapping of data in a linear manner across two or
more disks.
part of a volume fails, the data on that portion of the failed volume can be re-created from
the remaining data and parity information.
A RAID-5 volume uses striping to spread data and parity evenly across multiple disks in an
array. Each stripe contains a parity stripe unit and data stripe units. Parity can be used to
reconstruct data if one of the disks fails. In comparison to the performance of striped
volumes, write throughput of RAID-5 volumes decreases, because parity information
needs to be updated each time data is accessed. However, in comparison to mirroring,
the use of parity reduces the amount of space required.
19
Erasure coding is a volume layout with an advantage of storage saving. Mirroring can provide
fault tolerance just like Erasure coding but will have huge storage overhead. So, it is treated as
a feature rather than a new area.
Erasure coding is a new feature available as a technology preview in Veritas InfoScale for
configuration and testing in non-production environments. It is supported in DAS, SAN, FSS,
and standalone environments.
As storage systems expand and become more complex, traditional data protection
mechanisms are proving to be inadequate against failures. Erasure coding offers a more
robust solution in redundancy and fault tolerance for critical storage archives. In erasure
coding, data is broken into fragments, expanded and encoded with redundant data pieces and
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
stored across different locations or storage media. When one or more disks fail, the data on
failed disks is reconstructed using the parity information in the encoded disks and data in the
surviving disks. Erasure coded volumes must be created using disk group version 230 or later.
When you create an erasure coded volume, Veritas InfoScale, by default, runs asynschronous
initialization on the volume ensuring that data and parities are synchronized for all regions.
The operation runs in the background allowing the volume to be available for use to
applications immediately after creation. The volumes display the SYNC state after creation
until all the regions are synchronized. This functionality is supported on both private and
shared/FSS disk groups.
You can manually initialize an erasure coded volume by setting init=zero at the time of
creating the volume. The initialization zeroes out all the regions and the volume is not
available for use until the initialization process completes.
• Reference materials
– Storage Foundation Administrator’s Guide
– https://sort.veritas.com
– http://www.veritas.com/support
20
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
21
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
22
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Volume
B. Plex
C. Partition
D. Subdisk
23
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Volume
B. Plex
C. Partition
D. Subdisk
The correct answer is D. A subdisk is the smallest unit of storage in Volume Manager, which is mapped to a specific
region of a physical disk. A subdisk is defined by an offset and a length in sectors on a VxVM disk.
24
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Public region
B. TOC region
C. Virtual region
D. Private region
25
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Public region
B. TOC region
C. Virtual region
D. Private region
The correct answer is D. The private region stores information, such as disk headers, configuration copies, and kernel
logs, in addition to other platform-specific management areas that VxVM uses to manage virtual objects.
26
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
27
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is C. By default, Volume Manager uses a cross-platform data sharing (CDS) disk layout. A CDS disk is
consistently recognized by all VxVM-supported UNIX platforms .
28
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Volumes
B. Plexes
C. Subdisks
D. Disk groups
29
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Volumes
B. Plexes
C. Subdisks
D. Disk groups
The correct answer is D. A disk group is a collection of VxVM disks that share a common configuration.
30
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Concatenated
B. Striped
C. RAID-5
D. Mirrored
31
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Concatenated
B. Striped
C. RAID-5
D. Mirrored
The correct answer is A. Concatenation is the mapping of data in a linear manner across two or more disks.
Concatenation allows a volume to be created from multiple regions of one or more disks if there is not enough space for
an entire volume on a single region of a disk.
32
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
End of presentation
02-33
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the Creating a Volume and File System lesson in the Veritas InfoScale 7.4.2
Fundamentals for UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
Topic Objective
Preparing disks and disk groups for Initialize an OS disk as a VxVM disk and create a disk group by using
volume creation command-line utilities.
Creating a volume and adding a file Create a concatenated volume, add a file system to an existing volume, and
system mount the file system.
Removing volumes, disks, and disk Remove a volume, evacuate a disk, remove a disk from a disk group,
groups destroy a disk group, and shred a disk.
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
This is the Preparing disks and disk groups for volume creation topic.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
An enclosure, or disk enclosure, is an intelligent disk array, which permits hot swapping of
disks. With Storage Foundation, disk devices can be named for enclosures rather than for the
controllers through which they are accessed as with standard disk device naming (for
example, c0t0d0 or hdisk2).
Enclosure-based naming allows Storage Foundation to access enclosures as separate physical
entities. By configuring redundant copies of your data on separate enclosures, you can
safeguard against failure of one or more enclosures. This is especially useful in a storage area
network (SAN) that uses Fibre Channel hubs or fabric switches and when managing the
dynamic multipathing (DMP) feature of Storage Foundation. For example, if two paths
(c1t99d0 and c2t99d0) exist to a single disk in an enclosure, VxVM can use a single DMP
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
[persistent=<yes|no>] [lowercase=<yes|no>] \
[use_avid=<yes|no>]
If you set the use_avid option to yes, the LUNs are numbered based on the array volume
ID instead of the traditional indexing method.
You can also change the device naming scheme using the Change the disk naming scheme
option in the vxdiskadm menu.
information is written to the private region. Any partitions (other than slice 2 on the Solaris
platform) that may have existed on the disk are removed.
These disks are under Volume Manager control but cannot be used by Volume Manager until
they are added to a disk group.
Note: Encapsulation is another method of placing a disk under VxVM control in which existing
data on the disk is preserved.
Changing the disk layout
To display or change the default values that are used for initializing disks, select the
Change/display the default disk layouts option in vxdiskadm:
names, such as
/dev/vx/[r]dsk/diskgroup/volume_name, replace physical locations, such as
/dev/[r]dsk/device_name.
The free space in a disk group refers to the space on all disks within the disk group that has
not been allocated as subdisks. When you place a disk into a disk group, its space becomes
part of the free space pool of the disk group.
Stage three: Assign disk space to volumes
When you create volumes, space in the public region of a disk is assigned to the volumes.
Some operations, such as removal of a disk from a disk group, are restricted if space on a disk
is in use by a volume.
group definition. Therefore, you cannot remove all disks from a disk group without destroying
the disk group.
Why are disk groups needed?
Disk groups assist disk management in several ways:
• Disk groups enable the grouping of disks into logical collections for a particular set of users
or applications.
• Disk groups enable data, volumes, and disks to be easily moved from one host machine to
another.
• Disk groups ease the administration of high availability environments. Disk drives can be
shared by two or more hosts, but they can be accessed by only one host at a time. If one
host crashes, the other host can take over its disk groups and therefore its disks.
• A disk group provides the configuration boundary for VxVM objects.
System A
bootdg = sysdg
Reserved defaultdg = acctdg
names:
• bootdg
• defaultdg System B
appdg
10
VxVM has reserved three disk group names that are used to provide boot disk group and
default disk group functionality. The names bootdg, defaultdg, and nodg are system-wide
reserved disk group names and cannot be used as names for any of the disk groups that you
set up.
If you choose to place your boot disk under VxVM control, VxVM assigns bootdg as an alias for
the name of the disk group that contains the volumes that are used to boot the system.
The main benefit of creating a default disk group is that SF commands default to that disk
group if you do not specify a disk group on the command line. defaultdg is an alias for the disk
group name that should be assumed if the -g option is not specified to a command. You can
set defaultdg when you install Veritas Volume Manager (pre-SF 5.1) or anytime after
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
installation.
By default, both bootdg and defaultdg are set to nodg.
Notes
• The definitions of bootdg and defaultdg are written to the volboot file. The definition of
bootdg results in a symbolic link from the named bootdg in
/dev/vx/dsk and /dev/vx/rdsk.
• The rootdg disk group name is no longer a reserved name for VxVM versions after 4.0. If
you are upgrading from a version of Volume Manager earlier than 4.0 where the system
disk is encapsulated in the rootdg disk group, the bootdg is assigned the value of rootdg
automatically.
11
A disk must be placed into a disk group before it can be used by VxVM. A disk group cannot
exist without having at least one associated disk. When you create a new disk group, you
specify a name for the disk group and at least one disk to add to the disk group. The disk
group name must be unique for the host machine.
Adding disks
To add a disk to a disk group, you select an uninitialized disk or a free disk. If the disk is
uninitialized, you must initialize the disk before you can add it to a disk group.
Disk naming
When you add a disk to a disk group, the disk is assigned a disk media name. The disk media
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Initialize disks:
vxdisksetup -i accessname [attributes]
vxdisksetup -i hds9500-alua0_0 (EBN)
vxdisksetup -i hdisk5 (OSN: AIX)
vxdisksetup -i sdf (OSN:Linux)
12
From the vxdiskadm main menu, select the Add or initialize one or more disks option.
Specify the disk group to which the disk should be added. To add the disk to a new disk group,
you type a name for the new disk group. You use this same menu option to add additional
disks to the disk group.
To verify that the disk group was created, you can use vxdg list.
When you add a disk to a disk group, the disk group configuration is copied onto the disk, and
the disk is stamped with the system host ID.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
13
For example:
vxassist -g appdg make appvol 1g
14
When you create a volume, you indicate the desired volume characteristics, and VxVM
creates the underlying plexes and subdisks automatically. The VxVM interfaces require
minimal input if you use default settings. For experienced users, the interfaces also enable
you to enter more detailed specifications regarding all aspects of volume creation.
Before you create a volume
Before you create a volume, ensure that you have enough disks to support the layout type.
• A striped volume requires at least two disks.
• A mirrored volume requires at least one disk for each plex. A mirror cannot be on the
same disk that other plexes of the same volume are using.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
To create a volume from the command line, you use the vxassist command. In the syntax:
• Use the -g option to specify the disk group in which to create the volume.
• make is the keyword for volume creation.
• volume_name is a name you give to the volume. Specify a meaningful name which is
unique within the disk group.
• length specifies the number of sectors in the volume. You can specify the length by
adding an m, k, g, or t to the length.
Step CLI
15
A file system provides an organized structure to facilitate the storage and retrieval of files. You
can add a file system to a volume when you create a volume or any time after you create the
volume initially.
− When a file system has been mounted on a volume, the
data is accessed through the mount point directory.
− When data is written to files, it is actually written
to the block device file:
/dev/vx/dsk/diskgroup/volume_name.
− When fsck is run on the file system, the raw device
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
file is checked:
/dev/vx/rdsk/diskgroup/volume_name.
To add a file system to a volume from the command line, you must create the file system,
create a mount point for the file system, and then mount the file system.
Solaris
• To create and mount a VxFS file system:
− mkfs -F vxfs /dev/vx/rdsk/datadg/datavol
− mkdir /data
− mount -F vxfs /dev/vx/dsk/datadg/datavol /data
• To create and mount a UFS file system:
− newfs /dev/vx/rdsk/datadg/datavol
− mkdir /data
− mount /dev/vx/dsk/datadg/datavol /data
Not for Distribution.
03-15
HP-UX
• To create and mount a VxFS file system:
• mkfs -F vxfs /dev/vx/rdsk/datadg/datavol
• mkdir /data
• mount -F vxfs /dev/vx/dsk/datadg/datavol /data
• To create and mount an HFS file system:
• newfs -F hfs /dev/vx/rdsk/datadg/datavol
• mkdir /data
• mount -F hfs /dev/vx/dsk/datadg/datavol /data
AIX
• To create and mount a VxFS file system using mkfs:
• mkfs -V vxfs /dev/vx/rdsk/datadg/datavol
• mkdir /data
• mount -V vxfs /dev/vx/dsk/datadg/datavol /data
• To create and mount a VxFS file system using crfs:
crfs -v vxfs -d /dev/vx/rdsk/datadg/datavol -m /data -A yes
Notes:
• An uppercase V is used with mkfs; a lowercase v is used with crfs (to avoid conflict
with another crfs option).
• crfs creates the file system, creates the mount point, and updates the file systems file
(/etc/filesystems). The -A yes option requests mount at boot.
• If the file system already exists in /etc/filesystems, you can mount the file system
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Linux
To create and mount a VxFS file system using mkfs:
mkfs -t vxfs /dev/vx/rdsk/datadg/datavol
mkdir /data
mount -t vxfs /dev/vx/dsk/datadg/datavol /data
Information Entry
Device to mount: /dev/vx/dsk/appdg/appvol
Device to fsck: /dev/vx/rdsk/appdg/appvol
Mount point: /app
File system type: vxfs
fsck pass: 2
Mount at boot: yes
Mount options: -
17
Using CLI, if you want the file system to be mounted at every system boot, you must edit the
file system table file by adding an entry for the file system. If you later decide to remove the
volume, you must remove the entry in the file system table file.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
AIX
In AIX, you can use the following commands when working with the file system table file,
/etc/filesystems:
• To view entries: lsfs mount_point
• To change details of an entry, use chfs. For example, to turn off mount at boot: chfs -
A no mount_point
After completing this topic, you will be able to view disk and disk group
information and identify disk status.
18
19
21
To display detailed information about a disk, you use the vxdisk list command with the
name of the disk. With this command, you can either use the disk access name or the disk
media name together with the disk group name as shown in the following syntax:
vxdisk -g diskgroup list dm_name
vxdisk -g appdg list appdg01
Device: emc0_dd5
devicetag: emc0_dd5
type: auto
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
hostid: train12
disk: name=appdg01 id=1000753057.1114.train12
group: name=appdg id=1000753077.1117.train12
...
In the example output:
• Device is the VxVM name for the device path.
• devicetag is the name used by VxVM to refer to the physical disk.
• type is how a disk was discovered by VxVM. auto is the default type.
• hostid is the name of the system that currently manages the disk group to which the
disk belongs; if blank, no host is currently controlling this group.
HARDWARE_MIRROR: no
DMP_DEVICE : emc0_dd1
DDL_DEVICE_ATTR: lun
CAB_SERIAL_NO : 313635323300
ATYPE : A/A
ARRAY_VOLUME_ID: DD1
ARRAY_PORT_PWWN: 10.10.5.3:3260
ANAME : EMC
TRANSPORT : iSCSI
vxdg list
NAME STATE ID
appdg enabled,cds 969583613.1025.cassius
oradg enabled,cds 971216408.1133.cassius
24
vxlist disk
vxlist dg
25
The vxlist command is a new display command that provides a consolidated view of the SF
configuration.
To display the vxlist command output, the vxdclid daemon must be running. If this
daemon is not running, execute the following command as the root user
/opt/VRTSsfmh/adm/dclisetup.sh
For more information on using the vxlist command, refer to the manual pages.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to display volume layout
information by using the command line.
26
Option Description
-h Hierarchical listing
-r Related records (Layered volumes)
-t Tabular listing
-A Active disk groups
-u unit Unit of measure (for size)
27
28
To display the volume, plex, and subdisk record information for a disk group:
vxprint -g diskgroup -htr - u h
In the output, the top few lines indicate the headers that match each type of output line that
follows. Each volume is listed along with its associated plexes and subdisks and other VxVM
objects.
• dg is a disk group.
• st is a storage pool (used in Intelligent Storage Provisioning).
• dm is a disk.
•
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
vxlist volume
TY VOLUME DISKGROUP SIZE STATUS LAYOUT LINKAGE
vol appvol appdg 1.00g healthy concat -
vol mirvol appdg 500.00m healthy concat -
vol oravol oradg 3.00g healthy striped -
vxinfo –g appdg -p
vol appvol fsgen Started
plex appvol-01 ACTIVE
vol mirvol fsgen Started
plex mirvol-01 ACTIVE
plex mirvol-02 ACTIVE
29
The vxlist command is useful in summarizing the volume information on the system. You
can also use this command to display the disks and the plexes associated with a specific
volume, using the following command options:
vxlist –s disk vol volume_name
vxlist -s disk vol appvol
disks
TY DEVICE DISK NPATH ENCLR_NAME ENCLR_SNO
STATUS
disk emc0_dd1 appdg01 2 emc0 ... imported
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to remove a volume, evacuate
a disk, remove a disk from a disk group, destroy a disk group, and shred a
disk.
30
vxedit:
31
Only remove a volume if you are sure that the data in the volume is not needed, or the data is
backed up elsewhere. A volume must be closed before it can be removed. For example, if the
volume contains a file system, the file system must be unmounted. You must edit the OS-
specific file system table file manually in order to remove the entry for the file system and
avoid errors at boot. If the volume is used as a raw device, the application, such as a database,
must close the device.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
vxdiskadm:
Move volumes from a disk
CLI:
32
Evacuating a disk moves the contents of the volumes on a disk to another disk. The contents
of a disk can be evacuated only to disks in the same disk group that have sufficient free space.
To evacuate to any disk except for appdg03:
vxevac -g appdg appdg02 !appdg03
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
33
You can verify the removal by using the vxdisk list command to display disk
information. A disk that has been taken out of a disk group no longer has a disk media name
or disk group assignment but still shows a status of online.
Before the disk is taken out of the disk group:
vxdisk -o alldgs list
DEVICE TYPE DISK GROUP STATUS
emc0_dd1 auto:cdsdisk appdg01 appdg online
...
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After the disk is taken out of the disk group using the vxdg -g appdg rmdisk
appdg01 command:
vxdisk -o alldgs list
DEVICE TYPE DISK GROUP STATUS
emc0_dd1 auto:cdsdisk -- online ...
vxdiskunsetup accessname
vxdiskunsetup [-f] -o shred[=1|3|7] accessname
vxdiskunsetup emc_dd4
34
After the disk has been removed from its disk group, you can remove it from Volume Manager
control completely by using the vxdiskunsetup command. This command reverses the
configuration of a disk by removing the public and private regions that were created by the
vxdisksetup command. The vxdiskunsetup command does not operate on disks that
are active members of an imported disk group. This command does not usually operate on
disks that appear to be imported by some other host—for example, a host that shares access
to the disk. You can use the -C option to force deconfiguration of the disk, removing host
locks that may be detected.
Before the disk is uninitialized:
vxdisk -o alldgs list
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Destroy
36
Destroying a disk group permanently removes a disk group from Volume Manager control,
and the disk group ceases to exist. When you destroy a disk group, all of the disks in the disk
group are made available as empty disks. Volumes and configuration information including
the automatic configuration backups of the disk group are removed. Disk group configuration
backups are discussed later in this course. Because you cannot remove the last disk in a disk
group, destroying a disk group is the only method to free the last disk in a disk group for
reuse. A disk group cannot be destroyed if any volumes in that disk group are in use or contain
mounted file systems. The bootdg disk group cannot be destroyed.
Caution: Destroying a disk group can result in data loss. Only destroy a disk group if you are
sure that the volumes and data in the disk group are not needed.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
To destroy a disk group from the command line, use the vxdg destroy command.
Note: You can bring back a destroyed disk group by importing it with its dgid if its disks had
not been re-used for other purposes.
• Key points
– In this lesson, you learned how to create a volume with a file system.
– You also learned about device-naming schemes, how to add a disk to a disk group, and how to view
configuration information for volumes, disk groups, and disks.
– In addition, you learned how to remove a volume, disk, and disk group.
• Reference materials
– Veritas InfoScale Installation Guide
– Veritas Storage Foundation Administrator’s Guide
– Veritas Services and Operations Readiness Tools (SORT):
https://sort.veritas.com
37
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
38
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
39
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Controller number
B. Target ID
C. Disk media name
D. Vendor ID
40
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Controller number
B. Target ID
C. Disk media name
D. Vendor ID
The correct answer is D. By default, the logical name of an enclosure is typically based on the Vendor ID.
41
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. A disk is initialized.
B. A disk is assigned to a disk group.
C. The space on a disk becomes part of the free space pool default disk group.
D. Disk space is assigned to volumes.
42
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. A disk is initialized.
B. A disk is assigned to a disk group.
C. The space on a disk becomes part of the free space pool default disk group.
D. Disk space is assigned to volumes.
The correct answer is A. SCSI disks are usually preformatted. After a disk is formatted, the disk must be initialized for use
by Volume Manager. In other words, disks must be detected by the operating system, before VxVM can detect the disks.
43
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
44
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is C. To configure a disk for use by VxVM and write a disk header to the disk, you use the
vxdisksetup -i accessname [attributes] command.
45
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
46
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
47
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
48
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
49
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
50
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
51
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. vxdisplay
B. vxvol
C. vxconfig
D. vxprint
52
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. vxdisplay
B. vxvol
C. vxconfig
D. vxprint
The correct answer is D. The vxprint command can display information about disk groups, disk media, volumes,
plexes, and subdisks.
53
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
54
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
55
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
This appendix contains slides that are platform-specific, and they may be
reviewed per the viewer’s discretion and interest, or you may optionally
end the presentation now.
56
Step CLI
57
On AIX, to create and mount a file system, use the mkfs -V vxfs command, then the
mkdir command, and then the mount -V command.
To create and mount a file system using crfs, use the crfs -v vxfs command.
To return to the Adding a File System After Volume Creation slide, click the back button at
the bottom of the screen.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Step CLI
58
On HP, to create and mount a file system, use the mkfs -F vxfs command, then the
mkdir command, and then the mount command.
To create and mount a HFS file system, it is the same, except use the newfs –F hfs
command rather than mkfs and use the –F hfs flag in the mount command.
To return to the Adding a File System After Volume Creation slide, click the back button at
the bottom of the screen.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Step CLI
59
On Linux, to create and mount a file system use the mkfs -t vxfs command, then
mkdir, and then mount -t vxfs.
To return to the Adding a File System After Volume Creation slide, click the back button at
the bottom of the screen.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Step CLI
60
On Solaris, to create and mount a file system, use the mkfs -F vxfs command, then the
mkdir command, and then the mount –F vxfs command.
To create and mount a UFS file system, it is almost the same, except use the newfs
command rather than mkfs.
For the mount command, do not use the –F flag.
To return to the Adding a File System After Volume Creation slide, click the back button at
the bottom of the screen.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
End of presentation
03-61
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the Working with Volumes with Different Layouts lesson in the Veritas InfoScale 7.4.2
Fundamentals for UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
Topic Objective
Creating volumes with various Create concatenated, striped, and mirrored volumes from the
layouts command line.
Allocating storage for volumes Allocate storage for a volume by specifying storage attributes.
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to identify the features,
advantages, and disadvantages of volume layouts supported by VxVM.
appvol
Volume:
appvol
appdg01-01
12 GB
Plex:
appvol-01 appdg02-02
appvol-01
Subdisks:
• appdg01-01
• appdg02-02 appdg01 appdg02
Volume layouts
Each volume layout has different advantages and disadvantages. For example, a volume can
be extended across multiple disks to increase capacity, mirrored on another disk to provide
data redundancy, or striped across multiple disks to improve I/O performance. The layouts
that you choose depend on the levels of performance and availability required by your
system.
Concatenated layout
A concatenated volume layout maps data in a linear manner onto one or more subdisks in a
plex. Subdisks do not have to be physically contiguous and can belong to more than one VM
disk. Storage is allocated completely from one
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
subdisk before using the next subdisk in the span. Data is accessed in the remaining subdisks
sequentially until the end of the last subdisk. For example, if you have 12 GB of data then a
concatenated volume can logically
map the volume address space across subdisks on different disks. The addresses 0 GB to 8 GB
of volume address space map to the first 8-gigabyte subdisk, and addresses 9 GB to 12 GB
map to the second 4-gigabyte subdisk. An address offset of 10 GB, therefore, maps to an
address offset of 2 GB in the second subdisk.
Plex:
appvol-01 SU1 SU2 SU3 SU4 Stripes
Columns
SU5 SU6 SU7 SU8
Subdisks: SU9 SU10 SU11 SU12
• appdg01-01
SU13 SU14 SU15 SU16
• appdg02-01 appvol-01
• appdg03-01
• appdg04-01
A striped volume layout maps data so that the data is interleaved, or allocated in stripes,
among two or more subdisks on two or more physical disks. Data is allocated alternately and
evenly to the subdisks of a striped plex.
The subdisks are grouped into “columns.” Each column contains one or more subdisks and can
be derived from one or more physical disks. To obtain the maximum performance benefits of
striping, you should not use a single disk to
provide space for more than one column.
All columns must be the same size. The size of a column is equal to the size of the volume
divided by the number of columns. The default number of columns in a striped volume is
based on the number of disks in the disk group.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Data is allocated in equal-sized units, called stripe units, that are interleaved between the
columns. Each stripe unit is a set of contiguous blocks on a disk. The stripe unit size can be in
units of sectors, kilobytes, megabytes, or gigabytes. The default stripe unit size is 64K, which
provides adequate performance for most general purpose volumes. Performance of an
individual volume may be improved by matching the stripe unit size to the I/O characteristics
of the application using the volume.
appvol
Volume:
appvol
Plexes:
• appvol-01
• appvol-02 appdg02-02
appdg01-02
appdg03-02
Subdisks:
• appdg01-02 appvol-01 appvol-02
• appdg02-02
• appdg03-01
appdg01 appdg02 appdg03
VxVM disks: appdg01-01 appdg02-01 appdg03-01
• appdg01 appdg02-02 appdg03-02
• appdg02 appdg01-02
appdg02-03 appdg03-03
• appdg03
considered complete.
Distribute mirrors across controllers to eliminate the controller as a single point of failure.
Columns
SU5 SU6 P SU4
Subdisks: SU9 P SU7 SU8
• appdg01-01 P SU10 SU11 SU12
• appdg02-01 appvol-01
• appdg03-01
• appdg04-01
A RAID-5 volume layout has the same attributes as a striped plex, but one column in each
stripe is used for parity. Parity provides redundancy. Parity is a calculated value used to
reconstruct data after a failure. While data is
being written to a RAID-5 volume, parity is calculated by performing an exclusive OR (XOR)
procedure on the data. The resulting parity is then written to the volume. If a portion of a
RAID-5 volume fails, the data that was on that portion of the failed volume can be re-created
from the remaining data and parity information.
RAID-5 volumes keep a copy of the data and calculated parity in a plex that is striped across
multiple disks. Parity is spread equally across columns. Given a five-column RAID-5 where
each column is 1 GB in size, the RAID-5 volume size is 4 GB. An amount of space equivalent to
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
one column is devoted to parity; the remaining space is used for data. The default stripe unit
size for a RAID-5 volume is 16K. Each column must be the same length but may be made from
multiple subdisks of variable length. Subdisks used in different columns must not be located
on the same physical disk. RAID-5 requires a minimum of three disks for data and parity.
When implemented as recommended, an additional disk is required for the log. RAID-5
cannot be mirrored.
Erasure coding is a volume layout with an advantage of storage saving. Mirroring can provide
any fault tolerance just like Erasure coding but will have huge storage overhead. So, it is
treated as a feature rather than new area.
Erasure coding is a new feature available as a technology preview in Veritas InfoScale for
configuration and testing in non-production environments. It is supported in DAS, SAN, FSS,
and standalone environments.
As storage systems expand and become more complex, traditional data protection
mechanisms are proving to be inadequate against failures. Erasure coding offers a more
robust solution in redundancy and fault tolerance for critical storage archives. In erasure
coding, data is broken into fragments, expanded and encoded with redundant data pieces and
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
stored across different locations or storage media. When one or more disks fail, the data on
failed disks is reconstructed using the parity information in the encoded disks and data in the
surviving disks. Erasure coded volumes must be created using disk group version 230 or later.
When you create an erasure coded volume, Veritas InfoScale, by default, runs asynschronous
initialization on the volume ensuring that data and parities are synchronized for all regions.
The operation runs in the background allowing the volume to be available for use to
applications immediately after creation. The volumes display the SYNC state after creation
until all the regions are synchronized. This functionality is supported on both private and
shared/FSS disk groups.
You can manually initialize an erasure coded volume by setting init=zero at the time of
creating the volume. The initialization zeroes out all the regions and the volume is not
available for use until the initialization process completes.
• Disk space
Disadvantages
10
Concatenation: Advantages
• Better utilization of free space: Concatenation removes the restriction on size of storage
devices imposed by physical disk size. It also enables better utilization of free space on
disks by providing for the ordering of available discrete disk space on multiple disks into a
single addressable volume.
• Simplified administration: System administration complexity is reduced because making
snapshots and mirrors uses any size space, and volumes can be increased in size by any
available amount.
Concatenation: Disadvantages
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• No protection against disk failure: Concatenation does not protect against disk failure. A
single disk failure results in the failure of the entire volume.
Striping: Advantages
• Improved performance through parallel data transfer: Improved performance is obtained
by increasing the effective bandwidth of the I/O path to the data. This may be achieved by
a single volume I/O operation spanning across a number of disks or by multiple concurrent
volume I/O operations to more than one disk at the same time.
• Load-balancing: Striping is also helpful in balancing the I/O load from multiuser
applications across multiple disks.
After completing this topic, you will be able to create concatenated, striped,
and mirrored volumes from the command line.
13
Concatenated:
Striped:
Mirrored:
14
To specify different volume layouts while creating a volume from the command line using the
vxassist make command, you use the layout attribute. If you do not specify the layout
attribute, by default, vxassist creates a concatenated volume that uses one or more
sections of disk space. The layout=striped attribute designates a striped layout and the
layout=mirror-concat or the layout=mirror-stripe attributes designate a
mirrored volume layout. Note that you can also use the layout=mirror attribute to create
a mirrored volume. However, layout=mirror may result in the creation of layered volumes.
Layered volumes are covered in detail later in this lesson.
Note: To guarantee that a concatenated volume is created, include the layout=nostripe
attribute in the vxassist make command. Without the layout attribute, the default layout
is used that may have been changed by the creation of the /etc/default/vxassist
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
file. The following additional attributes are used with the striped volume layout:
• ncol=n designates the number of stripes, or columns, across which the volume is
created. This attribute has many aliases. For example, you can also use nstripe=n or
stripes=n.
You can also provide an upper limit for the maximum size by specifying
maxsize=length parameter. If the maximum possible size is higher than this upper limit,
the volume is created using the upper limit as the volume length. If the maximum possible
size is smaller than this limit, the volume is created with the maximum possible size.
Concatenated:
vxassist –g appdg make appvol 10g
16
After completing this topic, you will be able to allocate storage for a volume
by specifying storage attributes and ordered allocation.
17
18
VxVM selects the disks on which each volume resides automatically, unless you specify
otherwise. To create a volume on specific disks, you can designate those disks when creating a
volume. By specifying storage attributes when you create a volume, you can:
• Include specific disks, controllers, enclosures, targets, or trays to be used for the volume.
• Exclude specific disks, controllers, enclosures, targets, or trays from being used for the
volume.
• Mirror volumes across specific controllers, enclosures, targets, or trays. (By default, VxVM
does not permit mirroring on the same disk.)
By specifying storage attributes, you can ensure a high availability environment. For example,
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
you can only permit mirroring of a volume on disks connected to different controllers and
eliminate the controller as a single point of failure. To exclude a disk, controller, enclosure,
target, or tray, you add the exclusion symbol (!) before the storage attribute. For example, to
exclude appdg02 from volume creation, you use the format: !appdg02.
Note: When creating a volume, all storage attributes that you specify for use must belong to
the same disk group. Otherwise, VxVM does not use these storage attributes to create a
volume.
19
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
20
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
21
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Concatenation
B. Striping
C. Mirroring
D. RAID-5
22
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Concatenation
B. Striping
C. Mirroring
D. RAID-5
The correct answer is C. A mirrored volume layout consists of more than one plex that duplicate the information
contained in a volume. In the event of a physical disk failure and when the plex on the failed disk becomes unavailable,
the system can continue to operate using the unaffected mirrors.
23
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Striping
B. Concatenation
C. Mirroring
D. RAID-5
24
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Striping
B. Concatenation
C. Mirroring
D. RAID-5
The correct answer is B. A concatenated volume layout maps data in a linear manner onto one or more subdisks in a
plex. Concatenation does not protect against disk failure. A single disk failure results in the failure of the entire volume.
25
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. mirror-concat
B. concat-mirror
C. mirror-stripe
D. stripe-mirror
26
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. mirror-concat
B. concat-mirror
C. mirror-stripe
D. stripe-mirror
The correct answer is D. In the stripe-mirror volume layout, the top-level volume contains a striped plex and the
component subvolumes are mirrored
27
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
28
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
29
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
30
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
31
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
End of presentation
04-32
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the Making Configuration Changes lesson in the Veritas InfoScale 7.4.2 Fundamentals
for UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
Topic Objective
Renaming VxVM objects Rename VxVM objects, such as disks and disk groups.
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to add a mirror to and remove a
mirror from an existing volume, add a log, and change the volume read
policy.
• Provides redundancy
Plex • Improves concurrent read performance
• Requires plex synchronization (in background)
• Limitations:
Plex Plex ‒ Only volumes with concatenated or striped plexes
‒ By default, mirror created with the same layout as original plex
‒ Each mirror on separate disks
‒ All disks in the same disk group
If a volume was not originally created as a mirrored volume, or if you want to add additional
mirrors, you can add a mirror to an existing volume.
By default, a mirror is created with the same plex layout as the plex already in the volume. For
example, assume that a volume is composed of a single striped plex. If you add a mirror to the
volume, VxVM makes that plex striped, as well. However, you can specify a different layout.
A mirrored volume requires at least two disks. You cannot add a mirror to a disk that is
already being used by the volume. A volume can have multiple mirrors, as long as each mirror
resides on separate disks.
Only disks in the same disk group as the volume can be used to create the new mirror. Unless
you specify the disks to be used for the mirror, VxVM automatically locates and uses available
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Without Storage Foundation, moving data from one array to another requires downtime.
Using Storage Foundation, you can mirror to a new array, ensure it is stable, and then remove
the plexes from the old array. No downtime is necessary. This is useful in many situations, for
example, if a company purchases a new array.
The high level steps for migrating data using Storage Foundation are listed on the slide. Note
that if you have multiple volumes on the old array, you would need to repeat steps 6 to 9 for
each volume. The following steps illustrate the commands you need to use to perform the
migration using a simple example where the appvol volume in the appdg disk group is moved
from the emc0 enclosure to the emc1 enclosure. To keep the example simple, only one LUN is
used to mirror the simple volume.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Read I/O 0
0
Write I/O 0
1
Plex Plex DRL
Write I/O 10
0 Recovery 1
Write I/O 10 0
Plex Plex DRL Recovery 1
Plex Plex DRL
Logging in VxVM
By enabling logging, VxVM tracks changed regions of a volume. Log information can then be
used to reduce plex synchronization times and speed the recovery of volumes after a system
failure. Logging is an optional feature, but is highly recommended, especially for large
volumes.
Dirty region logging
Dirty region logging (DRL) is used with mirrored volume layouts. DRL keeps track of the
regions that have changed due to I/O writes to a mirrored volume. Prior to every write, a bit is
set in a log to record the area of the disk that is being changed. In case of system failure, DRL
uses this information to recover only the portions of the volume that need to be recovered.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
If DRL is not used and a system failure occurs, all mirrors of the volumes must be restored to a
consistent state by copying the full contents of the volume between its mirrors. This process
can be lengthy and I/O intensive.
When you enable logging on a mirrored volume, one log plex is created by default. The log
plex uses space from disks already used for that volume, or you can specify which disk to use.
To enhance performance, you should consider placing the log plex on a disk that is not already
in use by the volume.
11
DRL adds a small I/O overhead for most write access patterns.
• DRL should not be used for:
• Mirrored boot disks
• Volumes that have a data change object (DCO)
• Data change objects are used with the FastResync feature.
• Data volumes for databases that support the SmartSync feature of Volume Manager
Redo log volumes and other volumes that are used primarily for sequential writes
may benefit from using a sequential DRL instead of a standard DRL (logtype=drlseq).
Volume Volume
Preferred
Read I/O
Is there
a striped plex?
Default
12
One of the benefits of mirrored volumes is that you have more than one copy of the data
from which to satisfy read requests. The read policy for a volume determines the order in
which plexes are accessed during read I/O operations.
• Round robin: VxVM reads each plex in turn in “round-robin” manner for each
nonsequential I/O detected. Sequential access causes only one plex to be accessed in
order to take advantage of drive or controller read-ahead caching policies. If a read is
within 256K of the previous read, then the read is sent to the same plex.
• Preferred plex: VxVM reads first from a plex that has been named as the preferred plex.
Read requests are satisfied from one specific plex, presumably the plex with the highest
performance. If the preferred plex fails, another plex is accessed. For example, if you are
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
mirroring across disk arrays with significantly different performance specifications, setting
the plex on the faster array as the preferred plex would increase performance.
• Selected plex: This is the default read policy. Under the selected plex policy, Volume
Manager chooses an appropriate read policy based on the plex configuration to achieve
the greatest I/O throughput. If the mirrored volume has exactly one enabled striped plex,
the read policy defaults to that plex; otherwise, it defaults to a round-robin read policy.
• Siteread: VxVM reads preferentially from plexes at the locally defined site. This is the
default policy for volumes in disk groups where site consistency has been enabled.
• Split: Divides the read requests and distributes them across all the available plexes.
14
After completing this topic, you will be able to resize an existing volume and
file system from the command line.
15
16
Resizing a volume
If users require more space on a volume, you can increase the size of the volume. If a volume
contains unused space that you need to use elsewhere, you can shrink the volume.
When the volume size is increased, sufficient disk space must be available in the disk group to
support extending the existing volume layout. A volume with concatenated layout can be
grown by any amount on any disk within the disk group whereas a volume with striped layout
can be grown only if subdisks remain the same length and an equal number of disks as stripes
are available. When increasing the size of a volume, VxVM assigns the necessary new space
from available disks. By default, VxVM uses space from any disk in the disk group, unless you
define specific disks.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
17
To resize a volume from the command line, you can use either the vxassist command or
the vxresize command. Both commands can expand or reduce a volume to a specific size
or by a specified amount of space, with one significant difference:
• vxresize automatically resizes a volume’s file system.
• vxassist does not resize a volume’s file system.
When using vxassist, you must resize the file system separately by using the fsadm
command.
When you expand a volume, both commands automatically locate available disk space unless
you designate specific disks to use. When you shrink a volume, the unused space becomes
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
1 2 3 4
3 vxresize –g mydg myvol 4g
18
You can resize a VxFS file system while the file system remains mounted by using the fsadm
command:
fsadm [-b newsize] [-r rawdev] mount_point
Using fsadm to resize a file system does not automatically resize the underlying volume.
When you expand a file system, the underlying device must be large enough to contain the
new larger file system.
20
When you resize a LUN in the hardware, you should resize the VxVM disk corresponding to
that LUN. You can use vxdisk resize to update disk headers and other VxVM structures
to match a new LUN size. This command does not resize the underlying LUN itself.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to deport a disk group from one
system and import it on another system.
21
appdg dbdg
appvol dbvol
Additional disks
22
VM disks
When you deport a disk group,
Deport you can optionally:
• Specify a new host.
• Rename the disk group.
appdg
Volume
Before deporting:
• Stop applications.
• Unmount file systems.
VM disks
23
A deported disk group is a disk group over which management control has been surrendered.
The objects within the disk group cannot be accessed, its volumes are unavailable, and the
disk group configuration cannot be changed. (You cannot access volumes in a deported disk
group because the directory containing the device nodes for the volumes are deleted upon
deport.) To resume management of the disk group, it must be imported.
A disk group cannot be deported if any volumes in that disk group are in use. Before you
deport a disk group, you must unmount file systems and stop any application using the
volumes in the disk group.
Deporting and specifying a new host
When you deport a disk group using CLI commands, you have the option to specify a new host
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
to which the disk group is imported at reboot. If you know the name of the host to which the
disk group will be imported, specify the new host during the operation. If you do not specify
the new host, the disks could accidentally be added to another disk group, resulting in data
loss. You cannot specify a new host using the vxdiskadm utility.
Deporting and renaming
When you deport a disk group using InfoScale Operations Manager or CLI commands, you also
have the option to rename the disk group. Note that the disk cannot be renamed when
deporting using the vxdiskadm utility.
Import
When you import a disk group, you can:
appdg
• Specify a new disk group name.
Volume
• Clear host locks.
• Import as temporary.
• Force an import.
VM disks
24
ensures that dual-ported disks (disks that can be accessed simultaneously by two systems) are
not used by both systems at the same time. If a system crashes, the locks stored on the disks
remain, and if you try to import a disk group containing those disks, the import fails.
Importing as temporary
A temporary import does not persist across reboots. A temporary import can be useful, for
example, if you need to perform administrative operations on the temporarily imported disk
group.
Forcing an import
A disk group import fails if the VxVM configuration daemon cannot find all of the disks in the
disk group. If the import fails because a disk has failed, you can force the import. Forcing an
import should always be performed with caution.
CLI:
vxdiskadm:
CLI:
vxdg [-ftC] [-n new_name] import diskgroup
vxvol –g diskgroup startall
25
With SF 5.1 SP1 and later, all volumes in the disk group are started automatically during a disk
group import by default. However, with earlier versions of SF or if the autostartvolumes
parameter is modified to off, you must manually start all volumes after you import a disk
group from the command line.
A disk group must be deported from its previous system before it can be imported to the new
system. During the import operation, the system checks for host import locks. If any locks are
found, you are prompted to clear the locks.
To temporarily import a disk group, you use the -t option. This option does not set the
autoimport flag, which means that the import cannot survive a reboot.
After completing this topic, you will be able to rename VxVM objects, such
as disks and disk groups.
27
• Online operation
‒ Object names must be unique within the DG.
‒ Related objects must also be renamed.
• Possible impact on file system or application
28
rename its plexes. Volumes are not affected when subdisks are named differently from the
disks.
dbserver
olddgname newdgname
Deport Import
29
You cannot import or deport a disk group when the target system already has a disk group of
the same name. To avoid name collision or to provide a more appropriate name for a disk
group, you can rename a disk group.
• To rename a disk group when moving it from one system to another, you specify the new
name during the deport or during the import operations.
• To rename a disk group without moving the disk group, you must still deport and reimport
the disk group on the same system.
Note that renaming a disk group:
• does not change the disk group ID (dgid).
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• may require modifying the file system table (For example, /etc/vfstab for Solaris).
• may require modifying applications, such as databases, using the volumes.
Using the CLI, for example, to rename the disk group appdg to oradg:
vxdg -n oradg deport appdg or vxdg deport appdg
vxdg import oradg vxdg -n oradg import appdg
From the command line, if you need to restart all volumes in the disk group:
vxvol -g new_dg_name startall
vxvol -g oradg startall
To use the key, InfoScale provides the option to re-key the volumes that change the KMS key when needed. This
option is also known as key rotation.
You can use an external scheduler based on your policy to schedule the re-key operation.
30
InfoScale supports the use of a single KMS key for all the volumes in a disk group.
Consequently, you can maintain a common KMS key at the disk group level instead of
maintaining an individual KMS key for each volume. When you start an encrypted volume that
has a common KMS key with the disk group, VxVM needs to fetch only one key to enable
access to the volume. Thus, a common KMS key reduces the network load that is sent to the
KMS in the form of multiple requests based on the number of volumes. A single request to
KMS lets you to start all the volumes in a single operation.
To make the use of this single key more secure, InfoScale provides the option to re-key the
volumes that change the KMS key when needed. This option is also known as key rotation.
You can use an external scheduler based on your policy to schedule the re-key operation.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The disk group level encryption key management and key rotation feature does not support VVR configuration and
disk group operations like join, split, move.
31
To use a single key for all the encrypted volumes in a disk group, set the value of the
same_enckey tunable to yes as follows:
At the time of disk group creation, set:
vxdg -o same_enckey=yes init DiskGroupName diskName1 diskName2 ... diskNameN
• Key points
– In this lesson, you learned how to add a mirror to and remove a mirror from an existing volume.
– You also learned how to change the volume read policy and resize an existing volume.
– In addition, you learned to deport a disk group from one system and import it on another system.
– Finally, you learned how to rename VxVM objects, such as disks and disk groups, and upgrade disk
groups.
• Reference materials
– Veritas InfoScale Release Notes
– Veritas Storage Foundation Administrator’s Guide
– https://sort.veritas.com
32
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
33
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
34
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
35
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
36
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
37
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
38
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
39
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
40
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
41
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
42
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. vxedit rename
B. vxdisk rename
C. vxdg rename
D. vxdisk newname
43
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. vxedit rename
B. vxdisk rename
C. vxdg rename
D. vxdisk newname
44
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
45
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
46
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
End of presentation
05-47
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the Administering File Systems lesson in the Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
Topic Objective
Benefits of using Veritas File System Explain Veritas File System features, such as extent-based allocation.
Using Veritas File System commands Apply the appropriate VxFS commands from the command line.
Logging in VxFS Perform logging in VxFS by using the intent log and the file change
log.
Controlling file system Perform logging in VxFS by using the intent log and the file change
fragmentation log.
Using thin provisioning disk arrays Use SF features that optimize storage with thin provisioning disk
arrays, such as the SmartMove feature and thin reclamation.
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to explain Veritas File System
features, such as extent-based allocation.
A file system is simply a method for storing and organizing computer files and the data they
contain to make it easy to find and access them.
Veritas File System includes the following features:
• Intent log
− Veritas File System (VxFS) was the first commercial journaling file system. With
journaling, metadata changes are first written to a log (or journal) then to disk. Since
changes do not need to be to be written in multiple places, throughput is much faster
as the metadata is written asynchronously.
− VxFS provides fast recovery of a file system from system failure because the recovery
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• Storage checkpoints
Backup and restore applications can leverage Storage Checkpoint, a disk- and I/O-efficient
copying technology for creating periodic frozen images of a file system.
• Multi-volume file system support
The multi-volume support feature allows several volumes to be represented by a single
logical object. This feature is used with the SmartTier feature.
• SmartTier (previously known as Dynamic Storage Tiering)
The SmartTier feature allows you to configure policies that automatically allocate storage
from specific volumes for certain files, or relocate files by running file relocation
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
commands, which can improve performance for applications that access specific types of
files.
• Improved database performance
Databases can be created on the character devices to achieve the same performance as
databases created on raw disks.
n+9 n+10 n+11 n+12 n+13 n+14 n+15 n+16 n+17 n+40 n+41 n+42
Extent-based allocation enables larger I/O operations to be passed to the underlying drivers,
which results in good performance and less metadata overhead.
Each file is associated with an index block, called an inode. In an inode, an extent is
represented as an address-length pair, which identifies the starting block address and the
length of the extent in logical blocks. This enables the file system to directly access any block
of the file.
VxFS automatically selects an extent size by using a default allocation policy that is based on
the size of I/O write requests. The default allocation policy attempts to balance two goals:
• Optimum I/O performance through large allocations.
• Minimal file system fragmentation through allocation from space available in the file
system that best fits the data The first extent allocated is large enough for the first write to
the file. Typically, the first extent is the smallest power of 2 that is larger than the size of
the first write, with a minimum extent allocation of 8K. Additional extents are
progressively larger, doubling the size of the file with each new extent.
After completing this topic, you will be able to apply the appropriate VxFS
commands from the command line.
10
You can generally use Veritas File System (VxFS) as an alternative to other disk-based, OS-
specific file systems, except for the file systems used to boot the system. File systems used to
boot the system are mounted read-only in the boot process, before the VxFS driver is loaded.
VxFS can be used in place of:
• UNIX File System (UFS) on Solaris, except for root, /usr, /var, and /opt.
• Hierarchical File System (HFS) on HP-UX, except for /stand.
• Journaled File System (JFS) and Enhanced Journaled File System (JFS2) on
• AIX, except for root and /usr.
•
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Extended File System Version 2 (EXT2) and Version 3 (EXT3) on Linux, except for root,
/boot, /etc, /lib, /var, and /usr.
Location of VxFS commands
Most Veritas file system commands are located in /opt/VRTS/bin, which must be
included in the PATH environment variable.
Note: The Linux platform includes a native fsadm command in the /usr/sbin directory. If
this path is listed before the /opt/VRTS/bin directory in the PATH environment variable,
provide the full pathname of the fsadm command (/opt/VRTS/bin/fsadm) to use the
VxFS-specific version of this command.
HP-UX /etc/default/fs
AIX /etc/vfs
Linux /etc/default/fs
Option Description
13
options
mount [-v] Displays mounted file systems
Mount
14
After completing this topic, you will be able to perform logging in VxFS by
using the intent log and the file change log.
15
Data
Metadata
3
If the system
crashes, the
intent log is 2
The intent log is written before
replayed by Disk
file system updates are made.
VxFS fsck.
fsck
16
Intent log
A file system may be left in an inconsistent state after a system failure. Recovery of structural
consistency requires examination of file system metadata structures. Veritas File System
provides fast file system recovery after a system failure by using a tracking feature called
intent logging, or journaling. Intent logging is the process by which intended changes to file
system metadata are written to a log before changes are made to the file system structure.
Once the intent log has been written, the other updates to the file system can be written in
any order. In the event of a system failure, the VxFS fsck utility replays the intent log to
nullify or complete file system operations that were active when the system failed.
Traditionally, the length of time taken for recovery using fsck was proportional to the size of
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
the file system. For large disk configurations, running fsck is a time-consuming process that
checks, verifies, and corrects the entire file system.
The VxFS version of the fsck utility performs an intent log replay to recover a file system
without completing a full structural check of the entire file system. The time required for log
replay is proportional to the log size, not the file system size. Therefore, the file system can be
recovered and mounted seconds after a system failure. Intent log recovery is not readily
apparent to users or administrators, and the intent log can be replayed multiple times with no
adverse effects.
Note: Replaying the intent log may not completely recover the damaged file system structure
if the disk suffers a hardware failure. Such situations may require a complete system check
using the VxFS fsck utility.
To check file system consistency by using the intent log for the VxFS file system on the appvol
volume:
fsck [fstype] /dev/vx/rdsk/appdg/appvol
17
You use the VxFS-specific version of the fsck command to check the consistency of and
repair a VxFS file system. The fsck utility replays the intent log by default, instead of
performing a full structural file system check, which is usually sufficient to set the file system
state to CLEAN. You can also use the fsck utility to perform a full structural recovery in the
unlikely event that the log is unusable.
The syntax for the fsck command is:
/opt/VRTS/bin/fsck [fstype] [generic_options] [-y|-Y] [-n|-N]
\
[-o full,nolog] special
For a complete list of generic options, see the fsck(1m) manual page. Some options include:
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
-o p can only be run with log fsck, not with full fsck.
After completing this topic, you will be able to defragment a VxFS file
system.
18
19
20
• Greater than 5 percent of free space in extents of less than 8 blocks in length
• More than 50 percent of free space in extents of less than 64 blocks in length
• Less than 5 percent of the total file system size available as free extents in lengths of 64 or
more blocks
Fragmentation can also be determined based on the fragmentation index. The fragmentation
report displays fragmentation indices for both the free space and the files in the file system. A
value of 0 for the fragmentation index means that the file system has no fragmentation, and a
value of 100 means that the file system has the highest level of fragmentation. The
fragmentation index is new with SF 6.x and enables you to determine whether you should
perform extent defragmentation or free space defragmentation.
21
df -os /mnt3
/mnt3 (/dev/vx/dsk/testdg/vol3): … blocks … files
Free Extents by Size
1: 2077988 2: 2104073 4: 1371895
8: 2226679 16: 1618029 32: 1000385
64: 53134 128: 1667 256: 480
512: 352 1024: 302 2048: 244
4096: 172 8192: 107 16384: 76
32768: 122 65536: 5 131072: 0
262144: 0 524288: 0 1048576: 0
2097152: 0 4194304: 0 8388608: 0
16777216: 0 33554432: 0 67108864: 0
134217728: 0 268435456: 0 536870912: 0
1073741824: 0 2147483648: 0
22
23
The best way to ensure that fragmentation does not become a problem is to defragment the
file system on a regular basis. The frequency of defragmentation depends on file system
usage, activity patterns, and the importance of file system performance.
In general, follow these guidelines:
• Schedule defragmentation during a time when the file system is relatively idle.
• For frequently used file systems, you should schedule defragmentation daily or weekly.
• For infrequently used file systems, you should schedule defragmentation at least monthly.
• Full file systems tend to fragment and are difficult to defragment. You should consider
expanding the file system.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
To determine the defragmentation schedule that is best for your system, select what you think
is an appropriate interval for running extent reorganization and run the fragmentation reports
both before and after the reorganization. If the degree of fragmentation is approaching the
bad fragmentation figures, then the interval between fsadm runs should be reduced. If the
degree of fragmentation is low, then the interval between fsadm runs can be increased.
You should schedule directory reorganization for file systems when the extent reorganization
is scheduled. The fsadm utility can run on demand and can be scheduled regularly as a cron
job. The defragmentation process can take some time. You receive an alert when the process
is complete.
After completing this topic, you will be able to use SF features that optimize
storage with thin provisioning disk arrays, such as the SmartMove feature
and thin reclamation.
24
Volume on host: 1 TB 1 TB
Physical storage
allocated for LUN in 1 TB 100 GB
array:
When using a thin provisioning capable array, a virtual container (virtual volume) is created for
the 1TB. The array then creates/resizes LUNs as actual data is written to the virtual container.
The administrator is not involved after the initial virtual container is created unless the
amount of actual physical storage is used up.
To truly benefit from thin storage, you need the right stack on all hosts:
• A multi-pathing driver that supports the thin hardware
• A file system optimized not to waste storage on thin volumes
• A stack to reclaim space as you migrate to thin storage
• A stack to continually optimize utilization of thin storage
SF unlocks thin provisioning’s full potential with DMP and VxFS which is the only cross-
platform thin storage-friendly file system.
vxdisk list
3pardata0_0034 auto:cdsdisk 3pardata0_0034 thindg online thin
3pardata0_0035 auto:cdsdisk 3pardata0_0035 thindg online thinrclm
vxdisk –e list
3pardata0_0034 auto 3pardata0_0034 thindg online hdisk49 tp std
3pardata0_0035 auto 3pardata0_0035 thindg online hdisk50 tprclm std
26
Used block
Empty block
VxFS file
system
API
Volume
plex 1 plex 2
(new mirror)
28
This tunable is system-wide and persistent, so it only needs to be set once per server. Setting
this tunable parameter to none completely disables the SmartMove feature. You can also use
the vxdefault command to change the value of this tunable parameter. The vxdefault
command is explained in more detail later in this topic.
Note: The Veritas file system must be mounted to get the benefits of the SmartMove feature.
This feature can be used for faster plex creation and faster array migration.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
vxdefault list
KEYWORD CURRENT-VALUE DEFAULT-VALUE
autostartvolumes on on
fssmartmovethreshold 100 100
reclaim_on_delete_start_time 22:40 22:10
reclaim_on_delete_wait_period -1 1
same_key_for_alldgs off off
sharedminorstart 33000 33000
usefssmartmove all all
29
The vxdefault command is used to modify and display the tunable parameters that are
stored in the /etc/vx/vxsf file as shown on the slide.
The sharedminorstart tunable parameter is used with the dynamic disk group reminoring
feature. This feature is used to allocate minor numbers dynamically to disk groups based on
their private or shared status. Shared disk groups are used with Cluster Volume Manager and
are not covered in this course.
The fssmartmovethreshold defines a threshold value; only if the file system %usage is less
than this threshold, then the SmartMove feature is used. By default, the
fssmartmovethreshold is set to 100 which means that SmartMove is used with all vxfs file
systems with less than 100% usage.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
1 2 3
Verify SmartMove Add thin LUNs to Mirror volumes to
turned on. disk group. thin LUNs.
4 5 6
Remove original If necessary,
Test performance
mirrors and LUNs expand (if larger
of new plexes.
from disk group. thin LUNs).
Storage Provisioning
add-on in VIOM
30
On the slide, the procedure displays the migration from a traditional disk array to a disk array
that supports thin provisioning. This is done assuming that the total space provided by the
thin provisioning array is larger in size than the traditional LUNs used to build the volume and
file system.
Here is an example of an implementation:
1. Turn the SmartMove feature on if necessary.
vxdefault list
vxdefault set usefssmartmove all (if necessary)
2. Add the new, thin LUN, called thinarray0_01 in this example, to the existing disk group.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Note that you can use multiple LUNs although this example is showing only one.
vxdisksetup -i thinarray0_01
vxdg -g appdg adddisk thinarray0_01
3. Add the new, thin LUN as a new plex to the volume.
vxassist -g appdg mirror appvol thinarray0_01
Multi-threaded
reclamation
By VxVM:
vxdisk reclaim disk|enclosure|diskgroup
32
Thin provisioning (TP) capable arrays allocate actual physical storage only when the
applications using the LUNs write data. However, when portions of this data is deleted,
storage is not normally reclaimed back to the available free pool of the thin provisioning
capable array.
Storage Foundation uses the VxFS knowledge of used and unused blocks at the file system
level to reclaim that unused space. This process must be manually started by the system
administrator.
Thin reclamation can only be performed on volumes with mounted VxFS file systems.
Volumes without a VxFS file system or volumes that are not currently mounted are not
reclaimed. If the volume consists of a mix of thin-provisioning disks and regular disks, the
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Space
reclaimed
33
Thin reclamation as implemented in SF 5.0 MP3 (and as described on the previous page) is a
best effort in the sense that it takes any existing contiguous free space in the file system and
reports it to Volume Manager for reclamation. If that contiguous free space is large enough to
be reclaimed in the array (based on chunk size and chunk alignment on the LUN), the space is
effectively reclaimed. Otherwise, the free space is not reclaimed.
The core benefit of this approach is that it either returns storage to the array free pool, or it
does not; the operation never triggers additional storage usage.
The main drawback is that if the free space is fragmented into small contiguous areas, it may
not get reclaimed.
InfoScale Storage has the capability to perform more aggressive reclamation by moving data
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
around in the file system to maximize the size of the contiguous free space. This is an
additional option for reclamation that can only be triggered at the file system level using the
fsadm -R -A mount_point command. Note that you can use the -o analyze
option first to determine if you should perform a normal reclaim operation or an aggressive
reclaim operation.
Notes:
• Aggressive reclamation can only be performed on file systems that are known to use thin
reclaim capable storage.
• Aggressive reclamation can increase the thin storage usage temporarily during the data
compaction process.
34
The commands like vxassist remove volume, and vxedit –rf rm volume, and
the volume shrink operation can trigger automatic reclamation if the released storage is on
thin provision reclaimable LUNs.
The reclaim operation is asynchronous, because the delete or shrink operations are quicker.
The reclamation of the storage released due to volume delete or shrink is performed by the
vxrelocd daemon and can be controlled by the following tunable parameters:
• reclaim_on_delete_wait_period=[-1 – 366]
A value of -1 indicates immediate reclamation and a value of 366 indicates that no
reclamation will be performed by the vxrelocd daemon.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• reclaim_on_delete_start_time=[00:00-23:59]
The vxdg destroy diskgroup command does not reclaim any storage automatically.
The thin provision reclaimable LUNs belonging to the destroyed disk group must be reclaimed
manually using the vxdisk reclaim disk command.
appvol
DCO Volume
DCO DCO
Log Log
Plex Plex • When a disk group includes thin LUNs, mirrored volumes are automatically
created with data change objects (if using SF Enterprise license).
• Updates to the original volume are recorded in DCO logs and stored on disk.
• Resynchronization involves applying only changed data, rather than
performing an entire atomic resynchronization.
35
The SF Enterprise license enables the FastResync feature of Veritas Volume Manager. The
FastResync feature is used for fast resynchronization of the plexes of a mirrored volume. This
feature is mostly used with instant volume snapshots. However, it is also used for
resynchronization of plexes that become stale with respect to the contents of the volume due
to failures.
Without FastResync, when a plex of a mirrored volume becomes stale, the resynchronization
involves an entire atomic copy from the active plexes to the stale plex. With FastResync,
Volume Manager keeps track of the changed regions of the volume and synchronizes only
those regions. This behavior helps with optimizing thin LUN usage. Therefore, FastResync is
automatically enabled on mirrored volumes if the disk group contains thin LUNs and the
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
feature is licensed.
When FastResync is enabled on a mirrored volume, a data change object (DCO) is created with
a DCO volume to hold the FastResync maps as well as the DRL recovery maps and other
special maps used with instant snapshot operations on disk.
Note that you cannot remove a mirrored volume using the vxassist remove volume
command if it has an associated DCO log. To remove a mirrored volume with a DCO log, use
the following vxedit command:
vxedit -g diskgroup -rf rm volume_name
• With this change, you can create and mount • With this change, you can create and mount
VxFS only on DLV 11 and later. DLV 6 to 10 VxFS only on DLV 12 and later. DLV 6 to 11 can
can be used for a local mount only. be used for a local mount only.
36
The following DLV changes are applicable in Veritas InfoScale 7.4.1 - Linux
• Support added for DLV 15
• Default DLV is DLV 15
• Support deprecated for DLV 10
With this change, you can create and mount VxFS only on DLV 11 and later. DLV 6 to 10 can be
used for a local mount only.
The following DLV changes are applicable in Veritas InfoScale 7.4.2 - Linux
• Added support for DLV 16
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• The SELinux policy for RHEL 7.6 and later now includes support for VxFS file system
as persistent storage of SELinux security extended attributes.
• With this support, users can use SELinux security functionalities and features on
VxFS files and directories on RHEL 7.6 and later.
37
The SELinux policy for RHEL 7.6 and later now includes support for VxFS file system as
persistent storage of SELinux security extended attributes.
With this support, users can use SELinux security functionalities and features on VxFS files and
directories on RHEL 7.6 and later.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• Key points
– In this lesson, you learned how to apply the appropriate VxFS commands from the command line to
administer the file system.
– You also learned how to perform logging in VxFS by using the intent log and the file change log.
– In addition, you learned how to defragment a Veritas file system.
– Finally, you learned how to use thin provisioning disk arrays with Storage Foundation.
• Reference materials
– Veritas Storage Foundation Administrator’s Guide
– https://sort.veritas.com
38
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
39
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
40
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. fsadm
B. fsck
C. fcladm
D. mkfs
41
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. fsadm
B. fsck
C. fcladm
D. mkfs
42
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
43
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
44
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
45
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
46
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
47
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
48
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
49
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
50
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
This appendix contains slides that are platform specific and may be
reviewed at the viewer’s discretion and interest. You may opt to end the
presentation now.
51
File system type switch: -V vxfs (or -v vxfs when used with crfs)
/usr/lib/fs/vxfs
Related directories:
/etc/fs/vxfs
Back
52
Veritas File System can be used in place of Journaled File System (JFS) and Enhanced
Journaled File System (JFS2) on AIX, except for root and /usr.
The location of Veritas File System commands on AIX is displayed on the slide.
The file system switchout for Veritas file system is accomplished using -V vxfs, or -v
vxfs when used with crfs.
The default file system file is /etc/vfs.
To return to the “Using VxFS Commands” slide, click the back button at the bottom of the
screen.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Location of VxFS
commands:
/sbin/fs
(optional for PATH
variable)
Back
53
Veritas File System can be used in place of Hierarchical File System (HFS) on HP-UX, except for
/stand.
The location of Veritas File System commands on HP-UX is displayed on the slide.
The file system switchout for Veritas file system is accomplished using -F vxfs.
The default file system file is /etc/default/fs.
To return to the “Using VxFS Commands” slide, click the back button at the bottom of the
screen.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Back
54
Veritas File System can be used in place of Extended File System Version 2 (EXT2) and Version
3 (EXT3) on Linux, except for the root, /boot, /etc, /lib, /var, and /usr directories.
The location of Veritas File System commands on Linux is displayed on the slide.
The file system switchout for Veritas file system is accomplished using -t vxfs.
The default file system file is /etc/default/fs.
To return to the “Using VxFS Commands” slide, click the back button at the bottom of the
screen.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
/usr/lib/fs/vxfs/bin
Related directories:
/etc/fs/vxfs
Back
55
Veritas File System can be used in place of UNIX File System (UFS) on Solaris, except for root,
/usr, /var, and /opt. The location of Veritas File System commands on Solaris is
displayed on the slide.
The file system switchout for Veritas file system is accomplished using -F vxfs. The default
file system file is /etc/default/fs.
To return to the “Using VxFS Commands” slide, click the back button at the bottom of the
screen.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
End of presentation
06-56
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the High Availability Concepts lesson in the Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts • Lesson 11: Cluster Communications
Topic Objective
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to define high availability.
Asynchronous
replication Volume Replicator
Shared storage
AVAILABILITY
Journaled
file system Storage Foundation
INVESTMENT
Data centers may implement different levels of availability depending on their requirements
for availability.
• Backup: At a minimum, all data needs to be protected using an effective backup solution,
such as Veritas NetBackup.
• Data availability: Local mirroring provides real-time data availability within the local data
center. Point-in-time copy solutions protect against corruption. Online configuration keeps
data available to applications while storage is reconfigured to meet changing IT and
business needs. DMP provides resilience against path failure.
• Shared disk groups and cluster file systems: These features minimize application failover
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
time because the disk groups, volumes, and file systems are available on multiple systems
simultaneously.
• Local clustering: The next level is an application clustering solution, such as Veritas Cluster
Server, for application and server availability.
• Remote replication: After implementing local availability, you can further ensure data
availability in the event of a site failure by replicating data to a remote site. Replication can
be application-, host-, or array-based.
• Remote clustering: Implementing remote clustering ensures that the applications and data
can be started at a remote site. Veritas Cluster Server supports remote clustering with
automatic site failover capability.
A Gartner study shows that large companies experienced a loss of between $3,024,000 and
$4,860,000 (USD) per month for nine hours of unplanned downtime.
In addition to the monetary loss, downtime also results in loss of business opportunities and
reputation. Planned downtime is almost as costly as unplanned. Planned downtime can be
significantly reduced by migrating a service to another server while maintenance is
performed. Given the magnitude of the cost of downtime, the case for implementing a high
availability solution is clear.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to recall clustering terminology.
The term cluster refers to multiple independent systems connected into a management
framework.
Types of clusters
A variety of clustering solutions are available for various computing purposes.
• HA clusters: Provide resource monitoring and automatic startup and failover
• Parallel processing clusters: Break large computational programs into smaller tasks
executed in parallel on multiple systems
• Load balancing clusters: Monitor system load and distribute applications automatically
among systems according to specified criteria
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A/A
A/P
Utilization: Utilization:
Link to examples
• N + 1—Similar to N-to-1, the applications restart on the spare after a failure. Unlike the N-
to-1 configuration, after the failed server is repaired, it can become the redundant server.
• N-to-N—This configuration is an active/active configuration that supports multiple
application services running on multiple servers. Each application service is capable of
being failed over to different servers in the cluster.
In the example displayed on the slide, utilization is increased by reconfiguring four
active/passive clusters and one active/active cluster into one N-to-1 cluster and one N-to-N
cluster respectively. This enables a saving of four systems.
Click the link to view some examples of cluster configurations.
Replication
10
Cluster configurations that enable data to be duplicated among multiple physical locations
protect against site-wide failures.
Campus clusters
The campus or stretch cluster environment is a single cluster stretched over multiple
locations, connected by an Ethernet subnet for the cluster interconnect and a fiber channel
SAN, with storage mirrored at each location.
Advantages of this configuration are as follows:
• It provides local high availability within each site as well as protection against site failure.
• It is a cost-effective solution; replication is not required.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Replication
11
After completing this topic, you will be able to explain how applications are
managed in a high availability environment.
12
DG
13
are the same components that the administrator must manually move from a failed server to
a working server to keep the service available to clients in a non-clustered environment.
Application service examples include:
• A Web service consisting of a Web server program, IP addresses, associated network
interfaces used to allow access into the Web site, a file system containing Web data files,
and a volume and disk group containing the file system.
• A database service may consist of one or more IP addresses, database management
software, a file system containing data files, a volume and disk group on which the file
system resides, and a NIC for network access.
Servers Application
Application
DB IP
DB IP
FS FS
FS FS NIC
NIC
Vol Vol
Vol Vol
DG
DG
14
Cluster management software performs a series of tasks in order for clients to access a
service on another server in the event a failure occurs. The software must:
• Ensure that data stored on the disk is available to the new server, if shared storage is
configured (Storage).
• Move the IP address of the old server to the new server (Network).
• Start up the application on the new server (Application).
The process of stopping the application services on one system and starting it on another
system in response to a fault is referred to as a failover.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Application
Application
DB IP
DB IP
FS FS
NIC
FS FS
NIC Vol Vol
Vol Vol
DG
DG
Replication
15
In a global cluster environment, the application services are generally highly available within a
local cluster, so faults are first handled by the HA software, which performs a local failover.
When HA methods such as replication and clustering are implemented across geographical
locations, recovery procedures are started immediately at a remote location when a disaster
takes down a site.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Function Requirement
16
The most important requirements for an application to run in a cluster are crash tolerance and
host independence. This means that the application should be able to recover after a crash to
a known state, in a predictable and reasonable time, on two or more hosts.
Most commercial applications today satisfy this requirement. More specifically, an application
is considered well-behaved and can be controlled by clustering software if it meets the
requirements shown in the slide.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to list the key requirements for
a clustering environment.
17
Multiple LAN
components and links
Multiple storage
components and links
18
All failovers cause some type of client disruption. Depending on your configuration, some
applications take longer to fail over than others. For this reason, good design dictates that the
HA software first try to fail over within the system, using agents that monitor local resources.
Design as much resiliency as possible into the individual servers and components so that you
do not have to rely on any hardware or software to cover a poorly configured system or
application. Likewise, try to use all resources to make individual servers as reliable as possible.
Single point of failure analysis
Determine whether any single points of failure exist in the hardware, software, and
infrastructure components within the cluster environment. Any single point of failure
becomes the weakest link of the cluster. The application is equally inaccessible if a client
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
network connection fails, or if a server fails. In addition, consider the location of redundant
components. Having redundant hardware equipment in the same location is not as effective
as placing the redundant component in a separate location.
In some cases, the cost of redundant components outweighs the risk that the component will
become the cause of an outage. For example, buying an additional expensive storage array
may not be practical. Decisions about balancing cost versus availability need to be made
according to your availability requirements.
Secondary DNS
NIS slave
19
• Reference materials
– Veritas InfoScale Release Notes
– Veritas Cluster Server Administrator’s Guide
– https://sort.veritas.com
20
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
21
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
22
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
23
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is C. In N-to-1 cluster there is one failover target node for all the applications running in the
cluster and once the failed system is repaired the corresponding service group should be moved back.
24
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. One
B. Two
C. Three
D. Four
25
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. One
B. Two
C. Three
D. Four
The correct answer is B. For an active/passive local cluster, minimum number of system required is 2.
26
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
27
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is B. Global clusters contains multiple clusters in different geographical locations.
28
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
29
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is A. In clustering goal is to avoid any single point of failures whether hardware or software. We
achieve the same through hardware redundancy in case of hardware.
30
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
31
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is D. You must be licensed to use the software components of an application as well as
clustering software on all the nodes.
32
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
This appendix contains slides that are platform specific and may be
reviewed at the viewer’s discretion and interest. You may opt to end the
presentation now.
33
Failed
34
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After failover
After repair
Passive f/o
node
35
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
F/O
node
After failover
After repair
F/O
node
36
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Failed
37
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A D G J
B E H K
C F I L
Before failover
After failover
Failed
D G J
A B C
E H K
F I L
38
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
End of presentation
01-39
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the VCS Building Blocks lesson in the Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
Topic Objective
VCS architecture Summarize the implementation of high availability in the VCS architecture.
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to define VCS terminology.
Consisting of:
• Up to 128 systems (nodes).
• An interconnect for cluster communication.
• A public network for client connections.
• Shared storage accessible by each system.
Shared storage
Online service
Offline service
Cluster Interconnect
A VCS cluster is a collection of independent systems working together under the VCS
management framework for increased service availability.
VCS clusters have the following components:
• Up to 128 systems—sometimes referred to as nodes or servers—each running its own
operating system.
• A cluster interconnect, which enables cluster communications.
• A public network, connecting each system in the cluster to a LAN for client access.
• Shared storage (optional), accessible by each system in the cluster that needs to run the
application.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
webip webmnt
A service group is a virtual container that enables VCS to manage an application service as a
unit. The service group contains all the hardware and software components required to run
the service. The service group enables VCS to coordinate failover of the application service
resources in the event of failure or at the administrator’s request.
A service group is defined by these attributes:
• The cluster-wide unique name of the group
• The list of the resources in the service group, usually determined by which resources are
needed to run a specific application service
• The dependency relationships between the resources
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Failover
• Online on only one cluster system at a time.
• Referred to as active/passive.
• Most common type.
Parallel
• Online on multiple cluster systems simultaneously.
• Referred to as active/active.
• Example: Oracle Real Application Cluster (RAC).
webip webmnt
Resources:
• Have unique names throughout the cluster .
webnic webvol
• Are always contained within service groups.
• Are categorized as:
webdg – Persistent: Never turned off.
– Nonpersistent: Turned on and off.
websg
Recommendations:
• Match resource and service group names to easily identify all resources in a group.
• Use a naming convention that identifies the service and type of resource.
Resources are VCS objects that correspond to hardware or software components, such as the
application, the networking components, and the storage components.
VCS controls resources through these actions:
• Bringing a resource online (starting)
• Taking a resource offline (stopping)
• Monitoring a resource (probing)
Resource categories
• Persistent, never turned off
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• None
VCS can only monitor persistent resources—these resources cannot be brought online
or taken offline. The most common example of a persistent resource is a network
interface card (NIC), because it must be present but cannot be stopped.
• On-only
VCS brings the resource online if required but does not stop the resource if the
associated service group is taken offline. ProcessOnOnly is a resource used to start,
but not stop a process such as daemon, for example.
• Nonpersistent, also known as on-off
Most resources fall into this category, meaning that VCS brings them online and takes
them offline as required. Examples are Mount, IP, and Process.
Parent
Dependencies:
• Determine parent/child relationships.
• Are defined to be parent dependent on child.
webip • Cannot be cyclical.
Parent/child
webnic
• No cyclical dependencies are allowed. There must be a clearly defined starting point.
Attributes:
• Specify values used by VCS to manage the resource.
• Can be required or optional, as specified by the resource
type definition.
Online webmnt
10
Types:
• Specify the attributes needed to define a
resource.
• Are used to create a resource.
• Are similar to source code for a resource.
mount
mount [-t fstype]
mount [-V
[-F fstype] [options]
fstype] [options] block_device
[options] block_device mount_point
block_device mount_point
mount_point
11
Resources are classified by resource type. For example, disk groups, network interface cards
(NICs), IP addresses, mount points, and databases are distinct types of resources. VCS
provides a set of predefined resource types—some bundled, some add-ons—in addition to
the ability to create new resource types.
Individual resources are instances of a resource type. For example, you may have several IP
addresses under VCS control. Each of these IP addresses is, individually, a single resource of
resource type IP. A resource type can be thought of as a template that defines the
characteristics or attributes needed to define an individual resource (instance) of that type.
You can view the relationship between resources and resource types by comparing the mount
command for a resource on the previous slide with the mount syntax on this slide. The
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
resource type defines the syntax for the mount command. The resource attributes fill in the
values to form an actual command line.
IP Vol
NIC DG
12
Agents are processes that control resources. Each resource type has a corresponding agent
that manages all resources of that resource type. Each cluster system runs only one agent
process for each active resource type, no matter how many individual resources of that type
are in use.
Agents control resources using a defined set of actions, also called entry points. The four entry
points common to most agents are:
• Online: Resource startup
• Offline: Resource shutdown
• Monitor: Probing the resource to retrieve status
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• Clean: Killing the resource or cleaning up as necessary when a resource fails to be taken
offline gracefully
Each entry point has corresponding scripts or binaries that specify how to perform each of the
four actions. For example, the startup entry point of the Mount agent mounts a block device
on a directory, whereas the startup entry point of the IP agent uses the ifconfig (Solaris,
AIX, HP-UX) or ip addr add (Linux) command to set the IP address on a unique IP alias on
the network interface.
The difference between offline and clean is that offline is an orderly termination and clean is a
forced termination. In UNIX, this can be thought of as the difference between exiting an
application and sending the kill -9 command to the process.
VCS provides both predefined agents and the ability to create custom agents.
13
The Veritas Cluster Server Bundled Agents Reference Guide describes the agents that are
provided with VCS and defines the required and optional attributes for each associated
resource type. Veritas also provides additional application and database agents in an Agent
Pack that is updated quarterly. Some examples of these agents are Data Loss Prevention,
Documentum, and IBM DB2 Database.
Select Downloads > High Availability Agents at the Support Web site for a complete list of
agents available for VCS.
Note: The Veritas Cluster Administrator’s Guide provides an appendix with a complete
description of attributes for all cluster objects.
To obtain PDF versions of product documentation for VCS and agents, visit the SORT Web site.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to outline the different VCS
cluster communication mechanisms.
14
The interconnect:
• Determines affiliated nodes by cluster ID.
• Uses a heartbeat mechanism.
• Maintains a single view of the cluster membership.
• Is also referred to as the private network.
15
VCS requires a cluster communication channel between systems in a cluster to serve as the
cluster interconnect. This communication channel is also sometimes referred to as the private
network because it is often implemented using a dedicated Ethernet network.
Veritas recommends that you use a minimum of two dedicated communication channels with
separate infrastructures—for example, multiple NICs and separate network hubs—to
implement a highly available cluster interconnect.
The cluster interconnect has two primary purposes:
• Determine cluster membership: Membership in a cluster is determined by systems
sending and receiving heartbeats (signals) on the cluster interconnect. This enables VCS to
determine which systems are active members of the cluster and which systems are joining
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
or leaving the cluster. In order to take corrective action on node failure, surviving members
must agree when a node has departed. This membership needs to be accurate and
coordinated among active members—nodes can be rebooted, powered off, faulted, and
added to the cluster at any time.
• Maintain a distributed configuration: Cluster configuration and status information for
every resource and service group in the cluster is distributed dynamically to all cluster
systems.
Cluster communication is handled by the Group Membership Services/Atomic Broadcast
(GAB) mechanism and the Low Latency Transport (LLT) protocol, as described in the next
sections.
LLT:
• Sends heartbeat messages.
• Transports cluster communication traffic.
LLT • Balances traffic load across multiple network links.
• Runs on an Ethernet network.
LLT
16
GAB:
GAB • Manages cluster membership.
• Sends and receives configuration information.
• Uses LLT as the transport mechanism.
GAB LLT
LLT
17
broadcast ensures that all active systems receive all messages for every resource and
service group in the cluster.
GAB
LLT
18
The fencing driver implements I/O fencing, which prevents multiple systems from accessing
the same Volume Manager-controlled shared storage devices in the event that the cluster
interconnect is severed. In the example of a two-node cluster displayed in the diagram, if the
cluster interconnect fails, each system stops receiving heartbeats from the other system.
GAB on each system determines that the other system has failed and passes the cluster
membership change to the fencing module. The fencing modules on both systems contend for
control of the disks according to an internal algorithm. The losing system is forced to panic
and reboot.
The winning system is now the only member of the cluster, and it fences off the shared data
disks so that only systems that are still part of the cluster membership (only one system in this
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
example) can access the shared storage. The winning system takes corrective action as
specified within the cluster configuration, such as bringing service groups online that were
previously running on the losing system.
HAD
hashadow
HAD:
vxfen • Manages agents and service groups.
• Maintains resource configuration and state information.
• Is monitored by the hashadow daemon.
GAB
LLT
19
The VCS engine, also referred to as the high availability daemon (HAD), is the primary VCS
process running on each cluster system. HAD tracks all changes in cluster configuration and
resource status by communicating with GAB. HAD manages all application services (by way of
agents) whether the cluster has one or many systems. Building on the knowledge that the
agents manage individual resources, you can think of HAD as the manager of the agents. HAD
uses the agents to monitor the status of all resources on all nodes.
This modularity between HAD and the agents allows for efficiency of roles:
• HAD does not need to know how to start up Oracle or any other applications that can
come under VCS control.
• Similarly, the agents do not need to make cluster-wide decisions.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
This modularity allows a new application to come under VCS control simply by adding a new
agent—no changes to the VCS engine are required. On each active cluster system, HAD
updates all the other cluster systems with changes to the configuration or status.
In order to ensure that the had daemon is highly available, a companion daemon,
hashadow, monitors had, and if had fails, hashadow attempts to restart had. Likewise,
had restarts hashadow if hashadow stops.
The InfoScale 7.4.2 release introduces the Dualstack IPv4 and IPv6 support for InfoScale Availability/Enterprise.
20
The InfoScale 7.4.2 release introduces the Dualstack IPv4 and IPv6 support for InfoScale
Availability/Enterprise.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
21
22
HAD maintains configuration and state information for all cluster resources in memory on
each cluster system. Cluster state refers to tracking the status of all resources and service
groups in the cluster. When any change to the cluster configuration occurs, such as the
addition of a resource to a service group, HAD on the initiating system sends a message to
HAD on each member of the cluster by way of GAB atomic broadcast, to ensure that each
system has an identical view of the cluster. Atomic means that all systems receive updates, or
all systems are rolled back to the previous state, much like a database atomic commit.
The cluster configuration in memory is created from the main.cf file on disk in the case
where HAD is not currently running on any cluster systems, so there is no configuration in
memory. When you start VCS on the first cluster system, HAD builds the configuration in
memory on that system from the main.cf file. Changes to a running configuration (in
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
memory) are saved to disk in main.cf when certain operations occur. These procedures are
described in more detail later in the course.
)
Mount webmnt (
MountPoint = "/webdata"
BlockDevice = "/dev/vx/dsk/webdatadg/webdatavol"
FSType = vxfs
FsckOpt = "-y"
)
23
Configuring VCS means conveying to VCS the definitions of the cluster, service groups,
resources, and resource dependencies. VCS uses two configuration files in a default
configuration:
• The main.cf file defines the entire cluster, including the cluster name, systems in the
cluster, and definitions of service groups and resources, in addition to service group and
resource dependencies.
• The types.cf file defines the resource types.
Additional files similar to types.cf may be present if agents have been added. For
example, if the Oracle enterprise agent is added, a resource types file, such as
OracleTypes.cf, is also present. The cluster configuration is saved on disk in the
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to explain the InfoScale support
for multi-version clusters.
24
v.7.4.1 v.7.x.y
• All the nodes need to have the same HAD version. • Nodes in a cluster can have different HAD versions.
• Upgrading can be done only via a Rolling or a Full • Nodes can be upgraded in an individual manner.
upgrade. • Nodes can be added with newer HAD versions.
• If nodes are to be added to a cluster, they should have • Downgrading of nodes is allowed.
the same HAD version.
25
In earlier InfoScale versions, you needed to keep the same HAD version across all the nodes in
a cluster. To upgrade all the cluster nodes, you needed to perform either a rolling upgrade or a
full upgrade. This increases downtime and complexity in the functioning of the cluster. If a
new node is to be added to the cluster, you needed to have the same HAD version as the
other nodes in the cluster.
In the InfoScale 7.4.2 version, you can have nodes with different HAD versions. To upgrade
cluster nodes, you can upgrade nodes individually. You can also add nodes with newer HAD
versions. This ensures that there is not downtime in the functioning of the cluster. In addition,
downgrading which is the replacement of existing nodes with systems with older HAD
versions is also possible.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The multi-version cluster feature is not backward compatible. Support for the multi-
version cluster feature is applicable only from InfoScale 7.4.2.
All the nodes in a cluster need to have the same Cluster Protocol Number (CPN).
In future releases, new features or agents will be enabled only if the CPN matches with
the minimum CPN required for those features or agents.
26
Let us discuss some requirements and limitations of the multi-version cluster feature. The
multi-version cluster feature is not backward compatible. It is applicable from InfoScale 7.4.2.
All the nodes in a cluster need to have the same Cluster Protocol Number (CPN). In future
releases, new features or agents will be enabled only if the CPN matches with the minimum
CPN required for those features or agents.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Scalability Upgrade
27
The multi-version cluster feature allows for a scalable solution. While adding new nodes, you
can use the latest release, independently from the version of the existing cluster.
In addition, upgradation is another benefit of this feature. In the latest InfoScale release,
nodes can be upgraded in an individual manner. Nodes can be upgraded by adding a new
node with a newer version and then removing an old node with the previous version. If the
upgradation process is repeated for all nodes in a cluster, it can be completed with zero
downtime and without any impact of the level of availability.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
28
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
29
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
30
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
31
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is B. Parent resource needs to go offline before the child and in case of persistent, it can’t be
taken offline and hence service group offline will be affected.
32
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
33
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is B. There is a corresponding agent for each resource type that manages all the resources of
specific type. Resource type defines how that particular agent will manage corresponding resources.
34
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. True
B. False
35
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. True
B. False
The correct answer is False. Cluster systems communicate over LLT on a redundant private Ethernet network.
36
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. True
B. False
37
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. True
B. False
The correct answer is False. Agent communicates with HAD on local systems, it is HAD that communicates over
GAB with the HAD on another cluster nodes.
38
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
39
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is C. Role of hashadow daemon is to keep track of local HAD on each system. In case HAD is
down, it restarts it.
40
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
End of presentation
02-41
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the VCS Operations lesson in the Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
Topic Objective
Common VCS tools and operations Perform common cluster administrative operations.
Service group operations Manage applications under control of VCS service groups.
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to perform common cluster
administrative operations.
You can use different VCS interfaces to manage the cluster environment, provided that you
have the proper VCS authorization. VCS user accounts are described in detail in the ‘VCS
Configuration Methods’ lesson.
• Veritas InfoScale Operations Manager (VIOM) is a Web-based interface for administering
managed hosts in local and remote clusters. Installation and configuration of the VOM
environment is described in detail in the Veritas InfoScale Operations Manager Installation
Guide.
• The VCS command-line interface is installed by default and is best suited for configuration
and management of a local cluster.
• The Java GUI runs on Windows, and although deprecated for UNIX and Linux platforms, it
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
can still be used for some limited configuration and management purposes.
The VCS engine log is located in /var/VRTSvcs/log/engine_A.log. You can view this file with
standard UNIX text file utilities such as tail, more, or view. VCS provides the hamsg utility
that enables you to filter and sort the data in log files.
In addition, you can display the engine log in VIOM to see a variety of views of detailed status
information about activity in the cluster, using selected perspectives like Storage, Availability,
and Server level.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Getting help
• Command-line syntax:
– ha_command –help
– man ha_command
• Cluster Server Administrator’s Guide
The following examples show how to display resource attributes and status.
• Display values of attributes to ensure they are set properly.
hares -display webip
#Resource Attribute System Value
. . .
webip AutoStart global 1
webip Critical global 1
• Determine which resources are non-critical.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
websg s2
• Determine whether a service group is set to automatically start.
hagrp -value websg AutoStart
1
• List the state of a service group on each system.
hagrp -state websg
#Group Attribute System Value
websg State s1 |Online|
websg State s2 |Offline|
After completing this topic, you will be able to manage applications under
control of VCS service groups.
webnic webvol
webdg
Online
hagrp -online
10
When a service group is brought online, resources are brought online starting with the child
resources and progressing up the dependency tree to the parent resources. In order to bring a
failover service group online, VCS must verify that all nonpersistent resources in the service
group are offline everywhere in the cluster. If any nonpersistent resource is online on another
system, the service group is not brought online.
A service group is considered online if all of its nonpersistent and autostart resources are
online. An autostart resource is a resource whose AutoStart attribute is set to 1. The state of
persistent resources is not considered when determining the online or offline state of a
service group because persistent resources cannot be taken offline. However, a service group
is faulted if a persistent resource faults.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
To bring a service group online, use either form of the hagrp command:
• hagrp -online group -sys system
or
• hagrp -online group –any
The -any option brings the service group online based on the group’s failover policy.
webdg
hagrp -offline
11
When a service group is taken offline, resources are taken offline starting with the highest
(parent) resources in each branch of the resource dependency tree and progressing down the
resource dependency tree to the lowest (child) resources.
Persistent resources cannot be taken offline. Therefore, the service group is considered offline
when all nonpersistent resources are offline.
Taking a service group offline using the CLI
To take a service group offline, use either form of the hagrp command:
hagrp -offline group -sys system
Provide the service group name and the name of a system where the service group is online.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
s2
VCS: s1
1. Takes resources offline on system s1.
2. Brings resources online on system s2.
Only resources online on system s1 are brought online
on system s2.
hagrp -switch
12
In order to ensure that failover can occur as expected in the event of a fault, test the failover
process by switching the service group between systems within the cluster.
Switching a service group does not have the same effect as taking a service group offline on
one system and bring the service group online on another system. When you switch a service
group, VCS replicates the state of each resource on the target system. If a resource has been
manually taken offline on a system before the switch command is run, that resource is not
brought online on the target system.
To switch a service group, type the following command:
hagrp -switch group -to system
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
You need to provide the service group name and the name of the system where the service
group is to be brought online.
When frozen, VCS does not take action on the service group even if you
hagrp -freeze
13
When you freeze a service group, VCS continues to monitor the resources, but it does not
allow the service group (or its resources) to be taken offline or brought online. Failover is also
disabled, even if a resource faults. You can also specify that the freeze is in effect even if VCS is
stopped and restarted throughout the cluster.
Warning: Freezing a service group effectively overrides VCS protection against a concurrency
violation—which occurs when the same application is started on more than one system
simultaneously. You can cause possible data corruption if you bring an application online
outside of VCS while the associated service group is frozen.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to manage resources within
VCS service groups
15
webnic webvol
webdg
Online
16
In normal day-to-day operations, you perform most management operations at the service
group level. However, you may need to perform maintenance tasks that require one or more
resources to be offline while others are online. Also, if you make errors during resource
configuration, you can cause a resource to fail to be brought online.
Bringing resources online using the CLI
To bring a resource online, type hares -online resource -sys system
You need to provide the resource name and the name of a system that is configured to run
the service group.
Note: The service group shown in the slide is partially online after the webdg resource is
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
brought online. This is depicted by the textured coloring of the service group circle.
dbnic dbvol
dbdg
hares -offline
17
Taking resources offline should not be a normal occurrence. Taking resources offline causes
the service group to become partially online, and availability of the application service is
affected.
If a resource needs to be taken offline, for example, for maintenance of underlying hardware,
then consider switching the service group to another system.
If multiple resources need to be taken offline manually, then they must be taken offline in
resource dependency tree order, that is, from top to bottom.
Taking a resource offline and immediately bringing it online may be necessary if, for example,
the resource must reread a configuration file due to a change. Or you may need to take a
database resource offline in order to perform an update that modifies the database files.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to learn about core VCS
enhancements in 7.4.2.
18
• HAD startup can be customized by editing two new VCS custom scripts.
• In Solaris and AIX, hastart makes internal service callbacks to svcadm enable
vcs and to /etc/init.d/vcs.rc start.
19
HAD startup can be customized by editing two new VCS custom scripts. It gives better control
on HAD startup, avoiding the customers using personal scripts. New environment variables
are introduced to control the evacuation of the ServiceGroups and the enabling/disabling of
the CmdServer. It gives control on the evacuation of service groups and the start of the
CmdServer during a reboot/shutdown or start/stop of the VCS service.
In Solaris and AIX, hastart makes internal service callbacks to svcadm enable vcs
and to /etc/init.d/vcs.rc start. Consistent VCS service state across reboot and
manual startup
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
20
pre_hastart custom_had_start
21
Two custom scripts pre-hastart and custom_had_start are called as part of the
VCS startup.
• pre-hastart is called by the VCS service startup script before the hastart
command is executed.
• custom_had_start is called by the hastart script to allow a custom code to start
the HAD binary. For example, HAD start-up done from the command shell that has a new
process authentication group)
Sample scripts are part of the VRTSvcs package and are available in the folder specified on the
slide. This folder contains sample code and return value transactions. After customizing the
scripts, it should be copied to the $VCS_HOME/bin folder for execution. Note that only
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
NOEVACUATE STARTCMDSERVER
22
Service Callbacks:
• Ensure consistent behaviour across reboot and manual start-up.
• Ensure common behaviour across platforms.
• Provides $VCS_HOME/bin/pre_hastart execution as part of the hastart call.
• Reflects the correct status of VCS at service level.
23
For Solaris: hastart executes svcadm restart vcs’ or ‘svcadm enable vcs
depending on the status of service.
For AIX : /etc/init.d/vcs.rc start is called.
Service Callbacks ensure consistent behaviour across reboot and manual start-up. It also
ensures common behaviour across platforms. It provides
$VCS_HOME/bin/pre_hastart execution as part of the ‘hastart’ call and reflects
the correct status of VCS at the service level.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Execute
hastart
pre_hastart
Custom HAD
startup
Yes
pre_hastart No
script? Called using
Traditional
service
HAD startup
No script?
Yes
VCS service
startup
custom_had Yes
start?
No
24
The flowchart on the slide displays Service callback with Custom Script support.
There are limitations to the Custom Script including the failure to determine whether the
custom_had_start script succeeded to start the HAD daemon. A failure is treated as a
user error. When a command hangs in a custom script, the VCS service transitions to the failed
state once it reaches the start timeout. The hung process continues to run along with the
caller scripts.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• Key points
– In this lesson, you learned to use VCS tools to manage applications under VCS control.
– In addition, you learned about the use of VOM to practice managing resources and service groups.
– Finally, you learned about the two new custom scripts and two new environment variables and other
enhancements in core VCS service support in 7.4.2.
• Reference materials
– SORT downloads
– Veritas Cluster Server Release Notes
– Veritas Cluster Server Administrator’s Guide
25
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
26
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
27
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. True
B. False
28
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. True
B. False
The correct answer is False. Once an application is under VCS control, it should be managed through VCS only not
manually outside VCS.
29
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
30
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is D. VCS service groups can be managed through any of the following: VCS Java GUI, VOM or CLI.
31
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
32
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is C. A service group state will be reflected as offline only when all its nonpersistent resources are
offline on that particular system.
33
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
34
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is A. The main purpose of freezing a service group is to limit its failover during maintenance activity
on that service group.
35
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Enables you to simulate operations using the VCS CLI on a running cluster
B. Enables you to simulate operations using the VCS Java GUI on a running cluster
C. Requires dedicated main.cf and types.cf configuration files
D. Can be run using any valid main.cf and types.cf files
36
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Enables you to simulate operations using the VCS CLI on a running cluster
B. Enables you to simulate operations using the VCS Java GUI on a running cluster
C. Requires dedicated main.cf and types.cf configuration files
D. Can be run using any valid main.cf and types.cf files
The correct answer is D. VCS simulator works pretty good with just a valid main.cf and types.cf file.
37
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
03-38
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the VCS Configuration Methods lesson in the Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
Topic Objective
Start and stop VCS and describe the effects of cluster attributes
Starting and stopping VCS
on cluster shutdown.
Overview of configuration
Compare and contrast VCS configuration methods.
methods
Controlling access to VCS Set user account privileges to control access to VCS.
Summarize VCS user and agent account passwords encryption
VCS Password Encryption
standards.
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to start and stop VCS and
describe the effects of cluster attributes on cluster shutdown.
41
21 main.cf main.cf
had had
hashadow hashadow
1
31
hastart
The default VCS startup process is demonstrated using a cluster with two systems connected
by the cluster interconnect. To illustrate the process, assume that neither system have an
active cluster configuration.
1. The hastart command is run on s1 and starts the had and hashadow processes.
2. HAD checks for a valid configuration file (hacf -verify config_dir).
3. HAD checks for an active cluster configuration on the cluster interconnect.
4. There is no active cluster configuration, hence HAD on s1 reads the local main.cf file and
loads the cluster configuration into local memory. The s1 system is now in the VCS local
build state, which means that the VCS is building a cluster configuration in the memory on
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
main.cf main.cf 61
had had
hashadow hashadow
71
10
1 51
91 hastart
81
5. The hastart command is then run on s2 and starts had and hashadow on s2. The s2
system is now in the VCS current discover wait state which means that the VCS is in a wait
state while it is discovering the current state of the cluster.
6. HAD on s2 checks for a valid configuration file on the disk.
7. HAD on s2 checks for an active cluster configuration by sending a broadcast message out
on the cluster interconnect, even if the main.cf file on s2 is valid.
8. HAD on s1 receives the request from s2 and responds.
9. HAD on s1 sends a copy of the cluster configuration over the cluster interconnect to s2.
The s1 system is now in the VCS running state, which implies that the VCS determines that
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
there is a running configuration in memory on system s1. The s2 system is now in the VCS
remote build state, which means that the VCS is building the cluster configuration in the
memory on the s2 system from the cluster configuration that is in a running state on s1.
10. HAD on s2 performs a remote build operation to place the cluster configuration in the
memory.
s2 s2
s1 s1
had had
had had
• Stop operation does not timeout. • Stop operation timed-out after the above
• HAD continues to be in the LEAVING state. mentioned value (seconds).
• Administrative intervention might be • VCS stops itself forcefully. (hastop-
required. local-force)
There are several methods of stopping the VCS engine (had and hashadow daemons) on a
cluster system. The options you specify to hastop determine where VCS is stopped, and how
resources under VCS control are affected. However, before 7.3.1 there were different
behaviors of VCS shutdown across different platforms.
In previous releases, the VCS shutdown mechanism:
Application > HAD > VXFEN > GAB > LLT
If the application hangs while going offline, it hangs init/rc forever resulting in a system
shutdown or restart is halted forever.
In release 7.3.1, a new variable $VCS_STOP_TIMEOUT is introduced.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• User can decide the VCS behavior through the $VCS_STOP_TIMEOUT variable
— Default value of the variable is 0.
— Non-zero value is considered as timeout activated.
• Control over the VCS shutdown behavior.
• Provides common behavior across supported platforms.
$VCS_STOP_TIMEOUT variable
• If $VCS_STOP_TIMEOUT is set to a non-zero value, the VCS service stop will timeout
after the given value.
Possible values of $VCS_STOP_TIMEOUT
• 0 = Do not timeout VCS service stop
• Non-zero value = Timeout VCS service stop after the given value (value is in
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
seconds)
$VCS_STOP_TIMEOUT variable location
• For Linux = /etc/sysconfig/vcs
• For Solaris = /etc/default/vcs
• For AIX = /etc/default/vcs
$VCS_STOP_TIMEOUT
Default value
10
$VCS_STOP_TIMEOUT variable to define the VCS behavior while executing the hastop
command.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to compare and contrast VCS
configuration methods.
11
# vi main.cf
~
include "types.cf"
cluster webvcs (
Offline: HAD must restart …
)
• Manual modification of configuration System s1 (
files. )
• Modification of a main.cf file using the System s2 (
)
VCS Simulator. group websg (
…
12
VCS provides several tools and methods for configuring service groups and resources,
generally categorized as:
• Online configuration
You can modify the cluster configuration while VCS is running using either the graphical
user interfaces or the command-line interface. These online methods change the cluster
configuration in memory. When finished, you write the in-memory configuration to the
main.cf file on disk to preserve the configuration.
• Offline configuration
In some circumstances, you can simplify cluster implementation and configuration using
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to explain the online
configuration method.
13
In-memory
configuration
In-memory
configuration
HAD
vxfen
HAD
GAB
vxfen
LLT
GAB
LLT
14
When you use the Cluster Manager to modify the configuration, the GUI communicates with
the HAD on the specified cluster system to which the Cluster Manager is connected.
Note: The Cluster Manager configuration requests are shown conceptually as ha commands
in the diagram, but they are implemented as system calls.
The had daemon communicates the configuration change to had on all other nodes in the
cluster, and each had daemon changes the in-memory configuration.
When the command to save the configuration is received from the Cluster Manager, had
communicates this command to all cluster systems, and each system’s had daemon writes the
in-memory configuration to the main.cf file on its local disk.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The VCS command-line interface is an alternate online configuration tool. When you run ha
commands, had responds similarly.
Note: When two administrators are changing the cluster configurations simultaneously, each
administrator can see the changes as they are being made.
ReadOnly=0
1
haconf -makerw Shared cluster configuration
2 in memory.
hares –modify ...
In-memory
main.cf configuration main.cf
not equal to
main.cf
15
You must open the cluster configuration to add service groups and resources, make
modifications, and perform certain operations.
The state of the configuration is maintained in an internal attribute, i.e. ReadOnly. If you try to
stop VCS with the configuration open, a warning is displayed. This helps to ensure that you
remember to save the configuration to disk and avoid losing any changes made while the
configuration was open. You can override this protection, as described later in this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
ReadOnly=0
haconf -dump
Shared cluster configuration
in memory.
In-memory
main.cf configuration main.cf
matches
main.cf
16
When you save the cluster configuration, VCS copies the configuration in memory to the
main.cf file in the /etc/VRTSvcs/conf/config directory on all running cluster
systems. At this point, the configuration is still open. You have only written the in-memory
configuration to disk and have not closed the configuration.
If you save the cluster configuration after each change, you can view the main.cf file to see
how the in-memory modifications are reflected in the main.cf file.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
ReadOnly=1
1
haconf -dump -makero Shared cluster configuration
in memory.
2
In-memory
main.cf configuration main.cf
matches
main.cf
17
When the administrator saves and closes the configuration, VCS changes the state of the
configuration to closed where the ReadOnly attribute is equal to 1. In addition, VCS writes the
configuration in memory to the main.cf file
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
18
When the cluster configuration is open, you cannot stop VCS without overriding the warning
that the configuration is open.
• If you ignore the warning and stop VCS while the configuration is open, you may lose
configuration changes.
• If you forget to save the configuration and shut down VCS, the configuration in the
main.cf file on disk may not be the same as the configuration that was in memory
before VCS was stopped.
You can configure VCS to automatically back up the in-memory configuration to the disk to
minimize the risk of losing modifications made to a running cluster.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
# ls –l /etc/VRTSvcs/conf/config/main*
-rw------ 2 root other 5992 Oct 10 12:07 main.cf
-rw------ 1 root root 5039 Oct 8 8:01 main.cf.08Oct2015...
-rw------ 2 root other 5051 Oct 9 17:58 main.cf.09Oct2015...
-rw------ 2 root other 5992 Oct 10 12:07 main.cf.10Oct2015...
-rw------ 1 root other 6859 Oct 11 7:43 main.cf.autobackup
-rw------ 2 root other 5051 Oct 9 17:58 main.cf.previous
19
You can set the BackupInterval cluster attribute to automatically save the in-memory
configuration to disk periodically. When set to a value greater than or equal to three minutes,
VCS automatically saves the configuration in memory to the main.cf.autobackup file.
Note: If no changes are made to the cluster configuration during the time period set in the
BackupInterval attribute, no backup copy is created. If necessary, you can copy the
main.cf.autobackup file to main.cf and restart VCS to build the configuration in
memory at the point in time of the last backup. Ensure that you understand the VCS startup
sequence described in the “Starting and Stopping VCS” section before you attempt this type
of recovery.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to set user account privileges to
control access to VCS.
20
Stronger security to VCS users with 2048 bit key and SHA256 signature, in secure mode.
21
If you have not configured security in the cluster, VCS has a completely separate list of user
accounts and passwords to control access to VCS.
When using the Cluster Manager to perform administration, you are prompted for a VCS
account name and password. Depending on the privilege level of that VCS user account, VCS
displays the Cluster Manager GUI with an appropriate set of options. If you do not have a valid
VCS account, you cannot run the Cluster Manager.
When using the command-line interface for VCS, you are also prompted to enter a VCS user
account and password after which it is determined whether that user account has proper
privileges to run the command. One exception is the UNIX root user. By default, only the UNIX
root account is able to use VCS ha commands to administer VCS from the command line.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
11
Set the VCS_Host environment variable to the node name to
administer global groups (in remote clusters).
21
Log in to VCS using halogin vcs_user_name.
31
Type the password.
22
The halogin command is provided to save authentication information so that users do not
have to enter credentials every time a VCS command is run.
The command stores authentication information in the user’s home directory. You must set
the VCS_HOST environment variable to the name of the node from which you are running VCS
commands to use halogin.
Note: The effect of halogin only applies for that shell session.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
23
You can ensure that different types of administrators in your environment have a VCS
authority level to affect only those aspects of the cluster configuration that are appropriate to
their level of responsibility.
For example, if you have a DBA account that is authorized to take a database service group
offline or switch it to another system, you can make a VCS Group Operator account for the
service group with the same account name. The DBA can then perform operator tasks for that
service group, but cannot affect the cluster configuration or other service groups. If you set
AllowNativeCliUsers to 1, then the DBA logged on with that account can also use the VCS
command line to manage the corresponding service group.
Setting VCS privileges is described in the next section.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Group
hauser -addpriv user priv
–group group
5. Save and close the configuration.
24
VCS users are not the same as UNIX users except when running VCS in secure mode. If you
have not configured security in the cluster, VCS maintains a set of user accounts separate from
UNIX accounts. In this case, even if the same user exists in both VCS and UNIX, this user
account can be given a range of rights in VCS that does not necessarily correspond to the
user’s UNIX system privileges.
The slide shows how to use the hauser command to create users and set privileges. You can
also add privileges with the -addpriv and –deletepriv options to hauser.
In non-secure mode, VCS passwords are stored in the main.cf file in encrypted format. If
you use a GUI or CLI to set up a VCS user account, passwords are encrypted automatically. If
you edit the main.cf file, you must encrypt the password using the vcsencrypt
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
command.
After completing this topic, you will be able to summarize VCS user and
agent account passwords encryption standards.
26
27
In versions prior to 7.3.1, AES 128 bit encryption was used. The vcsencrypt utility allowed
users to encrypt the agent passwords using a security key. The security key supports AES or
Advanced Encryption Standard encryption which creates a secure password for the agent.
Before 7.3.1, AES 128 bit encryption was used in agents.
The VCS 7.3.1 recommendations was to move to AES 256-bit encryption for enhanced security
and use the randomized initialization vector in the AES-CBC implementation.
• AES-256 bit encryption provides better security to the users as compared to the AES-128
encryption. The solution proposed is cross-platform.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• Users can get confused between different behaviors of VCS across platforms. VCS being a
cross-platform product ensures common behavior across OS platforms.
• AES-256 encryption is more secure than AES-128 encryption. It has a 256-bit key which
means that 2^256 possible keys to bruteforce, as proposed to 2^128 in AES 128-bit
encryption.
28
A 256 bit encryption key is generated using the openssl APIs, and stored in the main.cf.
Then a randomized initialization vector is generated, which also gets stored in the
main.cf. The vscencrypt utility encrypts the password using the secure key and the
initialization vector. Backward compatibility of the agent password is supported.
Example:
• A user already uses AES 128 bit encryption and upgrades to VCS 7.3.1.
• The agent will be able to decrypt the passwords.
• But once the user decides to generate a new 256 bit secure key, they have to encrypt the
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
2 This key is generated by accepting a passphrase from the customer and appending it with fixed value
buffer.
3 This value is now hashed, obfuscated, converted to hexadecimal and is dumped in the
configuration file.
5 An ipm connection is opened with engine and the encrypted password is sent to HAD.
6 HAD decrypts this password and the password corresponding to the user in the configuration file.
29
In versions prior to 7.4.2, VCS user passwords used home-grown algorithms which were weak
and susceptible to dictionary attacks. In the InfoScale version 7.4.2, the home-grown
algorithm is dismissed. Both User and Agent Passwords are encrypted using standard AES 256
(Advanced Encryption Standard) algorithm. As there were no prerequisites to AES 256
encryption implementation, systems and licenses requirements for InfoScale Availability were
same.
1. Generate the 256 bit symmetric key which would be used for encryption and decryption
of passwords.
2. This key is generated by accepting a passphrase from the customer and appending it with
fixed value buffer.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
3. This value is now hashed, obfuscated, converted to hexadecimal, and is dumped in the
configuration file.
4. When a user tries to login, they prompted for a password.
5. An ipm connection is opened with the engine and the encrypted password is sent to HAD.
6. HAD decrypts this password and the password corresponding to the user in the
configuration file.
7. If these passwords match, the user can access the Cluster.
• User and Agent passwords can also be encrypted using the “vcsencrypt” utility and
manually added to the configuration:
User Agent
Password vcsencrypt -vcs vcsencrypt -agent
30
• The 256-bit secure key used for encryption must be generated manually by using the
following command:
vcsencrypt -gensecinfo
Please enter a passphrase of minimum 8 characters.
Passphrase:
SecInfo generated successfully.
Trying to update its value in config file.
• User and Agent passwords can also be encrypted using the “vcsencrypt” utility and
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• Key points
– In this lesson, you learned to change the default VCS shutdown behavior using Cluster attributes and
back up the cluster configuration.
– In addition, you learned about online configuration that enables you to keep VCS running while making
configuration changes.
– Finally, you learned VCS user and agent account passwords encryption using AES 256.
• Reference materials
– Veritas Cluster Server Administrator’s Guide
– Veritas Cluster Server Command Line Quick Reference
– https://sort.veritas.com
31
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
32
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
33
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. GAB
B. LLT
C. hashadow
D. had
34
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. GAB
B. LLT
C. hashadow
D. had
The correct answer is D. HAD daemon reads cluster configuration from the main.cf file at startup.
35
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Cluster Operator
B. Group Administrator
C. Group Operator
D. Service Administrator
36
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Cluster Operator
B. Group Administrator
C. Group Operator
D. Service Administrator
The correct answer is B. Group administration user will limit the full access to the specific service group.
37
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
38
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is C. Assigning each user to Cluster Guest VCS user access.
39
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. hastop -force
B. hastop -migrate
C. hastop -local -evacuate
D. hastop –local -force
40
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. hastop -force
B. hastop -migrate
C. hastop -local -evacuate
D. hastop –local -force
The correct answer is D. -force option enables the continuity of the applications even though HAD is stopped locally
on that node.
41
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
42
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is B. You need to make sure that the configuration is in read-only mode, if you try to perform
hastop without –force option, it will throw an error message asking you to change the configuration to read-only.
But with –force option it doesn’t perform such check.
43
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. True
B. False
44
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. True
B. False
45
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
04-46
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the Preparing Services for VCS lesson in the Veritas InfoScale 7.4.2 Fundamentals
for UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
Topic Objective
Preparing applications for VCS Prepare applications for the VCS environment.
Testing the application service Test the application services before placing them under VCS control.
Stopping and migrating a service Stop resources and manually migrate a service.
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to prepare applications for the
VCS environment.
Application
Process
Storage
File system
Volume
Network Disk group
s1
IP address 10.10.21.198
NIC
An application service is the service that the end-user perceives when accessing a particular
network address. An application service typically consists of multiple components, some
hardware- and some software-based, all cooperating together to produce a service.
For example, a service can include application software (processes), a file system containing
data files, a physical disk on which the file system resides, one or more IP addresses, and a
NIC for network access.
If this application service needs to be migrated to another system for recovery purposes, all of
the components that compose the service must migrate together to re-create the service on
another system.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Application
Storage
Process
Network
IP Mount
10.10.21.198
NIC Volume
DiskGroup
The first step in preparing services to be managed by VCS is to identify the components
required to support the services. These components should be itemized in your design
worksheet and may include the following, depending on the requirements of your application
services:
Shared storage resources:
• Disks or components of a logical volume manager, such as Volume Manager disk groups
and volumes
• File systems to be mounted
• Directory mount points
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Network-related resources:
• IP addresses
• Network interfaces
Application-related resources:
• Identical installation and configuration procedures completed on each cluster node
• Procedures to manage and monitor the application
• The location of application binary and data files
The following sections describe the aspects of these components that are critical to
understanding how VCS manages resources.
Perform one-time
configuration tasks.
Use the procedure shown in the diagram to prepare and test application services on each
system before placing the service under VCS control. Consider using a design worksheet to
obtain and record information about the service group and each resource. This is the
information you need to configure VCS to control these resources.
Details are provided in the following section.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
In order to configure the operating system resources you have identified as requirements for
an application, you need the detailed configuration information used when initially
configuring and testing services.
You can use a design diagram and worksheet while performing one-time configuration tasks
and testing to:
• Show the relationships between the resources, which determine the order in which you
configure, start, and stop resources.
• Document the values needed to configure VCS resources after testing is complete.
Note: If your systems are not configured identically, you must note those differences in the
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
design worksheet. The “Online Configuration” lesson shows how you can configure a resource
with different attribute values for different systems.
10
Verify that the resources specified in your design worksheet are appropriate and complete for
your platform. Refer to the Veritas Cluster Server Bundled Agents Reference Guide before you
begin configuring resources.
The examples displayed in the slides in this lesson show values for various operating system
platforms, indicated by the icons. In the case of the appsg service group shown in the slide,
the lan2 value of the Device attribute for the NIC resource is specific to HP-UX. Solaris, Linux,
and AIX have other operating system-specific values, as shown in the respective Bundled
Agents Reference Guides.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
-V
Make a file system. mkfs -t
–F vxfs /dev/vx/rdsk/appdatadg/appdatavol
11
The diagram shows the procedure for configuring shared storage on the initial system. In this
example, Volume Manager is used to manage shared storage.
Note: Although examples used throughout this course are based on Veritas Volume Manager,
VCS also supports other volume managers. VxVM is shown for simplicity—objects and
commands are essentially the same on all platforms. The agents for other volume managers
are described in the Veritas Cluster Server Bundled Agents Reference Guide.
Preparing shared storage, such as creating disk groups, volumes, and file systems, is
performed once, from one system. Then you must create mount point directories on each
system.
The options to mkfs differ depending on platform type, as displayed in the following
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
examples.
AIX
mkfs -V vxfs /dev/vx/rdsk/appdatadg/appdatavol
Linux
mkfs -t vxfs /dev/vx/rdsk/appdatadg/appdatavol
Solaris/HP-UX
mkfs -F vxfs /dev/vx/rdsk/appdatadg/appdatavol
Some applications have VCS-specific installation and configuration
requirements, described in the VCS agent guides for those
applications, such as Oracle.
12
You must ensure that the application is installed and configured identically on each system
that is a startup or failover target and manually test the application after all dependent
resources are configured and running.
Some VCS agents have application-specific installation instructions to ensure the application is
installed and configured properly for a cluster environment. Check the Veritas Services and
Operations Readiness Tools (SORT) Web site for application-specific guides, such as the Veritas
Cluster Server Agent for Oracle Installation and Configuration Guide.
Depending on the application requirements, you may need to
• Create user accounts.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to test the application services
before placing them under VCS control.
13
Stop resources.
Application software
14
Before configuring a service group in VCS to manage an application, test the application
components on each system that can be a startup or failover target for the service group.
Following this best practice recommendation ensures that VCS can successfully manage the
application service after you configure a service group to manage the application.
The testing procedure emulates how VCS manages application services and must include:
• Startup: Online
• Shutdown: Offline
• Verification: Monitor
The actual commands used may differ from those used in this lesson. However, conceptually,
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
the same type of action is performed by VCS. Example operations are described for each
component throughout this section.
21 Start the volume, if autorecovery has been disabled (and for pre-6.0 VCS).
vxvol –g appdatadg start appdatavol
15
Verify that shared storage resources are configured properly and accessible. The examples
shown in the slide are based on using Volume Manager.
1. Import the disk group.
2. Start the volume.
3. Mount the file system.
Mount the file system manually for the purposes of testing the application service. Do not
configure the operating system to automatically mount any file system that will be controlled
by VCS.
Examples of mount commands are provided for each platform.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
AIX
mount -V vxfs /dev/vx/dsk/appdatadg/appdatavol /appdata
Linux
mount -t vxfs /dev/vx/dsk/appdatadg/appdatavol /appdata
Solaris/HP-UX
mount -F vxfs /dev/vx/dsk/appdatadg/appdatavol /appdata
http://eweb.com
DNS
Admin IP on s1 e1000g0
configured at boot s1
10.10.21.8
16
The example in the slide demonstrates how users access services through a virtual IP address
that is specific to an application. In this scenario, VCS is managing a Web server that is
accessible to network clients over a public network.
1. A network client requests access to http://eweb.com.
2. The DNS server translates the host name to the virtual IP address of the Web server.
3. The virtual IP address is managed and monitored by a VCS IP resource in the Web service
group. The virtual IP address is associated with the next virtual network interface for
e1000g0, which is e1000g0:1 in this example of Solaris network interfaces.
4. The system which has the service group online accepts the incoming request on the
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
virtual IP address.
Note: The administrative IP address is associated with a physical network interface on a
specific system and is configured by the operating system during system startup. These are
also referred to as base or test IP addresses.
http://eweb.com
ifconfig e1000g0 addif 10.10.21.198 up
DNS
Virtual IP configured by IP resource e1000g0:1
s2
17
The diagram in the slide shows what happens if the system running the Web service group
(s1) fails.
1. The IP address is no longer available on the network. Network clients may receive errors
that web pages are not accessible.
2. VCS on the running system (s2) detects the failure and starts the service group.
3. The IP resource is brought online, which configures the same virtual IP address on the
next available virtual network interface alias, e1000g:1 in this example. This virtual IP
address floats, or migrates, with the service. It is not tied to a system.
4. The network client Web request is now accepted by the s2 system.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Note: The admin IP address on s2 is also configured during system startup. This address is
unique and associated with only this system, unlike the virtual IP address.
CAUTION: The administrative IP address cannot be placed under VCS control. This address
must be configured by the operating system. Ensure that you do not configure an IP resource
with the value of the administrative IP address.
18
Configure the application IP addresses associated with specific application services to ensure
that clients can access the application service using the specified address.
Application IP addresses are configured as virtual IP addresses. On most platforms, the
devices used for virtual IP addresses are defined as interface:number.
Note: These virtual IP addresses are only configured temporarily for testing purposes. You
must not configure the operating system to manage the virtual IP addresses.
The following examples show the platform-specific commands used to configure a virtual IP
address for testing purposes.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• AIX
Create an alias for the virtual interface and bring up the IP on the next available logical
interface.
ifconfig en1 inet 10.10.21.198 netmask 255.0.0.0 alias
• HP-UX
1. Configure IP address using the ifconfig command.
ifconfig lan2:1 inet 10.10.21.198
2. Use ifconfig to manually configure the IP address to test the configuration without
rebooting.
ifconfig lan2:1 up
20
When all dependent resources are available, you can start the application software. Ensure
that the application is not configured to start automatically during system boot. VCS must be
able to start and stop the application using the same methods you use to control the
application manually.
Examples of operating system control of applications:
On AIX and HP-UX, rc files may be present if the application is under operating system
control.
On Linux, you can use the chkconfig command to determine if an application is under
operating system control.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
On Solaris 10 platforms, you must disable the Service Management Facility (SMF) using the
svcadm command for some services, such as Apache, to ensure that SMF is not trying to
control the service.
Follow the guidelines for your platform to remove an application from operating system
control in preparation for configuring VCS to control the application.
21
You can perform some simple steps, such as those shown in the slide, to verify that each
component needed for the application to function is operating at a basic level.
Note: To test the network resources, access one or more well-known addresses outside of the
cluster, such as local routers, or primary and secondary DNS servers.
This helps you identify any potential configuration problems before you test the service as a
whole, as described in the “Testing the Integrated Components” slide.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
10.10.21.198
s2
s1
22
When all components of the service are running, test the service in situations that simulate
real-world use of the service.
For example, if you have an application with a backend database, you can:
1. Start the database (and listener process).
2. Start the application.
3. Connect to the application from the public network using the client software to verify
name resolution to the virtual IP address.
4. Perform user tasks, as applicable; perform queries, make updates, and run reports.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Another example that illustrates how you can test your service uses Network File System
(NFS). If you are preparing to configure a service group to manage an exported file system,
verify that you can mount the exported file system from a client on the network.
After completing this topic, you will be able to stop resources and manually
migrate a service.
23
24
Stop resources in the order of the dependency tree from the top down after you have finished
testing the service. You must have all resources offline in order to migrate the application
service to another system for testing. The procedure also illustrates how VCS stops resources.
The ifconfig options are platform-specific, as shown in the following examples.
AIX
ifconfig en1 10.10.21.198 delete
HP-UX
ifconfig lan2:1 0.0.0.0
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Linux
ifdown eth0:1
Solaris
ifconfig e1000g0 removeif 10.10.21.198
Application
Process
Storage
s1
25
After you have verified that the application service works properly on one system, manually
migrate the service between all intended target systems. Performing these operations enables
you to:
• Ensure that your operating system and application resources are properly configured on all
potential target cluster systems.
• Validate or complete your design worksheet to document the information required to
configure VCS to manage the services.
Perform the same type of testing used to validate the resources on the initial system,
including real-world scenarios, such client access from the network.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to stop resources and manually
migrate a service.
26
appsg appproc
Resource dependency definition
appip appmnt Service group appsg
Parent resource Requires Child resource
appnic appvol appvol appdg
appmnt appvol
appdg appip appnic
appproc appmnt
appproc appip
27
Ensure that the steps you perform to bring resources online and take them offline while
testing the service are accurately reflected in a design worksheet. Compare the worksheet
with service group diagrams you have created or that have been provided to you.
The slide shows the resource dependency definition for the application used as an example in
this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
s1 AutoStartList s1, s2
Parallel 0
Startup
system
28
Check the service group attributes in your design worksheet to ensure that the appropriate
startup and failover systems are listed. Other service group attributes may be included in your
design worksheet, according to the requirements of each service.
Service group definitions consist of the attributes of a particular service group. These
attributes are described in more detail later in the course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• Key points
– In this lesson, you learned to prepare each component of a service and document attributes.
– Finally, you learned to test services in preparation for configuring VCS service groups.
• Reference materials
– Veritas Cluster Server Bundled Agents Reference Guide
– Veritas Cluster Server Administrator’s Guide
– https://sort.veritas.com
29
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
30
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
31
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Starting and stopping the service under VCS control before creating the SystemList
B. Making sure all resources in the service group can be probed
C. Manually testing each service on each system that is a startup or failover target
before placing the service under VCS control
D. Testing all services on the same system before placing under VCS control
32
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Starting and stopping the service under VCS control before creating the SystemList
B. Making sure all resources in the service group can be probed
C. Manually testing each service on each system that is a startup or failover target
before placing the service under VCS control
D. Testing all services on the same system before placing under VCS control
The correct answer is C. As a best practice, you should manually test the application startup and stop procedures on all
the cluster nodes, before you put it under VCS.
33
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
34
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is C. Follow the dependency order of application components while starting or stopping the related
resources components.
35
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
36
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is B. Application IP resource is a required key components for the accessibility of the application by
the outside world so it needs to be migrated to the failover system.
37
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
38
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is A. As the application IP is a virtual IP hence it needs an already existing IP address i.e.
administrative IP address on the NIC as the base IP.
39
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
40
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is D. A mount point needs to be manually created on each system.
41
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
42
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is A. Administrative IP address is configured at OS level and configuring the same under VCS will
remove it during the failover of that service group to another system.
43
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
This appendix contains slides that are platform specific and may be
reviewed at the viewer’s discretion and interest. You may opt to end the
presentation now.
44
1. Create an alias for the virtual interface and bring up the IP on the next available logical interface.
2. Edit /etc/hosts to assign a virtual hostname (application service name) to the IP address.
10.10.21.198 eweb.com
45
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
2. Use ifconfig to manually configure the IP address to test the configuration without rebooting.
ifconfig lan2:1 up
3. Edit /etc/hosts and assign a virtual host name (application service name) to the virtual IP address.
10.10.21.198 eweb.com
46
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
2. Edit /etc/hosts to assign a virtual host name (application service name) to the IP address.
10.10.21.198 eweb.com
47
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
1. Plumb the virtual interface and bring up the IP on the next available logical interface.
2. Edit /etc/hosts to assign a virtual hostname (application service name) to the IP address.
10.10.21.198 eweb.com
48
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
For public network access to high availability services, you must configure an administrative IP address associated with
the physical network interface.
• Configure the operating system to bring up the administrative IP address during system boot.
• These addresses are also sometimes referred to as base, maintenance, or test IP addresses.
The administrative IP address may already be configured and only needs to be verified.
49
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
50
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
51
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
52
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
53
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
End of presentation
06-54
Not for Distribution.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
06-55
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the Online Configuration lesson in the Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
Topic Objective
Testing the service group Test the service group to ensure proper configuration.
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The diagram illustrates a high-level procedure that can be used as a standard methodology to
create service groups and resources while the VCS is running. There are multiple ways to use
this configuration procedure, but following a recommended practice simplifies and
streamlines the initial configuration and facilitates troubleshooting if you encounter problems
during configuration.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Conventions Examples
• <unique_id>_<VCS_type>_<instance> • sap_nic_lan1, sap_nic_lan2
• <unique_id><VCS_type><instance> • paydbmountredo, paydbmountarch
Using a consistent pattern for selecting names for VCS objects simplifies initial configuration
of high availability. Perhaps more importantly, applying a naming convention helps avoid
administrator errors and significantly reduces troubleshooting efforts when errors or faults
occur.
As displayed on the slide, Veritas recommends the use of the pattern based on the function of
the service group, and match some portion of the name among all resources and the service
group in which the resources are contained.
When deciding upon a naming convention, consider delimiters, such as dash (-) and
underscore (_), with care. Differences in keyboards may prevent use of some characters,
especially in the case where clusters span geographic locations.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
main.cf
The Startup box specifies that the service group starts automatically when VCS starts on
the system, if the service group is not already online elsewhere in the cluster. This is
defined by the AutoStartList attribute of the service group. In the example displayed in
the slide, the s1 system is selected as the system on which appsg is started when VCS
starts up.
• The type of service group
The Service Group Type selection is Failover by default.
If you save the configuration after creating the service group, you can view the main.cf file
to see the effect of HAD modifying the configuration and writing the changes to the local disk.
Note: You can click the Show Command button to see the commands that are run when you
click OK.
After completing this topic, you will be able to create resources using online
configuration tools.
Bring online
Done
10
Add resources to a service group in the order of resource dependencies starting from the
child resource (bottom up). This enables each resource to be tested as it is added to the
service group.
While adding a resource, you need to specify:
• The service group name
• The unique resource name
If you prefix the resource name with the service group name, you can more easily identify
the service group to which it belongs. When you display a list of resources from the
command line using the hares -list command, the resources are sorted
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
alphabetically.
• The resource type
• Attribute values
Use the procedure shown in the diagram to configure a resource.
Notes:
• It is recommended that you set each resource to be non-critical during initial
configuration. This simplifies testing and troubleshooting in the event that you have
specified incorrect configuration information. If a resource faults due to a configuration
error, the service group does not fail over if resources are non-critical.
• Enabling a resource signals the agent to start monitoring the resource.
11
The NIC resource has only one required attribute, Device, for all platforms other than HP-UX,
which also requires NetworkHosts unless PingOptimize is set to 0.
Optional attributes for NIC vary by platform. Refer to the Veritas Cluster Server Bundled
Agents Reference Guide for a complete definition. These optional attributes are common to all
platforms.
• NetworkType: Type of network, Ethernet (ether)
• PingOptimize: Number of monitor cycles to detect if the configured interface is inactive
A value of 1 optimizes broadcast pings and requires two monitor cycles. A value of 0
performs a broadcast ping during each monitor cycle and detects the inactive interface
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Virtual IP addresses:
• Are configured by the agent using the
operating system-specific command
– AIX/HP-UX/Solaris: ifconfig
IP appip(
– Linux: ip addr Critical = 0 main.cf
• Must be different from the administrative Device = eth0
Address = "10.10.21.198“
IP address
NetMask: 255.255.255.0
)
12
The slide displays the attribute values for an IP resource (on Solaris) in the appsg service
group, reflected in the main.cf snippet.
The IP resource on Solaris has two required attributes: Device and Address, which specify the
network interface and virtual IP address, respectively. The required attributes vary depending
on the platform.
Optional attributes
• NetMask: Netmask associated with the application IP address
− The value may be specified in decimal (base 10) or hexadecimal (base 16). The default
is the netmask corresponding to the IP address class.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
hares command: Add a resource and • Imports and deports a disk group
modify resource attributes. • Monitors the disk group using vxdg
haconf –makerw
hares –add appdg DiskGroup appsg
hares –modify appdg Critical 0
hares –modify appdg DiskGroup
appdatadg
hares –modify appdg Enabled 1 main.cf
haconf –dump –makero
DiskGroup appdg(
Critical = 0
DiskGroup = appdatadg
)
13
You can use the hares command to add a resource and configure the required attributes.
This example shows how to add a DiskGroup resource.
The DiskGroup resource has only one required attribute, DiskGroup.
Note: As of version 4.1, when a disk group is brought under VCS control, VCS sets the vxdg
autoimport flag to no, to ensure the disgroup is not autoimported during the system boot
process..
Example optional attributes:
• StartVolumes: Starts all volumes after importing the disk group
This also starts layered volumes by running vxrecover -s. The default is 1, enabled,
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
main.cf
Volume appvol(
Volume resources: Critical = 0
• Are not required; Volumes are started DiskGroup = appdatadg
automatically during disk group import Volume = appdatavol
• Provide additional monitoring )
14
The Volume resource can be used to manage a VxVM volume. Although the Volume resource
is not strictly required, it provides additional monitoring. You can use a DiskGroup resource to
start volumes when the DiskGroup resource is brought online. This has the effect of starting
volumes more quickly, but only the disk group is monitored.
If you have a large number of volumes on a single disk group, the DiskGroup resource can
time out when trying to start or stop all the volumes simultaneously. In this case, you can set
the StartVolume and StopVolume attributes of the DiskGroup to 0, and create Volume
resources to start the volumes individually.
Also, if you are using volumes as raw devices with no file systems, and, therefore, no Mount
resources, consider using Volume resources for the additional level of monitoring. The
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
15
The Mount resource has the required attributes displayed in the main.cf file excerpt in the
slide.
Example optional attributes:
• MountOpt: Specifies options for the mount command
When setting attributes with arguments starting with a dash (-), use the percent (%)
character to escape the arguments. Examples:
hares -modify appmnt FsckOpt %-y
The percent character is an escape character for the VCS CLI which prevents VCS from
interpreting the string as an argument to hares.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• SnapUmount: Determines whether VxFS snapshots are unmounted when the file system is
taken offline (unmounted)
The default is 0, meaning that snapshots are not automatically unmounted when the file
system is unmounted.
Note: If SnapUmount is set to 0 and a VxFS snapshot of the file system is mounted, the
unmount operation fails when the resource is taken offline, and the service group is not
able to fail over.
This is desired behavior in some situations, such as when a backup is being performed
from the snapshot.
VCS view
hares -display appmnt -attribute VxFSMountLock
#Resource Attribute System Value
appmnt VxFSMountLock global 1
16
Storage Foundation enables a file system to be mounted with a key which must be used to
unmount the file system. The Mount resource has a VxFSMountLock attribute to manage the
file system mount key.
This attribute is set to the ‘VCS’ string by default when a Mount resource is added. The Mount
agent uses this key for online and offline operations to ensure the file system cannot be
inadvertently unmounted outside of VCS control.
You can unlock a file system without unmounting by using the fsadm command:
/opt/VRTS/bin/fsadm -o mntunlock="key" mount_point_name
Note: The example operating system commands for unmounting a locked file system are
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
specific to Solaris. Other operating systems may use different methods for unmounting file
systems.
17
The Process resource controls the application and is added last because it requires all other
resources to be online in order to start. The Process resource is used to start, stop, and
monitor the status of a process.
• Online: Starts the process specified in the PathName attribute, with options, if specified,
in the Arguments attribute
• Offline: Sends SIGTERM to the process; SIGKILL is sent if process does not exit within one
second
• Monitor: Determines if the process is running by scanning the process table
The optional Arguments attribute specifies any command-line options to use when starting
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
the process.
18
If the executable is a shell script, you must specify the script name followed by arguments.
You must also specify the full path for the shell in the PathName attribute.
The monitor script calls ps and matches the process name. The process name field is limited
to 80 characters in the ps output. If you specify a path name to a process that is longer than
80 characters, the monitor entry point fails.
If the executable is a binary, you specify the full path to the executable in the PathName
attribute, and any options to be passed are specified in the Arguments attribute.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to resolve common errors made
during online configuration.
19
Y
Done Online?
Modify attributes
Primary focus
N
Incorrect
Values? Flush group
Correct
Check logs
Fix problems
N
Faulted?
Y
Clear resource
20
Verify that each resource is online on the local system before continuing the service group
configuration procedure. If the online operation hangs with the resource stuck in WAITING TO
GO ONLINE state, you can flush the service group using the hagrp –flush command instead of
waiting until the online operation times out and the resource faults. Flushing the service
group changes the WAITING TO GO ONLINE resource state back to OFFLINE.
Note: The flush operation does not halt the resource online operation if it is still running. If a
running operation succeeds after a flush command was fired, the resource state might still
change.”If you are unable to bring a resource online, use the procedure in the diagram to find
and fix the problem. You can view the logs through Cluster Manager or in the
/var/VRTSvcs/logs directory if you need to determine the cause of errors. VCS log
entries are written to engine_A.log and agent entries are written to resource_A.log
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
files.
Note: Some resources must be disabled and re-enabled. Only resources whose agents have
open and close entry points, such as MultiNICB, require you to disable and enable again after
fixing the problem. By contrast, a Mount resource does not need to be disabled if, for
example, you incorrectly specify the MountPoint attribute.
However, it is generally a good practice to disable and enable regardless because it is difficult
to remember when it is required and when it is not. In addition, a resource is immediately
monitored upon enabling, which would indicate potential problems with attribute
specification. More detail on performing tasks necessary for solving resource configuration
problems is provided in the following sections.
21
A fault indicates that the monitor entry point is reporting an unexpected offline state for a
previously online resource. This indicates a problem with the underlying component being
managed by the resource.
Before clearing a fault, you must resolve the problem that caused the fault. Use the VCS logs
to help you determine which resource has faulted and why. It is important to clear faults for
critical resources after fixing underlying problems so that the system where the fault originally
occurred can be a failover target for the service group. In a two-node cluster, a faulted critical
resource would prevent the service group from failing back if another fault occurred. You can
clear a faulted resource on a particular system, or on all systems where the service group can
run.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Note: Persistent resource faults should be probed to force the agent to monitor the resource
immediately. Otherwise, the resource status is not online until the next
OfflineMonitorInterval, which may be up to five minutes (default is 300 seconds).
Clearing and probing resources using the CLI
To clear a faulted resource, type:
hares -clear resource [-sys system]
If the system name is not specified then the resource is cleared on all systems.
To probe a resource, type:
hares -probe resource -sys system
After completing this topic, you will be able to explain the implementation
of high availability in the VCS architecture.
22
Done
23
After you have successfully brought each resource online, link the resources and switch the
service group to each system on which the service group can run.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
main.cf
hares –link appip appnic
hares –dep
hares –unlink appip appnic appip requires appnic
24
When you link a parent resource to a child resource, the dependency becomes a component
of the service group configuration. When you save the cluster configuration, each dependency
is listed at the end of the service group definition in the main.cf file, after the resource
specifications. In addition, VCS creates a dependency tree in the main.cf file at the end of
the service group definition to provide a more visual view of resource dependencies. This is
not part of the cluster configuration, as denoted by the // comment markers.
// resource dependency tree
//
//group appsg
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
//{
//IP appip
// {
// NIC appnic
. . .
Note: You cannot use the // characters as general comment delimiters. VCS strips out all lines
with // upon startup and re-creates these lines based on the requires statements in
main.cf.
Service group must be
fully online on one node.
25
You can run a virtual fire drill for a service group to check that the underlying infrastructure is
properly configured to enable failover to other systems. The service group must be fully online
on one system, and can then be checked on all other systems where it is offline.
You can select which type of infrastructure components to check, or run all checks. In some
cases, you can use the virtual fire drill to correct problems, such as making a mount point
directory if it does not exist. However, not all resources have defined actions for virtual fire
drills, in which case a message is displayed indicating that no checks were performed. You can
also run fire drills using the havfd command, as shown in the slide.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
main.cf
DiskGroup appdg(
hares –modify appdg Critical 1 DiskGroup = appdatadg
)
26
The Critical attribute is set to 1, or true, by default. When you initially configure a resource,
you set the Critical attribute to 0, or false. This enables you to test the resources as you add
them without the resource faulting and causing the service group to fail over as a result of
configuration errors you make.
Some resources may always be set to non-critical. For example, a resource monitoring an
Oracle reporting database may not be critical to the overall service being provided to users. In
this case, you can set the resource to non-critical to prevent downtime due to failover in the
event that it was the only resource that faulted.
Note: When you set an attribute to a default value, the attribute is removed from main.cf.
For example, after you set Critical to 1 for a resource, the Critical = 0 line is removed from the
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
resource configuration because it is now set to the default value for the resource type.
To see the values of all attributes for a resource, use the hares command. For example:
hares -display appdg
27
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
28
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
29
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. VCS automatically writes the resource definition to the main.cf file on disk.
B. Resources are automatically enabled as they are added.
C. Resources are automatically brought online when its required attributes are set.
D. VCS sets resources to Critical by default.
30
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. VCS automatically writes the resource definition to the main.cf file on disk.
B. Resources are automatically enabled as they are added.
C. Resources are automatically brought online when its required attributes are set.
D. VCS sets resources to Critical by default.
The correct answer is D. By default when a resource is added through CLI, it is set to critical.
31
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
32
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is A. Corresponding resource agent will stop monitoring that resource if it is disabled.
33
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
34
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is C. File system locking feature provided by VCS prevents accidental/manual unmounting of
VCS filesystem.
35
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
36
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is C. You need to explicitly specify the SystemList attribute in order to
specify the list of systems where that service group can come online.
37
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
38
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is D. Resource will be probed when you try to bring it online so it is a must to set all
required attributes to be configured before the resource may come online.
39
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
End of presentation
06-40
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the Offline Configuration lesson in the Veritas InfoScale Availability 7.4.2 for
UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
Topic Objective
Testing the service group Test the service group to ensure proper configuration.
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to identify scenarios where
offline configuration is applicable.
group hr2011db (
SystemList = {s1=0,s2=1} s2
AutoStartList = {s1, s2} s1
)
hr2012db
main.cf
mkt2012db
db2012clus
group hr2012db (
SystemList = {s3=0,s4=1}
s4
AutoStartList = {s3, s4} s3
)
One example where offline configuration is appropriate is when your high availability
environment is expanding and you are adding clusters with similar configurations.
In the example displayed on the slide, the original cluster consists of two systems, with each
system running a database instance. Another cluster with essentially the same configuration is
being added, but it is managing different databases. You can copy configuration files from the
original cluster, make the necessary changes, and then restart VCS as described later in this
lesson. This method may be more efficient than creating each service group and resource
using a graphical user interface or the VCS command-line interface.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
extwebapache intwebapache
extwebdg intwebdg
Another example of using offline configuration is when you want to add a service group with a
similar set of resources as another service group in the same cluster.
In the example displayed on the slide, the portion of the main.cf file that defines the
extwebsg service group is copied and edited as necessary to define a new intwebsg service
group.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
main.cf
modelclus
s2
s1
You can use the VCS Simulator to create and test a cluster configuration on the Windows
operating system and then copy the finalized configuration files into a real cluster
environment. The Simulator enables you to create configurations for all supported UNIX,
Linux, and Windows platforms.
This only applies to the cluster configuration. You must perform all the preparation tasks to
create and test the underlying resources, such as virtual IP addresses, shared storage objects,
and applications. After the cluster configuration is copied to the real cluster and VCS is
restarted, you must test all the objects, as shown later in this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to outline offline configuration
procedures.
The diagram illustrates a process for modifying the cluster configuration when you are
configuring your first service group and do not have services already running in the cluster.
Select one system to be your primary node for configuration. Work from this system for all
steps up to the final point of restarting VCS.
1. Save and close the configuration. Always save and close the configuration before making
any modifications. This ensures that the configuration in the main.cf file on disk is the
most recent in-memory configuration.
2. Change to the configuration directory. The examples used in this procedure assume you
are working in the /etc/VRTSvcs/conf/config directory.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
3. Stop VCS. Stop VCS on all cluster systems. This ensures that there is no possibility of
another administrator changing the cluster configuration while you are modifying the
main.cf file.
4. Edit the configuration files. You must choose any system to modify the main.cf file.
However, you must then start VCS first on that system.
11
The diagram illustrates a process for modifying the cluster configuration when you want to
minimize the time that VCS is not running to protect existing services. This procedure includes
several built-in protections from common configuration errors and maximizes high availability.
First system
Designate one system as the primary change management node. This makes troubleshooting
easier if you encounter problems with the configuration.
1. Save and close the configuration. Save and close the cluster configuration before you start
making changes. This ensures that the working copy has the latest in-memory
configuration.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
2. Back up the main.cf file. Make a copy of the main.cf file with a different name. This
ensures that you have a backup of the configuration that was in memory when you saved
the configuration to disk.
Note: If any *types.cf files are being modified, also back up these files.
3. Make a staging directory. Make a subdirectory of /etc/VRTSvcs/conf/config in
which you can edit a copy of the main.cf file. This helps ensure that your edits are not
overwritten if another administrator changes the configuration simultaneously.
13
7. Verify the configuration file syntax. Run the hacf command in the staging directory to
verify the syntax of the main.cf and types.cf files after you have modified them.
Note: The dot (.) argument indicates that the current working directory is used as the
path to the configuration files. You can run hacf -verify from any directory by
specifying the path to the configuration directory:
hacf -verify /etc/VRTSvcs/conf/config
8. Stop VCS. Stop VCS on all cluster systems after making configuration changes. To leave
applications running, use the -force option, as shown in the diagram.
9. Copy the new configuration file. Copy the modified main.cf file and all *types.cf
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
files from the staging directory back into the configuration directory.
10. Start VCS. Start VCS first on the system with the modified main.cf file.
11. Verify that VCS is in a local build or running state on the primary system.
12. Start other systems. After VCS is in a running state on the first system, start VCS on all
other systems. You must wait until the first system has built a cluster configuration in
memory and is in a running state to ensure the other systems perform a remote build
from the first system’s configuration in memory.
Cluster
conf
4
2 main.cf
had
hashadow
1
3
hastart
s1
14
The diagram illustrates how to start VCS to ensure that the cluster configuration in memory is
built from a specific main.cf file.
Starting VCS using a modified main.cf file
Ensure that VCS builds the new configuration in memory on the system where the changes
were made to the main.cf file. All other systems must wait for the build to successfully
complete and the system to transition to the running state before VCS is started elsewhere.
1. Run hastart on s1 to start the had and hashadow processes.
2. HAD checks for a valid main.cf file.
3. HAD checks for an active cluster configuration on the cluster interconnect.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
4. Since there is no active cluster configuration, HAD on s1 reads the local main.cf file and
loads the cluster configuration into local memory on s1.
5. Verify that VCS is in a local build or running state on s1 using hastatus -sum.
11
main.cf main.cf 7
had had
hashadow hashadow
10 8 6
9 hastart
15
6. When VCS is in a running state on s1, run hastart on s2 to start the had and
hashadow processes.
7. HAD on s2 checks for a valid main.cf file.
8. HAD on s2 checks for an active cluster configuration on the cluster interconnect.
9. The s1 system sends a copy of the cluster configuration over the cluster interconnect to
s2.
10. The s2 system performs a remote build to load the new cluster configuration in memory.
11. HAD on s2 backs up the existing main.cf and types.cf files and saves the current
in-memory configuration to disk.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
16
Ensure that you create the resource dependency definitions at the end of the service group
definition. Add the links using the syntax shown in the slide.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
17
A portion of the completed main.cf file with a new service group definition for intwebsg is
displayed in the slide. This service group was created by copying the extwebsg service group
definition and changing the attribute names and values.
Two errors are intentionally shown in the example on the slide.
• The extwebip resource name was not changed in the intwebsg service group. This causes a
syntax error when the main.cf file is checked using hacf -verify because you cannot
have duplicate resource names within the cluster.
• The intwebdg resource has the value of extwebdatadg for the DiskGroup attribute. This
does not cause a syntax error, but is not a correct attribute value for this resource. The
extwebdatadg disk group is being used by the extwebsg service group and cannot be
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to resolve common errors made
during offline configuration.
18
19
If you are running an old cluster configuration because you started VCS on the wrong system
first, you can recover the main.cf file on the system, where you originally made the
modifications using the main.cf.previous backup file created automatically by VCS.
To recover from an old configuration, use the offline configuration procedure to restart VCS
using the recovered main.cf file. Note: You must ensure that VCS is in the local build or
running state on the system with the recovered main.cf file before starting VCS on other
systems.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
s1 4 hasys -force s1
3
hacf -verify /etc/VRTSvcs/conf/config
20
2. Visually inspect the main.cf file and modify or replace the file as necessary to ensure it
contains the correct configuration content.
3. Verify the configuration with hacf -verify /opt/VRTSvcs/conf/config.
4. Run hasys -force s1 on s1. This starts the local build process. You must have a valid
main.cf file to force VCS to a running state. If the main.cf file has a syntax error, VCS
enters the ADMIN_WAIT state.
5. HAD checks for a valid main.cf file.
6. HAD on s1 reads the local main.cf file, and if it has no syntax errors, HAD loads the
cluster configuration into local memory on s1.
main.cf main.cf 9
had had
hashadow hashadow
10
8
11
7 hastart
21
7. When HAD is in a running state on s1, this state change is broadcast on the cluster
interconnect by GAB.
8. Next, run hastart on s2 to start HAD.
9. HAD on s2 checks for a valid main.cf file. This system has an old version of the
main.cf.
10. HAD on s2 then checks for another node in a local build or running state.
11. Since s1 is in a local build or running state, HAD on s2 performs a remote build from the
configuration on s1.
12. HAD on s2 copies the cluster configuration into the local main.cf and types.cf files
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
ls –l /etc/VRTSvcs/conf/config/main*
-rw------ 2 . . . 5992 Jan 10 12:07 main.cf
-rw------ 1 . . . 5039 Jan 8 8:01 main.cf.08Jan2017.13.13.36
-rw------ 2 . . . 5051 Jan 9 17:58 main.cf.09Jan2017.13.41.26
-rw------ 2 . . . 5992 Jan 10 12:07 main.cf.10Jan2017.13.14.20
-rw------ 2 . . . 5051 Jan 9 17:58 main.cf.previous
22
Each time you save the cluster configuration, VCS maintains backup copies of the main.cf
and types.cf files.
This occurs as follows:
1. New main.cf.datetime and *types.cf.datetime files are created.
2. The hard links for main.cf, main.cf.previous, types.cf and
types.cf.previous (as well as any others) are changed to point to the correct
versions
Although it is always recommended that you copy configuration files before modifying them,
you can revert to an earlier version of these files if they are damaged or lost.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to test the service group to
ensure proper configuration.
23
Set critical
Done
24
After you restart VCS throughout the cluster, use the procedure shown in the slide to verify
that your configuration additions or changes are correct.
Notes:
• This process is slightly different from online configuration, which tests each resource
before creating the next and before creating dependencies.
• Resources should come online after you restart VCS if you have specified the appropriate
attributes to automatically start the service group.
In case of any configuration problems, use the procedures shown in the “Online
Configuration” lesson. If you need to make additional modifications, you can use one of the
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
online tools or modify the configuration files using the offline procedure.
25
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
26
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
27
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
28
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is C. hacf –verify is the command that needs to be executed on the directory
holding the configuration files.
29
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Copy the main.cf and *types.cf into a staging directory before editing.
B. Change the original main.cf while the cluster is up, then bring the cluster down.
C. Make changes to the service groups before changing cluster attributes.
D. Change resources, then service groups, then cluster attributes.
30
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Copy the main.cf and *types.cf into a staging directory before editing.
B. Change the original main.cf while the cluster is up, then bring the cluster down.
C. Make changes to the service groups before changing cluster attributes.
D. Change resources, then service groups, then cluster attributes.
The correct answer is A. As a best practice you should always copy the configuration files to a staging directory
where you will apply changes.
31
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
If you inadvertently start VCS with an old configuration file, you can:
32
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
If you inadvertently start VCS with an old configuration file, you can:
The correct answer is B. You need to stop HAD on all the nodes and copy the main.cf.previous file to
main.cf and restart VCS on that node.
33
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. To ensure that all systems build from their local main.cf files
B. To ensure that the other systems wait and perform a remote build
C. To ensure all other systems remain in ADMIN_WAIT until you force a local build
D. To force VCS to perform a local build on each system
34
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. To ensure that all systems build from their local main.cf files
B. To ensure that the other systems wait and perform a remote build
C. To ensure all other systems remain in ADMIN_WAIT until you force a local build
D. To force VCS to perform a local build on each system
The correct answer is B. You wait for local build to get completed on the first node of the cluster
and once the modified configuration is in memory you can start VCS on rest of the nodes for a
remote build.
35
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
36
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is A. So that the latest in-memory configuration is saved to main.cf file.
37
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
End of presentation
07-38
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the Configuring Notification lesson in the Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
Topic Objective
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
HAD
SMTP SNMP
notifier HAD
When VCS detects certain events, you can configure the notifier to:
• Generate an SNMP (V2) trap to specified SNMP consoles.
• Send an e-mail message to designated recipients.
VCS ensures that no event messages are lost while the VCS engine is running, even if the
notifier daemon stops or is not started. The HAD daemons throughout the cluster
communicate to maintain a replicated message queue. If the service group with the notifier
configured as a resource fails on one of the nodes, the notifier fails over to another node in
the cluster. Since the message queue is guaranteed to be consistent and replicated across
nodes, the notifier can resume message delivery from where it left off after it fails over to the
new node.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Messages are stored in the queue until one of these conditions is met:
• The notifier daemon sends an acknowledgement to HAD that at least one recipient has
received the message.
Or
• The queue is full. The queue is circular—the last (oldest) message is deleted in order to
write the current (newest) message.
ClusterService notifier
HAD
csgnic
HAD
notifier
Warning Error
Information
Concurrency violation
SevereError
HAD
SMTP SNMP
notifier HAD
See the VCS Administrator's Guide
for a complete list of events.
Event messages are assigned one of four severity levels by the notifier:
• Information: Normal cluster activity is occurring, such as resources being brought online.
• Warning: Cluster or resource states are changing unexpectedly, such as a resource in an
unknown state.
• Error: Services are interrupted, such as a service group faulting that cannot be failed over.
• SevereError: Potential data corruption is occurring, such as a concurrency violation.
The administrator can configure the notifier to specify which recipients are sent messages
based on the severity level.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A complete list of events and corresponding severity levels is provided in the Veritas Cluster
Server Administrator’s Guide.
The table on this slide displays how the notifier levels shown in e-mail messages compare to
the log file codes for corresponding events.
This example shows a log entry for a VCS ERROR and the corresponding e-mail message.
Notice that the notifier SevereError events correlate with CRITICAL entries in the engine log.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Most Recent
In addition to the primary VCS engine log file, VCS logs information for had, hashadow, and all
agent programs.
Starting with the 4.0 version of VCS, messages in VCS logs have a unique message identifier.
Each entry includes a text code indicating the severity, from CRITICAL entries to INFO entries
with the status information.
• CRITICAL entries indicate problems requiring immediate attention—contact Support
immediately.
• ERROR entries indicate exceptions that need to be investigated.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to configure notification using
the NotifierMngr resource.
11
12
Although, you can start and stop the notifier daemon manually outside of VCS, you should
make the notifier component highly available by placing the daemon under VCS control.
You can configure VCS to manage the notifier manually using the command-line interface or
the Veritas Operations Manager, or set up notifications during initial cluster configuration.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
13
14
The notifier daemon runs on only one system in the cluster, where it processes messages
from the local had daemon. If the notifier daemon fails on that system, the NotifierMngr
agent detects the failure and migrates the service group containing the NotifierMngr resource
to another system.
Because the message queue is replicated throughout the cluster, any system that is a target
for the service group has an identical queue. When the NotifierMngr resource is brought
online, had sends the queued messages to the notifier daemon.
The example in the slide shows the configuration of a notifier resource for e-mail notification.
See the Veritas Cluster Server Bundled Agents Reference Guide for detailed information about
the NotifierMngr agent.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Note: Before modifying resource attributes, ensure that you take the resource offline and
disable it. The notifier daemon must be stopped and restarted with new parameters in order
for changes to take effect.
ResourceStateUnknown ResourceRestartingByAgent
Notification events
ResourceMonitorTimeout ResourceWentOnlineByItself
ResourceNotGoingOffline ResourceFaulted
15
You can set the ResourceOwner attribute to define an owner for a resource. After the
attribute is set to a valid e-mail address and notification is configured, an e-mail message is
sent to the defined recipient when one of the resource-related events occurs, shown in the
table in the slide. VCS also creates an entry in the log file in addition to sending an e-mail
message.
ResourceOwner can be specified as an e-mail ID ([email protected]) or a user account
(gene). If a user account is specified, the e-mail address is constructed as login@smtp_system,
where smtp_system is the system that was specified in the SmtpServer attribute of the
NotifierMngr resource.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
From: Notifier
Subject: VCS Information, Service group is
E-mail message
online
Event Time: Wed Feb 23 18:23:09 2015
. . .
Entities Owner: [email protected]
16
You can set the GroupOwner attribute to define an owner for a service group. After the
attribute is set to a valid e-mail address and notification is configured, an e-mail message is
sent to the defined recipient when one of the group-related events occurs, as shown in the
table in the slide.
GroupOwner can be specified as an e-mail ID ([email protected]) or a user account (gene).
If a user account is specified, the e-mail address is constructed as login@smtp_system, where
smtp_system is the system that was specified in the SmtpServer attribute of the NotifierMngr
resource.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
main.cf snippet
cluster vcs_cluster (
UserNames = { admin = ************ }
Administrators = { admin }
ClusterRecipients= { " [email protected]" Error }
)
17
Configuration Authentication
No authentication and no privacy Username
(noAuthNoPriv)
Authentication and no privacy (authNoPriv) Username, authentication protocol,
authentication password
Authentication and privacy (authPriv) Username, authentication protocol,
authentication password, privacy protocol,
privacy password
18
Using the Management Server console in VIOM, you can configure SNMPv3 settings.
Note: To perform this task, your user group must be assigned the Admin role on the
Management Server perspective.
Prerequisites for configuring SNMPv3 settings are as follows:
• Install Python 3.5.
• Install PySNMP.
To install PySNMP off-line, the following packages must be downloaded and installed for
PySNMP to be operational:
PyASN1 - https://pypi.python.org/pypi/pyasn1
PySNMP - https://pypi.python.org/pypi/pysnmp/
PyCryptodomex - https://pypi.python.org/pypi/pycryptodomex/
19
6. For encryption, check the Privacy checkbox and perform the following steps:
a) Select the Privacy Protocol from the drop-down.
b) Enter the Privacy Password.
7. Enter the path for the python executable.
8. Click Save Settings.
9. To delete the settings, click Delete Settings.
External CA
Login request
Request certificate
Client VIOM
20
21
A public key certificate that identifies a root certificate authority (CA). There can be one or
more intermediate certificates. Server, intermediate and root certificates are in PEM format.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
22
23
The screenshot on the slide displays commands for creating certificate signing request for CA.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
24
The commands to create client CSR and certificates are displayed on the slide.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
25
26
The table on the slide displays a list of performing operations using Veritas InfoScale
Operations Manager Web services APIs for version 7.4.2.
APIs were introduced in the InfoScale 6.X version. Additional APIS were introduced in later
versions. Users can add these APIs to their scripts and perform the operations with out
logging to the VIOM GUI. During the maintenance activity, the APIS are useful when the user
is handling hundreds of clusters.
• VIOM APIs are accessible using the HTTPS protocol or using standard HTTPS client such as
cURL. REST
• APIs are used to query the VIOM server to discover and manage data like setting the
objects and listing their properties. APIs are used to manage user-defined attributes and to
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
perform some operations on objects like service group freeze and service group unfreeze
and so on.
Using the Veritas InfoScale Operations Manager Web services API, you can also perform the
following operations:
• Start and stop Virtual Business Services. This operation can be performed in
the Server and Availability perspective.
• Start and stop VVR replication.
• Run a recovery plan.
• Provision storage using a storage template. To perform this operation, you need to install
the Storage Provisioning and Enclosure Migration Add-on version 6.1.
27
The table on the slide displays a list of performing operations using Veritas InfoScale
Operations Manager Web services APIs for version 7.4.2.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
28
The table on the slide displays a list of performing operations using Veritas InfoScale
Operations Manager Web services APIs for version 7.4.2.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
29
30
VCS provides an additional method for notifying users of important events. When VCS detects
certain events, you can configure a trigger to notify an administrator or perform other actions.
You can use event triggers in place of, or in conjunction with, notification.
Triggers are executable programs, batch files, shell or Perl scripts associated with the
predefined event types supported by VCS that are shown in the slide.
Triggers are configured by specifying one or more keys in the TriggersEnabled attribute. Some
keys are specific to service groups or resources.
The RESSTATECHANGE, RESRESTART, and RESFAULT keys apply to both resources and service
groups. When one of these keys is specified in TriggerPath at the service group level, the
trigger applies to each resource in the service group.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
more /opt/VRTSvcs/bin/sample_triggers/resfault
. . .
# Usage:
# resfault <system> <resource> <oldstate>
#
# <system>: is the name of the system where resource faulted.
# <resource>: is the name of the resource that faulted.
# <oldstate>: is the previous state of the resource that
# faulted.
#
# Possible values for oldstate are ONLINE and OFFLINE.
. . .
31
/opt/VRTSvcs/bin/triggers/websg/nofailover
group websg (
SystemList = { s1 = 0, s2 =1 } main.cf
AutoStartList = { s1, s2 }
TriggersEnabled = { NOFAILOVER }
TriggerPath = "bin/triggers/websg"
)
32
/opt/VRTSvcs/bin/trigers/websg.
The example portion of the main.cf file shows the NOFAILOVER trigger enabled for websg
(globally), and the trigger path customized to map to
/opt/VRTSvcs/bin/triggers/websg.
33
This slide displays the basic procedure for creating a trigger using a sample script provided
with VCS.
In this case, the resfault script is copied from the sample_triggers directory and then modified
to use the Linux /bin/mail program to send e-mail to the modified recipients list.
The only changes required to make use of the sample resfault trigger in this example are the
following two lines:
@recipients=("student\@mgt.example.com");
. . .
"/bin/mail -s resfault $recipient < $msgfile";
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After a trigger is modified, you must ensure the file is executable by root, and then copy the
script or program to each system in the cluster that can run the trigger. Finally, modify the
TriggersEnabled attribute to specify the key for each system that can run the trigger.
ls /opt/VRTSvcs/bin/triggers/websg/preonline/
T01backup
T02setenv
T03online
34
VCS supports the use of multiple scripts for a single trigger. This enables you to break the logic
of a trigger into components rather than having all trigger logic in one monolithic script.
The number contained in the file name determines the order in which the scripts are run,
similar to legacy UNIX startup and kill scripts in rcN.d directories. To use multiple files for a
single trigger, you must specify a custom path using the TriggerPath attribute.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• Key points
– In this lesson, you learned about VCS notification service and the functions of ClusterService group.
– In addition, you learned to configure notifications using the NotifierMngr resource.
– Finally, you learned to configure triggers to customize the notification facilities to meet your specific
requirements.
• Reference materials
– Veritas Cluster Server Bundled Agents Reference Guide
– Veritas Cluster Server Administrator’s Guide
– https://sort.veritas.com
35
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
36
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
37
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. hasnmp
B. hanotify
C. notifier
D. notifiermngr
38
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. hasnmp
B. hanotify
C. notifier
D. notifiermngr
The correct answer is C. The notifier process configures how messages are received from VCS and how they are
delivered to SNMP consoles and SMTP servers.
39
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. GroupOwner
B. ResourceOwner
C. GroupAdmin
D. GroupOperator
40
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. GroupOwner
B. ResourceOwner
C. GroupAdmin
D. GroupOperator
The correct answer is A. GroupOwner attribute when set properly for specific service group enables all service group
events to be notified through email to specified user.
41
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
42
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
43
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. admin
B. operator
C. sysadmin
D. operator, admin, and sysadmin
NotifierMngr notifier (
SmtpServer = "smtp.acme.com"
SmtpRecipients = { "[email protected]" = Information
"[email protected]" = Warning "[email protected]" = Error }
PathName = "/opt/VRTSvcs/bin/notifier"
)
44
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. admin
B. operator
C. sysadmin
D. operator, admin, and sysadmin
NotifierMngr notifier (
SmtpServer = "smtp.acme.com"
SmtpRecipients = { "[email protected]" = Information
"[email protected]" = Warning "[email protected]" = Error }
PathName = "/opt/VRTSvcs/bin/notifier"
)
The correct answer is B. Since the notification type is Information, it will be received by the operator as per the
configuration.
45
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
46
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is A. The trigger will be executed when a resource faults in a service group with the attribute
TriggerEnabled set to RESFAULT.
47
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. True
B. False
48
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. True
B. False
49
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
08-50
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the Handling Resource Faults lesson in the Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
Topic Objective
VCS response to resource faults Explain how VCS responds to resources faults.
Controlling fault behavior Control fault behavior using resource type attributes.
Fault notification and event handling Configure fault notification and triggers.
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to explain how VCS responds to
resource faults.
• A service group must have at least one critical resource to enable automatic failover.
• Other attributes modify this behavior, as described throughout this lesson.
Critical resources define the basis for failover decisions made by the VCS. When the monitor
entry point for a resource returns with an unexpected offline status, the action taken by the
VCS engine depends on whether the resource is critical.
By default, if a critical resource in a failover service group faults or is taken offline as a result
of another resource fault, VCS determines that the service group is faulted. VCS then fails the
service group over to another cluster system, as defined by a set of service group attributes.
The default failover behavior for a service group can be modified using one or more optional
service group attributes. Failover determination and behavior are described throughout this
lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Take entire
group offline
VCS responds in a specific and predictable manner to faults. When VCS detects a resource
failure, it performs the following actions:
1. Instructs the agent to execute the clean entry point for the failed resource to ensure that
the resource is completely offline. The resource transitions to a FAULTED state.
2. Takes all resources in the path of the fault offline starting from the faulted resource up to
the top of the dependency tree.
3. If an online critical resource is part of the path that was faulted or taken offline, faults the
service group and takes the group offline to prepare for failover. If no online critical
resources are affected, no more action occurs.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
4. Attempts to start the service group on another system in the SystemList attribute
according to the FailOverPolicy defined for that service group and the relationships
between multiple service group. Note: The state of the group on the new system prior to
failover must be offline (not faulted).
5. If no other systems are available, the service group remains offline. VCS also executes
certain triggers and carries out notification while it performs each task in response to
resource faults. The role of notification and event triggers in resource faults is explained in
detail later in this lesson.
ALL
N
Execute clean entry point
Fault the resource and the service group
Several service group attributes can be used to change the default behavior of VCS while
responding to resource faults.
ManageFaults
The ManageFaults attribute can be used to prevent VCS from taking any automatic actions
whenever a resource failure is detected. Essentially, ManageFaults determines whether VCS or
an administrator handles faults for a service group.
If ManageFaults is set to the default value of ALL, VCS manages faults by executing the clean
entry point for that resource to ensure that the resource is completely offline, as shown
previously. This is the default value (ALL).
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
If this attribute is set to NONE, VCS places the resource in an ADMIN_WAIT state and waits for
administrative intervention. This is often used for service groups that manage database
instances. You may need to leave the database in its FAULTED state in order to perform
problem analysis and recovery operations.
Note: This attribute is set at the service group level. This means that any resource fault within
that service group requires administrative intervention if the ManageFaults attribute for the
service group is set to NONE.
Critical N
1 Keep group
online resource
in path? partially online
Y
Take entire 0
AutoFailOver?
group offline
1
Choose a failover target from
SystemList based on FailOverPolicy
Y Failover N
Bring group Keep
target
online elsewhere available? group offline
This attribute determines whether automatic failover takes place when a resource or system
faults. The default value of 1 indicates that the service group should be failed over to other
available systems if at all possible. However, if the attribute is set to 0, no automatic failover is
attempted for the service group, and the service group is left in an OFFLINE | FAULTED state.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to determine the duration of
service group failover with traditional and IMF monitoring.
10
• Service group failover time is the sum of the duration of each failover task.
• You can affect failover time behavior by setting resource type attributes.
Traditional monitoring
= Failover duration
11
When a resource failure occurs, application services may be disrupted until either the
resource is restarted on the same system or the application services migrate to another
system in the cluster. The time required to address the failure is a combination of the time
required to:
• Detect the failure
When traditional monitoring is configured, a resource failure is only detected when the
monitor entry point of that resource returns an offline status unexpectedly. The resource
type attributes used to tune the frequency of monitoring a resource are MonitorInterval
(default of 60 seconds) and OfflineMonitorInterval (default of 300 seconds).
• Fault the resource
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
OfflineMonitorInterval
• Defines frequency of offline monitoring
• Is set to default value of 300 seconds for most resource types
• Can be reduced for testing (for example, to 60 seconds)
! If you change a resource type attribute, you affect all resources of that type.
13
You can change some resource type attributes to facilitate failover testing. For example, you
can change the monitor interval to see the results of faults more quickly. You can also adjust
these attributes to affect how quickly an application fails over when a fault occurs.
MonitorInterval
This is the duration (in seconds) between two consecutive monitor calls for an online or
transitioning resource. The default is 60 seconds for most resource types.
OfflineMonitorInterval
This is the duration (in seconds) between two consecutive monitor calls for an offline
resource. If set to 0, offline resources are not monitored. The default is 300 seconds for most
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
resource types.
Refer to the Veritas Cluster Server Bundled Agents Reference Guide for the applicable monitor
interval defaults for specific resource types.
Timeout interval values define the maximum time within which the entry points must finish or be terminated.
14
After completing this topic, you will be able to control fault behavior using
resource type attributes.
15
ConfInterval
Determines the amount of time that must elapse before restart and
tolerance counters are reset to zero Is set to 600 by default.
ToleranceLimit
• Enables the monitor entry point to return OFFLINE several times
before the resource is declared FAULTED
• Is set to 0 by default
16
Although the failover capability of VCS helps to minimize the disruption of application services
when resources fail, the process of migrating a service to another system can be time-
consuming. In some cases, you may want to attempt to restart a resource on the same system
before failing it over to another system.
Whether a resource can be restarted depends on the application service:
• The resource must be successfully cleared (taken offline) after failure.
• The resource must not be a child resource with dependent parent resources that must be
restarted.
If you have determined that a resource can be restarted without impacting the integrity of the
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
application, you can potentially avoid service group failover by configuring the RestartLimit,
ConfInterval, and ToleranceLimit resource type attributes.
For example, you can set the ToleranceLimit to a value greater than 0 to allow the monitor
entry point to run several times before a resource is determined to be faulted. This is useful
when the system is very busy and a service, such as a database, is slow to respond.
On On On Off On Off
Monitor
Faulted
Confidence
Restart 0 0 0 1 1 1
Counter
17
This example illustrates how the RestartLimit and ConfInterval attributes can be configured for
modifying the behavior of VCS when a resource is faulted. Setting RestartLimit = 1 and
ConfInterval = 180 has this effect when a resource faults:
1. The resource stops after running for 10 minutes.
2. The next monitor returns offline.
3. The ConfInterval counter is set to 0.
4. The agent checks the value of RestartLimit.
5. The resource is restarted because RestartLimit is set to 1, which allows one restart within
the ConfInterval counter
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
types.cf
type NIC (
. . .
static int ToleranceLimit = 2
static str ArgList[] = { Device, …
. . .
)
18
You can modify the resource type attributes to affect how an agent monitors all resources of a
given type. For example, agents usually check their online resources every 60 seconds. You
can modify that period so that the resource type is checked more often. This is good for either
testing situations or time-critical resources.
You can also change the period so that the resource type is checked less often. This reduces
the load on VCS overall, as well as on the individual systems, but increases the time it takes to
detect resource failures.
For example, to change the ToleranceLimit attribute for all NIC resources so that the agent
ignores occasional network problems, type hatype -modify NIC ToleranceLimit
2
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
hares
main.cf
NIC webnic(
Device = eth0
. . .
MonitorInterval=10
. . .
)
19
Resource type attributes apply to all resources of that type. You can override a resource type
attribute to change its value for a specific resource. Use the options to hares shown on the
slide to override resource type attributes.
Note: The configuration must be in read-write mode in order to modify and override resource
type attributes. The changes are reflected in the main.cf file only after you save the
configuration using the haconf -dump command.
Some predefined static resource type attributes (those resource type attributes that do not
appear in types.cf unless their value is changed, such as MonitorInterval) and all static
attributes that are not predefined (static attributes that are defined in the type definition file)
can be overridden. For a detailed list of predefined static attributes that can be overridden,
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to recover from resource faults.
20
21
When a resource failure is detected, the resource is put into a FAULTED or an ADMIN_WAIT
state depending on the cluster configuration. In either case, administrative intervention is
required to bring the resource status back to normal.
Recovering a resource from a faulted state
A critical resource in FAULTED state cannot be brought online on a system. When a critical
resource is FAULTED on a system, the service group status also changes to FAULTED on that
system, and that system can no longer be considered as an available target during a service
group failover.
You have to clear the FAULTED status of a nonpersistent resource manually. Before clearing
the FAULTED status, ensure that the resource is completely offline and that the fault is fixed
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
outside of VCS.
Note: You can also run hagrp -clear group [-sys system] to clear all FAULTED
resources in a service group. However, you have to ensure that all of the FAULTED resources
are completely offline and the faults are fixed on all the corresponding systems before running
this command.
The FAULTED status of a resource is cleared when the monitor returns an online status for
that resource. Note that offline resources are monitored according to the value of
OfflineMonitorInterval, which is 300 seconds (five minutes) by default. To avoid waiting for
the periodic monitoring, you can initiate the monitoring of the resource manually by probing
the resource.
After completing this topic, you will be able to configure fault notification
and triggers.
22
23
As a response to a resource fault, VCS carries out tasks to take resources or service groups
offline and to bring them back online elsewhere in the cluster. While carrying out these tasks,
VCS generates certain messages with a variety of severity levels and the VCS engine passes
these messages to the notifier daemon. Whether these messages are used for SNMP
traps or SMTP notification depends on how the notification component of VCS is configured.
The following events are examples that result in a notification message being generated:
• A resource becomes offline unexpectedly; that is, a resource is faulted.
• VCS cannot determine the state of a resource.
• A failover service group is online on more than one system.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Resource cannot
Call resnotoff
be taken offline
Resource is placed
Call resadminwait
in ADMIN_WAIT
Successful resource
Call resstatechange
online/offline
No failover target
Call nofailover
exists
24
You can use triggers to customize how VCS responds to events that occur in the cluster. For
example, you could use the ResAdminWait trigger to automate the task of taking diagnostics
of the application as part of the failover and recovery process. If you set ManageFaults to
NONE for a service group, VCS places faulted resources into the ADMIN_WAIT state. If the
resadminwait trigger is configured, VCS runs the script when a resource enters ADMIN_WAIT.
Within the trigger script, you can run a diagnostic tool and log information about the
resource, and then take a desired action, such as clearing the state and faulting the resource:
hagrp -clearadminwait -fault group -sys system
Lets look at the role of triggers in resource faults. As a response to a resource fault, VCS
carries out tasks to take resources or service groups offline and to bring them back online
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
elsewhere in the cluster. While these tasks are being carried out, certain events take place. If
corresponding event triggers are configured, VCS executes the triggers, as shown in the slide.
Triggers are placed in the /opt/VRTSvcs/bin/triggers directory by default. Sample
trigger scripts are provided in /opt/VRTSvcs/bin/ sample_triggers. Trigger
configuration is described in the “Configuring Notification” lesson and the VERITAS Cluster
Server Administrator’s Guide.
25
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
26
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
27
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
If a critical resource in a service group faults, the default behavior of VCS is:
A. Take offline only the resources in the service group that are dependent on the
faulted resource.
B. Take offline all resources in the service group on the current node and attempt to fail
the service group over to another node.
C. Report the faulted resource, but continue running the service group on the current
node.
D. Attempt to fail the faulted resource over to another node.
28
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
If a critical resource in a service group faults, the default behavior of VCS is:
A. Take offline only the resources in the service group that are dependent on the
faulted resource.
B. Take offline all resources in the service group on the current node and attempt to fail
the service group over to another node.
C. Report the faulted resource, but continue running the service group on the current
node.
D. Attempt to fail the faulted resource over to another node.
The correct answer is B. Failure of a critical resource will take all the resources offline that belongs to
corresponding service group and attempt failover of the service group to another node.
29
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
30
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is C. If the parent resource is critical resource then in that case only it will failover to another
node.
31
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
32
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is B. A faulted resource can be attempted online on the node where it failed, you need to clear
the fault first.
33
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. It forces the agent to run the clean entry point to clear internal states.
B. It forces the agent to monitor the resource and report the resource status.
C. It checks the configuration of the resource for syntax errors.
D. It clears transitional states of the resource.
34
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. It forces the agent to run the clean entry point to clear internal states.
B. It forces the agent to monitor the resource and report the resource status.
C. It checks the configuration of the resource for syntax errors.
D. It clears transitional states of the resource.
The correct answer is B. Probing a resource forces the corresponding agent to run a monitor
cycle on that resource and report it status.
35
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. ResFaults
B. AdminWait
C. ResWait
D. ManageFaults
36
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. ResFaults
B. AdminWait
C. ResWait
D. ManageFaults
The correct answer is D. ManageFaults attribute control the failover behavior of a service group. If its value is
selected to NONE then service group enters ADMIN_WAIT state and no failover happens in case of service
group fault.
37
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
End of presentation
09-38
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the Intelligent Monitoring Framework lesson in the Veritas InfoScale 7.4.2
Fundamentals for UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
Topic Objective
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to describe how the Intelligent
Monitoring Framework (IMF) improves fault detection.
Faulted
Resources
Faulting…
The Intelligent Monitoring Framework was created to meet customer demands for supporting
increasing numbers of highly available services. Some environments are supporting large
numbers of resources (hundreds of mount points, for example) running on already loaded
systems.
With traditional monitoring, VCS agents poll each resource every 60 seconds, by default. This
can add a substantial system load in large-scale environments. The periodic nature of
traditional monitoring, coupled with the requirement to run the monitor process for each
resource, results in the state of the resource being unknown between monitor cycles, and
requires additional system resources.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Faulted
Resources
Registering…
IMF was introduced in VCS 5.1 SP1, with every major release agents were further added to
IMF supported list. The agents listed on the right side of the slide are common to all
supported OS and supports IMF monitoring.
In addition, you can create custom agents that use IMF monitoring by linking the AMF plug-ins
with the script agent and creating an XML file to enable registration with the AMF module. For
more information about using IMF monitoring for custom agents, see the VCS Agent
Developer’s Guide.
Note: There are other VCS agents available through the agent pack on the SORT Web site
(https://sort.veritas.com/agents).
Some of these application agents, such as WebSphereMQ also support IMF. Refer to the
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to describe how to configure
IMF.
0 — No intelligent monitoring
1 — Intelligent monitoring for offline resources; poll-based for
online resources
2 — Intelligent monitoring for online resources; poll-based
monitoring for offline resources
3 — Intelligent monitoring for both online and offline resources
The Mode key of the IMF attribute determines whether IMF or traditional monitoring is
configured for a resource. Accepted values are:
0—Does not perform intelligent resource monitoring
1—Performs intelligent resource monitoring for offline resources and performs poll-based
monitoring for online resources
2—Performs intelligent resource monitoring for online resources and performs poll-based
monitoring for offline resources
3—Performs intelligent resource monitoring for both online and for offline resources
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Default setting is 5.
MonitorFreq
RegisterRetryLimit
• Number of times the agent must retry registration for a component.
Default setting is 3. • If unable to register, intelligent monitoring is disabled for that component.
10
MonitorFreq is a key value that specifies the frequency at which the agent invokes the
monitor agent function. The value of this key is an integer.
After the component is registered for IMF-based monitoring, the agent calls the monitor
agent function as follows:
• For online components: (MonitorFreq X MonitorInterval) number of seconds.
• For offline components: (MonitorFreq X OfflineMonitorInterval) number of seconds.
For agents that support IMF, the default value is 5. You can set this attribute to a non-zero
value in cases where the agent requires to perform poll-based monitoring in addition to the
intelligent monitoring.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
RegisterRetryLimit is a key value that determines the number of times the agent must retry
registration for a component. If the agent cannot register the component within the limit that
is specified, then intelligent monitoring is disabled until the component state changes or the
value of the Mode key changes. The default value is 3.
Process sendmail(
Pathname = "/usr/bin/sendmail"
Arguments = "-db –q3om"
PidFile = "/var/run/sendmail.pid
. . .
IMF Mode=2, MonitorFreq=5
. . .
)
11
The configuration snippet in the slide shows a Process resource with IMF enabled for
monitoring online resources and traditional poll-based monitoring for offline resources. For
Process resources that are offline, the agent runs monitor entry point periodically as specified
by the OfflineMonitorInterval attribute.
With MonitorFreq set to 5, the agent runs the monitor entry point periodically, calculated by
multiplying the value of MonitorFreq (5) by MonitorInterval (60 seconds), which results in a
poll-based monitor occurring every five minutes for an online resource. Poll-based monitoring
is performed by checking the process table for the process IDs listed in the PidFile.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• VCS operator permission specific to the application’s expert is assigned to the service group that corresponds to an
application.
• In case of multiple such applications, provision to specify different VCS operation permission is required for each
application.
• Clone the Application agent (binaries and type definition) and configure their resources in a separate service groups for
each application.
• IMF environment seamlessly identifies each cloned Application agents, and makes them IMF aware.
12
InfoScale lets you clone the Application agent so that you can configure a different service
group for each application. You must then assign the appropriate operator permissions for
each service group for it to function as expected.
The Application agent is IMF-aware and uses asynchronous monitoring framework (AMF)
kernel driver for IMF notification. For more information about IMF and intelligent resource
monitoring, refer to the Cluster Server Administrator's Guide.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
In InfoScale 7.4.1, Mount agent is IMF-aware, and can process notification for VxFS, EXT4 and XFS.
• Fast failover
• CPU conservation
13
Intelligent Monitoring Framework (IMF) uses AMF kernel driver for immediate notifications,
enabling VCS to detect state changes instantly.
In InfoScale 7.4.1, Mount agent is IMF-aware, and can process notification for VxFS, EXT4 and
XFS. The slide lists benefits of IMF support for NFS file system. For the requirements and
supportability, refer to the Cluster Server Administrator's Guide.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to determine how VCS behaves
when IMF monitoring detects faults.
14
IMF monitoring
+ Detect the resource failure:
• AMF driver notifies agent of state change
• Agent runs monitor entry point to determine state
+ Fault the resource.
+ Take the entire service group offline.
+ Select a failover target.
+ Bring the service group online on another system in the cluster.
= Failover duration
15
Failover duration for service groups when an IMF-monitored resource is determined in similar
fashion to the process described in the “Handling Resource Faults” lesson.
The key difference is the time required to detect a resource fault. Depending on when a
resource faults in the traditional poll-based model, the detection of the fault can take up to 60
seconds. For IMF-monitored resources, a fault is detected and the agent probes the resource
immediately to determine the resource state.
When a process dies or hangs, the operating system generates an alert. The agent is
registered to receive such alerts from the operating system, through the AMF kernel module.
The agent then probes the resource to determine the state and notifies HAD if the resources
is faulted. HAD can then take action within seconds of a resource fault, rather than minutes,
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
16
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
17
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
18
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
19
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is B. When IMF mode is set to 2 then Intelligent monitoring for online resources and poll-based
monitoring for offline resources is conducted.
20
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Every minute
B. Never
C. Every four minutes
D. Every two minutes
21
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. Every minute
B. Never
C. Every four minutes
D. Every two minutes
The correct answer is C. After 4 monitor cycles(MonitorFreq=4) of 60 seconds (MonitorInterval=60) that is 4 minutes.
22
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
23
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
24
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. haimfconfig –enable
hatype –modify Process IMF –update Mode 3
B. hatype –modify Process IMF –update Mode 3
C. hatype –modify Process IMF –update Mode 1
D. haimfconfig –agent -Process
hatype –modify Process IMF –update Mode 3
25
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. haimfconfig –enable
hatype –modify Process IMF –update Mode 3
B. hatype –modify Process IMF –update Mode 3
C. hatype –modify Process IMF –update Mode 1
D. haimfconfig –agent -Process
hatype –modify Process IMF –update Mode 3
The correct answer is B. IMF mode 3 enables intelligent monitoring for both online and offline resources.
26
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
27
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The correct answer is D. As the offline event is reported, monitoring entry point is initiated.
28
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
End of presentation
10-29
Not for Distribution.
Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration
© 2020 Veritas Technologies LLC. All rights reserved. Veritas and the Veritas Logo are trademarks or registered trademarks of Veritas Technologies LLC
or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.
This is the Cluster Communications lesson in the Veritas InfoScale 7.4.2 Fundamentals for
UNIX/Linux: Administration course.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
PART 2: Veritas InfoScale Availability 7.4.2 for UNIX/Linux: InfoScale Availability Additions
Administration
• Lesson 09: Handling Resource Faults
InfoScale Availability Basics • Lesson 10: Intelligent Monitoring Framework
• Lesson 01: High Availability Concepts
• Lesson 11: Cluster Communications
Topic Objective
VCS communications review Describe how components communicate in a VCS environment.
Describe the files that specify the cluster interconnect
Cluster interconnect configuration
configuration.
Describe how systems join the cluster membership and services
Cluster startup
come online.
System and cluster interconnect
Describe how VCS responds to common failures.
failure
Changing the interconnect
Change the cluster interconnect configuration.
configuration
The table on this slide lists the topics and objectives for this lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to describe how components
communicate in a VCS environment.
VCS maintains the cluster state by tracking the status of all resources and service groups in the
cluster. The state is communicated between HAD processes on each cluster system by way of
the atomic broadcast capability of Group Membership Services/Atomic Broadcast (GAB). HAD
is a replicated state machine, which uses the GAB atomic broadcast mechanism to ensure that
all systems within the cluster are immediately notified of changes in resource status, cluster
membership, and configuration.
Atomic means that all systems receive updates, or all systems are rolled back to the previous
state, much like a database atomic commit. If a failure occurs while transmitting status
changes, GAB’s atomicity ensures that, upon recovery, all systems have the same information
regarding the status of any monitored resource in the cluster.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Up to eight links per node. • Heartbeats every half- • Heartbeats every second.
second. • No cluster status sent.
• Cluster status information • Automatically promoted to
carried over links. high priority if there are no
• Usually configured for high-priority links
dedicated cluster network functioning.
links. • Can be configured on public
network interfaces.
LLT can be configured to designate links as high-priority or low-priority links. High-priority links
are used for cluster communications (GAB) as well as heartbeats. Low-priority links carry only
heartbeats unless there is a failure of all configured high-priority links. At this time, LLT
switches cluster communications to the first available low-priority link. Traffic reverts to high-
priority links as soon as they are available.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
# gabconfig -a
GAB Port Memberships
===============================================
Port a gen f7c001 membership 01 ; ;12
Port b gen f7c004 membership 01 ; ;12
Port h gen f7c002 membership 01 ; ;12
To display the cluster membership status, type gabconfig on each system. For example:
gabconfig -a
The first example in the slide shows:
• Port a, GAB membership, has four nodes: 0, 1, 21, and 22
• Port b, fencing membership, has four nodes: 0, 1, 21, and 22
• Port h, VCS membership, has four nodes: 0, 1, 21, and 22
Note: The port a, port b, and port h generation numbers change each time the membership
changes.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
The gabconfig output uses a positional notation to indicate which systems are members of
the cluster. Only the last digit of the node number is displayed relative to semicolons that
indicate the 10s digit. The second example shows gabconfig output for a cluster with 22
nodes.
After completing this topic, you will be able to describe the files that specify
the cluster interconnect configuration.
/etc/llttab
set-cluster 10
set-node s1
link nxge0 /dev/nxge:0 - ether - -
link nxge4 /dev/nxge:4 - ether - -
10
The VCS installation utility sets up all cluster interconnect configuration files and starts LLT and
GAB. You may never need to modify communication configuration files. Understanding how
these files work together to define the cluster communication mechanism helps you to
understand VCS behavior.
LLT configuration files
The LLT configuration files are located in the /etc directory.
The llttab file
The llttab file is the primary LLT configuration file and is used to:
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
/etc/llthosts
0 s1
1 s2
The system (node) name does not need to be the UNIX host name found using the hostname command.
11
The llthosts file associates a system name with a VCS cluster node ID number. This file
must be present in the /etc directory on every system in the cluster. It must contain a line
with the unique name and node ID for each system in the cluster. The format is:
node_number name
The critical requirements for llthosts entries are:
• Node numbers must be unique. If duplicate node IDs are detected on the Ethernet LLT
cluster interconnect, LLT in VCS 4.0 is stopped on the joining node. In VCS versions before
4.0, the joining node panics.
• The system name must match the name in llttab if a name is configured for the set-
node directive (rather than a number).
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
/etc/llttab
set-node s1
0 to 64k -1 set-cluster 10
link qfe0 /dev/qfe:0 - ether - -
link qfe4 /dev/qfe:4 - ether - -
/etc/llthosts
0 s1
0 to 63 1 s2
12
A unique number must be assigned to each system in a cluster using the set-node
directive. Each system in the cluster must have a unique llttab file, which has a unique
value for set-node, which can be one of the following:
• An integer in the range of 0 through 63 (64 systems per cluster maximum)
• A system name matching an entry in /etc/llthosts
The set-cluster directive
LLT uses the set-cluster directive to assign a unique number to each cluster. A cluster ID
is set during installation and can be validated as a unique ID among all clusters sharing a
network for the cluster interconnect.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
Note: You can use the same cluster interconnect network infrastructure for multiple clusters.
The llttab file must specify the appropriate cluster ID to ensure that there are no
conflicting node IDs.
If you bypass the installer mechanisms for ensuring the cluster ID is unique and LLT detects
multiple systems with the same node ID and cluster ID on a private network, the LLT interface
is disabled on the node that is starting up. This prevents a possible split-brain condition,
where a service group might be brought online on the two systems with the same node ID.
/etc/VRTSvcs/conf/sysname
s1 s1 set-node /etc/VRTSvcs/conf/sysname
s1 s1
s1 s1 set-cluster 10
s1 s1
s1 s1
s1 s1 s1 s12 link dev1 /dev/dev:1 - ether - -
s1 s1 s12
s1 s1 link dev2 /dev/dev:2 - ether - -
s1 s1
s1 s1
s1 s12
13
The sysname file is an optional LLT configuration file that is configured automatically during
VCS installation. This file is used to store the short-form of the system (node) name. The
purpose of the sysname file is to enable specification of a VCS node name other than the
UNIX host name. This may be desirable, for example, when the UNIX host names are long and
you want VCS to use shorter names.
Note: If the sysname file contains a different name from the llttab/
llthosts/main.cf files, this “phantom” system is added to the cluster upon cluster
startup.
The sysname file can be specified for the set-node directive in the llttab file. In this
case, the llttab file can be identical on every node, which may simplify reconfiguring the
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
cluster interconnect in some situations. Refer the sysname manual page for a complete
description of the file.
/etc/gabtab
/sbin/gabconfig –c –n 4
• The -n option must always be set to the total number of systems in the cluster.
14
GAB is configured with the /etc/gabtab file. This file contains one line that is used to start
GAB. For example: /sbin/gabconfig -c -n 4
This example starts GAB and specifies that four systems are required to be running GAB to
start within the cluster. The -n option must always be set to the total number of systems in
the cluster. A sample gabtab file is included in /opt/VRTSgab.
Note: Other gabconfig options are discussed later in this lesson. Refer the gabconfig
manual page for a complete description of the file.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to describe how systems join
the cluster membership and services come online.
15
16
GAB and LLT are started automatically when a system starts up. HAD can only start after GAB
membership has been established among all cluster systems. The mechanism that ensures
that all cluster systems are visible on the cluster interconnect is GAB seeding.
Seeding during startup
Seeding is a mechanism to ensure that systems in a cluster are able to communicate before
VCS can start. Only systems that have been seeded can participate in a cluster. Seeding is also
used to define how many systems must be online and communicating before a cluster is
formed.
By default, a system is not seeded when it boots. This prevents VCS from starting, which
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
prevents applications (service groups) from starting. If the system cannot communicate with
the cluster, it cannot be seeded.
Seeding is a function of GAB and is performed automatically or manually, depending on how
GAB is configured. GAB seeds a system automatically in one of two ways:
• When an unseeded system communicates with a seeded system
• When all systems in the cluster are unseeded and able to communicate with each other
The number of systems that must be seeded before VCS is started on any system is also
determined by the GAB configuration.
AIX
/etc/rc.d/rc2.d/S70llt Checks for /etc/llttab and runs
/sbin/lltconfig -c to start LLT
/etc/rc.d/rc2.d/S92gab Calls /etc/gabtab
/etc/rc.d/rc2.d/S99vcs Runs /opt/VRTSvcs/bin/hastart
Solaris
/lib/svc/method/llt Checks for /etc/llttab and runs
/sbin/lltconfig -c to start LLT
/lib/svc/method/gab Calls /etc/gabtab
/lib/svc/method/vcs Runs /opt/VRTSvcs/bin/hastart
21
Monitor
31 31
HAD HAD
s1 s2
1 During startup, HAD autodisables service groups.
HAD directs agents to probe (monitor) all resources on all systems in the SystemList
21
to determine their status.
If agents successfully probe resources, HAD brings service groups online according to
31
AutoStart and AutoStartList attributes (or where resources were running before
hastop –all –force).
18
During initial startup, VCS autodisables a service group until all its resources are probed on all
systems in the SystemList. When a service group is autodisabled, VCS sets the AutoDisabled
attribute to 1 (true), which prevents the service group from starting on any system. This
protects against a situation where enough systems are running LLT and GAB to seed the
cluster, but not all systems have HAD running.
In this case, port a membership is complete, but port h is not. VCS cannot detect whether a
service is running on a system where HAD is not running. Rather than allowing a potential
concurrency violation to occur, VCS prevents the service group from starting anywhere until all
resources are probed on all systems.
After all resources are probed on all systems, a service group can come online by bringing
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
offline resources online. If the resources are already online, as in the case where HAD has
been stopped with the hastop -all -force option, the resources are marked as online.
After completing this topic, you will be able to describe how VCS responds
to common failures.
19
A C B C
s1 s2 s3
20
The example cluster used throughout most of this section contains three systems, s1, s2, and
s3, each of which can run any of the three service groups, A, B, and C.
The abbreviated system and service group names are used to simplify the diagrams.
In this example, there are two Ethernet LLT links for the cluster interconnect. Prior to any
failures, systems s1, s2, and s3 are part of the regular membership of cluster number 1. When
the s3 system fails, it is no longer part of the cluster membership. Service group C fails over
and starts up on either s1 or s2, according to the SystemList and FailOverPolicy values.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
= Failover duration
21
When a system faults, application services that were running on that system are disrupted
until the services are started up on another system in the cluster. The time required to
address a system fault is a combination of the time required to:
• Detect the system failure.
A system is determined to be faulted according to these default timeout periods:
– LLT timeout: If LLT on a running system does not receive a heartbeat from a system for
16 seconds, LLT notifies GAB of a heartbeat failure.
– GAB stable timeout: GAB determines that a membership change is occurring, and after
five seconds, GAB delivers the membership change to HAD.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• Select a failover target. The time required for the VCS policy module to determine the
target system is negligible, less than one second in all cases, in comparison to the other
factors.
• Bring the service group online on another system in the cluster. As described in an earlier
lesson, the time required for the application service to start up is a key factor in
determining the total failover time.
21 LLT starts on s1 and s2. GAB starts but cannot seed with s3 down.
41 GAB on s2 now seeds because it can detect another seeded system (s1).
22
You can override the seed values in the gabtab file and manually force GAB to seed a system
using the gabconfig command. This is useful when one of the systems in the cluster is out
of service and you want to start VCS on the remaining systems.
To seed the cluster if GAB is already running, use gabconfig with the –x option to override
the -n value set in the gabtab file. For example, type:
gabconfig -x
If GAB is not already started, you can start and force GAB to seed using -c and -x options to
gabconfig:
gabconfig -c -x
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
CAUTION: Only manually seed the cluster when you are sure that no other systems have GAB
seeded. In clusters that do not use I/O fencing, you can potentially create a split brain
condition by using gabconfig improperly.
After you have started GAB on one system, start GAB on other systems using gabconfig
with only the -c option. You do not need to force GAB to start with the -x option on other
systems. When GAB starts on the other systems, it determines that GAB is already seeded and
starts up.
s1 s2 s3
gabconfig -a
Port a gen a6e003 membership 012
Port a gen a6e003 jeopardy ; 2
Port b gen a6e006 membership 012 Jeopardy membership: s3
Port b gen a6e006 jeopardy ; 2
Port h gen a6e004 membership 012
Port h gen a6e004 jeopardy ; 2
23
In the case where a node has only one functional LLT link, the node is a member of the regular
membership and the jeopardy membership. Being in a regular membership and jeopardy
membership at the same time changes only the failover behavior on system fault. All other
cluster functions remain. This means that failover due to a resource fault or switchover of
service groups at operator request is unaffected.
The only change is that other systems prevented from starting service groups on system fault.
VCS continues to operate as a single cluster when at least one network channel exists
between the systems.
In the example shown in the diagram where one LLT link fails:
• A jeopardy membership is formed that includes just system s3.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
• System s3 is also a member of the regular cluster membership with systems s1 and s2.
• Service groups A, B, and C continue to run and all other cluster functions remain
unaffected.
• Failover due to a resource fault or an operator switch a service group is unaffected.
• If system s3 now faults or its last LLT link is lost, service group C is not started on systems
s1 or s2.
A C B A B C
21
s1 s2 s3
24
A B C
21
1
s1 s2 s3
25
LLT can be configured to use a low-priority network link as a backup to normal heartbeat
channels. Low-priority links are typically configured on the public network or administrative
network.
In normal operation, the low-priority link carries only heartbeat traffic for cluster membership
and link state maintenance. The frequency of heartbeats is reduced by half to minimize
network overhead. When the low-priority link is the only remaining LLT link, LLT switches all
cluster status traffic over the link. Upon repair of any configured link, LLT switches cluster
status traffic back to the high-priority link.
Notes:
• Nodes must be on the same public network segment in order to configure low-priority
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
After completing this topic, you will be able to change the cluster
interconnect configuration.
26
s7
s6
s5
27
You may never need to perform any manual configuration of the cluster interconnect because
the VCS installation utility sets up the interconnect based on the information you provide
about the cluster.
However, certain configuration tasks require you to modify VCS communication configuration
files, as shown in the slide.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
28
The procedure shown in the diagram can be used for any type of change to the VCS
communications configuration. The first task includes saving and closing the cluster
configuration before backing up and editing files.
Although some types of modifications do not require you to stop both GAB and LLT, using this
procedure ensures that any type of change you make takes effect.
For example, if you added a system to a running cluster, you can change the value of -n in the
gabtab file without having to restart GAB. However, if you added the -j option to change
the recovery behavior, you must either restart GAB or execute the gabtab command
manually for the change to take effect.
Similarly, if you add a host entry to llthosts, you do not need to restart LLT. However, if
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
you change llttab, or you change a host name in llthosts, you must stop and restart
LLT, and, therefore, GAB.
Following this procedure ensures that any type of changes take effect. You can also use the
scripts in the rc*.d directories to stop and start services. Use the UNIX or Linux-specific
method for starting and stopping services for fencing, GAB, and LLT.
• AIX: /etc/init.d/vxfen.rc stop|start
• RHEL 7, SLES 12 Linux: systemctl stop|start vxfen
• Earlier Linux distributions: /etc/init.d/vxfen stop|start
• Solaris: svcadm disable -t vxfen
set-node s1
set-cluster 10
# Solaris example
link nxge0 /dev/nxge:0 - ether - -
link nxge4 /dev/nxge:4 - ether - -
link-lowpri e1000g0 /dev/e1000g:0 - ether - -
29
You can add links to the LLT configuration as additional layers of redundancy for the cluster
interconnect. You may want an additional interconnect link for:
• VCS for heartbeat redundancy
• Storage Foundation for Oracle RAC for additional bandwidth
To add an Ethernet link to the cluster interconnect:
1. Cable the link on all systems.
2. Use the process on the previous page to modify the llttab file on each system to add
the new link directive.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
To add a low-priority public network link, add a link-lowpri directive using the same
syntax as the link directive, as shown in the llttab file example in the slide.
VCS uses the low-priority link only for heartbeats (at half the normal rate), unless it is the only
remaining link in the cluster interconnect.
30
For more information about the topics discussed in this lesson, refer to the resources listed on
the slide and remember to check the Veritas Support Web site frequently.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
31
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
32
The next section is a quiz. In this quiz, you are asked a series of questions related to the
current lesson.
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. llthosts, llttab
B. lltnodes, llttab
C. llttab, lltconfig
D. llttab, sysname
33
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. llthosts, llttab
B. lltnodes, llttab
C. llttab, lltconfig
D. llttab, sysname
The correct answer is A. /etc/llthosts and /etc/llttab files are responsible for LLT configuration.
34
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. gabstart -a 2
B. gabconfig -all
C. gabconfig -c -n 2
D. gabtab -c -n 2
35
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. gabstart -a 2
B. gabconfig -all
C. gabconfig -c -n 2
D. gabtab -c -n 2
36
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. gabconf
B. gabstart
C. gabtab
D. gabconfig
37
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. gabconf
B. gabstart
C. gabtab
D. gabconfig
38
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. To ensure that the state of cluster systems is known before HAD is started.
B. To force GAB to communicate with HAD on each system in the cluster.
C. To determine which systems have transitioned from running to faulted.
D. To force HAD on each system to start sending heartbeats to each other system.
39
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. To ensure that the state of cluster systems is known before HAD is started.
B. To force GAB to communicate with HAD on each system in the cluster.
C. To determine which systems have transitioned from running to faulted.
D. To force HAD on each system to start sending heartbeats to each other system.
The correct answer is A. HAD starts after the initial seeding is completed, it waits until all the systems seeds and join the
GAB. Once all the systems join GAB, state of all systems is known across the cluster.
40
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. To ensure that all service groups are started manually when VCS first starts up.
B. To determine where resources are online before starting any service groups.
C. To ensure that you manually reset the AutoDisabled attribute each time HAD starts
up.
D. Service groups are only autodisabled during HAD startup if they faulted before HAD
was shut down.
41
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
A. To ensure that all service groups are started manually when VCS first starts up.
B. To determine where resources are online before starting any service groups.
C. To ensure that you manually reset the AutoDisabled attribute each time HAD starts
up.
D. Service groups are only autodisabled during HAD startup if they faulted before HAD
was shut down.
The correct answer is B. Service groups remain autodisabled until its resource state has been probed and determined on
all the cluster nodes.
42
Copyright @ 2020 Veritas Technologies LLC. All rights reserved.
End of presentation
11-43
Not for Distribution.