SSE1G4 Course Version1
SSE1G4 Course Version1
SSE1G4 Course Version1
cover
Front cover
Course Guide
IBM Storwize V7000 Implementation
Workshop
Course code SSE1G ERC 4.0
May 2019 edition
Notices
This information was developed for products and services offered in the US.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative
for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not
intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or
service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate
and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this
document does not grant you any license to these patents. You can send license inquiries, in writing, to:
IBM Director of Licensing
IBM Corporation
North Castle Drive, MD-NC119
Armonk, NY 10504-1785
United States of America
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY
KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some jurisdictions do not allow disclaimer
of express or implied warranties in certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein;
these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s)
and/or the program(s) described in this publication at any time without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any manner serve as an
endorsement of those websites. The materials at those websites are not part of the materials for this IBM product and use of those
websites is at your own risk.
IBM may use or distribute any of the information you provide in any way it believes appropriate without incurring any obligation to you.
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other
publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other
claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those
products.
This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible,
the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to
actual people or business enterprises is entirely coincidental.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corp., registered in many
jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM
trademarks is available on the web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml.
© Copyright International Business Machines Corporation 2012, 2019.
This document may not be reproduced in whole or in part without the prior written permission of IBM.
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
V11.3
Contents
TOC
Contents
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xx
Agenda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii
TOC Unit 6. IBM Storwize V7000 installation and management access . . . . . . . . . . . . . . . . . . . . . . . . 6-1
Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
Installation and configuration: System installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-3
Pre-install Technical Delivery Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4
TDA pre-installation checklist and worksheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5
Determine installation requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7
Storwize V7000 planning tables and charts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-9
Storwize V7000 physical installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10
System power requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12
Storwize V7000 node 12F/24FSAS cable requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-13
Storwize V7000 node 92F SAS cable requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-15
Storwize V7000 cable options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-16
Cable management arm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-17
Device power on order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-18
System initialization using the Technician port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-19
Management GUI access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-20
System Setup tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-21
Additional system configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-22
Installation and configuration: Client system setup (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-23
Required IP addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-24
Storwize V7000 management interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-25
Cluster communication and management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-26
Installation and configuration: Client system setup (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-27
Cluster zoning requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-28
Dual fabric for high availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-30
Storwize V7000 port destination recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-31
Storwize V7000 port assignment recommendations for isolating traffic . . . . . . . . . . . . . . . . . . . . . . 6-32
Zone definitions by port number or WWPN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-33
Name and addressing convention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-34
WWN addressing scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-35
Storwize V7000 and FlashSystem 900 switch zoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-36
Storwize V7000 and DS3500 switch zoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-37
Storwize V7000 Fiber Channel Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-38
FC Zoning and multipathing LUN access control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-40
Multipathing and host LUN access control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-41
Host zoning preferred paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-42
Zoning multi HBA hosts for resiliency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-43
SAN zoning documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-44
Installation and configuration: Client system setup (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-45
Storwize V7000 GUI Dashboard view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-46
Default authorize users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-47
User authentication methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-48
User group roles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-49
Access menu: User groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-50
Managing user authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-51
Remote Authentication configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-52
CLI SSH keys encrypted communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-53
PuTTYgen key generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-54
Save the generated keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-55
User with SSH key authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-56
Create CLI session with SSH key authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-57
Accessing CLI from Microsoft Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-58
Command-line interface commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-59
Monitoring view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-60
System-Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-61
Keywords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-62
Unit 13. IBM Spectrum Virtualize FlashCopy and Consistency groups . . . . . . . . . . . . . . . . . . . 13-1
Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2
Spectrum Virtualize Copy Services: FlashCopy (1 of 6) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-3
Spectrum Virtualize software architecture: FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-4
Data consistency after a crash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-5
FlashCopy point in time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6
Different types of FlashCopy PiT images (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-7
Different types of FlashCopy PiT images (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-8
FlashCopy implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-9
FlashCopy attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-10
FlashCopy process (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-11
FlashCopy process (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-12
FlashCopy full copy process complete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-13
FlashCopy: Background copy rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-14
FlashCopy reads/writes: Full background copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-16
FlashCopy reads/writes: No background copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-17
FlashCopy: Sequence of events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-18
FlashCopy mapping states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-20
Copy Services: FlashCopy options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-22
TMK
Trademarks
The reader should recognize that the following terms, which appear in the content of this training
document, are official trademarks of IBM or other companies:
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business
Machines Corp., registered in many jurisdictions worldwide.
The following are trademarks of International Business Machines Corporation, registered in many
jurisdictions worldwide:
AIX 5L™ AIX® DB™
developerWorks® DS8000® Easy Tier®
Express® FlashCopy® GPFS™
HyperSwap® IBM Business Partner® IBM Cloud™
IBM FlashCore® IBM FlashSystem® IBM Security™
IBM Spectrum Accelerate™ IBM Spectrum Archive™ IBM Spectrum Control™
IBM Spectrum Protect™ IBM Spectrum Scale™ IBM Spectrum Storage™
IBM Spectrum Virtualize™ IBM Spectrum™ Insight®
Interconnect® Linear Tape File System™ MicroLatency®
Notes® Passport Advantage® Power Systems™
Power® Real-time Compression™ Redbooks®
Storwize® System Storage DS® System Storage®
Tivoli® Variable Stripe RAID™ XIV®
Adobe is either a registered trademark or a trademark of Adobe Systems Incorporated in the United
States, and/or other countries.
Intel is a trademark or registered trademark of Intel Corporation or its subsidiaries in the United
States and other countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft and Windows are trademarks of Microsoft Corporation in the United States, other
countries, or both.
Java™ and all Java-based trademarks and logos are trademarks or registered trademarks of
Oracle and/or its affiliates.
UNIX is a registered trademark of The Open Group in the United States and other countries.
VMware and VMware vSphere are registered trademarks or trademarks of VMware, Inc. or its
subsidiaries in the United States and/or other jurisdictions.
SoftLayer® is a trademark or registered trademark of SoftLayer, Inc., an IBM Company.
Other product and service names might be trademarks of IBM or other companies.
pref
Course description
IBM Storwize V7000 Implementation Workshop
Duration: 4 days
Purpose
This course is designed to leverage SAN storage connectivity by integrating a layer of intelligence
of virtualization, the IBM Storwize V7000 to facilitate storage application data access independence
from storage management functions and requirements. The focus is on planning and
implementation tasks associated with integrating the Storwize V7000 into the storage area network.
It also explains how to:
• Centralize storage provisioning to host servers from common storage pools using internal
storage and SAN attached external heterogeneous storage.
• Improve storage utilization effectiveness using Thin Provisioning and Real-Time Compression.
• Implement storage tiering and optimize solid state drives (SSDs) or flash systems usage with
Easy Tier.
• Facilitate the coexistence and migration of data from non-virtualization to the virtualized
environment.
• Utilize network-level storage subsystem-independent data replication services to satisfy backup
and disaster recovery requirements.
This course lecture offering is a the Storwize V7000 V8.2 level.
Important
This course consists of several independent modules. The modules, including the lab exercises,
stand on their own and do not depend on any other content.
Audience
This lecture and exercise-based course is for individuals who are assessing and/or planning to
deploy IBM System Storage networked storage virtualization solutions.
pref
Prerequisites
• Introduction to Storage (SS01G)
• Storage Area Networking Fundamentals (SN71G) or equivalent experience
• An understanding of the basic concepts of open systems disk storage system and I/O
operations.
Objectives
After completing this course, you should be able to:
• Distinguish the concepts of IBM Spectrum virtualization.
• Recall the history for IBM Storwize V7000.
• Distinguish the core principles of the IBM FlashCore Technology.
• Classify the characteristics and components of the IBM SAN Volume Controller system and the
SAS-attached expansion enclosures.
• Outline setups required to integrate an Storwize V7000 system solution.
• Compare the characteristics of the RAID and DRAID.
• Summarize the virtualization process converting physical storage space into virtual resources.
• Recall the process to create host access storage on an Storwize V7000 system.
• Differentiate the advanced software features designed to simplify data management, reclaim
storage space, and preserve storage investments.
• Differentiate methods in which to migrate data to and from the virtualized system environment.
• Summarize the methods of remote data replications to improve availability and support for
disaster recovery.
• Employ administrative operations to manage, monitor, and troubleshoot the system.
environment.
• Summarize the characteristics of IBM Storage Insights’ ability to identify, troubleshoot and
minimize potential system downtime.
pref
Agenda
Note
The following unit and exercise durations are estimates, and might not reflect every class
experience.
Day 1
(00:30) Course Introduction
(00:30) Unit 1: Introduction of IBM Storwize V7000
(01:00) Unit 2: IBM Storwize V7000 hardware architecture
(00:00) Unit 3: IBM FlashCore Technology
(00:30) Unit 4: IBM Storwize V7000 SAS-Attached storage
(00:45) Unit 5: IBM Storwize V7000 RAID protection solutions
(01:15) Unit 6: IBM Storwize V7000 system installation and management access
(00:15) Exercise 0: Lab environment overview
(00:15) Exercise 1: Storwize V7000 system initialization
(00:45) Exercise 2: Storwize V7000 system configuration
(00:30) Exercise 3: Configure user authentication
Day 2
pref
Day 3
Day 4
Uempty
Overview
This unit provides a high-level overview of the course deliverables and overall course objectives
that will be discussed in detail in this course.
Uempty
Course overview
This is a 4-day lecture and exercise-based course for individuals who are assessing and planning
to deploy IBM Storwize V7000 networked storage virtualization solutions.
&RXUVHLQWURGXFWLRQ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Course prerequisites
• Introduction to Storage (SS01G)
• Storage Area Networking Fundamentals (SN71G) or equivalent experience
• A basic understanding of the concepts of open systems disk storage systems and I/O
operations
&RXUVHLQWURGXFWLRQ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Course objectives
• Distinguish the concepts of IBM Spectrum virtualization.
• Recall the history for IBM Storwize V7000.
• Distinguish the core principles of the IBM FlashCore Technology.
• Classify the characteristics and components of the IBM SAN Volume Controller system and the SAS-
attached expansion enclosures.
• Outline setups required to integrate an Storwize V7000 system solution.
• Compare the characteristics of the RAID and DRAID.
• Summarize the virtualization process converting physical storage space into virtual resources.
• Recall the process to create host access storage on an Storwize V7000 system.
• Differentiate the advanced software features designed to simplify data management, reclaim storage
space, and preserve storage investments.
• Differentiate methods in which to migrate data to and from the virtualized system environment.
• Summarize the methods of remote data replications to improve availability and support for disaster
recovery.
• Employ administrative operations to manage, monitor, and troubleshoot the system. environment.
• Summarize the characteristics of IBM Storage Insights’ ability to identify, troubleshoot and minimize
potential system downtime.
&RXUVHLQWURGXFWLRQ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Agenda: Day 1
• Unit 1: Introduction of IBM Storwize V7000
• Unit 2: IBM Storwize V7000 hardware architecture
• Unit 3: IBM FlashCore Technology
• Unit 4: IBM Storwize V7000 SAS-Attached storage
• Unit 5: IBM Storwize V7000 RAID protection solutions
• Unit 6: IBM Storwize V7000 system installation and management access
ƒ Exercise 0: Lab environment overview
ƒ Exercise 1: Storwize V7000 system initialization
ƒ Exercise 2: Storwize V7000 system configuration
ƒ Exercise 3: Configure user authentication
&RXUVHLQWURGXFWLRQ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Agenda: Day 2
• Review
• Unit 7: IBM Spectrum Virtualize storage provisioning
• Unit 8: IBM Spectrum Virtualize volume allocation
• Unit 9: IBM Spectrum Virtualize host integration
• Unit 10: IBM Spectrum Virtualize data reduction technologies
ƒ Exercise 4: Provision internal storage
ƒ Exercise 5: Examine external storage resources
ƒ Exercise 6: Managing external storage resources
ƒ Exercise 7: Host definitions and volume allocations
ƒ Exercise 8: Access storage from Windows and AIX
ƒ Exercise 9: Hybrid pools and Easy Tier
&RXUVHLQWURGXFWLRQ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Agenda: Day 3
• Review
• Unit 11: IBM Easy Tier
• Unit 12: IBM Spectrum Virtualize data migration
• Unit 13: IBM Spectrum Virtualize FlashCopy and Consistency group
ƒ Exercise 10: Access Storwize V7000 through iSCSI host
ƒ Exercise 11: Volume dependencies and tier migration
ƒ Exercise 12: Reconfigure internal storage: RAID options
ƒ Exercise 13: Thin provisioning and volume mirroring
ƒ Exercise 14: Migrate existing data: Import Wizard
&RXUVHLQWURGXFWLRQ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Agenda: Day 4
• Review
• Unit 14: IBM Spectrum Virtualize remote data replication
• Unit 15: IBM Spectrum Virtualize administration management
• Unit 16: IBM Storage Insights
ƒ Exercise 15: Copy Services: FlashCopy and consistency groups
ƒ Exercise 16: User roles and access
ƒ Exercise 17: Migrate existing data: Migration Wizard
ƒ Exercise 18: Easy Tier and STAT analysis
• Class wrap-up and evaluation
&RXUVHLQWURGXFWLRQ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
/DEUHVRXUFHV
&RXUVHLQWURGXFWLRQ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Introductions
• Name
• Company
• Where you live
• Your job role
• Current experience with products and technologies in this course
• Do you meet the course prerequisites?
• Class expectations
&RXUVHLQWURGXFWLRQ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Class logistics
• Course environment
• Start and end times
• Lab exercise procedures
• Materials in your student packet
• Topics not on the agenda
• Evaluations
• Breaks and lunch
• Outside business
• For classroom courses:
ƒ Lab room availability
ƒ Food
ƒ Restrooms
ƒ Fire exits
ƒ Local amenities
&RXUVHLQWURGXFWLRQ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Overview
This module provides an overview of IBM Spectrum Virtualize, which provides the ability to
transform storage, streamlining deployments, enabling better data value and management in your
physical infrastructure. It also introduces the evolution of the IBM Storwize V7000 enclosures.
References
Implementing IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1
http://www.redbooks.ibm.com/redpieces/pdfs/sg247938.pdf
Uempty
Objectives
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
Uempty
A SCSI logical unit also sometimes referred to as a LUN, virtual disk, volume, virtual volume, virtual drive, or SCSI
Managed disk (MDisk)
disk, built from an internal or served from external RAID array
Physical Flash/Enterprise drives within a control enclosure or expansion enclosure used to create DRAID/RAID
Internal storage
arrays and managed disks
Managed disks that are configured from system-level RAID array or for SCSI logical units (also known as LUNs)
External storage
presented by storage systems that are attached to the SAN and managed by the system
Virtualization refers to the act of creating a virtual (rather than actual) version of something, including virtual
Virtualization
computer hardware platforms, operating systems, storage devices, and computer network resources.
Uempty
Diagnose and
$FFRPPRGDWLQJ
H[SHFWHGIXWXUH
address disk
JURZWK performance
issues and disk
hot spots
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
The role of a storage administrator can become more complex every year with the demands of a
growing heterogeneous storage environment. This inherent growth does not change the task that
must be performed, which involves configuring storage arrays so that its capacity can be portioned
off for different purposes.
With hundreds of storage environments to be provisioned, this can be unpleasant and tedious,
which can also result in storage capacity being sealed off-from-use.
A provisioning process requires several steps to be performed in a specific order. Storage has to be
configured for the best SAN performance, such as zoning switches and other network appliances
for optimal performance. All logical unit numbers (LUNs) must be assigned to its required device or
shared across the network. Data must be accessible to the users who need it.
High availability and disaster recovery solutions must be included to maintain the SAN network
environment in the event of partial or catastrophic failure. Once everything has been provisioned
and all the programs and applications are installed, the entire SAN environment must be tested
before sensitive and valuable data is committed to it. These solution features must be updated and
tested periodically to maintain accessibility.
The administrator must also make sure that the SAN environment can accommodate expected
future growth, while diagnosing and addressing disk performance issues and disk hot spots -
making this a never ending job.
Many different storage environments adds complexity to the administrator's job while Spectrum
Virtualize simplifies the environment, making it appear as practically a single storage environment.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
The IBM Spectrum Storage family is the industry’s first software family based on proven
technologies and designed specifically to simplify storage management, scale to keep up with data
growth, and optimize data economics. It represents a new, more agile way of storing data, and
helps organizations prepare themselves for new storage demands and workloads. The software
defined storage solutions included in the IBM Spectrum Storage family can help organizations
simplify their storage infrastructures, cut costs, and start gaining more business value from their
data.
IBM Spectrum Storage provides the following benefits:
▪ Simplify and integrate storage management and data protection across traditional and new
applications
▪ Deliver elastic scalability with high performance for analytics, big data, social, and mobile
▪ Unify siloed storage to deliver data without borders with built-in hybrid cloud support
▪ Optimize data economics with intelligent data tiering from flash to tape and cloud
▪ Build on open architectures that support industry standards that include OpenStack and
Hadoop
Uempty
IBM Spectrum Protect Optimized hybrid cloud data protection to reduce backup costs by up to 53%
Complete VM protection and availability with that’s easy to set up and manage
IBM Spectrum Protect Plus yet scalable for the enterprise
Private, Public
or Hybrid Cloud
Flash Cloud Services Reduces CAPEX, OPEX of multi-vendor block environments through storage
IBM Spectrum Virtualize
virtualization – stores up to 5x more data
IBM Spectrum Virtualize for Real-time hybrid cloud disaster recovery, replication and migration
Public Cloud
IBM Spectrum Archive Fast data retention that reduces TCO for active archive data by up to 90%
Any Storage Rich
Storage Servers
IBM Spectrum Accelerate Enterprise block storage for hybrid cloud deployed in minutes instead of months
IBM Spectrum Scale High-performance, highly scalable hybrid cloud storage for unstructured data
High- Hybrid
Efficient Performance Cloud
IBM Spectrum NAS Easy to manage software-defined file storage for the enterprise
Secure
Flexible, scalable and simple hybrid cloud object storage with geo-dispersed
IBM Cloud Object Storage enterprise availability and security
IBM Spectrum CDM Manage copies to increase business velocity and efficiency
Enables rapid deployment of IBM Cloud Private in hours with a pre-tested,
IBM Spectrum Access Blueprint validated IT stack
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
The IBM’s Spectrum storage family comprises of the following software-defined products:
IBM Spectrum Control - an analytics-driven data management software that is designed to reduce
costs by up to 73 percent.
IBM Spectrum Control provides efficient infrastructure management for virtualized, cloud, and
software-defined storage to simplify and automate storage provisioning, capacity management,
availability monitoring, and reporting.
The functionality of IBM Spectrum Control is provided by IBM Data and Storage Management
Solutions.
IBM Spectrum Connect - fast, easy integration of storage in multiple cloud environments.
IBM Spectrum Protect - optimized data protection to reduce backup costs by up to 53 percent.
IBM Spectrum Protect enables reliable, efficient data protection and resiliency for software-defined,
virtual, physical, and cloud environments.
The functionality of IBM Spectrum Protect is provided by IBM Backup and Recovery Solutions.
IBM Spectrum Protect Plus - complete VM protection and availability with that’s easy to set up
and manage yet scalable for the enterprise.
IBM Spectrum Virtualize - is a virtualization software capable of supporting mixed environments,
storing up to 5x more data.
IBM Spectrum Virtualize is an industry-leading storage virtualization product that enhances existing
storage to improve resource utilization and productivity to achieve a simpler, more scalable and
cost-efficient IT infrastructure. The functionality of IBM Spectrum Virtualize is provided by IBM SAN
Volume , the FlashSystem V9000 and the entire Storwize family.
Uempty
IBM Spectrum Virtualize for Public Cloud - Real-time hybrid cloud disaster recovery, replication
and migration.
IBM Spectrum Archive - provides fast data retention that reduces TCO for active archive data by
up to 90%.
IBM Spectrum Archive enables you to automatically move infrequently accessed data from disk to
tape so you can lower costs while retaining ease of use and without the need for proprietary tape
applications.
The functionality of IBM Spectrum Archive is provided by IBM Linear Tape File System.
IBM Spectrum Accelerate - supports enterprise storage for cloud, deployed in minutes instead of
months. IBM Spectrum Accelerate is a software defined storage solution, using Intel processors
and their attached storage to create a disk subsystem that works like an XIV system. This solution
is designed to help speed delivery of data across the organization and add extreme flexibility to
cloud deployments. IBM Spectrum Accelerate delivers hotspot-free performance, easy
management scaling, and proven enterprise functionality such as advanced mirroring and flash
caching to different deployment platforms.
IBM Spectrum Scale - supports high-performance, highly scalable storage for unstructured data
that scales to yottabytes (YB) of data. IBM Spectrum Scale is a proven high-performance data and
file management solution that can manage over one billion petabytes of unstructured data.
Spectrum Scale redefines the economics of data storage using policy-driven automation: as time
passes and organizational needs change, data can be moved back and forth between flash, disk
and tape storage tiers without manual intervention.
IBM Spectrum Scale, delivered by IBM General Parallel File System (GPFS), or the Elastic Storage
Server.
IBM Spectrum NAS - easy to manage software-defined file storage for the enterprise.
Uempty
+LJK,236
,QIUDVWUXFWXUH $OO)ODVK 'LVN &ORXG 6HUYHUV
3ODWIRUPV
With the IBM Spectrum Storage Suite, you get unlimited access to all members of the IBM
Spectrum Storage software family with licensing on a flat, cost-per-TB basis. This reduce the
complexity of trying to manage multiple software license and predictable as capacity grows. Each
Spectrum Storage solution is structured specifically to meet changing storage needs.
Uempty
6WRUDJH0DQDJHPHQWDQG$SSOLFDWLRQV
6WRUDJH,QIUDVWUXFWXUHRSWLPL]HGIRUWKHGDWDXQGHUO\LQJDZRUNORDG± ILOHEORFNRUREMHFW
All of IBM SDS can adapt to high performance or high capacity needs by leveraging appropriate underlying
storage media – Flash, NL-SAS, or Tape.
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
IBM's software-defined storage (SDS) management and optimization software products help with
the challenges of managing your storage environment.
Software-defined storage is an IT advancement automating infrastructure configuration via
software, providing rapid deployment aligned to real-time application requirements. SDS also
allows architects and administrators to take advantage of IBM storage product features such as the
many optimization features of Spectrum Virtualize
In a virtualized storage infrastructure, a software-defined storage architecture is concealed from
users who require a storage volume resource that provides the capacity and performance attributes
that are suited to the application workload they're running.
All resources can then be assigned based on application workload requirements with best-available
resources aligned to business-requirements-based service level policies.
IBM's software-defined storage (SDS) management and optimization software products help with
the challenges of managing your storage environment.
Software-defined storage is an IT advancement automating infrastructure configuration via
software, providing rapid deployment aligned to real-time application requirements. SDS also
allows architects and administrators to take advantage of IBM storage product features such as the
many optimization features of Spectrum Virtualize
In a virtualized storage infrastructure, a software-defined storage architecture is concealed from
users who require a storage volume resource that provides the capacity and performance attributes
that are suited to the application workload they're running.
All resources can then be assigned based on application workload requirements with best-available
resources aligned to business-requirements-based service level policies.
Uempty
60,66103HWF
virtualization (Optional)
File/record/namespace File/record/subsystem
Monitoring
6FDOHRXWFOXVWHULQJ
Database
+LJKavailability
,QVWUXPHQWDWLRQ
)UDPHZRUNV
file/record subsystem File system Security
6WRUDJHGRPDLQ
(Optional) Auditing
Block virtualization
Block Billing
Block subsystem +RVW
Redundancy
1HWZRUN
Backup
'HYLFH Virtualization Capacity
(Optional)
Retention
Block subsystem
Compliance
Storage devices (disks,etc.)
Data Management
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
Storage virtualization is a term that is used comprehensively throughout the storage industry, as it
can apply to various technologies and fundamental capabilities. Understanding storage
virtualization means knowing what is created. Where does it take place? And, how storage
virtualization gets implemented? Storage virtualization, as defined in the Storage Networking
Industry Association’s (SNIA) shared storage model version 2, represents one of the most
heterogeneous environments integrated in IT infrastructures, with a multitude of different systems
at all levels of the stack. This complexity has become a hindrance to achieving business goals such
as 100% uptime.
The first level of the storage virtualization is “what is created”. It specifies the types of virtualization,
such as file, file system, share, or block device virtualization.
The next level describes “where” the virtualization can take place. This requires a multilevel
approach that characterizes virtualization at all three levels of the storage environment: host server,
network, and devices. An effective virtualization strategy distributes the intelligence across all three
levels while centralizing the management and control functions. This does not reflect the data
storage functions handled by the array, or how the host should control application and its volume
management, or path redirection, path failover, data access, and distribution or load-balancing
capabilities that should be moved to the switch or the network level.
Uempty
,Q%DQG$SSOLDQFH6\PPHWULF9LUWXDOL]DWLRQ
Storage pool
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
IBM Spectrum Virtualize supports symmetric virtualization, as it sits directly in the data path of all
I/O traffic, acting as both the target (accepting I/O requests from the host) and the initiator
(processing I/O requests from the storage) prospectively.
Storage virtualization is a technique of abstracting physical resources into a unified view of all
deployed storage devices, both internal and external, by adding an abstraction layer to the existing
SAN infrastructure separating the application servers from the underlying physical storage
systems.
In an IBM Spectrum clustered environment, virtualization hides the complexity created by having a
variety of disparate storage devices supporting different tiers of storage from different vendors with
different interfaces and multi-pathing drivers. It collectively manages all storage resources enabling
more efficiency, facilitating balanced use of resources, and other efficiency features across all the
storage that isn't possible with individual disk subsystems by themselves.
With IBM Spectrum Virtualize, businesses can avoid the complexity of storage management with
dependable software that is capable of transforming their storage data, by streamlining
deployments, improving data value, security, and simplicity for new and existing storage
infrastructure. These terminologies are consistently emphasized as IBM Spectrum Virtualize
standards.
Uempty
An IBM Spectrum Virtualization layer allows minimal hardware configuration of a device’s I/O group
to be connected to the SAN. This includes redundant hardware that runs with dual clustered nodes
providing the advantages to manage storage, independently of the disk systems that are assigned
to it.
Each system is built upon the IBM Spectrum Virtualize software, which supports different tiers of
storage from different vendors with different interfaces and multi-pathing drivers, while maintaining
isolation from the host. The host sees only one device type, one multi-pathing driver, and one
management interface regardless of the number of types of storage controllers being managed by
the IBM Spectrum Storage product.
Considering multiple devices as a single logical pool of storage brings more flexibility than only
allowing users to see storage capacity in a single device. Once you look at storage logically and
provision from the virtualization layer, the constraints of individual devices go away and so do many
potentially wasteful practices.
IBM Spectrum Virtualize spreads virtual disks and their I/Os across all the MDisks in a pool; thus,
balancing use of physical resources in the pool, eliminating hot spots and hot spot management.
This leads to more effective utilization of the storage, better overall application performance, and
better use of administrator time.
IBM Spectrum Virtualize uses image mode which facilitates the creation of a one-to-one direct
mapping between a volume and disk subsystem LUN that contains existing data.
Image mode simplifies the transition of existing data from a non-virtualized to a virtualized
environment without requiring physical data movement or conversion.
This diagram shows the basic function of Spectrum Virtualize, pooling storage from external or
internal storage presented as MDisks to Spectrum Virtualize, then serving up virtual disks, or
Uempty
VDisks, to host; thus, allowing us to virtualize all the storage. Then other efficiency features can
also be utilized which will be discussed later.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
Uempty
Spectrum Virtualize
Virtualization Layer
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
With each release of the IBM Spectrum Virtualize software comes more innovated software
functionality that offers many advanced features for storage virtualization.
The Spectrum Virtualize approach is based on a scale-out clustered architecture and lifecycle
management tasks. Spectrum Virtualize allows for non-disruptive replacements of any part in the
storage infrastructure, including the SVC devices themselves. It also simplifies compatibility
requirements that are associated in heterogeneous server and storage environments.
Therefore, all advanced functions are implemented in the virtualization layer, which allows
switching storage array vendors without impact. This enables the application server storage
requirement needs to be articulated in terms of performance, availability, or cost.
Storage virtualization moves all existing data mapped by volumes in one storage pool of a legacy
storage system to another storage pool in another storage system. The legacy storage system can
then be decommissioned without impact to existing applications.
The entire data migration can happen while the application continues running. The storage systems
can also be from different vendors.
Uempty
License $$
Virtualization takes place quickly, efficiently, and in real time
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
By using volume mirroring or VDisk mirroring, a volume can have two physical copies. Each volume
copy can belong to a different storage pool, and each copy has the same virtual capacity as the
volume.
You can also use VDisk mirroring to mirror data across disk subsystems, or remotely mirror VDisks
to another IBM Spectrum Virtualize system to enhance data availability in case of disk subsystem
failure or site failure respectively.
These functions can be performed using Remote Mirroring features of Metro Mirroring, Global
Mirroring, or Global Mirroring with Change Volumes where FlashCopy is used to create backup of
the Changed Volume. This usually involves systems that are part of a HyperSwap or Enhanced
Stretched Cluster system.
All functions are moved from the storage system into the virtualization layer, which means these
functions will stay constant even if the back-end hardware is eventually upgraded/changed. Instead
of purchasing multiple licenses for back-end storage such as encryption, remote mirroring,
FlashCopy, thin provisioning, and so on, one can just purchase a license on the SVC to provide
these functions, thus saving significant cost.
Bottom-line, virtualization is a process that takes place quickly, efficiently, and in real time, while
avoiding increases in administrative costs.
Uempty
Thin provisioning
SAN Reduced storage cost for allocated but
unused storage
THN COMP
Virtual
Disk
Virtual
Disk
Compression
Applying algorithms to reduce the
Spectrum Virtualize
Virtualization Layer
capacity required to store a block of
data
Oracle HDS EMC NetApp
Size: 100 GB
Disk Subsystem B
Storage pools (HDD/Flash)
IBM Spectrum Virtualize software offers a comprehensive range of data reduction and efficiency
capabilities including compression, deduplication, thin provisioning, compaction, SCSI unmap, and
space-efficient snapshots that can be coordinated with IBM Spectrum Copy Data Management
software.
Flash storage benefits immensely from data reduction due to its cost, write amplification and
performance, often reducing the storage and cost by half. Thin provisioning optimize efficiency by
allocating disk storage space in a flexible manner among users based on the minimum space
required by users at any given time. Thin provisioning extends storage utilization and reduce the
consumption of electrical energy as less hardware space is required, and enable more frequent
recovery points of data to be taken (point in time copies) without a commensurate increase in
storage capacity consumption.
Data compression provides storage savings by applying algorithms that reduce the capacity
required to store a block of data. IBM now guarantees an incredible savings rate of up to 80 percent
when you deploy IBM Real-time Compression, provided analysis of the customer data set supports
it, or up to 50% savings without analysis of the customer data. This process enables up to five times
as much data to be stored in the same physical disk space. The goal of compression is to reduce
data footprint.
Uempty
Data deduplication
Applying algorithms to reduce number of
time duplicated data block are stored
Data deduplication and compression both use in-line compression technology that allows systems
to operate with uncompressed data. Unlike compression that reduces the amount of bits required to
store the data, deduplication eliminates the number of times duplicate data blocks are kept.
Only one copy of a block is kept, while duplicate copies are eliminated (by virtue of intelligent
metadata handling). Deduplication can save a huge amount of storage capacity for applications that
have similar data, or not reduce capacity at all.
Data reduction pools increase existing infrastructure capacity utilization by leveraging new
efficiency functions. The pools enable you to automatically de-allocate and reclaim capacity of
thin-provisioned volumes containing deleted data and, for the first time, enable this reclaimed
capacity to be reused by other volumes. With a new log-structured pool implementation, data
reduction pools help deliver more consistent performance from compressed volumes. Data
reduction pools also support compressing all volumes in a system, potentially extending the
benefits of compression to all data in a system. Data is deduplicated on a pool basis in data
reduction pools.
When deploying IBM Spectrum virtualize products under the Data Reduction Guarantee, IBM gives
organizations two options for how to leverage the program’s advantages. The IBM Flexible Data
Reduction Guarantee option provides capacity savings of up to 5:1 based on analyzing the data to
be stored to determine the data reduction rates that are possible. If that analysis is not feasible, the
IBM Estimate Free Data Reduction Guarantee option provides an effortless 2:1 capacity savings
without collecting data and generating reports. There is no charge for these guarantees and users
can select the option that works best for them.
Uempty
Multi-tier pools
Moves the data to the right tier
automatically
Improve storage price performance
SAN
Virtual
Improve application performance with
Disk
Virtual Virtual Virtual small investment in flash storage
Disk
1024 MB extents
Disk Disk
Requires a license
Spectrum Virtualize
Virtualization Layer Single tier pools
Automated balancing across MDisks in
Flash Cold data
Hot data
pool
Cold data Cold data
Size: 100 GB
No license required
Enterprise Disk Subsystem B
Storage pools (HDD/Flash)
Nearline
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
IBM Spectrum Virtualize enables the implementation of a tiered storage scheme using multiple
storage pools through Lifecycle Management, which facilitates the migration of aged or inactive
volumes to a lower-cost storage tier in a different storage pool, possibly a cheaper storage.
Effectively using storage means putting the right data on the right type of storage, and Easy Tier
does this automatically. So frequently accessed data is placed on Flash, while other data may
reside on 15K RPM disks, and infrequently or sequentially accessed data would be placed on near
line storage.
Often most of the I/Os reside on a small subset of the data space, so storing that data on a faster
tier offers a great performance return on the investment in a little flash. For example, it is not
uncommon for 90% of the I/Os to occur on 10% of the data. Data movement is seamless to the host
application regardless of the storage tier in which the data resides.
With Spectrum Virtualize's capability to dynamically add and remove disk subsystems, customers
can purchase flash or nearline storage and implement Easy Tier without disruption to host
applications.
In single-tier pools, Easy Tier performs automated storage pool balancing; moving data among
MDisks in the pool to balance I/Os across them when the I/Os aren't evenly balanced. This
balances use of the storage hardware resources in the pool. This facilitates adding or removing
storage from a pool and keeping the I/Os balanced across the storage in the pool. This part of Easy
Tier doesn't require a license.
Uempty
Spectrum Virtualize
Virtualization Layer (Pooling)
Oracle NetApp
Oracle EMC
Oracle HDS
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
Since all disks with various types data are collected in a pool, there is less waste and off-cut. It also
provides wide striping across all disks in one pool increase the performance (IO per second)
figures.
As a standard part of the solution are built-in functions like thin provisioning, real-time data
compression and other data reduction technologies - all provide reduction of capacity requirements
for internal and attached storage so that less disk space is required.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
This topic recalls the history of the IBM SAN Volume Controller 2145 storage engine, and highlights
key milestones in the storage engine hardware architecture.
Uempty
ŶƚĞƌƉƌŝƐĞůĂƐƐ͕
ŶƚƌLJͬDŝĚͲZĂŶŐĞ EsDĞĂĐĐĞůĞƌĂƚĞĚ
DƵůƚŝͲůŽƵĚŶĂďůĞĚ ůŽƵĚ^ĞƌǀŝĐĞ
DƵůƚŝͲůŽƵĚƌĞĂĚLJ WƌŽǀŝĚĞƌƐ ,ŝŐŚŶĚŶƚĞƌƉƌŝƐĞ
6LPSOLILHGPDQDJHPHQW
)OH[LEOHFRQVXPSWLRQPRGHO 6LPSOLILHGPDQDJHPHQW ƵƐŝŶĞƐƐ ŶƚĞƌƉƌŝƐĞ ŶĂůLJƚŝĐ
9LUWXDOL]HGHQWHUSULVHFODVVIODVKRSWLPL]HGPRGXODUVWRUDJH )OH[LEOHFRQVXPSWLRQPRGHO ůĂƐƐ ůĂƐƐ ůĂƐƐ
(QWHUSULVHFODVVKHWHURJHQHRXVGDWDVHUYLFHVDQGVHOHFWDEOH /DUJH*ULGVFDOH
GDWDUHGXFWLRQ )XOOWLPHGDWDUHGXFWLRQ
ƵƐŝŶĞƐƐƌŝƚŝĐĂů͕ĚĞĞƉĞƐƚŝŶƚĞŐƌĂƚŝŽŶǁŝƚŚ/DĂŶĚ
/DWŽǁĞƌ^LJƐƚĞŵƐ͕ƐƵƉĞƌŝŽƌƉĞƌĨŽƌŵĂŶĐĞ͕ŚŝŐŚĞƐƚ
,%0)ODVK&RUH7HFKQRORJ\2SWLPL]HG ĂǀĂŝůĂďŝůŝƚLJ͕ƚŚƌĞĞͲƐŝƚĞͬĨŽƵƌͲƐŝƚĞƌĞƉůŝĐĂƚŝŽŶĂŶĚ
ŝŶĚƵƐƚƌLJͲůĞĂĚŝŶŐƌĞůŝĂďŝůŝƚLJ
69& )ODVK&RUH0RGXOH )ODVK6\VWHP ([WUHPH
6XSHULRUHQGXUDQFH $SSOLFDWLRQDFFHOHUDWLRQ SHUIRUPDQFH
(QKDQFHGGDWDVWRUDJH +DUGZDUH
IXQFWLRQVHFRQRPLFVDQG EHWWHUSHUIRUPDQFH
),36 &RPSUHVVLRQ
IOH[LELOLW\ZLWKVRSKLVWLFDWHG +DUGZDUH 7DUJHWLQJGDWDEDVH
YLUWXDOL]DWLRQ &RPSUHVVLRQ DFFHOHUDWLRQ
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
Figure 1-20. IBM Systems Flash offerings portfolio for block storage
IBM offers an innovation of high-performance system storage offerings in a wide range of different
types of Flash storage solutions. With a common set of values, IBM Spectrum Virtualize offers
storage solutions in the IBM Storwize V7000 models for entry to mid-range all-flash requirements
with its latest Storwize v7000 Next Gen model featuring Non-Volatile Memory Express (NVMe)
end-to-end offering, same as the FlashSystem 9100. Each IBM Spectrum Virtualize family member
offers flash-driven application performance, scale-out clustering, simplified management, a flexible
consumption model, storage virtualization, a comprehensive set of enterprise data services and
optional data reduction capability.
The FlashSystem 9100 and the FlashSystem A9000 all use IBM’s FlashCore technology, with
FS9100 using the NVMe FlashCore Module. Although SVC-SV1 is not listed, it is still an integral
part of the IBM Spectrum Virtualize solution using the FlashCore technology.
IBM FlashSystem A9000 has a common set of values with the FlashSystem A9000R, using the
Spectrum Accelerate code to deliver simplified management, large grid scale, enterprise system for
large VDI consolidation, on-premises cloud requiring QoS and multi-tenancy as well as ERP, CRM,
SAP and Epic workloads, and full-time data reduction, and in common with the Spectrum Virtualize
offerings, has a flexible consumption model.
In addition, each flash storage solution:
• Delivers sophisticated capabilities that are easy to deploy and help control costs for growing
businesses.
• Performance optimized flash solutions to achieve the highest performance across a wide
variety of workloads and environments.
• A strong focus on data reduction to store massive amounts of capacity in a minimum amount of
space.
Uempty
• Majority of members in this family have cloud-centric abilities to support the shift to multi-cloud
environments.
IBM FlashSystem 900 is positioned for clients with the very lowest latency workload requirements,
such as HPC, Analytics or even metadata storage for other applications and data protection
offerings, and can be paired with SVC to drive application acceleration across a heterogeneous
environment.
And last but not least, the DS8000 range for business critical, mainframe and multi-site replication
requirements, offering the highest levels of business continuity as well as OLTP workloads.
Uempty
• Built-in redundancy
• Single 2U platform
39 kg
(86 lb)
Approx.
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
IBM Storwize V7000 has been redesigned as a virtualized, all-flash, powerful end-to-end
Non-Volatile Memory Express (NVMe) hybrid storage system, that combines the performance of
IBM FlashCore technology; built on the efficiency of IBM Spectrum Virtualize; and delivered on a
proven IBM software solution with extremely low latencies to support multi cloud deployments. It
also provides the intuitive of IBM Storage Insights to help optimize your storage infrastructure using
predictive analytics.
IBM Storwize V7000 systems are 19” rack mount, 2U enclosures, loaded with built-in redundancy
and is Storage Class Memory (SCM) capable.
Each enclosure weights approximately 46.6 kilograms (which is 102.5 pounds) fully loaded. The
node canisters weights approximately 14 kilograms each.
Uempty
IBM Storwize V7000 system is introduced with two SFF NVMe Control Enclosure models: IBM
2076 Storwize V7000, Model 724 for hybrid configurations, and IBM 2076 Storwize V7000, Model
U7B. The Model U7B is the Storwize V7000 hardware component to be utilized in the Storage
Utility Offering space. It is physically and functionally identical to the V7000 model 724 with the
exception of target configurations and variable capacity billing.
Both Storwize V7000 models are designed to cater to the high demands of today’s data driven
application with NVMe capabilities at the lowest price point.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
The Storwize V7000 models offers three-year warranty, and are customer installed and maintained.
Each system consists of several Customer Replaceable Units (CRUs), such as cables, SFP
transceivers, canisters, power supply units, batteries, drives, and enclosure chassis, with one FRU
(system board only) replacement support by IBM Service Support Representatives (SSRs). Without
a Same Day Warranty Service Upgrade, 2076 models carry a next business day between 9 a.m.
and 5 p.m., Limited On-site - Mandatory CRU. With the Same Day Warranty Service Upgrade,
clients have a choice to have an SSR perform the repair action.
Uempty
.,236 3%
3%
190H
'53 )&0V
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
IBM Storwize V7000 solutions offer innovative all-flash arrays designed to deliver fast, flexible
storage, whether on-premises or in cloud or hybrid-cloud environments, with enough speed to
support today’s virtualization and machine-learning applications.
The key benefits of the Storwize V7000 includes:
• Extreme Performance: Offers up to 750 thousands IOPS support coupled with extremely low
latency with NVMe optimized technology in a single 2U control enclosure. With the added
benefit of Easy Tier and high-speed hardware compression all help to deliver greater
performance.
• Resiliency: Full hardware redundancy with 2 nodes per controller, remote mirror and
HyperSwap for high availability configurations. FlashCore technology for greater flash
endurance; Ensure continuous operations, data protection and data security. Extreme Capacity:
With an NVMe-optimized all-flash arrays, Storwize V7000 can provide up to 2 petabytes (PB) of
effective storage in only 2U of rack space. Offers tremendous scaling and performance with up
to a massive 32 petabytes of all flash in a single industry standard 42U rack. Consolidates
different storage, with 440 arrays supported.
• Efficiency: Provides 2.7 times greater throughput than the previous generation of Storwize
V7000. Offers Data Reduction Pools(DRP) with data deduplication and compression providing
25% more capacity, virtualization and pooling to balance use of resources, and EasyTier to
efficiently use different tiers of storage.
• Package of guarantees.
• Non-disruptive infrastructure modernization based on the capabilities of storage virtualization.
Uempty
12 Gb
SAS
adapter
Control Expansion
Enclosures Enclosures
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
With the support of internal SAS-attached enclosures, IBM Storwize V7000 can scale up its storage
capacity to deliver internal drives providing flash-optimized tiered capacity solution. Expansion
enclosures are designed to be dynamically added with virtually no downtime, helping to quickly and
seamlessly respond to growing capacity demands.
IBM SAS-attached expansion enclosures complement external storages offering virtually limitless
scalability, while still maintain high performance and reliability. SAS attached controllers can provide
a cheaper all-flash array solution with or without EasyTier.
When adding the IBM SAS expansion enclosures to the clustered system configuration, the
expansion enclosures must be of the same machine type as the control enclosure it is being
attached to, and running on the IBM Spectrum Virtualize software.
Each control enclosure must also have a 12 Gb PCIe SAS adapter installed. This supports SAS
connectivity between the expansions to the controllers.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
When comparing the Storwize V7000 Gen 3 model to the previous generation Storwize V7000
Gen2+, the Gen3 control enclosure provides more powerful hardware that enables storage
capability for more data, more servers, more workloads featuring:
▪ Latest generation Intel processor.
▪ Up to 30% more performance than the Gen2+.
▪ 4x more cache to improve application responsiveness.
▪ Up to 2x number of 10 Gbps Ethernet ports for server and storage attachments.
▪ Protect investments by clustering with older generation hardware.
▪ And, complies with US Federal requirements (CFIUS and TAA) with its overall system
design.
Uempty
Keywords
Battery modules IBM Spectrum Virtualize
Control canister iSCSI Extensions over RDMA (iSER)
Data-at-rest encryption iWARP (internet Wide-area RDMA
Protocol)
Fibre Channel (FC)
Mirrored Boot Drives
Hardware-assisted compression
acceleration
• RDMA over Converged Ethernet (RoCE)
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
Uempty
Review questions (1 of 3)
1. Which of the following IBM Spectrum Virtualize features apply algorithms to reduce the
capacity required to store a block of data?
a. Data Deduplication
b. Thin Provisioning
c. Compression
d. Tier Storage
2. Which of the following IBM Spectrum Virtualize features apply algorithms to reduce the
number of times duplicated data block are stored?
a. Thin Provisioning
b. Compression
c. Data Deduplication
d. Tier Storage
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
Uempty
Review answers (1 of 3)
1. Which of the following IBM Spectrum Virtualize features apply algorithms to reduce the
capacity required to store a block of data?
a. Data Deduplication
b. Thin Provisioning
c. Compression
d. Tier Storage
The answer is Compression.
2. Which of the following IBM Spectrum Virtualize features apply algorithms to reduce the
number of times duplicated data block are stored?
a. Thin Provisioning
b. Compression
c. Data Deduplication
d. Tier Storage
The answer is Data Deduplication.
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
Uempty
Review questions (2 of 3)
3. Which of the following IBM Spectrum Virtualize features reduces storage cost for
allocated but unused storage?
a. Data Deduplication
b. Thin Provisioning
c. Compression
d. Tier Storage
4. Which of the following IBM Spectrum Virtualize features offer better performance by
putting hot data on a faster (flash) tier.
a. Thin Provisioning
b. Compression
c. Easy Tier
d. Tier Storage
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
Uempty
Review answers (2 of 3)
3. Which of the following IBM Spectrum Virtualize features reduces storage cost for
allocated but unused storage?
a. Data Deduplication
b. Thin Provisioning
c. Compression
d. Tier Storage
The answer is Thin Provisioning.
4. Which of the following IBM Spectrum Virtualize features offer better performance by
putting hot data on a faster (flash) tier.
a. Thin Provisioning
b. Compression
c. Easy Tier
d. Tier Storage
The answer is Easy Tier.
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
Uempty
Review questions (3 of 3)
5. Which of the following SVC storage engine requires the use of the 1U 2145
Uninterruptible Power Supply (UPS)?
a. IBM SVC DH8
b. IBM SVC CG8
c. IBM SVC-SV1
d. IBM SVC-CG4
6. Which of the following SVC storage engine inherited the nickname of “Big Fat Node”
with this massive design change?
a. IBM SVC DH8
b. IBM SVC CG8
c. IBM SVC-SV1
d. IBM SVC-CG4
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
Uempty
Review answers (3 of 3)
5. Which of the following SVC storage engine requires the use of the 1U 2145
Uninterruptible Power Supply (UPS)?
a. IBM SVC DH8
b. IBM SVC CG8
c. IBM SVC-SV1
d. IBM SVC-CG4
The answer is IBM SVC-CG8. Each IBM SVC-CG8 requires the support of a 2145 external
Uninterruptible Power Supply (or UPS) to provide temporary power while the contents of the
SVC cache and cluster information was written to the internal disk drive of each node of the SVC.
6. Which of the following SVC storage engine inherited the nickname of “Big Fat Node”
with this massive design change?
a. IBM SVC DH8
b. IBM SVC CG8
c. IBM SVC-SV1
d. IBM SVC-CG4
The answer is IBM SVC-DH8. IBM SVC DH8 unveiled the 2U chassis.
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
Uempty
Summary
&RS\ULJKW,%0&RUSRUDWLRQ
,QWURGXFWLRQRI,%06WRUZL]H9
Uempty
Overview
The module introduces the hardware components and features that defines the IBM Storwize
V7000 NVMe-attached drive controle enclosures.
References
Implementing IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1
http://www.redbooks.ibm.com/redpieces/pdfs/sg247938.pdf
Uempty
Objectives
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
• Hardware characteristics
• Virtualize clustering
• Licensing and software features
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
This topic highlights components and features of the IBM Storwize V7000 2076-724.
Uempty
Storwize V7000
• Specifications:
ƒ Dual Active-Active Array Controllers with NVMe to Flash Media
ƒ Dual-ported SFF twenty-four 2.5-inch NVMe Flash bays
í Supports redesigned 2.5-inch Flash Core Modules (FCM) and standard NVMe flash drives
í Supports intermix of IBM FlashCore Modules NVMe-attached Flash drives of different sizes
070DQG6HULDO1XPEHUIRULGHQWLILFDWLRQ
DOVRODEHOOHGXQGHUWKHOHIWEH]HO
*UHHQ 3RZHU/('
%OXH ,GHQWLI\/('
$PEHU )DXOW/('
Front view
190H)ODVKED\VDUHQXPEHUHG
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
The Storwize V7000, Model 724 is a uniquely designed low rack density, and high performance
control enclosure that is built on the enhancements of the IBM FlashSystem 9100, contains
twenty-four 2.5-inch drive sloys using NVMe interfaces to support NVMe-attached IBM FlashCore
Modules or r self-encrypting NVMe-attached SSD drives as the basis of the storage array, all
compacted in a two-rack unit chassis. An intermix of IBM FlashCore Module NVMe-attached Flash
drives of different sizes can be used simultaneously in ancontrol enclosure. NVMe drives are
supported for use in control enclosures only (not supported for use in the expansion enclosures).
This visual identifies the components located in the front of the IBM Storwize V7000. In addition to
the 24 FCMs, each enclosure front features:
• The drives are accessible from the front of the control enclosure. NVMe Flash bays are
numbered 01 – 24. In addition, each drive has two ports that connect the drive to each canister.
The system automatically detects the drives that are attached to it. These drives are configured
into arrays and presented as MDisks. Each drive slot contains a Green Activity LED and an
Amber Fault LED (also blinks when a drive is being identified to user by software).
▪ Located in on the left bezel shows the Product name, MTM and Serial Number for
identification. It also contains three system LED indicators: Green Activity LED, Blue Identify
LED, and an Amber Fault LED.
Uempty
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
IBM Storwize V7000 FlashCore Modules integrate IBM MicroLatency technology, advanced flash
management, and reliability into a 2.5-inch SFF NVMe supporting PCIe 3.0 2x2 connectivity, with
built-in, performance-neutral hardware compression. Each FCM leverages the advantages of IBM
FlashCore-enhanced 3D TLC storage media that provides greater flash density and storage
capacity than multi-level cell (MLC) solutions. Along with the move to 3D TLC flash, the
purpose-engineered FCMs utilize powerful inline, hardware accelerated data-compression
technology using the Field Programmable Gate Arrays (FPGAs) within each FCM that provides
consistent, high-performance data compression across the full range of workloads. This approach
allows the FS 9100 to deliver the level of performance that you expect without compression, with
the added benefit of better utilization of the physical storage.
The FCMs have also updated FPGAs for better error handling and recovery. With each FPGA
contains inline speed data compression, providing ultra-fast and always-on data reduction, IBM has
added read ahead cache to enable read latency on highly compressed pages.
IBM also took in consideration of the higher capacity modules generating more power. Therefore, a
four-plane programming was added to reduce the overall power during write operations using the
same power profile as the Storwize V7000.
IBM Storwize V7000 uses enterprise-class, two-dimensional flash RAID technology, leveraging
both the patented Variable Stripe RAID with the enhanced system-level DRAID 6 to deliver 99.999
percent availability. Variable Stripe RAID maintains system performance and capacity in the event
of partial or full flash chip failures, helping reduce downtime and avoid system repairs. System-wide
DRAID 6 with hot spare and dual parity also helps prevent data loss and improves availability by
striping data across multiple drives. These data protection and system reliability features are
backed by a seven-year flash endurance guarantee to deploy in mission-critical environments.
Uempty
6HOIHQFU\SWLQJGULYHV
• All NVMe drives used in the control enclosures are self-
encrypting
• SEDs comply with FIPS 140-2 as required by some
customers
• Data encryption is completed within the drive – no
impact to performance
• Encrypted drive can be eased using a single command
ƒ Replace the Data Encryption Key (DEK), or reset device to
factory default
• Supports automatic locks of encrypted drives when the
system or drive is powered down
• Drives automatically lock themselves on power loss
ƒ Access key is require at boot time to unlock and allow I/O
operations
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
The NVMe drives used in the control enclosures are self-encrypting drives (SEDs). The NVMe
drives are designed to support FIPS 140-2 Level 1 encryption3 with IBM Security Key Lifecycle
Manager (SKLM) centralized key management and full hot-swap capabilities. Spectrum Virtualize
allows you to use the SKLM, which uses key servers, to store Master Access Keys (MAKs). The
MAKs are required to read data that is already encrypted on the system, by providing access to the
Data Encryption Keys (DEKs) that are securely stored on the system.
Encryption of data is done in the electrical circuit of the drive without being impacted by
performance issues from software encryption. During write operations, the host I/O travels through
the software stack unencrypted, and is written out to the individual NVMe drives, where it is
encrypted.
Data encryption keys remain on the drive without being stored in system memory. Every encrypted
NVMe array is created with a unique Security ID (SID) as an access key for locking and unlocking
the self encrypting NVMe drive. When the SEDs are powered up, the system retrieves the key and
unlocks the drives. All SEDs in the same encrypted array share the same unlocking key.
An SED can be crypto-erased using a single command, to replace the DEK, or to revert the whole
device to its factory default settings. In addition, the system supports a security feature called
auto-lock, which protects against thieves plugging your drive into another system and accessing
your data. It also, drives automatically lock themselves on power loss, which requires an access
key at boot time to unlock and allow I/O operations. When an SED is locked, no I/O operations are
possible, and must be unlocked to read or write data.
Uempty
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
• NVMe is a protocol designed specifically for flash technologies, offering lower latency, and less
complicated storage drive transport protocol than SAS. NVMe-attached drives support multiple
queues so that each CPU core can communicate directly with the drive. This avoids the latency
and overhead of core-core communication, to give the best performance. NVMe multi-queuing
supports the Remote Direct Memory Access (RDMA) queue pair model for fast system access
to host-attached iWARP or RoCE communications using iSCSI Extensions for RDMA (iSER).
• The local MDisks are classified as tier 0 flash and the drives that uses the NVMe architecture
are also considered Tier 0 flash drives. For EasyTier, Tier 0 flash drives are high-performance
flash drives that process read and write operations and provide faster access to data than
enterprise or nearline drives. For most Tier 0 flash drives, as they are used the system monitors
their wear level and issues warnings when the drive is nearing replacement.
• All NVMe drives are a no-worries on endurance while under maintenance agreement. IBM
offers a flash endurance assurance program intended to help address concerns associated with
using even the most demanding workloads with IBM flash technology.
• All NVMe drives report temperature and drive health metrics (see SV 7.8.1 updates). When a
flash drive has exceeded the temperature warning threshold, the system will identify the flash
drive by this error, and then reported that its temperature is higher than the warning threshold.
• The FlashSystem has an internal mechanism to control the temperature in the specific
operating range and reacts when this range is left.
• FlashCore Modules will not throttle on temperature.
Uempty
)ODVK0HGLD
VRXUFHGIURP &DSDFLW\SHU'ULYHZLWK'DWD5HGXFWLRQ3RROV 0D[&DSDFLW\LQMXVW8ZLWK'DWD5HGXFWLRQ3RROV
7RVKLEDDQG ± ±
6DPVXQJ
190H7% 7%± 7% 7%± 7%
190H7% 7%± 7% 7%± 7%
190H7% 7%± 7% 7%± 7%
190H7% 7%± 7% 7%± 3%
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Storwize V7000 Flash media can easily provide significant amounts of capacity and capacity
savings, either by employing performance optimized hardware compression, or by using Data
Reduction Pools. The first table identifies engineering estimates of potential capacity for the various
FCM sizes and whether deduplication via DRPs is applied to the data. The second table provides
estimates for the NVMe drives. DRAID-6 is assumed. Actual savings will vary. Several factors can
have significant effect on capacity savings including:
• Compressibility of data
• How much data is duplicated within a DRP
• Configuration choice of using DRPs or not
• Configuration choice of using compressed volumes or not, if not using DRPs
• If FCMs are used or not
• The FCM size as compression savings vary among the different sizes
▪ Estimated compression savings for the 4.8 TB FCMs are 4.5:1 while it's 2.3:1 for the 9.6
and 19.2 TB FCMs
Flexibility is built into the Storwize V7000 architecture, you can choose IBM FCMs offering 4.8 TB,
9.6 TB, and 19.2 TB 3D TLC capacity points, or you can opt for industry standard NVMe standard
form factor drives offering 1.92 TB, 3.84 TB, 7.68 TB and 15.36 TB. You can also choose whether
to use DRPs or compressed volumes not in DRPs.
The tables provide engineering's best guidance. However, if you have a good idea of the
compression and deduplication savings, you can create your own estimate of expected capacity by
determining the RAID configuration, its raw capacity, and applying the compression and
deduplication savings. Plan on keeping about 15% of this space free for garbage collection - as
Uempty
running out of space will lead to an outage, insufficient free space inhibits effective space
management.
This solution provides the ability to grow capacity up to 2 petabytes (PB) of effective storage in only
2U of rack space, depending on the data set characteristics. In an industry standard 42U rack the
FlashSystem 9100 delivers the ability to cluster, scale out, or scale up capacity and performance up
to a massive 32 petabytes of all flash and up to 10 million IOPS. With the added benefit of IBM
Storwize V7000 arrays equip to support NVMe over Fabrics, provides the ability to extend
extremely low latency across entire storage area networks.
Uempty
ƒ If the system is configured with FCMs and you expected to use more duplicated data, it's
preferable to not configure compressed volumes
í The FCMs provides inline compression (FPGAs) and offloads to the Storwize V7000 system processors -
insure the best latency
ƒ If a volume contains data that is not compressible, then compressed volumes shouldn't be used
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
It's important the administrator monitor actual space savings and keep sufficient free space for
garbage collection and to effectively manage space, as actual space savings may not be as
expected. The system can provide estimated savings for compressing a volume, however, there
are no tools for estimating deduplication savings.
Therefore, you will need to consider the following factors in choosing among the configuration
alternatives:
• If expected to use more duplicated data, then you should use DRPs to save space and reduce
costs. This allows the system to reduce the amount of data that is stored on the storage
systems by reclaiming previously used storage resources that are no longer needed by host
systems.
• If the system is configured with FCMs and you expected to use more duplicated data, it's
preferable to not configure compressed volumes. This is because the FCMs provides inline
hardware compress through the FPGAs and offloads to the Storwize V7000 system processors,
which also insure the best latency.
• If a volume contains data that is not compressible, then compressed volumes shouldn't be
used. Not all workloads are good candidates for compression since some data are already
compressible by design. Therefore, you should only implement compression for data with an
expected compression ratio of 45% or higher.
Uempty
&DSDFLW\>7HUDE\WHV@
)&07% )&07% )&07% 190H7% 190H7% 190H7% 190H7%
8VDEOH&DSDFLW\'5$,'UHEXLOGDUHD 0D[,QOLQH&RPSUHVVLRQ 'DWD5HGXFWLRQ3RROVWRWDO
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
This diagram shows the astounding capacity capabilities of the IBM Storwize V7000 2U system
when using FCMs or NVMe standard Flash drives. For example, when populated with 24 NVMe
FCMs using 19.2 TB each, the system gives you a raw capacity of 460.8 TB. After DRAID-6 and
spare configurations, that leaves you 403.2 TB of Usable Capacity.
When you factor in the FCM’s built in hardware compression (for no performance penalty) with a
guarantee ratio of 2:1, which yields 800 TB of usable effective system capacity. Taking this a step
further with de-duplication to the volumes you provision, you can achieve at least 1 PB and possible
beyond in 2U for almost all workload types.
If you are using standard NVMe Flash drives, you can compress data using data reduction pools for
a 5:1 ratio using the inbuilt hardware compression assist engines in each of the control canisters.
Uempty
1 DIMM Bank
supporting up to 3 slots
Midplane connectors
(Slot 1)
5 DIMM Bank
PCIe riser card 1
The IBM Storwize V7000 system is a fully IBM SSR installed product. There are no customer
installed parts.
This interior view of the IBM Storwize V7000 highlights the locations of each component. The
system contains:
• Five fan modules that are standard for each control canister.
• Two Intel processors available in 8-core 2U system.
• Each processor is attached directly to twelve DIMM slots for a total of 24 DIMM slots.
• Single flash boot M2 drive (optional to support dual flash boot drives).
• One Trusted Platform Module (TPM) which is a dedicated microcontroller designed to secure
hardware through integrated cryptographic keys.
• Adjacent to the TPM, is a complementary metal-oxide semiconductor (CMOS) power cell that is
used to keep the system time when there is no power to the canister.
• Three PCIe Gen 3 riser cards (supporting one adapter each) which are standard installed to
support I/O connectivity.
• Two redundant AC power supplies (not shown).
Uempty
DDR4 DDR4
Cross
Card
Comms PCIe-3 x16
Intel-SP
Technology USB
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
&RS\ULJKW,%0&RUSRUDWLRQ
This visual illustrates the Storwize V7000 block diagram. Some of the key features include:
• A high performance, scalable PCI Express switching solution that optimizes performance per
watt for the FlashSystem 9100 dual-ported NVMe drive connections, system interconnects and
I/O expansions. It offers lower latency and higher bandwidth to the FCMs.
• The latest series of Intel processor providing scalable performance (SP) with 16 GB/s PCIe Gen
3 lanes, using 128/130-bit encoding. PCIe Gen 3 supports the implementation of higher
bandwidth protocols like 10Gb Ethernet, 16Gb Fibre channel, and 25 Gb Ethernet connections
and beyond – without oversubscribing the bandwidth.
• The Direct Media Interface (DMI) links to the on-board Intel’s Lewisburg PCH, providing a faster
generation of QuickAssist supporting compression assist, and a significant improvement of
bandwidth to support up the four 10GbE integrated ports with Remote Direct Memory Access
(RDMA) capabilities, Flash M2 Boot drives, a Trusted Platform Module (TPM) which can be
used for encryption, and the battery backup unit.
Uempty
System-level compression
Control enclosure on-board hardware-assist
FCM in-line hardware compression
compression
(PEHGGHGLQWKH)&0V)3*$V (PEHGGHGIHDWXUHRIWKH,QWHO/HZLVEXUJ3&+
FKLS
'RHVQRWSURFHVVDOUHDG\FRPSUHVVHGGDWD
Hardware-assist
Model
Compression
2076-724 40 Gb/s
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Each NVMe Flash Core Module implements data compression and decompression algorithm using
integrated in-line hardware compression. This data reduction feature requires no processor
intervention as it is embedded in the FCM’s Field Programmable Gate Array (FPGAs) and is
“always on.
Compression and decompression are automatically performed on individual logical pages, and are
completely transparent above the NVMe FCMs except for management of space.
FCM in-line hardware will not process data that has been already compressed as part of its normal
workload. Data that is already compressed is analyzed and then written without doing additional
compression. However, storage administrators should not to create compressed VDisks when they
are backed by the FCMs.
In-line hardware compression is faster compared to the existing supported software level RTC
function, providing greater performance as well as cost savings. All manufactured NVMe Flash
drives does not support self-compression.
The Storwize V7000 control enclosure Lewisburg PCH chip provides on-board hardware-assist
compression assist which allows all system volumes and volumes created with in a data reduction
pool (DRPs) to be compressed without any standard Compression Accelerator adapter installed.
With the on-board compression, the Storwize Model 724 provides compression assist of 40
Gigabits per second. The Model 724 compares favorably to the FlashSystem V9000 AC3 Control
Enclosure which offers 40 Gb/s using two compression accelerator adapter cards.
Although both techniques helped in reduced IO bandwidth consumption, and increased capacity of
the underlying storage device, compression in the FCM FPGAs has much less latency than the
compression that is provided in the Intel Lewisburg PCH which is controlled by the Spectrum
Virtualize software.
Uempty
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
IBM Storwize V7000 system features dual Intel-Scalable platform CPUs per controller canister
offering higher per-core performance with eight cores per CPU at 1.7 GHz, for a total of 32 Cores
per a 2U system.
The IBM Storwize V7000 system can also be clustered up 128 Cores per an 8U system, delivering
the highest performance and scalability for compute-intensive workloads across server, storage,
and network usages.
The Intel-SP processors are connected together using Ultra Path Interconnect (UPI) links which
replaces the obsolete older QuickPath Interconnect (QPI) links. Intel-SP offers foundational
enhancements that delivers 50 percent increased memory bandwidth and capacity with six memory
channels for memory-intensive workloads. Expands the I/O capabilities with x16 lanes of PCIe 3.0
bandwidth and throughput (per adapter card) and to the PCIe Switch for demanding I/O-intensive
workloads.
Uempty
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Each Storwize V7000 control canister system board contains 24 DDR4 DIMM slots, with each CPU
socket supporting up to 12 DIMMs (encased by six DIMM slots per channel on each side). Each
control canister supports 16 GB and 32 GB DIMMs, which can be installed in four distinct memory
configurations of four, eight, twelve, or twenty-four. (Each canister must have the same amount of
memory and the same configuration.) A Storwize V7000 control canister ships standard with four 16
GB memory per canister for a total of 128 GB of cache memory capacity.
Up to 576 GB of memory can be configured per control canister, giving a single cluster the
capability of 1.15 TB of cache using twelve 16 GB DIMMs and twelve 32 GB DIMMs in a 2U
system. This allows the Storwize V7000 2U storage arrays to leverage the performance and
efficiency of more than a terabyte of memory and multiple petabytes of storage, all moving at NVMe
speeds, to tackle even the most demanding real-time analytics or AI application workloads.
Memory capacity will be used for a number of purposes including read cache, the operating
system, deduplication, compression and 12 GB of write cache. Storwize V7000 systems may also
require significantly more DRAM for deduplication and additional persistent storage.
This image shows the location of each DIMM Bank, the DIMM slots and the CPUs. On the system
board, DIMM slots for CPU 2 are a 180 degrees rotation of CPU 1. The DIMM slots are labeled
according to their memory channel and slot; they are associated with the CPU nearest to their
DIMM slots. To maintain consistent airflow and cooling, each DIMM slot must contain either a
memory module or a filler blank.
Uempty
D0 16 GB 16 GB 16 GB 32 GB
D1 16 GB
CPU 1
DIMM Bank 2
A1 16 GB
A0 16 GB 16 GB 16 GB 32 GB
B1 16 GB
B0 16 GB 16 GB 32 GB
C1 16 GB
C0 16 Gb 32 GB
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Each memory channel has 2 DIMM slots, numbered 0-1. For example, DIMM slots A0 and A1 are in
memory channel A. This chart indefinites the DIMM slots on the system board that are encased by
CPU 1, defining the assigned slot numbers, and the DIMM size population rules.
Uempty
A0 16 GB 16 GB 16 GB 32 GB
A1 16 GB
CPU 2
DIMM Bank 4
D1 16 GB
D0 16 GB 16 GB 16 GB 32 GB
E1 16 GB
E0 16 GB 16 GB 32 GB
F1 16 GB
F0 16 Gb 32 GB
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
This chart indefinites the twelve DIMM slots on the system board that are encased by CPU 2,
defining the assigned slot numbers, and the DIMM size population rules.
Uempty
0LUURUHGERRWGULYHV
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
IBM Storwize V7000, Model 724 node canister contain a single 120 GB M2 internal boot drive with
the option to add an addition boot drive.
In the event of a node power failure, the integrated batteries will maintain the node power long
enough to save hardened data to the single or dual boot drives. This functionality is referred to as a
Fire Hose Dump (FHD), allowing the system to quickly stripe non-volatile system data (such as
write cache) to the boot drives for full system restore when power is returned. As an additional
feature, the dual Flash boot drives will mirror partitions for higher resiliency. Order of insertion is
important, as the single boot drive must be in Slot 1 for the system to boot.
In the event catastrophic node failure, in which there is a total boot drive failure (dual or single boot
drive), the node will fail to boot entirely. However, because there are two active-active nodes in the
IO Group and with NPIV is enabled by default, the partner node gets the virtual ports from the failed
canister and maintains system IO operations. Note that cache is disabled when the system is not
running redundantly.
Uempty
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Each Storwize V7000 node canister in a system contains a battery module that function as a UPS
to provide sufficient power to the node canister to allow a systematic shutdown in the event of a
loss of power from the electrical utility.
The battery is maintained in a charged state by the battery subsystem. In this state, the battery can
save critical data and state information in two back-to-back power failures. This means that the
node canister can start immediately after the first power failure without waiting to recharge. After
rebooting, if the battery does not have enough charge for a node to save its internal state, the node
remains in service until the battery is charged sufficiently. The batteries periodically recondition to
maintain an accurate calibration of the full charge capacity of the battery. If the power to a node
fails, the node can write its configuration state and cache state to its internal boot drive(s) using the
power provided by the battery.
If both battery modules in the enclosure are not healthy or sufficiently charged, and if one battery is
removed and the still-installed battery is not healthy or sufficiently charged, the entire system is
placed in a service state. All I/O stops and there is no access to the data. This is a data-access
event, not a data-loss event.
To display information about the battery in the command-line interface, use the
lsenclosurebattery command.
Uempty
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
The lifespan of each battery modules is approximately five years. The lifespan might be consumed
in less than five year depending on how often the battery modules are used during system
shutdowns, with a two year shelf life. The system Event Log will provide notification messages 120
or 240 days prior to the end-of-life date.
Approximately every 3 months and depending on the number of discharge cycles, each battery is
automatically reconditioned to measure the battery capacity. Batteries in the same enclosure are
not reconditioned within two days of each other. As a battery ages, it loses capacity. When a battery
no longer has capacity (which is below the planned threshold) to protect against two power loss
events, it reports the battery End Of Life event and it should be replaced.
Each battery provides power only for the canister in which it is installed. If a battery fails, the
canister goes offline and reports a node error. The single running canister destages its cache and
runs the I/O group in write-through mode until its partner canister is repaired and online. The
system sends an event log notification that a battery needs reconditioning or recalibrating.
However, reconditioning is rescheduled or canceled if the system loses redundancy. The
reconditioning feature is automatically disabled. The customer must enable it to provided
reconditioning or recalibration of the IBM FlashSystem 9100. The default is OFF and there is no
change during a code upgrade. All IBM Storwize V7000 systems are set to OFF during
manufacture. After the machine is installed, the battery must enabled for reconditioning by setting
the option to ON.
To access information about the battery in the management GUI, select Monitoring > System. On
the System - Overview page, click the directional arrow next the enclosure that contains the battery
module. On the Enclosure Details page, select Battery Module under Internal Components to
display information about the battery module. To display information about the battery in the
command-line interface, use the lsenclosurebattery command.
Uempty
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
The image highlights a close-up image of the Storwize V7000 control canister fan modules.
To maintain system cooling, each Storwize V7000 controller canister utilizes five integrated
dual-motor fan modules and a connecting cable to plug onto the motherboard. Each fan module is
a speed-controlled N+1 redundant fan unit that provides counter-rotating speed.
The five FRU fan modules are housed within a fan cage (also referred to as fan banks) which is
covered by an air baffle to help moderate the airflow. Fan modules are numbered from left side of
the bezel 1 through 3 to the right side of bezel 4 through 5 (from the front view of the chassis).
Fan modules are not hot-swappable components that can be removed while system is in operation.
To remove a fan, the node must be powered off to prevent the hosts from losing access to data in
volumes. This will ensure that the partner node in the I/O group takes over all I/O group operations.
Air flow through the canisters is channeled through the fans by an air dam that runs the width of the
canister and is interspersed with each fan module. Louvres on the rear of the module are free to
open under the pressure of the expelled air, and close under gravity to prevent backflow if a fan
fails.
Uempty
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
The Storwize V7000 control enclosure includes dual redundant 2000-watts AC high efficiency
power supply units (PSU) that are located on the rear of the unit. The power supplies are vertically
positioned with PSU1 on the left side of the enclosure and PSU2 on the right. These power supplies
are auto-sensing and can be connected to 200-240 Voltage AC.
Each power supply is seated on a power interposer that forms part of the PSU slot and help fills the
space between the PSU and the midplane connection. The blue colored touch point on the each
PSU and the power interposer indicates that it is warm-swappable component (meaning one at a
time). In the event of a power supply failure, do not operate the enclosure without a power
interposer and PSU in a PSU slot for longer than 5 minutes. Operating for longer than this period
might cause the control enclosure to shut down due to overheating. As long as one power supply is
functional, a failed power supply can be replaced without software intervention by following the
directed maintenance procedure while the unit is still in operation.
There are various power cables available for regional main outlets, and two different PDU
connector cables are available for connection to 10 amp or 16 amp PDU outlets.
Each power supply has three status indication LEDs, two green and one amber reporting its health
status.
The system airflow is from the front to the rear of each enclosure. The airflow passes between drive
carriers as it is drawn through the control enclosure by fans in each node canister and each power
supply. With the combined power and cooling modules air is exhaust from the rear of each canister.
To display information about the PSU in the command-line interface, use the lsenclosurepsu
command.
Uempty
The IBM Storwize V7000 control enclosure rear view offers multiple external connectors for data,
video, and power components.
• Each control enclosure contains two power supply units (PSUs) for normal operation.
• Dual control canisters: Control canister 1 installed in the top slot at 180ᵒ rotation and control
canister 2 installed in bottom slot. Therefore the slots and ports on control canister 1 are
numbered right to left and control canister 2 slots and ports are numbered left to right.
• Each control canister features:
▪ Up to three 16 lane PCIe Gen3 I/O slots per node (via riser cards) for a total of six active
host I/O adapters.
▪ There are also a number of port connectors that are used during normal operation for
service procedures:
- Total of eight 10 Gb Ethernet ports standard with Ports 3 and 4 used for 10 Gb iSCSI
Host I/O connectivity. On each node canister, Port 1 must be used for system
management IP and Service IP services. Port 2 can be used for a secondary
management IP.
- One 1 Gb Ethernet port for service technician also referred to as the Technician Port to
support DHCP/DNS for direct attach service management. You should connect to the
Technician port only when you are directed to do so by a service procedure or by an IBM
service representative. Ethernet cables are not supplied as part of your order. Ensure
the cables used meet the minimum standards for the Ethernet port type of the switch.
- One VGA video port
- Two rear USB 3.0 ports.
Uempty
- And, a host of rear-panel indicators that consist of LEDs to indicate the status of the
Fibre Channel ports, Ethernet connection and activity, power, and electrical current.
Uempty
Technician port
• Required for service access and cluster initialization
• Only possibility to reset superuser password (can be disabled)
• Do not connect to network (switch) infrastructure
• Supports auto-negotiation and Auto-MDI-X (cross-cable is
obsolete)
• Provides DHCP addresses to connected computer and
responds to every URL entered
• Only superuser can log in to service menu
• Independent of configured service IP
• USB key based actions are still supported
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
The purpose and key benefit of the Technician port is to simplify and ease the initial basic
configuration of the Spectrum Virtualize system by the local administrator or by service personnel.
This port runs a Dynamic Host Configuration Protocol (DHCP) server to facilitate
service/maintenance out-of-the-box. The Technician port (T-port) is a dedicated port that is marked
with a T (Ethernet port 4).
Uempty
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
IBM Storwize V7000 scales I/O by adding adapters via 6 adapter slots using low profile PCIe Gen3
riser cards (three per canister). The number of host connections vary, depending on the adapter
types and quantities installed. The following Storwize V7000 PCIe adapters quantities are
supported on the 2076-724/U7B control enclosure (other V7000 models vary in the quantity of
adapter slots and supported adapters):
A maximum of four 4-port 16 Gbps Fibre Channel adapters can be using slots 1 and 2 in both
nodes. With a 4-port 16 Gb FC adapter, the Storwize V7000 node provides link speeds of 2, 4, 8
and 16 Gb, offering high throughput. A maximum of four 2-port 25 Gb Ethernet host interface
adapters that use either the RDMA over Converged Ethernet (RoCE) networking protocol or the
iWARP (Internet Wide-area RDMA protocol) networking protocol using slots 1 and 2 in both nodes.
The IBM Statement of Direction (issued February 2018), states the Storwize V7000 16 Gbit Fibre
Channel and 25 Gbit Ethernet hardware for host interfaces are NVMe-oF ready.
Customers also have the option to support iSCSI host connection using the rear 10 Gbps Ethernet
ports 1 through 4 on each node.
A maximum of two 2-port 12 Gb SAS Expansion Enclosure Attach Card can be installed using only
slot 3 in both nodes. This is a 4-port card with only 2 ports active. The 12-Gb SAS adapter
supports IBM attached SAS 2U and 5U expansion enclosures.
The system also includes an integrated hardware-assisted compression acceleration processor to
offload compression and deduplication. A minimum quantity of one 16 Gb FC adapter feature or
one 25 Gb Ethernet adapter feature is required. A filler must be installed in all unused PCIe slots.
PCIe population must be identical in both node canisters within a control enclosure.
Uempty
,%06SHFWUXP9LUWXDOL]H*EDGDSWHUVXSSRUW
32 Gb Fibre Channel adapter PCIe3 x8 2-port LP adapter
Improved bandwidth performance with a maximum of 56 Gbps
Improved in read/write latency over 16Gb adapters
IBM Spectrum Virtualize Storwize V5100, Storwize V7000 Gen3, FlashSystem 9110 and
FlashSystem 9150 supports a 32 Gb Fibre Channel adapter PCIe3 x8 2-port LP adapter that uses
SR optics. Each port can provide up to 32 Gb Fibre Channel functions simultaneously, offering
maximum bandwidth performance of 56 Gb per seconds, and offers improved read/write latency
over 16 Gb adapters. The adapter can be used in either a x8 or x16 PCIe slot in the system.
The 32 Gb adapter delivers enhanced performance with up to 2.6 million IOPS (650 K per port) and
up to 24,000 MB/s of aggregate throughput, and providing unsurpassed reliability and resiliency. It
provides advanced storage networking features, capable of supporting the most demanding
virtualized and private cloud environments, while fully using the capabilities of high-performance
FC, all-flash arrays, and demanding enterprise applications
Each port provides single initiator capability over a fiber link or with NPIV, multiple initiator
capabilities is provided. The 32 Gb adapter can connect to 32 Gb, 16 Gb FC, and 8 FC port speeds
over fabric.
Uempty
SAS cables
1.5 m 12 Gb SAS Cable (mSAS HD)
3 m 12 Gb SAS Cable (mSAS HD)
6 m 12 Gb SAS Cable (mSAS HD)
IBM Storwize V7000 supports flash-optimized tiered storage configuration for mixed workloads with
the optional 2076-12F, 2076-24F, 2076-92F SAS expansion enclosures. A 4-port SAS Expansion
Enclosure Attach Card must be installed in port 2. This feature is used to attach up to two
expansion SAS standard or high-density expansion enclosures to an Storwize V7000 controller.
The 12 Gb SAS ports interconnect and transport protocol that defines the rules for information
exchange between all enclosures to include external devices that are connected to the cluster. The
4-port SAS card does not support connection to Storwize V7000 or SAN Volume Controller
expansion enclosures.
The table lists the available 12Gb SAS cables options available to attach the Storwize V7000
controllers to the SAS expansions are connected by using the IBM 1.5 m, 0.3 m, 0.6 m 12 Gb SAS
Cable (mSAS HD to mSAS).
Uempty
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
The Storwize V70000 control enclosure use the 80c product ID that includes the standard
worldwide node name (WWNN) and the worldwide port names (WWPN). The Fibre Channel (FC)
port numbers and worldwide port names (WWPNs) depend on the type of adapters that are
installed in the control enclosure.
The image shows the FC port numbers for the Storwize V70000 control enclosure. You can identify
port number to the WWPN that is listed in the table.
The WWPNs values are assigned as follows: 5005076810<P>XXXX
• 5 The IEEE Network Address Authority field format number. This value identifies a registered
port name.
• 005076 The Organizationally Unique Identifier (OUI) for IBM.
• 810 The product unique identifier for an Storwize V70000 control canister.
• <SP> The adapter slot/port ID number.
• XXXX A unique number for each Storwize V70000 control enclosure in the system.
Note that the XXXX value is based on the control enclosure WWNN which can be set via the
Service Assistant GUI or via the satask chvpd command when replacing a control enclosure, so
that the SAN zoning doesn't have to be changed when a control enclosure is replaced. Further
when FC adapters are replaced, they will pick up the WWPNs used by the previous adapters,
based on the node WWNN.
Uempty
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
This table identifies the fabric types that can be used for communicating between host to node,
node to storage system, and node to node systems. These fabric types can be used at the same
time. Note that iSCSI runs at all speeds, whereas the RoCE and iWARP only at 25 Gb.
The latency for these iSER interface is a bit lower than for Fibre Channel, however, new clients who
are looking to set cloud connectivity requirements are more likely to go for iSER Ethernet based
implementations.
IBM Storwize V70000 does not support Fiber Channel over Ethernet.
Uempty
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
This topic highlights components and features of the IBM Storwize V7000 2076-724.
Uempty
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
IBM Storwize V7000 uses the same virtualization as the other IBM Spectrum Storage built on the
IBM Spectrum Virtualize software. The Storwize V7000 system functions as a software-defined
storage virtualization layer between hosts and storage arrays, isolating application servers from
having direct associations with the physical storage systems. Internal storage and external storage
are configured as MDisks (Managed Disks) and placed in pools (also known as managed disk
groups) for balancing use of those resources. Virtual Disks or Volumes (or VDisks) are created from
the pools and presented to attached hosts, virtualizing the storage resources. Therefore, as the
central configuration point, all input/output (I/O) must flow through the Storwize V7000 virtualization
engine.
A single Storwize V7000 can scale its resources, each paired with external storages, networking,
virtualization, and management for a single infrastructure system solution. Storwize V7000 uses an
in-band approach to provide block-level aggregation and volume management for all storages
within the SAN, enabling enterprises to centralize storage provisioning with a single point of control.
This inline (or gateway) virtualization approach also allows for non-disruptive replacements of any
part in the storage infrastructure, including the node canisters themselves for a redundant, modular,
and scalable solution.
Uempty
2076
Cognitive
Cluster-Style Control and Scalability Easy Tier
Investment
Protection
Extensive Interoperability Support – 440+
Added value to legacy systems
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
IBM Storwize V7000 is a scalable solution that can be configured as a single or multi-clustered
solution to scale up to add more capacity and scale out for more performance and capacity to
expand virtualized system. You can scale storage capacity by using either SAS-attached
SSDs/HDDs or external storage controllers/systems.
IBM Storwize V7000 uses the IBM Spectrum Virtualize, which the system foundation that provides a
rich set of shared enterprise-class data services with extensive interoperability support to over 440
heterogeneous storage arrays from multiple vendors. All attached storage enclosures to the IBM
Storwize V7000 system deliver IBM Spectrum Virtualize enterprise-class, advanced storage
capabilities and advanced data services.
Uempty
'LIIHUHQWZD\VWRFOXVWHUDQ,%06SHFWUXP9LUWXDOL]HV\VWHP
• Up to 4 I/O groups can be clustered into one logical system to scale workload
Storwize V7000 can cluster with V7000 Gen2+, FlashSystem 9100 and SVC-SV1/DH8 control
enclosures
í Provides no hardware upgrade path – only data migration path
Fibre Channel or Fibre Channel over Ethernet (FCoE), or iSCSI is required
í Control enclosures require Ethernet connections for Spectrum Virtualize management
í Same level of Spectrum Virtualize code is required on all cluster I/O groups (except during upgrades)
Intermixing FlashSystem V9000 with Storwize V7000 are not supported
A Spectrum Virtualize cluster, can virtualize another cluster behind it as external storage
Cluster across sites:
í Enhanced Stretched Cluster SAN Fabric
SVC-SV1
V7000
í Remote mirroring across sites
í HyperSwap Clustered
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
There are several ways to cluster an IBM Spectrum Virtualize system such as the IBM Storwize
V7000. One way is create a cluster with 4 I/O groups, which is a maximum of 8 node. Another is to
virtualize one Spectrum Virtualize system cluster behind another. And then the last way is remote
mirroring.
Uempty
Up to 6 12Fs
Up to 8 AE3 Up to 20 24Fs Up to 8 92Fs
Up to 4 92Fs
(adds capacity)
Scale-up
Up 32 PB capacity effective with hardware compression
1X 2X 3X 4X
6FDOHRXW(add controller, capacity, and performance)
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Each I/O group has its own bandwidth limits (in both MB/s and IOPS) and to scale workload, one
simply adds up to a total of 4 I/O groups. Workload can be moved from one I/O group to another to
balance the workload across I/O groups.. Bandwidth is also affected by the number and types of
adapters in the I/O groups.
The nodes in the I/O groups might have different performance bandwidths, thus offering customers
a range of performance options.
The Storwize V7000 can deliver internal storage resources with the attachment of flash, SAS, and
NL-SAS in 2U and 5U expansion enclosures. External storage systems such as the IBM
FlashSystem 900 all-flash storage enclosure, can provide fast and resilient to minimize latency by
using IBM FlashCore hardware-accelerated architecture, IBM MicroLatency modules. Regardless
of the configuration, a single optimized array can provide a maximum up to 32 PB of effective
storage in only 2U system with the attachment of any combination of storage enclosures.
This image illustrates the increments of IBM Storwize V7000 systems in a scale up and scale out
four scalable system solutions. It also shows that additional storage enclosures can be added to
each scalable system configuration. Expansion enclosures can be dynamically added with virtually
no downtime, helping to quickly and seamlessly respond to growing capacity demands.
Performance of IBM Spectrum Virtualize nodes varies across the family, so solutions can be sized
to customer I/O workloads, I/O adapters can be added to scale I/O performance as well.
Uempty
,2SRUWFRXQWVSHU9*HQFOXVWHU
• Up to 64 host ports (depending on the host type)
ƒ Host ports can also be used for inter-node and external storage I/O
• Storwize V7000 control enclosures support host connectivity
ƒ Direct host attachment to storage enclosures are not supported
ƒ Support FC-SAN, FC-P2P, or iSCSI topology for host connectivity
16 Gb FC/ 16 Gb FC and
I/O Groups 25 Gb ETH
FC-NVMe 25 Gb ETH
1X 16 8,4,8 8,4
2X 24 16,8,16 16,8
3X 32 24 24,12
4X 64 32 32, 16
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
The capability to support multiple cards allows the Storwize V7000 control enclosure to dramatically
increase the I/O bandwidth, and gives administrators flexibility to dedicate I/O ports for
host/external storage attachment, communications between Storwize V7000 controllers, or for
inter-site communications. More ports support more I/O bandwidth. The chart lists the maximum I/O
ports that are supported for one to four I/O groups.
All host connections are made to the control enclosures. IBM Storwize V7000 control enclosures
require either native Fibre Channel (FC) storage area network (SAN), FC Point to Point (P2P), or
iSCSI topology for host connectivity.
Port counts for other V7000 models and other Spectrum Virtualize nodes will vary depending on the
number and types of supported adatpers and integrated ports.
Uempty
0D[/LPLWDWLRQVDQG%HVW3UDFWLFHUHFRPPHQGDWLRQV
• Storwize V7000 Limits and Restrictions, Supported Hardware, and Product Documentation URL
https://www-01.ibm.com/support/docview.wss?uid=ibm10741421
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
With every Storwize V7000 release, a collection of supported hardware and recommended
software level pages are maintained and published to ensure that your environment is running at an
optimal level. This URL page contains a series of links to these pages, by version, for all supported
Storwize V7000 releases in a single place for your convenience. For each release, a cumulative list
of Problems Resolved and New Features can be found in the code download document.
Uempty
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
If the new enclosure is cabled and zoned correctly to the Storwize V7000 SAN network correctly,
the GUI (Spectrum Virtualize V8.2.1) Monitoring > System-Overview page presents the option Add
Enclosure. Next, complete the instructions in the Add Enclosure wizard to configure the new control
enclosure into your existing system.
There is no system initialization that is required when another Spectrum Virtualize system is added
to the Storwize V7000 system. If the system does not appear, recheck the system zoning and
cabling.
Uempty
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
IBM Storwize V7000 Models 724 and U7B require IBM Spectrum Virtualize for Storwize V7000
Software V8.2.0.2, or later, for operation. Licenses are required per Controller, Expansion, and
External Data Virtualization.
IBM Spectrum Virtualize for Storwize V7000 software is preloaded by IBM on Storwize V7000
machines. Use of the software is entitled through the acquisition of IBM Spectrum Virtualize
software licenses. For the BASE virtualization, it is licensed by the PHYSICAL USABLE capacity
that Spectrum Virtualize is managing. Basically, the number of drawers of storage and storage type
(SCUs or Storage Configuration Units) per the Storwize V7000 system.
IBM Storwize V7000 Models 724 and U7B support external virtualization. Use of the external
virtualization capability is entitled through the acquisition of IBM Spectrum Virtualize Software for
SAN Volume Controller (SW PID 5641-VC8 in AAS and SW PID 5725-M19 in Passport
Advantage).
Storwize V7000 Models 724 and U7B require IBM Multi-Cloud starter software for Storwize V7000
(SW PIDs 5639-MC4, 5639-MC5, and 5639-MC6).
In a mixed clustered environment, additional licenses for Storwize V7000 are also required. You
must ensure that the Spectrum Virtualize system to be added to the cluster is licensed on an
enclosure basis for base, copy services, and compression. Additionally if the to be added Spectrum
Virtualize system manages externally virtualized storage, then the licenses for this capacity must be
converted to storage capacity units (SCU) or terabyte licenses. In cases with External Virtualization
with copy services, there is no way to differentiate the virtualized and internal storage at the volume
level. This means that there is no easy way to determine the amount of external storage used in a
copy service operation, and determine the amount of external virtualization copy service license
required.
Uempty
In addition to these licenses, if the Storwize V7000 system support encryption through an optional
license, then the cluster system to be added must also have an encryption license before it can be
added to the Storwize V7000 clustered environment.
Uempty
0XOWLFORXG VROXWLRQUHDG\
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
This chart lists the software features that are supported, included, and optional by licensing for IBM
Storwize V7000.
IBM Spectrum Virtualize combines a variety of IBM technologies, including deduplication,
compression, compaction, thin provisioning, and SCSI Unmap, HyperSwap (high-availability
solution), Easy Tier (automatic and dynamic data tiering), FlashCopy (snapshot), and remote data
replication. It also includes leading third-party technologies, such as Bridgeworks WANrockIT
network optimization. Encryption of internal and external virtualized capacities is also available
using a Feature code. Once enabled, it can be activated per storage pool. These technologies
enable Storwize V7000 to offer a rich set of functional capabilities and deliver extraordinary levels
of storage efficiency.
All of the Storwize V7000 functional capabilities shown are provided through IBM Spectrum
Virtualize Software for Storwize V7000.
In addition, IBM Storwize V7000 is capable of engaging in a multicloud world with the optional cloud
platform offerings that can be purchased to enhance your cloud storage environment.
Uempty
6SHFWUXP9LUWXDOL]HOLFHQVLQJ
/LFHQVHSHUHQFORVXUH
2SWLRQ 2SWLRQ
)/(;,%/(237,216 )8//%81'/(
&RQWUROOHU ([SDQVLRQ ([WHUQDO &RQWUROOHU ([SDQVLRQ ([WHUQDO
; %DVH ; %DVH
$GYDQFHG)XQFWLRQV ; )XOO%XQGOH
(DV\7LHU ; (DV\7LHU
)ODVKFRS\ ; )ODVKFRS\
5HPRWH0LUURU ; 5HPRWH0LUURU
&RPSUHVVLRQ
; &RPSUHVVLRQ
%DVLFVRIWZDUH
&%&RQWUROOHU%DVHGVRIWZDUH
;%([SDQVLRQ%DVHGVRIWZDUH
(%([WHUQDOYLUWXDOL]DWLRQ
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ
6WRUZL]H92QO\ &RS\ULJKW,%0&RUSRUDWLRQ
IBM Storwize V7000 offers two ways of license procurement: Fully flexible and Bundled (license
packages) The license model is based on license-per-enclosure concept known from the first
generation of IBM Storwize V7000, however the second generation offers more flexibility that
exactly matches your needs.
The base module is represented by IBM Spectrum Virtualize family and is mandatory for every
controller, enclosure, or externally managed controller unit. For advanced functions, there will be a
choice. The Full bundle, entitles the user to all advanced functions available on the system, and will
cost less than the sum of the those licenses. This full bundle will be the default pre-select, as we
expect the majority of customers will select this, for the value for money it offers.
We would expect almost all customers to be using Easy Tier and Flashcopy, and with the new
assurance and performance of Real-time compression, we would again expect this to be sold in all
but the most exceptional situation.
• IBM Spectrum Virtualize Software for Storwize V7000 Controller Software V8.2.1 (5639-CB8)
provides core software functions, and is required in all Storwize V7000 offerings. This software
includes components that are installed on Storwize V7000 expansion enclosures (2076), but
licensing is based solely on the quantity of control enclosures that are included in the system.
• IBM Spectrum Virtualize Software for Storwize V7000 Expansion Software V8.2.1 (5639-XB8).
Each Storwize V7000 expansion enclosure (2076-12F/24F) requires one 5639-CB8 Storwize
V7000 Base Software license. The 2076-92F enclosure requires four 5639-XB8 Base Software
licenses.
Optional license for external storage only presents all externally virtualized storage that is not
part of the Storwize V7000 machine type, and that do not have a 5639-RB8 license require a
Storage Capacity Units license.
Uempty
• IBM Spectrum Virtualize Software for Storwize V7000 External Data Virtualization Software
V8.2.1 (5639-EB8).
Additional licensed features can be purchased on-demand either as a full software bundle or each
feature separately.
Uempty
Differential Licensing
• Differential Licensing is used to calculate the license needed for a configuration
ƒ License change from ‘per TB’ to ‘per SCU’ (storage capacity unit) (for external virtualized storage)
ƒ SCU is defined in terms of the category of the storage capacity:
í Category 1: Flash and SSD flash drives
í Category 2: SAS drives, Fibre Channel drives, and systems that use drives with advanced architectures to
deliver high-end storage performance
í Category 3: Nearline SAS (NL-SAS) and Serial ATA (SATA) drives
í Example: 92F High Density Storage Enclosures count as 4 SCU licenses per enclosure
• For each SCU, the following number of terabytes (TB) by storage classification applies:
ƒ Flash = 1 SCU equates to 1.00 TB usable of Category 1
ƒ FC/SAS = 1 SCU equates to 1.18 TB usable of Category 2
ƒ NL/SATA = 1 SCU equates to 4.00 TB usable of Category 3
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Starting with version 7.7 of IBM Spectrum Virtualize, Differential Licensing is used to calculate the
license needed for a configuration. With Differential Licensing, licenses change from per terabyte to
per storage capacity unit (SCU). SCUs are only needed for virtualized storage that does not have
the 5639-CB8 base license. SCU is defined in terms of the category of the storage capacity:
• Category 1: Flash and SSD flash drives
• Category 2: SAS drives, Fibre Channel drives, and systems that use drives with advanced
architectures to deliver high-end storage performance
• Category 3: Nearline SAS (NL-SAS) and Serial ATA (SATA) drives
For example, the 92F High Density Storage Enclosures count as 4 SCU licenses per enclosure.
Any storage use case that is not listed is classified as Category 1.
For each SCU, the following number of terabytes (TB) by storage classification applies:
• 1 SCU equates to 1.00 TB usable of Category 1
• 1 SCU equates to 1.18 TB usable of Category 2
• 1 SCU equates to 4.00 TB usable of Category 3
When you calculate the count of SCUs per category, fractions must be rounded up to the next
higher integer number. For the IBM Spectrum Virtualize Real-time Compression for external
storage software license, enough SCUs are required to cover actual managed disk capacity that is
used by the compressed volumes. FlashCopy and Remote Replication licensing are unchanged
and remain based on the virtual disk capacity.
Uempty
Keywords
Control canister Internet Wide-area RDMA Protocol
(iWARP)
Distributed RAID (DRAID)
iSCSI Extensions over RDMA (iSER)
Traditional RAID (TRAID)
Mirrored Flash Boot M2 Drive
Dual Flash Boot M2 Drives
NVM Express (NVMe)
External Virtualization
PCIe Switch
Flash Core Module (FCM)
RDMA over Converged Ethernet
IBM FlashCore Technology Protocol (RoCE)
IBM Storwize V7000 Model 724 Single Flash Boot M2 Drive
IBM Storwize V7000 Model U7B Scale-out & Scale-up
IBM Multi-Cloud Self-encrypting drives
IBM Spectrum Virtualize
IBM Storage Insights
On-board Compression Assist
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review questions (1 of 4)
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review answers (1 of 4)
1. True or False: IBM Spectrum Virtualize in Storwize V7000 complements server
virtualization with technologies such as PowerVM, Microsoft Hyper-V, VMware
vSphere, Kubernetes and Docker.
The answer is True .
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review questions (2 of 4)
3. During a system power failure, the integrated batteries provide enough power to
allow the system to write data that is in __________ to the boot/dump drives.
1. Battery module(s)
2. NVMe flash Drives
3. Volatile memory
4. True or False: IBM Storwize V7000 slots are numbered from left to right for both
nodes in a control enclosure.
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review answers (2 of 4)
3. During a system power failure, the integrated batteries provide enough
power to allow the system to write data that is in __________ to the
boot/dump drives.
A. Battery Module(s)
B. NVME flash drives
C. Volatile memory
The answers is C. IBM Storwize V7000 flash boot drives supporting a larger write cache
capacity and provides faster processing between the boot time and the ‘dump to disk’ time.
4. True or False: IBM Storwize V7000 slots are numbered from left to right for
both nodes in a control canister.
The answer is False. Control canister 1 installed in the top slot at 180ஈ rotation. Therefore the
slots and ports on control canister 1 are numbered right to left and control canister 2 slots
and ports are numbered left to right.
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review questions (3 of 4)
5. How many SAS I/O adapters can be installed into a Storwize V7000 control
enclosure.
A. One
B. Two
C. Three
6. True or False: To provide a fully redundant IBM Storwize V7000 solution, you
must purchase two control enclosures.
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review answers (3 of 4)
5. How many SAS I/O adapters can be installed into a Storwize V7000 control
enclosure.
A. One
B. Two
C. Three
The answer is Two. When required, IBM Storwize V7000 supports a maximum of two 2-port 12 Gb
SAS Expansion Enclosure Attach Card installed only in slot 3 (one per control canister).
6. True or False: To provide a fully redundant IBM Storwize V7000 solution, you must
purchase two control enclosures.
The answer is False. A single Storwize V7000 control enclosure consists of two canisters or nodes,
which provides full redundancy.
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review questions (4 of 4)
7. True or False: The integrated 10 Gb Ethernet ports can be used for node-node
communications?
8. True or False: External storage can be either Fibre Channel or iSCSI attached?
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review answers (4 of 4)
7. True or False: The integrated 10 Gb Ethernet ports can be used for node-node
communications?
The answer is False, only FC or 25 Gb iSER can be used for node-node communications.
8. True or False: True or False: External storage can be either Fibre Channel or iSCSI
attached?
The answer is True
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Summary
,%06WRUZL]H9DUFKLWHFWXUHRYHUYLHZ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Overview
This unit introduces the key elements of the IBM FlashCore Technology Hardware Accelerated I/O,
IBM FlashCore Module, and Advanced Flash Management.
References
Implementing IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1
http://www.redbooks.ibm.com/redpieces/pdfs/sg247938.pdf
Uempty
2EMHFWLYHV
• Summarize the attributes of IBM
FlashCore Technology
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
,%0)ODVK&RUH7HFKQRORJ\+DUGZDUH$FFHOHUDWHG,2
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
The topic highlights IBM FlashCore Technology beginning with the Hardware Accelerated I/O
all-hardware data path which delivers the highest performance and lowest latency for all-flash
storage arrays.
Uempty
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Today's smarter data center is built with systems that are increasingly instrumented,
interconnected, and intelligent; thus meeting challenges to businesses that need to stay responsive
to dynamic environments.
IBM FlashCore technology innovations are fundamental to the FlashSystem 900 which is the
unified building block to the Storwize V7000 (724), FlashSystem 9100, FlashSystem V9000,
FlashSystem A9000 and FlashSystem A9000R storage systems. This engineered technology
maximizes flash’s lightning-fast I/O speed while at the same time supplying the most reliable flash
storage available on the market today. This design enables the use of IBM enhanced 3D
Triple-level cell (TLC) NAND technology, which increases density and further reduces storage
costs.
The IBM FlashCore technology employs several IBM patented mechanisms, to achieve greater
capacity and throughput, at a lower cost than the previous generation of IBM FlashSystem systems
which results in higher improvement in endurance with a potential less reduction in write
amplification.
We will review the essentials of the FlashCore technology and the unique approach using
high-performance architecture, reliability, and efficiency with Hardware Accelerated I/O, IBM Flash
Core Module, and Advanced flash management that features such as Variable Stripe RAID
technology, IBM-engineered error correction codes, and proprietary garbage collection algorithms.
Uempty
)ODVK&RUH+DUGZDUH$FFHOHUDWHG,2
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
FlashSystem Hardware Accelerated I/O is an advanced switch fabric that built on a flash-optimized
full active/active design, with an end-to-end NVMe strategy in mind to bring extremely low latencies
to multi-cloud environments.
This high performance, scalable PCI Express switching solution helps optimize performance per
watt for the Storwize V7000 dual-ported NVMe Flash Core Modules and NVMe standard drive
connections, system interconnects and I/O expansions, allowing I/Os to pass through any port and
come out with the same number of hops.
The dual RAID canister form a logical cluster with no single point of failure in the design (assuming
that all host connections have at least one path to each canister). The Variable Stripe RAID
protection functionality for the NVMe flash core modules to remove any bad plane that occurs from
use without impacting the available capacity of other devices within the RAID stripe.
In addition, each FCM leverages the advantages of IBM FlashCore-enhanced 3D Triple-Level Cell,
(TLC) storage media providing greater flash density and storage capacity than multi-level cell
(MLC) solutions.
IBM Storwize V7000 maintains the high resiliency with the latest generation of micron flash
providing you with the same life expectancy that are in all of our flash core products.
Uempty
,%0)ODVK&RUH7HFKQRORJ\)ODVK&RUH0RGXOH
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Our next sub-topic details the IBM Flash Core Module’s ability to deliver extreme performance,
greater density, unlimited scalability, and mission-critical reliability to the Storwize V7000 system.
Uempty
,%0)ODVK&RUH0RGXOHOHYHUDJHWHFKQRORJ\OHDGHUVKLS
• IBM Engineered SFF 2.5” Flash Core Module with NVMe
protocol
)ODVK&RUH0RGXOHVDUHQRWLQWHUFKDQJHDEOHZLWKWKH,%00LFUR/DWHQF\ 0RGXOHV
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
FlashCore technology leverages the powerful characteristics of IBM Micron TLC technology to
create an industry-leading enterprise storage solution. IBM has engineered and designed its own
FlashSystem MicroLatency Module’s flash storage to complement the hardware accelerated
architecture at the controller level. In order to offer the fastest I/O response time, IBM’s
MicroLatency Modules use industry standard NAND chips to support NAND data storage available
today.
Leveraging this patented design, IBM unveils in a 2.5-inch smaller foot print, the Flash Core Module
supporting PCIe 3.0 2x2 connectivity; offering higher capacity, extreme performance and reliability
advantages of a flash drive. The Flash Core Module exclusively utilizes the Non-Volatile Memory
Express (NVMe) drive protocol. It also includes a powerful built-in performance neutral in-line
hardware-based data compression technology using the Gateway/Controller Field Programmable
Gate Arrays (FPGA) within each FCM. This provides ultra-fast and always-on data reduction across
the full range of workloads to include self-encryption features. IBM has also added read ahead
cache to enable read latency on highly compressed pages – improving the overall performance.
In addition, this design has consolidated multiple chips (Power PC (PPC), and updated FPGAs for
better error handling and recovery.
Uempty
Flash Stacking
• Provides 3X the density compared to eMLC 101
ƒ Unique patented designs ensure maximum availability Z
100
ƒ Uses die stacking to increase density
011
ƒ Provides enterprise reliability, higher-performance, and
higher capacity 010
Supports flash wear guarantee 000
Y
Seven years hardware support and
optional post warranty hardware
maintenance for six years.
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Each FCM leverage the advantages of IBM FlashCore-enhanced 3D Triple-level cell (TLC) flash
NAND chip that provides greater flash density and storage capacity than multi-level cell (MLC)
solutions. Implementing IBM-enhanced 3D within each flash module provides three times higher
storage density than previous systems.
IBM-enhanced 3D TLC 50nm flash memory provides a major boost in its enterprise reliability,
achieving high-performance and memory than any of the previous micron technologies. Although
TLC is still not quite as fast as DRAM, with its 1 microsecond read latency, but it is 10 times denser
than the DRAM chips, and 50 % larger than the previous generation 20nm eMLC chips, which
means a higher capacity for more data storage.
The implements a three-dimensional (3D) TLC flash chip stacking architecture which takes the
previous flash chip xy coordinates, providing smaller cells that are now stack on top of each other –
one after the other. IBM then added a z coordinate to expand the cell in diagonal – creating a larger
denser cell and a denser product.
TLC flash implements 3-bits per cell which help improves storage capacity by 50% in the latest
Flash Core Modules for better performance and wear.
With the latest generation of FlashSystems and its flash modules, you have up to seven (7) years of
total hardware support, this includes the applicable warranty period plus up to six years of optional
post-warranty hardware maintenance, which can be purchased with the system or at a later date.
The latest generation of IBM FlashSystem storage supports 3D TLC flash.
Uempty
%HQHILWVRI)ODVK&RUHZLWK'7/&1$1'
• 3D TLC NAND flash
ϯd>&ůĂƐŚ • Program/Erase cycle spec: 5k cycles
• Specification assumes strong error
correction of 0.01 raw BER
(QGXUDQFH*DLQ ;
• IBM FlashCore technology:
Voltage threshold calibration
Strong error correction
Health binning
• With FlashCore technology we can
achieve 18.8k P/E cycles
• Raw endurance gain: 18.8/5.0 = 3.8x
• With 2.5-to-1 compression: over 9x
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
By comparison, today's NAND flash lasts for between 3,000 and 10,000 erase-write cycles.
IBM-enhanced 3D TLC can achieve roughly 5000 read/writes cycles and from a raw NAND
respective which is the number of bit errors per unit time, we can assume strong error correction of
0.01 raw bit error rate (BER). These cycles are improved upon with the voltage threshold
calibration, inline hardware compression, strong error correction software, and health binning with
garbage collection and wear-leveling. Therefore, you can achieve around 18.8k program/erase
cycles with an endurance gain which is 3.8 times the cycles when factor in the FlashCore
Advanced Flash Management. In addition the raw endurance gain is 18.8/5.0 equaling to 3.8x, and
with a 2.5 to 1 compression which improves TLC flash endurance 9x over standard
implementations without sacrificing latency.
Uempty
)3*$VLQWKHGDWDSDWK
All operations to the flash modules are controlled by a single FPGA
Gateway interface FPGA
Responsible for providing I/O to Controls I/O for multiple flash chips
the flash module
Sits on the card itself Maintains write ordering and layout
Has dual connection to the backplane Responsible for garbage collection, error
handling, and VSR (along with PPC chip)
LBA to LPA
mapping tables
(DRAM)
NAND storage
All I/O data transfers operations to the Flash Core Modules are processed and controlled by a
dedicated Gateway/Controller Interface Field Programmable Gate Arrays (FPGA) hardware-only
data path.
The Gateway/Controller Interface FPGA that is located on the Flash Core Module and has two
connections to the backplane. It is also responsible for the following functions:
• Provides direct memory access (DMA) path and hardware I/O logic
• Uses lookup tables and a write buffer
• Maintains write ordering and layout
• Provides write setup
• Maintains garbage collection
• Provides error handling
As an enterprise storage system, FlashSystems use controller-level DRAM caching as a staging
area for data being accessed and as a means of holding frequently accessed data to provide faster
I/O performance than reading from or writing to disk or SSD. And, since FlashCore technology
systems contain no disk or SSDs; instead, storage operates at speeds (reads and writes from flash
storage).
Uempty
/LQHVSHHGGDWDDWUHVWHQFU\SWLRQ
• XTS-AES-256 drive level encryption
ƒ Hardware encryption/decryption occur at internal
Protecting
data path line speed business’s most
• Zero performance Impact valuable asset
• Compression before encryption
ƒ Encryption and compression possible
• USB based keys or use IBM Security Key Lifecycle
Manager (SKLM) to simplify, centralize, and
automate key management
ƒ Provides support for multiple redundant SKLM servers
• Apply encryption to all virtualized storage
ƒ Centralize encryption management
Secure data at rest when drives
are removed from the system
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Flash Core Modules utilizes innovative line speed data–at-rest encryption that is implemented
below the RAID level in each module. The hardware-accelerated compression provides more
consistent data reduction than ever before across an even wider range of workloads, without
negatively impacting performance.
A dedicated chip inside each FCM provides an AES 256 hardware-based, data-at-rest encryption,
supporting industry standard AES-XTS 256 required by most leading compliance regulations such
as HIPAA and FIPS. Data-at-rest encryption protects against those potential exposure of sensitive
user data and user metadata that are stored on discarded or stolen flash modules.
Hardware encryption and decryption occur at internal data path line speed and have no impact on
I/O latency when in operation, and there is no performance degradation, making it much easier to
deploy data security for the all-flash storage systems. The encryption of system data and metadata
is not required, so system data and metadata are not encrypted.
FCM hardware encryption is supported by key management using either USB based keys or via
IBM Security Key Lifecycle Manager (SKLM) to simplify, centralize and automate key management.
IBM Spectrum Virtualize also features software encryption for all of the storage that doesn’t use
hardware encryption. This can simplify your management as well as extending the life of your
existing investments at the same time as adding more security.
Uempty
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Inline hardware compression is the latest enhanced feature that has been added the IBM
FlashCore Technology and now implemented in purpose-built TLC flash modules with inline
high-performance compression. FlashSystem inline hardware compression supports data
compression/decompression algorithm based on the Modified Dynamic GZIP algorithm.
This technology originated with Z and has been adapted to work in an IBM FlashCore flash
controller (FPGA chip) and always-on, providing faster compression with extremely low latency to
optimize performance and cost effectiveness.
Inline hardware compression can compress up to about 2.6 times that data depending capacity size
of the flash modules, and the compressibility of the data at hand. For example, using the 4.8 TB
flash modules provides the ability to compress 22 TB of data with zero performance impact while
still achieving the full 1.2 IOPS with this always-on compression feature with zero performance
impact.
Uempty
,%0)ODVK&RUH7HFKQRORJ\$GYDQFHG)ODVK0DQDJHPHQW
,%0)ODVK&RUH7HFKQRORJ\
+DUGZDUH$FFHOHUDWHG,2
,%0)ODVK&RUH0RGXOH
$GYDQFHG)ODVK0DQDJHPHQW
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Our sub-topic discusses how IBM Storwize V7000 uses the IBM FlashCore Advanced Flash
Management to maintain flash data storage.
Uempty
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Advanced Flash Management is the final piece of the IBM FlashCore technology which strengthens
NAND endurance and performance reliability by using special--purpose hardware and patented
algorithms to extend the life of NAND memory. This includes the following:
• Built in, performance neutral hardware compression and encryption
• Using next gen 64 layer 3DTLC
• Outstanding data reliability
• Cognitive Algorithms for Wear Levelling, Health binning, Heat segregation and media
management
• Intelligent media management that intelligently keeps settings ideal to prevent inconsistent
performance.
• Endurance without latency penalty
• FIPS 140 certification
• Self Protection on Power Loss
Uempty
System-level DRAID
Module-level
Variable Stripe RAID
(VSR)
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
The System-level DRAID is managed by centralized IBM Spectrum Virtualize system cluster and
provide protection against data loss and data unavailability resulting from flash module failures.
In addition, IBM FlashCore technology offer data protection using a combination of Variable Stripe
RAID technology (at the flash module level) and IBM-engineered error correction codes.
The module-level Variable Stripe RAID technology is a patented highly granular DRAID type data
protection arrangement and your first level of protection. It also compliments the system-level
DRAID by allowing data to be rebuilt onto a hot spare flash module across each flash chip in the
system, so that flash modules can be replaced without data disruption.
In addition to the 2D Flash RAID technology, the FlashSystems are protected by IBM Engineered
ECC error correction protections in which bit and block errors are managed by each module using
its chips.
Uempty
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Each Flash Core Module communicates with the System-level DRAID to reconstruct the data rather
than dedicating large amounts of spare flash per module in order to perform the RAID rebuild.
DRAID-6 is recommended due to its enhanced availability. Typically DRAID-6 will give equivalent
performance to DRAID-5 for most scenarios. DRAID 5 is also supported, but you can only create
DRAID6 within the GUI (as it is highly encouraged). DRAID 5 can only be configured using the CLI.
Implementing System-level DRAID enhances the solution reliability. The high IOPS and low
latency NVMe offers, along with the disturbed spare space in the array facilitates rebuilding the
array in the event a FCM fails, without impact to the hosts' performance – thus providing consistent
performance all the time.
Uempty
'5$,'LQLWLDOL]DWLRQEHKDYLRU
• Drives will low-level format when transitioning
from candidate to array member
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
When the DRAID starts its initialization process, FCMs will have a low-level format when
transitioning from candidate to array member. After all flash modules have been formatted, the
array-level initialization will have a relatively slow start and eventually speed up as it progress.
Uempty
)ODVKFKDOOHQJHVDQGVROXWLRQV
• Limited write cycles
ƒ Some logical blocks more frequently updated
ƒ Wear leveling
í Write updates sequentially to the flash
ƒ Overprovisioning
ƒ Variable stripe RAID
ƒ Health binning and heat segregation
ƒ ECC designed for flash
ƒ Voltage level shifting
ƒ Compression
• Write amplification
ƒ Physical write size a multiple of logical write size
ƒ Write updates sequentially and implement garbage collection
í Keep track of LBA to PBA via meta-data
• Erase required before write
ƒ Large blocks must be erased, leading to more write amplification
ƒ Write cache to address write latency
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
A major challenge for flash storage is the flash cell’s limited number of write cycles before it wears
out, which is exacerbated by the fact that some data is more frequently updated than other data.
So typically a file system is kept for the flash tracking logical block addresses (LBAs) to physical
block addresses (PBAs) in meta-data, whereby updates are written out sequentially to the flash
This also addresses the write amplification that occurs with flash, where we have to erase large
amounts of flash as a group compared to the much smaller write sizes we often see from hosts. By
writing out updates sequentially to the storage rather than updating in place, we are performing a
form of wear leveling. Overprovisioning is providing more extra flash cells than the Spectrum
Virtualize system sees from a FCM or SSD, and using that as extra redundant space to replace
worn out flash cells. Variable Stripe RAID is a form of overprovisioning whereby failed flash chips
are bypassed and data is restriped onto that overprovisioned storage. The ECC designed for flash
facilitates tracking cell health and implementing health binning and segregations algorithms
whereby hotter, more frequently updated data, is directed to healthier cells and vice versa.
Compressing the data before writing it to flash also lengthens the life of the flash cells, by writing
less data. And voltage level shifting is used to adjust read voltage as the flash cells age,
lengthening their life.
Without write cache, writes to flash would have lower latency than reads because we must first
erase the cell before we can write to it. But the Spectrum Virtualize write cache hides this latency
from the host.
Uempty
9DULDEOH6WULSH5$,'DGYDQWDJH
• Protects data from a chip failure
• Tracks failures at block levels
ƒ Parity treats each drive as a failure domain
• Dynamically re-stripes data at a sub-chip level
• Preserves life, protection and performance
Performance across all spare
capacity (no dedicated reserve) Maximum level of
flash module data
3
2
protection
1
Maximum wear life
Fast writes
Scalable
Fast at reads
Faster Rebuilds Non-volatile
Very low power
Initial chip data usage After failure
Variable Stripe RAID (VSR) recovery process is automatic and
transparent to the user and administrator
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Variable Stripe RAID (VSR) is a patented IBM technology that provides maximum level data
protection on the page, block, or chip level. VSR provides an intra-module RAID stripe on each
Flash Core Module (FCM), just like the MicroLatency modules in FlashSystem 900. This
technology eliminates the necessity to replace a whole flash module when a single chip or plane
fails. This, in turn, expands the life and endurance of flash modules and reduces considerably
maintenance events throughout the life of the system.
No system-level rebuild process is necessary to maintain data protection or usable capacity after a
failure caught by Variable Stripe RAID. Furthermore, the entire DSR recovery process is automatic
and transparent to the user and administrator, and typically takes place in less than a second.
Variable Stripe RAID activities are not normally tracked in system logs, but the root causes of
failures that are typically handled by Variable Stripe RAID-plane failures and block failures are
tracked in system counters and reflected in the overall flash module and system health metrics.
With Variable Stripe RAID, every flash controller creates a striped data layout across its set of chips
similar to a 9+P+Q+S System-DRAID level array with rotating parity. When the Variable Stripe RAID
algorithm detects a failure affecting one or more flash module in a DRAID stripe, the following
process happens:
• Data that is stored in the affected regions is reconstructed from the remaining data/parity
elements in the stripe.
• All pages in the affected stripe, including the reconstructed data, are moved to reserved space
(overprovisioned area).
• Subsequent requests for data in the affected stripe are directed to the new locations (now part
of the normal storage area in the system).
Uempty
• The original location of the affected stripe is added to the available overprovisioned area as a
(n-1) + parity stripe. (For example, if the affected stripe was a 9+2 stripe, it becomes an 8+2
stripe.). No spare drive is required, DRAID will only allow a single spare area which is
associated with the effective capacity of each flash module.
Uempty
,%0(QJLQHHUHG(&&GHWHFWLRQDQGFRUUHFWLRQ
Without With
• IBM has implemented stronger ECC code that IBM ECC IBM ECC
does not require read-retry
Write Write
Reread Write
ECC in IBM FlashSystems maintains latency Write Write
Reread
and results in large endurance improvements Write
with no performance penalties Reread
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
IBM NAND technology all-flash storage systems uses a strong error correcting code (ECC)
algorithm to protect data as it is accessed in flash memory to meet its flash reliability specifications.
IBM Engineered ECC error correction (ECC) algorithms are integral modules of flash controllers in
storage systems. IBM enhanced milt-level cell flash memory, which stores 2 bits per cell and has
four states, requires more energy to manage the electrical charge during write/erase operations.
This means that the enhanced TLC flash requires higher voltage, which degrades the
characteristics of its memory cell, requiring better error correction codes and mechanisms. The
error detection code and error correction code algorithms maintain data reliability by allowing
single-bit or multiple-bit corrections to the data that is stored. If the data is corrupted due to aging or
during the programming process, the error detection code and error correction code algorithms
compensate for the errors to ensure the delivery of accurate data to the host application.
In addition, the most IBM innovations provide the capability to handle most ECC activity using
hardware rather than software. Many systems rely on ECC hardware detection but may correct bit
errors using software functionality. However, software correction takes longer to fix bit errors, which
occur more frequently with higher density NAND chips. With FlashCore technology designed
hardware ECC, the latest FlashSystem storage systems can take advantage of this high-?density
but more volatile NAND memory without suffering any undue performance degradation. This
means that IBM ECC correct information without having to reread the information or write
information back to the flash and read it a second time. Therefore you not only benefit from the
lower cost and higher density of the latest--generation NAND technologies but you also gain
consistent high I/O performing flash storage.
Uempty
&RPSUHVVLRQGHFRPSUHVVLRQLQWKHGULYH
• Combines LZ1 (LZ77) with a form of Pseudo Dynamic Huffman
• Compression and decompression are performed on individual logical pages
ƒ Performs the inbound data path before any logical-to-physical mapping occurs
í Less data to transfer from/to backend making up for small added latency (7uS)
ƒ Decompression is performed as the last step in the outbound data path immediately prior to returning the
requested data
• Data protection (ECC) is implemented on top of compressed data
• Compression and Decompression completely transparent
Inline hardware compression is implemented below the System DRAID layer which combines the
LZ1 lossless data compression algorithms that maintains a sliding window with a form of Pseudo
Dynamic Huffman. Compression and decompression is performed on individual logical pages:
• The compression is performed in the inbound data path, before any logical-to-physical mapping
occurs. This reduces the data to transfer to/from the backend, reducing the transfer time. This
reduction in transfer time, offsets the time to process the compression/decompression
instructions. Only read latency is affected by the time to compress the data since we have write
cache in the I/O group, but it's offset by the reduced transfer time.
• Inline hardware compression is capable of compressing up to approximately 128:1 data ratio. If
the data is not compressible, the engine will 'bypass' compression. During this process, no data
expansion' can occur when attempting to compress uncompressible data.
• Compression in-line occurs before DDR write-buffer, after that the data is stored compressed in
write buffer.
• Next error correction code (ECC) is applied to the compressed data.
• Decompression is performed as the last step in the outbound data path. This is completed
immediately prior to returning the requested data.
Uempty
*DUEDJHFROOHFWLRQ
• Reclaim invalidated space due to out-of-place writes.
• Relocation of valid data leads to write amplification (WA).
ƒ Smarter data placement using heat segregation reduces write amplification
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Garbage collections is a process that involves reclaiming invalidated space due to out-of-space
writes. Once data has been written to flash block, you cannot write again until a block has been
erased – freeing the block for use again.
A block is much larger than a page, therefore, this process requires ongoing statistics to determine
when a block becomes invalidated as it is rewritten. Because garbage collection involves
time-consuming erase operations and numerous internal reads and writes, an ongoing garbage
collection process can stall incoming user requests until it completes. As a consequence of the
queuing delay, the performance of flash can be significantly degraded. IBM FlashSystem has
hardware capabilities that allow it to simultaneously run garbage collection activities without
impacting the I/O. This is achieved using heat segregation to reduce write amplification, which is a
simple algorithm that measures how much invalidity is used. If a block that is selected for erasure
has some valid pages, those pages are migrated to other blocks before erasing the block. This
process is commonly done in the background or as needed, if the system is low on available space.
Uempty
:HDUOHYHOLQJ+HDOWK%LQQLQJ
• Equalize block health instead of P/E cycles
ƒ Dynamic wear leveling: Smarter data placement using health binning.
í Counts writes and dynamically remap blocks
í Enhances its endurance to keep performance consistent
í Ensures the same memory blocks are not overwritten
• Static wear leveling: Reduce to strict minimum to ensure retention targets
ƒ Predicts and projects the health status of each flash block
Wear-leveling
management
Written Written Written Written
+ + - -
Health
Written Written Written Written
IBM FlashCore technology offers continuous health monitoring of each flash block and performs
asymmetrical wear-leveling and sub-chip tiering. This asymmetrical advanced wear-leveling
mechanisms offers significantly higher endurance in real life workloads and works in conjunction.
Health binning provides longevity enhancement to prolong the service life of erasable flash
memory. IBM FlashCore technology uses a wear-leveling algorithms for smarter data placement so
that erase cycles are equalized evenly among all of the flash memory by therefore, enhancing its
endurance up to 57%.
The wear-leveling algorithms can be classified into dynamic wear-leveling and static wear-leveling.
In combination with the over-provisioning algorithm, the goal of the wear-leveling algorithm is to
ensure that the same memory blocks are not overwritten too often. With this mechanism, the flash
controller distributes the erase and write cycles across all the flash memory blocks, advantage of
even more NAND storage to better preserve Storwize V7000 flash storage.
With the enhancement of 3D TLC, flash controller still predicts and projects the health status of
each flash block – evening predicting which blocks are healthier than others and therefore placing
the hot data on the healthiest block and the cold data resides on the less healthy blocks. This
process measures the endurance by the average health of the block instead being determined by
the least healthiest block.
With continuous health management, IBM can actively improve storage health by adjusting the
threshold to make blocks healthier which helps to prolong its longevity.
Uempty
+HDOWKVHJUHJDWLRQ
• Continuously monitor block health and shift threshold voltages accordingly
ƒ Determines those blocks that are frequently used versus less frequently used
ƒ Group blocks based on heat access
• Actively narrow health distribution of all blocks with health binning
ƒ At end-of-life remaining good blocks only have little P/E cycles left
• Enhanced write amplification
Blocks that reach the error correction capability of the ECC must be retired. These retired blocks
eat up over-provisioning and ultimately limit device endurance even if there are still many good
blocks available.
With an ongoing algorithm of collection statistical data, IBM has improved the process of garbage
collection even more by performing health segregation which is a heat level grouping, allows data
that has the same approximate access heat to be written together; providing up to 45% reduction in
write amplification.
One of the critical key factors is write amplification which is a measure of how often data that is
written from the host data gets written to the flash. Based on the size of host data megabytes or
gigabytes, the garbage collection will write data x number of times which can leave to high write
amplifications. When there are higher write amplifications, the harder it is for the garbage collection
to keep up without interfering the host writes and reads. When there are relatively large number of
stale pages detected for the same address, those pages get isolated into a separate set of flash
blocks to reduce write amplification.
In transaction workloads such as Oracle databases, FlashCore technology write amplification can
be as much as three times less garbage collection than what is used in SSDs. Low latency not only
strengthens flash endurance but provides consistent performance continuously.
Uempty
3URDFWLYHYROWDJHOHYHOVKLIWLQJ
• Flash Conditioning (proactive voltage-level shifting)
ƒ Dynamic read level shifting over life of flash blocks
ensures that the flash cells will have the longest possible
lifespan
ƒ Predictive techniques adjust internal flash settings in
advance, minimizing probability of uncorrectable errors
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
FlashCore technology provides the ability to provide proactive voltage-level shifting which can
actually adjust the voltage level being used to trigger flash cells. As flash cells age, FlashCore
evaluates the health of each block by implementing a dynamic read level shifting over the life of
flash blocks to ensures that the flash cells will have the longest possible lifespan. It also uses
predictive techniques to adjust internal flash settings in advance, minimizing probability of
uncorrectable errors, and determines the best voltage levels to set a block as it ages proactively.
This ensures proper operation, particularly when weak cells might be involved.
Uempty
.H\ZRUGV
FlashCore Technology • Triple-level cell (TLC)
Hardware accelerated I/O • Inline hardware compression
IBM MicroLatency Module • System-level DRAID
IBM Flash Core Module • Hardware only data path
Advanced Flash Management
• IBM Variable Stripe RAID
Latency
• IBM engineered ECC
Flash performance
• Advanced wear leveling
High IOPS
• IBM optimized overprovisioning
High bandwidth
• Write buffer
Low latency
• Hardware offload
• Line speed data-at-rest encryption
• Garbage collection
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
5HYLHZTXHVWLRQVRI
1. What are the three core principles of IBM FlashCore technology?
A. Gateway/Controller FPGA
B. IBM Hardware Accelerated I/O
C. IBM Flash Core Modules
D. IBM Advanced Management
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review answers (1 of 2)
1. What are the three core principles of IBM FlashCore technology?
The answers are IBM Hardware Accelerated I/O, IBM Flash Core Modules, and IBM
Advanced Management.
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review questions (2 of 2)
4. Which of the following RAID solution provides an intra-module RAID stripe
within each Flash Core Module?
A. IBM Variable Stripe RAID
B. IBM engineered ECC
C. System-level RAID
D. Distributed RAID 6
4. Which of the following IBM Flash Core Advanced Flash Management features
allows data that has the same approximate access heat to be written together to
reduce write amplification.
A. Health binning
B. Garbage collection
C. IBM ECC
D. Health segregation
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review answers (2 of 2)
4. Which of the following RAID solution provides an intra-module RAID stripe on
each Flash Core Module?
A. IBM Variable Stripe RAID
B. IBM engineered ECC
C. System-level RAID
D. Distributed RAID 6
The answer is IBM Variable Stripe RAID
5. Which of the following IBM Flash Core Advanced Flash Management features
allows data that has the same approximate access heat to be written together to
reduce write amplification.
A. Health binning
B. Garbage collection
C. IBM ECC
D. Health segregation
The answers is Health segregation.
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
6XPPDU\
• Summarize the attributes of IBM
FlashCore Technology
,%0)ODVK&RUHWHFKQRORJ\ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Overview
This module identifies the IBM Storwize V7000 SAS-Attached expansion enclosure options.
References
Implementing IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1
http://www.redbooks.ibm.com/redpieces/pdfs/sg247938.pdf
Uempty
Objectives
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
This topic examines the IBM Storwize V7000 SAS expansion enclosures that can be configured to
scale up and scale out the storage system infrastructure.
Uempty
IBM Storwize V7000 system can scale up its storage capacity to deliver internal SAS-attached
flash drives that uses the IBM 2U SAS expansion enclosures and IBM 5U SAS High Density
Drawer
12 Gb
0XVWEHWKHVDPHPDFKLQHW\SH
SAS
DVWKHFRQWUROHQFORVXUH
adapter
&RQWUROHQFORVXUHPXVWKDYH
*E6$6DGDSWHUFDUG
LQVWDOOHG
Control Expansion
Enclosures Enclosures
6XSSRUWV6$6FRQQHFWLYLW\
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
With the support of internal SAS-attached enclosures, IBM Storwize V7000 can scale up its storage
capacity to deliver internal flash drives providing tier 1 capacity solution. Expansion enclosures are
dynamically added with no downtime, helping to quickly and seamlessly respond to growing
capacity demands.
IBM SAS-attached expansion enclosures complement external storages, offering limitless
scalability, while maintaining high performance and reliability. SAS attached controllers can provide
a cheaper all-flash array solution with or without EasyTier.
When adding the IBM SAS expansion enclosures to the system configuration, the expansion
enclosures must be of the same machine type as the control enclosure it is being attached to, and
running on the IBM Spectrum Virtualize software.
Each control enclosure must contain a 12 Gb PCIe SAS adapter. The SAS adapter supports SAS
connectivity between the expansions to the controllers.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
IBM Spectrum Virtualize products are designed to deliver the benefits of storage virtualization and
advanced storage capabilities for environments from large enterprises to small businesses and
midmarket companies. You can scale up and scale out your storage infrastructure to start small and
pay as you grow for performance or capacity while still managing a single enterprise-class system.
A single Storwize V7000 control enclosure can support multiple attached SAS expansion
enclosures. Options include a 2U, LFF Model 12F supporting twelve 3.5-inch SSD/HDD SAS
drives, SFF Model 24F supporting twenty-four 2.5-inch SSD/HDD drives, and a 5U, LFF Model 92F
supporting ninety-two 3.5-inch drives. The different models of these expansion enclosures can be
intermixed within an V7000 system.
These 12 Gbps SAS expansion enclosures connect to a pair of SAS adapters, one in each control
enclosure/node in an I/O group. This ensures availability in a node or SAS adapter failure.
IBM SAS-attached expansion enclosures offer significantly scalability, while still maintaining high
performance and reliability, supporting various optional IBM disk drive expansion enclosures
options to complement external storage.
All IBM SAS-attached expansion enclosures appear in the system management GUI as internal
storage resources, which are directly attached to the system, with the same machine type as the
control enclosure.
With SAS attachment, hot-spare nodes or stretch clusters are not supported.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
Expanding on a set of intuitive architecture design concept, IBM repeats the same redundant
hardware components and features for its 2U and 5U SAS expansion enclosures - making the units
physically identical. Even though many of these components are hot-swappable, they are intended
to be used only when your system is not active (no I/O operations). Each replaceable unit has its
own removal procedure, and should be removed or replaced only when you are directed to do so.
The SAS expansion canisters on the expansion enclosures are the same for all 2U and 5U
expansion enclosure models. Each component that makes up expansion enclosures are indicated
by light-emitting diodes (LEDs) that indicate the status and the activity of the part.
Each expansion enclosure's SAS status LED indicators has the same meaning as the LED
indicators of SAS ports as its storage controller node. An expansion enclosure houses the following
more hardware: Power supply units (or PSUs) and drives.
Uempty
This table identifies the IBM control enclosures by its machine types and model numbers, and its
supported expansion models. The drive expansion enclosure models for 12 drives, 24 drives, and
92 drives are based on the IBM Systems Storage control enclosure model number running on the
IBM Spectrum Virtualize software.
Storwize V7000 offers three 12 Gb SAS expansion enclosure machine type/model numbers.
Storwize V7000 LFF Expansion Enclosure Model 12F supports up to twelve 3.5-inch flash drives,
Storwize V7000 SFF Expansion Enclosure Model 24F supports up to twenty-four 2.5-inch flash
drives, while Storwize V7000 LFF HD Expansion Enclosure Model 92F supports up to ninety-two
flash drives a 3.5-inch carrier. SFF and LFF HD expansion enclosures can be intermixed within an
Storwize V7000 system.
Intermixing different expansion enclosure machine type/model numbers are not supported. For
example, Storwize V7000 -branded expansion enclosures are not supported for use with
FlashSystem 9100, FlashSystem V9000, or SVC systems and vice versa.
All control enclosures supporting the Model 92F drawer must be running at the IBM Spectrum
Virtualize V7.8 software level or higher, and all control enclosures supporting the Model AFF and
AF9 expansions must be running at the IBM Spectrum Virtualize V8.2 software level or higher.
Each expansion enclosure can be purchased with a 1 year or 3-year warranty. Optional warranty
service upgrades are available for enhanced levels of warranty service. Expansion enclosure drives
and cables are CRU while some expansion components are FRU replaced by IBM service
personnel.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
This topic introduces the characteristics of the optional IBM 2U expansion enclosures, 12-drive
model, and 24-drive model.
Uempty
Installed drives
have two LEDs:
Fault and Activity
Twelve drive bays supporting 3.5-inch SAS drives
Power
Status
Fault
The IBM 2U SAS expansion model types are 12F, 24F, and AFF are all housed in a 19-inch rack
mount enclosures. All drives are front loaded in drives bays that support:
▪ Model type 12F contains 12 slots supporting 3.5-inch large form factor SAS drives.
▪ Model AFF and 24F both contain 24 slots supporting 2.5-inch small form factor SAS drives.
IBM Storwize 5010, IBM Storwize 5030F, IBM Storwize V7000F, IBM Storwize V7000 (724),
and the IBM FlashSystem 9100 expansion enclosures are not available in the twelve drive
model type.
The 2U SAS expansion enclosure has several sets of LEDs that provide information about the
overall status of the enclosure, power, drives, fans, canisters, and SAS connections. Each installed
drive on the expansion enclosure has two light-emitting diode (LED) indicators: Fault and Activity;
they have no controls or connectors. This enclosure is designed to accommodate large terabytes of
disk drives, supporting a mixture of supported drive models.
Uempty
2 Canister 1 SAS port 1 (IN) required 7 Canister 2 SAS port 1 (IN) required
3 Canister 1 SAS port 2 (Not in use) 8 Canister 2 SAS port 2 (Not in use)
1 6
2 3 4 7 8 9
5 10
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
All 2U expansion enclosures contain two side-by-side expansion canisters to provide SAS
connectivity to the controller and expansion enclosures sequentially through two integrated 12 Gb
SAS In and Out ports. Integrated SAS ports provide redundant connection paths to the control
nodes in an I/O group and paths to other expansion enclosures in the chain.
The SAS ports are numbered from left to right as SAS port 1 and SAS port 2. The SAS port 1 (the
IN port) connects up to four data channels that use the control enclosure or to the previous
expansion enclosure in the chain with the control enclosure at the end of the chain. The SAS port 2
(the OUT) port connects to the next expansion enclosure in the chain.
There are four indicator LEDs (two per SAS ports) providing the port Link and Fault connection
status information, and three canister indicator LEDs (power, status, and fault) providing status
information for the expansion canister as a whole.
It also contains two side-by-side dual redundant hot-swappable, auto-sensing, AC 764-watt power
supplies. Each power supply also contains a cooling module that exhausts air from the rear of each
canister allowing airflow passes between drive carriers and through each expansion canister.
The 2U expansion canisters do not cache volume data or store state information in volatile memory.
Therefore, expansion canisters do not require battery power. If AC power to both power supplies in
an expansion enclosure fails, the enclosure powers off. When AC power is restored to at least one
power supply, the enclosure restarts without operator intervention.
Although these are hot-swappable components, they are intended to be replaced only when your
system is not active (therefore no I/O operations). If your system is powered on and processing I/O
operations, go to the management GUI and follow the fix procedures. Initiating replacement
actions without the assistance of the fix procedures can result in loss of data or loss of access to
data. Before removing a drive, administrators should migrate data off them before removing them
by using the GUI Fix procedures.
Uempty
2076-12F/24F 764 W (2) AC 100 V to 240 V 10A for 100V 764 W 2607
6A for 240V
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
To ensure that the 2U expansion enclosures meet your environment power requirements, this table
lists the power specifications for each power supply. The 2U expansion 764-watt power supplies
are auto-sensing and can be connected to 100 - 240 volt AC power. The power supply has no
physical power button or switch. Power is provided by plugging a 2.8 meter PDU C13-C14 power
cable into each power supply unit single-phase (100V - 240V) electrical outlets. The 2U expansion
enclosure has a British Thermal Unit (BTU) per-hour rating of approximately 2607 BTU in heat
dissipation.
A power supply is never to be removed from an active enclosure until a replacement power supply
unit is ready to be installed. If a power supply unit is not installed, then airflow through the enclosure
is reduced and the enclosure can overheat. Replace the power supply within 5 minutes of replacing
a faulty unit. The canister is ready with no critical errors when Power is illuminated, Status is
illuminated, and Fault is off.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
This topic introduces the characteristics of the optional IBM 5U High Density expansion Drawer
supporting 92 drives.
Uempty
1U PSU fascia
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
IBM 5U expansion enclosure is a high density expansion drawer in a 19-inch rack mount enclosure.
The 5U model contains 92 drive slots supporting large form factor 3.5-inch drives. The front of the
chassis contains a 4U fascia that covers the front of the drive bays area and a 1U fascia covers the
two side-by-side dual redundant hot-swappable, auto-sensing, ac 2400-watt power supply units
(PSUs). Each power supply contains a cooling module that cools the lower bay and exhausts air
from the rear of each canister.
There are three display LEDs indicators that are located almost mid-center of the chassis. These
LEDs indicate the expansion status of the enclosure power, identify status and enclosure fault
status as a whole.
The high density enclosure is designed to accommodate large terabytes of disk drives, more than
four times the capacity then other drive enclosures.
Uempty
Maximum
Model and Input power Maximum Caloric value
PSU power
type requirements input current (BTU/hr)
output
2076-92F 2400 W (2) AC 200 - 240 V~, 12 A (x2 – per 2400 W 8189
(nominal; +/- 10% inlet
tolerant redundancy)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
This table lists the power specifications for each 5U expansion drawer power supply. The dual
redundant 2400-watts auto-sensing high efficiency power supplies are front loaded and can be
connected with C19-C20 power cables to the power connectors on the rear of the enclosure. In
additional, one or more C19 power distribution units (PDU) are needed in the rack to connect power
to 9848-A9F power supplies.
With the dual 2400 watts power supplies the optimal operation is achieved when operating between
200-240AC (Nominal) with minimum voltage range of 200V. The 5U expansion drawer has a British
Thermal Unit (BTU) per-hour rating of approximately 8189 BTU in heat dissipation.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
IBM 5U expansion drawer interior drive view shows the location of the 92 drive slots supporting
3.5-inch large form factor disk drives and two secondary expander modules. The two secondary
expanders are installed in the center of the drive bays.
The drive locations chart identifies the drive slots in the enclosure, which are numbered 1-92 from
left to right in rows A to G. The drive locations are also marked on the enclosure itself. The rows (A
to G) are marked on the left and right edges of the enclosure. The first column of drives (1 through
14) is marked on the front edge of the enclosure, and there after. The row and column marks are
visible when the top cover is removed. The drive slots must be populated sequentially, starting from
the back-left corner position (slot 1, grid A1). Sequentially install the drive in the slots from left to
right and back row to front. Always complete a full row before installing drives in the next row.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
The secondary expander modules provide SAS connectivity between the expansion canisters and
the drives. Each drive has 2 SAS ports. SAS port 1 of each drive is connected to secondary
expander module 1 and SAS port 2 of each drive is connected to secondary expander module 2.
Each expansion canister is connected to both secondary expander module 1 and secondary
expander module 2.
If secondary expander module 2 is missing or is faulty, the expansion canisters can communicate
only with SAS port 1 on each drive. Similarly, if secondary expander module 1 is missing or is faulty,
the expansion canisters can communicate only with SAS port 2 on each drive.
The secondary expander modules also contain two LEDs, an online indicator and a fault indicator
that is on top of each secondary expansion module to monitor the status.
Uempty
1
Dural side-by-side reserved expansion
canisters
2
Used to attach control enclosures and
additional expansion enclosures
Hot-swappable components
Expansion Expansion Remove only when there is no I/O operations
canister 1 canister 1
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
The rear view of the 5U expansion drawer contains four hot-swappable fan modules to provide
cooling to the chassis drive bays and secondary expander modules; two power connectors, and
dual reversed installed expansion canisters to provide SAS connectivity to the controller and
expansion enclosures.
After the C19-C20 power cables are connected to the power connector, the enclosure automatically
powers on and begins its Power On Self-Tests (POST). Power is removed from an individual power
supply by physically unplugging the cable power supply. However, when powering off an enclosure,
initiate the process through the software and then disconnect both of the power cords from the
enclosure.
Always initiate the power off process through software, such as the management GUI, the service
assistant GUI, or the CLI. After shutting down an enclosure through the software, you can then
disconnect the power cords from both power supply units in the node or enclosure to completely
remove power.
Although these are hot swappable components, I/O to them should be stopped before they are
disconnected.
The 5U expansion canisters within the enclosure do not cache volume data or store state
information in volatile memory. Therefore, expansion canisters do not require battery power. If AC
power to both power supplies in an expansion enclosure fails, the enclosure powers off. When AC
power is restored to at least one power supply, the enclosure restarts without operator intervention.
Uempty
Expansion Expansion
canister 1 canister 2
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
The 5U expansion drawer contains two expansion canisters that are located in the rear of the
expansion unit. These expansion canisters are physically identical to the 24F expansion enclosures
except they are installed in a reversed order. Integrated SAS ports provide redundant SAS
connectivity connection paths between control nodes in an I/O group and to other expansion
enclosures in the chain sequentially through two integrated 12 Gb SAS ports.
The SAS ports are numbered from left to right as SAS port 1 and SAS port 2. The SAS port 1, the
IN port connects up to four data channels that use the control enclosure or to the previous
expansion enclosure in the chain, with the control enclosure at the end of the chain. SAS port 2, the
OUT port connects to the next expansion enclosure in the chain. Expansion enclosures can be
included in a SAS chain, with up to a total of 4 expansion enclosures per SAS chain.
There are four indicator LEDs (two per SAS ports) providing the port Link and Fault connection
status information, and three canister indicator LEDs (power, status, and fault) providing status
information for the expansion canister as a whole.
The canister is ready with no critical errors when Power is illuminated, Status is illuminated, and
Fault is off. When both ends of a SAS cable are inserted correctly, the green link LEDs next to the
connected SAS ports are lit.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
IBM 5U expansion drawer contains four fan modules and two fan interface boards (FIBs). The FIBs
act as the interface between the fans and the system drive board. FIB 1 connects fan modules 1
and 2 to the drive board; FIB 2 connects fan modules 3 and 4.
As airflows from the front to the rear of the chassis, the hot-swappable 80 mm fan modules provide
cooling to the chassis drive bays. If the fault LED on each fan module is yellow, it is possible that
the FIB that controls those modules needs to be replaced. You can remove a fan module without
powering off the expansion enclosure. However, to maintain proper operating temperature, do not
remove more than one fan module at a time.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
To achieve optimum airflow, all bays and slots must have fillers to prevent the system from
overheating. System airflow is from the front to the rear of each enclosure:
• Airflow passes between drive carriers and through each enclosure.
• Airflow for the upper 4U of the 5U enclosure enters the front, passes between the disk drives,
and exits through the large fans in the rear of the enclosure.
• Airflow for the lower 1U of the 5U enclosure is driven through the power supplies through 40
mm X 56 mm fans. Air continues through the chassis cooling the ESMs or controllers and exits
the rear of the enclosure.
• With the combined power and cooling module, air is exhausted from the rear of each canister.
Uempty
12 Gb SAS interface
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
This topic identifies the IBM SAS Expansion Enclosures requirements to scale out storage.
Uempty
Flash 800 GB, 1.6 TB, 1.92 TB, 3.2 TB, 3.84 TB, 7.68 TB, and 15.35 TB
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
All IBM Storwize V7000 SAS expansion enclosures supports 12 Gb SAS industry-standard drives in
form factor features of 2-inch and 3.5-inch drives, allowing clients' to expand total capacity and
deliver tiered data solutions that scales beyond their storage capacity needs, delivering all-flash
data processing solutions with industry-leading flash storage technology. The Storwize V7000 SAS
expansion enclosures support up to 32 PB of usable storage capacity.
All drives are dual-port and hot-swappable. Drives of the same form factor and connector type can
be intermixed within an enclosure. SFF and LFF HD expansion enclosures can be intermixed
behind the SFF control enclosure.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
On each SAS chain, the system can support up to a SAS chain weight of 10. And each I/O group
can support two SAS chains, with a total weight of 20. Each or 2076-92F expansion enclosure
adds a value of 2.5 to the SAS chain weight. Each 2147-12F or 2147-24F expansion enclosure
adds a value of 1 to the SAS chain weight. This results in a maximum of 264 drives using twenty
12F enclosures in two SAS chains, plus the 24 drives in the control enclosure, or 504 drives using
twenty 24F expansion enclosures, or 760 drives using eight 92F expansion enclosures.
With a maximum of up to 264 drives that use large form factor 3.5 inch drives, and up to 504 drives
that use small form factor 2.5 inch drives per control enclosure.
The maximum total of internal drives can vary depending on whether the control enclosures contain
internal drives. Large form factor and small form factor can be intermixed behind the storage
controllers. Therefore, all SAS expansion enclosures are shared between each control canisters in
the system, they must be physically attached to both controllers in the pair. Regardless of the type
of storage (SAS attached or virtualized), there is a limit of 32 PB per system.
Uempty
Models Models
12F/24F 12F/24F
installed the installed in
same racks separate racks
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
With the attachment of the IBM 2U expansion enclosures, IBM Storwize V7000 can scale up to a
total of SAS weight of 10 per SAS chain and a SAS weight of 20 per I/O group for a maximum of up
to 264 drives that use large form factor 3.5-inch drives, and up to 504 drives that use small form
factor 2.5-inch drives per control enclosure. Intermixing of expansion enclosures in a system is
supported.
The rack illustrations show the redundant cabling paths to the SAS storage chains when the
expansion enclosures are installed in the same rack or in a separate rack that is next to the control
enclosures.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
With the attachment of the IBM 5U 92F high density expansion drawers, IBM Storwize V7000 can
scale up to a total of eight HD drawers per I/O group (4 per SAS chain) for a total of 760 drives that
use large form factor 3.5-inch drives. Intermixing of expansion enclosures in a system is supported.
This rack illustration shows the redundant cabling paths to the SAS storage chains with different
expansion enclosure models that are installed in the same rack as the control enclosures. This
example shows two 12F expansion enclosures and two 92F high density enclosures in each SAS
chain.
Uempty
8 Standard Standard
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
This table shows SAS chains with multiple configurations from 10 to 4 expansion enclosures per
SAS chain. The SAS chains has limits based on the number of 2U and 5U SAS expansion
enclosures to be attached. Standard refers to the Model 2U expansion enclosures. Dense refers to
the Model 5U HD expansion enclosure.
Uempty
The second table shows the maximum allowed intermix of expansion enclosures per the
FlashSystem 9100 system cluster. Each of the following expansion enclosure configurations has a
total SAS chain of 10, which must be balanced across both chains to support the maximum limits
specified per chain.
In this scalable solution, up to ten AFF expansion enclosures are supported per SAS chain for a
total of twenty AFF expansion enclosures for the I/O group. The maximum standard 2U SAS
expansion enclosures that are supported in a full scale-out configuration is 80 using twenty AFF
expansion enclosures. In a full scale-out configuration, a maximum of thirty-two A9F expansion
drawers are supported.
When mixing SAS enclosure in the SAS chains, up to two A9F enclosures and five AFF enclosures
per SAS chain for a total of four A9F and ten AFF expansion enclosures for the I/O group.
With four-way system clustering, the size of the system can be increased to a maximum of 3,040
drives.
Uempty
12 Gb SAS interface
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
This topic identifies the scalability requirements to scale out storage capacity that uses a 12 Gb
SAS interface. This topic also discusses the system power-on and power-off requirements.
Uempty
PHY PHY
PHY PHY
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
Each Storwize V7000 control enclosure must be configured with a 4-port 12 Gb SAS adapter (only
2 ports are active) to support the attachment of the IBM Storwize V7000 SAS expansion
enclosures. The SAS ports of both units are connected by using SAS connectors. The 12 Gb SAS
port is a third-generation SAS interface that uses PCI Express 3.0 bandwidth.
The improved bandwidth backed by I/O processing capabilities to maximize link utilization supports
increased scaling of traditional hard disk drives as well as improved flash performance. Each 12 Gb
SAS port as well as the SAS cable contains four physical (PHY) lanes. Each lane uses multiple
links (as the 6 Gb SAS technology) for full duplex transmission to transmit and receive higher date
rates up to 4800 Mb (which is 48 Gb).
Above each port is a green LED that is associated with each PHY (eight LEDs in total). The LEDs
are numbered 1 - 4. The LED indicates activity on the PHY. For example, if traffic starts it goes over
PHY 1. If the line is saturated, the next PHY starts working. If all four Phy LEDs are flashing, the
backend is fully saturated. The 12 Gb SAS also provides investment protection with compatibility
with an earlier version with 6 Gb SAS.
When connecting to an expansion enclosure, only ports 1 and 3 on the SAS PCIe adapter in each
of the control nodes in the I/O group are used to connect control nodes to expansion enclosures.
SAS port 1 supplies the connection path for one SAS chain and SAS port 3 supplies the connection
path for the other SAS chain. This configuration creates two separate chains of storage.
Uempty
Enclosure IDs are assigned dynamically by the SAS port 2 (output) on the expansion
SAS fabric enclosure is not in use
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
Each independent SAS chain connects the control enclosures to the expansion enclosures. This
provides a symmetrical way to balanced distribution of the expansion enclosures on both SAS
chains for performance and availability.
All internal disk drives of the control enclosure belong to SAS Chain 0. Each of the independent
SAS chain (SAS port 1 and SAS port 3) supports a maximum of 10 expansion enclosures per SAS
chain (depending on the model).
SAS port 1 in each control node connects to the “in” ports of the canisters in the first SAS expansion
enclosure in one of the SAS storage chains, providing fault-tolerance.
The “out” port from the first expansion enclosure connects to the “in” port on the second SAS
expansion enclosure in the chain. This cabling pattern continues for all of the SAS expansion
enclosures in the chain. The last expansion enclosure in the chain does not have a loop-back cable
or other special terminating connection.
SAS port 3 in each control node connects to the “in” ports of the canisters in the first SAS expansion
enclosure in the other SAS storage chain, that uses the same cabling pattern as for the top SAS
storage chain. Enclosure IDs are assigned dynamically by SAS fabric device discovery.
SAS port 2 (output) on the expansion enclosure does not contain a SAS cable.
When powering off an expansion enclosure breaks the SAS chain and connection to any drives in
expansion enclosures beyond that expansion enclosure. But since each end of the SAS chain is
connected to a node in an I/O group, access to powered on SAS enclosures continue to work.
Uempty
Control / Expansion
Enclosures Cable description
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
IBM SAS expansion enclosures are connected to the IBM Spectrum Virtualize control enclosures
that use the IBM 0.6 meter, 1.5 meter, 3.0 meter, and 6.0 meter 12 Gb SAS Cables (mini SAS HD to
mini SAS HD) terminated with SF8644 connectors.
Uempty
Keywords
IBM Storwize V7000 GUI Internal drives
12 Gb SAS interface Virtualization
SAS chains Redundant Array of Independent Disks
(RAID)
Mini-SAS HD cable
IBM 2U SAS Expansion Enclosure
Effective capacity
IBM 5U High Density Expansion Drawer
Internal disks
Secondary expander modules
Storage pool
FAN interface board (FIB)
SAS IN and OUT ports
Hot-swappable components
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
Uempty
Review questions (1 of 4)
1. True or False: All IBM Storwize V7000 12F/24F/92F expansion enclosures are
displayed in the management GUI as external storage resources.
2. True or False: All 2U SAS expansion enclosures must be installed in the same rack as
the control enclosures.
3. True or False: All drive modules in an enclosure must be the same size and type.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
Uempty
Review answers (1 of 4)
1. True or False: All IBM Storwize V7000 12F/24F/92F expansion enclosures are
displayed in the management GUI as external storage resources.
The answer is False. IBM Storwize V7000 12F/24F/92F are configured as internal storage
arrays with in the management GUI.
2. True or False: All 2U SAS expansion enclosures must be installed in the same rack as
the control enclosures.
The answer is False. IBM Spectrum Virtualize products can be installed in separate racks as
well as the same rack as the Storwize V7000 control enclosures.
3. True or False: All drive modules in an SAS attached enclosure must be the same size
and type.
The answer is False. IBM SAS expansion enclosures LFF and SFF drive models can be
intermixed behind the same control enclosures.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
Uempty
Review questions (2 of 4)
4. Which of the following SAS expansion enclosures can be configured into a chain of 10
expansion enclosures? (Choose all that apply)
a. IBM SAS Expansion Enclosure, Model 12F
b. IBM SAS Expansion Enclosure, Model 24F
c. IBM SAS High Density Expansion Drawer, Model 92F
d. All of the above
5. True or False: Drives can be installed in any empty slot in the IBM 5U High Density SAS
Expansion Drawer.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
Uempty
Review answers (2 of 4)
4. Which of the following SAS expansion enclosures can be configured into a chain of 10
expansion enclosures? (Choose all that apply)
a. IBM SAS Expansion Enclosure, Model 12F
b. IBM SAS Expansion Enclosure, Model 24F
c. IBM SAS High Density Expansion Drawer, Model 92F
d. All of the above
The answers are A and B. Models 12F and 24F can both scale up to 10 SAS expansion
enclosure per SAS chain for a maximum of 20 SAS expansion enclosures per I/O group.
5. True or False: True or False: Drives can be installed in any empty slot in the IBM 5U
High Density SAS Expansion Drawer.
The answer is False. All 5U High Density SAS Expansion Drawer drive slots are populated in
sequential order, starting from the back-left corner position (slot 1, grid A1).
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
Uempty
Review questions (3 of 4)
6. How many Storwize V7000 SAS High Density Drawer can be attached to one Storwize
V7000 I/O group?
a. Ten
b. Eight
c. Six
d. Four
7. Which port on the IBM 12 Gb SAS adapter are used to connect to the expansion
enclosures?
a. Ports 1 and 2
b. Ports 3 and 4
c. Ports 1 and 3
d. Ports 1 through 4
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
Uempty
Review answers (3 of 4)
6. How many Storwize V7000 SAS High Density Drawer can be attached to one Storwize
V7000 I/O group?
a. Ten
b. Eight
c. Six
d. Four
The answer is Eight. IBM 5U High Density Expansion Drawer support a maximum of 8 SAS
expansion enclosures per I/O group.
7. Which port on the IBM 12 Gb SAS adapter are used to connect to the expansion
enclosures?
a. Ports 1 and 2
b. Ports 3 and 4
c. Ports 1 and 3
d. Ports 1 through 4
The answer is ports 1 and 3. Only ports 1 and 3 are used to create SAS chains for the attached
expansion enclosures.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
Uempty
Review questions (4 of 4)
8. True or False: A SAS chain can be connected to nodes in two I/O groups.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
Uempty
Review answers (4 of 4)
8. True or False: A SAS chain can be connected to nodes in two I/O groups.
The answer is False. SAS chains must be connected to the node pairs within an I/O
group.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
Uempty
Summary
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H96$6$WWDFKHGVWRUDJH
Uempty
Overview
This module introduces IBM supported traditional RAID arrays and Distributed RAID arrays, and
the benefits of implementing DRAIDs as part of your production solution.
References
Implementing IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1
http://www.redbooks.ibm.com/redpieces/pdfs/sg247938.pdf
Uempty
Objectives
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Uempty
• Managed Resources
• Traditional RAID levels
• Distributed RAID arrays
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Uempty
External storage
resources
Internal storage
resources 6$1
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
An IBM Spectrum Virtualize system such as IBM SAN Volume Controller, FlashSystem 9100,
FlashSystem V9000, and IBM Storwize family can manage a combination of internal and supported
external storage systems. Internal storage is the RAID-protected storage that is Flash, SAS, or
SATA drives that are directly attached to the system that uses SAS-Attached expansion enclosures.
The system automatically detects the drives that are attached to it and displays them within the GUI
as internal or external storage.
An external storage subsystems, or storage controller is an independent backend device that
coordinates and controls the operations for its disk drives or logical units. External storage must be
configured to the same SAN fabric to be virtualized by the Spectrum Virtualize system.
The IBM Spectrum Virtualize system manages the capacity of other disk systems with external
storage virtualization (an External Virtualization license is required for each storage device to be
managed). For example, when Spectrum Virtualize system virtualizes a storage enclosure, such as
the FlashSystem 900, its capacity becomes part of the system. The FlashSystem 900 becomes
managed as external storage in the same way as the capacity on internal capacity. Capacity in
external storage systems inherits all the rich functions and ease of use of Spectrum Virtualize
system.
When virtualizing the FlashSystem 900 Model AE2 Flash enclosures behind the IBM FlashSystem
V9000, each enclosure is configured into a single managed disk group and RAID array. The
FlashSystem 900 Model AE2 is then managed as internal storage for redundancy in a module
failure. However, the FlashSystem V9000 with Model AE3 enclosures is managed as external
storage, and its capacity is separately managed and configured into LUNs, which become multiple
managed MDisks, and to evenly balance use of physical resources.
Uempty
• Tier 1 flash
ƒ Lower-cost flash drives, typically with larger capacities
ƒ Lower performance and write endurance characteristics dŝĞƌϭ
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Uempty
Flash /
Distant Past dŝĞƌϬ SSDs Now and Future
15K 15K Flash
dŝĞƌϭ HDDs dŝĞƌϭ HDDs dŝĞƌϭ
7.2K 7.2K
dŝĞƌϯ HDDs dŝĞƌϯ HDDs dŝĞƌϯ Near Line
(NL)
Tape 7.2K
HDDs Near Line
dŝĞƌϰ dŝĞƌϰ or Tape dŝĞƌϰ (NL) or
Tape
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Tiered storage is the assignment of different categories of data to various types of storage media to
reduce total storage cost. Tiers are determined by performance and cost of the media, and data is
rank by how often it is accessed. In the past, the amount or percentage of storage per tier was
based on ‘rule of thumb’ or a rough approximation. Now and in the Future, IBM have tools available
to determine the appropriate amount of storage that is needed for each storage tier based on
Analytics.
An IBM SAN Volume Controller solution can also support the following tiers of drives:
• IBM uses a 4 tiered storage solution that is based on the client’s storage media need. A Tier 4
solution is used by businesses that require both greater data currency and faster recovery. Tier
4 incorporates more disk-based solutions than tape backups, making it easier to make such
point-in-time (PiT) copies with greater frequency.
• Tier 3 often used for data that is event-driven, rarely used or unclassified files on slow-spinning
hard disk drives (HDDs), recordable compact disks, or tapes.
• Tier 2 can be used for data such as financial, seldom-used, or classified files, as well as the
ability to store on less-expensive media in the SAN.
Uempty
• Managed Resources
• Traditional RAID levels
• Distributed RAID arrays
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Uempty
RAID1 RAID1
Array Array
RAID0 MDisk MDisk RAID10 RAID10 RAID10 RAID10
Array Array Array Array Array
MDisk RAID0 MDisk MDisk MDisk MDisk
RAID0
Array Array
MDisk MDisk Storage RAID10 RAID10 RAID10 RAID10
RAID0
Array Pool Array Array Array Array
MDisk MDisk MDisk MDisk
MDisk
Storage
Pool Storage
Pool
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Traditional RAID levels have been around for decades, providing data storage reliability on how
data is stored on the disk array and the level of protection that is provided. When you plan to attach
expansion enclosures to the IBM Spectrum Virtualize storage system, the first aspect to consider is
which RAID level is the most suited and appropriate for your environment. The preferred RAID type
varies according to the capacity, performance, and protection level required.
Traditional RAID, except for RAID 0, most RAID levels provide various degrees of redundancy and
performance, and various restrictions based on the number of members in the array. When a part of
the RAID system fails, different RAID levels help to recover lost data in different ways.
The SAS RAID controller supports RAID 0, 1, 5, 6, and 10.
Uempty
Traditional RAID 0
• Data striped across two or more drives
• Created without parity information, redundancy,
or fault tolerance 6WULSHGYROXPH
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
RAID 0 (also known as a stripe set or striped volume) stripes data evenly across two or more disks,
without parity information, redundancy, or fault tolerance. Since RAID 0 provides no fault tolerance
or redundancy, the failure of one drive can cause the entire array to fail. This failure occurs because
data is striped across all disks, which results in total data loss.
By using multiple disks (at least 2) at the same time, this offers superior I/O performance for both
read and write operations. Thus achieving more IOPS bandwidth with more disks and more
throughput for sequential I/O threads since we can write to all the disks in the array at the same
time.
RAID 0 is ideal for non-critical storage of data that must be read/written at a high speed, and are
able to tolerate lower reliability such as on an image correction or video editing station. RAID 0 is
primarily used in applications that require high performance, where loss of data is acceptable.
Uempty
Traditional RAID 1
• Data is mirrored to second drive
ƒ Writes data to both drives
0LUURULQJ
• Does not support parity, striping or spanning
drive space
ƒ Array can be only as large as the smallest EORFN
member disk EORFN
EORFN
• Supports a single drive failure
EORFN
EORFN EORFN
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
RAID 1 consists of an exact copy (or mirror copy) of a set of data on two or more disks. A RAID 1
configuration offers no parity, striping, or spanning of disk space across multiple disks. Since the
data is mirrored on all disks in the array, the array can be only as large as the smallest member
disk.
The array continues to operate when at least one member drive is operational. Data is stored twice
by writing them to both drives. If a drive fails, data does not have to be rebuilt, the controller uses
the data drive on the working drive and continues operations.
From a performance standpoint, a RAID-1 array provides twice the read IOPS as a single disk, but
the same write IOPS as a single disk since a write must be done to each disk for each application
write. Similarly, for sequential I/O you can read up to twice as fast from two disks as one, but we
can’t write faster to a RAID 1 array than we can to a single disk. Usually write latency depends on
the use of write cache in the disk controller.
Uempty
Traditional RAID 5
Requires at least 3 drives but can work with
up to 16
Data and parity striped across the drive set Striping with parity across the drives
RAID 5 is the most common secure RAID level. It requires at least three drives but can work with up
to 16. Data blocks, or strips, are striped across the drives and on one drive a parity checksum of all
the block data is written. The parity data is also spread across all the disks in the array. A RAID 5
array can withstand a single drive failure without losing data or access to data. Although RAID 5
can be achieved in software, a hardware controller is recommended. Usually write cache memory
is used on these controllers to improve write latency, and also read latency when the data exists in
the cache.
The RAID 5 writes penalty, while usually masked through the controller’s write cache, does affect
the IOPS workload to the disks. Each application write requires reading the data that is being
written over, associate parity information on another disk, and update to the data and the parity.
Therefore, each application write I/O requires four physical disk I/Os. If the application or disk
controller is doing sequential write I/O, then the controller can calculate the new parity for the entire
stripe and update the parity strip.
Drive failures affect throughput, for example: reads from a failed disk require reading all the other
disks in the array to determine the data. Also, since spare drives are located outside the array,
replacing the failed drive generates read from the remaining drives during the rebuild. If a second
drive fails during the hot spare replacement process, data in the array is lost and must be recovered
from a backup so any data changes since that backup is lost.
RAID 5 rebuild can take a while to process, depending on the disk sizes (larger data writes require
more time) and existing application workload (application I/Os are competed with rebuild I/Os).
The capacity downside of RAID 5, is that one drive worth of space gets used by parity information in
each array.
Uempty
Traditional RAID 6
• Requires at least 4 drives but can work with
up to 16
• Data blocks (or strips) are striped across the Striping with double parity
drives
• Two drives of parity space striped across the
array block 1a block 1a block 1a parity 1a
RAID 6 is like RAID 5, but the parity data is written to two drives instead of one. Therefore, RAID-6
array can survive two disk failures in the array without data loss, thus making it reliable then RAID
5.
The RAID 6 writes penalty means that each application write requires 6 physical disk I/Os. This
means that it reads the data that is written over one disk, and associate the parity information on
two other disks, which calculates the new parity and then update the data and parity on those three
drives. Usually write I/O latency is masked via write cache on the controller, but the IOPS workload
to the disks is limited by the disk IOPS limits. Similar to RAID 5 full stripe writes, typically from
sequential I/O streams, RAID 6 avoids most of the penalty as it calculates the parity for a full stripe
immediately and then writes a full stripe, which includes the parity data.
The capacity downside of RAID 6 is that two drives worth of space is consumed by parity
information in each array.
Uempty
Traditional RAID 10
• Combines RAID 1 and RAID 0
ƒ Data is mirrored to second drive
Mirroring + Striping
ƒ Stripes data across each set of drives
• Supports a single drive failure
ƒ No loss of data block 1 block 1 block 2 block 2
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
RAID level 10 combines RAID 1 and RAID 0 by mirroring all data on secondary drives while using
striping across each set of drives. RAID 10 can deliver higher throughput for random read
workloads, since twice as many disks provide twice the IOPS bandwidth. With certain workloads,
such as sequential writes, RAID 5 often shows a performance advantage.
If an issue occurs with one of the disks in a RAID 10 configuration, the rebuild time is fast since all
that is needed is copying all the data from the surviving mirror to a new drive. This process can take
as little as 30 minutes for drives of 1 TB.
RAID 10 can be an expensive way to maintain redundancy, as half of the usable capacity goes to
mirroring.
Uempty
For redundancy, spare drives (also known as hot spares or global hot spares) are configured and
are used to automatically replace a failed drive. As part of the rebuild process, data is read from the
remaining drives in the array (both data and parity), the data on the failed drive is calculated and
written to the spare. Theoretically, data can be recovered as quickly as it takes to sequentially write
a full drive of data (or less if the array isn’t full). However, as application I/O continues it competes
for disk access and can slow down the sparing process significantly. Further, since the application
might be trying to read data from the drive that failed, those reads require reading the associated
data strips/blocks in the array plus one of the parity disks; thus, use more of the read IOPS
bandwidth of the array. As a consequence, arrays that are built from larger drives take longer to go
through the sparing process. Higher application workloads can also cause the sparing process to
take longer. The longer the sparing process takes, the more chance exists that second drive fails in
a RAID 5 array resulting in more data loss. RAID-6 can survive two disk failures so it’s more reliable
than RAID-5 because there’s significantly less chance of a two drive failure during the hot spare
replacement process.
In this example of a RAID 6, each stripe is made up of data strips (represented by D1, D2, and D3)
and two parity strips (P and Q), with the ability withstand two simultaneous drive failures. Because
the rebuild time is limited by throughput of single drive, and as drives grow in capacity, the rebuild
time increases – especially with larger drives.
Uempty
Traditional arrays can be created by using the GUI or CLI. Both use the mkarray command. The
management GUI offer presets implemented based on best practices guidelines. After the array is
created, it can be used instantly and moved to a pool where volumes can be created. Volumes can
be written immediately after creation and mapping.
Redundancy depends on the type of RAID level that is selected at creation time. To reduce the
calculation of parity information and to improve performance, the cache attempts to combine writes
together into full stripe writes. The usable capacity for RAID 0 is the drive count time the drive size,
which is 100% usage but with the cost of no redundancy. A redundancy of 1 means that one drive
can fail without failing the array. The usable capacity for RAID 1 is the size of one drive since the
other is just a mirror. The supported drive counts for RAID 5, 6, and 10 are higher, but a default size
is used in the GUI. The GUI default is to use eight drives for a RAID 5 array, which helps balance
cost and reliability.
For external controllers, where possible, create a single logical disk from the entire capacity of each
RAID array. You can intermix different block size drives within an array and a storage pool.
However, performance degradation can occur if you intermix 512 block-size drives and 4096
block-size drives within an array. For example, with a configuration of 24 disks, and one hot spare,
23 does not evenly balance. Therefore, you might choose two spares, using two (11 disks) RAID 5
arrays to have two equivalent (in terms of size and performance) MDisks.
Do not mix managed disks (MDisks) that greatly vary in performance in the same storage pool tier.
The overall storage pool performance in a tier is limited by the slowest MDisk. Because some
storage systems can sustain much higher I/O bandwidths than others, do not mix MDisks that are
provided by low-end storage systems with MDisks that are provided by high-end storage systems in
the same tier. Doing so negates the benefit of Spectrum Virtualize system balancing use of physical
disk resources.
Uempty
Keep in mind that RAID data redundancy is not to the same as data backup. You still need to
ensure data safety by backing up your data daily to offline or off-site storage.
Uempty
Array member goals are used for hot spare selection and can be displayed with the lsarraymembergoals
command
Both the array and its member drives have a property called balanced (suitability in the GUI) which indicates
whether member goals are met
Exact: All member goals have been met
Yes: All member goals except location have been met
No: One or more of the capability goals has not been met
For an array and its members, the system dynamically maintains a spare protection count of non-degrading
spares
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Administrators can design the RAID arrays explicitly, or can allow the system to build arrays based
on specified goals.
If you use the GUI to create a traditional array, several parameters are used to select the used to
determine suitable drives that are based on different goals. One goal is to have only drives from the
same type in one array (Flash and SSDs). Also, the drive RPM is a goal; all members should have
the same speed. The same is true for the capacity. Another goal is the location goal, which places
the members on a special chain, enclosure, or slot ID.
Spectrum Virtualize system supports hot-spare drives. To decide for a spare drive, the member
goals are used. They can be listed with the lsarraymembergoals command.
From a planning perspective, you should have about 1 hot spare for every 20-30 physical disks that
are configured in RAID 1, 5, 6 or 10 arrays.
Uempty
When a RAID member drive fails, the system automatically replaces the failed member with a
hot-spare drive and resynchronizes the array to restore its redundancy. When creating internal
storage by using the wizards, the management GUI automatically creates and marks all spare
drives. The rule is to create one spare for every 23 array members. Which results in one enclosure
with 24 disks in the following setup: 23 drives have the Candidate state while one has the state of
Spare.
The selection of a spare drive that replaces a failed disk is done by the system. A drive with a lit
fault LED indicates that the drive has failed and is no longer in use by the system. When the system
detects that such a failed drive is replaced, it reconfigures the replacement drive to be a spare and
the drive that was replaced is automatically removed from the configuration. The new spare drive is
then used to fulfill the array membership goals of the system. The process can take a few minutes.
If the replaced drive was a failed drive, the system automatically reconfigures the replacement drive
as a spare and the replaced drive is removed from the configuration.
Uempty
When the system selects a spare for member replacement, the spare that is the best
possible match to array member goals is chosen based on
An exact match of member goal capacity, performance, and location
A performance match: the spare drive has a capacity that is the same or larger and has the same or
better performance
If a better spare is introduced to the system, the better spare is exchanged automatically to
rebalance the array
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
The goal is always to replace them with the same type and properties as the failed disk. If they are
not available, the system searches for the best solution. The spare drives are global and can be
used from any array. There is not limit to number of spare drives allowed in an array.
Uempty
Drive-Auto Manage/Replacement
No longer must follow the DMP (directed maintenance procedure) for step-by-step
guidance
Drive-Auto Manage/Replacement
Simply swap the old drive for new
í New drive in that slot takes over from the replaced drive
2/' 1(:
GULYH GULYH
5$,' 5$,'
$UUD\ $UUD\
PGLVN PGLVN
6ORW 6ORW
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Replacing an old drive is much easier since you no longer must follow the guidance of DMP (direct
maintenance procedure) to exchange an old drive for a new one. With Drive-Auto Manage, you can
swap the drives and the new drive in that slot takes over from the replaced drive.
Wait at least 20 seconds before you remove the drive assembly from the enclosure to enable the
drive to spin down and avoid possible damage to the drive. Do not leave a drive slot empty for
extended periods. Do not remove a drive assembly or a blank filler without having a replacement
drive or a blank filler.
Uempty
• Managed Resources
• Standard RAID levels
• Distributed RAID arrays
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Uempty
With the IBM Spectrum Virtualize 7.6 release, IBM introduced an advanced RAID array technology
solution that is known as IBM Distributed RAID array technology (typically referred to as DRAID).
DRAID offers an improved RAID solution as it distributes data across all drives in the array,
therefore, reducing the load on each individual drive, with no spare drive. Since no idle disks are
used as spares, all disks in a DRAID array contribute to the performance. Both DRAID 5 and
DRAID 6 are supported.
With more drives and the demand for high availability, DRAID offers up to 10 times faster RAID
array rebuild times than traditional RAID, as a result of incorporating and striping the hot spare
space within the array, which is especially important when using large drives. DRAID also delivers
increased application I/O bandwidth by incorporating the hot spare drives into the array, unlike
traditional RAID. Think of an 80 drive array as 40 pairs of drives, all processing the rebuild work
simultaneously, versus a single hot spare disk performing the write half of that work.
When using Spectrum Virtualize system thin-provisioning, compressed volumes, or Data Reduction
Pools (DRP), data gets written sequentially by the Spectrum Virtualize system to the MDisks.
Therefore, negating the traditional RAID 5 and RAID 6 write penalties.
Since traditional RAID is likely used for decades to come, the system still maintains support.
Traditional and distributed arrays can be combined in the same pool. However, it is not possible to
convert a traditional array to a distributed array or vice versa though you can dynamically migrate
data from one to the other.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Distributed RAID arrays provide the ability to create arrays from 4 to 128 drives in size.
The system supports the following RAID levels for distributed arrays.
• DRAID 5 arrays are created with a drive count (array width 4-128), a hot spare or rebuild area
count (0-4), and a stripe width (3-16).
• DRAID 6 arrays are created with a drive count (array width 6-128), a hot spare or rebuild area
count (0-4), and a stripe width (5-16).
• Distributed arrays support encryption as of IBM Spectrum Virtualize version 7.7 and above.
Uempty
DRAID
• MDisk capacity
ƒ DRAID 5 = Drive capacity x (#Drives - #Spares) x {(Stripe width – 1)/Stripe width}
ƒ DRAID 6 = Drive capacity x (#Drives - #Spares) x {(Stripe width – 2)/Stripe width}
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
When creating DRAID RAID array, the MDisk representing the array is a logical size as specified by
the formulas. For example, the graphic shows a DRAID 5 creation that uses eighteen 278.9 GiB
SAS drives with one hot spare, and an array width of 9. This yields an MDisk with a size calculated
as: 278.9 GiB x (18-1) x (9-1)/9 = 4,214 GiB = 4.12 TiB in capacity.
Think of the space as the array width minus the spares and the remaining space split between data
and parity space based on the DRAID type and stripe width. The mkdistributedarray command is
used to create the DRAID arrays, with an array width or drive count (4-128 for DRAID5, and 6-128
for DRAID 6), number of hot spares aka. rebuild areas (1-4) and stripe width (3-16 for DRAID 5 and
5-16 for DRAID 6).
Uempty
Example of Distribute RAID 6 3+P+Q over 10 drives with two distributed spares
Drive #3 fails
Loss data on “D3” is rebuilt that uses “D1, D2, and P (shown with white letters)
Spare is allocated depending on the pack number
ƌŝǀĞƐ
'
' '
' ' 33 44 '
' '
' '
' ' ZŽǁ
33 44 ' '
' '
' 33 44 '
' '
In this instance, the '
'
4
'
'
'
3 44
'
'
'
3
'
'
4
'
'
'
33
'
3
4 ' ' ' 3 4 ' ' '
5 rows make up a '
' 33 4 '
' '
' '
' 33 44 4
pack
Rebuild areas are
moved based on the
pack number
Number of rows in a
pack depends on the
number of strips in a
stripe – pack size is
constant for an array
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Distributed RAID arrays solve the traditional RAID problems because rebuild areas are distributed
across all the drives in the array. In this example, of a DRAID 6 that distributes 3+P+Q over 10
drives with two distributed spares. The five rows that make up a pack. Distributed arrays remove
the need for separate drives that are idle until a failure occurs. Instead of allocating one or more
drives as spares, the spare capacity is distributed over specific rebuild areas across all the member
drives. The spare space is depended on the pack number. The number of rows in a pack depend
on the number of strips in a stripe, means that the pack size is constant for an array. A stripe, which
can also be referred to as a redundancy unit, is the smallest amount of data that can be addressed.
For distributed arrays, the strip size can be 128 or 256 KiB. The stripe width indicates the number of
strips of data that can be written at one time when data is regenerated after a drive fails.
When a drive failure occurs (as shown with Disk 3), to recover data, data is first read from multiple
drives. For example, to recover a data strip requires reading the remaining data strips in the stripe
plus one of the associated parity strips, then the missing data is calculated from that information.
The recovered data is then written to the rebuild areas, which are distributed across all of the drives
in the array. The remaining rebuild areas are distributed across all drives, which support data to be
copied faster, allowing redundancy to be restored much more rapidly. Additionally, as the rebuild
progresses, the performance of the pool is more uniform because all of the available drives are
used for every volume extent. The number of rebuild areas is based on the width of the array. The
size of the rebuild area determines how many times the distributed array can recover failed drives
without risking becoming degraded. After the failed drive is rebuilt, the array can tolerate another
two drive failures. If all of the rebuild areas are used to recover data, the array becomes degraded
on the next drive failure.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
This graph shows that IBM Distributed RAID arrays provide up to 20% better performance
compared to traditional non-distributed RAID arrays, even with both arrays contains the same
number of drives, for the same I/O workload.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
This graph shows array rebuild performance comparison for traditional RAID array and distributed
RAID array technologies while application I/O was running on the arrays. This example compares
the rebuild for a member drive failure in non-distributed RAID 6 with 12 drives of 1 TB nearline
technology to the rebuild for a member drive failure in distributed RAID 6 with 128 drives of 1 TB
nearline technology.
The array rebuild test showed 10 times faster rebuild performance in case of distributed RAID
array. However, the rebuild times can vary depending on the distributed array configuration, disk
sizes, and types.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
When a redundant array is doing read/write I/O operations, the performance of the array is bound
by the performance of the slowest member drive. If the SAS network is unstable or if too much work
is being driven to the array when drives do internal ERP processes, performance to member drives
can be far worse than usual. In this situation, arrays that offer redundancy can accept a short
interruption to redundancy to avoid writing to, or reading from, the slow component. Writes that are
mapped to an under performance drive are committed to the other copy or parity, and are then
completed with good status (assuming no other failures). When the member drive recovers, the
redundancy is restored that uses a background process of writing the strips that were marked out of
sync while the member was slow.
• This technique is managed by using the setting of the slow_write_priority attribute of the
distributed array, which defaults to latency when the array is created. When set to latency, the
array is allowed to be taken out of synchronization to quickly complete write operations that take
excessive time. When the array uses latency mode or attempts to avoid reading a component
that is in redundancy mode, the system evaluates the drive regularly to assess when it
becomes a reliable part of the system again. If the drive never offers good performance or
causes too many performance failures in the array, the system fails the hardware to prevent
ongoing exposure to the poor-performing drive. The system fails the hardware only if it cannot
detect another explanation for the bad performance from the drive.
• To modify the response time goal, use the charray command to change the slow_write_priority
attribute to redundancy. When set to redundancy, slow write operations are completed in
normal time and the arrays remain synchronized, the array is not allowed to become out of
sync. However, the array can avoid suffering read performance loss by returning reads to the
slow component from redundant paths.
Uempty
Why DRAID 6?
Better reliability and availability than traditional RAID or DRAID 5
Less chance of losing data due to drive failures during a rebuild
Ability to recover from drive read errors
Better performance than traditional RAID with unused hot spare drives
DRAID offers faster rebuild time, especially when
Customers have spinning disks as opposed to flash
Large disks are used
Under heavy application I/O workloads
Spectrum Virtualize system features of thin volumes, compressed volumes, and
Data Reduction Pools (DRPs) avoid most of the RAID write penalty through full
stripe writes as new data is written sequentially to storage
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
The basic benefit of DRAID 6 is higher reliability, higher availability, and lower chance of data loss
as compared to traditional RAID 5 and RAID 6, or even DRAID 5. Compared to traditional RAID 5
or DRAID 5, you avoid data loss if a second drive fails before the first drive failure is handled via the
sparing process. Though isn’t likely, it is a risk, which increases as the size of the array increases,
as larger drives are used, and as production workload increases. Therefore, increase the time to go
through the sparing process, which increases probability of a second drive failure. Second, while
read errors are unlikely to occur, it is possible (and usually fixed during disk scrubbing processes
done by the hardware). These types of errors can be recovered with RAID 1, 5, 6, or 10, but they
can’t be recovered during the sparing process except for RAID 6 or DRAID 6 with a single drive
failure. DRAID also offers better IOPS bandwidth since spare drives are incorporated into the array.
Finally, with the support of thin provisioning, compressed volumes, and volumes in Data Reduction
Pools (DRPs), as new data is written it is written sequentially to the storage to avoid the RAID write
penalties through full stripe writes, which is the significant disadvantage of traditional or distributed
RAID 5 and 6, though is less of a concern with flash. In any event, write latency is normally
minimized due to the use of DRAM-based write cache in the Spectrum Virtualize system.
Uempty
DRAID configurations
$UUD\VSHUV\VWHP
6WRUDJH '5$,' '5$,'
,2 JURXS 6\VWHP *8,SUHIHUHQFH
6WRUZL]H9 GLVNV GLVNV
6WRUDJH $VVLJQLQJVSDUHVSHUDUUD\
'ULYHV 6SDUHV
6WRUZL]H9 8S WRGULYHV VSDUH
GULYHV VSDUHV
GULYHV VSDUHV
GULYHV VSDUHV
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Distributed array supports up to 10 arrays in an I/O group and 32 arrays in a system. When creating
an array, the GUI provides default array stripe width (not array width for DRAID which is the same
as drive count), on the storage GUI, is 9 disks for DRAID 5 and 12 disks for DRAID 6. However, the
preferences for the array width can vary according to the number of disks and number of tiers on
storage. The GUI recommends between 40 and 80 assuming that you have at least 40 of the drive
class you want to use. Typically, the best benefit for rebuild times is around 48 HDD drives in a
single DRAID. Faster rebuild times results in task completed in a couple of hours with the least
impact, with the load spread over many drives. As you increase the number of drives the rebuild
time shortens. However, the rebuild time can be unusual, unless you are deploying an All Flash
system rather than storage with Flash drives. It is not recommended to have heterogeneous pools
with different drive classes within the same tier.
Before IBM Spectrum Virtualize version 7.7.1, each array was assigned to a single CPU core. Now
a single DRAID array can achieve maximum system performance with DRAID multi-threading
across all cores, making full use of the multi-core environment.
The system assigns spare options by default, which varies according to the array size:
• Up to 36 disk drives: One spare rebuild area
• 37 - 72 disk drives: Two spare rebuild areas
• 73 - 100 disk drives: Three spare rebuild areas
• 101 - 128 disk drives, with a maximum number of rebuild areas per distributed array is 4
Uempty
Use the lsdriveclass command to display information about all drive classes
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Uempty
Drive types
The system uses the following information to determine the drive class of each drive:
Block size (512 or 4096)
Capacity
I/O group
RPM speed (7.2 K 10 K, or 15 K; blank for flash/SSD)
Technology (unknown, SAS_HDD, SAS_Nearline, or Flash)
Array can only contain drives from the same drive class or superior drive class
*%
1HDUOLQH
.
*% 6$6 6$6
.
6$6
([DPSOHRIVXSHULRUGULYHFODVV
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
To enhance performance of a distributed array, all of the drives must come from the same, or
superior, drive class. Each drive class can be identified by its drive_class_id. The system uses the
following information to determine the drive class of each drive:
• Block size (512 or 4096)
• Capacity
• I/O group
• RPM speed (7.2 K 10 K, or 15 K; blank for flash and SSD)
• Technology (unknown, SAS_HDD, SAS_Nearline, or flash and SDD)
When replacing a failed member drive in the distributed array, the system can use another drive
that has the same drive class as the failed drive, or the system can also select a drive from a
superior drive class. For example, two drive classes can contain drives of the same RPM speed,
technology type, and block size but different data capacities. In this case, the superior drive class is
the drive class that contains the higher capacity drives.
Uempty
Keywords
Distributed arrays • RAID 1
Distributed RAID (DRAID) • RAID 5
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Uempty
Review questions (1 of 2)
1. What is the maximum number of distributed arrays allowed per system?
A. 32
B. 36
C. 48
D. 128
2. True or False: IBM Storwize V7000 supports conversion from traditional to distributed
RAID.
3. True or False: IBM Storwize V7000 supports the intermix of Flash, SAS, and SATA
drives in the same DRAID array.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Uempty
Review answers (1 of 2)
1. What is the maximum number of distributed arrays allowed per system?
A. 10
B. 32
C. 48
D. 128
The answer is 32. A maximum of 10 arrays are supported per I/O group, 32 arrays per system and
up to 128 drives per array.
2. True or False: IBM Storwize V7000 supports conversion from traditional to distributed
RAID.
The answer is False. Conversion from traditional to distributed RAID is not supported.
3. True or False: IBM Storwize V7000 supports the intermix of Flash, SAS, and SATA
drives in the same DRAID array.
The answer is False. For performance and redundancy purpose, you cannot mix SAS and SATA drives
in the same DRAID array.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Uempty
Review questions (2 of 2)
4. Which of the following RAID arrays offers improved performance, faster rebuild times, and
double drive failure?
A. RAID 5
B. DRAID 5
C. RAID 6
D. DRAID 6
E. DRAID 10
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Uempty
Review answers (2 of 2)
4. Which of the following RAID arrays offers improved performance, faster rebuild times, and
double drive failure?
A. RAID 5
B. DRAID 5
C. RAID 6
D. DRAID 6
E. DRAID 10
The answer is DRAID 6. DRAID 6 offers the best performance in rebuild times and redundancy
protection with double drive failures.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Uempty
Summary
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H95$,'SURWHFWLRQVROXWLRQV
Uempty
Overview
This module describes the physical and logical requirements to prepare for installation,
initialization, configuring, and managing the IBM Storwize V7000 system.
References
Implementing IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1
http://www.redbooks.ibm.com/redpieces/pdfs/sg247938.pdf
Uempty
Objectives
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Uempty
System installation
Rack installation
Initial setup
System setup
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
This topic discusses the installation requirements and the initial system setup that is performed by
client for an IBM Storwize V7000 storage enclosure.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
IBM Storwize V7000 planning can be categorized into two types: physical planning and logical
planning. For a smooth and efficient installation, planning, and preparation, tasks must take place
before the system is scheduled for delivery and installed. A sales representative arranges a
Technical Delivery Assessment (also known as the TDA) meeting to go over site-specific details,
and to ensure that the correct information is gathered, before the delivery of the system. This
assessment meeting will also include, your IBM installation planning representative, IBM service
support representative (also known as SSR), and IBM Technical Advisor (TA).
Expert Pre-install TDA is mandatory for all installations and its focus is on the:
• Installation and implementation plan based on the client expectations, requirements, and
acceptance criteria.
• Before the systems are delivered, the physical requirements for the location where the
equipment will be installed must be verified. Therefore, IBM performs a site readiness review on
the receiving and the handling of the equipment according to guidelines, and security access.
During this time IBM provides technical recommendations, and accesses risks.
• IBM will also verify that the client’s electricity is suitable to handle to workload, and whether the
facility meets the cooling requirement, especially if the rear-door heat exchanger was part of the
purchase; adequate rack space for the Storwize V7000. This will also include network
requirements, and other parameters.
• IBM TDA will also make ensure client readiness to support the product and or solution.
Uempty
Logical Planning
9 System Management IP address
9 Node and iSCSI IP addresses
9 Domain name server (DNS)
9 Simple Mail Transfer Protocol (SMTP) gateway
9 Email sender address
9 Network Time Protocol (NTP) server
9 Time zone
9 Remote access
9 Contact information
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
In addition, a TDA pre-installation checklist, and worksheets document must be completed for each
Storwize V7000, and given to the IBM Installation Planning Representative or IBM SSR. The listed
required information, must be provided in each worksheet to prevent further inquiry, and delays
during the installation.
Other configuration tasks, such as defining storage pools, volumes, and hosts, are the
responsibility of the storage administrator.
• System Management IP address requires an IP address for each node with the appropriate
netmask, gateway, and any planned iSCSI IP addressees.
• All-flash storage enclosure IP address that uses an Ethernet port to management of the flash
enclosure.
• If Domain Name System (DNS) is used in your environment, then the systems must have the IP
address of the primary DNS server, and, if available, the secondary server.
• Simple Mail Transfer Protocol (SMTP) gateway, is needed for event notification through email.
This allows the system to be able to initiate an email notification, which is sent out through the
configured SMTP gateway (IP address or DNS name).
• Email sender address required, to show as the sender in the email notification.
• Network Time Protocol (NTP) server if required, systems can be used with an NTP server to
synchronize the system time with other systems.
• Establish a time zone, based on the location where the system is installed.
• Establish Remote Access, by using remote support network connection that is outbound
connectivity to the Internet.
Uempty
• Finally, your contact information, the person who can authorize remote support access and can
enable Storwize V7000 for remote support.
Uempty
Before proceeding with the rack hardware configuration, ensure that all the required information is
collected and that it is valid:
1. Collect and document the number of hosts (application servers) to attach to the Storwize
V7000, the traffic profile activity (read or write, sequential, or random), and the performance
requirements (I/O per second - IOPS).
2. Collect and document the storage requirements and capacities:
▪ The total storage (2076-24F or backend storage) existing in the environment to be
provisioned on Storwize V7000.
▪ The required storage capacity for volumes: volume mirroring, FlashCopy, compressed
volumes, or remote copy.
▪ Per host: Storage capacity, the host logical unit number (LUN) quantity, and sizes.
▪ The required virtual storage capacity that is used as a fully managed volume and used as a
thin-provisioned volume.
3. Define the local and remote SAN fabrics and clustered systems, if a remote copy or a
secondary site is needed.
4. Define the number of clustered systems and the number of pairs of nodes (between one and
four) for each system. Each pair of nodes (an I/O group) is the container for the volume. The
number of necessary I/O groups depends on the overall performance requirements.
5. Design the SAN according to the requirement for high availability and best performance.
Consider the total number of ports and the bandwidth that is needed design the iSCSI network
according to the requirements for high availability and best performance. Consider the total
number of ports and bandwidth that is needed between the host and the Storwize V7000.
Uempty
6. Determine the Storwize V7000 service IP address, and the IP addresses for the Storwize V7000
system and for the host that connects through iSCSI.
7. Determine the IP addresses for IP replication.
8. Define a naming convention for the Storwize V7000 nodes, host, and storage subsystem.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Plan the details of the physical system setup by using the planning tables for cable connections,
configuration data, hardware location, and redundant AC power connection. The location charts
include a suggested layout as well as blank worksheets. The completed tables and charts are also
useful as a record of the installation.
Download the hardware location chart, the cable connection table, the configuration data table, and
the optional redundant ac-power switch connection chart from the following website:
• Go to: https://www.ibm.com/support/knowledgecenter/en/ST3FR7.
• Select the version or edition of IBM Storwize V7000 documentation.
• From the Storwize V7000 documentation, click Planning under Getting Started.
• Click the Planning worksheets.
Uempty
Connect expansion
Install the optional storage enclosures to node
expansion enclosures
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Uempty
8. Connect all power cords to the required component (DO NOT power ON any devices)
9. Verify that the installation is correct, and all cables are connected properly.
If you elect to have IBM SSR install the Storwize V7000 control enclosures and storage enclosures,
they would perform the same procedures to install and connect the components to each other, to
Fibre Channel switches, the Ethernet management switch, and to the power distribution units. This
includes cabling any for an initial system cluster installation. However, for a scalable system cluster
installation, IBM lab-based services completes the cabling.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Each V7000 control enclosure, or I/O group, and storage enclosure requires two IEC-C13 power
cable connections to connect to their 800-W and 1300-W power supplies. Country specifics power
cables are available for ordering to ensure that proper cabling is provided for the specific region. A
total of four power cords are required to connect the Storwize V7000 system to power. For
upstream high availability, one power cord from each Storwize V7000 system gets plugged into the
left side in-cabinet PDU, and the second power cord into the right side in-cabinet PDU. PDUs are
fed by separate power sources.
For upstream high availability, each rack cabinet is equipped with dual power strips or in-cabinet
PDUs. One power cord from each node is plugged into the left side in-cabinet PDU, and the second
node power cord is plugged into the right side in-cabinet PDU. The in-cabinet PDUs are fed by
separate redundant upstream power sources. The PDUs are fed by separate power sources. This
enables the cabinet to be split between two independent power sources for greater upstream high
availability. When adding more Storwize V7000s to the system cluster, the same power cabling
scheme should be continued for each additional enclosure.
Uempty
10 units up
Enclosure 1
2076-24F expansion enclosure, the
blue pull tab must be below the cable Node Canister 1
1
Storwize V7000-724 node canister,
I/O Group
1
the blue pull tab must be above the Node Canister 2
connector
1 1
Enclosure 2
10 units down
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
When installing the Storwize V7000 into a rack environment, it is recommended to place the
expansion units 2076-24F adjacent their parent (Storwize V7000) nodes in a rack. This helps to
manage simplicity and transparency of system layout within the rack. Also, ensure that the SAS
cables are run within the cable management arms in the back of the Storwize V7000 nodes. A
single control enclosure can support up to 20 expansion enclosures in two chains: 10 in the upper
chain (above the control enclosure) and 10 in the lower chain.
The visual illustrates how to connect 2076-12F/24F to the Storwize V7000 using the supplied SAS
cables:
1. Connect the top node 1 to the expansion enclosures.
a. Connect SAS port 1 of the top node 1 to SAS port 1 of the left canister in the first expansion
enclosure 1.
b. Connect SAS port 3 of the top node 1 to SAS port 1 of the left canister in the second
expansion enclosure 4.
2. Connect the second node 2 to the expansion enclosures.
a. Connect SAS port 1 of the bottom node 2 to SAS port 1 of the right canister in the first
expansion enclosure 1.
b. Connect SAS port 3 of the bottom node 2 to SAS port 1 of the right canister in second
expansion enclosure 4.
When connecting to a 2076-12F/24F expansion enclosure, the blue pull tab must be below the
cable, and for Storwize V7000 node, the blue pull tab must be above the connector. Insert the
connector gently until it clicks into place. If there is resistance, the connector is probably oriented
the wrong way. Do not force it. When inserted correctly the connector can be removed only by
pulling the tab.
Uempty
This configuration creates two separate chains of one enclosure each. No cable can be connected
between a port on a left canister and a port on a right canister of the expansion enclosures.
Ensure that cables are installed in an orderly way to reduce the risk of cable damage when
replaceable units are removed or inserted. The cables need to be arranged to provide clear access
to Ethernet ports, including the technician port.
Uempty
4 units up
6 units up
2076-24F expansion enclosure, the Enclosure 1
blue pull tab must be below the cable
Enclosure 2
6 units down
The 5U SAS expansion canisters are installed upside down 4 units down
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
The 5U High Density Drawer expansion canisters are physically identical to the 12F and 24F SAS
expansion enclosures except they are installed in a reversed order. The SAS ports are numbered
from left to right as SAS port 1 and SAS port 2. The SAS port 1, the IN port connects up to four data
channels that use the control enclosure or to the previous expansion enclosure in the chain, with
the control enclosure at the end of the chain. SAS port 2, the OUT port connects to the next
expansion enclosure in the chain. Expansion enclosures can be included in a SAS chain, with up to
a total of 4 expansion enclosures per SAS chain.
Uempty
)HDWXUH 3URGXFWQDPH
FRGH
Mini SAS
connector
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
The visual lists the available cable components for the Storwize V7000 2076-12F/24F expansion
enclosures. Both the Storwize V7000 2076 control enclosure and expansion enclosure are
connected using the IBM 0.6m, 1.5 m, 3.0 m, 6.0 m 12 Gb SAS Cable (mSAS HD to mSAS). Check
Interoperability Guide for the latest supported options. Cable requirements are discussed in the
installation unit.
Uempty
Lost in translation
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
The cable-management arm is an optional feature and is used to efficiently route cables so that you
have proper access to the rear of the system. Cables are routed through the arm channel and
secured with cable ties or hook-and-loop fasteners. You should allow slack in the cables to avoid
strain in the cables as the cable management arm moves.
Uempty
2
Power on any Fibre Channel
or Ethernet network switches
1
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Once all the installation requirements are completed, use the following steps in the order that is
given to power on the system environment. Typically, the Storwize V7000s are implemented in
existing environments where the hosts are already in production.
1. Power on FC and Ethernet switches. This is necessary to access the nodes, and for inter-node
communication for the Storwize V7000 cluster to operate.
2. Power on the SAS expansion enclosures. Note that expansion enclosures do not have a power
button. Wait approximately one minute to allow the expansion enclosures to complete their
Power On Self-Tests (also referred to as POST) and discover the drives. SAS expansion
enclosures are powered on before the control enclosures or nodes to allow the SAS storage to
be discovered and configured on the attached control enclosures when they are powered on.
3. Power on the control nodes. The nodes begin to power on and begin their POST, which might
take as long as five minutes to complete. The node appears idle during this time. And after
POST is complete, no yellow, amber, or fault LEDs should be lit and the green LEDs; you can
power on (or restart) the host servers.
4. Finally administrators can configure the Storwize V7000 cluster, storage pools, hosts and
VDisks for the hosts, and start applications.
To power off devices reverse order of the procedures. Before doing so, you need to stop all host I/O
operations to the volume on the system followed by the host.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
In order to create an Storwize V7000 clustered system, you must initialize the system by using the
Technician service port. The technician port is designed to simplify and ease the initial basic
configuration of the Storwize V7000 storage system. This process requires the administrator to be
physically at the hardware site.
To initialize a node, you simply connect a personal computer (PC) to the Technician port (Ethernet
port 4) on the rear of a node canister ─ only one node required. This port can be identified by the
letter “T”. The node uses DHCP to configure IP and DNS settings of the personal computer. If your
laptop is not DHCP enabled, configure the IP addresses by using the system default static IP
addresses 192.168.0.1 for the node.
After the Ethernet port of the laptop is connected to the technician port, open a supported web
browser and point to https://install. If the node has Candidate status, you are automatically
redirect 192.168.0.1 initialization wizard. Otherwise, the service assistant interface is displayed.
Administrators can access the interface of a node that uses its Ethernet port 1 service IP address
through either a web browser or open an SSH session.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
To access the IBM Storwize V7000 management GUI, you will need a valid we browser. To ensure
that you have the latest supported browser and the appropriate settings are enabled, visit the IBM
Storwize V7000 Knowledge Center.
Your web browser might feel compelled to protect certain access it has been deemed as untrusted.
These certificate warnings are self-issued and not harmful.
If required, acknowledge that you understand the ricks and confirm the following security exception
to continue.
If you are unable to connect to the management GUI from your web browser and received a Page
not found or similar error, you need to verify that both nodes are functioning working. You can’t
connect if the system is not operational with at least one node online. If you know the service
address of a node canister, either use the service assistant to verify that the state of at least one
node canister is active, or if the node canister is not active, use the LEDs to see if any node canister
state is active.
Uempty
Configure tasks:
Verify system name
Add license functions
Finalize
Set System Data & Time
System Setup
Enable Encryption
(License required)
Configure Call Home (POC,
Location and Email address
Enable Remote Support
Enable IBM Storage Insight Review Summary
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
You need to complete the following System Setup tasks or have them completed by an IBM SSR or
IBM Business Partner:
1. If you’re redirected to the IBM Service Assistant GUI, you need to initialize the system for
management access by using the client-supplied worksheet. The SA provides a default user ID
(superuser) and password (passw0rd with a zero “0” instead of the letter “o”). Administrators
can also use the Service Assistant IP address to access the GUI and perform recovery tasks
and other service-related issues.
2. Next, you open a browser window and login into the Storwize V7000 management GUI using
the management IP address that is established during the initialization.
3. Follow the System Setup Wizard to verify the system name, add license functions as required,
set date and time, enable system encryption (license required), configure Call Home, enable
Remote Support, and enable IBM Storage Insights.
4. Review Summary. If modifications are required, you can use the Back button to do so.
5. Finalize the system setup.
The customer’s SAN, LAN and host administrators will also be involved in the setup process. IP
address from the LAN team are entered into the SA GUI, and inter-node communications must be
setup either via SAN zones between the node ports, or via 25 Gb Ethernet ports, prior to cluster
initialization. Eventually host and backend storage administrators will get involved migrating
workloads into the environment.
Uempty
Log in to the
management GUI
Add Verify
additional nodes Software
Code level
Other features:
8VHUDXWKHQWLFDWLRQ
6HFXUHFRPPXQLFDWLRQV
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
After the successfully completing all mandatory steps of the initial configuration, log in to the
management GUI to verify the system settings. During the initialization, only two nodes on the
fabric are configured in to a cluster, and presented automatically. If you have purchased more
control enclosures, you must manually add them to the cluster. After all control enclosures are part
of the cluster, you can install the optional expansion enclosures. However, it is strongly
recommended to ensure that the system is running at the appropriate firmware level before
configuring your system storage.
When a new enclosure is cabled correctly to the system, the Add Enclosures action automatically
displays on the System > Overview page. If this action does not appear, review the installation
instructions to ensure the new enclosure is cabled correctly. You can also add a new enclosure by
selecting Add Enclosure from the System Actions menu.
You might be required to complete any addition administration settings that might have not been
completed during the system setup wizard such as Call Home. In this case, the GUI offers
suggested tasks to help remind to include, create a volume and configure a storage pool. You can
directly perform the tasks from the pop-up window or cancel them and run the procedure later at
any convenient time. Optionally, you can configure other features, such as user authentication, and
secure communications.
Uempty
System installation
System setup
Management IP
SAN zoning
Management interfaces
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
This topic discusses the client requirement for completing the system setup of the Storwize V7000.
Uempty
Required IP addresses
Cluster management IP address on ethernet port 1
One management IP address per cluster
Node service IP address on ethernet port 1
One per node
Optional IP address
Alternate cluster management IP address (ethernet port 2 on separate subnet)
iSCSI IP addresses
Both IPv4 and IPv6 address format is supported
10 GbE cluster
10 GbE port is used for ETH1 ETH2 ETH3 ETH4 management ports
system initialization
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV &RS\ULJKW,%0&RUSRUDWLRQ
The management IP addresses are assigned during the initialization of the system and represent a
set of nodes on the system that contains the management GUI and the command-line interface,
which manage the system. A management IP addresses is required to manage the Storwize V7000
storage system through either a graphical user interface (GUI), command-line interface (CLI)
accessed by using a Secure Shell connection (SSH), or by using an embedded CIMOM that
supports the Storage Management Initiative Specification (SMI-S). The system IP address is also
used to access remote services like authentication servers, NTP, SNMP, SMTP, and syslog
systems, if configured. Each Storwize V7000 control enclosure contains a default management IP
address that can be changed to allow the device to be managed on a different address than the IP
address assigned to the interface for data traffic.
The system cluster requires the following IP addresses:
• Cluster management IP address: Address used for all normal configuration and service access
to the cluster. There are two management IP ports on each control enclosure. Port 1 is required
to be configured as the port for cluster management. Both Internet Protocol Version 4 (IPv4)
and Internet Protocol Version 6 (IPv6) are supported.
• Service assistant IP address: One address per node. The cluster operates without the nodes’
service IP addresses but it is highly recommended that each node is assigned an IP address
for service-related actions.
• A 10/100/1000-Mb Ethernet connection is required for each cable.
An alternative management IP address on another subnet can be configured on ethernet port 2.
This provides redundancy to mange the cluster in case of a LAN subnet failure. Note that the
cluster handles SAN based I/O even if the IP network is down.
ISCSI addresses can also be assigned to any of the Ethernet ports.
Uempty
Control node
Ethernet
2 - 8 nodes https
GUI, CLI, and
CIMOM
GUI: Web browser
CLI: Over SSH
over https
with key or password
Embedded GUI with password
with best practices
Any
presets resource
SMI-S to
manager
CIMOM
CIM interface
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
The Storwize V7000 simplifies storage management by providing a single image for multiple
controllers and a consistent user interface for provisioning heterogeneous storage.
The Storwize V7000 provides cluster management interfaces that include:
• An embedded Storwize V7000 graphical user interface (GUI) that supports a web browser
connection for configuration management, which is similar to the common source code base as
IBM Storwize V7000 (Storwize V7000).
• A command-line interface (CLI) accessed by using a Secure Shell connection (SSH) with
PuTTY.
• An embedded CIMOM that supports the SMI-S, which allows any CIM-compliant resource
manager to communicate and manage the system cluster.
To access the cluster for management, there are two user authentication methods available:
• Local authentication: Local users are those managed within the cluster, that is, without using
a remote authentication service. Local users are created with a password to access the
Storwize V7000 GUI, and/or assigned an SSH key pair (public/private) to access the CLI.
• Remote authentication: Remote users are defined and authenticated by a remote
authentication service. The remote authentication service enables integration of system with
LDAP (or MS Active Directory) to support single sign-on. We take a closer look at the remote
authentication method later in this unit.
Uempty
Configuration Node
Boss Node
• Owns cluster IP address (up to two addresses) • Controls cluster state updates
• Provides configuration interface to cluster • Propagates cluster state data to all nodes
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
IBM Storwize V7000 cluster can contain up to four I/O groups, which is eight Storwize V7000 node
system.
When the initial node is used to create a cluster, it automatically becomes the configuration node
for the system cluster. The configuration node responds to the cluster IP address and provides the
configuration interface to the cluster. All configuration management and services are performed at
the cluster level. If the configuration node fails another node is chosen to be the configuration node
automatically and this node takes over the cluster IP address. Thus, configuration access to the
cluster remains unchanged.
The cluster state holds all configuration and internal cluster data for the cluster. This cluster state
information is held in non-volatile memory of each node. If the main power supply fails, then the
battery modules maintain battery power long enough for the cluster state information to be stored
on the internal disk of each control enclosure. The read/write cache information is also held in
non-volatile memory. If power fails to a node, then the cached data is written to the internal disk.
A control enclosure in the cluster serves as the Boss node. The Boss node ensures
synchronization and controls the updating of the cluster state. When a request is made in a node
that results in a change being made to the cluster state data, that node notifies the boss node of the
change. The boss node then forwards the change to all nodes (including the requesting node) and
all the nodes make the state-change at the same point in time. This ensures that all nodes in the
cluster have the same cluster state data. The system cluster time can be obtained from an NTP
(Network Time Protocol) server from time synchronization.
Uempty
System installation
System setup
Management IP
SAN zoning
Management interfaces
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
This topic discusses the SAN zoning requirements for an Storwize V7000 clustered system.
Uempty
Fabric 1
Host Zones
Intra-cluster
zoning SVC
Redundancy V7000
V7000 V7000
Fabric 2 Storage
DS8800
Zones
External storage
FS 900 AE3
Storwize V5030
Local storage
DS3500
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
The SAN fabric zones allow the IBM Spectrum Virtualize system to see each other to include other
supported control enclosures being added to an I/O group, and all storage devices that is attached
to the fabrics, and for all hosts to see only the managing controllers. SAN zoning configuration is
implemented at the switch level. Ensure system I/O groups are zoned correctly and is part of the
same storage area network (SAN). For high availability, a dual fabric network that uses two
independent fabrics or SANs (up to four fabrics are supported) is recommended.
The switches can be configured into four distinct types of fabric zones:
• Intra-cluster zoning needs to be configured prior to creating a cluster to allow controllers or
nodes communication. This requires internal switches to create up to two zones per fabric and
include a single port per node, which is designated for intra-cluster traffic. A Spectrum Virtualize
system can be connected to up to four fabrics. No more than four ports per node should be
allocated to intra-cluster traffic.
• A host zone consists of the control enclosure and the host. You need to create a host zone for
every server that needs access to storage from the controller. A single host should not have
more than eight paths to an I/O group. IBM Storwize V7000 supports up to three inter-switch
link (ISL) hops in the fabric, which means that connectivity between the server and the control
enclosure can be separated by up to five FC links, four of which can be 10 km long (6.2 miles) if
longwave small form-factor pluggables (SFPs) are used.
• Storage zones is a single zone that consists of all the storage systems that is virtualized by an
Spectrum Virtualize controller enclosure. If you plan to virtualize storage behind the Storwize
V7000, such as the V7000, it is zoned as external FC storage to the Spectrum Virtualize
controller enclosure.
• Remote copy zones is an optional zone to support Copy Services features for Metro Mirroring
and Global Mirroring operations if the feature is licensed. This zone contains half of the system
Uempty
ports of the system clusters in partnerships. When using a Spectrum Virtualize system for
remote mirroring, alternatives exist for connecting the clusters through the Ethernet ports on the
system, or for interconnecting the SAN fabrics at both sites.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
To ensure proper performance and to maintain application high availability in the unlikely event of
an individual node canister failure, it is recommended that all V7000 control enclosures in a
clustered system must be on dual SAN fabrics with each Storwize V7000 node adapter ports
spread evenly across both fabrics. All V7000 control enclosures must also be on the same local
area network (LAN) segment, which allows for any node in the clustered system to assume the
clustered system management IP address. For a dual LAN segment, port 1 of every node is
connected to the first LAN segment, and port 2 of every node is connected to the second LAN
segment. Therefore, if a node fails or is removed from the configuration, the remaining node
operates in a degraded mode, but the configuration is still valid for the I/O Group.
Uempty
The visual lists options that represent optimal configurations based on port assignment to function.
Using the same port assignment but different physical locations will not have any significant
performance impact in most client environments.
This recommendation provides the wanted traffic isolation while also simplifying migration from
existing configurations with only 4 ports, or even later migrating from 8-port or 12-port
configurations to configurations with additional ports. More complicated port mapping
configurations that spread the port traffic across the adapters are supported and can be considered
but these approaches do not appreciably increase availability of the solution since the mean time
between failures (MTBF) of the adapter is not significantly less than that of the non-redundant node
components.
Uempty
Slot: Port Port # SAN 4-port Nodes 8-port Nodes 12-port Nodes 16-port Nodes
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Figure 6-26. Storwize V7000 port assignment recommendations for isolating traffic
Uempty
Fabric
Switch domain#
LUN
masking
Lw Ls
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Two different zoning methods that can be used are zoning by port, or the most common practice of
zoning by WWPN. When zoning by port, if a cable is moved on a switch or to another switch on the
fabric, the zoning definition must be changed. While if the HIC on the host is changed, (which
results in a new WWPN on that HIC) the port zoning doesn't need to be changed.
Zoning by WWPN provides the granularity at the adapter port level. If the cable is moved to another
port or to a different switch in the fabric the zoning definition is not affected. However, if the HIC is
replaced and the WWPN is changed (this does not apply to the Spectrum Virtualize system
WWPNs) then the zoning definition needs to be updated accordingly.
When zoning by switch domain ID, ensure that all switch domain IDs are unique between both
fabrics and that the switch name incorporates the domain ID. Having a unique domain ID makes
troubleshooting problems much easier in situations where an error message contains the Fibre
Channel ID of the port with a problem. For example, have all domain IDs in first fabric to start with
10 and all domain IDs in second fabric to start with 20.
Uempty
3RUWQDPH
1B3RUW,'
+RVW+%$QRGH+%$
3RUWQDPH
1B3RUW,'
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
IBM storage uses a methodology whereby each world wide port name (WWPN) is a child of the
world wide node name (WWNN). The unique world wide name (WWN) is used to identity the
Fibre Channel storage device in a Storage Area Network (SAN). This means that if you know
the WWPN of a port, you can easily identify the vendor and match it to the WWNN of the
storage device that owns that port.
Uempty
9HQGRUVSHFLILF
&RPSDQ\,' 9HQGRU6SHFLILF,QIR
LQIRUPDWLRQ
F
5HJLVWHUHG
IRUPDW
F
Each N_port on a storage device contains persistent (16 hexadecimal) World Wide Port Name
(WWPN) that is actually 8 bytes.
The first table is an example of an Emulex HBA IEEE Standard format (10). Section 1 identifies
the WWN as a standard format WWN. Only one of the 4 digits is used, the other three must be
zero filled. Section 2 is called the OUI or “company_id” and identifies the vendor (more on this
later). Part 3 is a unique identifier created by the vendor.
Our next example is an QLogic HBA identifying an IEEE Extended format (20). Section 1
identifies the WWN as an extended format WWN. Section 2 is a vendor specific code and can
be used to identify specific ports on a node or to extend the serial number (section 4) of the
WWN. Section 3 identifies the vendor. Section 4 is the unique vendor-supplied serial number for
the device.
The last two tables identifies vendor IEEE Registered Name format of the WWN. This is
referred to a Format 5 which enables vendors to create unique identifiers without having to
maintain a database of serial number codes. IBM owns the 005076 company ID. Section 1: 5
identifies the registered name WWN. Section 2: 00: 05: 07:6 identifies the vendor. Section 3:
3:00:c7:01:99 is a vendor-specific generated code, usually based on the serial number of the
device, such as a disk subsystem.
All vendors wishing to create WWNs must register for a company ID or OUI (Organizationally
Unique Identifier). These are maintained and published by IEEE.
Uempty
For 8 Gb connect the P3 and P4 ports to the SAN Fabric 1 SAN Fabric 2
same fabrics as P1 and P2 ports respectively
1 1 1 1 2 2 2 2 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4
P1 P2 P3 P4 P1 P2 P3 P4
Slot 1 Slot 2 Slot 1 Slot 2 Slot 1 Slot 2 Slot 1 Slot 2 Slot 1 Slot 2 Slot 1 Slot 2 Slot 1 Slot 2 Slot 1 Slot 2
Node 1 Node 2 Node 3 Node 4 Node 1 Node 6 Node 7 Node 8
Storwize V7000 8-node with 4 FC ports per node (8 per I/O Grp)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
This example illustrates an Storwize V7000 scalable building block with four ports per I/O group and
the IBM FlashSystem 900 two ports per canister are connected to a redundant fabric by using 8-Gb
FC connections. Both system ports are evenly split between two SAN fabrics.
Uempty
1 1 1 1 2 2 2 2 1 1 1 1 2 2 2 2 1 1 1 1 2 2 2 2 1 1 1 1 2 2 2 2
P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4
Slot 1 Slot 2 Slot 1 Slot 2 Slot 1 Slot 2 Slot 1 Slot42 Slot 1 Slot 2 Slot 1 Slot 2 Slot 1 Slot 2 Slot 1 Slot 2
Node 1 Node-2 Node-3 Node-4 Node-5 Node-6 Node-7 Node-8
I/O Group 0 I/O Group 1 I/O Group 2 I/O Group 3
Storwize V7000 scalable BB with 4 FC ports per node (8 per I/O Grp)
Cntrl A Cntrl B
Channels Channels
1 and 3 2 and 4
C2 C4
C1 C3
Controller 1 Controller 2
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
This example illustrates an Storwize V7000 scalable building block with four ports per I/O group,
and the IBM System Storage DS3500 two port are connected to a redundant fabric by using 8-Gb
FC connections. Both system ports are evenly split between two SAN fabrics.
This follows the best practice of alternating FC adapter ports across SAN fabrics, which typically
yields more application bandwidth under FC cable, port, or fabric failure conditions.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
SAN zoning connectivity of an Storwize V7000 environment can be verified by using the system
management GUI by selecting Settings > Network and then select Fibre Channel Connectivity
in the Network filter list. The Fibre Channel Connectivity view displays SAN connectivity data as
seen by the Storwize V7000 cluster, that is, the port-to-port connectivity between the configuration
control nodes ports in the cluster with the attached host ports, and storage system ports that are
attached through the Fibre Channel network.
This output allows you to verify that the SAN cabling and zoning are properly setup. E.G., if you
select a host, the paths from the V7000 to the host will be listed and you can compare it to the
planned design. Note this output doesn’t tell us on which fabric, in the typical dual SAN fabric
environment, each port resides.
Each row in the example output shows an I/O path as reflected by the local and remote WWPN
pair, where the local WWPN is a Storwize V7000 port, and the remote WWPN is another Storwize
V7000, host or storage port. The first four rows represent two inter-node I/O paths between node 1
and node 2. Note that the paths between nodes in the cluster are listed twice; once from each
node's point of view. Paths to hosts and storage (or remote Storwize V7000s), which are the ones
listed after the first four, are only listed once.
The host ports are initiators to the Storwize V7000 target ports, while Storwize V7000 ports are
initiators to the back end storage. It is common that Storwize V7000 ports act as both initiators and
targets.
The Fibre Channel Connectivity is useful in validating the connections that arise from the cabling
and zoning. The initiator ports must have completed the port login (PLOGI) process before it can be
listed.
Uempty
Multiple ports or connections from a given storage system can be defined to provide greater data
bandwidth and more availability.
Uempty
)&6ZLWFK$ )&6ZLWFK%
/81
PDVNLQJ /81VKDULQJUHTXLUHV
/Z /V DGGLWLRQDOVRIWZDUH
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
A host system is generally equipped with two HBAs requiring one to be attached to each fabric.
Each storage system also attaches to each fabric with one or more adapter ports. A dual fabric is
also highly recommended when integrating the Storwize V7000 into the SAN infrastructure.
LUN masking is typically implemented in the storage system and in an analogous manner in the
Storwize V7000 to ensure data access integrity across multiple heterogeneous or homogeneous
host servers. Zoning is deployed often complementing LUN masking to ensure resource access
integrity. Issues that are related to LUN or volume sharing across host servers are not changed by
the Storwize V7000 implementation. Additional shared access software, such as clustering
software, is still required if sharing is desired.
Another aspect of zoning is to limit the number of paths among ports across the SAN and thus
reducing the number of instances the same LUN is reported to a host operating system.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
The SDD provides multipath support for certain OS environments that do not have native MPIO
capability. SDD also enhances the functions of the Windows DSM and AIX PCM MPIO frameworks.
For availability, host systems generally have two HBA ports installed; and storage systems typically
have multiple ports as well. The number of multiple instances of the same LUN increases as more
ports are added. In a SAN environment, a host system with a multiple Fibre Channel adapter ports
which connect through a switch to multiple storage ports is considered to have multiple paths. Due
to these multiple paths, the same LUN is reported to the host system more than once.
For coexistence and gradual conversion to the Storwize V7000 environment, a storage system
RAID controller might present LUNs to both the Storwize V7000 as well as other hosts attached to
the SAN. Dependent upon some restrictions, a host might be accessing SCSI LUNs surfaced either
directly from the storage system or indirectly as volumes from the Storwize V7000. Besides
adhering to the support matrix for storage system type and model, HBA brand and firmware levels,
device driver levels and multipath driver coexistence, and OS platform and software levels, the
fabric zoning must be implemented to ensure resource access integrity as well as multipathing
support for high availability.
Although attached storage is supported, it is the Storwize V7000 and not the individual host
systems that interacts with these storage systems, their device drivers, and multipath drivers.
Uempty
Logical ports 4 3 1 3 5 6 7 8 4 3 1 3 5 6 7 8
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
Physical ports HIC 1 HIC 2 HIC 1 HIC 2
Node 1 Node 2
V1 I/O Group 0
By default, the Storwize V7000 GUI assigns ownership of even-numbered volumes to one node of
a caching pair and the ownership of odd-numbered volumes to the other node. When a volume is
assigned to an V7000 node at creation, this node is known as the preferred node through which the
volume will normally be accessed. The preferred node is responsible for I/Os for the volume and
coordinates sending the I/Os to the alternative node.
This illustration is of a 2-node system with dual paths to both the fabric and HBA to the Storwize
V7000 I/O group. Each host HBA port is zoned with one port of each V7000 node of an I/O group in
a four-path environment. The first volume (vdisk1), whose preferred node is NODE 1, is accessed
for I/O then the path selection algorithms will load balance across the two preferred paths to NODE
1. The other two non preferred paths defined in this zone are to NODE 2, which is the alternate
node for volume (vdisk1).
The reason for not assigning one HBA to each path is because, one node solely serves as a
backup node for any specific volume. That is, a preferred node scheme is used. The load is never
be balanced for that particular volume. Therefore, it is better to load balance by I/O group instead
so that the volume is assigned to nodes automatically.
Uempty
6$1)DEULF 6$1)DEULF
1 1 1 1 2 2 2 2 5 5 5 5 1 1 1 1 2 2 2 2 5 5 5 5 1 1 1 1 2 2 2 2 5 5 5 5 1 1 1 1 2 2 2 2 5 5 5 5
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
6ORW 6ORW 6ORW 6ORW 6ORW 6ORW 6ORW 6ORW 6ORW 6ORW 6ORW 6ORW
,2*URXS ,2*URXS
6WRUZL]H9QRGH
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Multiple fabrics increase the redundancy and resilience of the SAN by duplicating the fabric
infrastructure. With multiple fabrics, the hosts and the resources have simultaneous access to both
fabrics, and have zoning to allow multiple paths over each fabric.
In this example, the host has two HBAs installed, and each port of the HBA is connected to a
separate SAN switch. This allows the host to have multiple paths to its resources. This also means
that the zoning has to be done in each fabric separately. If there is a complete failure in one fabric,
the host can still access the resources through the second fabric.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
To track host zoning, you can create a worksheet that documents which host ports should be
assigned to which Storwize V7000 ports to ensure the workload is spread across the Storwize
V7000 HBA ports. This might be particularly helpful when host ports are set up with four paths to
the Storwize V7000 I/O group.
Not all OS platforms recommend or support eight (or even four) paths between the host ports and
the I/O group. Consult the Storwize V7000 Information Center for platform-specific host attachment
details.
Uempty
System installation
System setup
Management IP
SAN zoning
Management interfaces
Management GUI
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
This topic discusses the management requirements for accessing the graphical user interface
(GUI) and assigning of user IDs and roles. This topic also describes the steps that are required to
configure an SSH (PuTTYGen) connection and create user authentication for access.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
The Dashboard is the default home page that provides a high-level view of information about the
system, its performance, capacity, and system health.
The Dashboard serves as a quick view to assess the overall condition of the system and view
notifications of any critical issues that require immediate action.
• At a glance overview of performance, capacity, and system health.
• Enhancements for use with mobile devices, including Event Flag based performance charts.
• Performance graphs overlaid with events.
• Improvements to “strongly encourage” enabling of Call Home and Remote Access.
• Capacity over time GUI provides clients more insight on how they are using their capacity over
time.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
The Access > Users option can be used to perform user administration such as create new users,
apply password administration, plus add and remove SSH keys. By default, a superuser ID and
default password are generated. A superuser operates in a security administrator role, which
provides the ability to manage all functions of the system, including managing users, user groups,
and user authentication. They can also run any system commands from the command-line
interface (CLI). You can assign other user to a security administrator role, however, only a
superuser can run sainfo or satask commands, or access the Service Assistant interface on a
node.
When Remote Support Assistance has been enabled, either during the system setup wizard or as a
stand-alone enablement, the system generates two user IDs. These are support remote access
users (sra_IDs) to enable system access. These IDs are restricted and only used by IBM Support
personnel. When logging with an sra_ID, the support personnel must respond to the challenge
code presented with a response code that is received from the IBM Support Center.
Uempty
Uempty
Access to all the functions provided by both the management GUI and CLI
including those related to managing users, user groups, and authentication
User has a limited command set related to servicing the cluster, and has
access to all the functions associated with monitor role
User does not have the authority to change the state of the cluster or cluster
resources
User can perform the same tasks and run most of the same commands as
administrator-role users
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
There are five default user groups and roles. When adding a new user to a group, the user must be
associated with one of the corresponding roles:
• Security Administrator: User has access to all the functions provided by both the management
GUI and CLI including those related to managing users, user groups, and authentication.
• Administrator: User has access to all the functions provided by both the management GUI and
CLI except those related to managing users, user groups, and authentication.
• Copy Operator: User has the authority to start, modify, change direction, and stop FlashCopy
mappings and Remote Copy relationships at the stand-alone or consistency group level, but
cannot create or delete definitions. The user has access to all the functions associated with
monitor role.
• Service: User has a limited command set related to servicing the cluster. It is designed primarily
for IBM service personnel. The user has access to all the functions associated with monitor
role.
• Monitor: User can access panes and commands, back up configuration data, initiate change to
its own password and SSH key, and issue the following commands: finderr, dumperrlog,
dumpinterallog, and chcurrentuser. This role cannot perform actions that change the state of
the system or the resources that the system manages.
• RestrictedAdmin: User can perform the same tasks and run most of the same commands as
administrator-role users. However, users with the Restricted Administrator role are not
authorized to run the rmvdisk, rmvdiskhostmap, rmhost, or rmmdiskgrp commands and
cannot remove volumes, host mappings, hosts, or pools. When secure remote assist is enabled
on the system, support personnel can be assigned this role to help resolve errors and fix
problems.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Administrators can create role-based user groups where any users that are added to the group
adopt the role that is assigned to that group. Roles apply to both local and remote users on the
system and are based on the user group to which the user belongs. A local user can only belong to
a single group; therefore, the role of a local user is defined by the single group to which that user
belongs.
The User Group navigation pane lists the user groups pre-defined in the system. To create a user
group, you must define its Roles. Once created, you can determine the authentication type and the
number of users who are assigned with this group.
Uempty
GUI,
CLI CLI
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
When a clustered system is created, the authentication settings default to local, which means that
the system contains a local database of users and their privileges. Users can be created on the
system by using the user accounts that they are given by the local superuser account. With a valid
password and username, users are allowed to log in into both GUI and CLI with the defined access
level privileges. If a password is not configured, the user will not be able to log in to the GUI.
SSH keys are not required for CLI access. However, you can choose either to use SSH or a
password for CLI authentication. The CLI can be accessed with a pair of public and private SSH
keys. A public key is uploaded to the Storwize V7000 for each user, while the private key is typically
stored on the user's local system in a connection profile for the program used to access the CLI
such as Putty.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
IBM Storwize V7000 supports remote authentication using LDAP. This enables authentication with
a domain user name and password instead of a locally defined user name. If the enterprise has
multiple Storwize V7000 clusters, then user names are no longer need to be defined on each of
these systems. Centralized user management is at the domain controller level instead of the
individual Storwize V7000 clusters.
Before configuring authentication for a remote user, you first verify that the remote authentication
service is configured for the SAN management application. You also need to configure remote
authentication before you can create a new user.
To configure the remote authentication service, navigate to the Directory Services window. Click
Configure Remote Authentication. The supported types of LDAP servers are IBM Tivoli Directory
Server, Microsoft Active Directory (MS AD), and Open LDAP (running on a Linux system).
The user that is authenticated remotely by an LDAP server is granted permission on the Storwize
V7000 system according to the role that is assigned to the group of which the user is a member.
That is, the user group must exist with an identical name on the system and on the LDAP server for
the remote authentication to succeed.
Uempty
Public
3
2
Install public key
in cluster
Storwize V7000
Secure communications
Storwize V7000
Public
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
To use the CLI, the PuTTY program (on any workstation with PuTTY installed) must be set up to
provide the SSH connection to the Storwize V7000 cluster. The command-line interface (CLI)
commands use the Secure Shell (SSH) connection between the SSH client software on the host
system and the SSH server on the system cluster. For Windows environments, the Windows SSH
client program PuTTY can be downloaded.
A configured PuTTY session that uses a generated Secure Shell (SSH) key pair (Private and
Public) is needed to use the CLI. The key pair is associated with a given user. The user and its key
association are defined by using the superuser. The public key is stored in the system cluster as
part of the user definition process. When the client (for example, a workstation) tries to connect and
use the CLI, the private key on the client is used to authenticate with its public key stored in the
system cluster.
The CLI can be accessed using password instead of SSH. However, when invoking commands
from scripts by using the SSH key interface is recommended as it is more secure.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Since most desktop workstations are Windows-based, we are using PuTTY examples.
To generate a key-pair on the local-host, you need to specify the key type. PuTTYGen defaults to
the SSH-2 RSA, which is recommended to provide better security level.
SSH2 is separated into modules and consists of three protocols working together:
• SSH Transport Layer Protocol (SSH-TRANS)
• SSH Authentication Protocol (SSH-AUTH)
• SSH Connection Protocol (SSH-CONN)
The SSH-TRANS protocol is the fundamental building block, which provides the initial connection,
packet protocol, server authentication, basic encryption services, and integrity services. PuTTYGen
supports bits up to 4096 and defaults to 1024. However it is recommended to set this at a minimum
of 2048. Once you have chosen the type of key-pair to generate, click Generate. This procedure
generates random characters that are used to create a unique key.
A helpful tip is to move the cursor over the blank area in the Key Generator window until the
progress bar reaches the far right. Movement of the cursor causes the keys to be generated faster.
The progress bar moves faster with more mouse movement.
Uempty
Public
\Keys.PUBLICKEY.PUB
\Keys
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
The result of the key generation shows the public key (in the box that is labeled Public key for
pasting into OpenSSH authorized_keys file).
The Key comment enables you to generate multiple keys. Therefore, it is recommended to set this
to username@hostname for easy identification.
The Key passphrase is an additional way to protect the private key and is never transmitted over
the internet. If your set a passphrase then you are asked enter it before any connection is made
through SSH. If you cannot remember the key passphrase, then there is no way to recover it.
Save the generated keys by using the Save private key and Save public key buttons respectively.
The name and location of the file to place the key will be prompted. The default location is
C:\Support Utils\PuTTY. If another location is chosen, then make a record for later reference.
The public key can be saved in any format such as *.PUB or *.txt. The public key is stored into
the cluster as part of user management. However, the private key uses the PuTTY format of *.PPK,
which is required for authentication.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
The SSH-AUTH protocol defines three authentication methods: public key, host based, and
password. Each SSH-AUTH method is used over the SSH-TRANS connection to authenticate itself
to the server.
After the generating the SSH key pair, if a user requires CLI access for Storwize V7000
management through SSH you must upload a valid SSH public key file for the user definition on the
Storwize V7000. The SSH public key option can also be configured later after user creation. In this
case, a password for the user is required.
To upload the SSH public key for an existing user, right-click on the user and select Properties.
From the Create User pane, click the Browse button, which opens the windows explorer. Navigate
to the \Keys folder to upload the public.PPK file, and click Create. The CLI mkuser command is
generated to define or add user with SSH key authentication for a CLI no-password required login.
This is an optional feature for users and is not compulsory for Storwize V7000 management.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Now that you have stored the public key in the Storwize V7000 cluster, you need to establish a CLI
SSH connection and upload the private key .PPK file. To do so, open the PuTTY client. From the
Category navigation tree, click Session and enter the management IP address or DNS host name
of the cluster and accept the default port 22 that is used for SSH Protocol. Ensure that SSH is
selected as the connection type.
Next, select Connection > SSH > Auth. You need to use the private key that matches the
corresponding public key. In the Private key file for authentication field box, use the Browse
button to navigate to the location of the generated private.PPK file, or copy paste the file path into
the field.
Once the session parameters are specified, return to the Session pane and provide a name to
associate with the new session environment definition in the Saved Sessions field. Click Save to
save the PuTTY session settings and establish SSH private key authentication by using CLI SSH
connection. PuTTY is a commonly used Terminal client.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Providing the SSH authentication has been enabled, the PuTTY client will only prompt login for the
user ID. This is the same user ID that was authenticated with the public.PPK file. Once the login
ID is entered, SSH authentication validates password access in the form of the private key and
management IP address against the public key and user ID in the cluster.
In this example, we have issued the lscurrentuser command, which list the username by which
the current terminal is logged in.
The PuTTY SSH client software's are available in portable form and requires no need of special
setup. For other operating systems, use the default SSH clients or installed ones.
One sets up saved sessions in PuTTY, via the Save button, to later access terminal sessions, with
a session name, an IP address, the local private authentication key if one is used, character size
and other parameters. Then later one selects a save session in the pane and uses the Load
button to load it, then the Open button to open the terminal interface.
Once the SSH authentication has been established, upon the next log in using the PuTTY Client,
you will need to only select the name saved session and click Load > Open to recall the saved
management IP address.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
The command-line interface (CLI) enables you to manage the Storwize V7000 by typing
commands. Based on the user’s privilege level, commands can be issued to list information and
execute the commands for performing actions. Commands can be complemented with logically
consistent command-line syntax. The syntax of a command is basically the rules for running the
command. It is important to understand how to read syntax notation so that you can use a
command properly.
Visit the IBM Knowledge Center to search for a list of CLI commands, or navigate to the latest
version of the IBM Spectrum Virtualize software.
Uempty
Monitoring view
Monitoring menu has four menu options
System (dynamic view
í Hardware appears in 3-D form
Events (View and manage storage events reported)
Performance (View system performance statistics)
Background Tasks (Displays all long running tasks)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
The System window allows you to monitor the entire system capacity as well as view details on
control and expansion enclosures and various hardware components of the system. The hardware
is represented in a 3-D form to view front and rear components. Components can be selected
individually to view the status and properties in detail.
The Events window tracks all informational, warning, and error messages that occurs in the system,
and provides access to problems that must be fixed and maintenance procedures that step you
through the process of correcting the problem. You can apply various filters to sort them or export
them to an external comma-separated values (CSV) file. A CSV file can be created from the
information listed.
The Performance view provides real-time statistics in graphical views to monitor CPU utilization,
volume, interface, and MDisk bandwidth of the system and node. Each graph represents 5 minutes
of collected statistics and provides a means of assessing the overall performance of the system.
The Background Tasks displays all long running tasks that are currently in progress on the system,
such as volume synchronization, array initialization, and volume formatting. Once tasks are
completed, they are automatically removed from the display.
Uempty
System-Overview
As of Spectrum Virtualize 8.2.1, the System-Overview provides a graphical view of the IBM
Storwize V7000 2076 nodes in the system cluster. Each node pane lists key details of the node’s
configuration. The System Actions menu and the Node Actions provide additional management
options. If SAS-attached storage enclosures were configured, they would appear in its own pane as
an Expansion Enclosure, and this would include its own Expansion Actions menu.
Uempty
Keywords
Management interface • Remote authentication
Clustered system • Service Assistant Tool
I/O group
• Management GU
Configuration node
• Managed disks (MDisks)
Boss node
SSH client • Logical unit number (LUN)
SAN zoning • LUN masking
Host zoning • Storage pool
Virtualization • Worldwide node name (WWNN)
Local authentication
• Worldwide port name (WWPN)
SSH key authentication
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Uempty
Review questions (1 of 2)
1. Which of the following IP addresses can be configured to the Ethernet port 1 of each
node? (Choose all that apply)
A. Cluster management IP address
B. Alternative cluster management IP address
C. iSCSI IP address
D. DNS management IP address
E. Node IP address
2. True or False: IBM Storwize V7000 control nodes can use dedicated (private) SAN or
Ethernet switches to provide intra-cluster communication.
3. True or False: Zoning is used to control the number of paths between host servers and
the Storwize V7000.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Uempty
Review answers (1 of 2)
1. Which of the following IP addresses can be configured to the Ethernet port 1 of each
node? (Choose all that apply)
A. Cluster management IP address
B. Alternative cluster management IP address
C. DNS management IP address
D. Node IP address
The answer is A., C., and E.
2. True or False: IBM Storwize V7000 control nodes can use dedicated (private) SAN or
Ethernet switches to provide intra-cluster communication.
The answer is True.
3. True or False: Zoning is used to control the number of paths between host servers and
the Storwize V7000.
The answer is True.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Uempty
Review questions (2 of 2)
7RLQLWLDOL]HWKH6WRUZL]H9QRGHFDQLVWHUVDODSWRSRUZRUNVWDWLRQPXVWEH
FRQQHFWHGWREODQNRQWKHUHDURIDQRGHFDQLVWHU
7UXHRU)DOVH7KH(WKHUQHWFDEOHVIRUFRQQHFWLRQVWRWKHPDQDJHPHQWQHWZRUNDUH
SURYLGHGZLWKWKH6WRUZL]H9KDUGZDUHVKLSPHQW
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Uempty
Review answers (2 of 2)
7RLQLWLDOL]HWKH6WRUZL]H9QRGHFDQLVWHUVD3&RUZRUNVWDWLRQPXVWEHFRQQHFWHGWR
7HFKQLFLDQSRUW73RUW RQWKHUHDURIDQRGHFDQLVWHU
7KHDQVZHULV7HFKQLFLDQSRUW73RUW
7UXHRU)DOVH7KH(WKHUQHWFDEOHVIRUFRQQHFWLRQVWRWKHPDQDJHPHQWQHWZRUNDUH
SURYLGHGZLWKWKH6WRUZL]H9KDUGZDUHVKLSPHQW
The answer is False. The customer is responsible for providing Ethernet cables for
management network.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Uempty
Summary
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUZL]H9LQVWDOODWLRQDQGPDQDJHPHQWDFFHVV
Uempty
Overview
This module identifies the characteristics of IBM Spectrum Virtualize storage provisioning.
References
Implementing IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1
http://www.redbooks.ibm.com/redpieces/pdfs/sg247938.pdf
Uempty
Objectives
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
This module discusses the IBM Spectrum Virtualize storage logical resources.
Uempty
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
All managed storage resources provides the foundation for a logical building block that represents
basic storage units called managed disks (MDisks). MDisks are added to storage pools to create
virtual disk volumes (Vdisks) that are mapped to a host. By pooling the storage, and spreading
VDisks across MDisks, the I/Os on those volumes get evenly spread across the Mdisks, helping to
balance the I/O workload to backend storage MDisks, RAID arrays, physical disks, processors, and
ports.
Uempty
Managed disks
• IBM Spectrum Virtualize supports two different types of managed disks:
ƒ Internal SAS storage
í IBM SAS Expansion Enclosure Models 12F/24F/92F expansion enclosures (supporting 3.5-inch and 2.5-
inch drives)
í Configured into RAID array as managed disk (MDisks)
í SAS chains cabled to partner nodes in an I/O group as cluster wide resource
ƒ External SAN-attached MDisks
í External storage systems or storage controller are independent of the Spectrum Virtualize system
(FlashSystem 900, Storwize Family, and so on)
í External storage volumes (logical units or LUNS) presented as a managed disk (MDisk)
SAN
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
A Spectrum Virtualize system can manage a combination of internal storage and a large variety of
external storage systems. Internal storage is SAS attached to an I/O group. The system
automatically detects SAS storage presenting those drives for configuration into RAID arrays which
then become MDisks that Spectrum Virtualize manages. External storage is discovered using SAN
protocols and LUNs from that storage are presented as MDisks to Spectrum Virtualize.
An external storage subsystem, or storage controller, is an independent back-end device that
coordinates and controls the operations for its disk drives or logical units. External storage must be
configured to the same SAN fabrics or now can be ISCSI attached, to be virtualized by the
Spectrum Virtualize system.
The Spectrum Virtualize system manages the capacity of other disk systems (an External
Virtualization license is required for each storage device to be managed). When virtualizing an
external storage system, its capacity becomes part of the Spectrum Virtualize system cluster.
Capacity in external storage systems inherits all the rich functions and ease of use of IBM Spectrum
Virtualize system.
Uempty
Managed disks
• MDisks are either arrays (RAID) from internal storage or volumes from external storage
systems
• MDisks are used to create storage pools from which VDisks are created for hosts
• A managed disk must be protected by RAID to prevent loss of the entire storage pool
1.8 TB
Member Member Member Member Member
SAS 10K RPM disk disk disk disk disk
2 TB
NL SAS 7.5 K RPM Member Member Member Member Member
disk disk disk disk disk
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
MDisks are either arrays (RAID) from internal storage or volumes from external storage systems.
For external storage, MDisks are typically the entire RAID arrays in the storage system, or might be
just part of a RAID array, while for internal storage a MDisk is an entire RAID array.
MDisks are not visible to host systems, however, it is possible to map a MDisk to a single VDisk or
volume, for external storage with existing data, with that VDisk visible to the host.
The administrator allocates these MDisks into various storage pools for different usage or
configuration needs. If zoning has been configured correctly, MDisks are not be visible to a host
system on the storage area network as it should only be zoned to the Spectrum Virtualize storage
system (migrating backend storage LUNs with existing data is covered later).
Managed disks are grouped by the storage administrator into one or more storage pools, also
referred to as managed disk groups. Typically a storage pool will consist of MDisks with the same
performance and availability characteristics, such as a group of equally sized MDisks from the
same disk subsystem, which is based on best practices for configuring a specific disk subsystem
behind the storage system, as EasyTier's automated single tier storage pool balancing will
eventually balance I/Os across MDisks in a pool.
Uempty
Uempty
Extent 1
Extent 2
Pool_IBM Flash
Extent 3
Extent 4
R5
R5
R5
R5
...
Extent-n
Extent-n
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
Once the MDisks are placed into a storage pool they are divided into a number of extents. The
extents are numbered sequentially starting with 0. The extent size is a property of pools which can
range in size from 16 MB to 8192 MB as defined by the system administrator with a default size of
1 Gb.
A warning can be setup so that when a certain percentage of the extents have been used, an entry
in the event log is created and the administrator is notified using an icon on the administrative GUI.
It is good practice to keep free extents in every pool to facilitate automated storage pool balancing,
and other reasons.
Uempty
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
The extent size can be set to 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, or 8192 MB. It is set by
the administrator at the storage pool level when it is defined. Once set, the extent size stays
constant for the life of the pool.
The choice of the extent size affects maximum supported MDisk size and the total amount of
storage that can be managed by the clustered system. A 16 MB extent size supports MDisk
capacity of 2 TB for a maximum capacity of 64 TB per system. Increasing capacity based on the
powers of 2, the 8192 MB extent size allows for a maximum total storage capacity of 32 PB of
managed storage per system. The total capacity values assumes that all of the storage pools in the
system use the same extent size.
For most systems, a capacity of 1 to 2 PB is sufficient. A preferred practice is to use 256 MB for
larger clustered systems. The default extent size is 1024 MB, supporting up to 4096 MDisks, (2048
per I/O group). To avoid wasting storage capacity, the volume size should be allocated as a multiple
of the extent size.
Uempty
Extent 1a
Extent 1a Extent 2a Extent 3a Extent 2a
Extent 1b Extent 2b Extent 3b Extent 3a
Extent 1c Extent 2c Extent 3c Extent 1b
Extent 1d Extent 2d Extent 3d Extent 2b
Extent 1e Extent 2e Extent 3e Extent 3b
Extent 1f Extent 2f Extent 3f
Extent 1c
Extent 1g Extent 2g Extent 3g
Extent 2c
Extent 3c
Managed Disks (MDisks)
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
The extents from a given storage pool are used by the storage system to create volumes which are
known as logical or virtual disks (also commonly referred to as VDisks). A volume is
host-accessible storage that was provisioned from one storage pool. Or, if it is a mirrored volume, it
was provisioned from two storage pools. These volumes are presented to hosts as logical units that
the host sees as physical disks.
When an application server needs a disk capacity of a given size, a volume of that capacity can be
created from a storage pool that contains MDisks with free space (unallocated extents). The
storage system management GUI creates the volume by allocating extents from a given storage
pool. The number of extents that are required is based on the extent size attribute of the storage
pool and the capacity that is requested for the volume. By default, extents are taken from all MDisks
contained in the storage pool in round robin fashion until the capacity of the volume is fulfilled.
Striped VDisks help ensure that I/Os are evenly balanced across MDisks in a pool. Assuming that
I/Os are randomly distributed across the space in a VDisk, then we can show that I/Os will be
evenly distributed across all the MDisks in the pool when using striped VDisks. Even if some
VDisks handle much more I/O than others.
Uempty
ƒ Tier 0 flash
R5 R5
ƒ Tier 1 flash
R5
ƒ Tier 2 Enterprise R5
ƒ Tier 3 Nearline
ƒ MDisk should have the same hardware
characteristics
í Same RAID type, RAID array size, disk type, and Pool_Hybrid
RPMs
R5 R5
• Multi-tiered storage pool R5
SSD
array
MDisk*
ƒ Combinations of mix disk tiers (HDD and
flash/SDD) in one pool *Tier 0 MDisks can
also be SCSI LUNs
ƒ Mix hardware characteristics from storage systems
í Different drive types
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
MDisks that are used in a single-tiered storage pool should have the same performance and
availability characteristics to balance the hardware resource use to optimize the price performance
of the storage and avoid hot spot disk management, such as the same RAID type, RAID array size,
disk type, and RPMs.
A multi-tiered storage pool contains a mix of MDisks with more than one type of disk tier attribute. A
multi-tiered storage pool that contains both generic_hdd and generic_ssd or flash MDisks is also
known as a hybrid storage pool. Therefore a multi-tiered storage pool contains MDisks with various
characteristics as opposed to a single-tiered storage pool. However, it is a preferred practice for
each tier to have MDisks within a tier of the same size, performance and availability characteristics,
as this helps balance hardware resource use within a tier.
A hybrid pool contains multiple types of MDisks with one MDisk being flash based. If the MDisks
are created from different tiers of storage, Easy Tier can be used to automatically manage the
migration of highly used data to faster drives.
All external storage system or storage controller MDisks are identified and the administrator might
need to change the tier based on what the storage is providing.
Ensure that all MDisks that are allocated to the same tier of a parent pool are the same have the
same size, performance characteristics and RAID type. Doing this takes advantage of the ability of
the Spectrum Virtualize system to spread I/Os evenly across all the back end resources in the pool,
get the best performance, and eliminate hot spot management.
IBM Spectrum Virtualize storage system supports these tiers:
• Tier 0 flash: Storage from an IBM FlashSystem MicroLatency module or external flash backed
MDisk.
Uempty
• Tier 1 flash: Storage from a SSD or flash with typically less endurance or higher latencies than
Tier 0 flash.
• Tier 2 Enterprise: Storage from typically 15K or 10K RPM disk drives.
• Tier 3 Nearline: Storage from nearline-class MDisks typically operating at 7200 RPM.
Some customers might use 15K disk drives in Tier 2, while using 10K drives in Tier 3.
Uempty
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
The are two types of storage pools, the parent pool and the child pool.
• Parent pools receive their capacity from MDisks. There are often cases where you want to
sub-divide a storage pool (or managed disk group) but maintain a larger number of MDisks in
that pool. Such as where a host administrator has the permission and authority to create
volumes in a child pool, and the business has allocated him a specific amount of storage. This
allows I/Os from all volumes in the parent pool, including the child pools, to be balanced across
all the MDisks. A Parent pool is a standard pool creation that receive its capacity from MDisks
that are divided into a defined extent size.
• Child Pools were introduced in V7.4.0 code release. Instead of being created directly from
MDisks, child pools are created from existing capacity that is allocated to a parent pool. Child
pools are created with fully allocated physical capacity. The capacity of the child pool must be
smaller than the free capacity that is available to the parent pool. The allocated capacity of the
child pool is no longer reported as the free space of its parent pool. Child pools are logically
similar to storage pools, but allow you to subdivide the parent pool into multiple child pools,
limiting the space consumption for volumes within each child pool.
The same mkmdiskgrp command that is used to create physical storage pools is also used to
create child pools.
Uempty
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
Consider the following general guidelines when you create a parent pool:
• An MDisk can be associated with just one parent pool.
▪ You can add only MDisks that are in unmanaged mode to increase storage capacity. When
MDisks are added to a parent pool, their mode changes from unmanaged to managed.
▪ You can specify a warning capacity for a pool. A warning event is generated when the
amount of space that is used in the pool exceeds the warning capacity. The warning
threshold is especially useful with thin-provisioned volumes that are configured to
automatically use space from the pool.
• Volumes are associated with just one pool, except when mirroring across pools.
▪ Volumes that are allocated from a parent pool are striped by default across all the storage
that is placed into that parent pool. This also enables nondisruptive migration of data from
one storage system to another storage system and helps simplify the decommissioning
process. Volumes can be mirrored across pools, and are as part of the process to migrate
volume from one pool to another.
• You can delete MDisks from a parent pool under the following conditions:
▪ Volumes are not using any of the extents that are on the Mdisk.
▪ Enough free extents are available elsewhere in the pool to move any extents that are in use
from this MDisk
▪ The system ensures that all extents that are used by volumes in the child pool are migrated
to other MDisks in the parent pool to ensure that data is not lost.
If the parent pool is deleted, you cannot recover the mapping that existed between extents that are
in the pool or the extents that the volumes use. If the parent pool has associated child pools, then
Uempty
you must delete the child pools first and return its extents to the parent pool. Once the child pools
are deleted, then you can delete the parent pool. The MDisks that were in the parent pool are
returned to unmanaged mode and can be added to other parent pools. Because the deletion of a
parent pool can cause a loss of data, you must force the deletion if volumes are associated with it.
If the volume is mirrored and the synchronized copies of the volume are all the same pool, the
synchronized mirrored volumes are destroyed when the storage pool is deleted. If the volume is
mirrored and there is a synchronized copy in another pool, and the pool is deleted, the copy in the
other pool remains as an unmirrored volume.
Uempty
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Table describe commands that are used to assign a given storage pool
Parameter Child pool usage Storage pool usage
-name Optional Optional
-mdisk Cannot be used with child pools Optional
-tier Cannot be used with child pools Optional
-easytier Cannot be used with child pools Optional
-size Mandatory Cannot be used with parent pools
-parentmdiskgrp Mandatory Cannot be used with parent pools
-unit Optional Optional
-warning Optional Optional
-encrypt Optional Optional for both parent pools and child pools
-datadeduction Cannot be used with -parentmdiskgrp Data reduction pools must be a parent pool.
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
Child pools are similar to parent pools with similar properties and provides most of the functions
that MDiskgrps have such as creating volumes that specifically use the capacity that is allocated to
the child pool.
Maximum number of storage pools remains at 128 and each storage pool can have up to 127 child
pools. Child pools can be created used both the GUI and CLI however they are shown as child
pools with all their differences to parent pools in the GUI.
Child pools can be encrypted independently of parent pools, but a child pool within an encrypted
parent pool can not be created using the -encrypt yes option, as the child pool's data will be
encrypted by being in the parent pool.
Uempty
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
IBM Spectrum Virtualize support encryption of data at rest for both internal and external storage, by
creating encrypted storage pools. Oftentimes, purchasing an encryption license on Spectrum
Virtualize is the less expensive option.
To use encryption on the system, you must purchase an encryption license, activate the function on
the system, enable encryption using the management GUI or CLI, and create copies of the keys.
The encryption license is per system, not per TB or SCU.
After you have enabled the system encryption, you can create an encrypted storage pool by
specifying the option to encrypt the pool. If the backend storage has its own encryption capability,
that can be used instead by specifying that the MDisk supports encryption using the chmdisk
-encrypt yes command. This offloads the encryption work for that MDisk from Spectrum
Virtualize.
Both pools and volumes have an encrypted attribute. All volumes in an encrypted pool will have its
encrypt attribute set to yes. For systems with encryption enabled, you can also migrate existing
volumes from non-encrypted pools to encrypted pools. The encryption attribute is independent of
the volume class created, therefore it is transparent to applications, easing implementation and
operation.
Encryption on internal SAS storage is done by the SAS adapters, while encryption on external
storage resources is done by the Spectrum Virtualize system processors.
Encryption keys are kept on USB sticks, and are needed to boot the cluster. Alternatively keys may
be kept in a SKLM repository on a server.
Uempty
Encrypted
MDisk 1 Encrypted
Encrypted
Volume
Encrypted MDisk 3
MDisk 1 Encrypted
MDisk 2
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
You can not change an existing parent or child pool without encryption, to one with encryption. You
will need to delete and recreate the pool with encryption. Furthermore, SAS RAID array MDisks
must be recreated to enable encryption. Therefore, it is best to get encryption enabled when the
Spectrum Virtualize system is initially setup, to avoid the needed extra space and a lot of data
migration to get one's data into encrypted pools.
However, if a customer wants to start using encryption they simply create new child or parent pools,
and migrate existing volumes to them using migratevdisk command, or one can also migrate the
data using volume mirroring.
You can create encrypted child pools within unencrypted parent pools, but not within encrypted
parent pools. With an encrypted parent pool, the data in a child pool will be encrypted using the
parent pool's key.
Once the child pool has been defined, you can create a child pool volume by using the same
procedural steps listed within the Create Volumes wizard, as well as map volume directly to a host.
Administrators can use child pools to control capacity allocation for volumes that are used for
specific purposes such as assigning application/server administrator their own child pool of storage
to manage, without allowing them to access or manage other storage.
Uempty
Encrypted
MDisk 3
Encrypted
Encrypted MDisk 2
MDisk 1 Encrypted
Volume
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
All volumes created in encrypted pool will take on an encrypted attribute. For systems with
encryption enabled, you can also migrate existing volumes from non-encrypted pools to encrypted
pools. The encryption attribute is independent of the volume class created, therefore it is
transparent to applications, easing implementation and operation.
Uempty
'DWDUHGXFWLRQSRRO
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
Spectrum Virtualize supports data reduction pools, which reduce storage space needed via
thin-provisioned, compressed, and deduplicated volumes, and also through host SCSI unmap
operations.
A data reduction pool can be created via selecting the Data Reduction option in the GUI, or via the
CLI mkmdiskgrp command. It’s recommended that DRPs be at least 20 TB in size, with a sweet
spot of 100-200 TB. The minimum meta-data size is around 1 TB. Also, one should always plan to
Keep at least 15% of the space free for garbage collection purposes to free up and consolidate
empty space created via overwrites of data, or SCSI unmap operations when using data reducing
volumes.
As customer applications write data to the pool (for other than fully allocated volumes), data is
written to the pool sequentially getting the benefits of RAID full stripe writes. Then the old location is
marked as freed. So overwrites and SCSI unmaps result in a lot of scattered free block holes that
the garbage collection process reclaims by reading the data and rewriting it without the empty
holes. The cost of the garbage collection process depends heavily on the amount of valid data in
the extents. As a result it has to work harder when free space is limited. A general guideline is to
ensure that the volume capacity with the data reduction pool does not exceed 85% of the total
capacity of the data reduction pool. The table lists the extent size and the minimum data reduction
pool capacity that is required to be able to create a volume within the DRP.
Note that SCSI unmap is turned off by default, and you need to enable it using the chsystem
-hostunmap on command. Similarly to allow backend storage to reclaim deleted space via SCSI
unmap, be sure it’s enabled via the chsystem -backendunmap on command.
Uempty
6SHFWUXP9LUWXDOL]H3RROVRSWLRQV
• Pools: List/manage all parent & child pools
• Volumes by Pool: List/manage VDisks in each pool
• Internal Storage: List/manage SAS disks and to create
RAID and DRAID array MDisks
• External Storage: List/manage MDisks from external
controllers/backend storage
• MDisks by Pools: List/manage MDisks including
unassigned MDisks
• System Migration: Wizard to migrate existing data from
external storage to Spectrum Virtualize
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
The GUI Pools menu is used to configure and manage storage pools, internal and external storage,
MDisks, and to migrate existing storage to the system.
• The Pools menu lists all storage pools created under the IBM Spectrum Virtualize system
management. This includes standard pools, data reduction pools, child pools and multi-tiered
pools.
• Volumes by Pools lists and manages the VDisks in each pool.
• Internal Storage contains a collection of physical disks that are directly attached to the
Spectrum Virtualize system. After the GUI is started, these disks are detected and configured
into RAID or DRAID arrays MDisks to be used in a storage pool.
• External Storage s used to list and manage MDisks from external storage controllers,
discovering those unmanaged MDisks, and virtualizing it.
• MDisks by Pools is used to list and manage all the MDisks sorted by the pool they are in, or if
not in a pool in the list of unmanaged MDisks.
• System migration presents a wizard to migrate or import existing data on external storage so
that it is managed by Spectrum Virtualize.
Uempty
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
Internal storage attached drives are automatically detected by the IBM Spectrum Virtualize system.
The system determines the available drive classes, and recommends array configurations. These
arrays are presented as MDisks, which can be added to pools in the same way as MDisks that are
discovered on external storage systems are added.
All drives detected in the Spectrum Virtualize system initially are configured as candidate disks.
Once they are configured into a RAID or DRAID array, they become member disks. Hot spare disks
can also be configured.
Uempty
Storwize V7000F
FlashSystem
Storwize V7000 A9000 FlashSystem 900
Storwize V5030F
FlashSystem 840
Investment Protection
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
Further scalability, IBM Spectrum Virtualize offers storage virtualization, enables IBM SAN Volume
Controller to manage the capacity of other disk systems (over 440+) with external storage
virtualization.
Although the Spectrum Virtualize system controller supports an intermix of differing storage within
storage pools, the best approach is to always use the same array model, RAID mode, RAID array
size, and drive speeds. Multi-tiered pools are only recommended for use with EasyTier.
When an external storage system is virtualized, its capacity becomes part of the Spectrum
Virtualize system clustered system and managed in the same way as the capacity on internal
flash/SAS/Nearline-SAS drives within the system. Capacity in external storage systems inherits all
the rich functions and ease of use of Spectrum Virtualize system. Recommend that you exercise
caution with large disk drives so that you do not have too few spindles to handle the load. With
implementation of flash drives as the performance tier, and even with the Enterprise Tier-0 Flash
drives, RAID-5 was the standard. However, with larger 10K drives in particular, DRAID-6 is
recommended as it meets the performance requirements, and is more reliable especially with larger
DRAID arrays.
If you do not have an external storage subsystem for which the Spectrum Virtualize system
provides a round robin I/O path algorithm, make the number of MDisks per storage pool a multiple
of the number of storage ports that are available. This approach ensures sufficient bandwidth to the
storage controller and an even balance across storage controller ports.
Use of the external virtualization capability is entitled through the acquisition of IBM Spectrum
Virtualize Software for SAN Volume Controller (SW PID 5641-VC8 in AAS and SW PID 5725-M19
in Passport Advantage (R)).
Uempty
performance
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
Any IBM Spectrum Storage system that runs on the IBM Spectrum Virtualize (such as the V7000)
software serves as the insulating layer, SCSI LUNs become the foundational storage resource that
is owned by the storage system and are referred to as managed disks (MDisks). A one-to-one
relationship exists between the SCSI LUNs and the managed disks.
The storage system takes advantage of the basic RAID controller features (such as traditional
RAIDs 1, 5, 6, or 10) but does not depend on large controller cache or host-independent copy
functions that are associated with the storage systems.
Typically external storage is configured such that each LUN uses an equivalent set of physical
resources from a performance standpoint to facilitate balanced used of the entire external storage
system. The MDisks representing those LUNs are typically placed into a single pool while VDisks or
volumes for hosts are striped across the MDisks, therefore, balancing the I/O workload across the
physical resources.
Uempty
9LUWXDOL]LQJH[LVWLQJVWRUDJH
Before attaching external disk subsystem (referred to as backend storage) to behind an IBM
Spectrum Virtualize system such as a V7000 running IBM Spectrum Virtualize for virtualization,
there are certain tasks that needs completed in a specific order by the host, SAN, Spectrum
Virtualize system, and external storage administrators. Many of these steps can be performed at
the same time since there are different administrators who are involved in the various tasks.
Before storage with existing data is virtualized by the Spectrum Virtualize system, it's important to
create a list of the LUNs so that the correct LUNs or VDisks can be mapped to the correct hosts as
part of the virtualization/migration procedure.
The host administrators must install the multi-path code the Spectrum Virtualize system requires (if
not already installed or in the host operating system). They must also configure the Spectrum
Virtualize system storage and remove any definitions of storage to be virtualized. These tasks
might include starting and stopping applications as part of the process to virtualize the storage
where the application and data reside.
The SAN administrator must setup zones for the Spectrum Virtualize system including host zones,
backend storage zones, and Spectrum Virtualize system inter-node zones for Spectrum Virtualize
system cluster communications. This might require removing existing zones between external
storage and hosts previously configured before that storage was virtualized.
The backend storage administrators need to configure LUNs, unmap LUNs from eternal storage,
and reassign LUNs to the Spectrum Virtualize system. When the storage is virtualized, there is less
work to do as they no longer must deal with changes for new or existing application configurations.
The Spectrum Virtualize system administrator is creating host objects, storage pools, and then
mapping VDisks to hosts.
Host and storage devices must log in to the SAN fabric (referred to as a fabric login (FLOGI)), so
the SAN switch can see the device WWPNs. Similarly, the hosts, which include the Spectrum
Uempty
Virtualize system from the backend storage point of view, must log in to the storage (referred to as
port login (PLOGI)) so the storage system can see the host WWPNs, and the Spectrum Virtualize
system can see the storage system FC adapter WWPNs respectively. This facilitates the creation of
the host objects and the assigning of storage to host.
FLOGIs and PLOGIs occur when host/storage devices power up and scan to configure storage.
PLOGIs require that the SAN is zoned so the hosts can see the Spectrum Virtualize system
storage.
Uempty
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
Backend storage system MDisks are accessed based on one of four multipathing methods upon
Spectrum Virtualize system’s discovery of the storage system model. The objective is to evenly
balance the use of backend storage ports for the best performance.
The four multipathing methods or options to access an MDisk of an external storage system are:
• Round robin: I/Os for the MDisk are distributed over multiple ports of the storage system.
• MDisk group balanced: I/Os for the MDisk are sent to one target port of the storage system.
The assignment of ports to MDisks is chosen to spread all the MDisks within the MDisk group
(pool) across all of the active ports as evenly as possible.
• Single port active: All I/Os are sent to a single port of the storage system for all the MDisks of
the system.
• Controller balanced: I/Os are sent to one target port of the storage system for each MDisk.
The assignment of ports to MDisks is chosen to spread all the MDisks (of the given storage
system) across all of the active ports as evenly as possible.
For example, MDisks presented by a DS3500 are accessed by using the MDisk group balancing
method while MDisks presented by a FlashSystem are accessed by using the round robin method.
In most cases, disk subsystems where the Spectrum Virtualize system uses MDisk group balanced
or controller balanced should configure the number of MDisks as a multiple of the storage ports
available to balance use of the backend storage ports.
Uempty
%HVWSUDFWLFHV3RROVDQG0'LVNV
3ODFH0'LVNVZLWKVDPHDYDLODELOLW\DQGSHUIRUPDQFHDWWULEXWHVIURPWKH
VDPHVWRUDJHV\VWHPLQWKHVDPHVWRUDJHSRRO(DV\7LHUH[FHSWHG
$VVLJQ0'LVNVLQPXOWLSOHVRIVWRUDJHSRUWVWREDODQFH
XWLOL]DWLRQRIDOOVWRUDJHSRUWV]RQHGZLWKV\VWHPFOXVWHU
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
As a general practice ensure the number of MDisks presented from a given storage system is a
multiple of the number of its storage ports that are zoned with the IBM Spectrum Storage product
running IBM Spectrum Virtualize. This approach is particularly useful for storage systems where the
round robin method is not implemented for MDisk access.
Uempty
Node1 Node2
(controller ID 1) (controller ID 2)
(IBMSV C1) (IBMSV C2)
WWNN1 WWNN2
w w w w w w w w
w w w w w w w w
p p p p p p p p
n n n n n n n n
L0 L2 L4 L6 L1 L3 L5 L7
Each Spectrum Virtualize node has a unique WWNN, which remains the same even if the node
hardware is replaced as the WWNN is configurable. The Spectrum Virtualize system adapter
WWPNs also remain the same if they are replaced because their WWPNs are determined through
the node's WWNN and the adapter placement in the unit. All backend storage ports are zoned to all
Fibre Channel ports on the Spectrum Virtualize system used for attaching backend storage.
Therefore, replacing nodes or adapters for a Spectrum Virtualize cluster does not require rezoning,
as replacement of host and storage adapters normally do. Depending upon the number of FC ports,
some may be dedicated for remote mirroring, inter-node communications, or host and storage I/O.
Since Spectrum Virtualize uses a specific multi-path algorithm, based on the storage model, that
send I/O requests for a specific MDisk to a specific storage port, balancing I/Os across those
backend storage ports requires at least one MDisk per backend storage port. The example shows
a situation in which 8 backend storage ports exist (not in the diagram) thus 8 MDisks were created.
Uempty
([WHUQDO6WRUDJH
• Displays all MDisks detected by the system (unmanaged, managed and image)
• MDisks are organized by the external storage system that presents them
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
External Storage MDisks are organized by the external storage system that presents them. To
display the unique ID that is associated with the external storage system, use the filter function in
the management GUI to display the UID column. If new external storage is added to the system,
select Actions > Discover storage. If no MDisks display, ensure that you have cabled the system
correctly to the external storage system.
Managed disks have associated access modes. These modes, which govern how the Spectrum
Virtualize system cluster uses the MDisks.
• Unmanaged: The default access mode for LUNs discovered from the SAN fabric by the
Spectrum Virtualize system. These LUNs have not yet been assigned to a storage pool.
• Managed: The standard access mode for a managed disk that has been assigned to a storage
pool. The process of assigning a discovered SCSI LUN to a storage pool automatically changes
the access mode from unmanaged to managed mode. In managed mode space from the
managed disk can be used to create virtual disks.
• Image: A special access mode that is reserved for SCSI LUNs that contain existing data. Image
mode preserves the existing data when control of this data is turned over to the Spectrum
Virtualize system. Image mode is specifically designed to enable existing data to become
Spectrum Virtualize system-managed. SCSI LUNs containing existing data must be added to
the SAN Volume Controller as image mode.
Uempty
0'LVNVQDPLQJFRQYHQWLRQH[DPSOHV
0'LVN1DPHV 6WRUDJH3RRO
6:9.B6$6.B3RRO
PGLVN 6:9.3BVDV
DUUD\VDV ;*%
[[*%
PGLVN 6:9.3BVDV
DUUD\VDV ;*%
[[*%
/81VIURP
,%06:9.
6WRUDJH3RRO
'6.GHYB6$7$B3RRO
/81
PGLVN '6.3GHYBVDWD
DUUD\VDWD ;*%
[[*%
/81
PGLVN '6.3GHYBVDWD
DUUD\VDWD ;*%
/81VIURP [[*%
,%0'6.
%HVWSUDFWLFH5HQDPHVWRUDJHSRROWRUHIOHFWWKHW\SHRIVWRUDJHGHYLFH
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
Since the logical unit names (LUNs) namespaces are local to the external storage systems, it is not
possible for the managing storage system to determine the name and the generic mdisk# that is
automatically assigned. The storage administrator typically changes the MDisk name to reflect the
disk subsystem and storage technology from which it sourced from. If the MDisk contains data,
then the name is changed to reflect the host and or application for the data it holds.
Renaming MDisks and VDisks facilitates putting the MDisks into appropriate pools, or ensuring the
VDisks are assigned to the correct hosts. You can correlate these MDisks/VDisks that uses its
UDID (Universal Device ID), LUN ID, or based on its size of the corresponding LUNs on the
external storage system.
Uempty
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
The management GUI Pools and MDisk by Pools window list MDisks in the pools, and how much
space each pool has and how full the pools are. You can to add the Data Reduction column to the
display to distinguish between DRPs and regular pools.
Uempty
System Overview
Click any of the component options to view its detail.
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
The Spectrum Virtualize GUI System > Overview provides a visual display of the nodes in the
system cluster. Each node panel lists the hardware components used in the configuration.
Select a node to view component characteristics. You can also click on each component to view a
brief pop-up description, which also displays a full component details in the far right panel of the
screen. You can also right-click on any component to view other options.
If storage expansion enclosures were configured, they would appear in its own panel as an
Expansion Enclosure and this would include an Expansion Actions menu with additional options.
Uempty
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
The Dashboard is the default home page of the IBM Spectrum Virtualize GUI. It contains high-level
information about the system such as the performance, capacity, and system health that provide an
overall understanding of what is happening on the system. The Dashboard also serves as quick
view of the overall condition of the system, and displays notifications of any critical issues that
require immediate action.
Uempty
Keywords
• IBM Spectrum Virtualize GUI • Extents
• Internal Storage • Virtualization
• Dashboard • Redundant Array of Independent Disks
• External Storage (RAID)
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review questions
1. List at least 3 of the 5 SAS drive roles.
2. What is the default mode of an external storage MDisk once it is detected by IBM
Spectrum Virtualize GUI?
A. Array
B. Unmanaged
C. Managed
D. Image
3. True or False: The back-end storage system LUNs discovered on the same fabric
as the Spectrum Virtualize system are assigned to the system as MDisks.
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review answers
1. List at least 3 of the 5 SAS drive roles.
The answers are unused, failed, candidate, member, and spare.
3. True or False: The back-end storage system LUNs discovered on the same
fabric as the Spectrum Virtualize system are assigned to the system as
MDisks.
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Summary
,%06SHFWUXP9LUWXDOL]HVWRUDJHSURYLVLRQLQJ &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Overview
This module identifies striped, sequential, and image volume allocations to the supported host to
include benefits of I/O load balancing and nondisruptive volume movement between the caching
I/O groups.
References
Implementing IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1
http://www.redbooks.ibm.com/redpieces/pdfs/sg247938.pdf
Uempty
Objectives
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Uempty
Volume allocation
Creating virtual volume (VDisk)
Mapping volumes to host
Managing volume
Caching I/O group
Host storage access
Non Disruptive Volume Movement
(NDVM)
Throttling
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
This topic identifies the concept of volume allocation, the virtualization types, and types of volumes
that can be created and assigned to a host system.
Uempty
Volume allocation
Host or host cluster objects are defined Host objects
via the WWPNs or IQNs via FC or
ethernet respectively FC host iSCSI
Volume is mapped to a host or host WWPNs IQNs
cluster object
System/administrator assigns V3
controlling I/O group and preferred
Node 0
V1 Node 1
MDisk1 MDisk 3
MDisk 2
Storage pool
A volume is also known as a VDisk
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
A volume is a logical disk, logical volume or virtual disk (VDisk for short) that provides an area of
usable capacity which can be mapped to an attached host or hosts. The system does not
automatically present volumes to the host system. You must map each volume to a host or host
cluster object, with the host WWPNs or IQNs tied to the host or host cluster object.
When a volume is created, it is assigned to an I/O group, called the caching I/O group. Further, an
I/O group contains two nodes, one of which handles I/Os for volumes assigned to the node, and is
known as the preferred or caching node for a volume. While the system cluster can have multiple
I/O groups, the I/O requests and write cache for a volume are handled exclusively by a node of a
single I/O group under normal working conditions. Host I/Os are normally directed to a port on the
caching node, but if those paths aren’t working, I/Os can be directed to other ports on the cluster,
and the I/O will be forwarded to the caching node. Note that read and write cache for the volume
resides on the caching node. In the event a node fails, the partner node in the I/O group starts
handling I/Os and read cache for the failing node’s volumes until the failed node is repaired (note
that in failed node scenarios, write cache is disabled on the partner node and write I/Os operate in
write through mode).
The caching node or the preferred node, is automatically assigned by the system cluster using a
round-robin algorithm across the nodes. Alternatively one can specify the I/O group and caching
node for a volume. This facilitates scaling of the system cluster I/O workload, and balancing of I/Os
across I/O groups and nodes.
Uempty
2. Sequential
Extent 4a
Extent 4b
Extent 4c • Sequential volumes have extents residing on a
VA_WV2
1 GB
Extent 4d MDisk4 single MDisk in the pool
Extent 4e 300 GB
Extent 4f
Extent 4 g
Extent 5 g
Partial extent
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
When defining a volume, one of the three volume virtualization types is specified:
• The first method is striped mode (default). A striped volume is allocated one extent in turn from
each managed disk that has free extents (or a subset of managed disks that are known as a
stripe set) in the storage pool. This process continues until the space required for the volume
has been satisfied.
▪ Striped volumes facilitate balancing the I/Os across the physical resources improving I/O
performance. They also allow one to create volumes with more capacity than any single
physical disk.
• The second method is sequential mode. A sequential volume is where the extents are allocated
one after the other from one managed disk to the next manage disk to create the volume, given
enough consecutive free extents are available on that managed disk. This is typically used for
large managed disks with applications that are designed to balance their I/O across volumes;
therefore, using the application to balance I/Os across physical resources, rather than using
striped volumes to do it.
• Image mode volumes are special volumes that have a direct relationship with one managed
disk, and are used to move existing storage LUNs with data, into Spectrum Virtualize system
management and virtualization. The backend storage LUN is presented to Spectrum Virtualize
as an MDisk, and a VDisk is created that maps directly to the MDisk. That image mode VDisk is
typically mapped back to the host that uses the data, from Spectrum Virtualize. At that point the
storage is virtualized and under Spectrum Virtualize management. Usually image move volume
are dynamically changed and spread out across a storage pool as a striped volume. Note that
image or sequential mode volumes cannot exist in a child pool.
Uempty
Type of volumes
Mirrored A volume with two physical copies. Each volume copy can belong to a different storage pools.
Custom A volume created with user-defined customization rather the standard default settings.
Creates volume copies on separate sites for systems that are configured with HyperSwap
HyperSwap
topology.
A volume created with virtual and real (physical) capacity. Presents virtual capacity to hosts/user
Thin-provisioned
rather than the actually physical capacity.
A volume created with virtual and real capacity. Data is compressed as it is written to disk, saving
Compressed
more space.
Deduplicated A volumes that reside in a DRP, and are individually selected for deduplication.
A volume that is a PIT copy of a local volume, that's stored in the cloud for availability and DR
Cloud
purposes.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The IBM Spectrum Virtualize management GUI provides presets, which are templates with supplied
default values that incorporate best practices for creating volumes. The presets are designed to
minimize the necessity of having to specify many parameters during object creation while providing
the ability to override the predefined attributes and values. Each volume preset relates to one of
more of the three types of virtualization modes.
All volumes are created from the unallocated extents that are available in the storage pool. The
following table displays a list of volume types.
• A basic volume is the simplest type of volume whose data is striped across all available
managed disk in a single pool. It services I/O using readwrite cache and is classified as fully
allocated, therefore it reports real capacity and virtual capacity and equal.
• A mirrored volume is a volume with two physical copies, where each volume copy can belong to
a different storage pools.
• A custom volume is created with user-defined customization rather than taking the standard
default settings for each of the options under quick volume creation.
• HyperSwap volumes use Metro Mirror to mirror a volume across sites, whereby the host will
access the local copy if it's available, else it will access the copy at the remote site if the local
storage becomes unavailable. HyperSwap enhances availability in case of disk subsystem
failure.
• Thin-provisioned volume presents virtual storage capacity to hosts or user rather than the
actual physical capacity. When you create a volume, you can designate that it is
thin-provisioned to save capacity for the volume. A thin-provisioned volume has different virtual
capacity and a real capacity.
Uempty
• Compressed volume are like thin-provision volumes, compressed volumes have virtual, real,
and used capacities. When you create volumes, you can specify compression as a method to
save capacity for the volume. With compressed volumes, data is compressed as it is written to
disk, saving more space. To use the compression function on external storage, you must obtain
the IBM Real-time Compression license.
• Deduplicated volumes reside in a DRP, and are individually selected for deduplication.
Deduplicated volumes can also be fully allocated, thin, or compressed. Deduplication occurs on
a pool basis for the selected volumes in the pool.
• A cloud volume is a PIT copy of a local volume, that's stored in the cloud for availability and DR
purposes. One needs a connection and account with a cloud service provider for this capability.
Note that often volumes can have combinations of these types; e.g., a volume can be compressed
and mirrored, deduplicated and mirrored, thin-provisioned with a copy in the cloud, etc.
Uempty
Volume allocation
Creating virtual volume (VDisk)
Mapping volumes to host
Managing volume
Caching I/O group
Host storage access
Non Disruptive Volume Movement
(NDVM)
Throttling
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Uempty
)$:,1
92/
92/
*%
*%
UVL]H
6WRUDJHSRROV
)XOO\DOORFDWHG 7KLQSURYLVLRQHG
RU&RPSUHVVHG
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
To create a basic volume you will need to select the pool location where volumes will be sourced by
extents that are contained in only one storage pool. Next, you will need to specify the volume
quantity, capacity size, unit of measurement (bytes, kilobytes, megabytes, gigabytes or terabytes,
and name. Multiple volumes can be created at the same time by using an automatic sequential
numbering suffix. We recommend using an appropriate naming convention of volumes to help you
easily identify the associated host or cluster of hosts.
The Capacity Savings is set to none by default. However, this feature provides the ability to alter the
provisioning of a basic volume into Thin-provisioned volume or compressed volume.
By default, volumes are accessible via ports on the caching I/O group, and I/Os are normally
directed to the caching node via the host multi-path code. As volumes are created in the I/O group,
by default Spectrum Virtualize alternates the node which will be the caching node to help balance
the workload across nodes.
Once you have specified the parameters, the summary provides a quick view of volume details
before creation. You have the option to just create the volume, or to create it and map it to a host.
Volumes can be mapped to a host simply by right-clicking on the volume and choosing the Map to
Host option.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The Spectrum Virtualize GUI and the CLI use different abbreviations to indicate capacity. The
following table displays the differences in how capacity indicators are displayed in the management
GUI versus the CLI.
The metrics presented as KiB, MiB, GiB and TiB represent 2^10, 2^20, 2^30, and 2^40 respectively,
and this metric was defined in part to avoid confusion between powers of 10 and powers of 2 (for
example, a million bytes vs. a MB). As a consequence KB, MB, GB and TB have changed to be
powers of 10: 10^3, 10^6, 10^9 and 10^12 respectively.
Currently Spectrum Virtualize uses both KB and KiB (and similarly for the other metrics) to
represent the same thing, namely powers of 2.
Uempty
Example 2
8
Space from assigned storage pool volume
Volume capacity = 10 GB
1
Capacity striped across MDisks
3 exts 3 exts 4 exts
MDisk1 MDisk2 MDisk 3
The lsvdiskextent
<volume name> displays the
extents assigned Extent = 1GB
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Whether you chose the option to Create or Create and Map, the GUI will generate a mkvdisk
command for each volume and the volume parameters specified.
The mkvdisk command creates a 10 GB volume in the pool (or managed disk group) with ID 0, and
the volume is given a name of WINVOL1 with an ID of 1. All volumes are assigned a volume ID.
The volume capacity is rounded to a whole number of extents for extent allocation, so it’s
recommended to make volume sizes a multiple of the extent size so space isn’t wasted. The
lsvdiskextent command (with the volume name or ID) can be used to view on which MDisks the
VDisk and its extents reside.
Example 2 graphically shows a VDisk striped across the 3 MDisks in a storage pool.
Uempty
&RS\
&RS\
*% *%
6WRUDJHSRROV
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
A mirrored volume is created using the same preset as a basic volume, except it is created with two
identical volume copies of the same virtual capacity, providing a simple RAID 1 function. Typically
each volume copy will reside in a different storage pool from different disk subsystems, maintaining
availability in case a disk subsystem fails.
In the management GUI, an asterisk (*) indicates the primary copy of the mirrored volume. The
primary copy indicates the preferred volume for read requests.
The volume copy can be any type: image, striped, or sequential. The volume copies can be created
as fully allocated volumes or with capacity savings of thin-provisioned or compressed.
Mirrored volumes are discussed in detail in a later topic.
Uempty
0LUURUHG 0LUURUHG
&RPSUHVVHG &RS\
&RS\
7KLQ
SURYLVLRQHG
6WRUDJHSRRO
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The custom volume option provides options to create volumes with thin-provisioning, compression
and deduplication capabilities. It also expands the base level default options for basic and mirrored
volumes. A custom volume can be customized with respect to mirror synch rate, cache mode and
formatting.
Uempty
Basic Mirrored
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Fully allocated volumes are automatically formatted and initialized through the quick initialization
process after the volume is created. This process fills the volume with zeros right after it is
formatted, making it available for immediate use, with the format completing in the background. The
system formats any new fully allocated volume copy by default. Quick initialization requires a small
amount of I/O to complete and limits the number of volumes that can be initialized at the same time.
Some volume actions such as moving, expanding, shrinking, or adding a volume copy are disabled
when the specified volume is formatting. Those actions can be done when the formatting process
completes.
The quick initialization process can be disabled in circumstances where it is not necessary. For
example, if the volume is the target of a Copy Services function, the Copy Services operation
formats the volume.
The quick initialization process can also be disabled for performance testing so that the
measurements of the raw system capabilities can take place without waiting for the process to
complete. Filling new volumes with zeros is a good practice because otherwise, there may be
non-referenced old data on the storage, that results in wasted resource use later. For example, if
we create a fully allocated volume but don't format it, then later make a volume copy, all that
unused data gets unnecessarily copied, or if we convert it to a thin volume, that old data will use up
space.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Volumes from disk subsystems use a UID to uniquely identify them, and Spectrum Virtualize is no
exception. The UID for a LUN on backend storage can be compared to the UIDs of SV MDisks to
correlate the SV MDisk to the disk name used on the backend storage. Similarly, as a VDisk is
configured on a host, the UID on the host disk can be compared to the VDisk UIDs on SV to
correlate it to the SV name. After we correlate the two names, we typically change the VDisk name
on Spectrum Virtualize to reflect the host to which it belongs and its use.
Uempty
9ROXPHFDFKHPRGHV
By default, when a volume has been created the cache is set to readwrite mode
&DFKHPRGH 'HVFULSWLRQ
readwrite All read and write I/O operations that are performed by the volume are stored
in cache. This is the default cache mode for all volumes.
readonly All read I/O operations that are performed by the volume are stored in cache.
Required when backend storage is doing remote mirroring or FlashCopy to
maintain write order data consistency.
none All read and write I/O operations that are performed by the volume are not
stored in cache.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Cache in IBM Spectrum Virtualize storage system can be set at a single volume granularity. For
each volume, the cache can be readwrite, readonly, or none.
By default, when a volume has been created the cache is set to readwrite, so write latency is
improved and operates at cache speeds. Reads also are faster when the data is in cache. You use
cache-disabled (none) volumes primarily when you are virtualizing an existing storage
infrastructure and you want to retain the existing storage system copy services. You need to use
cache-disabled volumes where copy services are being used in the backend rather than controlled
via Spectrum Virtualize, to maintain write order and data consistency. Note that for DRP volumes,
readwrite is always used.
Keep the use of cache-disabled volumes to minimum for normal workloads, because turning off
write cache increases write latency, though if the backend storage has write cache that offsets most
of the increased latency.
You can also use cache-disabled volumes to control the allocation of cache resources. By disabling
the cache for certain volumes, more cache resources are available to cache I/Os to other volumes
in the same I/O group. This technique of using cache-disabled volumes is effective where an I/O
group serves volumes that benefit from cache and other volumes, where the benefits of caching are
small or nonexistent.
Uempty
Volume commands (1 of 5)
Additional CLI commands for administering volumes:
mkvolume
mkimagevolume
Addvolumecopy
rmvolumecopy
rmvolume
The lsvdisk now includes volume_idvolume_name and function fields to easily identify the
individual volumes that make up a HyperSwap volume
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
IBM Spectrum Virtualize introduced additional CLI commands for administering volumes in
stretched clusters or for HyperSwap, but the GUI will also continue to use legacy commands, for all
volume administration.
The new volume commands:
mkvolume
mkimagevolume
addvolumecopy
rmvolumecopy
rmvolume
The lsvdisk command has also been modified to include volume_id, volume_name, and function
fields to easily identify the individual volumes that make up a HyperSwap volume. The
mkimagevolume was implemented to have separate commands for creating a VDisk from an MDisk
with existing data, vs. creating a new empty VDisk. The mkimagevolume is still equivalent to
mkvdisk -vtype image.
Uempty
Volume commands (2 of 5)
The mkvolume command:
Determined by the system topology and the number of storage pools specified
Create an empty volume using storage from existing storage pools
Used to create HyperSwap or stretched system topology volumes:
í Stretched volume = two mirrored copies
í HyperSwap volume = two volume Metro Copy copies
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The mkvolume command, as opposed the mkvdisk command creates an new empty volume using
storage from existing storage pools, which is determined by the system and by the number of
storage pools specified topology. This command is used for high availability configurations that
includes HyperSwap or stretched systems topologies. Volumes are always formatted (zeroed).
Uempty
Volume commands (3 of 5)
The mkimagevolume command:
Create a new image mode volume
Can be used to import a volume, preserving existing data
Implemented to have different commands to create new a empty VDisks versus
VDisks with existing data. Equivalent to mkvdisk -vtype image
Examples:
Fully allocated image:
mkimagevolume -mdisk 2 -pool 0
Thin-provisioned image:
mkimagevolume -mdisk 7 -pool 1 -thin -size 25 -unit gb
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The mkimagevolume command creates a new image mode volume. This command be used to
import a volume, preserving existing data. Implemented as a separate command to provide greater
distinction between the action of creating a new empty volume and creating a volume by importing
data on an existing MDisk.
Uempty
Volume commands (4 of 5)
The addvolumecopy command:
Adds new copy to an existing volume (synchronized from the existing copy)
Used for stretched and HyperSwap topologies to create high availability volumes
The command addvolumecopy can be used to create:
í Mirrored volume - Standard topology
Examples:
addvolumecopy -pool 2 volume5
addvolumecopy -pool site2pool1 –thin volume4
addvolumecopy -image mdisk12 -pool 3 volume2
í Stretched volume - Stretched topology (Stretch cluster is not supported for V7000, V9000, and FS9100)
í HyperSwap volume - HyperSwap topology
Example: addvolumecopy -pool site2pool volume5
(This command is used for both topologies)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The addvolumecopy command adds a new copy to an existing volume. It can be used for stretched
and HyperSwap topology systems that requires a high available volumes split across the two sites.
It can also be used on a standard topology system to add a mirrored copy to an existing volume.
The new copy will always be synchronized from the existing copy.
Uempty
Volume commands (5 of 5)
The rmvolumecopy command:
Removes a copy of a volume but leaves the actual volume intact
Converts a Mirrored, Stretched, or HyperSwap volume into a basic volume
For a HyperSwap volume this includes deleting the active-active relationship and the change volumes
Allows a copy to be identified simply by its site
The rmvolume command:
Removes a volume, including volumes in remote mirroring relationships
The -force parameter from rmvdiskcopy and rmvdisk replaced by individual override parameters, -
removercrelationships and –removefcmaps, making it clearer to the user exactly what protection they are
bypassing
More specific than the rmvdisk and rmvdiskcopy commands
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The rmvolumecopy command removes a copy of a volume, leaving the volume fully intact. It also
converts a Mirrored, Stretched or HyperSwap volume into a basic volume.
The rmvolume command deletes the volume. For a HyperSwap volume this includes deleting the
active-active relationship and the change volumes. This command also allows a copy to be
identified simply by its site.
The –force parameter with rmvdiskcopy is replaced by individual override parameters, making it
clearer to the user exactly what protection they are bypassing.
The new commands are more specifically for high availability volumes.
Uempty
Volume status
offline The volume is offline and unavailable if both nodes in the I/O group are missing, or if
none of the nodes in the I/O group that are present can access any synchronized
copy of the volume. The volume can also be offline if the volume is the secondary of
a Metro Mirror or Global Mirror relationship that is not synchronized. A thin-
provisioned volume goes offline if a user attempts to write an amount of data that
exceeds the available disk space.
degraded The status of the volume is degraded if one node in the I/O group is online and the
other node is either missing or cannot access any synchronized copy of the volume.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Uempty
Volume allocation
Creating virtual volume (VDisk)
Mapping volumes to host
Managing volume
Caching I/O group
Host storage access
Non Disruptive Volume Movement
(NDVM)
Throttling
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
If you chose to Create and Map volumes to a host object, the host must be predefined on the
system. During the map volume to host process, the GUI generates a mkvdiskhostmap command
for each volume being mapped to a host object. This command automatically assigns the LUN
number or SCSI ID (as seen by the host) for each volume using the lowest available SCSI ID. This
SCSI ID controls the sequence in which the volumes are presented to the host. For example, if you
present three volumes to the host, and those volumes have SCSI IDs of 0, 1, and 3, the volume that
has an ID of 3 might not be found because no disk is mapped with an ID of 2.
A volume can be mapped to multiple host objects, and in such shared disk clusters, it’s usually
required to have the same LUN ID for a volume on each host, in which case we can specify the
LUN ID using the –scsi <LUN ID> parameter.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The Hosts > Mappings window in the management GUI provides an alternative view to all host
volumes and their assigned SCSI ID. The Private Mappings is the default view that list all host
accessed volume. You also have the option to view Shared Mappings for volumes that are mapped
to host clusters, or All Host Mappings for a collective view of both.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The lshostvdiskmap command can be used to display a list of volumes that are mapped to host
objects. It can be filtered to a specific host by specifying the host name or ID. In this example, we
list the volumes mapped to the PODA_WIN1 host (alternatively we can specify its host ID of 0).
The –delim , parameter reduces the width of the resulting output by replacing blank spaces
between columns with a delimiter (which is a comma). When the CLI displays a summary list of
objects each entry generally begins with the object ID followed by the object name of the object.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
When deleting a volume mapping, you are only removing the connection from the host to the
volume, the volume will then appear on the system as an unmapped volume. If the volume is
mapped to multiple hosts, removing a mapping from one host doesn't remove the ability of the other
hosts to access the volume.
If data on a volume is to be preserved, the host must unmount the disk before the volume is
unmapped. This will ensure that the connection to the disk is closed correctly by the host, and to
ensure that no data is left in the host cache while being unmapped.
Volume mappings can be removed using several methods in the GUI by selecting one or more
volumes (holding the Ctrl key), then right click and select Unmap Volumes. You will need to
confirm how many volumes are to be unmapped by entering that number in the Verify field, and
then click Unmap.
Uempty
Volume allocation
Creating virtual volume (VDisk)
Mapping volumes to host
Managing volumes
Caching I/O group
Host storage access
Non Disruptive Volume Movement
(NDVM)
Throttling
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The management GUI Notification icon (middle icon highlighted in the image) provides an overview
of currently running tasks triggered by the administrator and the suggested tasks recommending
users to perform specific configuration actions. Depending on the task initiated, the system might
suggests that a task needs to be performed and offers direct access to the associated location
using the Run Task option. In this example, a host has not been defined. If you do not want to
complete the suggested task, click the Not Now link and the suggestion message disappears.
Similarly, you can analyze the details of running tasks, either all of them together in one window or
of a single task. The View option opens a single task (volume format job as shown). The View All
Tasks option will direct you to the Monitoring Background Tasks menu which displays all tasks in
progress on the system. After the task completes, the task is automatically deleted from the display.
Uempty
Volume properties
• Volume UID is the equivalent of a hardware volume
serial number
• Caching I/O Group
ƒ Identifies the I/O group to which the volume belongs
• Accessible I/O Groups
ƒ Identifies the I/O groups whose ports can be used to
access the volume
ƒ By default only the caching I/O group
• Preferred node
ƒ Identifies the caching node in the I/O group
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The volume properties option provides an overview of the selected volume as seen within the GUI.
You can expand the view by selecting View more details.
• All volumes are created with a Volume ID which is assigned by the system cluster at volume
creation. The Volume UID is the equivalent of a hardware volume serial number. This UID is
transmitted to the host OS and on some platforms it can be displayed by host-based
commands.
• The Caching I/O Group specifies the I/O group to which the volume belongs.
• The Accessible I/O groups identifies the I/O groups whose ports are can be used to access and
do I/O to the volume. This capability is used during non disruptive volume movement (NDVM).
Normally I/O is directed to ports on the caching node, and typically hosts have volumes served
by only one I/O group, so it makes sense to limit the accessible I/O group to the caching I/O
group for that host’s volumes. Though SAN zoning and LUN/port masking features on
Spectrum Virtualize are typically also used to limit the number of disk paths.
• The Preferred node identifies the caching node in the I/O group and to which the host sends
I/Os when working paths to the caching node exist.
Uempty
0DQDJLQJYROXPHUHVRXUFHV
The Action menu displays lists resource option to manage volumes
Includes resources to reduce the complexity of moving data that is transparent to the host
Add to column
display
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The volume Actions menu displays a list resource options to manage volumes such as modify
volume mappings, unmap volume, rename volume, or create new volumes. In addition, it offers
resources to reduce the complexity of moving data that is transparent to the host.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The system supports a global setting that can prevent active volumes or host mappings from being
deleted inadvertently if the system detects recent I/O activity to the volume. This feature is called
volume protection.
When a volume is deleted, the system verifies whether it is a part of a host mapping, FlashCopy
mapping, or remote-copy relationship. In these cases, the system fails to delete the volume, unless
the -force parameter is specified. However, using the -force parameter can lead to unintentional
deletions of volumes. With volume protection enabled, administrators have to wait for the specified
time to pass without any I/O occurring to a volume, before it can be deleted, regardless if the
-force parameter is used.
To prevent an active volume from being deleted unintentionally, administrators can enable volume
protection using the CLI chsystem -vdiskprotectionenabled yes -vdiskprotectiontime 60
command. The parameter -vdiskprotectionenabled yes enables volume protection and the
-vdiskprotectiontime parameter indicates how long a volume must be inactive before it can be
deleted. In this case, volumes can only be deleted if they have been inactive for over 60 minutes.
Administrators who don't want to wait, can disable volume protection, delete the volumes, and turn
volume protection back on.
The following commands are affected by this setting:
rmvdisk
rmvdiskcopy
rmvdiskhostmap
rmmdiskgrp
rmhostiogrp
rmhost
rmhostport
Uempty
IBMcluster:superuser>lsvdiskextent VOL3
id number_extents
92/ 0 10
*% 1 9 H[WV H[WV H[WV
2 8
*% 3 3 '6. '6. '6.
*%
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The size of a volume can be expanded to present a larger capacity disk to the host operating
system. This can be accomplished in just a few clicks using management GUI or using the CLI
using the expandvdisksize command. Increasing the size of the volume is done without
interruptions to the user availability of the system. However, before increasing the volume capacity,
you must ensure that the host operating system provides support to recognize that a volume has
increased in size.
For example:
• AIX 5L V5.2 and higher issuing the chvg –g vgname
• Windows Server 2008, and Windows Server 2012 for basic and dynamic disks
This example shows the a 10 GB volume’s capacity was increased by 5 GB. When a volume is
expanded its virtualization type becomes striped even if it was previously defined as sequential.
Image type volumes cannot be expanded. Expanding a sequential VDisk will make the volume
striped, unless one uses the -mdisk flag to ensure added extents come from the same MDisk
leaving it as a sequential VDisk.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The method that system cluster uses to shrink a volume is to remove the required number of
extents from the end of the volume. Depending on where the data is on the volume, this action can
be data destructive. Therefore, the recommendation is that the volume should not be in-use by a
host. The shrinking of a volume is similar to expanding volume capacity. Ensure that the operating
system supports shrinking (natively or by using third-party tools) before you use this function. In
addition, it is best practice to always have a consistent backup before you attempt to shrink volume.
The shrinkvdisksize command that is generated by the GUI decreases the size of the volume
by the specified size. This interface to reduce the size of a volume is not intended for in-use
volumes that are mapped to a host. It is used for volumes whose content will be overlaid after the
size reduction, such as being a FlashCopy target volume where the source volume has an esoteric
size that needs to be matched.
Alternatively, to avoid losing data on the VDisk when shrinking it, is to simply create a smaller
volume of the size (assuming it's large enough to hold all the data), map volume to the host, and
then have the host administrator migrate the data to the new smaller volume - this can often be
done dynamically. After migrating the data, one can reclaim the original larger volume.
Uempty
Example:
IBMcluster:superuser> lsvdiskextent VOL3
id number_extents
0 7
1 7 H[WV H[WV H[WV
2 6
PGLVN PGLVN PGLVN
PGLVN PGLVN
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
If an MDisk is removed from a storage pool then all of its allocated extents are redistributed to the
other MDisks in the pool, this includes migrating any volumes that were allocated from the MDisk.
The rmmdisk command is generated by the GUI contains the -force parameter to remove the
MDisk from its current pool. The -force specification enables the removal of the MDisk by
redistributing the allocated extents of this MDisk to other MDisks in the pool. You can use the
lsvdiskextent command followed by the volume name to view the extent distribution for the
volume.
Uempty
9ROXPHDOORFDWLRQ
&UHDWLQJYLUWXDOYROXPH9'LVN
0DSSLQJYROXPHVWRKRVW
0DQDJLQJYROXPHV
&DFKLQJ,2JURXS
+RVWVWRUDJHDFFHVV
1RQ'LVUXSWLYH9ROXPH0RYHPHQW
1'90
7KURWWOLQJ
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
This topic discusses the process of volume caching, and the system read and write distribution.
Uempty
V1 V2 V3 V4
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Although each control enclosure of the system cluster has a copy of the cluster state data, MDisks
and storage pools are cluster-wide resources available to all I/O groups in the cluster. Volumes, on
the other hand, are owned by, and handle I/Os for the volume, by the caching node in the caching
I/O group. If the caching node fails, the other node in the I/O group then takes over control for the
volume. The I/O group is known as the volume’s caching I/O group. When a write operation is
performed to a volume, the node that processes the I/O duplicates the data onto the partner node
that is in the I/O group.
It’s important to protect the write cache data for data consistency purposes; thus we must have an
extra copy elsewhere (in the other node’s cache) to make sure it’s not lost in the event of a node
failure.
Uempty
I/O Group0
V1 V2
Preferred Alternative
control node control node
2
Node 1 3 Node2
Boot disks Mirrors write I/O in Boot disks
Cache write data cache Cache
5
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
When write operations occurs the host request to send I/O writes to a volume that is assigned to a
preferred control node, which the volume is normally accessed. The distributed cache can be
managed by both control nodes of the caching I/O group.
The host initiates a write I/O request (1) where the multi-path algorithm selects the caching node.
The I/O then goes into the caching node's cache (2), and a copy of the data is sent into the I/O
group's other node's cache (3). After the data is protected on the partner node, a write complete
acknowledgment is returned to the requesting host (4).
The data is physically written to the storage disk (5) later in which cache management in control
Node 1 (the preferred storage enclosure) will cause the cached data to be destaged to the system
cluster and the other control node is notified that the data has been destaged. The system cluster
write cache is partitioned to ensure that no heavily accessed or slow performing pool consume all
the cache,
To protect write cache data, in case of power loss or I/O group failure, data in volatile memory is
written to both internal boot drives.
Uempty
I/O Group0
V1 V2
Alternative
Control node path
1 Node 1 2
Mirrors write I/O in Node 2
Boot disks Boot disks
write data cache
Cache Cache
3
4
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
If a node failure should occur within an I/O group, the other node in the I/O group assumes the I/O
responsibilities of the failed node (1). Data loss during a node failure is prevented by mirroring the
write data cache between the two nodes’ caches in the I/O group (2). A node failure will cause the
surviving control node to accelerate the destaging of all modified data in the cache (3), and then to
storage disk to minimize the exposure to failure (4). At this point, all I/O writes are processed in
write-through mode.
Uempty
Volume allocation
Host storage access
Non Disruptive Volume Movement
(NDVM)
Throttling
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
This topic describes the process that hosts use to configure volumes on the host, and examine their
paths.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
From the Windows host perspective, the storage volumes are presented as standard SCSI disks.
All volumes that are mapped to Windows host via Fibre Channel or via iSCSI, are discovered and
displayed collectively within the Windows Disk Management interface. Windows presents volumes
as unallocated disks that must be initialized and formatted for use as a logical drive.
Disk MPIO support should be installed before configuring multi-path disks; otherwise, the system
will configure a disk for each path to it which could lead to problems later.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The SDDDSM datapath query device command can be used to correlate IBM Spectrum Virtualize
system volumes based on the serial number that is shown for the disk (which is the storage system
UID for the volume).
Disk paths are displayed and can be used to validate the disk path design, zoning, and cabling, and
compared to the disk paths as seen by the Spectrum Virtualize system. Paths without an (*) are
preferred or optimized paths that go to the caching node for the volume, and are normally used for
I/Os to the disk, though non-preferred paths are used as part of the disk configuration process or in
some failure scenarios. This is reflected in the Select column indicating how many times that path
was selected for an I/O. Non-preferred paths typically go to the non-caching node, or to ports on
other access I/O groups if so configured.
Uempty
)$$,;
LRBJUS
9$B92/
9$B92/
,'
,' 8,'«
8,'« 6&6,,'
6&6,,'
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
AIX uses the cfgmgr command to discover newly added devices including disks, while the lsdev
-Cc disks lists the configured disks. The example shows two 2145 disks that were configured,
while before only hdisk0 was configured which contains the rootvg or operating system.
AIX offers built in MPIO or one can use the SDDPCM product (which uses the built in MPIO) for
displaying and managing paths.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Once hdisks are discovered and configured on the AIX system, they are typically put into a Volume
Group (VG) for use with AIX's Logical Volume Manager (LVM). The mkvg command creates the VG,
and then typically Logical Volumes are created often for holding a file system. The lspv output
correlates hdisks with the VG to which they belong. The lsvg -l <vgname> command lists logical
volumes in the VG, none of which have been created yet. Typically one will create Logical Volumes
(LVs) on which file systems will be created and formatted.
Uempty
To confirm that the new disks are discovered and that the paths have been configured correctly, the
SDDPCM pcmpath query device command is used. The output of this command is the same
structure as the SDDDSM. The pcmpath query device command validates the I/O distribution
across the paths of the preferred storage enclosure of the volume (or hdisk). SDDPCM identifies
eight paths for each hdisk because this host is zoned for eight paths access in the example.
Currently, all eight paths show a state of OPEN because of the volume group it varied on.
Paths without an (*) are paths to the caching node of the volume, and called preferred, optimized or
primary paths.
The SERIAL number of the AIX hdisk correlates to the storage system UID value of the volume.
Uempty
)$$,;
IVFVL IVFVL
3DWK '% '$ 3DWK
LRBJUS
3DWK )$$,;
3DWK
,'
8,'«
VWRUDJHHQFORVXUH) 6&6,,' VWRUDJHHQFORVXUH)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
From the previous two command output sets, an understanding of the path configuration can be
obtained and host zoning can be validated. Under normal circumstances the SDDPCM or AIC
MPIO distributes I/O requests across these four paths to the volume’s preferred storage enclosure.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
IBM SAN Volume Controller, Storwize Family, FlashSystem 9100, and FlashSystem V9000
supports guest host operating systems that are running on the VMware operating system. With the
tight integration with VMware vSphere workloads (including vCenter Web Client (VWC), vStorage
APIs for Storage Awareness (VASA), and vStorage APIs for Array Integration (VAAI)) clients can
benefit from the extreme performance and macro efficiency attributes of IBM FlashSystems.
Each storage system integrates with the VMware vCloud suite of applications that includes those
listed to include vCenter Site Recovery Manager (SRM/SRA) to facilitate administration,
management and monitoring.
Uempty
Enabling VVOL
• Prerequisites for VVOL support:
ƒ IBM Spectrum Virtualize version 7.6.0 or later
ƒ IBM Spectrum Control Base Edition (version 2.2.1 or later)
ƒ VMware vSphere (ESXi hosts and vCenter) version 6.0 (or later)
ƒ Requires Network Time Protocol (NTP) server
Configured on both the storage system and the IBM Spectrum Control
Base server
ƒ Requires network information for both VMware vCenter and IBM
Spectrum Control Base Edition
IP address, subnet mask, gateway, and fully qualified domain name
(FQDN) such as hostname.domain.com
• VVOL must be enabled using Settings > System > VVOL.
ƒ Select On to enable Virtual Volumes
í A utility volume is automatically created to store critical metadata -
managed by Spectrum Control
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The system supports VMware vSphere Virtual Volumes, sometimes referred to as Virtual Volumes
or VVols, which allow VMware vCenter to automate the management of system objects like
volumes and pools. VVOl must be enabled for support using the GUI Settings > System >VVOL
selection.
Before you configure Virtual Volumes, the following prerequisites must be met:
• Ensure that your system is running version 7.6.0 or later.
• Ensure that IBM Spectrum Control Base Edition (version 2.2.1 or later) is installed.
• Ensure that you are running VMware vSphere (ESXi hosts and vCenter) version 6.0 (or later).
• Ensure that Network Time Protocol (NTP) server is configured on the system and the IBM
Spectrum Control Base server. NTP ensures that time settings are consistent between the
system and the IBM Spectrum Control Base server.
• Confirm that you have the network information for both VMware vCenter and IBM Spectrum
Control Base Edition: the IP address, subnet mask, gateway, and fully qualified domain name
(FQDN) such as hostname.domain.com.
Once you have meet the prerequisites, select On to enable VVOL. A utility volume is automatically
created to store critical metadata that is required for Virtual Volumes. This utility volume is managed
by the IBM Spectrum Control Base Edition server.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
After VMware vSphere Virtual Volumes are enabled on the system, you need to define ESXi hosts
that are enabled for Virtual Volumes on the system. Enter the name of an ESXi host server that will
access storage from the system and enter connection information. Select VVOL for the host type.
Click Add Host. Repeat this step for each ESXi host server.
If the ESXi host was previously configured, the host type can be changed by selecting the ESXi
host. Click Action and select Properties or right-click on the existing ESXi host. On the Overview
panel, select Edit and change the host type to VVOL.
ESXi host can also be configured using the command-line interface, by entering the command:
mkhost -name esx1 -type adminlun –fcwwpn number. The -type adminlun indicates the host is
used for Virtual Volumes management. The same command syntax can be issued for an existing
ESXi server by using chhost instead of mkhost.
Uempty
Creating VVOL
• Create a user VASA Provider security role on the IBM
Spectrum Connect server to manage VVOLs
0JPWSDWK
• Assign a storage pool to store provide capacity for the
utility volume
ƒ Best practice store a mirrored copy in a second storage
ƒ System admin can complete certain management actions on 0JPWSDWK
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
To manage the VVOLs, the system administrator can assign ownership of Virtual Volumes to IBM
Spectrum Control Base Edition by creating a user with the VASA Provider security role. IBM
Spectrum Control Base Edition provides communication between the VMware vSphere
infrastructure and the clustered system.
Defining the user account for the IBM Spectrum Control Base Edition server automatically
configures a new user with the VASA Provider role. IBM Spectrum Control Base Edition server uses
these storage credentials and role privileges to access the system and to run the automated tasks
that are required for Virtual Volumes. It also provides communication between the VMware vSphere
infrastructure and the storage system. Pools and VVOLs can also be configured using the CLI.
The system administrator selects or create a pool that will provide capacity for the utility volume.
With each new volume created by the VASA provider, VMware vCenter defines a few kilobytes of
metadata that are stored on the utility volume. The utility volume can be mirrored to a second
storage pool from different storage systems or a different I/O group to ensure that the failure of a
storage pool does not result in loss of access to the metadata. Utility volumes are exclusively used
by the VASA provider and cannot be deleted or mapped to other host objects.
The system administrator can complete certain actions on volumes and pools that are owned by
the VASA Provider security role, IBM Spectrum Control Base Edition retains the primary
management responsibility for all Virtual Volumes.
VVOL support can only be removed by system administrator. This requires the removal of any
associated pools. Ensure that the VMware vCenter administrator has migrated any virtual
machines off VVols datastores hosted by the storage system. If data still remains, the pool cannot
be deleted.
To disable VVOLs support using the Settings > System > VVOL changing the setting to Off.
Uempty
For information about IBM Spectrum Control Base Edition management of VVOLs, refer to the IBM
Spectrum Control Base Edition documentation.
Uempty
Volume allocation
Host storage access
Non Disruptive Volume Movement
(NDVM)
Throttling
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
This topic identifies the process in which volumes can be moved to a different caching I/O group.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Moving a volume between I/O groups is considered a migration task, which involving the SAN
Fabric, storage system cluster and host administrators. Hosts mapped to the volume must support
non disruptive volume movement (NDVM). Modifying the I/O group that services the volume can be
done concurrently with I/O operations if the host supports non disruptive volume move. However,
the cached data that is held within the system must first be written to the system disk before the
allocation of the volume can be changed. Since paths to the new I/O group need to be discovered
and managed, the multipath driver support is critical for nondisruptive volume move between I/O
groups. Typically, the SAN administrator has to change the zoning to first allow FC communication
to the target I/O group, then to remove zoning from the source I/O group after the VDisk has been
migrated. Rescanning at the host level ensures that the multipathing driver is notified that the
allocation of the preferred storage enclosure has changed and the ports by which the volume is
accessed has changed. The general reason to do a NDVM is to reduce the workload of an I/O
group and more evenly balance the workload across the cluster.
If there are any host mappings for the volume, the host must have permission to access LUNs from
the target I/O group as specified via the -iogrp flag of mkhost, or added with the addvdiskaccess
command. Keep in mind that the commands and actions on the host vary depending on the type of
host and the connection method used. These steps must be completed on all hosts to which the
selected volumes are currently mapped.
For example, Windows rescan for disks will remove paths after moving a volume to a new I/O
group. The cfgmgr on AIX will rescan the paths and set the pathing appropriately.
Note that NDVM is not supported with DRP volumes.
Uempty
Support information for an IBM Spectrum Virtualize storage product is based on code level. One
easy way to locate its web page is to perform a web search using the key words. For example ‘IBM
SAN Volume Controller supported hardware list’.
The NDVM provides a list of host environments that supports non-disruptively moving a volume
between I/O groups. The Multipathing column identifies the multipath driver required. After the
move, paths to the prior I/O group might not be deleted until a host reboot occurs.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
The visual shows some addition notes and summary on Non-Disruptive Volume Move (NDVM).
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
You can also use the management GUI to move volumes between I/O groups non-disruptively. In
the management GUI, select Volumes > Volumes. On the Volumes panel, select the volume that
you want to move and select Modify I/O Group or select Actions > Modify I/O Group. The wizard
guides you through all the steps that are necessary for moving a volume to another I/O group,
including any changes to hosts that are required.
A volume is owned by a caching I/O group as active I/O data of the volume is cached in the storage
enclosures of this I/O group. If a volume is not assigned to a host, changing its I/O group is simple,
as none of its data is cached yet.
Make sure you create paths to I/O groups on the host system. After the system has successfully
added the new I/O group to the volume's access set and you have moved selected volumes to
another I/O group, detect the new paths to the volumes on the host.
The GUI generates the following commands to a new caching I/O group:
• The movevdisk -iogrp command enables the caching I/O group of the volume to be changed.
The -storage enclosure parameter allows the preferred storage enclosure of the volume to
be explicitly specified. Otherwise, the system load balances between the two storage
enclosures of the specified I/O group.
• The addvdiskaccess -iogrp command adds the specified I/O group to the volume’s access
list. The volume is accessible from the ports of both I/O groups. However, the volume’s data is
only cached in its new caching I/O group.
• The rmvdiskaccess -iogrp command removes the access to the volume from the ports of the
specified I/O group. The volume is now only accessible through the ports of its newly assigned
caching I/O group.
The chvdisk -iogrp option is no longer available beginning with v6.4.0.
Uempty
Keep in mind that the SAN administrator will typically be involved. First to add zones so the host
can access ports on the target I/O group. Then later after migrating the volume to the target I/O
group, to remove zones so the host no longer accesses the source I/O group.
The host administrator is also involved. This includes rescans of the disk from the host to pick up
the new paths to the target I/O group, and then to correctly configure the preferred paths to the new
I/O group after the volume is moved.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Changing the caching node in an I/O group for a volume is a simpler task than NDVM which
changes the I/O group handling j a volume. Note the host administrator should rescan disks after
this change as well, because the preferred paths also change, and need to be reflected on the host.
The storage system allows you to issues the CLI movevdisk command to move the preferred
storage enclosure of a volume either within the same caching I/O group or to another caching I/O
group.
Uempty
Volume allocation
Host storage access
Non Disruptive Volume Movement
(NDVM)
Throttling
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
This module discusses the ability to set I/O throttling for IBM Spectrum Virtualize storage objects.
Uempty
System throttles
System supports throttles on hosts, host clusters, volumes, copy
offload operations and storage pools
Controls the amount of resources used during processing I/Os
Set limitation of both bandwidth and IOPS Host cluster
í Host cluster
í Host
í Volumes Input Output
Bandwidth limit and IOPS limit are the maximum amount that can be
processed before the system delays processing
If more than one throttle applies to an I/O operation, the lowest
and most stringent throttle is used
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Throttling is a mechanism that is intended to favor the performance of critical business applications
that run concurrently with less critical applications. Based on the IBM Spectrum Virtualize system
ability to process shared capacity and cache among all applications, and all hosts that are attached
to the same resources, equal allocation of these resources among both critical and less critical
applications might negatively affect the performance of the business-critical applications.
Therefore, throttling is a mechanism to, volumes, copy offload operations, and storage pools.
control the maximum amount of resources based on the bandwidth and IOPS that are used when
the system is processing I/Os on a specific host, or host cluster.
When you configure throttles on the system, keep in mind the following guidelines:
The throttle limit is a per node limit. For example, if a throttle limit is set for a volume at 100 IOPS,
each node on the system that has access to the volume allows 100 IOPS for that volume. Any I/O
operation that exceeds the throttle limit are queued at the receiving nodes. The multipath policies
on the host determine how many nodes receive I/O operations and the effective throttle limit. If
more than one throttle applies to an I/O operation, the lowest and most stringent throttle is used.
Uempty
Volumes:
Can be used to limit I/O for lower priority volume
workloads so that volume I/O intensive operations are
not affected
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
In case of a host cluster throttle, all the hosts in the host cluster share the throttle limit. The
response to this issue is to limit the input/output operations per second (IOPS) rate and bandwidth
of certain applications by specifying and then enforcing limits. As a result, throttling enables better
performance for the critical host applications sharing storage resources, concurrently with the
noncritical host applications.
If throttles are configured on any of the selected hosts, the throttles must be removed to be included
in the host cluster. Throttles can be applied only to the host cluster and not individual hosts within
the cluster. If you choose not to remove the throttles from the host, the host is excluded from the
host cluster.
The system supports throttles to delay processing of I/O operations for volumes. If storage systems
provide storage to a wide variety of applications, then production volumes with more critical I/O can
be competing with volumes that have lower priority operations. For example, volumes that are used
for backup or archive operations can have I/O intensive workloads, potentially taking bandwidth
from production volumes. Volume throttle can be used to limit I/Os for these types volumes so that
I/O operations for production volume are not affected.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Throttles can be defined for storage pools to control I/O operations on back-end storage systems.
Storage pool throttles can be used to avoid overwhelming the back-end storage and be used with
virtual volumes. Since virtual volumes use child pools, a throttle limit for the child pool limits I/O
operations for that pool's volumes. Parent and child pool throttles are independent of each other. A
child pool can have higher throttle limits than its parent pool.
You can also create throttles for systems that have copy offload features enabled, such as
offloaded data transfer (ODX) on Microsoft Windows Server 2012 or for XCOPY/WRITESAME
features on VMware hosts. Copy offload frees up host cycles for some host copies (e.g. copying a
boot volume for creating a VM), by having Spectrum Virtualize do it. Copy offload must be enabled
through a chsystem -odx on command, and also requires a certain level of SDDDSM multi-path
code on Windows. For systems with these features enabled, administrators can define throttles to
delay processing for copy offloads to free bandwidth for other more critical operations. When a
throttle for copy offload is defined, the throttle is applied for all copy offload I/O on the Spectrum
Virtualize system. Like other throttles on the system, you can set IOPS throttles, bandwidth
throttles, or both; however, bandwidth throttles are more effective for copy offload operations.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Additional CLI commands were created to support the throttling feature, to include existing CLI
command.
• Use the mkthrottle command to create a new throttle object and associate it with an object
(such as a volume) or offloaded I/O.
• Use the chtrottle command to change attributes associated with a specified throttle object.
• Use the lsthrottle command to list throttle objects that are configured in the clustered
system.
• The chvdisk command is used to modify the properties of a volume also can be used to set
throttle limits.
• The lsvdisk command is commonly use to display a concise list or a detailed view of volumes
that are recognized by the clustered system.
Uempty
Keywords
Battery modules • iSCSI Extensions over RDMA (iSER)
Control canister • iWARP (internet Wide-area RDMA
ProtocolMirrored Boot Drives)
Data-at-rest encryption
• Non Disruptive Volume Movement (NDVM)
Fibre Channel (FC)
• RDMA over Converged Ethernet (RoCE)
Fibre Channel Protocol (FCP)
• RACE Compression
Hardware-assisted compression
acceleration • Real-time Compression Acceleration card
Non-Volatile Memory Express (NVMe) • Uninterruptible power supplies (UPS)
IBM Spectrum Virtualize
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Uempty
Review questions (1 of 2)
1. For a host to access volumes that are provisioned by the system cluster, which of the
following must be true?
A. The host WWPNs or IQN must be configured on the volume’s owning I/O group.
B. Fibre Channel zoning or iSCSI IP port configuration must have been set up to allow appropriate
ports to established connectivity.
C. The volumes must have been created and mapped to the given host object.
D. All of the above
2. True or False: Data consistency is protected in the event of a node failure by mirroring
write cache between the two nodes in an I/O group.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Uempty
Review answers (1 of 2)
1. For a host to access volumes that are provisioned by the system cluster, which of the
following must be true?
A. The host WWPNs or IQN must be configured on the the volume’s owning I/O group.
B. Fibre Channel zoning or iSCSI IP port configuration must have been set up to allow appropriate
ports to established connectivity.
C. The volumes must have been created and mapped to the given host object.
D. All of the above
The answer is all of the above.
2. True or False: Data consistency is protected in the event of a node failure by mirroring
write cache between the two nodes in an I/O group.
The answer is True. When a node fails within an I/O group, the other node in the I/O group
assumes the I/O responsibilities of the failed node.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Uempty
Review questions (2 of 2)
3. Host objects are defined to Spectrum Virtualize via (choose all that apply):
$ IP address
B. WWPNs
C. Hostname and domain name
D. IQNs
E. Automatically
4. True or False: Extents for a striped volume are allocated in a round robin fashion from
all the MDisks in a pool with free extents.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Uempty
Review answers (2 of 2)
3. Host objects are defined to Spectrum Virtualize via (choose all that apply):
$ IP address
B. WWPNs
C. Hostname and domain name
D. IQNs
E. Automatically
The answer is WWPNs and IQNs.
4. True or False: Extents for a striped volume are allocated in a round robin fashion from
all the MDisks in a pool with free extents.
The answer is True.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Uempty
Summary
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HYROXPHDOORFDWLRQ
Uempty
Overview
This module provides an overview of integrating FC and Ethernet application servers into an IBM
Spectrum Virtualize system environment for volume access.
References
Implementing IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1
http://www.redbooks.ibm.com/redpieces/pdfs/sg247938.pdf
Uempty
Objectives
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Uempty
Host integration
Host integration
Fibre Channel host types
Ethernet host types
Host clusters
N_Port ID Virtualization (NPIV)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
This topic discusses the concept of host server integration in an storage system environment.
Uempty
V1 V2 V3 V4 V5
An IBM Spectrum Virtualize system integrates intelligence into the SAN fabric by placing a layer of
abstraction between the host server’s logical view of storage (front-end) and the storage systems’
physical presentation of storage resources both internal and external.
By providing this virtualization layer, the host servers can be configured to use volumes and be
uncoupled from physical storage systems for data access. This uncoupling allows storage
administrators to make storage infrastructure changes and perform data migration to implement
tiered storage infrastructures transparently without the need to change host server configurations.
Additionally, the virtualization layer provides a central point for management of block storage
devices in the SAN through its provisioning storage to host servers that spans across multiple
storage systems. It also provides a platform for advanced functions such as data migration, thin
provisioning, and data replication services.
Uempty
Citrix
Win AIX Sun HP VMware Linux Xen NetWare Blade Tru64 Apple SGI
And so on
Protocols
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
IBM Spectrum Virtualize storage products supports IBM and non-IBM storage systems to
consolidate storage capacity and multiple application workloads for open system hosts provides:
• Easier storage management.
• Increased utilization rate of the installed storage capacity.
• Advanced Copy Services functions offered across storage systems from separate vendors.
• Only one multipath driver is required for attached hosts.
In environments where the requirement is to maintain high performance and high availability, hosts
are attached through a storage area network (SAN) with Fibre Channel protocol (FCP).
Uempty
VLAN
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
A host object is a logical object that is presented to the storage system for management. Hosts can
be connected to the IBM Spectrum Virtualize systems through Fibre Channel, Fibre Channel over
Ethernet, NVM Express (NVMe) over Fibre Channel (FC-NVMe), or an IP network using iSCSI or
iSER.
• Fibre Channel (FC) and Fibre Channel over Ethernet (FCoE) host connections using 8 Gb or 16
Gb FC connections. IBM Spectrum Virtualize FC-based hosts must be connected to either SAN
switches or directly connected, and must be zoned appropriately. To verify the system
compatibility to use 16 Gb connections or the IBM Spectrum Virtualize system without the need
for FC fabric switches or visit the IBM System Storage Interoperation Center (SSIC) web page:
http://www.ibm.com/systems/support/storage/ssic/interoperability.wss.
• Internet Small Computer Interface (iSCSI) and iSCSI Extensions over RDMA (iSER) host
connections enables the convergence of storage traffic on to standard lower-cost TCP/IP
networks. For iSCSI host attachment, connections can be obtained using the native 1 Gb ports
or 10Gb ports. The 10Gb connects offers higher-performance, supporting up to seven times per
port throughput over the 1Gb ports. The 10 Gb FCoE also supports iSCSI connectivity, as well
as FC connectivity for hosts. The 10Gb port cannot be used for inter-cluster communication nor
can it be used to attach backend storage.
• IBM Spectrum Virtualize V8.2.1 introduced support for iSER host attachment for the 2145-SV1
using either RoCE or iWARP transport protocol through a 25 Gbps Ethernet adapter installed
on each node.
• In addition, IBM Spectrum Virtualize V8.2 supports the attachment of (Non-Volatile Memory
Express (NVMe) hosts by using FC-NVMe over Fibre Channel Protocol (FCP) as its underlying
transport.
Uempty
For a given host, it is recommended that the attachment should be either with Fibre
Channel-based, iSCSI-based, but generally not both at the same time.
Uempty
190([SUHVVSURWRFROV
• NVMe protocols delivers high bandwidth and low NVMe Host
latency storage access
• NVMe over Fabrics (NVMe-oF) is across the network Host-side transport abstraction
• Two types of fabric transports for NVMe currently
part of the standard
1. NVMe over Fabrics using Remote Direct Memory
Future Fabrics
Fibre Channel
Access (RDMA)
InfiniBand
iWARP
RoCE
InfiniBand
RoCE (Ethernet/UDP)
iWARP (TCP/IP)
2. NVMe over Fabrics using the Fibre Channel
Protocol (FCP)
FC-NVMe (FCP) Controller-side transport abstraction
The NVMe protocol is an open collection of standards and interfaces that fully exposes the benefits
of non-volatile memory in all types of computing environments, from mobile to data center. It is
designed to deliver high bandwidth and low latency storage access.
Fibre Channel is a fabric transport option for NVMe over Fabrics (NVMe-oF), a specification
developed by NVM Express Inc. The T11 committee of the International Committee for Information
Technology Standards (INCITS) defined a frame format and mapping protocol to apply NVMe-oF to
Fibre Channel.
NVMe-oF defines a common architecture that supports a range of storage networking fabrics for
NVMe block storage protocol over a storage networking fabric. This includes enabling a front-side
interface into storage systems, scaling out to large numbers of NVMe devices and extending the
distance within a datacenter over which NVMe devices and NVMe subsystems can be accessed.
There are two types of fabric transports for NVMe are currently supported:
1. NVMe over Fabrics using Remote Direct Memory Access (RDMA) includes InfiniBand, RoCE
and iWARP. The development of NVMe over Fabrics with RDMA is defined by a technical
sub-group of the NVM Express organization.
2. NVMe over Fabrics using the Fibre Channel for host connections is referred to as FC-NVMe.
Fibre Channel Protocol (FCP) is the underlying transport for FC-NVMe, which already puts the
data transfer in control of the target and transfers data direct from host memory, similar to
RDMA.
FC-NVMe is also designed work with Fibre Channel over Ethernet (FCoE).The goal of NVMe over
Fabrics is to provide distance connectivity to NVMe devices with no more than 10 microseconds
(μs)of additional latency over a native NVMe device inside a server.
Uempty
Preparation guidelines
List of general procedures that pertain to all hosts:
$OZD\VFKHFNFRPSDWLELOLW\DW66,& ZHESDJH
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
When managing a storage system that is connected to any host, you must follow basic
configuration guidelines. These guidelines pertain to determining the preferred operating system,
driver, firmware, and supported host bus adapters (HBAs) to prevent unanticipated problems due to
untested levels.
Next, what is the number of paths through the fabric that are allocated to the host, the number of
host ports to use, and the approach for spreading the hosts across I/O groups. They also apply to
logical unit number (LUN) mapping and the correct size of virtual disks (volumes) to use.
For load balancing and access redundancy on the host side, the use of a host multipathing driver is
required.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
A disk multi-path design, balance use of the physical resources. Typically, when host and storage
solutions are designed, it assumes that the resources will be used in a balanced method of access.
If not, this can lead to performance bottlenecks. Clients also rely on availability, requiring that the
applications continue to work at the required performance even under scenarios where various
pieces of redundant hardware fails. The most typical worst case would be a SAN fabric failure
which causes the loss of half the available designed bandwidth. An exception might be those hosts
using only two SAN ports or two SAN adapters, failure of a host or storage adapter or port usually
leaves more than half the designed bandwidth.
Uempty
Host integration
Fibre Channel host types
FC host connection
MPIO implementation
FC-NMVe host connection
Ethernet host types
Host clusters
N_Port ID Virtualization (NPIV)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
This topic discusses the Fibre Channel connection as it relates to disk multi-path implementation.
Uempty
Fibre Channel
Fibre Channel (FC) is a technology for transmitting data between computer devices at data
rates of up to 32 Gb
FC uses a worldwide name (WWN) as a unique identity for each Fibre Channel device
End-points in FC communication (host/storage port) have a specific WWN, called a WWPN
Worldwide port names (WWPNs) associated with the HIC ports are used to define host
objects
Direct connection of control enclosures and external storage systems are not supported
WWPN
WWPN
Host switch disk
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Fibre Channel (FC) is the prevalent technology standard in the storage area network (SAN) data
center environment. This standard has created a multitude of FC-based solutions that have paved
the way for high performance, high availability, and the highly efficient transport and management
of data.
Each device in the SAN is identified by a unique worldwide name (WWN). The WWN also contains
a vendor identifier field and a vendor-specific information field, which is defined and maintained by
the IEEE.
You can attach the system to open-systems hosts using Fibre Channel connections that are
attached to the system either directly or through a switched Fibre Channel fabric. Each port on a
node is identified by a worldwide port name (WWPN). Worldwide port names (WWPNs) associated
with the HIC ports are used to define host objects. However, direct connection to IBM Spectrum
Virtualize storage systems and external storage systems are not supported
Uempty
SCSI initiator
Fabric 1 Fabric 2
Dual fabrics are highly recommended
SCSI target
AC3 storage
enclosure
SCSI initiator
AE3 storage AE3 storage
enclosure 1 enclosure 2
SCSI target
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
From the perspective of the SCSI protocol, the IBM Spectrum Virtualize storage systems are no
different from any other SCSI device. It appears as a SCSI target to the host SCSI initiator. The
storage enclosure does behave as a SCSI device to the host objects it services and in turn it acts a
SCSI initiator that interfaces with the back-end storage systems. Therefore, a path is defined as a
logical connection from a host/initiator port to a storage/target port or a Spectrum Virtualize
storage/initiator port to a back-end storage/target. The path can exist only if the two Fibre Channel
ports are in the same zone.
For high availability, the recommendation for attaching the system cluster to a SAN is consistent
with the recommendations of designing a standard SAN network. That is, build a dual fabric
configuration in which if any one single component fails then the connectivity between the devices
within the SAN is still maintained although possibly with degraded performance.
• Modern SAN switches have at least two types of zoning: port zoning, and worldwide port name
(WWPN zoning, which is the preferred method of use.
• Zoning with alias names can make zoning easier to configure and understand, with the
possibilities of fewer errors.
• When mounting LUNs to a host, every LUN has its own set of paths (and they are usually the
same for all LUNs on a host).
Uempty
SAN
Fabric 2
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
In a SAN fabric, a host system can be connected to a storage device across the network. A path
can exist only if the two FC ports are in the same zone
For Fibre Channel host connections, the storage system must be connected to either SAN switches
or directly connected to a host port. There are no particular limits on the actual distance between
storage enclosures and host servers. Therefore, a server can be attached to an edge switch in a
core-edge configuration with the IBM Spectrum Virtualize storage system at the core of the fabric.
The storage enclosure detects Fibre Channel host interface card (HIC) ports that are connected to
the SAN. For any given volume, the number of paths through the SAN from the control enclosure to
the host must not exceed eight. However, before you can determine the number of disk paths, you
must know the cabling, the SAN zoning, and what if any LUN masking has been implemented at
the storage. Spectrum Virtualize implements LUN masking using the mkhost command -iogrp
option specifying I/O groups from which it can access LUNs, and also via the mkvdisk
-accessiogrp option specifying which I/O groups' ports can be used to access the volume.
Further, a port mask may be specified for the host further limiting the ports used for I/O from specific
I/O groups.
The paths must also be configured on the host, which typically occurs during boot or a scan for
LUNs by the host.
The number of potential paths is based on the number of storage ports times the number of host
ports, assuming all host ports can see all storage ports. This is typically reduced in half by the use
of dual SAN fabrics, and can be further reduced through SAN zoning and LUN masking. Also, since
Spectrum Virtualize is ALUA storage, typically only half the available paths are used for I/Os, those
paths going to the caching node for the volume.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
The potential number of host paths to Spectrum Virtualize can easily exceed the multi-path code
limits, and simplicity is a virtue in setting up zones and path. So, a simple and command approach
to zoning is to create single initiator single target zones for a 4 path solution that provides
availability for many dual failure scenarios, and still maintains adequate bandwidth under failure
scenarios. Typically with 2 host ports, 2 ports per node in an I/O group, and with hosts generally
zoned to a single I/O group (which serves up the VDisks for the host) one typically zones 4 paths. In
some cases hosts are zoned to more than one I/O group (e.g., with HyperSwap or NDVM).
To effectively balance use of Spectrum Virtualize resources, hosts are spread across I/O groups.
Uempty
As an SAN best practice, IBM recommends a single initiator zone because it prevents
malfunctioning initiators from delaying I/O from other initiators in the same zone. For optimum
performance and availability, this is a common configuration for SAN attached storage providing 4
paths.
This configuration uses two dual-port FC adapter cards on the host, and a similar configuration on
the storage. This zoning follows the single initiator and single target strategy yielding 4 paths in a
dual fabric environment.
This design provides redundancy for failure of any component (except the host), and the number of
path failures are shown for each type of failure. If a SAN fabric fails, we lose half the paths and half
the available bandwidth. Based on customer requirements, it's usually advisable to size the
interconnects to meet the customer's performance objectives under maintenance and/or failure
scenarios.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
This physical hardware configuration is exactly the same as the typical 4 path design, only with 8
paths as a result of the zoning. Comparison to the previous zoning that shows this solution with the
same hardware and cabling, which has twice as many paths due to a different SAN zone
configuration. However, it offers little in the way of improved availability or performance. Meanwhile,
the additional paths add complexity and overhead in the multi-path driver with double the paths to
manage.
In comparing the 4-path to 8-path design from an availability perspective, with the 4 path design
when we lose a host/storage port, we lose the ability to use the storage/host port on the path, but
not for the 8 path design. Nevertheless, both options survive most dual failure scenarios (failures of
both SAN fabrics, both host adapters, both nodes; cause an outage for both the 4 and 8 path
designs).
Examining these two options from a performance standpoint, in both cases you have 4 host ports
and 4 storage ports to handle the performance. Therefore, if a host port fails in the 4 path solution,
you also lose use of a storage port, but not in the 8 path solution. Assuming host port and storage
port bandwidths are relatively close to one another (which they often are), then in both the 4 and 8
path cases, you are limited to the bandwidth of 3 ports. However, there's no performance
difference.
There is also a benefit in the 4 path case regarding path failure recovery time. In the 8 path case,
assuming a cable from the host to the switch fails, the system waits for the first I/O using that cable
to time out, then the multi-path driver will resubmit the I/O down the next path. However, if that next
path also uses the same host port, the system has to wait for it to time out again, lengthening path
failure recovery time as compared to the 4 path solution.
Uempty
^ŝŶŐůĞŝŶŝƚŝĂƚŽƌnjŽŶĞƐ
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
If a host needs more I/O bandwidth than can be supplied with 4 paths, there are two approaches a
storage administrator can implement:
• One is to simply add more adapter ports; typically on both the host and storage though there
are exceptions when for example, the storage port bandwidths are much higher than host
adapter ports.
• The other approach is to create two logical host objects (as shown in this example). Each
logical host uses a separate set of 4 host adapter ports, with each set of 4 ports connected
across adapters, fabrics and nodes as before. Then the Spectrum Virtualize administrator
assigns half the volumes for the host to one host object, and the other half to the other host
object. This solution evenly balances the I/O workload across two sets of paths, resulting in
lower path counts while using more adapters and ports assuming the I/O workload across the
two sets of volumes is equal.
▪ This design concept can be applied to solutions requiring even numbers of ports such as a
6 port with 6 paths and one logical host, or 3 logical hosts each using 2 paths. The results
offers better balance of I/O operations across physical paths, verses having faster path
failure recognition and recovery. Having odd number of ports doesn’t allow for evenly
balancing the I/O across SAN fabrics.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
In designing disk path solutions, there are issues with too many paths, and issues with not enough
paths.
• Too many paths can significantly lengthen the time to recognize and recover from a hardware
failure when that failure causes multiple paths to fail. The host multi-path code must distinguish
between slow I/Os and failed I/Os via a path timeout, often 30 seconds. If the hardware failure
affects multiple paths, and we resend a failed I/O down another path also affected, we have to
wait twice as long. Preferably problems in the SAN are communicated via a RSCN message to
host adapter driver, but that's not always possible. Further, lots of paths add complexity for the
administrators, and overhead to keep track of all the paths.
• Too few paths can reduce redundancy below what customers want, and limit bandwidth.
Typically, solutions are sized to provide the desired performance under failure or maintenance
scenarios.
Uempty
+66
6LQJOHLQLWLDWRU]RQHV
+66
Switch
++6
6LQJOHWDUJHW]RQHV
++6
S1 S2 +6+6 6LQJOHLQLWLDWRUVLQJOHWDUJHW
+6+6
6LQJOHLQLWLDWRUVLQJOHWDUJHW
Storage +6+6
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Listed are five different zoning alternatives for a very simple and minimal environment, to
understand the zoning alternatives you might find in the field, and why some ways are better than
others.
• Option 1 doesn't implement zoning where all ports connected to the switch can see all the
others, or the SAN administrator creates such a zone with zoning implemented. This isn't a best
practice, partly because we have multiple initiators in the one zone. Single initiator zoning is a
SAN best practice, as is implementing zoning rather than open zoning. Single initiator zoning
prevents malfunctioning initiators from interfering with I/Os from other initiators in the zone.
• Option 2 uses single initiator zoning, and is fully supported. There are 2 disk paths in each
zone.
• Option 3 uses single target zoning, therefore it doesn't meet the single initiator zoning best
practice.
• Option 4 uses single initiator, single target zoning and is the preferred simple approach, zoning
each host port to a single corresponding storage port. This option only has 2 disk paths, but the
disk paths map to the zones one to one, keeping it simple. While 2 paths isn't what you typically
want in a production environment, this is a minimal configuration for educational purposes.
• Option 5 uses single initiator, single target zoning as well, but with twice the number of paths.
This option does have a slight advantage from an availability perspective as compared to option
4. For example, if ports H1 and S2 fail, option 5 continues to work where option 4 doesn't since
we still have the H2 to S1 path in option 5. However, typical production environments with 4
paths handle this double failure, the likelihood of it occurring is very rare, and we assume that
things are reliable enough that we can fix something that breaks before something else that
breaks.
Uempty
Option 5 has disadvantages regarding complexity, failure detection and recovery time (time to
get figure out a path has failed and then resend failed I/Os down a working path), and overhead.
Uempty
Figure 9-19. Balancing hosts and their I/Os across I/O groups
Typically, most hosts are zoned to one I/O group, which is the caching I/O group for the host
volumes or VDisks. In case where there are high I/O workloads, zoning the host to two or more I/O
groups will increase the number of paths to a VDisk for each host port or I/O group. It can also be
used keep paths to a minimum of 4.
The storage administrator can also create multiple host objects for a single host, one for each set of
host ports connected to an I/O group. Each host port will be zoned to only one storage port. The
administrator can then balance the VDisk workload across nodes and I/O groups. Host
administrator should also balance I/O across the VDisks.
In this example, zoning limits each host I/O operation with a single I/O group. This is the common
practice when the I/O workload of the hosts is modest compared to the I/O bandwidth of the I/O
groups. The idea is to assign hosts across the I/O groups so that the I/O workload is evenly
balanced across the I/O groups.
Uempty
A host might use two I/O groups for more I/O bandwidth, or temporarily for Non Disruptive Volume
Move (NDVM). If the hardware has a limited I/O bandwidth, and using two I/O groups will double
the available resources including the write cache available, then NDVM might be your best solution.
NDVM provides the capability to change the I/O group and node owning a VDisk, dynamically while
the application runs.
In this example, two host objects are created for the host, one for each I/O group, and the host
WWPNs are split among the host objects, as well as the VDisks assigned to the host objects. For
critical workloads like this, usually the application, host, and storage administrators can create a set
of VDisks designed so that the application workloads are balanced across I/O groups. If that is not
an option, then one can use NDVM to move VDisks from one I/O group to another to manually
balance the workload.
Alternatively, by default you can use the Spectrum Virtualize LUN masking capability to keep the
disks paths to 4. The list of I/O groups whose ports can be used for I/O to a VDisk is known as the
I/O group access set, and by default is just the caching I/O group. This approach is the simplest to
keeping 4 disk paths when accessing VDisks from multiple I/O groups. By default, disk paths will be
only to the caching I/O group limiting the paths even if the host is connected to other I/O groups.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Fibre Channel host objects can be created by using the GUI Hosts > Add Host option. When you
create a new Fibre Channel host object, the system presents a list of candidate WWPNs that have
logged into the system but are not yet configured in host objects. Before you proceed, make sure
you have knowledge of the host WWPNs to verify that it matches back to the selected host.
Some Fibre Channel HBA device drivers do not leave their ports logged in if no disks are detected
on the fabric, so they are not visible in the list of candidate ports. You must enter the WWPNs for
such hosts manually.
By default, new hosts are created as generic host types and can access volumes from all I/O
groups in the cluster. You can select the option to modify the host OS type such as Hewlett-Packard
UNIX (HP-UX) or Sun, select HP_UX (to have more than eight LUNs supported for HP_UX
machines), TPGS for Sun hosts using MPxIO, or VMware Virtual Volumes (VVOL). Or you can
restrict the I/O groups access to volumes. By default, when you make the host it will have access to
volumes from all I/O groups, though usually it will be zoned to one I/O group. You can also add the
host to an already defined host cluster. A typical configuration has one host object for each host
system that is attached to the system.
The management GUI generates the mkhost command to correlate the selected WWPN values
with the host object defined and assigns a host ID. When a host object is defined the host count is
incremented by one for each I/O group specified.
If required, you can right-click on any host and select the Properties option to modify a host attribute
such as changing the host name and host type or restricting host access to volumes in a particular
I/O group.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
The AIX host ports have to perform a port login to Spectrum Virtualize for their WWPNs to display
with the Host Port (WWPN) panel. This is accomplished via a reboot or running the cfgmgr
command from AIX, provided the cabling and zoning are setup. You can display the installed AIX
host adapters by using the lsdev -Cc adapter |grep fcs command. The fscsi0 and fscsi1
devices are protocol conversion devices in AIX. They are child devices of fcs0 and fcs1
respectively.
Display the WWPN, along with other attributes including the firmware level, by using the lscfg
-vpl fcs* wildcard command or using the adapter number.
Once the AIX host ports has completed its Fibre Channel port login to the storage system, the
WWPN should be available for selection. Defining an AIX host object can be done in the same
manner as the Windows host.
Uempty
Before defining a Linux host, enter the ls /sys/class/fc_host command to manually scan the
directory to list the number of fc_host entries in your Linux environment. In most production
environments this could be an extensive list, depending on the number of HBA adapters per host.
To identify the host WWPNs that are attached to the storage system, enter the cat
/sys/class/fc_host/host*/port_name command. The * indicates that all hosts configured with
fibre HBA adapters attached to the system will be listed, or you can append the command with the
host ID as shown in example 2.
Uempty
Host integration
Fibre Channel host types
FC host connection
MPIO implementation
FC-NMVe host connection
Ethernet host types
Host clusters
N_Port ID Virtualization (NPIV)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
When attaching storage to an open-system host such as Windows, AIX or Linux, the storage
vendor specifies what, if any, supporting software must be installed. This information can be found
on the IBM System Storage Interoperation Center (SSIC) website at
https://www-304.ibm.com/systems/support/storage/ssic/interoperability.wss. SSIC will list options
for the supported multi-path I/O drivers. Typically, multipathing software will need to installed to
support the storage.
IBM provides Subsystem Device Driver (SDD), which is a free multi-path I/O management feature
that is specifically SDDDSM for Windows and SDDPCM for AIX. Alternatively, you can use the
native MPIO multi-path code included with the operating system by itself, MPIO and Microsoft
MPIO with MS DSM for AIX and Microsoft respectfully. This software provides support for
dynamically adding and removing paths, listing paths, multiple load balancing algorithms, path
failure detection, path health checking, recovery of repaired paths, path statistics and support for
both symmetrical and ALUA storage.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
All Windows operating servers include many enhancements for the connectivity of a computer
running a Windows server-class operating system to storage area networking (SAN) devices.
Among the enhancements enabling high availability for connecting Windows-based servers to
SANs is integrated Multipath I/O (MPIO) support. Microsoft MPIO architecture supports iSCSI,
Fibre Channel and serial attached storage (SAS) SAN connectivity by establishing multiple
sessions or connections to the storage array.
Multi-path I/O (MPIO) is an optional feature that must be installed. Installing requires a system
reboot. After restarting the computer, the computer finalizes the MPIO installation.
When MPIO is installed, the Microsoft device-specific module (DSM) is also installed, as well as an
MPIO control panel. The control panel can be used to do the following:
• Configure MPIO functionality
• Install additional storage DSMs
• Create MPIO configuration reports
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
On FC host system, you can verify the host HBA driver settings using Server Manager > Device
Manager panel. Device Manager is an extension of the Microsoft Management Console that
provides a central and organized view of all the Microsoft Windows recognized hardware installed
in a computer. Device Manager can be used for changing hardware configuration options,
managing drivers, disabling and enabling hardware, identifying conflicts between hardware
devices, and much more.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Depending on the host adapter installed in the server, you can used the HBA application such as
the QLogic SANSurfer FC HBA Manager, Emulex HBAnyware or Brocade ESCM. This software
provides a graphical user interface (GUI) that lets you easily install, configure, and deploy the Fibre
Channel HBAs. The GUI also includes diagnostic and troubleshooting capabilities to help optimize
SAN performance.
Uempty
50:01:73:68:NN:NN:RR:MP
' &
Port ID (0-3)
Storage system 0 for WWNN
serial number (hex) Module ID (0-f)
'&
0 for WWNN
<HV Rack ID (01-ff)
*E
$XWR 0 for WWNN
Initiator IEEE company ID
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
SAN zoning connectivity of an storage system environment can be verified using the management
GUI by selecting Settings > Network > Fibre Channel Connectivity in the Network filter list. The
Fibre Channel Connectivity view display to the connectivity between storage enclosures, other
storage systems, and hosts that are attached through the Fibre Channel network.
The GUI zoning output conforms to the guideline that for a given storage system, to zone its ports
with all the ports of the storage system cluster on that fabric. The number of ports dedicated will
determine the number of ports zoned.
In a dual fabric, system ports and the additional storage enclosure ports as well as those ports for
external storage are split between the two SAN fabrics. Typically, this means that half of the host or
disk subsystem ports, will be connected to one fabric, with the other ports connected to the other
fabric. In addition, it is preferred to attach half the ports on one adapter to one fabric, with the other
ports going to the other fabric, rather than attaching half the adapters to one fabric with the other
adapters connected to the other fabric.
Uempty
Host integration
Fibre Channel host types
FC host connection
MPIO implementation
FC-NMVe host connection
Ethernet host types
Host clusters
N_Port ID Virtualization (NPIV)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
This topic discusses the FC-NVMe host connection in an IBM Spectrum Virtualize system
environment.
Uempty
)&190HSURWRFRO
NVMe is an alternative to SCSI (small computer system interface)
Uses Fibre Channel, existing fabric protocol, shipping, standardized by T11
Transport is FC
No change to switching infrastructure
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
FC-NVMe, the NVMe over Fabrics initiative relating to Fibre Channel-based transport, is developed
by the INCITS T11 committee, which develops all of the Fibre Channel interface standards.
FC-NVMe defines a mapping protocol for applying the NVM Express interface. This standard
defines how Fibre Channel services and specified Information Units (IUs) are used to perform the
services defined by the NVM Express interface specification.
NVMe uses non-volatile memory as the storage to leverage the speed and robustness of Fibre
Channel. NVMe does not require a parallel infrastrusture. NVMe provides a highly scalable
solution, with lower I/O latencies, more bandwidth and lower CPU utilization through 16 Gb, 32 Gb
and higher speed switches and fabrics. The fabric switches have to be able to recognize FC-NVMe
services and devices plus handle registration and queries of FC features such as Brocade FOS 8.2
or higher Cisco NX_OS 8.x or higher.
Uempty
190HWHUPLQRORJLHV
Subsystem Non-volatile memory storage device.
Capsule Unit of information exchange used in NVMe-oF which contains NVMe command, data
and/or responses.
Discovery Controller A type of controller which supports minimal functionality for discovery of NVMe media
controllers.
Namespace ID (NSID) Similar to SCSI’s LUN (Logical Unit Number) identifier. The NSID is a set of logical block
addresses (LBA) on the NVM media. i.e., a volume.
SQ (Submission Queue) A queue used to submit I/O commands to a controller.
CQ (Completion Queue) A queue used to indicate command completions for any return data and completion
status by a controller.
Admin Queue A queue used to submit administrative commands to a controller.
I/O Queue A queue used to submit I/O commands to a controller for data movement.
Association An exclusive relationship between a specific controller and a specific host that includes
the Admin Queue and all I/O queues on that controller accessible by the specific host.
Scatter-Gather Lists (SGL) One or more pointers to memory containing data to be moved, or stored, where each
pointer consists of a memory address and length value.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
The following NVMe terminologies are used to define NVMe host connection:
• Subsystem: Non-volatile memory storage device.
• Capsule: Unit of information exchange used in NVMe-oF which contains NVMe command, data
and/or responses.
• Discovery Controller: A type of controller which supports minimal functionality for discovery of
NVMe media controllers.
• Namespace ID (NSID): Similar to SCSI’s LUN (Logical Unit Number) identifier. The NSID is a
set of logical block addresses (LBA) on the NVM media. i.e., a volume.
• SQ (Submission Queue): A queue used to submit I/O commands to a controller.
• CQ (Completion Queue): A queue used to indicate command completions for any return data
and completion status by a controller.
• Admin Queue: A queue used to submit administrative commands to a controller.
• I/O Queue: A queue used to submit I/O commands to a controller for data movement.
• Association: An exclusive relationship between a specific controller and a specific host that
includes the Admin Queue and all I/O queues on that controller accessible by the specific host.
• Scatter-Gather Lists (SGL): One or more pointers to memory containing data to be moved, or
stored, where each pointer consists of a memory address and length value.
Uempty
)&190HKRVWGLVFRYHU\
Same basic components as SCSI and FCP:
Hosts (initiators), SAN (network) and storage
devices (targets)
Storage device can also be known as a NVMe
Name Server
Storage Subsystem
Storage Subsystem consists of:
FC-NVMe
NVMe controllers Host Storage Subsystem
í Contains the SQ and CQ queues FC network
Discovery Controller
Namespace IDs
NVMe Storage Media Storage Controller
Initiator Target
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Uempty
190(,2RSHUDWLRQV
I/O operations have similar pattern as
FCP – SCSI
Read Operation Flow
Simplified I/O stack
5HDG&PG
Parallel requests easy with enhanced 'DWD
queuing capabilities
5HTXHVW
NVMe provides for large numbers of
queues (up to 64,000) and supports Write Operation Flow
massive queue depth (up to 64,000 :ULWH&PG
commands) ;IHU5G\
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
FC-NVMe allows for a host to send commands and data together (first burst), eliminating the first
data “read” by the target and providing better performance at distances.
The host writes I/O Command Queues and the I/O Commands Ready Signal. The NVMe controller
then picks the I/O Command Queues, executes them and sends I/O Completion Queues followed
by an interrupt to the host. The host records I/O Completion Queues and clears the I/O Commands
Completion Signal.
The NVMe host software can create queues, up to the maximum allowed by the NVMe controller,
as per system configuration and expected workload. It also supports multiple I/O queues, up to 64K
with each queue having 64K entries.
With NVMe ability to both distribute and gather IOs, therefore minimizing CPU overhead on data
transfers, and even provides the capability of changing their priority based on workload
requirements.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
An NVMe host connection can be established by entering an NVMe host, and then for the Host
connections select Fibre Channel NVMe. The host must configure the I/O group's NVMe Qualified
Name (NQN) on the host, and one can display it with the lsiogrp command. Similarly, the host
object on Spectrum Virtualize is specified using the host ports NQNs. The NQN of the host is added
to an IBM Spectrum Virtualize host object in the same way that you add FC WWPNs.
You can identify the host objects and the number of ports (NQNs) by using the lshost command.
The management GUI generates the mkhost command to create the iSCSI host object using the
-nvmename parameter followed by the NVMe host NQN.
Uempty
Host integration
Fibre Channel host types
Ethernet host types
iSCSI host connection
iSER host connection
iSCSI/ISER IP address failover
N_Port ID Virtualization (NPIV)
Host clusters
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
This topic discusses the iSCSI host connection in an IBM Spectrum Virtualize system environment.
Uempty
iSCSI iSCSI
IQN IQN
VLAN
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
In a manner equivalent to the dual fabric (redundant fabric) Fibre Channel environment, a highly
available environment can be created for iSCSI-based host access using two networks and
separating iSCSI traffic within the networks by using a dedicated virtual local area network path
(also referred to as a VLAN) for storage traffic. Ethernet ports in a storage enclosure are connected
to the two LANs and in conjunction with the two host NICs a multipathing environment is created for
access to volumes.
IBM Spectrum Virtualize systems support iSCSI connections using the rear 1 Gb or 10 Gb Ethernet
ports, or with the optional 4-port 10 Gbps Ethernet (iSCSI/FCoE) converged network adapter (CNA)
installed in each node.
Support for iSCSI IP network-attached hosts requires the configuration for additional IPv4 or IPv6
addresses for each port connection. These IP addresses are independent of the clustered system
configuration IP addresses which allow the IP-based hosts to access storage system managed
Fibre Channel SAN-attached disk storage.
The system supports the following I/O configuration options:
• I/O from different initiators in the same host to the same I/O group.
• I/O from different initiators in different hosts to the same volumes.
• I/O from Fibre Channel and iSCSI initiators in different hosts to the same volumes.
• I/O from Fibre Channel and iSCSI initiators in the same hosts to the same volumes is not
supported.
Uempty
Ethernet
header IP TCP iSCSI Data CRC
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Internet SCSI (iSCSI) is a storage protocol that transports SCSI over TCP/IP allowing IP-based
SANs to be created using the same networking technologies for both storage and data networks.
iSCSI runs at speeds of 1Gbps or at 10Gbps with the emergence of 10 Gigabit Ethernet adapters
with TCP Offload Engines (TOE). This technology allows block-level storage data to be transported
over widely used IP networks, enabling end users to access the storage network from anywhere in
the enterprise. In addition, iSCSI can be used in conjunction with existing FC fabrics as gateway
medium between the FC initiators and targets, or as a migration from a Fibre Channel SAN to an IP
SAN.
The advantage of an iSCSI SAN solution is that it uses the low-cost Ethernet IP environment for
connectivity and greater distance than allowed when using traditional SCSI ribbon cables
containing multiple copper wires. The disadvantage of an iSCSI SAN environment is that data is still
managed at the volume level, performance is limited to the speed of the Ethernet IP network, and
adding storage to an existing IP network may degrade performance for the systems that were using
the network previously. When not implemented as part of a Fibre Channel configuration, it is highly
recommended to build a separate Ethernet LAN exclusively to support iSCSI data traffic.
Uempty
iSCSI architecture
Mapping of SCSI architecture model to IP
Storage server (target)
OS
Storage client (initiator)
SCSI
Host
Switch
Available on most operating systems
Target Disk
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Internet Small Computer System Interface (iSCSI) is an alternative means of attaching hosts to the
IBM Spectrum Virtualize control enclosures. With the release of Spectrum Virtualize V8.1.1, you
can attach iSCSI based back end storage to Spectrum Virtualize, while prior to that, all back end
storage was FC or FCoE attached. The iSCSI function is a software function that is provided by
IBM Spectrum Virtualize and not the hardware.
In the simplest terms, iSCSI allows the transport of SCSI commands and data over a TCP/IP
network that is based on IP routers and Ethernet switches. iSCSI is a block-level protocol that
encapsulates SCSI commands into TCP/IP packets and uses an existing IP network. A pure SCSI
architecture is based on the client/server model.
An iSCSI client, which is known as an (iSCSI) initiator, sends SCSI commands over an IP network
to an iSCSI target. Communication between the initiator and target can occur over one or more
TCP connections. The TCP connections carry control messages, SCSI commands, parameters,
and data within iSCSI Protocol Data Units (iSCSI PDUs).
Uempty
([DPSOHRIDQ,3YPDQDJHPHQWDQGL6&6,VKDUHGVXEQHW
5HGXQGDQWQHWZRUNFRQILJXUDWLRQ L6&6,
LQLWLDWRU L6&6,
+RVW $GPLQ
L6&6,WDUJHWV L6&6,,3
$GGUHVVHV
0JPW,3
$GGUHVVHV
(7+
&RQILJ1RGH
(7+
[
*DWHZD\
1RGH
The visual illustrates the configuration of a redundant network for Storwize V7000 IPv4
management and iSCSI addresses that shares the same subnet. Each node Ethernet port and
iSCSI IP addresses can be configured on the same subnet with the same gateway, or you can have
each Ethernet port and iSCSI addresses on separate subnets and use different gateways.
This same setup can be configured by using the equivalent configuration with only IPv6 addresses.
Uempty
Before configuring an iSCSI host access, you need to identify if the target (storage system) you
plan to use supports MPIO and whether it supports a round-robin or load balancing path selection
from a performance sizing perspective. If a failover mode path is only available, then only one path
and the hardware associated with it will be used. A failover mode provides the network with
redundancy but it does not provide the performance increase as the other MPIO modes.
Some target manufacturers have their own MPIO DSM (Device Specific Module), therefore, it might
be preferable to use the target specified DSM mode. Consult the IBM Systems Storage
Interoperation Center (SSIC) at
https://www-03.ibm.com/systems/support/storage/ssic/interoperability for supported iSCSI host
platforms and if multipathing support is available for the host OS. If you are using Windows 2008,
MPIO support should be implemented when more than one path or connection is desired between
the host and the storage system.
You will also need to install the iSCSI initiator on the host to configure iSCSI attached storage for
the host. Following SCSI protocols, the iSCSI initiator on the host sends I/O requests to the iSCSI
target. For back end iSCSI storage, Spectrum Virtualize has the initiator and sends commands to
the back end iSCSI targets.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Before configuring iSCSI on Spectrum Virtualize, you must first setup IP addresses on each node in
an I/O group for iSCSI communications. IBM Spectrum Virtualize systems support IP aliasing on its
Ethernet ports, whereby two IP addresses can exist on a single port. Therefore, in addition to
having a management IP address, you can also attach an iSCSI host using Ethernet ports on both
nodes for availability. Typically, a node will have two or four Ethernet ports. These ports are either
for 1Gb support or 10Gb support. For each Ethernet port a maximum of one IPv4 address and one
IPv6 address can be designated for iSCSI I/O.
An iSCSI host connects to the storage system through the node-port IP address. If the node fails,
the address becomes unavailable and the host loses communication with the node. Therefore, you
want to ensure that the host is connected to an IP address on each node in the I/O group. To avoid
performance bottlenecks, the iSCSI initiator and target systems must use Ethernet ports at the
same speed.
The cfgportip command is generated to enable the component IP address be set for storage
enclosure ID1 port 1 and storage enclosure ID2 port 2.
Uempty
10.208.2.212
10.208.2.211
,3
iSCSI target
IQNs
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Each iSCSI initiator and target must have a unique name which is typically implemented as an
iSCSI qualified name (IQN). In this example, the Windows IQN is shown on the Configuration tab of
the iSCSI Initiator Properties window. The host’s iSCSI initiator IQN is used to define a host object.
A storage system control enclosure IQN can be obtained by selecting Settings > Network > iSCSI
Configuration pane of the management GUI. The verbose format of the lsstorage enclosure
command can also be used to obtain the control enclosure IQN. The IQN for a Windows host is
obtained from the iSCSI Initiator Properties window's Configuration tab.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Once you've configured iSCSI IP addresses on the host and Spectrum Virtualize nodes, then you
need to have the host initiator discover the targets, which is analogous to the Fiber Channel port
login, making the storage aware of the host, so that then you create the host object and map
volumes to it. From the iSCSI Initiator Properties window Discovery tab, click the Discover Target
Portal button and enter the node’s iSCSI IP port address or DSN name. Port number 3260 is the
default (official TPC/IP port for the iSCSI protocol and click OK). Once the portal address has been
entered the available iSCSI targets are displayed.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
The Targets tab lists each node's IQN that were previously discovered. The discovered targets
initial status is inactive. Use the Connect button to connect to the target. The Connect to Target
window provides options to tailor the behavior of the connection. Check both boxes for persistent
connections and to enable multipathing access.
Once this process is complete, the initiator to the discovered target (storage enclosure) is now
connected.
The group of TCP connections that link an initiator with a target form a session (loosely equivalent
to a SCSI Initiator-Terminator nexus). TCP connections can be added and removed from a session.
Across all connections within a session, an initiator sees one and the same target.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
If the target supports multiple sessions then the Add Session option under the Target Properties
panel allows you to create an additional session. You can also disconnect individual sessions that
are listed. Use the Devices button to view more information about devices that are associated with
a selected session.
Multiple Connections per Session (MCS) support is defined in the iSCSI RFC to allow multiple
TCP/IP connections from the initiator to the target for the same iSCSI session. This is iSCSI
protocol specific. This allows I/O to be sent over either TCP/IP connection to the target. If one
connection fails then another connection can continue processing I/O without interrupting the
application. Not all iSCSI targets support MCS. iSCSI targets that support MCS include but are not
limited to EMC Celerra, iStor, and Network Appliance.
For iSCSI-attached hosts, the number of logged-in nodes refers to iSCSI sessions that are created
between hosts and nodes, and might be greater than the current number of nodes on the system.
Uempty
There is no equivalent list of candidate IQNs available when creating iSCSI hosts. All iSCSI host
port IQNs must be entered manually. The Add Host for creating an iSCSI host is comparable to
setting up Fibre Channel hosts. Instead of selecting the Fibre Channel ports, it requires you to enter
the iSCSI Initiator host IQN that was used to discover and pair with the node IQN.
When the host is initially configured using neither an authentication method or Challenge
Handshake Authentication Protocol (CHAP) passphrase is set for use. You can choose to enable
CHAP authentication which involves sharing a CHAP secret passphrase between the control
enclosures and the host before the IBM Spectrum Virtualize system allows access to volumes.
The iSCSI attached hosts are eligible by default to access volumes in all four I/O groups. For Fibre
Channel attached hosts, the host WWPNs need to be zoned to the control enclosures in the I/O
groups that contain volumes for a given host. For iSCSI attached hosts the host needs only IP
network connectivity to the storage enclosures in the I/O groups that contain volumes for the given
host. Just as the FC host creation, you can add the host to an already defined host cluster.
The management GUI generates the mkhost command to create the iSCSI host object contains the
-iscsiname parameter followed by the iSCSI host IQN. A maximum of 256 iSCSI hosts are
supported per I/O group.
Uempty
Netstat command
The netstat –nt command displays active TCP connections
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
The netstat command is part of the Windows TCP/IP support that can be used to display all TCP
connections and attempts to translate the addresses to names. The –n displays the connection of
addresses and port numbers numerically. The -t displays the current connection offload state, in
this case it's InHost indicating that the host CPU is handling this part of the TCP/IP stack rather
than being offloaded onto the NIC.
Uempty
Host integration
Fibre Channel host types
Ethernet host types
iSCSI host connection
iSER host connection
iSCSI/ISER IP address failover
Host clusters
N_Port ID Virtualization (NPIV)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
This topic discusses the iSER host connection in an IBM Spectrum Virtualize system environment.
Uempty
L6&6,
By passes CPU and caches L6(5
iSER uses either iSCSI qualified name (IQN) (223 L:$53'ULYHU 5R&('ULYHU
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
IBM Spectrum Virtualize 8.2.1 system supports iSER host attachment with 25 GbE adapters, which
expands the host connectivity options for Spectrum Virtualize systems. iSER is an acronym for
iSCSI Extensions for Remote Direct Memory Access (RDMA). iSER is an extension to the iSCSI
protocol to use RDMA rather than regular iSCSI volume access. It reduces latency through a
shorter code stack, permits data to be transferred directly into and out of SCSI computer memory
buffers (which connects computers to storage devices) without intermediate data copies and
without much of the CPU intervention. iSER protocols use the 25 Gb Ethernet adapters.
The iSER protocols can be used in a fully Ethernet based infrastructure (no Fibre Channel) for
inter-node communication, HyperSwap, or Stretched Cluster environments. The RoCE and iWARP
connections, which support remote direct memory access (RDMA) technology are established
through unique IP address, offering faster transfers. iSER is an extension to the iSCSI protocol to
use Remote Direct Memory Access (RDMA) technology to reduce intermediate data copies in I/O
interface memory with less CPU usage. iSER uses either iSCSI qualified name (IQN) (223 bytes) or
extended unique identifier (EUI) (64-bit) names.
iWARP is layered on top of TCP/IP and, therefore, does not require a lossless Data Center
Bridging (DCB) fabric. RoCE, which is based on InfiniBand transport over Ethernet, and RoCEv2,
enhances RoCE with a UDP header and Internet routability
IBM Spectrum Virtualize ensures availability for iSCSI communications to an I/O group in node
failure scenarios, by failing over the iSCSI IP address to the surviving node, rather than depending
on the host multi-path driver to send I/Os to the surviving node. If an iSER target node fails, the
iSER initiator is logged out from the failed node. A new session or login is reestablished with the
partner (working) node that uses the IP address of the failed node. For successful iSER logins from
a host to a partner node after a failover, the port type (RoCE or iWARP) and the number of ports
must be the same on the partner nodes.
Uempty
All IP addresses (service and configuration) associated with a clustered-system Ethernet port must
be on the same subnet. However, IP addresses associated with a node Ethernet port that is used
for iSCSI traffic can be configured to belong to different subnets.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
iSER hosts can be attached to the Spectrum Virtualize system using the node’s Ethernet port IP
addresses, which can be assigned to any 25 Gbps Ethernet ports of the node. iSER leaves the
administrative framework of iSCSI untouched while mapping the data path over RDMA. The native
iSCSI host attach, and controller virtualization are used to support the 25 Gb Ethernet ports. You
can have a maximum of 4 host attach sessions from an initiator to each target node. Spectrum
Virtualize node ports supports a maximum of one IPv4 address and one IPv6 address can be
configured on each of Ethernet ports 1 and 2 for host I/O access to volumes. Spectrum Virtualize
supports a maximum of 256 iSER sessions per node. Therefore, hosts can use either iSCSI or
iSER (but not both at the same time). However, the 25 Gb node ports can support both
concurrently. When you compare its benefits to the standard iSCSI you get higher bandwidth, lower
latency and lower CPU utilization.
iSER supports one-way authentication through the Challenge Handshake Authentication Protocol
(CHAP), iSER target authenticating iSCSI initiators; versus iSCSI two-way authentication through
the CHAP.
Definitions to understand the terms connection and session:
• A connection is a TCP connection. Communication between the initiator and target occurs over
one or more TCP connections. The TCP connections carry control messages, SCSI
commands, parameters, and data within iSCSI Protocol Data Units (iSCSI PDUs).
• The group of TCP connections that link an initiator with a target form a session (loosely
equivalent to a SCSI Initiator-Terminator nexus). TCP connections can be added and removed
from a session. Across all connections within a session, an initiator sees one and the same
target.
Uempty
&RPPDQG6WDWXV
RoCE initiator ports cannot establish iSCSI host attach
sessions to iWARP ports and vice versa
L:$53'ULYHU 5R&('ULYHU
L:$53U1,& 5R&(U1,&
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
The RoCE ports and the iWARP ports are not iSER interoperable, which means that host iWARP
adapters only communicate with node iWARP adapters, and host RoCE adapters only
communicate with node RoCE adapters.
In addition, both the RoCE ports cannot establish iSCSI host attach sessions to iWARP ports or
vice versa. Therefore, if an iSER session is initiated by the host while an iSCSI session exists from
the same host, the iSER session is rejected. Likewise, if an iSCSI session is initiated by the host
while an iSER session exists from the same host, the iSCSI session is rejected.
Note that non-disruptively replacing a Spectrum Virtualize system adapter requires that another
adapter and path to the host exists for the storage. The host multi-path I/O code will ensure that
I/Os are sent using working paths on working adapters.
Uempty
í Max. of 63 characters
ƒ Click iSER host IQN field and paste the host IQN LTQGHVXVHGEVHHO
value
í Value copied from the iSER Initiator application
ƒ Optional fields:
CHAP authentication (storage system
acts as the authenticator
Sends a secret (passphrase) message
to host before access is granted
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
An iSER host connection can be established by first installing the software for the iSER
software-based initiator on the server, such as the Linux software iSER initiator. You can then use
either the management GUI or the command-line interface to define an iSER host server. Verify
that you configured the node and the system Ethernet ports correctly.
To create an iSER host using the GUI, select Hosts > Hosts and select Add Host. Select iSCSI or
iSER Hosts and enter an iSER initiator name in the host IQN field. Just as an iSCSI host, the iqn
value is the IQN for an IBM Spectrum Virtualize node. Enter additional details about the host and
click Add Host. You can also add more initiator names to one host. Because the IQN contains the
clustered system name and the node name, it is important not to change these names after iSCSI
is deployed.
The management GUI generates the mkhost command to create the iSCSI host object using the
-isername parameter followed by the iSER host IQN. The maximum for iSER hosts per I/O group
is 256 due to the IQN limits.
Windows logs in to the target as soon as you click Connect.
Uempty
Host integration
Fibre Channel host types
Ethernet host types
iSCSI host connection
iSER host connection
iSCSI/ISER IP address failover
Host clusters
N_Port ID Virtualization (NPIV)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
This topic discusses the iSCSI/iSER IP address failover host connection in an IBM Spectrum
Virtualize system environment.
Uempty
Current state
Management IP iSCSI IP
iSCSI targets iSCSI IP Management IP
addresses addresses addresses addresses
Control enclosure
10.x.9.210 10.x.9.201 10.x.9.202
Node1 Node2
Config node Ptrn node
Run the lsportip command to confirms that the storage enclosure1 iSCSI IP addresses
have been transferred to storage enclosure2.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Both the iSCSI and the iSER IP addresses configured on an Spectrum Virtualize system supports
node failover. This visual illustrates an iSCSI IP address failover from either a node failure or as a
result of taking the node offline for a code upgrade. In either case, if a Node1 enclosure is no longer
available then its partner Node2 enclosure inherits the iSCSI IP addresses of the departed storage
enclosure. In addition to node-port IP addresses, the iSCSI name and iSCSI alias for the failed
node are also transferred to the partner node. After the failed node recovers, the node-port IP
address and the iSCSI name and alias are returned to the original node.
The partner node port responds to the inherited iSCSI IP address as well as its original iSCSI IP
address. However, if the failed nodewas the cluster config node (or configuration node) then the
cluster designates another node as the new config node. The cluster management IP addresses
are moved automatically to the new config node.
From the perspective of the iSCSI host, I/O operations proceed as normal. To allow hosts to
maintain access to their data, the Node1-port IP addresses for the failed control enclosure are
transferred to the partner control Node2 enclosure in the I/O group. The partner Node2 enclosure
handles requests for both its own node-port IP addresses and also for node-port IP addresses on
the failed control enclosure. This process is known as control enclosure-port IP failover. Therefore,
the control enclosure failover activity is totally transparent and nondisruptive to the attaching hosts.
Uempty
Advantage:
ƒ Storage enclosure failover activity is transparent and nondisruptive
ƒ iSCSI host I/O operations proceed as normal
• Disadvantages:
ƒ Opened CLI sessions are lost when a config storage enclosure switch occurs
ƒ Opened GUI sessions might survive the switch
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
The storage system node failover activity is totally transparent and nondisruptive to the attaching
hosts.
If there is an opened CLI session during the storage enclosure failover then the session is lost
when a config storage enclosure switch occurs. Depending on the timing, opened GUI sessions
might survive the switch.
Uempty
1 Node1 return
Management IP iSCSI IP
iSCSI targets iSCSI IP Management IP
addresses addresses
Control enclosure addresses addresses
10.x.9.201 10.x.9.202
Node1 Node2 10.x.9.210
Ptrn node Config node
Run WKHlsportip command to confirm that the Node2 iSCSI IP addresses have
been transferred back to Node1
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Once the failed control (Node1) enclosure has been repaired or code upgrade has completed, it is
brought back online. The iSCSI IP addresses previously transferred to Node2 automatically
failback to Node1. The configuration storage enclosure remains intact (retaining the management
IP address), and does change storage enclosure. A configuration storage enclosure switch occurs
only if its hosting storage enclosure is no longer available.
When a failed storage enclosure is re-establishing itself to rejoin the cluster, its attributes do not
change (for example its object name is the same). However, a new storage enclosure object ID is
assigned. Example, if storage enclosure1, whose object ID was 1, will now be assigned the next
sequentially available object ID of 5.
Uempty
Host integration
Fibre Channel host types
Ethernet host types
Host clusters
N_Port ID Virtualization (NPIV)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
This topic discusses the creation of a host cluster to support volume shared and private volume
mappings.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
A host cluster is a group of logical host objects that can be managed together as a single cluster,
and to share access to volumes mapped directly to the host cluster object. Host cluster simplifies
the management of clustered host systems, such as in VMware environments where administrators
have to work with large host objects. Therefore, all hosts and their WWPN that are assigned to a
host cluster will share the same volume mappings, and share the same SCSI IDs and so on. New
volumes can be mapped to a host cluster, which simultaneously maps that volume to all hosts that
are defined in the host cluster.
Shared disk clusters include those in which an application running on the hosts write concurrently
to the shared volumes, and also clusters which failover the storage from one VM to another and
restart the application. Often there will be a set of shared volume used by the application running in
the cluster, and other volumes such as the operating system disk are not shared. For best practice
and to simplify the host cluster and host configuration process, create the host cluster object first.
This allows you to add new host object to a pre-defined host cluster during creation.
A maximum of 512 host clusters can be supported per system.
Uempty
Issue the CLI lshostcluster command to display the following host cluster status:
Online: All hosts in the host cluster are online
Host degraded: All hosts in the host cluster are either online or degraded
Host cluster degraded: At least one host is offline and at least one host is either online or
degraded
Offline: All hosts in the host cluster are offline (or the host cluster does not contain any hosts)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Each host cluster is identified by a unique name and ID, the names of the individual host objects
within the cluster, and the status of the cluster. A host cluster can contain up to 128 hosts. However,
a host can be a member of only one host cluster.
You can used the CLI lshostcluster command to display the following host cluster status:
• Online: All hosts in the host cluster are online.
• Host degraded: All hosts in the host cluster are either online or degraded.
• Host cluster degraded: At least one host is offline and at least one host is either online or
degraded.
• Offline: All hosts in the host cluster are offline (or the host cluster does not contain any hosts).
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
You can define a host cluster using the management GUI Hosts > Create Host Cluster panel. By
default, a list of hosts are presented. This represents all the hosts zoned to the storage system SAN
fabric.
To create a host cluster, complete following steps:
• On the Host Cluster page, select Create Host Cluster and enter the name of the host cluster
that you want to create.
• Select the available host object members to include within the host cluster.
• You can also create an empty host cluster and add hosts later.
Uempty
+RVWFOXVWHU
+RVW$ +RVW%
6KDUHGPDSSLQJ 6KDUHGPDSSLQJ
6&6,,' 6&6,,'
%RRW %RRW
/81$ /81%
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
By default the volumes presented are mapped to the selected host or hosts. These volumes will
simultaneously be shared among all hosts in the host cluster. The example shows two host
systems that are members of the same host cluster. All volumes (LUN A and LUN B) contain data
that have shared mappings with Host A and Host B. Any new hosts that are added to the host
cluster will inherit the same volume mappings.
Uempty
+RVWFOXVWHU
+RVW$ +RVW%
3ULYDWHPDSSLQJ 3ULYDWHPDSSLQJ
6&6,,' 6&6,,'
6KDUHGPDSSLQJ
6&6,,'
3ULYDWHPDSSLQJ
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Hosts in a host cluster can also have their own private volume mappings that are not shared with
other hosts in the host cluster. With private mapping, individual volumes can be directly mapped to
one or more hosts. This allows a host to maintain the private mapping of some volumes and share
other volumes with hosts in the host cluster. The SAN boot volume for a host would typically be a
private mapping.
To create a private mapping select volumes to be excluded from becoming shared mappings for the
entire host cluster. New hosts that are added to the host cluster will not inherit these volume
mappings.
The example shows two host systems that are members of the same host cluster. The volume that
contains data has a shared mapping with Host A and Host B. However, Host A and Host B also
have a private mapping to their respective boot volume (LUN A and LUN B).
Click Next. On the Summary page, verify the settings and click Make Host Cluster.
Uempty
If you selected hosts with volumes that have SCSI ID conflicts, the system does not add these
mappings to the host cluster. A SCSI LUN ID conflict occurs when multiple hosts are mapped to the
same volume but with different SCSI IDs. In this case, a shared mapping is not created because
the system does not allow a volume to be mapped more than once to the same host. The Summary
page lists all volumes that contain conflicts and the system retains these mappings as private
mappings to the original hosts.
Once you have verified the settings, click Make Host Cluster. The system issues a mkhostcluster
–name command with the name of the host cluster. The GUI will list the host cluster and all hosts
assigned. You can also create an empty host cluster using the same command. To display the
status of the host cluster using CLI, enter the lshostcluster command.
Uempty
$GGLWLRQDOKRVWFOXVWHUFRPPDQGV
The following new commands have been added to deal with host clusters:
mkhostcluster (Creates host cluster)
addhostclustermember (Adds a member to the host cluster) Host clusters
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
The following new commands have been added to deal with host clusters:
• mkhostcluster creates host cluster
• addhostclustermember adds a member to the host cluster
• lshostcluster lists the host clusters
• lshostclustermember lists host cluster member
• lshostclustervolumemap lists host cluster for mapped volumes
• rmhostclustermember removes host cluster member
• rmhostcluster removes a host cluster
• rmvolumehostclustermap unmaps volumes from the host cluster
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Typically all host and storage ports are logged into to the fabric SAN, usually during the power on.
When a host boots, it scans the SAN for target ports for any volumes or LUN mappings. Once this
process occurs, the storage performs a port scan to detect and learn of the host WWPNs being
presented to the managed storage system.
The associated WWPNs for each host WWPN are automatically presented in the management
GUI. The WWPNs are used to define a host object, in which volumes can be created and mapped
to the host object.
Unlike SCSI FC-attached hosts, with iSCSI and NVMe available candidate ports cannot be
checked.
Uempty
To support more than 256 host objects per IBM Spectrum Virtualize system (or 512 host objects)
the rmhostiogrp command is used to remove an I/O group eligibility from an existing host object.
The host object to I/O group associations only define a host object’s entitlement to access volumes
owned by the I/O groups. Physical access to the volumes requires proper SAN zoning for Fibre
Channel hosts and IP connectivity for iSCSI hosts.
Uempty
iSCSI and iSER names per I/O group 512* iSER (Model SV1 only)
* A lower limit applies if FC-NVMe hosts are attached; refer to the NVMe over Fibre Channel Host Properties section
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
The following guidelines apply when you connect host servers to an IBM Spectrum Virtualize
system:
• Up to 512 hosts per building block are supported, which equals to a total of 2,048 hosts for a
fully scaled system. If the same host is connected to multiple systems of a cluster, it counts as a
host in each system. However, up to 6 FC-NVMe hosts are supported per system when no
SCSI (FC/iSCSI/SAS) host are attached. This limits is not policed by the IBM Spectrum
Virtualize software. Exceeding this amount can result in adverse performance impact.
• A total of 2048 distinct, configured, host worldwide port names (WWPNs) or iSCSI Qualified
Names (IQNs) are supported per system for a total of 8192. This limit is the sum of the FC host
ports and the host iSCSI names (an internal WWPN is generated for each iSCSI name) that are
associated with all of the hosts that are associated with a system.
• The maximum number of FC ports and iSCSI names per host object is 32. However, the
maximum number of iSCSI and iSER names supported per host object is 8. A system may be
partnered with up to three remote systems.
• A lower limit applies if FC-NVMe hosts are attached; refer to the NVMe over Fibre Channel Host
Properties section.
Uempty
FC-NVMe and iSCSI host intermix 1/5 Maximum NVMe hosts per I/O group: 1
Maximum SCSI hosts per I/O group: 5
The maximum FC-NVMe hosts per system limit (6) still applies.
These limits are not policed by the Spectrum Virtualize
Visit the IBM Support website for software. Any configurations that exceed these limits may
the supported configurations. experience significant adverse performance impact.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
• The maximum number of volumes in a fully scaled system is 8,192 (having a maximum of 2,048
volumes per I/O Group). The maximum storage capacity supported is 32 PB per system. Not all
hosts are capable of accessing and managing this number of volumes. The practical mapping
limit is restricted by the host OS, and not by the IBM Spectrum Virtualize system. Does not
apply to hosts of type adminlun (used to support VMware vvols).
• iSCSI and iSER both support SCSI 3 registrations per VDisk. SCSI reservations are used to
control access to a shared SCSI device. An initiator sets a reservation on a Logical Unit
Number (LUN) in order to prevent another initiator from making changes to the LUN. This is
similar to the file-locking concept. SCSI reservations are always set by a host initiator in this
case iSCSI 512 and iSER 1024 (Model SV1 only). Ideally, the same initiator would perform a
SCSI release on the affected LUN.
• An FC-NVMe host can connect to up to four NVMe controllers on each IBM Spectrum Virtualize
system node. The maximum per node is four with an extra four in failover. A single I/O group
can contain up to 256 FC-NVMe I/O controllers. The maximum number of I/O controllers per
node is 128 plus an extra 128 in failover.
• When FC-NVMe and SCSI hosts are attached to the same I/O group, the following restrictions
apply:
▪ Maximum NVMe hosts per I/O group: 1
▪ Maximum SCSI hosts per I/O group: 5
▪ The maximum FC-NVMe hosts per system limit (6) still applies.
These limits are not policed by the Spectrum Virtualize software. Any configurations that exceed
these limits may experience significant adverse performance impact.
Uempty
Host integration
Fibre Channel host types
Ethernet host types
Host clusters
N_Port ID Virtualization (NPIV)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
This topic describes the characteristics of N_Port ID Virtualization (NPIV) event of any failure of the
node, hardware, or software.
Uempty
OS native multipathing
OS multipathing software distributes I/Os
among Fibre Channel ports
Each port is addressed by an ID, called an
N_Port ID
Multipathing software ensures continuous Multipathing software
access by rerouting I/Os to the partner node
in an I/O group
Switching from one node to the other can take A B C D E F G H
about 30 seconds
Node 1 Node 2
Application I/Os are paused during the switch
Delay to I/Os can be seen as extended
response times for end users
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Traditionally, if a node failed or was being removed for maintenance reasons, the paths that were
presented for volumes from that preferred node would go offline. In this case, the system relies on
the native host OS multipathing software to failover from using both sets of worldwide port name
(WWPN) to just those that remain online. This is the main purpose of the multipathing software,
however, it can become problematic, particularly for those hosts whose paths are reluctant to come
back online for whatever reason. At this point, the system is relying on the OS multipathing support
as well as fabric zoning to ensure resource access integrity for high availability.
Uempty
N_Port ID Virtualization
N_Port ID Virtualization (NPIV) creates multiple virtual FC ports per physical FC port
Removes the dependence on multipath software during failover
Node failover within an I/O or failover to a hot spare node
Allows the partner node to take over the WWPNs of the failed node
Essentially performs a failover at the fabric level - no host failover required
Paths appear online always to the host
NPIV is not supported for Ethernet-based protocols
P1 P2 P3 P4 P5 P6 P7 P8
Host Attach
Node 1: Active Owner Virtual Ports Node 1: Active Partner
P# P# P# P#
Spare (standby)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
N_Port ID Virtualization is a Fibre Channel industry standard method for virtualizing a physical
Fibre Channel port. With the release of Spectrum Virtualize 7.7.0, Spectrum Virtualize systems can
be enabled into N_Port ID Virtualization (NPIV) mode. This allows hot spare nodes with NPIV
enabled to automatically swapping a spare node into the cluster if the cluster detects a failing node.
This means that ports do not come up until they are ready to service I/O.
All spare nodes have active system ports (no host I/O ports), and are not part of any I/O group. Only
host connection on Fibre Channel ports that support node port virtualization (NPIV) can be used for
spare nodes. N_Port ID Virtualization allows for a single F_Port to be associated with multiple
N_Port IDs, therefore, the spare node uses the same NPIV worldwide port names (WWPNs) for its
Fibre Channel ports as the failed node, so host operations are not disrupted.
NPIV is not supported for Ethernet-based protocols such as FCoE, iSCSI and iSER.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
With the historically physical Fibre Channel port on any Spectrum Virtualize product, a single
WWPN to the Fibre Channel fabric is capable of being used as a target for host I/O, an initiator for
backend controller IO, and for internode and intercluster communications.
The NPIV target port feature can be presented in three modes:
• Disabled means that no NPIV target ports are started. This behavior is unchanged from
previous release.
• In the transitional mode, the NPIV target ports are enabled which allows volumes to be
presented through both physical and NPIV target ports.
• Enabled means that all NPIV target ports are enabled. In this case volumes are presented
through host NPIV target ports only.
These modes facilitate transitioning to use of NPIV. When NPIV ports are configured, to ensure that
application availability, zoning, and disk pathing can be dynamically updated so that additional
paths are configured to the NPIV ports from the host. When NPIV is fully enabled, paths to the
system’s physical WWPNs can be removed dynamically by the host and SAN administrators.
IBM SAN Volume Controller systems that are shipped with V7.7 and later should have NPIV
enabled by default.
If hot spare nodes are used, then NPIV must be implemented.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Migrating a cluster to use the NPIV functionality can be done dynamically, involving the SAN and
host administrators, and by using a transitional mode for each I/O group. When fctargetportmode
is enabled, volumes are presented only through NPIV WWPNs. Intra-cluster communications use
the physical port WWPNs.
When setting up a cluster for the first time, where NPIV is required, set each I/O group's
fctargetportnode attribute to enabled. Zoning must be setup by the SAN switch administrator for
the hosts.
If a cluster exists, you can use the management GUI to enable the NPIV port by navigating to the
Settings > System menu and select IO Groups. From there, you can right-click on each IO group
and select Change Target Port Mode. Once the NPIV port feature is enabled, select Settings >
Network > Fibre Channel Ports to view the virtualized ports.
Uempty
Virtual WWPNs
• WWPNs associated with an IBM Spectrum Virtualize storage system with NPIV
Physical NPIV target Physical NPIV target port Failed over NPIV
port WWPN port WWPN port WWPN WWPN fails port WWPN
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
When NPIV is enabled in a cluster, there are three target WWPNs virtualized, and all the worldwide
names (WWNs) representing the ports remain available during controller outages. Each physical
port is associated with two WWPNs. The WWPN associated with the primary port and the WWPN
associated with the NPIV target/host port.
• The physical port WWPN is used for inter-node (local and remote) traffic and communication to
back-end storage. The physical port WWPNs are logged in to the fabric anytime the Spectrum
Virtualize software is running, including in service mode.
• The NPIV or virtual WWPN is used for volume I/O with hosts. It is a target only port.
• When a node fails, the virtual/NPIV WWPN is moved to either the other node in the I/O group,
or if hot spare nodes are configured, all the virtual WWPNs move to the hot spare node.
• NPIV WWPNs move to the port in the same physical location on the failover node. Therefore,
the physical adapters and ports must match on the failover node.
The effect is the same for NVMe ports, as they use the same NPIV structure, but with the topology
NVMe instead of regular SCSI.
Uempty
Keywords
• Host object • Host bus adapter (HBA)
• Host cluster • SCSI target
• Fibre Channel (FC) • 2145 multi-path disk device
• Internet Small Computer System • Server Manager
Interface (iSCSI) • Disk management
• iSCSI Extensions over RDMA (iSER) • Device Manager
• Non-Volatile Memory Express (NVME) • Worldwide node (WWN)
• SCSI LUNs • Worldwide node name (WWNN)
• I/O load balancing • Worldwide port name (WWPN)
• Subsystem Device Driver Device • Multipath I/O (MPIO)
Specific Module (SDDDSM) • MPIO DSM (Device Specific Module)
• Subsystem Device Driver Path Control • N_Port ID virtualization (NPIV)
Module (SDDPCM) • Round robin algorithm
• Fibre Channel Protocol (FCP) • Zoning
• SCSI initiator
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Uempty
Review questions (1 of 3)
1. For a host to access volumes that are provisioned by the storage system, which of the
following must be true?
A. The host WWPNs or IQN must have been defined and mapped with the volume’s owning I/O
group
B. Fibre Channel zoning or iSCSI IP port configuration must have been set up to allow
appropriate ports to established connectivity
C. The volumes must have been created and mapped to the given host object
D. All of the above
2. True or False: If an IP network connectivity failure occurs between the iSCSI initiator
and the storage cluster iSCSI target port, the cluster will automatically failover the
iSCSI target port address to the other storage enclosure’s IP port.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Uempty
Review answers (1 of 3)
1. For a host to access volumes that are provisioned by the storage system, which of the
following must be true?
A. The host WWPNs or IQN must have been defined and mapped with the volume’s owning I/O
group
B. Fibre Channel zoning or iSCSI IP port configuration must have been set up to allow
appropriate ports to established connectivity
C. The volumes must have been created and mapped to the given host object
D. All of the above.
The answer is all of the above.
2. True or False: If an IP network connectivity failure occurs between the iSCSI initiator
and the storage cluster iSCSI target port, the cluster will automatically failover the
iSCSI target port address to the other storage enclosure’s IP port.
The answer is False.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Uempty
Review questions (2 of 3)
3. True or False: If a storage administrator changes the primary controller for a LUN, the
host will automatically redirect I/Os to the new primary controller.
4. Which of the following is the identifier when configuring for an iSCSI host in the
storage system?
A. IP address
B. WWPN
C. IQN
D. WWPN
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Uempty
Review answers (2 of 3)
3. True or False: If a storage administrator changes the primary controller for a LUN, the
host will automatically redirect I/Os to the new primary controller.
The answer is False. The host must learn of the new preferred paths typically via a rescan for
disks.
4. Which of the following is the identifier when configuring for an iSCSI host in the
storage system?
A. IP address
B. WWNN
C. IQN
D. WWPN
The answer is IQN. iSCSI is an IP-based standard for transferring data that supports host
access by carrying SCSI commands over IP networks using an iSCSI qualified name(IQN).
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Uempty
Review questions (3 of 3)
5. Hosts are usually zoned to a single I/O group except when (choose all that apply)
A. the I/O bandwidth exceeds that of an I/O group
B. the application needs to survive a node failure
C. when migrating hosts from one I/O group to another
D. Hosts are always zoned to all I/O groups
6. Which of the following NPIV modes allows volumes to be presented through both
physical and NPIV target ports.
A. Disable
B. Transitional
C. Enabled
D. Stealth
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Uempty
Review answers (3 of 3)
5. Hosts are usually zoned to a single I/O group except when (choose all that apply)
A. I/O bandwidth exceeds that of an I/O group
B. Application needs to survive a node failure
C. Migrating hosts from one I/O group to another
D. Hosts are always zoned to all I/O groups
The answer is A and C. B is incorrect because, if there is a node failure, it is transparent to a
host zoned to a single I/O group, and D is incorrect because most hosts are zoned to a single
I/O group.
6. Which of the following NPIV modes allows volumes to be presented through both
physical and NPIV target ports.
A. Disable
B. Transitional
C. Enabled
D. Stealth
The answer is Transitional. The transitional mode is an intermittent state where both the
virtual ports and physical ports are enabled.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Uempty
Summary
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HKRVWLQWHJUDWLRQ
Uempty
Overview
This module provides an overview of the IBM Spectrum Virtualize advanced software functions
designed to deliver storage efficiency and optimize storage asset investments. The topics include
data reduction pool technologies, volume capacity savings using Thin Provisioned virtualization,
and storage capacity utilization efficiency with the achievement of RACE Compression.
References
Implementing IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1
http://www.redbooks.ibm.com/redpieces/pdfs/sg247938.pdf
Uempty
Objectives
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
This topic identifies the data reduction technologies used to reduce the amount of capacity required
to store data.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Data reduction is a set of techniques for reducing the amount of physical storage required to store
data. There are three types of data reduction technologies that can be implemented.
Thin provisioning in which capacity is allocated on demand as data is written to storage.
Compression in which data is compressed before being written to storage.
Deduplication, in which duplicates of data are detected and are replaced with references to first
copy, and is supported in Data Reduction Pools (DRPs) on a volume basis.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Uempty
Dynamic
growth
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Thin provisioning function is to extends storage utilization efficiency to all supported IBM storage
systems by allocating disk storage space in a flexible manner among multiple users, based on the
minimum space required by each user at any given time.
With thin-provisioning, storage administrators can also benefit from reduced consumption of
electrical energy because less hardware space is required, and enable more frequent recovery
points of data to be taken (point in time copies) without a commensurate increase in storage
capacity consumption.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Thin-Provision volumes are sequential volume types that can be created either as fully allocated or
thin-provisioned. Thin-provisioned volumes creates two capacities: virtual and real.
Thin-provisioned volume creates an additional layer of virtualization. This layer gives the
appearance of storage as traditionally provisioned, having more physical resources than are
actually available. Note that thin volumes grow in grain size increments, and when an extent is filled
with grains, then the volume grows in the number of extents allocated from the disk pool.
Uempty
LBAn Virtual
Capacity
Real
Capacity
LBA0
rsize parm
Physical storage
used is 25 GB
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
From the host perspective, it see only the full volume or logical disk size that it has access to,
therefore any changes to the rsize increase is transparent to the host. This also means the actual
data can be moved or replicated to another pool with in the system without affecting the operation
of any application. When the data has been copied or moved, the meta-data is updated to point to
the new location.
This is the magic of IBM Spectrum virtualize from the perspective of the virtualized storage (actually
extent pointers), which affords the freedom of changing the storage infrastructure without host
impact; and the opportunity to exploit newer technology to optimize storage efficiency for better
returns on storage investments.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Thin-provision volumes can be created using any of the GUI presets using the same procedural
steps as any other volume created within the management GUI Create Volumes wizard. We are
using the basic quick option which provides predefined settings. When creating a thin provisioned
volume, you need to change the Capacity savings to thin-provisioned.
A thin-provisioned volume is created with different virtual capacity and a real capacity, which are
shown in the Summary. Virtual capacity is the volume capacity that is presented to hosts and other
Copy Services such as FlashCopy and Metro/Global Mirror.
Real capacity is what is actually allocated to the volume, and which starts with a user specified
-rsize buffer space which by default is 2% of the volume capacity, so keeping that pre-allocated free
space as data is written to the volume. In the example of a 50 GB volume, 2% or 1 GB is
pre-allocated and 1 GB of free pre-allocated space is kept on the volume as it grows.
Consequentially the system doesn't have to wait to allocate space to the volume before writing to
that storage.
Uempty
For a thin provisioned volume, the management GUI generates the mkvdisk command. This
command defines the unique specifications in which a thin-provisioned volume performs:
• By default the volume is set to autoexpand, a feature that prevents a thin-provisioned volume
from using up its capacity and going offline
• The grainsize role is to affect the maximum virtual capacity by allocating space in chunks (32
KB, 64 KB, 128 KB, or 256 KB which is the default).
• The rsize 2% is a default parameter that specifies the percentage of capacity to keep as free
pre-allocated space, and otherwise doesn’t apply to volumes in DRPs
• The warning 80% is a warning threshold to notify an administrator when the volume has
reached 80% of its virtual size, to help the administrator from running out of space in the pool,
when provisioning more VDisk space than exists in the pool.
Uempty
0D[LPXPWKLQSURYLVLRQHGYROXPHFDSDFLWLHVIRUDJUDLQVL]H
([WHQWVL]HLQ0% 0D[LPXPYROXPHUHDOFDSDFLW\LQ*% 0D[LPXPWKLQYLUWXDOFDSDFLW\LQ*%
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
A few factors (extent and grain size) limit the virtual capacity of thin-provisioned volumes beyond
the factors that limit the capacity of regular volumes.
The first table shows the maximum thin provisioned volume virtual capacities for an extent size. The
second table shows the maximum thin provisioned volume virtual capacities for a grain size.
Uempty
'LUHFWRU\%WUHH
RIYROXPHFDSDFLW\
9ROXPH¶V
H[WHQWV
0HWDGDWD
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
The directory of metadata and user data shares the real capacity allotment of the volume. When an
thin volume is initially created, the volume has no real data stored. However, a small amount of the
real capacity is used for metadata, which it uses to manage space allocation and keeps track of
where volume data resides. The metadata holds information about extents and volume blocks
already allocated. Once a new write causes a grain to be allocated and used, the metadata is
updated and the volume is expanded to keep the buffer of free space. This metadata that is used
for thin provisioning allows the IBM storage system to determine whether new extents have to be
allocated or not.
Here are a few examples:
• If the volume default grain size is 256 KB, then 256 KB within the allocated real capacity is
marked as used, for the 512 blocks of 512 bytes each, spanning the LBA range in response to
this write I/O request.
• If a subsequent write I/O request is to an LBA within the previously allocated 256 KB, the I/O
proceeds as usual since its requested location is within the prior allocated 256 KB.
• If a subsequent write I/O request is to an LBA outside the range of a previously grain, then
another grain is allocated to the volume.
All three of these write examples consult and might update the metadata directory. Read requests
also need to consult the same directory. Consequently, the volume’s directory is highly likely to be
IBM storage system cache-resident while I/Os are active on the volume.
Uempty
80%
Volume goes offline when capacity
attempting to allocate used
beyond the virtual size, of if
the pool runs out of space. Threshold alerts
Threshold alerts
as
as sent the
sent to the
Administrator
Administrator
X
X
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
To avoid exhausting the real capacity, you can enable the warning threshold on thin-provisioned
volumes to send alerts to an administrator by using email or an SNMP trap. The administrator can
then (if warranted) increase real capacity and/or virtual capacity. The warning threshold is set via
the GUI or via the CLI using the -warning parameter of mkvdisk. Similarly, one can set a warning
threshold for a disk pool using the GUI or the CLI using the -warning parameter of the mkmdiskgrp
command. Otherwise, the thin volume goes offline if it runs out of space.
When the warning threshold is exceeded, a message is added to the event log (default is 80%).
When the virtual capacity of a thin-provisioned volume is changed, the warning threshold is
automatically scaled to match. The new threshold is stored as a percentage.
Each thin provisioned volume it's own emergency capacity, which is typical 1% or 10% emergency
capacity per volume, depending on the type of volume. The dedicated emergency capacity allows
the volume stay online for anywhere between minutes to days depending upon the volume change
rate, before everything starts going offline. However this type of modification is considered
advanced usage and therefore is not available in the GUI.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
In this topic, we discuss how RACE Compression works in an IBM storage system environment.
Uempty
RACE Compression
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
As the industry needs continue grow, the demand for data compression must be fast, reliable, and
scalable. The compression must occur without affecting the production use of the data at any time.
In addition, the data compression solution must also be easy to implement.
Based on these industry requirements, IBM offers IBM RACE Compression, a combination of a
lossless data compression algorithm with a RACE technology.
The RACE compression offering is advertised as having no performance impact for good reason.
Write I/Os aren't affected because an acknowledgment is sent to the host once the data is in write
cache on Spectrum Virtualize, and compression occurs later below the write cache in the I/O stack.
While the system adds latency to decompress the data for reads, the fact that we're transferring
less data usually reduces the latency more than decompression adds to it. This is especially true for
spinning disks. Additional processor cycles are required, and often hardware accelerators are used
for it.
RACE compression only applies to compressed volumes in standard storage pools (not DRPs).
DRP pools use a different compression algorithm, though both compression algorithms use any
compression chips either integrated on the system board or added as adapter cards in the nodes.
IBM is transitioning from RACE to DRP compression.
Both RACE and DRP compression trade off the costs of processor cycles and compression
licensing to save storage space.
Uempty
Eliminates need to reserve space for Compression can help freeze storage growth
uncompressed data waiting post-processing or delay need for additional purchases
IBM RACE Compression supports all IBM RACE Compression can be used with
Spectrum Virtualize / FlashSystem / active primary data
Storwize V7K storage
High performance compression supports
&RPSUHVVGDWDIRUDQ\LQWHUQDORUH[WHUQDO workloads off-limits to other alternatives
VWRUDJHDQGDYRLG2(0FRPSUHVVLRQOLFHQVH
Greater compression benefits through use on
FRVWV
more types of data
Can significantly enhance value of existing
No performance impact
storage assets
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
IBM RACE Compression offers innovative, easy-to-use compression that is fully integrated to
support active primary workloads:
• Provides high performance compression of active primary data
▪ Supports workloads off-limits to other alternatives
▪ Expands candidate data types for compression
▪ Derives greater capacity gains due to more eligible data types
• Operates transparently and immediately for ease of management
▪ Eliminates need to schedule post-process compression
▪ Eliminates need to reserve space for uncompressed data pending post-processing
• Enhances and prolongs value of existing storage assets
▪ Increases operational effectiveness and capacity efficiency; optimizing back-end cache and
data transfer efficacy
▪ Delays the need to procure additional storage capacity; deferring additional capacity-based
software licensing
• Supports both internal and externally virtualized storage
▪ Compresses up to 512 volumes per I/O group (v7.3 code).
▪ Exploits the thin-provisioned volume framework
Uempty
SCSI Target
Forwarding
• Compression embedded into the Thin
Replication Provisioning layer
Upper Cache ƒ Compressed volumes are also thin volumes
Data
Forwarding
Thin Provisioning Compression • Seamlessly integrates with existing system
Reduction management design
Lower Cache
Log Virtualization Easy Tier ƒ Provides an indication of how much
Structured
Array
Forwarding uncompressed data has been written to the
DRAID/RAID Volume
Forwarding
SCSI NVMe
Initiator Initiator • All of IBM storage system / Spectrum
Virtualize advanced functions are supported
IOs to storage controllers
on compressed volumes
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Uempty
Traditional compression
7KUHHFRPSUHVVLRQDFWLRQV
EDVHGRQSK\VLFDOGDWDORFDWLRQ
)LOH8SGDWH
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
IBM RACE offers an innovation leap by incorporating a time-of-data-access dimension into the
compression algorithm called temporal compression. When host writes arrive, multiple compressed
writes are aggregated into a fixed size chunk called a compressed block. These writes are likely to
originate from the same application and same data type, thus more repetitions can usually be
detected by the compression algorithm.
Due to the time-of-access dimension of temporal compression (instead of creating different
compressed chunks each with its unique compression dictionaries) RACE compression causes
related writes to be compressed together using a single dictionary; yielding a higher compression
ratio as well as faster subsequent retrieval access.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
The most compression is probably most known to users because of the widespread use of
compression utilities, such as Zip and Gzip. At a high level, these utilities take a file as their input,
and parse the data by using a sliding window technique. Repetitions of data are detected within the
sliding window history, most often 32 kilobytes (KB). Repetitions outside of the window cannot be
referenced. Therefore, the file cannot be reduced in size unless data is repeated within the window.
This example shows compression that using a sliding window, where the first two repetitions of the
string “ABCDEF” fall within the same compression window, and can therefore be compressed using
the same dictionary. However, the third repetition of the string falls outside of this window, and
cannot, therefore, be compressed using the same compression dictionary as the first two
repetitions, reducing the overall achieved compression ratio.
Uempty
Compresses volume
in storage device
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
As part of its staging, data passes through the compression engine, and is then stored in
compressed format onto the storage pool. This means that the write of each host is compressed as
it passes through the compression algorithm. Therefore, the physical storage is only consumed by
compressed volume.
Writes are therefore acknowledged immediately after received by the write cache, with
compression occurring as part of the staging to internal or external physical storage.
While latency is added to reads from compressed volume, as it goes through the decompression
code stack, this increase in latency is offset by reduced latency resulting from reading less data and
transferring less data from the storage.
Uempty
Compression enhancements
Compression enhancements:
Offers hardware-assisted compression acceleration
Compressed data in bottom cache uses less cache
RAID Full stripe writes for compressed volumes
Support larger number of compressed volumes
Supports up to two compression accelerator cards:
Installed in dedicated slots
An I/O group with 4 compression accelerator cards can support up to RACE 512 compressed
volumes or 10,000 DRP compressed volumes
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Some Spectrum Virtualize systems require the optional Compression Accelerator cards to support
compression. One can add up to two Compression Accelerator cards per node, depending on the
Spectrum Virtualize system model. Whether required or not, adding Compression Accelerator
cards increases the number of volumes that can be compressed. For example, external storage
systems and IBM FlashSystems supporting 18 TB (large capacity) flash modules must have the
compression hardware installed to support compressed volume. Enabling compression does not
affect non-compressed host to disk I/O performance.
It is strongly recommended to place Compression Accelerator cards into their dedicated slots. An
I/O group with 4 Compression Acceleration Cards can support up to 512 volumes, where as an
8-node cluster with four (4) I/O groups can support as many as 2048 compressed volumes.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
IBM storage system requires a specify RACE Compression License. This license might be part of a
base license packaging that entitles the IBM storage system to all of the licensed functions, such as
Virtualization, FlashCopy, Global Mirror, and Metro Mirror, and RACE Compression. RACE
Compression is licensed by capacity, per terabyte of virtual data.
IBM authorizes existing IBM storage system customers to evaluate the potential benefits of RACE
Compression capability based on their own specific environment and application workloads for free
using it’s the Free Evaluation 45 Days Program. However, before you can use the RACE
Compression 45 days trail period, storage system must be running at the IBM Spectrum Virtualize
software version 7.4 or later and two RACE Compression Compression Accelerator cards installed.
The 45 days evaluation period begins when the you enable the RACE Compression function. At the
end of the evaluation period, the you must either purchase the required licenses for RACE
Compression or disable the function.
You can also apply the compression license using CLI by entering the total number of terabytes of
virtual capacity that is licensed for compression. For example, run chlicense -compression 200.
When ordering Compression Accelerator cards, one should have the same number of cards in
each node in an I/O group in case of node failure so both nodes can access compressed volumes.
Uempty
You can easily and transparently convert from one volume type to another (with an exception for
converting from RACE compressed volumes to DRP compressed/deduplicated volumes). You can
simply right click on a volume to create a volume copy and then specify the volume type, and once
the two copies are synchronized, we can delete the original copy.
For customers migrating from RACE compression to DRP compression, there are certain
limitations that apply:
- RACE compressed volumes can only reside in standard storage pools, and DRP compressed
volumes can only reside in DRPs
- One cannot have RACE compressed volumes and DRP deduplicated volumes in the same I/O
group at the same time.
- One cannot have RACE compressed volumes and DRP compressed volumes in the same I/O
group at the same time except for certain Spectrum Virtualize models with compression hardware
either integrated into the system board or via compression accelerator cards
In cases where the system doesn’t have compression hardware, alternative methods exist, such as
uncompressing the volumes prior to migration.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Compressed volumes are configured in the same manner as the other preset volumes. Just as the
thin-provisioned volumes, you can use any preset to create a compressed volume, however, the
custom preset provides a modified view of all the specifications in which a compressed volume
performs.
The accessible I/O groups, are those from whose ports a host can access the volume, and is a form
of storage LUN masking. By default, a host can access a volume only through the caching I/O
group.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Compressed volumes have the same characteristics as thin provisioned volumes, the defaults are
almost identical. You can accept the defaults or modify the rsize, autoexpand, and warning
threshold settings. The preferred settings are to set rsize to 2%, and set warning to 80%. The only
difference between the two presets is the grain size attribute. Compressed volumes do not have an
externally controlled grain size. The enabled cache mode specifics the read/write caching options
for the volume, and the unit device identifier (UDID) that is used by OpenVMS hosts to identify the
volume.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
The summary view shows the volume parameters and capacities that are being created. Like the
thin provisioned volume, the compressed volume is created with different virtual capacity and a real
capacity. During the creation, the compression engine immediately creates the compressed
volume's minimal metadata and header. Therefore, the used size is larger than the before
compression size.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
The management GUI generates the mkvdisk command to create a compressed volume. This
command defines the unique attributes of the compressed volume:
• By default the volume is set to autoexpand, expands the volume as data is written, to ensure
there's a -rsize buffer of free space for new data until the volume is fully allocated.
• Cache readwrite specifies the caching options for the volume which is enabled by default. If
cache is disabled, then readonly disables write caching but still allows read caching for a
volume.
• The -compressed parameter specifies that the volume is to be created compressed. Therefore
data is compressed when it is written and stored on the disk.
• A volume is owned by an I/O group and is assigned a preferred node within the I/O group at
volume creation. Unless overridden, the preferred node of a volume is assigned in round robin
fashion by the system. If the -iogrp parameter is not specified, the least used I/O group is
used for compressed copies (considering the subset of I/O groups that support compression).
• The rsize 2% is a default parameter that specifies the buffer size for which capacity is to be
increased on the volume.
• The warning 80% is a warning threshold to notify an administrator when the volume has
reached the specified threshold percentage.
Uempty
&RPSUHVVHGYROXPHFRS\GHWDLOV
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
A compressed volume is also a striped volume, and its extents are distributed across the MDisks of
the pool. Since the volume was created in a Hybrid (CHILD) pool, its extents are being sourced
from the mdiskgrp0 (Parent) pool in this example. In this example, the compressed volume
FA1-COMP resides in child pool FA1-Hybrid, which resides in parent pool mdiskgrp0.
The Volume Allocation capacity bar has been updated with the allocation of the FA1-COMP volume
copies (fully allocated and compressed). As the volume synchronization process continues, the
pool allocation will be show an increase as the volume is synchronized, then a decrease when the
original non-compressed volume copy is deleted.
The Compression Savings capacity reflects the total amount of compressed savings at the pool
level. It will also show an incline and decline in the compression savings during the volume
synchronization process. Once complete, it will only reflect the capacity used for metadata and the
compressed bytes of the compressed volume. After the volume synchronization is complete, the
fully allocated volume will be deleted.
IBM Easy Tier (v7.1 release) supports compressed volumes. Only random read operations are
monitored for compressed volumes (versus both reads and writes). Extents with high random reads
(64 K or smaller) of compressed volumes are eligible to be migrated to tier 0 storage.
Uempty
Compressed volume
Compression done by
volume’s preferred
node
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
An IBM storage system can increase the effective capacity of your flash storage up to 5 times using
IBM RACE Compression. Compression requires dedicated hardware resources within the nodes
which are assigned or de-assigned when compression is enabled or disabled. Compression is
enabled whenever the first compressed volume in an I/O group is created and is disabled when the
last compressed volume is removed from the I/O group.
Compression CPU utilization can be monitored from Monitoring > Performance. Use the
drop-down list to select and view CPU utilization data of the preferred node of the volume.
Behind the scene, compression is managed by the preferred node of the volume. As data is written,
it is compressed on the fly by the preferred node before written to the storage pool. A compressed
volume appears a standard volume with its full capacity to the attaching host system. Host reads
and writes are handled as normal I/O. As write activity occurred, compression statistics are updated
for the volume.
Note the MB/s for VDisks is much higher than the MB/s for MDisks, indicating the compression
results in transferring less data to and from storage, as compared to the data to and from the host.
Uempty
Databases 50-80%
E-mail 30-80%
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Not all workloads are good candidates for compression. The best candidates are data types that
are not compressed by design. These data types involve many workloads and applications such as
databases, character/ASCII based data, email systems, server virtualization infrastructures,
CAD/CAM, software development systems, and vector data.
Uempty
Integrated comprestimator
Provides native data economics with Comprestimator in the GUI
Analyzes the patterns of the actual customer data, and estimate the compressibility of actual customer
data per volume
Avoids the need to install Comprestimator in a separate location
To estimate compression savings, right-click on the volume and select Volume Copy Actions >
Space Savings > Estimate Compression Savings
Generates the analyzevdiskbsystem command to determine the compressing savings for the
volume
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
RACE Compression is a key differentiator of the IBM Spectrum Virtualize products, and
Comprestimator is its key sizing tool to estimate how much capacity savings the customer can
expect.
The integration of Comprestimator in IBM Spectrum Virtualize software eases the process of
estimating capacity savings by having this sizing tool integrated in system. This avoids the need to
install Comprestimator, and enables estimates of RACE Compression effectiveness to analyze the
patterns of the actual customer data, and estimate the compressibility of actual customer data per
volume. To estimate compression savings in the management GUI, select the volume and the
select Action > Space Savings > Estimate Compression Savings. This can also be implemented by
right-click on the volume. The system generates an analyzevdiskbsystem command which
automatically analyzes your confirmation to determine the potential storage savings if compression
was enabled. The management GUI incorporates the Comprestimator utility that used
mathematical and statistical algorithms to create potential compression savings for the volume.
Uempty
http://www-
304.ibm.com/webapp/set2/sas/
f/comprestimator/home.html
Use the IBM Comprestimator Utility to evaluate data on existing volumes for potential benefits of
compression. Implement compression for data with an expected compression ratio of 45% or
higher.
Do not attempt to compress data that is already compressed or with low compression ratios. They
consume more processor and I/O resources with small capacity savings.
The Comprestimator is a host based command line executable available from the IBM support
website. The utility and its documentation can also be found by performing a web search using the
key words ‘IBM Comprestimator’.
The Comprestimator supports a variety of host platforms. The utility runs on a host that has access
to the devices that will be analyzed, and performs only read operations so it has no effect
whatsoever on the data stored on the device.
Download the latest Comprestimator version 1.5.x.x to start analyzing expected compression
savings in accordance with IBM Spectrum Software Suite for XIV, Storwize V7000, SAN Volume
Controller (SVC) and FlashSystem V9000 storage systems.
Comprestimator users may want to consider taking measures to zero out deleted data to improve
the accuracy of the tool, and to free up that space for data reduction.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
The Comprestimator Utility is designed to provide a fast estimated compression rates for
block-based volumes that contain existing data. It uses random sampling of non-zero data on the
volume and mathematical analysis to estimate the compression ratio of existing data. By default, it
runs in less than 60 seconds (regardless of the volume size). Optionally, it can be invoked to run
longer and obtain more samples for an even better estimate of the compression ratio.
Given the Comprestimator is sampling existing data, the estimated compression ratio becomes
more accurate or meaningful for volumes that contain as much relevant active application data as
possible. Previously deleted old data on the volume or empty volumes not initialized with zeros are
subject to sampling and will affect the estimated compression ratio. It employs advanced
mathematical and statistical algorithms to efficiently perform read-only sampling and analysis of
existing data volumes owned by the given host. For each volume analyzed, it reports an estimated
compression capacity savings range; within an accuracy range of 5 percent.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
IBM Comprestimator utilities has now incorporated FLASHSYSTEM as storage type to estimate
how much capacity savings the client can expect with RACE Compression.
With FlashSystems, users now have the ability to set the number of flash modules in the simulated
system using the --flash-modules N, and set the size of the flash modules in the simulated system
using the --flash-module-size [SMALL|MEDIUM|LARGE]
Uempty
The guideline for a volume to be considered as a good candidate is a compression savings of 45%
or more.
To execute the Comprestimator Utility, log into the server using an account with administrator
privileges. Open a Command Prompt with administrator rights (Run as Administrator). Run
Comprestimator with the Comprestimator –n X –p -s (append with storage system)
command. For Storwize systems, you will need to use SVC.
This example of an Comprestimator output for an SVC volume indicates that the real storage
capacity consumption for this volume would be reduced from 50 GB to 10.6 GB. This represents a
saving of 32.4% within an accuracy range of 5.0%. Only 68.5% of the capacity savings would be
derived from Thin-Provisioning.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
In addition to the above statements, recall compression is performed by the volume’s preferred
node. The preferred node is assigned in round-robin fashion within the I/O group as each volume is
created. Over time, as volumes are created and deleted, monitor and maintain the distribution of
compressed volumes across both nodes of the I/O group.
In the example scenarios of this unit, compressed volumes and non-compressed volumes share
the same storage pool. For certain configurations and environments, it might be beneficial to
segregate compressed volumes into a separate pool to minimize impact on non-compressed
volumes. Review your environment with your IBM support representative when activating RACE
Note that IBM’s direction is towards using DRPs rather than RtC. And the newer hardware listed
does not support RtC.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
This topic discusses the data reduction pool to reduce the storage capacity.
Uempty
'DWD5HGXFWLRQ6WRUDJH3RRO
,%06SHFWUXP9LUWXDOL]HVXSSRUWHG
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
With the release of the IBM Spectrum Virtualize 8.1.2 introduces an architecture using a log
structured array for storing data such that updates are all written sequentially to storage and
benefiting from full RAID stripe writes. Meta-data exists in internal volumes in the pool to keep track
of the data's location.
Data reduction pools increase existing infrastructure capacity utilization by leveraging new
efficiency functions. The pools enable you to automatically de-allocate and reclaim capacity of
thin-provisioned volumes containing deleted data and, for the first time, enable this reclaimed
capacity to be reused by other volumes. The metadata keeps track of where data is in the pool.
With a new log-structured pool implementation, data reduction pools help deliver more consistent
performance from compressed volumes. Data reduction pools also support compressing all
volumes in a system, potentially extending the benefits of compression to all data in a system and
supports up to 10,000 compression volumes per system. This offers up to 3x better throughput for
compressed volumes, making it possible to compress more volumes and reduce storage costs.
Data reduction pools are completely transparent to host applications.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
The previous RACE software engines required dedicated processors and memory. Data reduction
pools instead, share the Spectrum Virtualize processors and memory and use a more efficient
algorithm for compression allowing more volumes to be compressed. This means that is
achievement all the benefits of our multi-threaded processing, alone with custom memory
management and no dedicated hardware just for compression.
When a data reduction pool is enabled, it utilizes 1GB memory that is taken from the system cache.
This cache is primarily used data reduction pool metadata.
There are limitations regarding mixing DRP and RACE as follows:
• RACE compressed volumes can only reside in standard storage pools, and DRP compressed
volumes can only reside in DRPs.
• One cannot serve up RACE compressed volumes and DRP deduplicated volumes from the
same I/O group at the same time.
• One cannot serve up RACE compressed volumes and DRP compressed volumes from the
same I/O group at the same time except for specific models of Spectrum Virtualize systems with
appropriate compression hardware either integrated into the system board or via attached
compression accelerator cards.
This is mainly of concern to customers using RACE compression, and desiring to migrate to DRP
as IBM is transitioning to DRP and customers will benefit by doing so as well. The procedure to
migrate depends on one has the appropriate compression hardware or not to support mixed RACE
and DRP compression.
Data reduction pool can achieve up to 4.8 GB/s per node with the dedicated compression
hardware, and is slightly better with EasyTier than RACE Compression.
Uempty
Data Reduction
Pool view
Directory Volumes
Internally:
4 Directory volumes
1 Customer data volume (per IO group)
1 Journal volume (per IO group)
1 Reverse Lookup volume (per IO group)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
In the example, we show a data reduction pool that has been created with four compressed
volumes. When a data reduction pool is created, it is created with the following I/O patterns: (for
other than fully allocated volumes):
• Four Directory volume that performs the volume Lookup and utilize 1% of the pool capacity.
▪ Short 4K random read/write I/Os
• One Customer data volume (per I/O group) maintains 98% of the pools capacity.
▪ Large sequential write I/Os, short random read I/Os
▪ Writes are 256kb into lower cache – merged into Full stripe writes
• One Journal volume (per I/O group) uses 2% of the pool capacity for recovery purposes, which
maintains a journal of all updates.
▪ Large sequential write I/Os typically 256k into lower cache, only read for recovery scenarios
such as T3 recovery.
• One Reverse Lookup volume (per I/O group) which uses 1% of the pool capacity to track what
is located on the physical storage, which is also used for garbage collection.
▪ Short semi-random read/write I/Os
With the release of Spectrum Virtualize V8.3, we have modified garbage collection to be function of
free capacity. This out-of-space behavior has been enhanced to avoid TTOs instead of graceful
offline events. Therefore, allowing garbage collection to operate at a lower rate below 85% capacity
utilization. With a less aggressive garbage collection means that it collect extents at a slower rate.
This, in turn, costs less write bandwidth and in turns, increases the performance. This may also
result in increased amount of reclaimable capacity. Above 85% utilization still operates at the
Uempty
current maximum level, which is collects more aggressively in order to ensure that the pool remains
online and has sufficient capacity.
While these meta-data volumes can be small, it’s recommended that DRPs be at least 20 TB in
size, with a sweet spot 100-200 TB in size. The minimum size for the meta-data and deduplication
tables which is about 1 TB. Thus, the benefits of DRPs are better as the storage pool gets larger.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Uempty
Data Reduction
Storage Pool Volume mirroring
Storage Pool
Traditional or legacy storage pools have a fixed allocation unit of an extent. These storage pools
alone with the fully allocated volume, thin provisioned volumes and compressed volumes are still
supported.
IBM Spectrum Virtualize support nondisruptive data movement of data from traditional storage
spool to data reduction pools by using volume mirroring.
Random Access Compression Engine (RACE) compression and DRP compressed volumes can
coexist in the same I/O group, however deduplication is not supported in the same I/O group as
RACE compressed volumes. If the Spectrum Virtualize model doesn’t have compression
processors either built into the system board or via compression accelerator adapter cards, then
RtC and DRP compressed volumes are not supported at the same time in the same I/O group.
RACE Compression development advises against mixing data types within the same volume
whenever possible, e.g. don’t mix compressible data with uncompressible data on the same
volume.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
With data reduction pools comes a revised architecture of compression with the implementation of
8 KB blocks (also referred to as 8 KB chunks) versus the 32 KB block size RACE compression
used. This is more efficient for flash storage.
In the example, the host writes to the disk, data is split up into fine grains of 8K chunks. Smaller 8K
chunk allows for less compression bandwidth for small I/Os read workloads. Previously, with RACE
1.0, you would build at least 64KB of data, compress that to 32KB and then save that.
Next, the 8K chucks are then compressed individually. Finally, the 8K chucks of compressed data
are then grouped into 256K blocks and written to the lower cache on to the storage. With the log
structured array to hold customer data, this results in large block sequential writes to disk, which
take advantage of RAID full stripe writes avoiding most of the RAID 5 and RAID 6 write penalty.
Uempty
Replication
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Data reduction pool uses a log structure array (LSA) to allocate capacity. Log structured arrays
always append new data to the end of the array. When data is overwritten, the old location and
capacity utilized is marked as reclaimable capacity. UNMAP functions can also request that you
free up no longer needed capacity making it reclaimable as well. Metadata is kept in an internal
volume pointing to where the data and reclaimable space is within the pool. While write I/O for data
is large block sequential, and read data is random read, I/O for the meta-data is random 4 KB read
and write. The metadata consumes less than 3% of the space. As a result, DRPs with a bit of flash
storage in a multi-tier pool using Easy Tier can significantly improve performance by placing
meta-data in the flash tier. This change mainly centers on the design of the metadata and knowing
where that data is stored.
Previously, when compression was implemented, the metadata was stored inline with the data, so
writes come in from the host and turn into a sequential stream of compressed data with small
metadata writes mixed through. With data reduction pools, any implementation of compression
(and thin provisioning) the metadata writes are written separately. This allows the cache to handle
data and metadata differently. Therefore, the volume you create from the pool to present to a host
application consists of a directory that stores the allocation of blocks within the capacity of the pool.
Uempty
Freeing storage
IBM
Host Servers
SCSI Unmap Spectrum Virtualize
freeing storage
freeing storage
File systems,
In addition to read
Hypervisors and Uses unmap to free
and write I/O
applications use storage to virtualized
requests there is an
unmap to free storage controllers
unmap I/O request
storage to Spectrum and Flash media
to free storage
Virtualize
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Uempty
Garbage collection
Garbage Collection Plans generated frequently based on (not limited to):
Frequency of over-writes
Rate of New I/O
Rate of data being invalidated (for example unmap)
Active/Stale data
Amount of space required to move data.
Amount of space free
Extents with largest number of holes
Data grouped into frequently modified and not frequently modified. Extent freed or
recycled
Garbage collection rates based on pool fullness
Does not apply to DRPs with only fully allocated volumes
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Data reduction pools have a process to consolidate space for deleted or overwritten data, called
garbage collection. Garbage collection relocates valid user data. This feature involves the
Reverse lookup volume. User data is consolidated into 256K blocks which get pushed through the
lower cache into RAID full stripe writes. As data is consolidated, extents are eventually freed up.
It's important that DRPs be kept less than 85% full to provide sufficient spare space for garbage
collection to run efficiently. As free space declines, garbage collection has to work harder, doing
more I/O to free up less, potentially interfering with production I/O.
The way garbage collection works, is that it has to work harder (more CPU, more reads and more
writes) when there isn't much free space, and very hard if there's little free space. For example,
consider a pool with most extents allocated to VDisks, with those extents having 20% of
reclaimable space. To free up an extent, we have to read 5 extents, and rewrite their data into 4
extents freeing up one extent for a total of 9 extents of I/O to free up one extent. Alternatively
consider a situation in which extents have 50% reclaimable space. In that case we only need to
read 2 extents and write 1 extent to free up an extent for a total of 3 extents of I/O to free up one
extent. Therefore, it's important to ensure that the real capacity of a pool doesn't exceed 85% of it's
physical capacity. Remember that over time as new data is written, reclaimable space increases
until it's returned to free space by garbage collection.
Note that DRPs with only fully allocated volumes do not use the LSA or garbage collection.
Also note that garbage collection is a lazy process, in that it may run very slowly or not at all if
there’s plenty of free space in the pool, and little space to free.
Uempty
Virtual 1B
1 2B
2 3B
3 4 6 7 8
5
Volume
Physical
Volume 1 5 2 3 4 8 7 6 2B 3B 1B 5
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
This is an example, a host performed a sequence of random writes to a VDisk. Host writes go
through the appropriate stack for the associated VDisk (deduplication, compression, and or
thin-provisioning) and are grouped into 256 KB I/Os that are appended to the log structured array
holding customer data in the DRP. No data is overwritten, and instead the old data is marked as
freed and reclaimable, leaving holes of free space on an extent. Additional, SCSI UNMAP
commands will mark old data as freed and reclaimable. Garbage collection will read extents with a
lot of holes, and rewrite them into fewer empty extents freeing up the old extents and using less
extents.
Here we show the extents 1, 2 and 3 data has been overwritten, creating holes therefore it is
marked invalid, leaving extent 5. Therefore, we can move extent 5 to another location which frees
up space.
Thin provisioned MDisk can also be utilizes with unmap to alert when extents are freed or a host
writes zeroes, providing more efficient use of physical space.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Uempty
4. If any recognizable patterns are found, this hash of the data is removed.
5. Now the system examines the remaining data, and seeks out new patterns. If a match is found,
the hash is used to reference the data. This might appear to look like deduplication but here you
don’t need a first copy saved, or a reference, as data reduction can generate the data
immediately.
6. Any new patterns are then removed (deduplicated). The small blue bars represents a minimum
amount metadata, and the hash reference itself, is what gets stored.
7. Now the system applies compression to the remaining data. If you compare this to the original
start of the 64 kilobytes data, you can see a great amount of data reduction, which is what
physical is stored out of this single user write.
This data reduction is running all the time, and takes place below the cache in order to help reduce
the amount of I/Os.
Note that memory will be consumed to store the deduplication fingerprint database, depending on
the amount of memory in the nodes as follows when deduplication is enabled:
Systems with 32GB per node = 12GB for fingerprint DB
Systems with 64GB per node = 16GB for fingerprint DB
Systems with 128GB+ per node = 32GB for fingerprint DB
In addition, approximately 1 GB is taken from cache when data reduction is enabled
Uempty
Mdiskgrp
reclaimable_capacity – space that can be reclaimed via GC
used_capacity_before_reduction
used_capacity_after_reduction
compression_opportunity
overhead_capacity
System
ƒ total_reclaimable_capacity
ƒ used_capacity_before_reduction
ƒ used_capacity_after_reduction
ƒ overhead_capacity
ƒ compression_opportunity
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
The system collects various statistics for data reduction pools, including how much capacity can
eventually be reclaimed via garbage collection (GC). See the Knowledge Center for the attribute
definitions in the lsvdisk, lsmdiskgrp, and lssystem command pages.
Consider capacity savings in pools and volumes where thin provisioning, compression and
deduplication are used. When using deduplication, capacity savings for a volume don’t make
sense because data is typically deduplicated from multiple volumes, and we can’t say space was
saved for a particular volume since deleting the volume that actually holds the data will make that
the data has to be re-homed and it can be rehomed to any volume referencing it. So with DRP we
look at capacity savings on a pool basis.
Also note, when deduplication is used, deleting volumes will take time to examine if any of its data
is duplicated and if so, to re-home that data to another volume.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
DRPs are designed for large storage pools, with a recommended minimum size of 20 TB with the
sweet spot in the range of 100-200 TB. More data typically yields more deduplication. The extent
size should be 4 GB or 8 GB, which effect the maximum pool size depending on the number of I/O
groups in the system. And while we do support DRPs when the system doesn’t have compression
hardware, it’s recommended to only implement DRPs with compression hardware; otherwise,
closely monitor your CPU utilization to avoid processor bottlenecks, and remember that in case of
node failure the partner node takes over processing for the affected volumes.
The Data Reduction Estimate Tool is accessible via Fix Central at
https://www-945.ibm.com/support/fixcentral/swg/selectFixes?parent=Mid-range%20disk%20syste
ms&product=ibm/Storage_Disk/IBM+Storwize+V7000+(2076)&release=8.2&platform=All&function
=all, works similarly to the Comprestimator tool, provides deduplications savings with multiple
volumes, and generates output like:
Estimated Dedup Savings: 97.8%
Estimated Compression Savings: 16.3%
Data Reduction Savings: 98.2%
---------------------------------------
Zeroes Detected Savings: 4.11%
Total Data Efficiency Savings: 98.2%
These volumes are excellent candidates for deduplication, but only an additional 0.4% is saved
from compression so a good choice in this case would be to use thin deduplicated volumes..
Remember we need to keep at least 15% of the pool empty when garbage collection is running (for
thin, compressed and/or deduplicated volumes), so savings should exceed that threshold.
Also see http://www14.software.ibm.com/webapp/set2/sas/f/dretool/home.html
Uempty
If possible, it’s recommended that different data types reside on separate volumes, i.e., don’t mix
non-compressible data with compressible data, or other combinations; thus, allowing more
granularity regarding the data reduction used for the specific data types.
Since meta-data is frequently accessed, placing it on flash backed storage via Easy Tier in a hybrid
pool can significantly improve performance.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Users should select the volume type that meets their business objectives and leverages good
performance from DRP technology. Each of the above volume combinations provides certain
benefits for storage efficiency and have different performance characteristics.
Fully allocated volumes have no savings from compression, deduplication or thin provisioning. But
there’s no LSA or extra meta-data overhead. Many applications make assumptions about data
placement to optimize I/O operations. Such volumes do benefit from SCSI unmap to the backend
storage so it can reclaim space, when in DRPs as opposed standard storage pools, when the
backend uses that function. This includes backend FCMs.
Compressed volumes are usually the second best performer though if the data isn’t compressible
then thin volumes are appropriate.. If deduplications savings exceed 30% then thin deduplicated
volumes are appropriate. Everyone’s environment is different, bottlenecks can arise in many
places, so these recommendations should be considered starting points.
It’s easy to test by creating a volume copy with the desired volume type. Writes go to cache, while
reads come from the primary volume copy. So wait for the volumes to sync up and observe the
performance. Then make the target volume the primary copy so reads come from it and observe
performance then, including from the application point of view. Then you can decide which volume
type to use.
To maximize DRP throughput, use one DRP per I/O group,
Uempty
One must take great care to ensure that data reducing backends don’t run out of space because
the data reduction ratio changes. With compression and deduplication on the front end, it’s
reasonable to not expect much data reduction on the back end. So it’s safer to assume no data
reduction will occur in the backend, and limit total allocated volume size to the uncompressed
capacity of the backend. And be sure to monitor used physical capacity in the backend to ensure
it doesn’t run out of space.
For fully allocated volumes, it’s recommended they reside in a separate storage pool. This avoids
garbage collection overhead while the backend compresses the fully allocated volumes in the pool.
Uempty
Supported systems
Supported on IBM Spectrum Virtualize Storwize V5030, Storwize V7000, SAN Volume
Controller SV1, FlashSystem V9000, and FlashSystem 9100
Data Reduction Pool license requirements:
SVC, FS V9000 and FS 9100 do not require an additional license for DRP (included in the based
capacity license)
Storwize V5030 and V7000 will need additional license for compression functionality
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Data Reduction pools are supported on IBM Spectrum Virtualize Storwize V5030, Storwize V7000,
SAN Volume Controller SV1, and FlashSystem V9000, and FlashSystem 9100 running the
V8.1.3.2 or higher.
IBM Spectrum Virtualize SAN Volume Controller, FlashSystem V9000, and FlashSystem 9100 do
not require a license to implement data reduction pools. Nodes must have at least 32 GB memory
to support deduplication. It is included in the base capacity license in the product. After you migrate
all RACE volumes to a data reduction pool, you will no longer need your old compression license.
However, if you are using data reduction pools on Storwize V5030 or Storwize V7000, there is still
an additional license required for the Compression functionality.
Uempty
Keywords
• Auto Expand • Garbage collection
• Comprestimator utility • Journal volume
• Customer data volume • Log Structure Array (LSA)
• Data reduction pool • RACE Compression
• Deduplication • Reverse Lookup volume
• Directory volume • SCSI unmap
• Estimate Compression Savings • Thin Provisioning
• Freeing storage • 8K blocks
• Fully allocated volume • 8K chunks
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Uempty
Review questions (1 of 2)
1. True or False: Data from an existing storage pool can be moved to a data reduction
pools using volume mirroring.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Uempty
Review answers (1 of 2)
1. True or False: Data from an existing storage pool can be moved to a data reduction
pools using volume mirroring.
The answer is true.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Uempty
Review questions (2 of 2)
3. Space within the allocated real capacity of a thin-provisioned volume is assigned in
what size increments driven by write activity?
A. Extent size increments as defined by the storage pool extent size
B. Grain size increments with a default grain size of 256 KB
C. Blocksize increments as defined by the application that owns the volume
4. True or False: IBM Spectrum Virtualize can estimate compression savings from the
management GUI to automatically analyzes potential storage savings.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Uempty
Review answers (2 of 2)
3. Space within the allocated real capacity of a thin-provisioned volume is assigned in
what size increments driven by write activity?
A. Extent size increments as defined by the storage pool extent size
B. Grain size increments with a default grain size of 256 KB
C. Blocksize increments as defined by the application that owns the volume
The answer is grain size increments with a default grain size of 256 KB
4. True or False: True or False: IBM Spectrum Virtualize can estimate compression
savings from the management GUI to automatically analyzes potential storage
savings.
The answer is True.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Uempty
Summary
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDUHGXFWLRQWHFKQRORJLHV
Uempty
Overview
This module provides an overview Easy Tier functionality and its ability to manage storage
efficiency, improve performance, and total cost of ownership for production workloads.
References
Implementing IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1
http://www.redbooks.ibm.com/redpieces/pdfs/sg247938.pdf
Uempty
Objectives
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
Uempty
(DV\7LHURYHUYLHZ
(DV\7LHUPRGHVRIRSHUDWLRQV
(DV\7LHUVHWWLQJV
6WRUDJH7LHU$GYLVRU7RRO67$7
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
Uempty
Volume Mirroring
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
With the implementation of Easy Tier, IBM Spectrum Virtualize storage systems supports flash
storage through a built-in dynamic data relocation that allows host transparent movement of data
among the internal and external storage subsystem resources.
Easy Tier automates the placement of data amongst different storage tiers by moving extents
around within a multi-tiered pool. This includes the ability to automatically and non-disruptively
relocate logical volume extents with high activity to storage media with higher performance
characteristics, while extents with low activity are migrated to storage media with lower
performance characteristics. In addition to eliminating manual intervention, it helps achieves the
best available storage performance for your workload in your environment. In this dynamically
tiered environment, data movement is seamless to the host application regardless of the storage
tier in which the data belongs. However, you can manually change the default behavior.
The usage statistics file can be off-loaded from clustered system, and then, you can use the IBM
Storage Tier Advisor Tool (STAT) to create a summary report to analyze heat data files that are
produced by Easy Tier. STAT is available at no additional cost.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
IBM Easy Tier technology is designed to help improve performance at lower costs through more
efficient use of flash. The Easy Tier function automatically identifies highly active data within
volumes and moves only the active data to a flash. IBM Easy Tier is support among all IBM
Spectrum Storage products, from the lowest to the highest performing storage, allowing the less
frequently accessed data to be moved to slower external storage, which can be SSD-based storage
or disk-based storage.
Easy Tier eliminates manual intervention when it assigns highly active data on volumes to faster
responding storage. In this dynamically tiered environment, data movement is seamless to the host
application regardless of the storage tier in which the data resides. The benefits can include faster
data access and throughput, better performance, and less power consumption.
Tier 0 flash typically is lower latency flash with NVMe or high endurance flash, while Tier 1 would
typically have slower latencies such as with SSDs or older flash technology.
While we can have up to 4 tiers, we only need 2 tiers for Easy Tier to move data to the appropriate
tier to improve performance.
Easy Tier operates on a 24 hour cycle. It first collects I/O data for each extent using 5 minute
intervals, creates heat map files, analyzes the data, creates a data migration plan, and then
implements the migration plan. Note that the rate extents are moved around is limited, so as not to
significantly impact production I/O.
Easy Tier takes advantage of the fact that I/Os are not evenly spread across the storage space an
application uses, and is characterized via an access density distribution. Access density is
measured as IOPS/GB. So, if one splits up the data space into extents, it's not uncommon to find
applications that do 90% of the I/Os in as little as 10% of the data space. Consequentially, putting
that active 10% of data on flash, gives 90% of the I/Os much better read latencies improving
Uempty
application performance. If every extent had the same IOPS/GB, there'd be no benefit from Easy
Tier. One can generate an access density chart from Easy Tier heat map files.
Uempty
VDVBVVG WLHUBIODVK
Tech Types WLHUBIODVK
VDVBKGG WLHUBHQWHUSULVH
VDVBQHDUOLQHBKGG WLHUBQHDUOLQH
7LHU 7HFK7\SHV
(7B7LHU WLHUBIODVK
Recognized by ET
(7B7LHU WLHUBIODVKWLHUBHQWHUSULVH
(7B7LHU WLHUBQHDUOLQH
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
The Easy Tier allows tiering between four tiers in a 3-Tier Mapping supporting Flash, Enterprise,
and Nearline drive technologies. This increases the performance of the system by supporting data
movement across all tiers.
This table shows the current drive technology, naming convention and all supported tier mapping
used by Easy Tier.
• Tier 0 flash: Specifies a tier0_flash IBM FlashSystem MicroLatency module or an external
MDisk for the newly discovered or external volume. Typically, performance critical workloads
deployed on IBM FlashSystem 900 enclosures. Drives that use NVMe architecture are also
considered Tier 0 flash drives.
• Tier 1 flash: Specifies a tier1_flash (or flash SSD drives) for the newly discovered or external
volume, which can be capacity added through IBM expansion enclosures, models 12F, 24F, and
92F.
• Enterprise tier: Specifies a tier_enterprise hard disk drive or an external MDisk for the newly
discovered or external volume. These MDisks can be built from serial-attached SCSI (SAS)
drives, typically with 10K or 15K RPM disks.
• Nearline tier: The nearline tier exists when nearline-class MDisks are used in the pool, such as
those drives built from nearline SAS drives that operate at 7200 RPM and have slower
latencies.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
IBM Spectrum Virtualize Family Software version 8 builds on the established history of version 7,
providing software-defined storage capabilities across a variety of platforms. These include SAN
Volume Controller, FlashSystem V9000, Storwize V7000, and the Storwize V5000 family. The base
license that is provided with your system includes the use of its basic functions such as Easy Tier
and Easy Tier Storage Pool Balancing. However, to enable the use of Easy Tier function on an
external attached storage enclosures such as FlashSystem 900 (standalone), and IBM SFF and
LFF expansion enclosures (12F, 24F, 92F, AFF, and A9F), an IBM Easy Tier licensed is required.
The number of licenses is based on the number of storage capacity units (SCUs) purchased, which
must be equal to the total number of control enclosures, expansion enclosures, and any enclosures
in any virtualized storage system. For example, if the system is made up of one control enclosure,
one expansion enclosure, and one virtualized storage system that has two enclosures, then four
licenses are required.
Administrators are responsible for purchasing extra licenses and configuring the systems within the
license agreement, which includes configuring the settings of each licensed function on the system.
Uempty
T0 T0 T0 + T0 + T0 + T0 + T0+ T1 T1 + T1 + T1 + T2 T2 + T3
User (VG) + T1 + T1+ T2 T2 + T3 T2 T2+ T3 T3
Tiers T1 T2 T2+ T3 T3
T3
T0 (Tier
Flash)
1 1 1 1 1 1 1
T1 (Tier
Flash)
2 2 2 2 2 1 2
T2 (Tier
HDD)
3 2 2 2 3 2 2 2
T3 (Tier NL)
3 3 2 3 3 3 3
This table shows the possible combinations for the pool configuration with four MDisk tiers.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
Easy Tier is a performance optimization mechanism that seamlessly migrates (or moves) data to
the most appropriate tier within the storage systems. This table identifies the user’s tiers supporting
4 tiers within a 3-Tier mapping. Note that Tier1_flash and tier2_hdd will “share” with flash getting
default landing and greater portion of I/O in balancing.
Uempty
(DV\7LHURYHUYLHZ
(DV\7LHUPRGHVRIRSHUDWLRQV
(DV\7LHUVHWWLQJV
6WRUDJH7LHU$GYLVRU7RRO67$7
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
Uempty
Volume Exchange
Applications
DB2
Warehouse Volumes
Smart
monitoring
Four extents identified as hot
– candidates for Flash tier
Size: 1TB
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
Easy Tier evaluation mode collects usage statistics for each storage extent for a storage pool
where the capability of moving data from one tier to the other tier is not possible or is disabled. An
example of such a storage pool is a pool of homogeneous MDisks, where all MDisks are typically
HDDs.
Easy tier must be enabled on non hybrid pools to collect data. The storage system monitors the
storage used at the volume extent level. Easy Tier constantly gathers and analyzes monitoring
statistics to derive moving averages for the past 24 hours. Volumes are not monitored when the
easytier attribute of a storage pool is set to off or inactive with a single tier of storage. You can
enable Easy Tier evaluation mode for a storage pool with a single tier of storage by setting the
easytier attribute of the storage pool to on.
If you turn on Easy Tier in a Single Tiered Storage pool, it runs in evaluation mode. This means it
measures the I/O activity for all extents. A statistic summary file is created and can be off-loaded
and analyzed with the IBM Storage Tier Advisory Tool (STAT). This will provide an understanding
about the benefits for your workload if you were to add Flash/SSDs to your pool, prior to any
hardware acquisition.
A summary file is created in the /dumps directory on the configuration node
(dpa_heat.node_name.date.time.data), which can be offloaded and viewed by using the IBM
Storage Tier Advisor Tool.
Uempty
Flash
Size: 1TB
HDD
1024 MB extents
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
Easy Tier automatic data placement also measures the amount of data access.
Automatic data placement is enabled by default for storage pools with more than one tier of
storage. This process also measures the amount of data access for all volumes whether the
volume is a candidate for automatic data placement. Once automatic data placement is enabled,
and if there is sufficient activity to warrant relocation, Easy Tier then acts on the measurements to
automatically place the data into the appropriate tier of a storage pool that contains both MDisk
tiers. Extents will begin to be relocated within a day after enablement. This sub-volume extent
movement is transparent to host servers and applications.
For a single level storage pool and for the volumes within that pool, Easy Tier creates a migration
report every 24 hours on the number of extents it would move if the pool was a multi-tiered pool.
Easy Tier statistics measurement is enabled. Using Easy Tier can make it more appropriate to use
smaller storage pool extent sizes.
A statistic summary file or ‘heat’ file generated by Easy Tier can be offloaded for input to the IBM
Storage Tier Advisor Tool (STAT). This tool produces reports on the amount of extents moved to
Flash/SSD-based MDisks and predictions of performance improvements that could be gained if
more Flash/SSD capacity is available.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
You can change the mode rates of extent migrations through the Easy Tier function and pool
balancing functions by change the acceleration mode.
Normally, a system running with the Easy Tier accelerated mode set to off, which is the system
default rate that ensures that migrations do not affect system performance.
To view the currently configured value, enter the lssystem command to view the default view for
easy_tier_acceleration. You can only enable Easy Tier accelerated mode from the command
line by using chsystem -easytieracceleration on/off command. The maximum rate of extent
migrations in normal mode (the system default) is 12 GB per 5 minutes for all functions, except cold
demote, which is 1 GB every 10 minutes. If easy_tier_acceleration is set to on, the maximum
migration rate is 48 GB per 5 minutes. However, the migration rate of 48 GB per 5 minutes in
accelerated mode cannot always be guaranteed.
Accelerated mode is not intended for day-to-day Easy Tier traffic, because it can increase the
workload on the system temporarily, use Easy Tier accelerated mode during periods of lower
system activity.
You can also temporarily enable the accelerated mode to rapidly use new capacity, such as:
• Adding more capacity to an existing storage pool either by adding to an existing tier or by
adding a new tier to the pool. When you enable the accelerated mode, the system can quickly
spread the existing volumes into the new capacity.
• If you are migrating multiple volumes between pools, and the target pool has more tiers than the
source pool to preserve the tier of the volume's extents. When you enable the accelerated
mode, the system can quickly take advantage of the additional tiers in the target pool.
To avoid the possibility of performance issues that are caused by overloading the managed disks,
use the accelerated mode during periods of reduced system activity.
Uempty
• For SAS attached drives (also considers MDisk MDisk0 MDisk1 MDisk2
IOPS bandwidth)
Uses XML files with disk performance information
based on technology and RAID configuration
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
When growing a storage pool by adding more storage to it, IBM Spectrum Virtualize software can
restripe the system data in pools of storage without having to implement any manual or scripting
steps. This process is called Automated Storage Pool Balancing. Although Automated Storage Pool
Balancing can work in conjunction with Easy Tier, it operates independently and does not require
an Easy Tier license. This helps grow storage environments with greater ease while retaining the
performance benefits that come from striping the data across the disk systems in a storage pool.
Automated Storage Pool Balancing uses XML files that are embedded in the software code. The
XML files uses stanzas to records the characteristics of the internal drives by RAID levels that are
built, the width of the array, drive types, and sizes used in the array, and so on, to determine MDisk
thresholds. External virtualized LUNs are based on its controller. During the Automated Storage
Pool Balancing process, it assess the extents that are written in the pool, and based on the drive
stanzas and its IOPs capabilities, data is automatically restriped across all MDisks within the pool
equally. In this case, you can have a single tier pool or mix different drive type and capacity that you
have MDisks on in the same pool. This is only a performance rebalance – not an extent rebalance.
Automated Storage Pool Balancing can be disabled on the pool.
Uempty
DB2
Warehouse
Backups
User
directories
Applications I/Os with sizes larger
than 64 KB are not considered the Multimedia
best use. This data is not “hot”.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
The common question is where should the Flash drives and Easy Tier function be deployed in your
environment? There are several areas to be considered when determining where the Easy Tier
feature and the Flash drives can provide the best value to our clients. If the environment is one
where there is a significant amount of very small granularity striping, such as Oracle or DB2
tablespace striping, then the output of the workload may be significantly reduced. In these cases
there may be less benefit from smaller amounts of SSDs and it may not be economical to
implement an Easy Tier solution. Therefore, you should test the application platform before fully
deploying Easy Tier in to your IBM Spectrum Virtualize system environment.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
This topic discusses the characteristics of the various Easy Tier settings.
Uempty
DXWR RQ EDODQFHG
DXWR WR RII PHDVXUHG³
DXWR WR RQ DFWLYH5
RQ RII PHDVXUHG³
RQ RQ EDODQFHG4
RQ WR RII PHDVXUHG³
RQ WR RQ DFWLYH5
See notes for number references
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
This table provides a summary of Easy Tier settings for pools, volumes and volume copy. The
Volume copy Easy Tier setting column indicates independent settings for non-mirrored volumes, or
for each copy of a mirrored volume. The rows highlighted in yellow are the default settings. Also
observe the reference numbers that are annotated in the Volume copy Easy Tier status:
1. If the volume copy is in image or sequential mode or is being migrated then the volume copy
Easy Tier status is measured instead of active.
2. When the volume copy status is inactive, no Easy Tier functions are enabled for that volume
copy.
3. When the volume copy status is measured, the Easy Tier function collects usage statistics for
the volume but automatic data placement is not active.
4. When the volume copy status is balanced, the Easy Tier function enables performance-based
pool balancing for that volume copy.
5. When the volume copy status is active, the Easy Tier function operates in automatic data
placement mode for that volume.
Keep in mind that automated storage pool balancing is not active unless Easy Tier is on/auto for a
storage pool.
Uempty
SSD Flash
After the 24 hour learning period
array
Data is moved automatically between tiers
í Hottest extents moved up
í Coldest extents moved down
Less Active Data
ENT Migrates Down New volume allocations use extents from Tier1
Disk (Enterprise disk) by default
í If no free Tier 1 capacity then Tier 2 will be used
if available, otherwise capacity comes from Tier 0
Active Data Easy Tier can operate as long as one extent free
Migrates Up in the pool
Once multi-tiers are placed in a pool, Easy Tier is automatically enabled. IBM Easy Tier can be
enabled on a volume basis to monitor the I/O activity and latency of the extents over a 24 hour
period. This type of volume data migration works at the extent level, it is often referred to as
sub-LUN migration.
The concept of Easy Tier is to transparently move data up and down unnoticed from host and user
point of view. When creating new volumes, they are placed by default on the Enterprise or middle
Tier 1. If Tier 1 has reached its capacity, then it will use the next lowest tier which would be Tier 2. If
all tiers are full, only then will it allocate extents from Tier 0. Easy Tier will then automatically start
migrating those extents (hot or cold) based on the workload. As a result of extent movement the
volume no longer has all its data in one tier but rather in two or three tiers.
Uempty
Auto rebalance
Redistribute extents within a tier to balance utilization
across MDisks for maximum performance
Either move or swap Nearline Tier
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
When Easy Tier is enabled, it determines the right storage media for a given extent based on the
extent heat and resource utilization. Easy Tier uses the following extent migration types to perform
actions between the three different storage tiers.
• Promote
▪ Moves the relevant hot extents to higher performing tier
• Swap
▪ Exchange cold extent in upper tier with hot extent in lower tier
• Warm Demote
▪ Prevents performance overload of a tier by demoting a warm extent to the lower tier
▪ This action is based on predefined bandwidth or IOPS overload thresholds. Warm demotes
are triggered when bandwidth or IOPS exceeds those predefined thresholds. This allows
Easy Tier to continuously ensure that the higher-performance tier does not suffer from
saturation or overload conditions that might affect the overall performance in the extent
pool.
▪ Warm Promote has been upgraded to look at latest 5-minutes and upgraded to queue 21
extents. If an extent is over-burdened, and assuming space free in higher tier, that extent
will get queued to promotion to higher tier.
• Demote or Cold Demote
▪ Easy Tier Automatic Mode automatically locates and demotes inactive (or cold) extents that
are on a higher performance tier to its adjacent lower-cost tier.
▪ Once cold data is demoted, Easy Tier automatically frees extents on the higher storage tier.
This helps the system to be more responsive to new hot data.
Uempty
• Expanded Cold Demote
▪ Demotes appropriate sequential workloads to the lowest tier to better utilize Nearline disk
bandwidth
• Storage Pool Balancing
▪ Redistribute extents within a tier to balance utilization across MDisks for maximum
performance
▪ Moves hot extents from high utilized MDisks to low utilized MDisks
▪ It attempts to migrate the most active volume extents up to Flash/SSD first.
• A previous migration plan and any queued extents that are not yet relocated are abandoned.
Extent migration occurs only between adjacent tiers. In three tiered storage pool Easy Tier will not
move extents from Flash/SSD directly to NL-SAS and vice versa without moving them first to SAS
drives.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
IBM Easy Tier uses an automated data placement (ADP) plan which involves scheduling and the
actual movement or migration of the volume’s extents up to, or down from, the highest disk tier. This
involves collecting I/O stats in five minute intervals on all volume with in the three tiered pool. Based
on the performance log after the 24 hour learning period, Easy Tier will use data migrator (DM) to
create an extent migration plan and dynamically moves extents based on suggestions and
performance between three tiers, and the heat of extents. Therefore high activity or hot extents are
moved to a higher disk tier such as Flash and SSD within the same storage pool. It also moves
extents whose activity dropped off, or cooled, from higher disk tier MDisk back to a lower tier MDisk.
The extent migration rate is capped so that a maximum of up to 40 MBps is migrated, which
equates to approximately 5 TB per day that is migrated between disk tiers.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
• The default Easy Tier setting for a storage pool is Auto, and the default Easy Tier setting for a
volume copy is On (for a single tier pool). This means single tier pools have automated storage
pool balancing in effect, while multi-tier pools have active data placement across tiers in effect
for striped VDisks.
• If the single tier pool Easy Tier setting is changed to On, the pool Easy Tier status would
become active and the volume copy Easy Tier status would remain balanced.
• With the default pool Easy Tier setting of Auto and the default volume Easy Tier setting of On
(for a two-tier or hybrid pool) this causes the pool and the volume Easy Tier status to become
Active. Easy Tier automatic data placement becomes Active automatically.
• The Easy Tier heat file is generated and continually updated as long as Easy Tier is active for a
storage pool.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
Before Easy Tier 3, the system could overload an MDisk by moving too much hot data onto a single
MDisk. Easy Tier 3 understands the “tipping point” for an MDisk and stops migrating extents – even
if there is space capacity available on that MDisk. The Easy Tier overload protection is designed to
avoid overloading any type of drive with too much work. To achieve this, Easy Tier needs to have
an indication of the maximum capability of a managed disk.
This maximum can be provided in one of two ways:
• For an array made of locally attached drives, the system can calculate the performance of the
managed disk because it is pre-programmed with performance characteristics for different
drives.
• For a SAN external MDisks, the system can not calculate the performance capabilities, so the
system has a number of pre-defined levels, that can be configured manually for each managed
disk. This is called the easy tier load parameter (low, medium, high, very_high).
If you analyze the statistics and find that the system doesn’t appear to be sending enough IOPs to
your SSDs, you can always increase the workload using chmdisk and the -easytierload
parameter.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
This topic highlight the features of the IBM Storage Tier Advisor Tool (STAT) to analyze the heat
performance of storage pools and volumes.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
The IBM Storage Tier Advisor Tool (also known as STAT) is a Microsoft Windows application that
analyzes heat data files produced by Easy Tier and produces a graphical display of the amount of
"hot" data per Volume. In addition, it predicts how adding Flash/SSD capacity to the measured
storage pool could benefit system performance. It can also provide low response time requirements
on Flash/SSDs while targeting HDDs for “cooler” data that is accessed more often sequentially and
at lower I/O rates.
The STAT tool can be downloaded from the IBM support website. You can also do a web search on
‘IBM Easy Tier STAT tool’ for a more direct link. Download the STAT tool and install it on a Windows
workstation. The default directory is C: \Program Files\IBM\STAT.
IBM Storage Tier Advisor Tool can be downloaded at:
http://www-01.ibm.com/support/docview.wss?uid=ssg1S4000935. You will need an IBM ID to
proceed with the download.
STAT creates a set of Hypertext Markup Language (HTML) files, and the user can then open the
index.html file in a browser to view the results.
Uempty
Import CSV files using the IBM Storage Tier Advisory Tool Charting Utility from IBM Techdocs:
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS5251
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
The STAT tool also creates three comma-separated values (CSV) files and place them in the
Data_files folder:
• <panel_name>_data_movement.csv
• <panel_name>_skew_curve.csv
• <panel_name>_workload_ctg.csv
These files contains a large amount of information pertaining to your captured data and may be
used as input data for other utilities. The best way to start investigating this data is to use the IBM
Storage Tier Advisory Tool Charting Utility from IBM techdocs:
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/PRS5251.
This tool will import the three CSV files into excel and automatically draw the three most interesting
charts automatically. It also contains a tab called Reference, which will explain all of the terms used
in the graphs, as well as providing a useful reminder about the different types of data migration in
Easy Tier.
Uempty
Iometer
• Iometer is an I/O workload generation tool that runs on Windows to characterize disk or
network I/O performance
• Installing Iometer:
Download the Iometer package and uncompress the files
Place the Iometer.exe and the Dynamo.exe files in the same directory
Dynamo.exe is the Iometer workload generator and must be installed on all system being
analyzed
Dynamo must remain
running throughout the
entire process
Iometer
'LVN
DQDO\]HG
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
An Iometer is an I/O subsystem measurement and characterization tool for clustered system to
analyze disk or network I/O workloads.
You must install the Iometer to uncompress the files. Iometer has two basic components, the
Iometer and Dynamo. Both files must be installed in the same directory. With the start of the
Iometer tool, a Dynamo executable file is generated and placed in the Windows task bar. Dynamo
is basically the Iometer workload generator that performs the disk I/O operations, records the
performance information, and returns the data to the Iometer. Iometer provides the reporting of the
results in a management GUI.
If you are analyzing multiple system, you will need to install a Dynamo file on each system, this will
required network configuration. If Iometer and Dynamo is run on the same system, no network is
required.
Uempty
Configuring Iometer (1 of 2)
• Topology panel:
ƒ Mangers (name of the local system)
ƒ Select Workers (manager’s available disk drives from the Disk Target Tab)
í Worker represents logical drives that are mounted to the host
í Blue icon is physical drives (only shown if the drives are un-partitioned)
í Yellow icon with a slash requires preparation before testing
ƒ Iometer uses a sum of values (Maximum Disk Size + Starting Disk Sector controls) as
an upper bound on the size of iobw.tst.
Iometer creates a
iobw.tst file for each
volume generated I/Os
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
The Iometer uses a graphical user interface that allows you to specify parameters to configure an
I/O workload that is not evenly balanced across the data space.
To get started you will need to create an I/O workload using worker processes to specific selected
logical drives. Iometer by default opens eight disk workers. Iometer recognizes two different volume
types:
• Blue icons represent physical drives; they are only shown if they have no partitions on them.
• Yellow icons represent logical (mounted) drives; which are only shown if they are writable. A
yellow icon with a red slash through it means that the drive needs to be prepared before the test
starts. Once you have specified the workers, you can disconnect the remaining workers.
When preparing an unprepared logical drive, Iometer uses a sum of the values (Maximum Disk
Size + Starting Disk Sector controls) as an upper bound on the size of iobw.tst file that will be
generated on each disk being tested.
Uempty
Configuring Iometer (2 of 2)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
In order to receive the desired results, you have the option to modify settings as to how the disk will
be accessed using the Access Specification. The default is 2-Kilobyte random I/Os with a mix of
67% reads and 33% writes, which represents a typical database workload. You can leave it alone or
change it. The Results Display tab, allows you to set the Update Frequency between 1 second or
60 seconds. For example, if you set the frequency to 10 seconds, the first test results appear in the
Results Display tab, and they are updated every 10 seconds after that.
Once you have the specifications set, press the Start Tests button (green flag). A standard Save
File dialog appears. Select a file to store the test results (default results.csv). Press the Stop Test
button (stop sign), and the final results are saved in the results.csv file.
Uempty
Once Easy Tier is active, a heat map is generated or updated approximately every 24 hours. Heat
maps are presented in the management GUI under the configuration node of the cluster based:
/dumps/easytier/dpa_heat.serial-1.yymmdd.hhmmss.data.
Any existing heat data file is overwritten whenever a new heat data file is produced; however, it will
include all the information to date for all measured pools and volumes.
Heat map files are located in the /dumps directory on the configuration node, which can be
downloaded using the Settings > Support > Support package. From there you will need to click
Manual Upload Instructions, Download Support Package, and then select Download Existing
Package.
The heat map file will need to be off-loaded by the user, and then invoked in the Storage Tier
Advisor Tool from a Windows command prompt console with the file specified as a parameter. The
user can also specify the output directory. Any existing heat data file is erased when it has been
existing longer than 7 days.
The program can also be invoked from CLI using the PuTTY scp (PSCP) window with the heat file
name specified. Ensure the heat file is in the same directory as the STAT program when invoking
from the CLI. The Storage Tier Advisor Tool creates an result index.html file to view the results
through a supported browser. Browsers Firefox 27, Firefox ESR_24, Chrome 33 and IE 10 are
supported. The file is stored in a folder called Data_files in either the current directory or the
directory where STAT is installed. The output index.html file can then be opened with a web
browser.
Uempty
Flash
Tier 0
ENT
Tier 1
NT
Tier 2
It is recommended to keep some free extents within a pool in order for Easy Tier to function. This
will allow Easy Tier to move the extents between tiers as well as move extents within the same tier
to load-balance the MDisks within that tier, without delays or performance impact.
• Easy Tier will work using only one extent; however, it will not work as efficiently.
• Easy Tier will work more efficiently with one extent times the number of MDisks in a storage
pool plus 16.
• Easy Tier heat map is updated every 24 hours for moves between tiers. Performance rebalance
is within a single tier (even in a hybrid pool) looked at and updated much more often. The
system is rebalancing on an hourly basis.
Uempty
Keywords
Automatic data placement mode • Hot promote
Easy Tier • Iometer
Cold promote
• Storage Pool Recommendation report
• Storage Tier Advisor Tool (STAT)
Data relocation
• System Summary report
Drive use attributes
• Warm promote
Flash
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
Uempty
Review questions (1 of 2)
1. What are three tier levels supported using Easy Tier Technology?
2. True or False: When the easy_tier_acceleration is set to on, the maximum migration
rate is 12 GB per 5 minutes.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
Uempty
Review answers (1 of 2)
1. What are three tier levels supported using Easy Tier Technology?
The answer is Flash tier, flash-based and SAS Enterprise tier, and Nearline SAS tier.
2. True or False: When the easy_tier_acceleration is set to on, the maximum migration
rate is 12 GB per 5 minutes..
The answer is False. If easy_tier_acceleration is set to on, the maximum migration rate is 48
GB per 5 minutes.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
Uempty
Review questions (2 of 2)
3. Space within the allocated real capacity of a thin-provisioned volume is assigned in
what size increments driven by write activity?
A. Extent size increments as defined by the storage pool extent size
B. Grain size increments with a default grain size of 256 KB
C. Blocksize increments as defined by the application that owns the volume
4. True or False: Easy Tier can collect and analyze workload statistics even if no SSD-
based MDisks are available.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
Uempty
Review answers (2 of 2)
3. Space within the allocated real capacity of a thin-provisioned volume is assigned in
what size increments driven by write activity?
A. Extent size increments as defined by the storage pool extent size
B. Grain size increments with a default grain size of 256 KB
C. Blocksize increments as defined by the application that owns the volume
The answer is grain size increments with a default grain size of 256 KB
4. True or False: Easy Tier can collect and analyze workload statistics even if no SSD-
based MDisks are available.
The answer is true.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
Uempty
Summary
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H(DV\7LHU
Uempty
Overview
This module provides an overview of the data migration concept and examines the data migration
options provided by the IBM Spectrum Virtualize Software to move data across IBM storage
systems built on the IBM Spectrum Virtualize managed infrastructure.
This module does not focus on a specific IBM storage product but on data migration as it applies to
all IBM Spectrum Storage products built with the IBM spectrum Virtualized software.
References
Implementing IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1
http://www.redbooks.ibm.com/redpieces/pdfs/sg247938.pdf
Uempty
Objectives
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Uempty
'DWDPLJUDWLRQRYHUYLHZ
'DWDPLJUDWLRQRSWLRQV
ƒ 3RROWRSRROPLJUDWLRQ
ƒ ,PSRUWZL]DUG
ƒ ([SRUWZL]DUG
ƒ 6\VWHPPLJUDWLRQ
ƒ 9ROXPHPLUURULQJ
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
In this topic, we will review the concept of data migration and look at several options in which data
migration can be performed. We will begin with the Data Migration concept.
Uempty
Data migration
Migration for
data archival
Migration for
system updates
/ maintenance
Attached
host
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
IBM Spectrum Virtualize systems can virtualize capacity from both new and existing storage
systems. The virtualized storage system delivers these functions in a homogeneous way on a
scalable and highly available platform over any attached storage and to any attached server.
There are two aspects to data migration. One is to move data from a non-IBM storage environment
to an IBM virtualized storage environment (and vice versa). The other is to move data within the
IBM storage managed environment.
While host-based data migration software solutions are available, the IBM storage system import
capability can be used to move large quantities of non-IBM storage managed data under IBM
storage control in a relatively small amount of time.
Virtualizing existing disk subsystems and their existing volumes to behind the IBM Spectrum
Virtualize storage system (and vice versa) involves an interruption of host or application access to
the data. Moving data within the IBM storage environment is not disruptive to the host and the
application environment.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Data migration is the process of transferring data between storage systems. Data migration is a key
process for any system implementation, upgrade, or consolidation.
For volumes managed by the IBM Spectrum Virtualize storage system such as SVC, FlashSystem
V9000 or Storwize V7000, the mapping of volume extents to MDisk extents can be dynamically
modified without interrupting or affecting a host’s access to these volumes. Most implementations
allow for this to be done in a manner that is transparent to the host while it continues to run
applications and perform I/O.
In addition, migration of existing data to IBM storage management takes place without data
conversion and movement. Once under clustered system management, transparent data migration
allows existing data to gain the benefits and flexibility of data movement without application
disruption.
Uempty
Extent 1a Extent 1a
Extent 2a Extent 2a
Extent 3a Extent 3a
Extent 1b Extent 1b
Extent 2b Extent 2b
Extent 3b Extent 3b
Extent 1c Extent 1c
Extent 1a Extent 2c
Extent 2c
Storage PoolA Extent 3c Extent 2a Extent 3c
Storage PoolB
Extent 3a
R5 R5 R5 Extent 1b
R5 R5 R5
Extent 2b
Chucks are
Extent 3b copied in 16
R5
LUN
R5
LUN
R5
LUN Extent 1c MB R5 R5 R5
LUN LUN LUN
Extent 2c
RAID Controller
Extent 3c RAID Controller
Since the volume represents the mapping of data extents rather than the data itself, the mapping
can be dynamically updated as data is moved from one extent location to another.
Regardless of the extent size the data is migrated in units of 16 MB. During migration the reads and
writes are directed to the destination for data already copied and to the source for data not yet
copied.
A write to the 16 MB area of the extent that is being copied (most likely due to IBM storage cache
destaging) is paused until the data is moved. If contention is detected in the back-end storage
system that might impact the overall performance of the IBM storage system, the migration is
paused to allow pending writes to proceed.
Once an entire extent has been copied to the destination pool, the extent pointer is updated and the
source extent is freed.
For data to migrate between storage pools, the extent size of the source and destination storage
pools must be identical.
Uempty
MDisks R5 R5 R5 R5 R5 R5
R5 R5 R5 R5 R5 R5
SCSI LUNs LUN LUN LUN LUN LUN LUN
Storage system
migration
RAID Controller RAID Controller
The volume migration (migratevdisk) function of the IBM storage system enables all the extents
associated with one volume to be moved to MDisks in another storage pool.
One use for this function is to move all existing data that mapped by volumes in one storage pool
for a legacy storage system to another storage pool for another storage system. The legacy storage
system can then be decommissioned without impact to accessing applications.
Another example of usage is enabling the implementation of a tiered storage scheme using multiple
storage pools. Lifecycle management is facilitated by migrating aged or inactive volumes to a
lower-cost storage tier in a different storage pool.
Uempty
BLUE1 BLUE3
300 GB 300 GB BLUDATA
Migrate 800 MB
BLUE2
Striped Extent 1x 300 GB
BLUE4
300 GB
Extent 3y
Extent 4g
BLUDATV Extent 5d
Virtualized 800 MB Extent 5e Managed mode
Extent 5f
Extent 1y
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Image type volumes have the special property that its last extent might be a partial extent. The
migration function of the IBM virtualized storage system allows one or more extents of the volume
to be moved and thus change the volume from the image to the striped virtualization type. Several
methods are available to migrate the image type volume to striped.
If the image type volume is mapped to a storage pool that is set aside to map only to image type
volumes then typically the administrator will migrate the image mode VDisk to a striped VDisk in
another pool. You can choose to do this automatically as part of the migration process. Image
volumes often are not multiples of the pool segment size, in which case the last partial extent of the
volume is converted to a full extent.
Uempty
Migrate to image
BLUE1 BLUE3
300 GB 300 GB
BLUE2
Striped Extent 1x 300 GB
BLUE4
300 GB
Extent 3y
Extent 4g
BLUDATV Extent 5d
800 MB
Virtualized 800 MB Extent 5e
Extent 5f
Unmanaged
Extent 1y mode
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
An export option (migratetoimage) is available to reverse the migration from the virtualized realm
back to non-virtualized. Data extents associated with a striped type volume are collocated to an
empty or unmanaged destination MDisk. The volume is returned to the image virtualization type
with its destination MDisk placed in image access mode.
The image volume can then be deleted from IBM storage management causing its related MDisk or
SCSI LUN to be removed from the storage pool and set in unmanaged access mode. The SCSI
LUN can then be unassigned from the IBM storage system ports and assigned directly to the
original owning host using the storage system’s management interfaces.
The migrate to image function also allows an image type volume backed with extents of an MDisk
in one storage pool to be backed by another MDisk in the same or different storage pool while
retaining the image virtualization type.
In essence, the volume virtualization type is not relevant to the migrate to image function. The
outcome is one MDisk containing all the data extents for the corresponding volume.
Uempty
The extent migration (migrateexts) function is used to move data of a volume from extents
associated with one MDisk to another MDisk within the same storage pool without impacting host
application data access.
When the Easy Tier function causes extents of volumes to move from HDD-based MDisks to
SSD-based MDisk of a pool, migrateexts is the interface used for the extent movement.
When an MDisk is to be removed from a storage pool and that MDisks contains allocated extents,
then a forced removal of the MDisk causes data associated with those extents to be implicitly
migrated to free extents among remaining MDisks within the same storage pool.
Uempty
New volume
Import wizard (create image volume, migrate to striped)
(existing data)
o Export to image mode (migrate striped MDisk to image
Volume
MDisk for export)
New volume
(existing data) Migration wizard (import multiple volumes; map to host)
A wealth of data migration options is provided by the IBM virtualized storage system. We will
examine each of these options listed as we explore data migration.
Uempty
Virtualization is a valuable technology for helping you get the most out of your IT investments. This
shows the high level steps to virtualize storage behind an IBM Spectrum Virtualize storage system.
Preparation is critical to data access, to avoid access and performance issues, you must ensure
that your SAN-attached storage systems and switches are correctly configured to work efficiently
with symmetric virtualization.
The application outage is generally short and can typically last between 5 to 15 minutes based on
the number of resources in the configuration. This can further be minimized using scripting as part
of the process. Alternatives do exist to avoid application outages entirely when no downtime is
acceptable in a high availability cluster using host based data migration from the backend to the
clustered storage. However, using host based data migration requires additional storage space on
the clustered system. The procedure covered here doesn’t require any additional storage.
Eventually the backend storage is reconfigured to best practices for behind Spectrum virtualize
clustered system, which facilitates evenly balanced the use of all its resources and maximizing its
performance as a result.
Uempty
'DWDPLJUDWLRQRYHUYLHZ
'DWDPLJUDWLRQRSWLRQV
ƒ 3RROWRSRROPLJUDWLRQ
ƒ ,PSRUWZL]DUG
ƒ ([SRUWZL]DUG
ƒ 6\VWHPPLJUDWLRQ
ƒ 9ROXPHPLUURULQJ
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
You can move a volume to a different storage pool only if the destination pool has enough free
space to hold volume data. If the pool does not have enough space, the volume will not move.
Before performing a volume migration, you need to ensure that the volumes are in good status.
Migrating a volume to another pool also means its extents (the data that belongs to this volume) are
moved (actually copied) to another pool. The volume itself remains unchanged from the host’s
perspective.
Migrate a volume to another pool can be invoked by clicking Volumes > Volumes by Host and
selecting the desired host in the Host Filter list. Right-click the volume entry and then select
Migrate to Another Pool from the menu list.
A list of storage pools eligible to receive the volume copy extents is displayed. The GUI only
displays target pools with the same extent size as the source pool and only if these pools have
enough free capacity needed for the incoming extents. Once you have selected a target pool, the
management GUI generates the migratevdisk command which causes the extents of the volume
copy to be migrated to the selected target storage pool.
A volume might potentially have two sets of extents, typically residing in different pools. The
granularity of volume migration is at the volume copy level. Therefore the more technically precise
terminology for a volume is actually a volume copy. Data migration occurs at the volume copy level
and migrates all the extents associated with one volume copy of the volume.
Uempty
IBMcluster:superuser>lsvdiskextent mdiskgrp 1
id number_extents
0'$7$
$,;B&+,369
0 2
1 2 H[WV H[WV H[WV
2 2
'6. '6. '6.
NL pool
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
As an extent is copied from the source pool to the destination pool the extent pointer for the volume
is updated to that of the destination pool. The extent in the source pool becomes free space eligible
to be reassigned to another volume.
Due to the IBM storage system implementation of volume extent pointers the volume migration is
totally transparent to the host. Nothing has changed from a host perspective. The fact that the
copied volume extents are now sourced by another pool is totally transparent and inconsequential
to the attaching host. I/O operations proceed as normal during the data migration.
Once the volume copy’s last extent has been moved to the target pool then the volume’s pool name
is updated to that of the target pool.
Uempty
'DWDPLJUDWLRQRYHUYLHZ
'DWDPLJUDWLRQRSWLRQV
ƒ 3RROWRSRROPLJUDWLRQ
ƒ ,PSRUWZL]DUG
ƒ ([SRUWZL]DUG
ƒ 6\VWHPPLJUDWLRQ
ƒ 9ROXPHPLUURULQJ
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
This topic reviews how to use the Import Wizard to bring a volume that contains existing data under
an IBM virtualized storage system control as an image type volume. Finally, we will examine the list
of procedures to be completed once the volume has been migrated to its new pool.
Uempty
Host
APPLUN APPLUN
IBM storage system management
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Image mode simplifies the transition of existing data from a non-virtualized to a virtualized
environment without requiring physical data movement or conversion. This method involves the
external storage that is currently presenting data to a host server, and the IBM Spectrum Virtualize
system in which the data will be migrated under its management for host access.
Uempty
Best practice: Have a separately defined storage pool set aside to house
SCSI LUNs containing existing data.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Image mode volumes are special volumes that creates a direct mapping between extents that are
on the MDisk and the extents that are on the volume. Therefore, the logical block address (LBA) x
on the MDisk is the same as the LBA x on the volume, which ensures that the data on the MDisk is
preserved as it is brought into the clustered system. The capacity of image mode volumes is equal
to the capacity of the MDisk from which it is created. Image mode volumes have a minimum size of
1 block (512 bytes) and always occupy at least one extent.
Some functions are not available for image mode volumes. For best practice, have a separately
defined Some virtualization storage pool set aside to house SCSI LUNs containing existing data.
Use the image type volume attribute to securely bring that data under the cluster virtualized
management. After the migration completion, the MDisk becomes a managed MDisk.
Uempty
Host
MDisk
APPLUN APPLUN
APP1DB APP1LOG 1
0 1 MDisk
Drive D Drive E
0
Before LUNs can be imported from an external storage subsystem into the virtualized IBM
Spectrum Virtualize environment, the LUN being imported to the IBM storage system has to be
unassigned from the host in storage box. The application which had been using that LUN obviously
has to take an outage.
The LUN will then need to be re-assigned from the external storage to IBM system cluster for
management. The LUN disk is detected as an MDisk (LUN) on the system cluster which becomes
external storage unmanaged mode MDisk.
Uempty
APP1LOG
1
APP1DB
0
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
If the external controller is not listed in the Pools > External Storage view, you will need to perform
a SAN device discovery by selecting Action > Discover storage. The GUI will issue the
detectmdisk command to cause the IBM virtualized storage system to perform Fibre Channel
SAN device discovery.
There is no interface for the IBM storage to discern if this MDisk contains free space or existing
data. You will need to confirm that the correct external storage volume had been discovered by
examining the details of the MDisk properties.
It is best practice to rename the MDisk to clarify its identity or to match its existing name in the LUN
data being imported from the external storage system. When renaming multiple volumes at the
same time, you can correlate the volumes on the system cluster with LUNs on the backend storage
subsystem, or disks on the hosts, using UDIDs and sometimes via the LUN size when they are
unique.
Uempty
APP1LOG
APP1DB MDisk 1
image MDisk 0 Click if the volume
mode contains Copy Services
Extent = 1024 MB
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
There are two methods in which you can import existing data:
1. Import to temporary pool as image-mode volume option allows you to virtualize existing
data from the external storage system without migrating the data from the source MDisk (LUN)
and then present them to host as image mode volume. This data will become virtualized by the
system cluster while remaining on the existing back end storage subsystem original LUN.
2. Migrate to existing pool option allows you to create an image mode volume and start migrate
the data to the selected storage pool. This frees up the original backend LUN that can be
reclaimed storage space. This facilitates reconfiguring the backend storage to a more optimal
configuration for virtualization and balanced use of its resources.
To start the import to image mode process, right click an unmanaged MDisk that correlates to the
external storage LUN and select Import from the menu. The import wizard guides you through a
quick import process to bring the volume’s existing data under IBM storage management.
The default name given to the volume created by the Import Wizard is a concatenation of the
storage system name followed by the MDisk LUN number. As best practice, to rename the volume
to a more descriptive name typically to identify it as being used by its assigned host.
Next, you will need to define migration pool with the exact same size of the LUN being migrated.
This example shows that the Import option is used and no existing storage pool is chosen,
therefore a temporary migration pool with an extent size of 1024 MB is created to hold the new
image-mode volume.
Uempty
1
0 image
APP1LOG stripe
APP1DB
MigrationPool_1024
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
During this process, the MDisk transitions from unmanaged mode to image mode. Immediately
after, the image type volume is migrated to the MigrationPool to become virtualized. Migrating
the image volume to striped is to be performed later and outside the control of the Import
Wizard.
The MigrationPool_1024 is normally used as a vehicle to migrate data from existing external LUNs
into storage pools, either located internally or externally, on the IBM storage system. You should not
use image-mode volumes as a long-term solution because the backend storage subsystem should
be reconfigured to facilitate balanced use of its resources, following best practices for the specific
disk subsystem.
Uempty
The Wizard generates several tasks. It first creates a storage pool called MigrationPool_1024 using
the same extent size (-ext 1024) as the intended target storage pool.
The mkvdisk command is used to concurrently perform two functions. It places the DS3K MDisk
into the MigrationPool_1024 and at the same time creates an image mode volume based on this
MDisk. At this point, there is a one-to-one relationship between the MDisk and the volume. This
volume’s extents are all sourced from this MDisk. The MDisk has an access mode of image and the
volume has a virtualization type of image. You will notice there is no reference to the volume’s
capacity as it is implicitly derived from the capacity of its MDisk.
Uempty
Host
External Storage Subsystem
APPLUN APPLUN
0 1
1
0
APP1LOG
APP1DB APP1LOG APP1DB APP1LOG
APP1DB
Drive D Drive E Drive D Drive E
unmanaged mode
SAN Zone Volume disks online
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
In terms of host access to the existing data, as soon as the mkvdisk command completes, the
volume can be mapped to the host object that was previously using the data that the MDisk now
contains.
The administrator uses the GUI to map the VDisk to the host, generating a mkvdiskhostmap
command. The host administrator will scan for disks resulting in a disk being configured on the
host. If the host disk shows it is offline, you will need to being it back online. Then host applications
can be restarted.
Generally image mode volume will be migrated to striped mode in another pool, to facilitate
balanced use of resources, and to take advantage of other IBM Spectrum Virtualize software
features.
If you need to preserve existing data on the unmanaged MDisks, do not assign them to the pools
because this action deletes the data.
Uempty
APP1LOG
APP1LOG
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
To virtualize the storage on an image mode volume, the volume needs to be transformed into a
striped volume. This process migrates the data on the image mode volume to managed-mode disks
in another storage pool. Issue the migratevdisk command to migrate an entire image mode
volume from one storage pool to another storage pool.
The IBM storage system attempts to distribute extents of the volume across all MDisks of the pool.
All extents of the MDisk have been freed. Remember that the MDisk's access mode became
managed when the migratevdisk process began.
Uempty
Delete MigrationPool
• Volume and MigrationPool_1024 can now be deleted
ƒ You have the option to keep the MigrationPool for subsequent imports
• MDisk returns to unmanaged mode
• LUN can now be unassigned from the IBM storage system host group to prevent future
detection
APP1LOG
APP1DB MDisk 1
image MDisk 0
mode
Extent = 1024 MB
Migration_Pool
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Having migrated the volume data from the original LUN and to a new storage pool, the MDisk and
the temporary MigrationPool_1024 storage pool no longer exist.
To finalize the import migration, the image type volume is deleted, its corresponding MDisk is
automatically removed from the storage pool. The empty MigrationPool_1024 can either be deleted
or kept for subsequent imports. The data migration to the IBM storage system is done.
The MDisks will be presented as unmanaged and no longer being used by the IBM storage system.
The backend storage administrator will typically unmap the LUN from the storage system cluster
and reclaim the space. This prevents the LUN from being detected the next time the system cluster
performs SAN device discovery. Consequently the system cluster removes the MDisk entries from
its inventory.
If the external storage device is scheduled for decommissioning, the SAN zoning needs to be
updated so that the IBM storage system can no longer see the its FC ports.
Uempty
'DWDPLJUDWLRQRYHUYLHZ
'DWDPLJUDWLRQRSWLRQV
ƒ 3RROWRSRROPLJUDWLRQ
ƒ ,PSRUWZL]DUG
ƒ ([SRUWZL]DUG
ƒ 6\VWHPPLJUDWLRQ
ƒ 9ROXPHPLUURULQJ
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
This topic discusses the Export to image mode option to remove striped type volume from IBM
virtualized storage management. It also highlights the steps in which to reassign the volume data to
the host directly from a storage system.
Uempty
n
n
Volume
copy Volume
copy
ID n exts
Extents of volume n
ID n exts ID n exts ID n exts
migrated
MDisk1 MDisk2 MDisk3 MDisk4 to one MDisk
Extent = 1024 MB
Flash pool
Same
capacity
or bigger as
volume
unmanaged
Extent = 1024 MB
MDisk
Any_same_extent_size_pool
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
The process to export or revert a striped type volume back to image mode, where the VDisk will
reside entirely on a backend storage LUN, transparently to the host. At some point, the application
will be stopped, and the backend storage administrator will unmap the LUN from the storage
system cluster, and map it back to the host. Then the host administrator will remove the storage
system clustered disk from the host configuration, rescan for disks and restart the application.
Zoning changes typically occur during this process. The backend storage administrator starts the
process by configuring a LUN of the appropriate size and mapping it to the storage clustered
system for use as the image mode MDisk.
The migratetoimage function is used to relocate all extents of a volume to one MDisk of the
storage system cluster, and to recreate the image mode pair.
In the example, the IBM storage system cluster export volume function (migratetoimage) enables
all the extents associated with a volume copy to be relocated to just one destination MDisk. The
access mode of the MDisk must be unmanaged for it to be selected as the destination MDisk. The
capacity of this MDisk must be either identical to or larger than the capacity of the volume.
As a result of the export process, the volume copy’s virtualization type changes to image and its
extents are sourced sequentially from the destination MDisk.
The image volume and MDisk pair can reside in any pool as long as the resident pool has the same
extent size as the pool that contained the volume copy initially.
Uempty
• Issues the following command to create an export pool using the same extent size:
mkmdiskgrp -ext 1024 -name ExportPool_1024
IBMcluster:superuser>lsmdiskgrp 5
id 5
name ExportPool_1024
status online
mdisk_count 0
vdisk_count 0
capacity 0
extent_size 1024
free_capacity 0
virtual_capacity 0.00MB
used_capacity 0.00MB
real_capacity 0.00MB
overallocation 0
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
As a general practice, image mode pairs should be kept in a designated migration pool instead of
being intermingled in a pool with striped volumes. A preparatory step needed prior to exporting the
volume copy is to have a storage pool with the same extent size as the volume’s storage pool.
In this case, you will need to determine the extents size of the pool in which the volume to be
exported resides in. Based on the pool’s extent size 1024, create an ExportPool _1024 with the
same extent size.
The subsequent lsmdiskgrp 5 command can display the details for the storage pool just created
and confirms the extent size of 1024 MB for the empty pool.
Uempty
Target 6RXUFH
Unmanaged
MDisk
APP3VOL_ APP3VOL
V APP1DB APP1LOG
LUN
Stripe volume Extent = 1024 MB
APP3VOL
Extent = 1024 MB
ExportPool_1024l
An export mode MDisk is associated with exactly one volume. This feature can be used to export a
volume to a non-virtualized disk and to remove the volume from storage virtualization, for example,
to map it directly from external storage system to host. If you have two copies of a volume, you can
choose one to export to image mode. To export a volume copy from striped to image mode,
right-click on the volume and select Export to Image Mode from the menu list.
From the Export to Image Mode window, you will need to select an unmanaged MDisk that is the
size of the volume copy or larger to export the volume’s extents. In this example, we selected the
APP3VOL MDisk of the same capacity that is still in the IBM storage system cluster inventory as an
eligible destination MDisk to be migrated to the ExportPool_1024 (source) storage pool for the new
image mode volume.
Since the target storage pool has to have the same extent size as the source storage pool, this pool
was pre-defined for that purpose. Also, the target storage pool may be an empty pool, so the
selected MDisks will be the target pool’s only member at the end of migration procedure. The target
storage pool can also have other image modes or striped MDisks. In case you have image and
striped MDisks in the same pool, volumes created in this pool will only use striped MDisks because
MDisks that are in image mode already have image mode volume created on top of them and
cannot be used as an extent source for other volumes.
The GUI generates the migratetoimage command to identify the destination MDisk to use for the
volume copy and the pool to contain the image mode pair.
Uempty
3 3
Volume volume
copy 1 copy 1
3
Image mode
4 exts
APP3VOL
APP3VOL LUN
APP3VOL
Extent = 1024 MB
ExportPool_1024 Storage pool
RAID5 pool
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
The migratetoimage command migrates the data of the volume by consolidating its extents (which
might reside on one or more MDisks) onto the extents of the target MDisk. After migration is
complete, the volume is classified as an image type volume, and the corresponding mdisk is
classified as an image mode MDisk.
The managed disk that is specified as the target must be in an unmanaged state at the time that the
command is run. Running this command results in the inclusion of the MDisk into the user-specified
storage pool.
You cannot specify migratetoimage if the target or source volume is offline. Correct the offline
condition before you migrate the volume.
At the completion of the export process the ExportPool_1024 pool contains one image access
mode MDisk with no free space. All the extents of this MDisk has been assigned to the volume to
be exported. During the migration to image mode the I/O operations continue to proceed as normal.
You can now remove the volume and MDisk from IBM Spectrum Virtualize management and
present the former MDisk as a LUN to the Windows host. To do so, first, stop application activity.
Either remove the drive letter in Windows to take the drive offline, or shut down the Windows host.
Since the storage system LUN is to be directly assigned to the host, you will need to update host
SAN zoning to enable access to the storage system.
Uempty
'DWDPLJUDWLRQRYHUYLHZ
'DWDPLJUDWLRQRSWLRQV
ƒ 3RROWRSRROPLJUDWLRQ
ƒ ,PSRUWZL]DUG
ƒ ([SRUWZL]DUG
ƒ 6\VWHPPLJUDWLRQ
ƒ 9ROXPHPLUURULQJ
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
This topic discusses the procedures to migrate existing data on external storage systems using the
IBM Spectrum Virtualize storage system migration wizard.
Uempty
A storage box
on the SAN
APPLUN APPLUN
IBM Spectrum Virtualize Storage System Migration is a wizard-based tool that is designed to simply
the migration task. The wizard features easy-to-follow pane that guides you through the entire
migration process.
System Migration uses volume mirroring instead of migratevdisk command to migrate existing
data into the virtualized environment. Similar to the Import Wizard this step can be optional.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
You can use the external storage DS Storage Manager Client interface to verify the map host LUNs
to the IBM storage system host group. This remap of LUNs to the IBM storage system host group
can be performed either prior to invoking the Migration Wizard or before the next step in the
Migration Wizard.
The LUN number assigned to the logical drives can be any LUN number. In this example, by default
the DS3500 storage unit uses the next available LUN numbers for the target host or host group.
The LUN number is assigned as LUN 1 for APP1DB. The logical drive ID of the LUN should match
the worldwide unique LUN names reported by the QLogic HBA management interface.
Uempty
A storage box
on the SAN
copy 0 copy 1
image striped unmanaged
APPLUN APPLUN type APP1DATA type mode
Unmanaged mode
MDisk25
For a MDisk26
Image
given LUN APPLUN
mode MDisk
Enables import of Other_Pool_1024
large capacity LUNs MigrationPool_8192 (any extent size)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Before you begin migrating external storage, confirm that the restrictions and prerequisites are met.
The IBM storage system supports migrating data from external storage system to the system using
either direct serial-attached SCSI (SAS) connections and Fibre Channel or Fibre Channel over
Ethernet connections.
The list of excluded environments are not built into the guided Migration Wizard procedure.
• Change VMWare ESX host settings, or do not run VMWare ESX.
If you have VMware ESX server hosts, you must change settings on the VMWare host so
copies of the volumes can be recognized by the system after the migration is completed. To
enable volume copies to be recognized by the system for VMWare ESX hosts, you must
complete one of the following actions on the host:
▪ Enable the EnableResignature setting.
▪ Disable the DisallowSnapshotLUN setting.
▪ One approach is to (before migrating the other LUNs) is to create a new LUN, assign it to
the host, and allow the host administrator to migrate the boot LUN.
To learn more about these settings, consult the documentation for the VMWare ESX host.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
The following are required to prepare external storage systems and IBM storage system for data
migration.
• In order for the IBM storage system to virtualize external storage, a per-enclosure external
virtualization license is required. You can temporarily set the license without any charge only
during the migration process. Configuring the external license setting prevents messages from
being sent that indicate that you are in violation of the license agreement. When the migration is
complete, the external virtualization license must be reset to its original limit.
• I/O operations to the LUNs must be stopped and changes made to the mapping of the storage
system LUNs and to the SAN fabric zoning. The LUNs must then be presented to the IBM
storage system and not to the hosts.
• The hosts must have the existing storage system multipath device drives removed, and the be
configured for the IBM storage system attachment. This might require further zoning changes to
be made for host-to IBM storage system SAN connections.
• The IBM storage system discovers the external LUNs as unmanaged MDisks.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
In order to ensure that data is not corrupted during the migration process, all I/O operations on the
host side must be stopped. In addition, SAN zoning needs to be modified to allow the backend
storage to map the LUNs to the system cluster, and for the system cluster to map VDisks to the
hosts.
Before migrating storage, administrator should record the hosts and their WWPNs for each volume
that is being migrated, and the SCSI LUN when mapped to this system.
Uempty
Right-click to
rename MDisk to
correlated to the
LUNs on the external
storage system
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
The IBM storage system management GUI will issue the detectmdisk command to scan
the environment to detect the available LUNs that have been mapped to the IBM storage
system host group. The lsdiscoverystatus command list the unmanaged MDisks to be
assigned to the IBM storage system. If the MDisks were not renamed during the GUI
external system discovery, you can right-click on each MDisk to rename them to correspond
to the LUNs from the external storage system.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
The Migration Wizard supports concurrently importing multiple unmanaged MDisks. The LUNs are
presented as unmanaged mode MDisks. The LUN numbers range from 0 to 255 range and are
surfaced to the IBM storage system as a 64-bit number with the low-order byte containing the
external storage assigned LUN number in hexadecimal format. The MDisk properties provide
additional confirmation that includes the storage system name and UID.
Uempty
For each of the selected MDisk, a mkvdisk command is generated to create the one-to-one volume
pair with a virtualization type of image. The image mode means that the volume is an extract image
of the LUN that is on the external storage system with its data completely unchanged. Therefore the
IBM storage system is simple presenting an active image of the external storage LUN.
The mkmdiskgrp command is used to create a MigrationPool whose extent size is 8192 MB. Using
the largest extent size possible for this pool enables MDisk addressability when importing
extremely large capacity LUNs.
The unmanaged image volumes are moved into the migration pool with an access mode of image
and a corresponding image type volume is created with all its extents pointing to this MDisk. The
name assigned to each volume follows the format of storage system name concatenated with the
storage system assigned LUN number for the MDisk.
As with all IBM storage objects, an object ID is assigned to each newly created volume. As a
preferred practice, map the volume to the host with the same SCSI ID before the migration.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Before you proceed, to map image volumes to a host, you need to verify that the potential host
system have been installed with the supported drivers and properly zoned within the IBM storage
system SAN fabric.
If a host object has not been defined to the IBM storage system yet, click the Add Host option.
Configuring host objects using the System Migration Wizard is optional as it can be perform after
volumes have been migrated to a specified pool.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
The System Migration Map Volumes to Hosts (optional) pane presents the image volumes under
the default name that contains the name of the external storage system along with the
corresponding MDisk name. All columns within the wizard can be modified for viewing purposes to
view information like the volume object IDs.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
From this pane, the selected image volumes can now be mapped to the desired host. This task can
be completed using the Map to Host option or from the Action menu select Map to Host.
With today’s SAN-aware operating systems and applications, a change in the SCSI ID (LUN
number) of a LUN presented to the host is not usually an issue, but only a concern if multiple hosts
are accessing the same VDisk and require the same LUN ID across all hosts. Windows behavior is
consistent. Therefore, it is not an issue for a disk to be removed from the system and the
represented with a different ID/LUN number. Windows will typically reassign the same drive letter if
it is still available.
Once the image volumes are mapped to the host object, host device discovery can be performed. It
might be appropriate to reboot the server as part of the host device discovery effort.
Uempty
MigrationPool_8192
Image copy 0 extents
volume
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Migrating image volumes to a selected pool is an optional. If it is desired to migrate these volumes
to the virtualized environment (virtualization type of striped) then select a target pool.
Unlike the Import Wizard, the Migration Wizard uses the IBM storage system Volume Mirroring
function (instead of migratevdisk) to implement the migration to the striped virtualization type. The
GUI generates one addvdiskcopy command to create a second volume copy (copy 1) for each
volume.
Since Volume Mirroring is used then the target pool extent size does not need to match the
migration pool extent size.
If no target pool is selected for this step then volumes and their corresponding MDisks are left as
image mode pairs. The System Migration Wizard can be invoked at a later point in time to complete
the migration to the virtualized environment.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
The GUI starts volume synchronization on each volume copy. This part of the System Migration
Wizard is complete. However the end of the storage migration wizard is not the end of the data
migration process. Once complete, click the Finish option.
Uempty
A storage box
Host has continuous access to
on the SAN
volume data while the migration
occurs in the background
APPLUN APPLUN
Managed
copy 0 copy 1 mode
Unmanaged mode
image striped
type APP1DATA type
For a MDisk#
given MDisk#
LUN
Image APPLUN
mode MDisk
Other_Pool_1024
MigrationPool_8192 (any extent size)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Since IBM storage system environment is virtualized and we were able to successfully map
volumes to the host, the host will have continuously access to the volume data while migration
occurs in the background. The application can be restarted and the host will have no awareness of
the migration process.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
After Volume Mirroring synchronization has reached 100%, you can finalize the migration process.
The image copy (copy 0) is deemed to be no longer needed since data have been migrated into the
IBM storage system virtualized environment.
From the System Migration pane, select the Finalize option. A subsequent IBM storage system
SAN device discovery action will delete each Copy 0 image volume copy from the IBM storage
system inventory provided the back end storage administrator has unmapped the LUN from the
system cluster.
Uempty
Delete MigrationPool_8192
• Image type volumes are deleted and the corresponding MDisk is automatically removed
from the pool
• MigrationPool_8192 can either be deleted or kept for subsequent imports
• Addition steps will need to be performed to unassign the LUNs in the backend storage from
the system cluster
A storage box
on the SAN
APPLUN APPLUN
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
When the finalization completes, the image type volumes are deleted, its corresponding MDisk is
automatically removed from the storage pool. You can unzone and remove the older storage
system from the IBM storage system SAN fabric. The empty MigrationPool_8192 can either be
deleted or kept for subsequent imports. The data migration to IBM storage system is done.
Additional steps will need to be performed to unassign the LUNs in the storage system from the
IBM storage system cluster. You would typically change the zoning to allow the backend disk
subsystem that is not zoned to communicate with the host. The back end storage will however
remain zoned to the system cluster.
Uempty
'DWDPLJUDWLRQRYHUYLHZ
'DWDPLJUDWLRQRSWLRQV
ƒ 3RROWRSRROPLJUDWLRQ
ƒ ,PSRUWZL]DUG
ƒ ([SRUWZL]DUG
ƒ 6\VWHPPLJUDWLRQ
ƒ 9ROXPHPLUURULQJ
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
This topic discusses how volume mirroring can be used to migrate data from one pool to another
pool.
Uempty
9ROXPH
9ROXPH 9ROXPH
&RS\ &RS\
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Volume Mirroring is a function where Spectrum Virtualize software stores two copies of a volume
and maintains those two copies in synchronization. Volume mirroring is a simple RAID 1-type
function that allows a volume to remain online even when the storage pool backing it becomes
inaccessible.
Volume mirroring is designed to protect the volume from storage infrastructure failures by seamless
mirroring between storage pools that might impact availability of critical data or applications.
Accordingly Volume Mirroring is a local high availability function and is not intended to be used as a
disaster recovery function.
Uempty
SCSI Target
• Both volume copies have its own cache
Forwarding
Destaging of the cache can now be done
Replication independently for each copy
Upper Cache
FlashCopy
Volume Mirroring
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
With the redesign of the IBM Spectrum Virtualize software architecture, mirrored volume
performance has been significantly improved. When the host writes to a mirrored volume, it puts
the data in the I/O group cache once before returning an acknowledgment to the host. Compared to
host based mirroring, the host does one write with clustered system mirroring instead of two, and
writes only once to write cache rather than twice; thus, uses the clustered system cache twice as
efficiently. The cluster will destage the write data to both volume copies of the volume. Destaging of
the cache can now be done independently for each copy, therefore one copy does not affect
performance of a second copy.
Also, since the IBM storage systems destage algorithm are MDisks aware, it can tune or adapt the
destaging process, depending on MDisk type and utilization, for each copy independently.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Uempty
Volume Volume
Copy0 Copy1
Extent 1a Extent 1a
Extent 2a Extent 2a
Extent 3a Extent 3a
Extent 1b Extent 1b
Extent 2b Extent 2b
Extent 3b
Extent 1c
Extent 2c
Extent 3c
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
The ability to create a volume copy affords additional management flexibility. Volume copy uses the
same virtualization policy and can be create as a striped, sequential, and image volumes. Volume
mirroring also offers non-disruptive conversions between fully allocated volumes and
thin-provisioned volumes.
A volume copy can also be added to an existing volume. In this case, the two copies do not have to
share the same virtualization policy. When a volume copy is added, the Spectrum Virtualize
software automatically synchronizes the new copy so that it contains the same data as the existing
copy.
Uempty
5 GB
Extent 4
Extent 3
Pool1
Extent 2
extent size
Extent 1
Volume 512 MB
Extent 0
Copy 1 Extent 4
Extent 3
Pool2
Extent 2
Copy has its own:
• Storage pool
Extent 1 extent size
• Virtualization type
Extent 0 1024 MB
• Fully allocated or thin
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
A volume can be migrated from one storage pool to another and acquire a different extent size. The
original volume copy can either be deleted or you can split the volume into two separate volumes –
breaking the synchronization. The process of moving any volume between storage pools is
non-disruptive to host access. This option is a quicker version of the “Volume Mirroring and Split
into New Volume” option. You might use this option if you want to move volumes in a single step or
you do not have a volume mirror copy already.
Uempty
Pool1 Pool2
extents extents
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
By default, volume copy 0 is assigned as the primary copy of the volume. However, from an I/O
processing point of view under normal conditions, reads and writes always go through the primary
copy. Writes are also sent to volume copy 1 so that synchronization is maintained between the two
volume copies. The location of the primary volume can also be changed by the user to either
account for load-balancing or possibly different performance characteristics for the storage of each
copy.
If the primary copy is unavailable - for example volume copy 0’s pool became unavailable due to its
storage system has been taken offline - the volume remains accessible to assigned servers. Reads
and writes are handled with volume copy 1. The IBM storage system tracks changed blocks of
volume copy 1 and resynchronize these blocks with volume copy 0 when it becomes available.
Reads and writes then revert back to volume copy 0. It is also possible to change volume copy 1 as
the primary copy.
Uempty
Summary
Default sync
í Provides a quick view of volume details before rate is 50
creation
í Volumes in the mirrored pair are created
equally in capacity
addvdiskcopy command adds
a copy to an existing volume
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
One of the simplest ways to create a volume copy is to right-click on a particular volume and select
the Add Volume Copy. This task will create the mirroring volume with two volumes and synchronize
the data copies of its extents. This procedure allows you to place mirrored volume in a single pool
or specify a primary and a secondary pool to migrate data between two storage pools.
Summary statement calculates the real and virtual capacity value of the volume. The virtual
capacity is the size presented to hosts and other Copy Services such as FlashCopy and
Metro/Global Mirror.
The addvdiskcopy command adds a copy to an existing volume, which changes a non-mirrored
volume into a mirrored volume. Use the -copies parameter to specify the number of copies to add
to the volume; this is currently limited to the default value of 1 copy. Use the -mdiskgrp parameter
to specify the managed disk group that will provide storage for the copy; the lsmdiskgrp CLI
command lists the available managed disk groups and the amount of available storage in each
group.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
You also can create mirrored volumes using the GUI Create Volumes Mirrored and Custom preset
options. Mirrored preset create mirrored volumes with predefined parameters such a volume format
and default sync rate.
The custom preset allows you to modify and specify specific parameters such as changing the sync
rate parameter to specify the rate at which the volume copies will resynchronize after loss of
synchronization.
Uempty
The mirrored volume entry displays two copies. By default, the asterisk associated with volume
copy 0 is used to identify the primary copy. This copy used by IBM storage system for reads and
writes. The addvdiskcopy request added copy 1 for this volume. Copy 1 is used for writes only.
Spectrum Virtualize volume mirroring automatically copies the data of copy 0 to copy 1; while
supporting concurrent application reads/writes.
A volume synchronization task is generated and runs in the background.
Uempty
When a server writes to a mirrored volume, the system cluster writes the data to both copies. If the
Primary volume copy is available and synchronized, any reads are directed to it. However, if the
primary copy is unavailable, the system cluster will read from Copy 1. There are two settings for the
mirror_write_priority attribute, trading off data consistency and write performance.
• Latency (default value): short time-out prioritizing low host latency. This option indicates a copy
that is slow to respond to a write I/O goes out of sync if the other copy successfully writes the
data.
• Redundancy: long time-out prioritizing redundancy. This option indicates a copy that is slow to
respond to a write I/O may use the full Error Recovery Procedure (ERP) time. The response to
the I/O is delayed until it completes to keep the copy in sync if possible.
Volume Mirroring ceases to use the slow copy for a period of between 4 to 6 minutes, and
subsequent I/O data is not affected by a slow copy. Synchronization is suspended during this
period. After the copy suspension completes, Volume Mirroring resumes, which allows I/O data and
synchronization operations to the slow copy that will, typically, shortly complete the
synchronization.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
The volume property details confirms that the two volume copies are identical, but assigned to
different storage pools. The capacity bar for the volume copies indicates that both volumes are fully
allocated volumes with writes performed on both copies. The synchronization or background copy
rate defaults to 50, which is set to 2 MBps. You can change the synchronization rate to one of the
specified rates to increase the background copy rate. You can issue a chvdisk -syncrate
command to change the synchronization rate using the CLI or right-click on the original volume and
select Modify Mirror Sync Rate.
The background synchronization rate can be monitored from the Monitoring > Performance view.
The default synchronization rate is typically too low for Flash drive mirrored volumes. Instead, set
the synchronization rate to 80 or above.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Volume Mirroring processing is independent of the storage pool extent size. When a volume copy is
created it has, one set of extents (copy 0), and a second set of extents created on the secondary
volume copy (copy 1). The two sets of extents or volume copies can reside in the same or different
storage pools.
Using volume mirroring over volume migration is beneficial because with volume mirroring storage
pools do not need to have the same extent size as is a case with volume migration. This allows
volume mirroring to eliminate the impact to volume availability if one or more MDisks, or the entire
storage pool fails. If one of the mirrored volumes copies becomes unavailable, updates to the
volume are logged to by the IBM storage system, allowing for the resynchronization of the volume
copies when the mirror is reestablished. The resynchronization between both copies is incremental
and is started by the IBM storage system automatically. Therefore, volume mirroring provides
higher availability to applications at the local site and reducing or minimizing the requirement to
implement host-based mirroring solutions.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
The primary copy is used by IBM storage system for both reads and writes. You can change volume
copy 1 to be the primary copy by right-clicking on its entry and select Make Primary from the menu
list.
The GUI generates the chvdisk -primary command to designate volume copy 1 as the primary
copy for the selected volume, volume ID.
A use case for designating volume copy 1 as the primary copy is the migration of a volume to a new
storage system.
For a test period, it might be desirable to have both the read and write I/Os directed at the new
storage system of the volume while still maintaining a copy in the storage system scheduled for
removal.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
You can convert a mirrored volume into a non-mirrored volume by deleting one copy or by splitting
one copy to create a new non-mirrored volume. During the deletion process for one of the volume
copies, the management GUI issues a rmvdiskcopy command followed by the -copy number (in
this example Copy 0). Once the process is complete, only volume copy 1 of the volume remains. If
volume copy 1 was a thin-provisioned volume, it is automatically converted to a fully allocated copy.
The volume can now be managed independently by Easy Tier based on the activity associated with
extents of the individual volume copy.
Uempty
Volume Volume
Copy0 Copy1
Extent 1a Extent 1a
Extent 2a Extent 2a
Extent 3a Extent 3a
Extent 1b Extent 1b
Extent 2b Extent 2b
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Although the two volume copies are identical, they appear to the host as one volume. If one of the
mirrored volume copies is temporarily unavailable, for example, because the storage system that
provides the storage pool is unavailable, the volume remains accessible to servers. The system
remembers which areas of the volume are written and resynchronizes these areas when both
copies are available. The secondary can service read I/O when the primary is offline without user
intervention. All volume migration activities occur within the IBM storage system it is totally
transparent to attaching servers and user applications.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
To protect against mirrored volumes being taken offline, and to ensure the high availability of the
system, follow the guidelines for setting up quorum disks where multiple quorum candidate disks
are allocated on different storage systems.
The system cluster quorum disks stores volume mirror state information that is needed to ensure
data integrity. If a quorum disk is not accessible and volume mirroring is unable to update the state
information, a mirrored volume might need to be taken offline to maintain data integrity.
Mirrored volumes can be taken offline if there is no quorum disk available. This behavior occurs
because synchronization status for mirrored volumes is recorded on the quorum disk.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
When creating a mirrored volume, you can only have a maximum number of two copies. Both
copies will be created with the same virtualization policy. The first Storage Pool specified will
contain the primary copy.
• To have a volume mirrored using different policies, you add a volume copy with a different
policy.
• Each copy can be located in different storage pools.
• It is not possible to create a volume with two copies when specifying a set of MDisks.
You can add a volume copy to an existing volume. Each volume copy can have a different space
allocation policy. You cannot create a mirrored volume from two existing VDisks.
You can remove a volume copy from a mirrored volume, only one copy remains.
You can split a volume copy from a mirrored volume and create a new volume with the split copy.
This function can only be performed when the volume copies are synchronized; otherwise, use the
-force command.
• Volume copies can not be recombined after they have been split.
• The split volume copy can be used as a means for creating a point-in-time copy (clone).
You can expand or shrink both of the volume copies at once.
• All volume copies always have the same size.
• All copies must be synchronized before expanding or shrinking them.
When a volume gets deleted, all copies get deleted.
Uempty
0 1
MDisk MDisk
12 15
13 14
Add to pool
2 3
DS3KNAVY4
DS3KNAVY5 DS3KNAVY6 DS3KNAVY7
MDisk MDisk
Moving disk subsystems on or off the floor is a common migration situation, which can be done in
two steps:
1. Add the new storage MDisks to the existing pool.
2. Remove the old storage MDisks from the pool.
Uempty
New volume
Import wizard (create image volume, migrate to striped)
(existing data)
o Export to image mode (migrate striped MDisk to image
Volume
MDisk for export)
New volume
(existing data) Migration wizard (import multiple volumes; map to host)
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
You should now be aware that the only time that data migration is disruptive to applications is when
a disk subsystem is virtualized behind an IBM clustered storage system or vice versa. In all other
cases, clustered system managed data movement is totally transparent. Applications proceed
blissfully unaware of changes being made in the storage infrastructure.
Uempty
Keywords
Striped mode
• System migration
• MDisks
Image mode
• Volume
Sequential mode
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Uempty
Review questions (1 of 2)
1. The three virtualization types for volumes are:
2. True or False: When using volume mirroring to migrate a volume from one pool to
another, the extent size of the two pools must be identical.
3. True or False: Migrating a volume from image virtualization type to striped or from
striped back to image is completely transparent to host application I/Os.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Uempty
Review answers (1 of 2)
1. The three virtualization types for volumes are:
The answers are striped, sequential, and image.
2. True or False: When using volume mirroring to migrate a volume from one pool to
another, the extent size of the two pools must be identical.
The answer is False.
3. True or False: Migrating a volume from image virtualization type to striped or from
striped back to image is completely transparent to host application I/Os.
The answer is Rrue.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Uempty
Review questions (2 of 2)
4. Which of the following is not performed by the Import wizard when a volume from an
external storage system is being migrated to the IBM storage system?
A. Create a migration pool with the proper extent size
B. Unzone and unmap the volume from the external storage system
C. Create an image type volume to point to storage on the MDisk being imported
D. Migrate the volume from image to striped type
5. True or False: Once a volume is under IBM storage system management, that data can
no longer be exported to another storage system.
6. True or False: To remove an external storage MDisk from IBM storage system, its
access mode should be unmanaged.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Uempty
Review answers (2 of 2)
4. Which of the following is not performed by the Import wizard when a volume from an
external storage system is being migrated to the IBM storage system?
A. Create a migration pool with the proper extent size
B. Unzone and unmap the volume from the external storage system
C. Create an image type volume to point to storage on the MDisk being imported
D. Migrate the volume from image to striped type
The answer is B. Unzone and unmap the volume from the external storage system.
5. True or False: Once a volume is under IBM storage system management, that data can no
longer be exported to another storage system.
The answer is False.
6. True or False: To remove an external storage MDisk from IBM storage system, its access
mode should be unmanaged.
The answer is True.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Uempty
Summary
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]HGDWDPLJUDWLRQ
Uempty
Overview
This module examines data replication Copy Services using FlashCopy point-in-time copy in which
the target volume contains a copy of the data that was on the source volume when the FlashCopy
was established. In addition discuss the use of FlashCopy Consistency groups that can be used to
help create a consistent point-in-time copy across multiple volumes.
References
Implementing IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1
http://www.redbooks.ibm.com/redpieces/pdfs/sg247938.pdf
Uempty
Objectives
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
Uempty
FlashCopy
Functionality and overview
Create Snapshot
Create consistency group with multi-
select
Incremental Backup
Indirection layer/Bitmap space
IBM Spectrum Protect Snapshot
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
Uempty
SCSI Target
Forwarding
Replication
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
This illustrates the Spectrum Virtualize software architecture and the placement of the FlashCopy
function below the Upper Cache. The I/O stack cache re-architecture improves the processing of
FlashCopy operations with:
• Near instant prepare (versus minutes) same for Global Mirror with Change Volumes
• Full stride write for FlashCopy volumes no matter what the grain size
• You can now configure 255 FlashCopy consistency groups - up from 127 previously
With the new cache architecture, you now have a two-layer cache – upper cache and lower cache.
You will notice that cache now sit above FlashCopy as well as below FlashCopy. So in this design,
before you can take a FlashCopy, if there is anything in the upper cache it needs to be transferred
to the lower cache before the pointer table can be taken. The pointer table can also be taken
without having destage cache to disk beforehand.
Uempty
It’s important to understand that most applications are written such that they can automatically
recover in the event of a system crash. Without this capability, system crashes would lead to
corrupt or inconsistent data because in-flight writes may or may not have completed before a
system crash. Various algorithms are used for this purpose including journaling and two-phase
commit. Therefore, write order consistency is required, with write acknowledgments.
When a server crashes, in-flight transactions may be lost depending on the algorithm used.
Transactions that were started but didn’t complete (via receiving an acknowledgment that all the
necessary writes completed) are often rolled back when the application is restarted to ensure data
consistency. However, end users won’t have received an acknowledgment that their transaction
completed either. This mean that they will know to resubmit it later when the application is
available. Therefore for FlashCopy, writes in the write cache (in the upper cache) are moved to the
lower cache to keep write order consistency, as part of the FlashCopy process, because writes
aren't necessarily destaged from write cache in the same order they were written by the host.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
Listed are various types of FlashCopy volumes exist for different purpose.
• Snapshot is a no-copy FlashCopy that creates a point-in-time view of the production data
• Clone is a full-copy FlashCopy that creates an exact replica of the volume, which can be
changed without impacting the original volume. Once all the data is copied to the clone, the
bitmap is deleted. When creating a clone using the fast-path, the mapping is also deleted.
When creating a close and specifying the mapping via the advanced option, the mapping will
remain.
• In an incremental Backup a full point-in-time replica of the production data. After the copy
completes, the backup view can be refreshed from the production data, with minimal copying of
data from the production volume to the backup volume. Two bitmaps are kept. The second
bitmap is used to track changes to the source volume after the incremental backup FlashCopy
is created.
• Multiple target FlashCopy mappings allows up to 256 target volumes to be copied from a
single source volume. Each relationship between a source and target volume is managed by a
unique mapping such that a single volume can be the source volume in up to 256 mappings.
Each of the mappings from a single source can be started and stopped independently. If
multiple mappings from the same source are active (in the copying or stopping states), a
dependency exists between these mappings.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
• The Cascaded FlashCopy function allows a FlashCopy target volume to be the source volume
of another FlashCopy mapping.
• A Reverse FlashCopy functions only allows the data that is required to bring the target volume
current is copied. If no updates have been made to the target since the last refresh, the
direction change can be used to restore the source to the previous point-in-time state. A reverse
flash copy requires stopping the application before restoring the data.
• FlashCopy targets can be thin provisioned reducing the space used for FlashCopies. There are
two variations of this option to consider:
▪ Space-efficient source and target with background copy: Copies only the allocated space.
▪ Space-efficient target with no background copy: Copies only the space that is used for
changes between the source and target and is referred to as snapshots.
This function can be used with multi-target, cascaded, and incremental FlashCopy.
• A consistency group is a container for FlashCopy mappings, Global Mirror relationships, and
Metro Mirror relationships. You can add many mappings or relationships to a consistency group,
however FlashCopy mappings, Global Mirror relationships, and Metro Mirror relationships
cannot appear in the same consistency group.
Uempty
FlashCopy implementation
Target volume Source volume
SVC, FlashSystem Must be on the same as source volume Must be on the same as the target volume
V9000, Storwize Family
Storage pool Does not need to be in same as source Does not need to be same as target
volume volume
Size Must be same as source volume Must be same as target volume
The size of the source and target The size of the source and target
volumes cannot be altered (increased or volumes cannot be altered (increased or
decreased) while a FlashCopy mapping decreased) while a FlashCopy mapping
is defined is defined
Listed are several guidelines to consider before implementing a FlashCopy in your IBM storage
environment.
• The source and target volumes must be in the same IBM Spectrum Virtualize system and
volumes must be the same “virtual” size.
• The FlashCopy source and target volumes must reside on the same Spectrum Virtualize
cluster. The target volume can reside in a storage pool backed by a different storage system
from the source volume, enabling more flexibility than traditional storage systems based
point-in-time copy solutions.
• The source and target volumes do not need to be in the same I/O Group or storage pool.
However, they can be within the same storage pool, across storage pools, and across I/O
groups.
• The storage pool extent sizes can differ between the source and target.
• The I/O group ownership of volumes affects only the cache and the layers above the cache in
the Spectrum Virtualize system I/O stack. Below the cache layer the volumes are available for
I/O on all nodes within the system cluster.
• FlashCopy operations perform in direct proportion to the performance of the source and target
disks. There is an increase in the back end IOPS for copy on write activity, but that only matters
if we're approaching IOPS limits of the underlying storage, and this activity eventually ends with
full copy FlashCopies, when the background copy completes, and is limited to one volume's
worth of data for a FlashCopy target in any case.
Uempty
FlashCopy attributes
Up to 5000 FlashCopy mappings per IBM Spectrum Virtualize system
Source volumes can have up to 256 target volumes (Multiple Target FlashCopy)
Target volumes can be the source volumes for other FlashCopy relationships (cascaded FlashCopy)
Consistency groups are supported to enable FlashCopy across multiple volumes for the same point
in time
Up to 255 FlashCopy consistency groups are supported per system
Up to 512 FlashCopy mappings can be placed in one consistency group
Target volume can be updated immediately and independently of the source volume
Maximum number of supported FlashCopy mappings is 4096 TB per IBM Spectrum Virtualize
system
Size of the source and target volumes cannot be altered (increased or decreased) while a FlashCopy
mapping is defined
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
The FlashCopy function in the IBM Spectrum Virtualize management GUI features using the lists of
following attributes:
• Up to 5000 FlashCopy mappings per IBM Spectrum Virtualize system.
• Only 256 FlashCopy mappings that can exist with the same source.
• You can have up to 4096 FlashCopy mappings that can exist with the same source Volume per
system.
• The maximum of FlashCopy Consistency Group that you can have per system is 255, which is
the arbitrary limit that is policed by the software.
• You have a maximum limit of 512 FlashCopy mappings per Consistency Group. The set amount
is based on the time that is taken to prepare a Consistency Group with many mappings.
• Once the target volume is mapped to a host and configured there, it can be immediately used
and updated, even if it's a clone and the full copy hasn't completed.
Uempty
FlashCopy process (1 of 2)
• Bitmap created at Flash Copy start time
ƒ Bitmap keeps track of grain size units (256 KB default)
ƒ Bits point to where data is for the target volume
ƒ Bits initially set to 0 pointing to source volume
ƒ No data is actually copied at start of flash copy mapping
í Target volume is immediately ready for read/write use
Bitmap
00000000000
00000000000
00000000000
00000000000
00000000000
00000000000
Source Target
00000000000
00000000000
00000000000
00000000000
00000000000
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
This diagram illustrates the general process for how FlashCopy works while the full image copy is
being completed in the background. To create an instant copy of a volume, you must first create a
mapping between the source volume (the disk that is copied) and the target volume (the disk that
receives the copy). The source and target volumes must be of equal size. When a FlashCopy
operation starts, a checkpoint is made of the source volume. No data is actually copied at the time
a start operation occurs. Instead, the checkpoint creates a bitmap that indicates that no part of the
source volume has been copied. Each bit in the bitmap represents one region of the source
volume. Each region is called a grain.
The bits in the bitmap point to where the data resides for the target volume, and initially all bits point
to the source volume. Each bit represents a grain size unit of the volumes which by default is 256
KB.
Uempty
FlashCopy process (2 of 2)
• When a host issues a write to the source, the original data is copied to the target (aka. a
copy on write or COW) and the bitmap is updated to point to the target
ƒ Applies to both full copy and no copy FlashCopies
ƒ Full copy FlashCopies have a non-zero background copy rate
í Background copy rate can be adjusted
Bitmap read
write
11111111111 1= Target
0= Source 00000000000
00010000000
00000001000
00000000000
00000100000
Source Target
00000000000
00000010000
00000000000
00000000000
00000000000
When data is being copied from the source volume to the target volume, the default grain size is
256 KB. To facilitate copy granularity for incremental copy the grain size can be set to 64 KB at
initial mapping definition. If a compressed volume is in a FlashCopy mapping then the default grain
size is 64 KB instead of 256 KB. After the Flash Copy mapping is started, hosts continue to write to
their source volume. So, to preserve the target volume's data, before the grain on the source
volume is written over, that grain is read and written to the target volume. A process called Copy On
Write.
The priority of the background copy process is controlled by the background copy rate. A rate of
zero indicates that only overwritten data on the source volume is copied over to the target volume
via the COW process. Unchanged data is read from the source. This option is designed primarily
for backup applications where a point-in-time version of the source is only needed temporarily.
Uempty
Bitmap
11111111111
11111111111
11111111111
11111111111
11111111111
11111111111
Source Target
11111111111
11111111111
11111111111
11111111111
11111111111
When the background copy is complete, FlashCopy operation is complete and the bitmap is
removed. There is no longer a relationship between the source and target volume. Therefore,
source and target are logically independent and the bitmap can be deleted. For fastpath clone
FlashCopies, the mapping is also deleted. While if the mapping was explicitly created by the user, it
will remain.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
The priority of the background copy process is controlled by the background copy rate. A rate of
zero indicates that only data being changed on the source should have the original content copied
to the target (also known as copy-on-write or COW). Unchanged data is read from the source. This
option is designed primarily for backup applications where a point-in-time version of the source is
only needed temporarily.
A background copy rate of 1 to 100 indicates that the entire source volume is to be copied to the
target volume. The rate value specified corresponds to an attempted bandwidth during the copy
operation:
• 01 to 10 - 128 KBps
• 11 to 20 - 256 KBps
• 21 to 30 - 512 KBps
• 41 to 50 - 2 MBps (the default)
• 51 to 60 - 4 MBps
• 61 to 70 - 8 MBps
• 71 to 80 - 16 MBps
• 81 to 90 - 32 MBps
• 91 to 100 - 64 MBps
• 101 -110 - 128 MBps
• 111 - 120 -256 MBps
• 121 - 130 -512 MBps
Uempty
• 131 - 140 -1 GBps
• 141 - 150 -2 GBps
The background copy rate can be changed dynamically during the background copy operation.
The background copy is performed by one of the nodes of the I/O group in which the source volume
resides. This responsibility is failed over to the other node in the I/O group in the event of a failure of
the node performing the background copy.
Uempty
D Write
copy on demand D'
Write
F copy-on-write F
F' (COW)
copied blocks
Write X' X Read
background
Read Y copy Y' Write
The background copy is performed backwards. That is, it starts with the grain containing the highest
logical block addresses (LBAs) and works backwards towards the grain containing LBA 0. This is
done to avoid any unwanted interactions with sequential I/O streams from the using application.
After the FlashCopy operation has started, both source and target volumes can be accessed for
read and write operations:
• Source reads: Business as usual.
• Target reads: Consult its bitmap. If data has been copied then read from target. If not, read from
the source.
• Source writes: Consult its bitmap. If data has not been copied yet then copy source to target
first before allowing the write (copy on write or COW). Update bitmap.
• Target writes: Consult its bitmap. If data has not been copied yet then copy source to target first
before the write (copy on demand). Update bitmap. One exception to copying the source is if
the entire grain is to be written to the target then copying the source is not necessary.
Uempty
D Write
copy on demand D'
Write
F copy-on-write F
F' (COW)
no
Write X' background
copy
Read Y
Minimize disk capacity utilization if
using Thin-Provisioned target
Grain size =256 KB/64 KB
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
For a copyrate=0 FlashCopy invocation the background copy is not performed. The target is often
referred to as a snapshot of the source. After the FlashCopy operation has started, both source and
target volumes can be accessed for read and write operations.
Write activity occurs on the target when:
• Write activity has occurred on the source and the point-in-time data has not been copied to the
target yet. The original source data (based on grain size) must be copied to the target before
the write to the source is permitted. This is known as copy-on-write.
• Write activity has occurred on the target to a subset of the blocks managed by a grain where the
point-in-time data has not been copied to the target yet. The original source data (based on
grain size) has to be copied to the target first.
• Read activity to the target is redirected to the source if the data does not reside on the target.
Since no background copy is performed, using a Thin-Provisioned target minimizes the disk
capacity required.
Uempty
2. Prepare Flush write cache for source, discard cache for target,
place source volume in write-through mode
svctask prestartfcconsistgrp
or
Preparing svctask prestartfcmap
3. Start Set metadata, allow I/O, start copy
svctask startfcconsistgrp
or
Prepared svctask startfcmap
Copying
4. Delete
Uempty
• I/O is briefly paused on the source volumes to ensure ongoing reads and writes below the
cache layer have been completed.
• Internal metadata are set to allow FlashCopy.
• I/O is then resumed on the source volumes.
• The target volumes are made accessible.
• Read and write caching is enabled for both the source and target volumes. Each mapping is
now in the copying state.
Unless a zero copy rate is specified, the background copy operation copies the source to target
until every grain has been copied. At this point, the mapping progresses from the copying state to
the Idle_or_copied state.
Delete: A FlashCopy mapping is persistent by default (not automatically deleted after the source
has been copied to the target). It can be reactivated by preparing and starting again. The delete
event is used to destroy the mapping relationship. A FlashCopy mapping is deleted when the data
is fully copied if the -autodelete option of the mkfcmap command is used. By default this option
isn't used, but it is used when creating a clone via the fastpath in the GUI. Note that one should not
delete the mapping until one either stops using the target volume, or a full copy has completed;
otherwise, it will lead to data corruption.
FlashCopy can be invoked using the CLI or GUI. Scripting using the CLI is also supported.
Uempty
3. Start
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
During the prepare event, writes to the source volume experience additional latency because the
cache is operating in write-through mode while the mapping progresses from preparing to
prepared mode. The target volume is online but not accessible.
The two mechanisms by which a mapping can be stopped are by I/O errors or by command. The
target volume is set offline. Any useful data is lost. To regain access to the target volume start the
mapping again.
If access to the bitmap and metadata has been lost (such as if access to both nodes in an I/O group
has been lost) the FlashCopy mapping is placed in suspended state. In this case, both source and
target volumes are placed offline. When access to metadata becomes available again then the
mapping will return to the copying state and both volumes will become accessible and the
background copy resumed.
The stopping state indicates that the mapping is in the process of transferring data to a dependent
mapping. The behavior of the target volume depends on whether the background copy process had
completed while the mapping was in the copying state. If the copy process had completed then the
target volume remains online while the stopping copy process completes. If the copy process had
not completed then data in the cache is discarded for the target volume. The target volume is taken
offline and the stopping copy process runs. When the data has been copied then a stop complete
asynchronous event is notified. The mapping transitions to the idle_or_copied state if the
background copy has completed, or to the stopped state if it has not. The source volume remains
accessible for I/O.
Stopped: The FlashCopy was stopped either by user command or by an I/O error. When a
FlashCopy mapping is stopped, any useful data in the target volume is lost. Because of this, while
the FlashCopy mapping is in this state, the target volume is in the Offline state. In order to regain
access to the target the mapping must be started again (the previous FlashCopy will be lost) or the
FlashCopy mapping must be deleted. While in the Stopped state any data which was written to the
Uempty
target volume and was not flushed to disk before the mapping was stopped is pinned in the cache.
It cannot be accessed but does consume resource. This data will be destaged after a subsequent
delete command or discarded during a subsequent prepare command. The source volume is
accessible and read and write caching is enabled for the source.
Suspended: The target has been point-in-time copied from the source, and was in the copying
state. Access to the metadata has been lost, and as a consequence, both source and target
volumes are offline. The background copy process has been halted. When the metadata becomes
available again, the FlashCopy mapping will return to the copying state, access to the source and
target volumes will be restored, and the background copy process resumed. Unflushed data which
was written to the source or target before the FlashCopy was suspended is pinned in the cache,
consuming resources, until the FlashCopy mapping leaves the suspended state.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
The management GUI supports the FlashCopy functionality with three menu options within the
Copy Services menu option:
• The FlashCopy menu option is designed to be a fast path with extensive use of pre-defined
automatic actions embedded in the FlashCopy presets to create target volumes, mappings, and
consistency groups.
• The Consistency Groups menu option is designed to create, display, and manage related
mappings that need to reside in the same consistency group.
• The FlashCopy Mappings menu option is designed to create, display, and manage the
individual mappings. If mappings reside in a consistency group then this information is also
identified.
FlashCopy mappings are created from all three menu options, but they are created automatically
from the FlashCopy menu.
The ensuing examples are designed to illustrate the FlashCopy functions provided by the Spectrum
Virtualize system as well as the productivity aids added with the GUI.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
For fast path FlashCopy processing, select Copy Services > FlashCopy to view a volume list.
Select a volume entry and right-click to select the desired FlashCopy preset.
The FlashSystem management GUI provides three FlashCopy presets to support the three
common use case examples for point-in-time copy deployments.
These presets templates that implement best practices as defaults to enhance administrative
productivity. For the FlashCopy presets the target volumes can be automatically created and
FlashCopy mappings defined. If multiple volumes are selected then a consistency group to contain
the related mappings is automatically defined as well.
Typical FlashCopy usage examples include:
• Create a target volume such that it is a snapshot of the source (that is, the target contains only
copy-on-write blocks or COW). If deployed with Thin Provisioning technology then the snapshot
might only consume a minimal amount of storage capacity. Use cases for snapshot targets
include:
▪ Backing up source volume to tape media where a full copy of the source on disk is not
needed.
▪ Exploiting Thin Provisioning technology by taking more frequent snapshots of the source
volume and hence facilitate more recovery points for application data.
• Create a target volume that is a full copy, or a clone, of the source where subsequent
resynchronization with the source is expected to be either another full copy or is not needed.
Use cases for clone targets include:
▪ Testing applications with pervasive read/write activities.
▪ Performing what-if modeling or reports generation where using static data is sufficient and
separation of these I/Os from the production environment is paramount.
Uempty
▪ Obtaining a clone of a corrupted source volume for subsequent troubleshooting or
diagnosis.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
• Create a target volume that is to be used as a backup of the source where periodic
resynchronization is expected to be frequent and hence incremental updates of the target would
be more cost effective. Use cases for backup targets include:
▪ Maintaining a consistent standby copy of the source volume on disk to minimize recovery
time.
▪ Implementing business analytics where extensive exploration and investigation of business
data for decision support requires the generated intensive I/O activities to be segregated
from production data while the data store needs to be periodically refreshed.
Both the snapshot and backup use cases address data recovery. The recovery point objective
(RPO) denotes at what point (in terms of time) should the application data be recovered or what
amount of data loss is acceptable. After the application becomes unavailable, the recovery time
objective (RTO) indicates how quickly it is needed to be back online or how much down time is
acceptable.
RPO and RTO business requirements can range from seconds to weeks. A reverse FlashCopy can
restore the data much faster, and almost instantly, and offers a lower RTO as compared to using
and restoring from tape. However, a reverse flash copy requires the Spectrum Virtualize system be
working to recover the data, while offsite tapes do not.
Uempty
PREPARE_COMPLETED
Cluster
COPY_COMPLETED Event SNMP
Log traps
STOP_COMPLETED
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
FlashCopy events that complete asynchronously are logged and can be used to generate SNMP
traps for notification purposes.
PREPARE_COMPLETED is logged when the FlashCopy mapping or consistency group has
entered the prepared state as a result of a user request to prepare. The user is now able to start (or
stop) the mapping/group.
COPY_COMPLETED is logged when the FlashCopy mapping or consistency group has entered
the idle_or_copied state when it was previously in the copying state. This indicates that the target
volume now contains a complete copy and is no longer dependent on the source volume.
STOP_COMPLETED is logged when the FlashCopy mapping or consistency group has entered
the stopped state as a result of a user request to stop. It is distinct from the error that is logged
when a mapping or group enters the stopped state as a result of an IO error.
Uempty
)ODVK&RS\
)XQFWLRQDOLW\DQGRYHUYLHZ
&UHDWH6QDSVKRW
&UHDWHFRQVLVWHQF\JURXSZLWKPXOWL
VHOHFW
,QFUHPHQWDO%DFNXS
,QGLUHFWLRQOD\HU%LWPDSVSDFH
,%06SHFWUXP3URWHFW6QDSVKRW
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
Uempty
Automatically creates
target volume,
mapping and starts it.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
The snapshot creates a point-in-time backup of production data. The snapshot is not intended to be
an independent copy. Instead, it is used to maintain a view of the production data at the time that
the snapshot is created. Therefore, the snapshot holds only the data from regions of the production
volume that changed since the snapshot was created. Because the snapshot preset uses thin
provisioning, only the capacity that is required for the copy on write activity is used.
To create and start a snapshot, from the Copy Services > FlashCopy window, right-click on the
volume that you want to create a snapshot of or click Actions > Create Snapshot. Upon selection
of the Create Snapshot option, the GUI automatically:
• Creates a target volume using a name based on the source volume name with a suffix of _01
appended for easy identification in the source volume's pool. The real capacity size starts out as
0% of the virtual volume size and will automatically expand as write activity occurs.
• Creates a mapping
• Starts the snapshot
Uempty
The management GUI defines a FlashCopy mapping using the mkfcmap command with a
background copy rate of 0.
• Starts the mapping using the startfcmap -prep 4 command where 4 is the object ID of the
mapping, and -prep embeds the FlashCopy prepare process with the start process.
Once the target volume has been created, it is now available to be mapped to host objects for host
I/O. The Snapshot Thin-provisioned volume uses disk space only when updates are made to the
source or target data and not for the entire capacity of a volume copy.
A FlashCopy can be modified as long as the task is running.
Uempty
IBM_cluster:admin>lsfcmap fcmap0
id 0
name fcmap0
source_vdisk_id 12
source_vdisk_name Test0
target_vdisk_id 14 Target_01
target_vdisk_name Test0_01
group_id
group_name
status copying 0% COWs
progress 0
copy_rate 0
start_time 190710195930
dependent_mappings 0
autodelete off
clean_progress 100
clean_rate 0
incremental off
difference 100
grain_size 256
……………. &2: FRS\RQZULWH
restore_progress 0
fc_controlled no
IBM_cluster:admin> &RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
All FlashCopy mappings are displayed from the Copy Services > FlashCopy Mappings view.
Observe the default mapping name of fcmap3 assigned to the mapping for source volume and note
the current copy progress of 15 percent is in the mapping entry. Since this mapping has a copy rate
set to 0, the copy progress represents the copy-on-write (COW) activity.
Use the CLI lsfcmap command with either the object name or ID of the mapping to view detailed
information about a mapping. The mapping grain size can be found in this more verbose output.
The grain size for a FlashCopy mapping bitmap defaults to 256 KB for all but the compressed
volume type; which has a default grain size of 64 KB. The default size value can be overridden if
the CLI is used to define the mapping. However, the best practice recommendation is to use the
default values.
The example shows the status of the source and target volume with no write activity.
Uempty
IBM_cluster:admin>lsfcmap fcmap0
id 0
name fcmap0
source_vdisk_id 12
source_vdisk_name Test0
target_vdisk_id 14 Target_01
target_vdisk_name Test0_01
group_id
group_name
status copying 7% COWs
progress 7
copy_rate 0
start_time 190710195930
dependent_mappings 0
autodelete off
clean_progress 100
clean_rate 0
incremental off
difference 100
grain_size 256
……………. &2: FRS\RQZULWH
restore_progress 0
fc_controlled no
IBM_cluster:admin> &RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
The example shows the status of the source and target volume with writes in progress. Since the
background copy rate is set to 0, the progress of 7% shows that 7% of the source volume has been
written to the target volume via the COW process.
When subsequent writes occur on the source volume, the content of the blocks being changed
(written to) is copied to the target volume in order to preserve the point-in-time snapshot target
copy. These blocks are referred to as copy-on-write (COW) blocks; the ‘before’ version of the
content of these blocks is copied as a result of incoming writes to the source. This write activity
caused the real capacity of the Thin-Provisioned target volume to automatically expand: Matching
the quantity of data being written.
It might be worthwhile to emphasize that the FlashCopy operation is based on block copies
controlled by grains of the owning bitmaps. FlashSystem is a block level solution so, by design (and
actually per industry standards), the copy operation has no knowledge of OS logical file structures.
The same information is available by using CLI with the help of the lsmap command.
Uempty
)ODVK&RS\
)XQFWLRQDOLW\DQGRYHUYLHZ
&UHDWH6QDSVKRW
&UHDWHFRQVLVWHQF\JURXSZLWKPXOWL
VHOHFW
,QFUHPHQWDO%DFNXS
,QGLUHFWLRQOD\HU%LWPDSVSDFH
,%06SHFWUXP3URWHFW6QDSVKRW
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
This topic examines the ability to create consistency group by selecting multiple mappings to be
managed as a single entity.
Uempty
Consistency groups
FlashCopy consistency groups are used to group multiple flash copy operations together that have
a need to be controlled at the same time
Group can be controlled by starting or stopping with a single operation
Ensures that when stopped for any reason, the I/Os to all group members have all stopped at the same
point in time
í Ensures time consistency across volumes
0DSSLQJ
0DSSLQJ
0DSSLQJ
0DSSLQJ
0DSSLQJ
6RXUFH9ROXPH 7DUJHW9ROXPH
6RXUFH9ROXPH 7DUJHW9ROXPH
6RXUFH9ROXPH 7DUJHW9ROXPH
6RXUFH9ROXPH 7DUJHW9ROXPH
6RXUFH9ROXPH 7DUJHW9ROXPH
6OLGHFRQWDLQDQLPDWLRQV
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
Consistency Groups address the requirement to preserve point-in-time data consistency across
multiple volumes for applications that include related data that spans multiple volumes. For these
volumes, Consistency Groups maintain the integrity of the FlashCopy by ensuring that “dependent
writes” are run in the application’s intended sequence.
When Consistency Groups are used, the FlashCopy commands are starting the consistency group
starts all the FlashCopy mappings at the same point in time. Therefore the administrators are
tasked with starting FlashCopy creation can do this for all the volumes associated with an
application in a single operation. This provides a crash consistent copy of the data.
After an individual FlashCopy mapping is added to a Consistency Group, it can be managed as part
of the group only. Operations, such as prepare, start, and stop, are no longer allowed on the
individual mapping.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
When multiple volumes are selected from the Copy Services > FlashCopy menu, the GUI presets
operate at the consistency group level (instead of mapping level). Besides automatically creating
targets and mappings, a consistency group is also defined to allow multiple mappings to be
managed as a single entity. The copy is automatically started at the consistency group level.
Some competitor's products require waiting for the full copy to complete before the target volume
can be accessed, which is a disadvantage compared to Spectrum Virtualize FlashCopy.
Consistency groups also be created, modified, and deleted with concise, direct CLI commands.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
The commands issued by the management GUI for this Clone preset invocation example have
been extracted and highlighted:
• A consistency group is created with the -autodelete parameter; which causes the Spectrum
Virtualize system to automatically delete the consistency group when background copy
completes.
• Two fully allocated target volumes are created. The name and size of the target volumes
derives from the source volumes; following the GUI naming convention for FlashCopy. Two
FlashCopy mappings are defined each with the default copy rate of 50 (or 2 MBps).
• Once the volume have been created and a FlashCopy mapping has been established, the
consistency group automatically starts with the startfcconsistgrp command which contains
an embedded prepare.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
The Copy Services > FlashCopy view displays the two defined individual mappings. The Copy
Services > FlashCopy Mapping shows that volumes are associated with the same fccstgrp0
consistency group. The progress bar for each FlashCopy mapping provides a direct view of the
progress of each background copy. This progress data is also provided through the Background
Tasks interface; which is accessible from any GUI view.
Uempty
svctask chfcmap -
cleanrate 50 -
copyrate 100 1
WRLQFUHDVHIURPWKHGHIDXOW
0%SVWR0%SV
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
The background copy has a default copy rate of 50 which can be changed dynamically.
To change the background rate, right-click a fcmap mapping entry and select Edit Properties. Drag
the Background Copy Rate: slider bar in the Edit FlashCopy Mapping box all the way to the right
to increase the value to 100 then click Save.
The GUI generates the chfcmap -copyrate command and uses -copyrate to increase the copy
rate to 100 for the specified mapping whose ID. The 100 value causes the background copy rate to
increase from the default 2 MBps to 64 MBps.
Uempty
• Once the copy operation between the source volume and target volume is complete, the
consistency group and mappings are deleted automatically
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
From the Copy Services > Consistency Group view, you can see the changes in the fcmp1 target
volume now that the background copyrate has been increased. The consistency group has a status
of copying as long as one of its mappings is in the copying status. Once the copy operation for the
source volume to the target volume is complete, the -autodelete specification will take effect.
Uempty
FlashCopy
Functionality and overview
Create Snapshot
Create consistency group with multi-
select
Incremental Backup
Indirection layer/Bitmap space
IBM Spectrum Protect Snapshot
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
The topic reviews the incremental or an incremental backup features to create a point-in-time copy
of a database implemented directly on attached FlashCopy or the IBM Spectrum Virtualize
systems.
Uempty
First copy process copies all of the data from the Full copy of all data the first time
source volume to the target volume
Later …
Copies only the parts of the source or target
volumes that changed since the last copy
Reduces the amount of data copied and time to Some data changed by apps
complete the full copy
Start incremental FlashCopy
í Extra bitmap tracks changes to the source volume
after the incremental PiT image is created on the
target volume
Only changed data copied
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
FlashCopy Backup copies, aka. incremental backup, is designed for situation in which a full copy of
the volume(s) needs to be made on a periodic basis. The incremental feature here, keeps track of
changes to the source volume that occur since the last Backup FlashCopy was created. The first
time we start a backup mapping, the full volume is copied to the target. Subsequent starts of this
Backup FlashCopy mapping then only have to copy over changes made to the source volume since
the last start of the mapping. Note that this requires an extra bitmap than clone or snapshot
FlashCopies.
The mappings and bitmap space are affected by the gain size as specified in the mkfcmap
command. The 64 KB grain size provides more copy granularity at the expense of using more bits
or larger bitmaps than the 256 KB grain size. To be able to monitor the difference between source
and target a “difference” value is maintained in the FlashCopy mapping details.
An extra bitmap space is created to track changes to the source volume after the incremental
backup mapping is started. It points to the changed data that will need to be copied to the target
volume the next time the mapping is started.
Uempty
Test1 TesT1_TGT
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
If less automation or more administrator control is desired, a FlashCopy mapping can be manually
defined from the Copy Services > FlashCopy Mappings panel by clicking the Create FlashCopy
Mapping button. This path expects the target volume to have been created already.
From the Create FlashCopy Mapping dialog box, specify the source volume and target volume. The
GUI automatically determines the list of eligible targets. An eligible target volume must be the same
size of the source and must not be serving as a target in other FlashCopy mappings. After a target
has been Added, the GUI confirms the source and target pairing. Normally one will create the target
volume before creating the mapping.
From the volume entries, the UIDs of the source and target volumes; also observe that they reside
in different storage pools representing different storage systems.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
The intent is to use incremental FlashCopy, therefore the Backup preset is selected. You have the
option to change the mapping attributes, such as the background copy rate or copy rate before the
mapping is defined. You also have the option to add the mapping to a consistency group. Since this
is a one volume one mapping example, a consistency group is not necessary.
The GUI generates the mkfcmap command contains the -incremental parameter. The incremental
copy option can only be specified at mapping definition. In other words, after a mapping has been
created, there is no way to change its attribute to an incremental copy without deleting the mapping
and redefining it again.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
To start the FlashCopy mapping, right-click the mapping entry and select Start from the menu list.
The management GUI generates the startfcmap command with an embedded prepare to start the
mapping.
Uempty
• If FlashCopy mapping has a status of Idle, Copied, or Copying, the source and target
volumes can act as independent volumes
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
From Copy Services > FlashCopy Mapping, you can view the background copying progress. The
FlashCopy Mapping has a status of copying as long as one of its mappings is in the copying
status. The target volume can be used immediately after the start of the FlashCopy even if the
mapping is in the copying state. However, to ensure I/O to the target volume doesn't interfere with
I/O to the source volume, we need to wait until the mapping is in the Copied state.
If the mapping is incremental and the background copy is complete, the mapping records the
differences between the source and target volumes only. Source and target volumes can be taken
offline, if the nodes handling the volumes lose communication with each other.
Uempty
• Target volume contains the point-in-time content of the source volume as subsequent
writes occurs
0RUHZULWHV
RQVRXUFH
7HVW
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
As data is added over a period of times to the source volume, the host I/O activity continues while
the background copy is in progress. Incremental FlashCopy copies all of the data when you first
start FlashCopy and then only the changes when you stop and start FlashCopy mapping again.
The target volume contains the point-in-time content of the source volume. Even though
subsequent write activity has occurred on the source volume, it isn’t reflected on the target volume.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
The CLI lsfcmap command is used in this example to view the FlashCopy mapping details. The
copy_rate had been updated to 100 percent. Background copy has completed hence the status of
this mapping is idle_or_copied. Recall the Backup preset was selected - causing this mapping to
be defined with autodelete off and incremental on.
Since this mapping is defined with incremental copy, bitmaps are used to track changes to both the
source and target (recall reads/writes are supported for both source and target volumes). The
difference value indicates the percentage of grains that have changed between the source and
target volumes.
This difference percentage represents the amount of grains that need to be copied from the source
to the target with the next background copy. The value of 22 percent in this example is the result of
data having been added or written to the source volume.
Uempty
In this example, we are using the CLI startfcmap -prep 0 command to start the mapping. This
command does not return a successful submission of a long running asynchronous job. In this
case, the background incremental copy.
Since it is an incremental copy, only those blocks related to the changed grains (the 22%) are
copied to the target. The immediately submitted lsfcmap command concise output displays a
status of copying and a progress of 77% already.
A short time later, the lsfcmap 0 verbose output shows the completion of the background copy -
progress 100 and difference 0.
After incremental copy completes, the content of the target volume is updated. At this point, the
content of the two volumes are identical. Subsequent changes to both source and target volumes
are now being tracked by the system cluster FlashCopy.
Uempty
Test1_Ale
3 Test1_Ale
Test1_Stout
Test1_Stout
4
Incremental
Copy
Test1_Ale
Data corrupted
Test1_Stout
5
/
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
This example illustrates the incremental copy option of FlashCopy. At some point along the way,
data corruption to the source occurs. This might be due to subsequent write activity it is now
deemed that a logical data corruption occurred - perhaps due to a programming bug.
Uempty
VB1_NEW_DEBUG
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
Reverse FlashCopy enables FlashCopy targets to become restore points for the source without
breaking the FlashCopy relationship and without having to wait for the original copy operation to
complete. FlashCopy provides the option to take a point-in-time copy of the corrupted volume data
for debugging purposes. It supports multiple targets (up to 256) and therefore multiple rollback
points.
You also have the ability to create an optional copy of the source volume to be made before the
reverse copy operation starts. This ability to restore back to the original source data can be useful
for diagnostic purposes.
A key advantage of IBM Spectrum Virtualize Reverse FlashCopy is that the original target volume
data is preserved, so if it's being read accessed (for example a backup is in progress) that access
can continue.
It's important to understand that you should not start a FlashCopy to a volume that it is in use. If the
volume is in use on a host, the host likely has data or structures from that volume reflected in its
RAM. And overwriting data on disk is likely to lead to data corruption and a host crash. The safe
procedure is to delete the disk definition on the host before starting a reverse FlashCopy, then
reconfiguring it on the host after the FlashCopy has started.
This image illustrates that the corrupted volume image is to be captured for future problem
determination (step 6A).
Then the reverse copy feature of FlashCopy is used to restore the source volume from the target
volume (step 6B).
Uempty
7HVW 7HVWB
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
To obtain a volume copy of the corrupted source volume for later debugging, the fast path Copy
Services > FlashCopy menu is used.
Right-click the source volume entry and select Create Clone. The Clone preset will automatically
generate commands to create the target volume, define the source to target FlashCopy mapping,
and start the background copy.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
To restore the source volume to its prior point-in-time copy, a reverse FlashCopy mapping is
defined. This procedural is similar to the used to creating any FlashCopy mapping, though in this
case the source volume is defined as the one with hopefully good data (which was and is originally
a FlashCopy target volume), and the target volume is the original volume with data we'll be writing
over. In this case, a clone is used to create an exact replica of the source volume on a target
volume. The copy can be changed without impacting the original volume. You also have the option
to add the mapping to a consistency group.
A warning dialog is displayed by the GUI to caution that the target volume is also a source volume
in another mapping. This is normal for restore and for this example, it is by design.
Uempty
Start mapping
Test1_
Test1
TGT
Test1_
DEBUG
Rename debug
volume
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
Since the reverse mapping had been defined using the Copy Services > FlashCopy Mapping
menu, its status is Idle. The administrator controls when to start mapping.
The FlashCopy target volume, target_01, which contains the source volume image with corrupted
data, should have a more descriptive name than the default name assigned by the fast path
FlashCopy GUI. You can use the Volumes > Volumes pane to rename the target_01 volume to
“target name_DEBUG”.
Uempty
Test1
Test1_TGT
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
To restore the content of the source volume from the target volume, right-click the new
source_TGT volume entry in the reverse FlashCopy mapping and select Start from the pop-up
menu.
Observe the startfcmap command generated by the GUI contains the -restore parameter. The
-restore parameter allows the mapping to be started even if the target volume is being used as a
source in another active FlashCopy mapping.
Uempty
ID 0
ID 1 Test1 ID 2 Test1W
_TGT
Test1_
_DEBUG
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
Uempty
7HVW
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
As with any FlashCopy mapping, after background copy has started, both the source and target
volumes are available for read/write access.
It's important to not start a FlashCopy to a target volume that is in use, because information about
the volume and data are potentially reflected in the host RAM, which could lead to a crash or
corruption of the data on the target. Therefore, you should generally remove the target disk
definition from the host, which also ensures the data and disk won't be in use, prior to starting a
mapping to a target volume. Then after starting the mapping, one can reconfigure the disk on the
host and restart the application.
This view shows the original source volume has been restored to the content level of the target
volume.
Based on the SDD reported disk serial number, the target_DEBUG volume has been assigned to
the host as drive letter E. It contains the corrupted content of the source volume. Reverse
FlashCopy enables FlashCopy targets to become restore points for the source without breaking the
FlashCopy relationship and without having to wait for the original copy operation to complete.
Uempty
Source Target
1 Optional Copy Volume Volume
of Original X Y
Relationship SAN
OR
Target 2 Reverse Target
Volume
Z FlashCopy Volume
W
operation
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
The multi-target FlashCopy operation allows several targets for the same source. This can be used
for backup to tape later. Even if the backup is not finished, the user can create an additional target
for the next backup cycle and so on.
Reverse FlashCopy enables FlashCopy targets to become restore points for the source without
breaking the FlashCopy relationship and without having to wait for the original copy operation to
complete. It supports multiple targets (up to 256) and thus multiple rollback points.
A key advantage of the IBM Spectrum Virtualize Multiple Target Reverse FlashCopy function is that
the reverse FlashCopy does not destroy the original target, which allows processes by using the
target, such as a tape backup, to continue uninterrupted.
IBM Spectrum Virtualize also provides the ability to create an optional copy of the source volume to
be made before the reverse copy operation starts. This ability to restore back to the original source
data can be useful for diagnostic purposes.
In this example, the multi-target FlashCopy operation has occurred an error or virus on the source.
Therefore, the administrator needs to reverse FlashCopy the Snapshot data on target1 or target2
can be flashed back. This process is incremental and thus very fast. The host can then work with
the clean data. If a root cause analysis of the original source is store the corrupted data for later
analysis.
Reverse FlashCopy:
• Does not require the original FC copies to have been completed.
• Does not destroy the original target content (for example, does not disrupt tape backups
underway).
• Does allow an optional copy of the corrupted source to be made (for example, for diagnostics)
before starting the reverse copy.
Uempty
• Does allow any target of the multi-target chain to be used as the restore or reversal point.
Uempty
Mapping
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
If your company data must maintain the consistency of data across multiple disk volumes at a
backup location and available 24 hours a day, having eight hours of downtime is unacceptable.
Using the FlashCopy service as part of the backup process can help reduce the time needed for the
backup. When the FlashCopy process is started, your application stops for just a moment, and then
immediately resumes.
The thin provisioning of Flash Copies significantly reduces the storage space needed to keep
multiple backups, only needing space to hold as much data that's been overwritten since the PiT
image was started for each target volume.
FlashCopy consistency groups ensure data consistency across multiple volumes by putting
dependent volumes in an extended long busy state and then starting the consistency group
mapping for the volumes. This ensures a consistent PiT image for all the volumes in the group.
Uempty
)ODVK&RS\
)XQFWLRQDOLW\DQGRYHUYLHZ
&UHDWH6QDSVKRW
&UHDWHFRQVLVWHQF\JURXSZLWKPXOWL
VHOHFW
,QGLUHFWLRQOD\HU%LWPDSVSDFH
,%06SHFWUXP3URWHFW6QDSVKRW
5HPRWH&RS\
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
In this topic, we will review how FlashCopy utilizes bitmaps to track grains in FlashCopy mappings
or mirroring relationship. In addition, review the functions of Tivoli Storage Manager.
Uempty
Write
Reade
COW doesn't impact write latency
Lower cache used to optimize IO via read pre-fetch Upper Cache
and coalescing of I/Os
Stage
Destage
FlashCopy
FlashCopy prep step just moves data from upper Indirection layer
to lower cache FlashCopy bitmap
Destage
Stage
í Much faster because upper cache write data goes
directly to the lower cache Lower Cache
Reade
Write
Reade
Write
IOs to storage
controllers
Source Target
Volume Volume
Copy from source to target
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
Starting with V7.3 the entire cache subsystem was redesigned and changed. Cache has been
divided into upper and lower cache. Upper cache serves mostly as write cache and hides the write
latency from the hosts and applications. The lower cache is a read/write cache and optimizes I/O to
and from disks.
The copy on write process requires that prior to the host write going to disk, that we read the data
we're writing over and write it to the FlashCopy target. This would increase write latency, except we
return an acknowledge to the host once the data is in the I/O group write cache, and this COW
process happens in the background.
The two level cache design provides additional performance improvements to FlashCopy
mechanism. Because now the FlashCopy layer is above lower cache in the IBM Spectrum
Virtualize software stack, it can benefit from read prefetching and coalescing writes to backend
storage. Also, preparing FlashCopy is much faster because upper cache write data does not have
to go directly to backend storage but to lower cache layer. Additionally, in the multi-target
FlashCopy the target volumes of the same image share cache data. This design is opposite to
previous IBM Spectrum Virtualize code versions where each volume had its own copy of cached
data.
The bitmap governs the I/O redirection (I/O indirection layer) which is maintained in both nodes of
the storage system I/O Group to prevent a single point of failure. For the FlashCopy volume
capacity per I/O Group, you have a maximum limit on the quantity of FlashCopy mappings that are
using bitmap space from this I/O Group. This maximum configuration uses all 4 GiB of bitmap
space for the I/O Group and allows no Metro or Global Mirror bitmap space. While a node has a
minimum of 64 GB of memory and a maximum of 256 GB, the maximum memory used in an I/O
group for RAID, mirroring, remote copy and FlashCopy combined is 552 MB.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
Bitmaps are internal storage system data structures used to track which grains in FlashCopy
mappings or mirroring relationships, have been copied from the source volume to the target
volume; or from one copy of a volume to another for Volume Mirroring.
Bitmaps consume bitmap space in each I/O group’s node cache. The maximum amount of memory
used for bitmap space is 552 MB per I/O Group, which is shared among FlashCopy bitmaps,
Remote Copy (Metro/Global Mirroring) bitmaps, Volume Mirroring, and RAID processing bitmaps.
When a storage system cluster is initially created, the default bitmap space assigned is 20 MiB
each for FlashCopy, Remote Copy, and Volume Mirroring; and 40 MB for RAID metadata.
The verbose lsiogrp command output displays, for a given I/O group, the amount of bitmap space
allocated and currently available for each given bitmap space category.
Uempty
Figure 13-56. Bitmap space and copy capacity (per I/O group)
By default, each I/O group has allotted 20 MB of bitmap space each for FlashCopy, Remote Copy,
and Volume Mirroring.
For FlashCopy, the default 20 MB of bitmap space provides a copy capacity to track 40 TB of target
volume space if the default grain size of 256 KB is used. The 64 KB grain size means four times as
many bits are needed to track the same amount of space; this increased granularity decreases the
total copy capacity to 10 TB or one fourth the amount as the 256 KB grain size. The tradeoff is a
potential decrease in the amount of data that needs to be incrementally copied, which in turn,
reduces copy time and storage system CPU utilization.
Incremental FlashCopy requires tracking changes for both the source and target volumes, thus two
bitmaps are needed for each FlashCopy mapping. Consequently for the default grain size of 256
KB, the total copy capacity is reduced from 40 TB to 20 TB. If the 64 KB grain size is selected, the
total copy capacity is reduced from 10 TB to 5 TB.
For Remote Copy (Metro and Global mirroring), the default 20 MB of bitmap space provides a total
capacity of 40 TB per I/O group; likewise for Volume Mirroring.
Uempty
IBM_cluster:admin>l>lsiogrp 0
id 0
name io_grp0
node_count 2 Update bitmap space for
vdisk_count 8
host_count 4 FlashCopy, Remote Copy
flash_copy_total_memory 30.0MB (MM/GM) and Volume Mirroring
flash_copy_free_memory 29.9MB
remote_copy_total_memory 10.0MB
remote_copy_free_memory 10.0MB
mirroring_total_memory 25.0MB
mirroring_free_memory 25.0MB
raid_total_memory 40.0MB
raid_free_memory 40.0MB
maintenance no
compression_active no
accessible_vdisk_count 8
compression_supported yes
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
The chiogrp command is used to control the amount of bitmap space to be set aside for each IO
group.
Use the chiogrp command to release the default allotted cache space if the corresponding function
is not licensed. For example, if Metro/Global Mirror is not licensed, change the bitmap space to 0 to
regain the I/O group cache for other use.
By the same token, if more copy capacity is required, use the chiogrp command to increase the
amount of memory set aside for bitmap space. A maximum of 552 MB, shared among FlashCopy,
Remote Copy, Volume Mirroring, and RAID functions, can be specified per IO group.
Uempty
)ODVK&RS\
)XQFWLRQDOLW\DQGRYHUYLHZ
&UHDWH6QDSVKRW
&UHDWHFRQVLVWHQF\JURXSZLWKPXOWL
VHOHFW
,QGLUHFWLRQOD\HU%LWPDSVSDFH
,%06SHFWUXP3URWHFW6QDSVKRW
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
In this topic, we will review the functions of IBM Spectrum Protect Snapshot (formerly IBM Tivoli
Storage FlashCopy Manager).
Uempty
IBM Tivoli Storage Manager was rebranded as IBM Spectrum Protect, and IBM Tivoli Storage
FlashCopy Manager is now IBM Spectrum Protect Snapshot. The management of many large
FlashCopy relationships and Consistency Groups is a complex task without a form of automation
for assistance. IBM Spectrum Protect Snapshot provides fast application-aware backups and
restores, leveraging advanced point-in-time image technologies available with the IBM storage
systems. In addition, it provides an optional integration with IBM Spectrum Protect, for long-term
storage of snapshots.
This example shows the integration of IBM Spectrum Protect and IBM Spectrum Protect Snapshot
from a conceptual level. IBM Spectrum Protect is supported on SAN Volume Controller,
FlashSystem, Storwize Family, DS8800, plus others.
Uempty
6XSSRUWIRUPXOWLSOHSHUVLVWHQW
VQDSVKRWV
3HUVLVWHQWVQDSVKRWVUHWDLQHGORFDOO\
9HU\IDVWUHVWRUHIURPVQDSVKRW
With Optional
3ROLF\EDVHGPDQDJHPHQWRIORFDO
TSM Backup
Integration SHUVLVWHQWVQDSVKRWV
Storage hierarchy 5HWHQWLRQSROLFLHVPD\EHGLIIHUHQWIRUORFDO
VQDSVKRWVDQGFRSLHVRQ760VHUYHU
Restore can be performed
from $XWRPDWLFUHXVHRIORFDOVQDSVKRWYHUVLRQV
Local snapshot version H[SLUHGHSOR\PHQW
TSM storage hierarchy
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
IBM Spectrum Protect Snapshot allows you to coordinate and automate host preparation steps
before you issue FlashCopy start commands to ensure that a consistent backup of the application
is made. You can put databases into hot backup mode and flush the file system cache before
starting the FlashCopy.
IBM Spectrum Protect Snapshot also allows for easier management of on-disk backups that use
FlashCopy, and provides a simple interface to perform the “reverse” operation.
This example shows the FlashCopy Manager feature.
Uempty
Keywords
FlashCopy • Copy rate
Flash mapping • Clone
Full background copy • Snapshot
No background copy
• Incremental Backup FlashCopy
• Bitmap space
Consistency groups
• Tivoli Storage FlashCopy
Consistency groups Manager
Target
Source
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
Uempty
Review questions
1. True or False: Both the source and target volumes of a FlashCopy mapping are
available for read/write I/O operations while the background copy is in progress.
3. True or False: Incremental FlashCopy assumes an initial full background copy so that
subsequent background copies only need to copy the changed blocks to
resynchronize the target.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
Uempty
Review answers
1. True or False: Both the source and target volumes of a FlashCopy mapping are
available for read/write I/O operations while the background copy is in progress.
The answer is True.
3. True or False: Incremental FlashCopy assumes an initial full background copy so that
subsequent background copies only need to copy the changed blocks to
resynchronize the target to the source.
The answer is True.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
Uempty
Summary
&RS\ULJKW,%0&RUSRUDWLRQ
,%06SHFWUXP9LUWXDOL]H)ODVK&RS\DQG&RQVLVWHQF\JURXSV
Uempty
Overview
The Spectrum Virtualize module examines data mirroring services for mission-critical data using
Remote Mirroring for Metro Mirror (synchronous copy) and Global Mirror (asynchronous copy)
volumes.
References
Implementing IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1
http://www.redbooks.ibm.com/redpieces/pdfs/sg247938.pdf
Uempty
Objectives
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Uempty
Remote Copy
Metro Mirror and Global Mirror
Inter-cluster connectivity
Partnership
HyperSwap
Examples of MM/GM configurations
(optional)
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Uempty
Natural disaster
Primary
Site 1 Secondary
Site 2
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Today, when businesses are often required to be operational 24x7x365, and potential disasters due
to weather, power outages, fire, water, or even terrorism pose numerous threats, the importance of
real time disaster recovery and business continuance have become absolutely necessary for many
businesses. Some disasters happen suddenly, stopping all processing at a single point in time, or
interrupts operations in stages that occur over several seconds or even minutes. This is often
referred to as a rolling disaster. Therefore, it is business critical requirement to plan for recovery to
eliminate those potential disaster that can causes system failures where they are immediate,
intermittent, or gradual.
Uempty
Remote Mirroring
Disaster recovery (DR) for block based volumes
Intercluster remote mirroring is over distance in which the two copies are
geographically isolated between two system clusters
Remote mirroring for a volume, is between the I/O groups in the clusters, handling the
subject local and remote volumes respectively
Available on nearly all systems capable of running IBM Spectrum Virtualize
Site 1 Site 2
Any Spectrum Partnership Any Spectrum Virtualize
Virtualize product product
(FC or IP)
VDisks VDisks
V9K V7K
Relationship
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Remote Copy services provides a single point of control when remote copy is enabled in your
network (regardless of the disk subsystems that are used) if those disk subsystems are supported
by the IBM storage systems.
Synchronous mirroring returns an acknowledgment to the host when the data is written to both the
local and remote Storwize systems. Asynchronous mirroring returns an acknowledgment to the
host when the data is written to the local disk, and processes are used to keep the writes done
remotely in the same order as they are done locally except in error conditions where
synchronization is lost. As a result synchronous remote mirroring has a RPO of 0 while
asynchronous remote mirroring has a RPO >= 0.
The general application of remote copy services is to maintain two real-time synchronized copies of
a volume, known as remote mirroring. The typical requirement for remote mirroring is over distance.
In this case, intercluster copy is used across two storage system clusters using a Fibre Channel
interswitch link (ISL) or alternative SAN distance extension solutions.
While remote mirroring is normally between geographically isolated clusters, it's also possible to
setup this function between two volumes in the same I/O group. However, it's simpler to just create
a volume mirror, as failure of one copy is handled automatically and transparently with volume
mirroring, but requires manual intervention with the remote mirroring feature to recover and access
the auxiliary volume, except in the case of HyperSwap. If the master copy fails, you can enable an
auxiliary copy for I/O operation.
Uempty
5HPRWH0LUURULQJGHILQLWLRQVRI
Terminology Description
The location of the main storage system site. The site also contains both the data and the active
Primary site
servers.
The secondary site is a remote site that contains a copy of the data and acts as a standby
Secondary site servers. In the event of a disaster at the master site, the servers at the secondary site become
active and start using the copy of the data.
The master volume or source volume, or storage system contains the production copy of the
Master(source) volume
data that the application normally accesses.
The auxiliary volume or target volume, or storage system contains the backup copy of the data
Auxiliary (target) volume
and is used for disaster recovery.
Partnership A partnership is established between Spectrum Virtualize systems to enable remote mirroring.
A relationship connects a local volume to remote volume of the same size that will be mirrored
Relationship
using copy services. The initial copying direction is from master to auxiliary.
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
A typical remote mirroring configuration involves the implementation of the following terminologies.
• The location of the main storage system site. The site also contains both the data and the active
servers.
• The secondary site is a remote site that contains a copy of the data and acts as a standby
servers. In the event of a disaster at the master site, the servers at the secondary site become
active and start using the copy of the data.
• The master volume or source volume, or storage system contains the production copy of the
data that the application normally accesses.
• The auxiliary volume or target volume, or storage system contains the backup copy of the data
and is used for disaster recovery.
• Partnership is a partnership is established between Spectrum Virtualize systems to enable
remote mirroring.
• Relationship is a relationship connects a local volume to remote volume of the same size that
will be mirrored using copy services. The initial copying direction is from master to auxiliary.
• Metro Mirroring establishes a synchronous mirror of a volume across systems.
• Global Mirroring establishes an asynchronous mirror of a volume across sites.
Uempty
5HPRWH0LUURULQJGHILQLWLRQVRI
Terminology Description
Global Mirror w/Change Establishes an asynchronous mirror using Flash Copy change volumes for both the master and
Volume (GMCV) auxiliary volumes with a defined RPO.
Interswitch links (ISL) Connects SANs at both sites for remote mirroring at the SAN layer.
An IP connection between two Spectrum Virtualize systems that is used for remote mirroring,
Intercluster link
referred to as IP replication.
A bandwidth of connection between sites, that limits the overall rate at which data can be
Inter-site bandwidth
mirrored.
IP replication uses innovative Bridgeworks SANSlide technology to optimize network bandwidth
IP replication and utilization. This function enables the use of a lower-speed and lower-cost networking
infrastructure for data replication.
Bits track whether grain sized units of a volume have been mirrored to the other site or not. A
Bitmaps
bitmap exists for both the master and auxiliary volumes.
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Uempty
PRIMARY SECONDARY
PRIMARY
/RZHUEDQGZLGWKUHTXLUHPHQWDW
PRIMARY H[SHQVHRIKLJKHU532V SECONDARY
PRIMARY
SECONDARY
Change
FlashCopy volume FlashCopy Change
Volume
mapping mapping
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
IBM Remote Copy Services offers several data mirroring methods which are a synchronous remote
copy called Metro Mirror (MM), asynchronous remote copy called Global Mirror (GM), and Global
Copy with Changed Volumes. Each methods will be discussed in details.
Uempty
SCSI Target
$GYDQFHGLQGHSHQGHQWQHWZRUNHGEDVHG
Forwarding
6$1ZLGH
Replication
Upper Cache
5HSOLFDWLRQOD\HUUHVLGHVDERYHWKHXSSHU
FDFKH
FlashCopy
7UDQVDFWLRQDFNQRZOHGJHPHQWGHSHQGVWKH
Volume Mirroring UHSOLFDWLRQWHFKQRORJLHVV\QFKURQRXVYHUVXV
DV\QFKURQRXVXVHGE\WKHKRVWWRVHQGWKH
Data Thin Provision Compression
Forwarding WUDQVDFWLRQWRLWVORFDOVWRUDJH
Reduction
Lower Cache
2SWLRQDOIHDWXUH /LFHQVHUHTXLUHG
Log Virtualization Easy Tier
Structured Forwarding
Array DRAID/RAID
Forwarding
SCSI NVMe
Initiator Initiator
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Uempty
Remote Copy
Metro Mirror and Global Mirror
Inter-cluster connectivity
Partnership
HyperSwap
Examples of MM/GM configurations
(optional)
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
This topic examines the functions of Remote Copy with Metro Mirror and Global Mirror.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Metro Mirror supports copy operations between volumes that are separated by distances up to
300 km. Synchronous mode provides a consistent and continuous copy, which ensures that
updates are committed at both the primary and the secondary sites before the application
considers the updates complete. The host application writes data to the primary site volume but
does not receive the status on the write operation until that write operation is in the storage system
cache at the secondary site. Therefore, the volume at the secondary site is fully up to date and an
exact match of the volume at the primary site if it is needed in a failover.
Metro Mirror provides the simplest way to maintain an identical copy on both the primary and
secondary volumes.
Uempty
2) Local cluster puts the write data into write cache for
destage to the master volume and sends a duplicate write Host
to the remote cluster
(4) Act
3) The secondary cluster receives write request, puts data
into write cache for destage to auxiliary volume and (1) Write
returns an acknowledgement to the primary cluster
Auxiliary
volume
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Synchronous mirroring ensures that data is written to both sites for a volume before an
acknowledgment is returned to the host. This creates an RPO of 0. Having to wait for the remote
write to complete, adds latency writes, and can have a significant impact depending on the distance
and latency between sites.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
A Global Mirror relationship allows the host application to receive confirmation of I/O completion
without waiting for updates to have been committed to the secondary site. In asynchronous mode,
Global Mirror enables the distance between two system clusters to be extended while reducing
latency by returning an acknowledgment to the host once the data resides on the primary Spectrum
Virtualize system, without waiting for the data to get to the secondary site.
The Global Mirror function provides the same function as Metro Mirror without requiring the hosts to
wait for the full round-trip delay of the long-distance link; however, some delay can be seen on the
hosts in congested or overloaded environments. This asynchronous copy process reduces the
latency to the host application and facilitates longer distance between the two sites. The secondary
volume is generally less than one second behind the primary volume to minimize the amount of
data that must be recovered in the event of a failure. However this requires a link with peak write
bandwidth be provisioned between the two sites. Consequentially asynchronous mirroring has a
non-zero RPO, and some transactions that completed at the master site will not have completed at
the secondary site, and will be lost in the event of a fail over to the secondary site.
Make sure that you closely monitor and understand your workload. The distance of Global Mirror
replication is limited primarily by the latency of the WAN Link provided.
Previously, Global Mirroring supported up to 80ms round-trip-time for the GM links to send data to
the remote location. With the release of V7.4, it supports up to 250ms round trip latency and
distances of up to 25,000km. Combined with the performance improvements in the previous
software release, these changes and enhancements have greatly improved the reliability and
performance even over poor links.
Uempty
Auxiliary
volume
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
In an asynchronous global operation, as host send write operations to the master volume,
transaction is process by cache and acknowledgment is immediately sent back to the host issuing
the write before the write operation is mirrored to the cache for the auxiliary volume. An update of
this write operation is sent to the secondary site at a later stage, which provides the capability to
perform Remote Copy over distances exceeding the limitations of synchronous Remote Mirroring.
An acknowledgment is sent from the secondary system to the primary system for updating of the
bitmap. The local bitmap is updated twice, first when the write data is put in the I/O group cache to
indicate the grain being updated hasn't be mirrored to the secondary site, and after the
acknowledgment from the secondary site, to update the bitmap to indicate the grains are mirrored.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Traditional Global Mirror and Global Mirror with Change Volumes each have their own distinct
strengths. The different between the two Global Mirror relationship is determined by the bandwidth
size used to support host writes between two systems.
Traditional Global Mirror operates without cycling, write operations are transmitted to the secondary
volume on a continuously basis triggered by write activity. The secondary volume is generally within
seconds behind the primary volume for all relationships. This achieves a low recovery point
objective (RPO) to minimize the amount of data that must be recovered. However, this requires a
network to support peak write workloads as well as minimal resource contentions at both sites.
Insufficient resources or network congestion might result in error code 1920 and stopped GM
relationships.
Uempty
MASTER AUXILIARY
change change
FlashCopy Requires less link FlashCopy
volume volume
Mapping bandwidth Mapping
Host I/O
(space-efficient) (space-efficient)
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
A Global Mirror relationship can be implemented with changes volumes leverage FlashCopy
functionality to mitigate peak bandwidth requirements by addressing average instead of peak
throughput at the expense of higher recovery point objectives (RPOs).
While the fastpath for incremental backup volumes is full copy by default, incremental volumes
does not need to be full copy. Therefore, the only difference between remote mirroring snapshots
and typical FlashCopy snapshots, is that remote mirroring tracks the changes to the master copy
volume each time it cycles – sending only the changes. This function is called Global Mirroring with
cycling mode.
Since the transmission of changed data can be smoothed over a longer time period, a lower (cost)
bandwidth option can be deployed. However, if the background copy does not complete within the
cycling period, the next cycle will not start until the prior copy load completes. This would lead to
increased or higher RPOs. It also enables the recovery point objectives to be configurable at the
individual relationship level.
Benefits of Global Mirroring with Change volumes:
• Bursts of host workload are smoothed over time so much lower link bandwidths can be used
• Almost zero impact; I/O pause when triggering next change volume (due to near instant
prepare)
• Less impact to source volumes while prepare, as prepare bound by normal destage, not a
forced flush.
Uempty
volume
Cycle period defaults to 300 seconds (5
minutes); can be tailored (1 min to 24 hours)
4) Wait for specified cycle time to complete, and repeat: for each relationship
When Global Mirror operates in cycling mode, after the initial background copy, changes are
tracked and the data is captured on the master volume and the changed data is copied to
intermediate change volumes using FlashCopy point-in-time copy technology. This process does
not required the change volume to copy the entire content of the master volume, instead it only has
to store data for regions of the master volume that change until the next capture step.
The primary change volume is then replicated to the secondary Global Mirror volume at the target
site periodically, which is then captured in another change volume on the target site. This provides
an always consistent image at the target site and protects your data from being inconsistent during
resynchronization.
The mapping between the two sites are updated on the cycling period (60 seconds to 1 Day.) This
means that the secondary volumes are much further behind the primary volume, and more data
must be recovered or abandoned in the event of a failover. Because the data transfer can be
smoothed over a longer time period, lower bandwidth is needed than for Metro or Global Mirror to
provide an effective solution.
The data stored on the change volume is the original data from the point that FlashCopy captured
the master volume. The data captured here uses FlashCopy provides data consistency. This cause
I/Os to momentarily pause as part of the prepare procedure when creating FlashCopies. This will
manifest as a spike in the I/O service times, lasting from a few tens of milliseconds or up to a
second or more if hundreds of volumes are being replicated as part of a consistency group. This will
show visibility as a single spike in read and write response time. The spike can be to a few tens of
milliseconds for volumes being individually replicated, or to up to a second or more if volumes are
being replicated as part of a large, 100-volume or more, application. More on this in a bit.
Simultaneously, the process captures the DR volume's data onto a change volume on the DR site
using FlashCopy. This consistently captures the current state of the DR copy, ensuring we can
revert to a known good copy if connectivity is lost during the next copy step. The data stored on the
Uempty
change volume on the DR site will be regions changed on the DR copy during the next copy step,
and will consist of the previous data for each region, allowing the reversion of the whole DR copy
back to the state at this capture.
Uempty
MASTER AUXILIARY
Mirror master change
change change
volume
volume to auxiliary volume volume
Change volumes are essentially snapshots taken at regular intervals, and where we mirror the
snapshot to the remote site. A snapshot of the primary volume is taken, it’s mirrored to a volume at
the remote site and then that volume is applied as a single transaction to the auxiliary volume,
grouping all the writes together. The RPO is essentially configurable and is less than or equal to
twice the snapshot frequency: it includes time since the last snapshot, plus the time to synchronize
the snapshot to the remote site (which should be <= to the snapshot frequency).
The cycle is repeated and one can specify the cycle interval (300 seconds is the default) yielding a
RPO of 600 seconds or 10 minutes or better.
Uempty
Must be in the same IO group as the master and auxiliary volumes Auxiliary
Auxiliary
Must be same virtual size as the master and auxiliary volumes change
FlashCopy volume
4. GUI issues the mkvdisk command to create the thin-provisioned mapping
change volumes and the chrcrelationship command to add the
newly created change volumes to the relationship Storage pool 2
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Global Mirror relationships can be easily changed to Global Mirror with Change Volumes by using
the GUI or CLI. Change volumes are space efficient volumes, therefore, the process requires you to
create change volume, which is in the same IO group as the master volume, and auxiliary change
volume, which is also in the same IO group as the auxiliary volume. The GM change volumes must
be the same size as the master and auxiliary volumes. However, they can be any volume type and
in any storage pool.
To add change volumes to an existing remote mirroring relationship, right-click on the relationship
and select Create New. The GUI will generate the appropriate mkvdisk command to create
Thin-Provisioned change volumes - based on the size and pool of the master and auxiliary
volumes. The chrcrelationship command is used to add the newly created change volumes to
the relationship.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Once the relationship has been started, grains representing changed blocks from the master
cluster are transmitted to the auxiliary cluster automatically for each cycle.
At the master cluster, the master volume to its change volume FlashCopy mapping is started. It is in
the copying state so that subsequent writes to the master volume cause COW blocks to be copied
to the master change volume. The mapping start time provides an indication of the start time of the
cycling period.
At the auxiliary cluster, the auxiliary volume to its change volume FlashCopy mapping is in the
copying state as well. Before changed blocks are written to the auxiliary volume, its COW blocks
are first copied to the auxiliary change volume.
Recall that the GUI created change volumes are Thin-Provisioned. As writes occur, the capacity of
the Thin-Provisioned target automatically expands.
In this example, the amount of changed blocks to be copied is taking longer than the cycling period
of 180 seconds.
Uempty
Review the relationship details again - the freeze time or recovery point has been updated with the
start time of the just completed cycle.
The state of a started relationship In cycling mode is always consistent copying - even when the
relationship progress is at 100%.
The change volumes relationship also contains a freeze time in YY/MM/DD/HH/MM format that
records the time of the last consistent image on the auxiliary volume. At copy completion, the
freeze time is updated with the start time of the just completed cycle. The freeze time is reflected in
the relationship entry of both clusters.
Uempty
A common time reference for both clusters (such as NTP servers) is highly recommended. The
freeze time should match the FlashCopy mapping start time of the master cluster, adjusted for time
zone differences of the cluster partners. The content of the auxiliary volume is consistent with the
content of the master volume as of the freeze time value.
The freeze time is updated as each copy cycle completes. It will vary by the cycling period (180
seconds in this example) as long as each background copy of the changed block completes within
the cycling period. Otherwise the freeze time, or recovery point, lags further behind in time.
Uempty
0 0 0 0 0 0 0 0 0 0
Master Volume Auxiliary Volume
Network
Bitmap Bitmap
Primary/Master Secondary/Auxiliary
Local site storage Remote site storage
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Both Metro and Global mirroring feature has some similarities to PiT images in that two volumes
exist, but we have two bitmaps rather than one. Typically only the primary volume is accessible via
a host system, while the secondary volume is only accessed by the remote copy processes
mirroring the data. Only the primary/local bitmap is used during normal activity, and it tracks grains
on the volume that haven’t been mirrored to the remote volume. The inability to access the auxiliary
volume presents some challenges in setting up a secondary site since a host typically needs
access to a volume to configure it on the system, and typically requires an application outage to do
so. And it’s usually done as part of setup and during disaster recovery testing – something that
should be done before a disaster, and after data initialization/synchronization.
The concept of mirroring the data to the remote site is fairly simple, but one needs to understand
the processes that get the volumes initially synchronized, what happens if a site or the inter-site
network fails, and the procedures to fail over to the remote site and to fallback to the primary site.
Uempty
With remote mirroring, it’s necessary that writes at the remote site are done in the same order as at
the local site to ensure data consistency and data integrity. However, this isn’t always possible,
such as when initializing the mirrors, or after a site or network failure. Writes can be grouped in
chronological order, and as long at they are written at the remote site in a single group or
transaction. This reduces the network bandwidth needed between sites. For example, if the same
disk location is overwritten more than once in the time period of the group, then only the last write
needs to be transmitted over the network. When the data on the volumes at both sites are the
same, then the volumes are considered synchronized.
Uempty
Unless there is some software working with the storage to automate failover, then it is a process
that will require manual intervention. When using fallback in a manual process, it usually occurs in a
controlled method, typically with a short application outage to bring processing back to the primary
site.
If asynchronous mirroring is used, where updates were done at the primary site that were not
mirrored to the secondary site, then this data will be lost during fallback. Therefore, business
decisions are made regarding this process whether to lose and discard these transactions or
whether to attempt to recover them. For example, when the primary site say recovers power, it may
be possible to bring up the application allowing the application administrator to examine the last
in-flight transactions and recover them, and manually apply them to the application running at the
remote site prior to initializing the mirroring back to the primary site. The bitmaps from both sites are
combined so anything that changed at the primary site (and wasn’t written to the secondary site) is
overwritten by data at the secondary site prior to fallback, so a fully consistent set of data will exist
at the primary site at fallback.
Uempty
Failure detection
• With inter-site disk mirroring, I/O latency can increase significantly
• Host distinguishes between disk failure and slow I/Os from I/O timeouts
• Remote copy processes distinguish between site/network failure, and slow I/O from I/O
timeouts
ƒ Look for I/O timeout configuration settings on the host, storage and applications
ƒ Other methods might be used to detect failure types including the use of quorum disks/sites
and backup networks
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
When doing inter-site mirroring, the additional latency to send data between the sites is not
insignificant as compared to the time to write to local disk (which is usually very fast since we
usually write to cache, and often around 1 ms). Therefore, you will have to be aware of I/O timeout
values from an application, system and storage perspective. Often I/O timeouts are adjustable and
have to be adjusted from these various perspectives to distinguish between slow I/Os and I/O
failures.
Some applications have I/O timeouts, hosts have I/O timeouts and remote mirroring for storage
have I/O timeouts. These are something to be aware of, and are best examined as part of failure or
disaster recovery testing for a specific solution implementation.
Uempty
Remote Copy
Metro Mirror and Global Mirror
Inter-cluster connectivity
Partnership
HyperSwap
Examples of MM/GM configurations
(optional)
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
This topic examines the connectivity options for configuring the Remote Copy Services Metro
Mirror and Global Mirror.
Uempty
Site-A
Spectrum Virtualize
system
SAN Switch
Managed Disks (MDisks)
Blue1 Blue2 Blue3
DS5000 DS3000
Storwize
Flash
SAN Switch
DS4000 XIV
V7000 System DS8000
'6
'6 '6
;,9
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
The SAN fabrics at the two sites are connected with interswitch links (ISL) or SAN distance
extension solutions. For testing or continuous data protection purposes, intracluster mirroring
operations are also supported.
Implicit with connecting the two clusters with ISLs is that the two fabrics of the two clusters must
merge (excluding non-fabric merge solutions from SAN vendors), that is, no switch domain ID
conflicts, no conflicting switch operating parameters, and no conflicting zone definitions. Zones
must be setup to allow the I/O groups containing the master and auxiliary volume pairs, to
communicate with each other. The ISL is also referred to as the intercluster link. It is used to control
state changes and coordinate updates.
The maximum bandwidth for the background copy processes between the clusters must be
specified. Set this value to less than the bandwidth that can be sustained by the intercluster link. If
the parameter is set to a higher value than the link can sustain, the background copy processes
uses the actual available bandwidth.
Uempty
Spectrum Virtualize
system
Site-A
SAN Switch
Managed Disks (MDisks)
Blue1 Blue2 Blue3
Storwize DS8800
DS5000 Flash
Family XIV
Systems
IP
WAN
FCIP
Interswitch Link (ISL)
SAN Switch
Storwize DS8800
Round-trip latency maximum is 250 millisec for GM Family DS5000
Flash
Systems
XIV
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
The FCIP protocol extends the distance between SANs by enabling two Fibre Channel switches to
be connected across an IP network. The IP network span is transparent to the FC connection. The
two SANs merge as one fabric as FCIP implements virtual E_Ports or a stretched ISL between the
two ends of the connection. Fibre Channel frames are encapsulated and tunneled through the IP
connection. The UltraNet Edge Router is an example of a product that implements FCIP where the
two edge fabrics merge as one.
SAN extended distance solutions where the SAN fabrics do not merge are also supported. Visit the
IBM Systems Storage support page for more information regarding:
• The Cisco MDS implementation of InterVSAN Routing (IVR).
• The Brocade SAN Multiprotocol Router implementation of logical SANs (LSANs).
Distance extension using extended distance SFPs are also supported.
The term intercluster link is used to generically include the various SAN distance extension options
that enable two IBM Spectrum Virtualize systems to be connected and form a partnership.
Uempty
,QWHUFOXVWHU]RQLQJIRU,6/SDUWQHUVKLSV
NODE 1 NODE 1
Hosts and controllers
ZONE 1
ZONE 3
ZONE 3
NODE 3 NODE 3
NODE 4 NODE 4
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
In this diagram, the inter-cluster zones in blue, are from every site A node to every other site B
node.
The graphic supports the recommended zoning guideline for Remote Copy:
• For each node that is to be zoned to a node in the partner system, zone exactly two Fibre
Channel ports.
• For a dual-redundant fabric, split the two ports from each node between the dual fabric so that
exactly one port from each node is zoned with the partner nodes.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
IBM Spectrum Virtualize systems supports remote copy over native Internet Protocol (IP)
communication using Ethernet communication links. Native IP replication enables the use of
lower-cost Ethernet connections for remote mirroring as an alternative to using Fibre Channel
configurations.
Native IP replication enables replication between any Spectrum Virtualize family member (running
the supported) that uses the built-in networking ports of the cluster nodes. IP replication includes
Bridgeworks SANSlide network optimization technology to bridge storage protocols and accelerate
data transfer over long distances. SANSlide is available at no additional charge.
IP replication requires a 1 Gb or 10 Gb LAN connections. An IBM Spectrum Virtualize system can
have only one port that is configured in an IP partnership, either port 1 or 2, cannot use both. If the
optional 10 Gb Ethernet card is installed in a system, ports 3 and 4 are also available. A system
may be partnered with up to three remote systems. A maximum of one of those can be IP and the
other two FC. Recommend a straight forward setup:
▪ Two active Ethernet links with two port groups to provide link failover capabilities
▪ At least two I/O groups to provide full IP replication bandwidth if one component is offline
Uempty
Site-A
Light thru fibre latency of 0.005 ms/km
100 km between sites adds 1 ms minimum
round trip light thru fibre
Site-B
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
In additional to planning for hardware and software resources in a Metro and Global Mirror
configuration, planning also includes requirements for volumes to be mirrored, bandwidth sizings,
and the performance impacts on production workloads.
Sufficient inter-site bandwidth is needed to handle the application write workload, sized to peak
write rates. In addition to needing that bandwidth, the inter-site latency and mirroring method also
affects application performance. In addition to the added latency from sending the light through
Fibre, latency can be added in routers, switches and other equipment. There will be other
communication besides data replication including inter-system communications for the partnership.
Without sufficient bandwidth, communications between the sites will queue adding even further
latency. Often testing and understanding the configuration is required to determine if the customer
can achieve the desired performance with long distance remote mirroring, such as:
• The number of Fibre Channel adapters
• Storage systems
• Switches/routers
• System updates to include updating bitmaps
• System congestion and queueing of data
Uempty
Inter-site bandwidth
Synchronous mirroring requires inter-site bandwidth >= peak write rates
Inadequate bandwidth = poor application performance
Asynchronous mirroring requires inter-site bandwidth >> average write rates
ƒ Reduce time-delay of mirrors being too far behind
ƒ Limited memory to keep track of un-mirrored writes
ƒ Writing frequently to the same grain can result in blocking due to locking of the bitmap
ƒ Inadequate bandwidth = loss of
synchronization
Global Mirror with Change Volumes
requires bandwidth > average write
rates
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
For Remote Copy and mirroring, you will need adequate inter-site bandwidth. When operating with
inadequate bandwidth, such as synchronous mirroring, application performance will be significantly
degraded. This can potentially lead to application crashes as I/Os build up waiting to be completed
to disk. With inadequate bandwidth for asynchronous mirroring, the storage memory can only hold
so much in-flight data, and eventually the mirrors become unsynchronized. With change volumes,
one can usually reduce the frequency the snapshots are taken so there is enough time to
synchronize the changes, to a point. Rather than this workaround, the right approach is to ensure
sufficient bandwidth between sites.
However, if an application writes to the same storage location frequently, and asynchronous
mirroring is used, the mirroring process usually locks the bitmap for the grain that is being updated
and is in-flight to the remote site, which can block another write over that grain to the local disk,
resulting in an impact to application performance. The algorithm used to asynchronously mirror the
data may or may not be able to handle this type of situation, and the application and system
administrators should be aware of it.
Uempty
Remote Copy
Metro Mirror and Global Mirror
Inter-cluster connectivity
Partnership
HyperSwap
Examples of MM/GM configurations
(optional)
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
This topic examines the functions of creating a Metro Mirror and Global Mirror relationship and
partnership.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Metro Mirror and Global Mirror partnerships define an association between a local cluster and a
remote cluster. Each cluster can maintain up to three partnerships, and each partnership can be
with a single remote cluster. As many as four clusters can be directly associated with each other.
Clusters also become indirectly associated with each other through partnerships. If two clusters
each have a partnership with a third cluster, those two clusters are indirectly associated. A
maximum of four clusters can be directly or indirectly associated.
Multi-cluster mirroring enables the implementation of a consolidated remote site for disaster
recovery. It also can be used in migration scenarios with the objective of consolidating data centers.
A volume can be in only one Metro or Global Mirror relationship - which defines the relationship to
be at most with two clusters. Up to 8192 relationships (mix of Metro and Global) are supported per
cluster.
Uempty
D B
System A can be a central DR site for the three A ˙ B, A ˙ C, and B ˙ C
other locations
Fully
connected Daisy chained
A C
A B C D
Subsequently, one after the other
B D
A ˙ B, A ˙C, A ˙ D, B ˙ D, and C ˙ D
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Multiple system mirroring allows for various partnership topologies. Each IBM storage system can
maintain up to three partner system relationships, which allows as many as four systems to be
directly associated with each other. This partnership cluster capability enables the implementation
of disaster recovery (DR) solutions.
A star topology is a network topology in which all the systems are individually connected to a
central switch. The switch provides redundancy capabilities and connects to each system as a
central point of communication. With system (A) being the central point for disaster recovery,
therefore all volumes created from the other sites are replicated to system (B → A, C →A, and D →
a). Star topologies are common in large industry networks, as a failure in any location does not
disrupt the entire network.
A fully connected mesh in which every system has a partnership to each of the three other systems.
This topology allows volumes to be replicated between any pair of systems, for example: A → B, A
→ C, and B → C.
All of the preceding topologies are valid for the intermix of the IBM Spectrum Virtualize products if
the system is set to the replication layer and running IBM Spectrum Virtualize code 6.3.0 or later.
Uempty
V V V V
V V V V
SWV7K SWV7K
system A system B
layer = replication layer = storage
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
A storage system can be configured with one and only one other system to form a partnership
relationship for intercluster mirroring. The partnership is defined on both system.
You can only create IP partnerships between two systems. This example facilitate replication
partnership of volumes between two sets storage systems that are connected remotely over an IP
connection. The first partnership shows that cluster A and cluster B (two IBM Spectrum Virtualize
systems) are in a partnership. The clusters can only operates with a layer value of replication. This
layer value cannot be changed. The second partnership shows that cluster A is also in a
partnership with system A (Storwize V7000). A system can only be part of one IP partnership.
Systems that are in an IP partnership has to have a layer attribute value of either replication or
storage. In addition, the layer attribute is also used to enable another storage system (such as a
Storwize system) to virtualize and manage another storage system.
Additional rules of usage for the layer attribute that are using the partnership:
• A Remote Copy partnership can only be formed between two partners with the same layer
value. For example, a partnership between an SVC and a Storwize V7000 system requires the
Storwize system to have a layer value of replication.
• An Spectrum Virtualize system cluster can virtualize a Storwize system only if the Storwize
system has a layer value of storage.
• A Storwize system with a layer value of replication can virtualize another Storwize system with a
layer value of storage.
• If the connection is broken between the IBM Spectrum Virtualize systems that are in a
partnership, all (intercluster) MM/GM relationships enter a Disconnected state.
Uempty
Direct
SVC partnership Storwize V7000
cluster A cluster B
partnership
Direct
Storwize V7000
cluster C
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
In a network of connected systems, the code level of each system must be supported for
interoperability with the code level of every other system in the network. This applies even if there is
no direct partnership in place between systems. For example, in the figure below, even though
system A has no direct partnership with system C, the code levels of A and C must be compatible,
as well as the partnerships between A-B and B-C.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Before a Metro Mirror or Global relationship or consistency group can be created with a remote
system, a partnership between the two clusters must be established. Each cluster can be created
over native IP links connected directly or via Ethernet switches, or over Fibre Channel connections.
Remote copy over native IP provides a less expensive alternative to using Fibre Channel
configurations. We describe the configuration settings that are needed for establishing an IP
partnerships.
Typically, the type of configuration is defined during the SAN zoning between Spectrum Virtualize
clusters is established with IP connections between each site's SAN, or when an Ethernet IP
connection is established between the two clusters.
Uempty
SVC_Site1 V9000_Site2
Once the SAN zoning between Spectrum Virtualize clusters is established with IP connections
between each site's SAN, or when an Ethernet IP connection is established between the system
clusters.
First, a cluster partnership must be defined between the two clusters using Copy Services >
Partnerships, then click the Create Partnership button.
For each system you will need to:
1. From the local system, select the type of IP and enter the IP address for the partner system,
using the same link bandwidth, and background copy rate for the partnership.
2. Enter the link bandwidth and background copy rate for the partnership. The Bandwidth value at
the partnership level defines the maximum background copy rate (in MBps) that Remote Copy
would allow as the sum of background copy synchronization activity for all relationships from
the direction of this cluster to its partner. The background copy bandwidth for a given pair of
volumes is set to a maximum of 25 MBps by default. Both of these bandwidth rates can be
modified dynamically.
3. Specify the CHAP secret for the partner system if you plan to use Challenge Handshake
Authentication Protocol (CHAP) to authenticate connections between the systems in the
partnership.
The GUI generates the mkpartnership or mkippartnership command to establish a partnership
between each Spectrum virtualize system cluster. Each cluster has a cluster name and a
hexadecimal cluster ID. The GUI generated commands tend to refer to a cluster by its cluster ID
instead of its name.
The partnership is now partially configured; as the attempt to form a partnership must also occur
from the partner-to-be.
Uempty
Repeat the same steps to establish the partnership from the local system to the partner system.
Once completed, a fully configured partnership exist between the two clusters.
Uempty
Example: IBM_SVC:Admin>lspartnershipcandidate
id configured system_name Apply commands on
both systems
000002032E60AF80 yes FS201-V7K
SVC V9000
cluster A cluster B
S104-V9K FS201-V7K
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
The CLI can also be used directly to create partnerships between clusters.
Before establishing a partnership using the CLI, you should first verify the systems availability. This
is a prerequisite for creating inter-system Metro or Global Mirror relationships. To do so, enter the
lspartnershipcandidate command on each system to list the clustered systems available for
setting up a partnership between a local system and remote system.
Observe that the lspartnershipcandidate command output shows the Spectrum Virtualize
cluster ID, cluster name, as well as its partnership state.
Uempty
SVC V9000
cluster A cluster B
IBM Spectrum
SDUWQHUVKLS
IBM Spectrum
Virtualize system Virtualize system
S104-V9K F201-V7K
You can view all of these parameter values by using the lssystem <system_name> command
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Depending on the type of configuration, you can use either the mkfcpartnership command for
traditional Fibre Channel (FC or FCoE) connections or mkippartnership for IP-based
connections.
Same as the GUI, to establish a fully functional MM/GM partnership, you must issue this command
on both systems.
When the partnership is created, you can specify the bandwidth to be used by the background copy
process between the local and remote storage system. If it is not specified, the bandwidth defaults
to 50 MBps. The bandwidth must be set to a value that is less than or equal to the bandwidth that
can be sustained by the intercluster link.
These examples shows of the mkippartnership command that was used to establish a
partnership between two Spectrum Virtualize clusters.
For ease of identification, the cluster name is always part of the command prompt.
You can use the lssystem command to view the remote system partnership status such as
partially_configured_local_stopped or partially_configured_local. The lssystem
command can also be used to list the layer value setting of a system.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
When two volumes are paired using FlashCopy they are said to be in a mapping. When two
volumes are paired using Metro Mirror or Global Mirror, they are known to be in a relationship.
A volume mirroring relationship allows the two volumes to be updated by an application where one
volume are mirrored on the other volume. The local cluster is referred to as the master cluster and
the local volume is called the master volume. The remote cluster is referred to as the auxiliary
cluster and the remote volume is called the auxiliary volume.
The master volume initially contains the production data for application access and the auxiliary
volume is a duplicate copy to be used in disaster recovery scenarios. For the duration of the
relationship, the master and auxiliary attributes never change, though which volume contains the
primary copy of the data, and which contains the secondary copy can change. The volumes can be
in the same Spectrum Virtualize clustered system or on two separate Spectrum Virtualize systems.
For intracluster copy, they must be in the same I/O group. The master and auxiliary volume cannot
be in an existing relationship and they cannot be the target of a FlashCopy mapping.
Uempty
select Create
Relationship
3 5
3 4
4
writes
Test0_M Tes1_A
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
To define a relationship with the GUI from the local cluster, click Copy Services > Remote Copy.
Click Action and select Create Relationship to open the a dialog.
Select the type of relationship desired (Metro Mirror, Global Mirror, or Global Mirror with Change
Volumes), and specify whether this relationship is an intracluster or intercluster relationship. For an
intercluster relationship, the remote cluster name needs to be chosen.
This example shows that the local site, Test0_M volume is identified as the master volume.
Communication between the two clusters caused a list of eligible auxiliary volumes to be sent from
the auxiliary site to the master site.
Uempty
SVC_Site1
Example:
dĂƐŬƐƚĂƌƚĞĚ
ƌĞĂƚŝŶŐƌĞůĂƚŝŽŶƐŚŝƉďĞƚǁĞĞŶdĞƐƚϬĂŶĚdĞƐƚϭ
ZƵŶŶŝŶŐĐŽŵŵĂŶĚƐ͗
svctask mkrcrelationship –aux Test1 –cluster V9000
master Test0
^LJŶĐŚƌŽŶŝnjŝŶŐŵĞŵŽƌLJĐĂĐŚĞ͕
dŚĞƚĂƐŬŝƐϭϬϬйĐŽŵƉůĞƚĞ͘
dŚĞƚĂƐŬĐŽŵƉůĞƚĞĚǁŝƚŚǁĂƌŶŝŶŐƐ͘
writes
Test0_M Test1_A
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Uempty
Primary_Site1
writes
Test0_M Test1_A
Seconday_Site2
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Even though the relationship was defined from the local site1, the relationship entry exists in both
clusters.
Each system GUI displays a relationship with the default name of rcrel following by a number, an ID
value which is derived from the object ID of the master and auxiliary volume.
Based on the choices made when the relationship was defined, the current state of the relationship
is inconsistent_Synchronize. The content of the two volumes are not consistent and the
relationship has not been started yet.
The value of the Primary Volume field also indicates the current copy direction of the relationship.
The volume name listed is the ‘copy from’ volume.
The lsrcrelationship command provides a more information than the GUI in a more compact
format.
Uempty
Primary_Site1 Secondary_Site2
Master Vol (Test0_M) Inconsistent_Synchronize Aux. Vol (Test1_M)
Inconsistent_copying Empty
Online Offline
Currently, the master volume is being used by host application from the primary site cluster. Since
the previous status was listed a inconsistent synchronize, the auxiliary volume has nothing of value,
it's taken offline by the IBM Spectrum Virtualize system to prevent accessing by a host system.
Generally, the auxiliary volume is not assigned or configured on a host until the mirroring operation
has been stopped. The auxiliary volume is not available for write operations or read operations.
Most operating systems require write access to read a volume, and allowing read access can
cause application crashes when the primary site is changing the data the secondary site has read
into its memory.
Remote mirroring is based on block copies controlled by grains of the owning bitmaps. The copy
operation has no knowledge of the OS logical file structures.
Issue the startrcconsistgrp command to start the copy process, and to set the direction of copy
if it is undefined, and optionally mark the secondary volumes of the consistency group as clean.
This command can be issued from either Spectrum Virtualize cluster using the GUI or CLI.
Uempty
progress 5 progress 12
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
The content of the master volume is now been copied to the auxiliary volume, the relationship is in
the state of inconsistent_copying.
The background copy rate for a given pair of volumes is set to 25 MBps by default and is subject to
the maximum ‘copy from’ partnership level bandwidth value since multiple background copies might
be in progress concurrently. The 25 MBps default rate can be changed with the chsystem
-relationshipbandwidthlimit parameter. Be aware that this value is controlled at the cluster
level. The changed copy bandwidth value is applicable to the background copy rate of all
relationships.
Once the background copying starts, the lsrcrelationship command can be issued to confirm
the synchronization or background copy is in progress. Progress of the background copy operation
can also be monitored with the lsrcrelationshipprogress command, which displays the copy
progress as a percentage to completion. When the progress reaches 100% or copy completion, the
command output displays a null value for the relationship object ID. Both commands can be issue
from either cluster using either the relationship name or object ID. While the background copy is in
progress, the status of auxiliary volume remains offline.
Uempty
Primary_Site1 Secondary_Site2
Master Vol (Test0_M) Idling Aux. Vol (Test1_M)
In-sync
Online Online
Data is identical
Read/write I/Os
Independent writes to either volume will
cause the relationship to be out of sync
progress 50 progress 50
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
To achieve write access for the auxiliary volume, the stoprcrelationship command must be
issued with the -access keyword followed by the object ID or name. This process will suspend the
mirroring relationships. The relationship status changes to idling. Use cases for write access of the
auxiliary volume includes disaster recovery testing, actual disaster recovery, or application
relocation. Immediately after the relationship is stopped, it is said to be in_sync.
since the relationship synchronization is no longer in process, the copy direction is now ambiguous.
This means that both the master volume and auxiliary volume in the relationship can both perform
read and write I/Os.
Uempty
Secondary_Site1 Primary_Site2
Master Vol (Test0_M) Inconsistent_copying Aux. Vol (Test1_M)
out-sync
Online
Online
Data changed
Write I/Os
progress 79
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
When write activity has occurred, on either or both volumes, then the two volumes need to be
resynchronized. In this case, the relationship needs to be restarted, the copy direction must be
specified with the -primary parameter. The -force keyword must be coded to acknowledge the out
of synchronization status. Background copy is invoked to return the relationship to the consistent
and synchronized state again.
The remote mirroring software keeps track of changes to the volumes that aren't mirrored to the
other sites. For example, when the network between sites is not up. To ensure the data is
consistent and synchronized, any data that's changed on secondary volume is overwritten with data
on the primary volume when mirroring is started. In addition any data that's changed on the primary
volume is also mirrored to the secondary volume. This avoids having to write all the data on the
primary volume to the secondary.
In this example, the Test1_A volume is deemed to contain the valid data and the mirroring direction
is to be reversed. The startrcrelationship command is coded with -primary aux and -force to
return the relationship to the consistent and synchronized state. The primary aux value indicates
the copy direction is now from the auxiliary volume to the master volume. The auxiliary volume is
now functioning in the primary role.
When the relationship state is consistent and synchronized, the copy direction can be changed
dynamically with the switchrcrelationship command. In this case, write access to the auxiliary
volume will be removed, and reverted to the -primary master volume. The master volume will
function in the primary role again.
When there are changes in the write access capabilities between the two volumes in the
relationship, it is crucial that no outstanding application I/O is in progress when the switch direction
command is issued. Typically the host application would be shut down and restarted for every
direction change.
Uempty
The switchrcrelationship command will only succeed when the relationship is in one of the
following states:
• Consistent_synchronized
• Consistent_stopped and in sync
• Idling and in sync
If the relationship is not synchronized, the startrcrelationship command can be used with the
-primary and -force parameters to manage the copy direction.
The administrator must ensure that a consistent copy of the data resides on a volume before
starting a relationship with that volume as the primary volume. Otherwise, one will be left with a
corrupt copy of the data.
Uempty
Consistency Group
DATA
Relationship DATA
30 GB
Relationship
30 GB
LOG LOG
1 GB
Atomic copy of multiple volumes 1 GB
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Similar to FlashCopy, a consistency group enables the grouping of one or more relationships so
that they are manipulated in unison.
A consistency group can be applied to a single relationship or a set of relationships. All
relationships in the group must have matching master and auxiliary clusters and the same copy
direction. Up to 256 metro mirroring and global mirroring consistency groups can be created.
The mkrcconsistgrp command can be used to create creates an empty MM/GM Consistency
Group. The MM/GM consistency group name must be unique across all consistency groups that
are known to the systems owning this consistency group. If the consistency group involves two
systems, the systems must be in communication throughout the creation process.
The new consistency group does not contain any relationships and is in the Empty state.
You can add MM/GM relationships to the group (upon creation or afterward) by using the
chrelationship command or it can be a stand-alone MM/GM relationship if no Consistency Group
is specified.
Uempty
SVC_Site1
freeze_time,2 Storwize V7000_Site2 freeze_time,2
019/02/09/11/ idling_disconnected not_present 019/02/09/11/
00/29 00/29
Test2_A
Test0_GA Test0_M Offline
consistent_disconnected
If all the intercluster links fail between two clusters, then communication is no longer possible
between the two clusters in the partnership. This section examines the state of relationships when a
pair of Spectrum Virtualize system clusters is disconnected and no longer able to communicate.
To minimize the potential of link failures, it is best practice to have more than one physical link
between sites. These links need to have a different physical routing infrastructure such that the
failure of one link does not affect the other links.
Prior to the connectivity failure between the two clusters, the relationship is in the
consistent_synchronized state when viewed from both clusters.
A total link outage or connectivity failure between the two Spectrum Virtualize system causes the
cluster partnership to change from fully_configured to not_present.
When a total link failure occurs between the two clusters, the copy direction of the relationship does
not change (primary=master) but changed data of the master volume can no longer be sent to the
auxiliary volume.
The relationship state indicates that:
• Site1 is in the idling_disconnected state. Mirroring activity for the volumes is no longer active
because changes can no longer be sent to the auxiliary volume.
• Site2 it is in the consistent_disconnected state. At the time of the disconnect, the auxiliary
volume was consistent but it is no longer able to receive updates.
Even though updates are no longer being sent to the auxiliary volume, the changes are tracked by
the mirroring relationship bitmap at the primary site so that the two volumes can be resynchronized
once the connectivity between the clusters is recovered.
The relationship on the Site2 captures the date and time of the connectivity failure as freeze_time
when its state changed to the consistent_disconnected state. The freeze time is the recovery
Uempty
point of the Test1_A volume content - it is the last known time when data was consistent with the
master volume.
Uempty
SVC_Site1
freeze_time,2 inconsistent_copying Storwize V7000_Site2 freeze_time,2
019/02/09/11/ consistent_stopped 019/02/09/11/
00/29 00/29
Test2_A
Test0_GA Test0_M Offline
consistent_stopped
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
After connectivity between the two clusters has been restored, both volumes in the relationship is in
are consistent_stopped state (does not automatically restart).
The freeze time of both relationship volume’s consistent_disconnected time has been obtained
from the relationship on the two clusters (since connectivity of the clusters has been restored;
enabling this freeze time value to be transmitted).
The relationship stopped state allows a FlashCopy to be performed at the Site2 to capture the data
on the auxiliary volume as of the freeze time; before restarting the mirroring relationship. It is
important to create that PiT because the data on the auxiliary volume will become
inconsistent once resynchronization starts until synchronization is achieved.
Presuming that I/O activity have occurred, the relationship is out-of-sync and must be started with –
force to allow the grains on the master volume need to be copied to the auxiliary volume.
During this background copy, the relationship is in the inconsistent_copying state. Since the
auxiliary volume is set offline during this state, it will not be brought online until the copy completes
and the relationship returns to the consistent_synchronized state.
Uempty
2XWRIV\QFLQGLFDWLRQV
Online indicates that the relationship is online and accessible.
Primary_change_offline indicates that the primary change volume in the relationship is offline.
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Uempty
Uempty
• Reverse the direction of the copy so that the auxiliary volume becomes the primary volume for
the copy process.
• Remote mirroring is also referred to as Remote Copy (rc), hence all the commands reflect the
acronym of “rc”.
• Similar to FlashCopy, SNMP traps can be generated on state change events.
When the two clusters can communicate, the clusters and the relationships spanning them are
described as connected. When they cannot communicate, the clusters and relationships spanning
them are described as disconnected.
Uempty
Mirroring Mirroring
Primary Secondary
FlashCopy Mirroring
Virtualization Feature
Source or Target Primary or Secondary
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
As shown with the first table above, the FlashCopy target volume can also participate in a
Metro/Global Mirror relationship. Constraints as to how these functions can be used together are:
• A FlashCopy mapping cannot be manipulated to change the contents of the target volume of
that mapping when the target volume is the primary volume of a Metro Mirror or Global Mirror
relationship that is actively mirroring.
• A FlashCopy mapping must be in the idle_copied state when its target volume is the
secondary volume of a Metro Mirror or Global Mirror relationship.
• The two volumes of a given FlashCopy mapping must be in the same I/O group; when the
target volume is also participating in a Metro/Global Mirror relationship.
For details refer to IBM Knowledge Center https://www.ibm.com/support/knowledgecenter/
for the latest Copy Service commands and supported features.
Uempty
Remote Copy relationships per consistency group - No l limit is imposed beyond the Remote Copy relationships per
clustered system limit.
Total Metro Mirror and Global Mirror volume capacity per I/O 1024 TB This limit is the total capacity for all master and auxiliary volumes
group in the I/O group.
Total Metro Mirror with Change Volumes relationship per 256 Change volumes used for active-active relationship do not count
system towards this limit.
Total FlashCopy volume capacity per I/O group 4096 TB 4096 for a full four node clustered system with four I/O groups.
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
This listed are the most up-to-date (as of this publication) Copy Services configuration limits.
Uempty
Maximum
Properties Note
number
A system may be partnered with up to three remote
Inter-cluster IP partnerships per system 1 systems. A maximum of one of those can be IP and the
other two Fibre Channel.
I/O group per system 2 The nodes from a maximum of two I/O groups per
system can be used for IP partnership.
Inter site links per IP partnership 2 A maximum of two inter site links can be used between
two IP partnership sites.
Ports per node 1 A maximum of one port per node can be used for IP
partnership.
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
This listed are the most up-to-date (as of this publication) IP Partnership configuration limits. The
list of the configuration limits and restrictions specific to IBM Spectrum Virtualize software version
are available by way of the following website:
http://www-01.ibm.com/support/docview.wss?uid=ssg1S1005419.
Uempty
Remote Copy
Metro Mirror and Global Mirror
Inter-cluster connectivity
Partnership
HyperSwap
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Uempty
,2JURXS ,2JURXS
9ROS 9ROV
9ROS 9ROV
IBM HyperSwap is a high availability solution that offers disaster recovery (DR) protection during
resynchronization. It provides most disaster recovery (DR) benefits of IBM Spectrum Virtualize as
well uses intra-cluster synchronous remote copy (Metro Mirror) capabilities along with existing
change volume and access I/O group technologies.
The HyperSwap feature is available on systems that can support more than one I/O group,
providing highly available active-active volumes accessible through two sites at up to 300 km apart.
Both nodes of an I/O group must be at the same site. This site must be the same site of the
controllers that provide the managed disks to that I/O group. When managed disks are added to
storage pools, their site attributes must match. This requirement ensures that each copy in a
HyperSwap volume is fully independent and is at a distinct site. HyperSwap volumes are in an
active-active relationships that automatically run and switch direction according to which copy or
copies are online and up to date.
Uempty
Write
Supports data reduction pool (DRP) or standard pool I/Os
Read
I/Os RC-controlled
2QH+\SHU6ZDSYROXPH8,'VDPHDWERWKVLWHV
FlashCopy
+\SHU6ZDSYROXPH5:DFFHVVLEOHIURPERWKVLWHVE\DKRVWRU Change maps Change
Volume Volume
KRVWFOXVWHU
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
When you create a HyperSwap volume using the GUI, it will have one UID (matching the master
volume UID) and be R/W accessible via a host or host cluster including host clusters using hosts at
both sites. Unlike a failover in a Metro Mirror cluster where one has to make the auxiliary volume
available for use and map it to a host, the HyperSwap architecture makes this transparent to the
host and application using the host disk I/O multi-path code, making the HyperSwap volume point
to the auxiliary volume at failover. Spectrum Virtualize manages whether the master or auxiliary
volume is served up to the host as the HyperSwap volume, and manages the mirroring across the
primary and secondary volumes. This process works to keep them synchronized and consistent,
though synchronization will be lost under failure scenarios, though a recent PiT image of consistent
data is kept at both sites for use in dual failure scenarios via manual intervention.
Change volumes here shouldn't be confused with change volumes used in Global Mirror with
Change Volumes (GMCV). In both cases, change volumes keep a consistent PiT image of the data.
However, GMCVs are used in the global mirroring process to keep the volumes synchronized, while
HyperSwap uses its change volumes to keep the latest PiT consistent image during
resynchronizations such as after a network outage between sites, in case of a second failure before
the first failure is corrected. In both cases the change volumes are created as FlashCopy PiT
images.
If your HyperSwap master/auxiliary volume is in a data reduction pool, then the corresponding
change volume is created with compression enabled. If the master/auxiliary volume is in a standard
pool then the change volume is created as thin-provisioned. If you have a deduplicated
master/auxiliary volume then both must reside in data reduction pools at both sites. It's
recommended than the change volume reside in the same pool as the corresponding
master/auxiliary volume and both the master/auxiliary volume and its change volume must be in the
same I/O group (though the master and auxiliary volumes are in different I/O groups, different sites
or failure domains, and different pools).
Uempty
The caching node for the primary volume handles all reads and writes for the volume. As a result if
the host isn't in the same site as the primary volume, additional latency is added to I/Os. Spectrum
Virtualize will automatically change the primary site to the site generating 75% or more of the write
I/Os after 20 minutes which will improve performance for those I/Os.
Uempty
Site 1 I/O Group Site 2 Host failure Site 1 Storage failure Site 2
Host A
Host A
Host B
failure Host B
Site 1 Site 2
Host A
Migrate applications
Host B
between sites
I/O group 0 I/O group 1 I/O group 0 I/O group 1
Node 1 Node 2 Node 3 Node 4 I/O group 0 I/O group 1 Node 1 Node 2 Node 3 Node 4
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
In a HyperSwap configuration, each site is defined as an independent failure domain. If one site
experiences a failure, then the other site can continue to operate without disruption. For example, if
access is lose to I/O group 0 from the host, then the host multipathing will automatically failover to
access data from I/O group 1. If access is lose to the primary copy of data, then the HyperSwap
function will forward request to I/O group 1 to service I/O. If access is loss entirely to I/O group 0,
then the host multi-pathing will automatically failover to access data on I/O group 1.
You must also configure a third site to host a quorum device or IP quorum application that provides
an automatic tie-break in case of a link failure between the two main sites. Using a volume mirroring
(VM) in a cloud data center with an IP quorum is al alternative to procuring a third site. Sites can be
simply different power domains, in different rooms, different buildings, or different cities. The site
choice affects what kind of disasters are survivable.
Uempty
HyperSwap prerequisites
&OXVWHUVSDQQLQJWZRVLWHV
6$1VLQWHUFRQQHFWHGZLWKLQWHUQRGH]RQHVIRULQWUDFOXVWHUFRPPXQLFDWLRQV
L6(5 FOXVWHULQJ
+RVWFRQQHFWLRQVWRVWRUDJHIURPERWK,2JURXSV
6LWHDWWULEXWHVHWIRUQRGHVKRVWVFRQWUROOHUV0'LVNVDQGVWRUDJHSRROV
6\VWHPWRSRORJ\ K\SHUVZDSVHWYLDFKV\VWHP
6\VWHPOD\HU UHSOLFDWRQVHWYLDFKV\VWHP
7KLUGVLWHTXRUXP
+\SHU6ZDS9ROXPH
Active-active
Master relationship Aux
VDisk VDisk
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Before one can create a HyperSwap volume, the specified prerequisites must be satisfied. First the
cluster and inter-node communications must be setup, using standing FC communications between
node, or using iSER clustering.
You must also plan for the host to communicate with either FC ports on both I/O groups for FC
based volumes, or to ethernet iSCSI ports on both I/O groups for iSCSI volumes. This will ensure
host disk paths to both I/O groups for the HyperSwap volume. While VDisks are by default only
accessible by ports on the caching I/O group, HyperSwap VDisks must be accessible via ports on
both I/O groups.
• You must assign the nodes, hosts, controllers, MDisks, and storage pool to sites as specified by
the site attribute for each.
• A third site quorum must exist to perform tie-break decisions to avoid split brain situations.
• And finally, you must set the system layer and topology attributes to storage and replication
respectively via the chsystem command.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
HyperSwap volume configuration is possible only after the IBM Spectrum Virtualize system has
been configured in HyperSwap topology. A HyperSwap topology system can be configured through
site allocation and topology change command-line (CLI) commands. When the system topology is
set to HyperSwap, each node, controller, and host in the system configuration must have a site
attribute set to 1 or 2. After this topology change the GUI will present an option to create a
HyperSwap volume through the management GUI. A HyperSwap volume is created by using the
mkvolume command instead of the standard mkvdisk command.
The GUI uses the same basic volume creation is that in the HyperSwap Details, and uses its
topology awareness to map storage pools to sites based on the storage pool names provided. After
the volume is created, it is visible in the Volumes > Volumes list.
Uempty
Keywords
FlashCopy Partnership
Flash mapping Remote Copy
Full background copy Relationship
No background copy Metro Mirror
Consistency groups Synchronous
Target Global Mirror
Source Asynchronous
Copy rate Global Mirror without cycling
Clone Global Mirror with cycling and
Incremental FlashCopy change volume
Bitmap space Freeze time
Master Cycle process
Auxiliary
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Uempty
Review questions ( 1 of 2)
1. True or False: Bitmap space for Copy Services is managed in the node cache of the I/O
group.
2. True or False: Upon the restart of a Remote Copy relationship, a 100% background
copy is performed to ensure the master and auxiliary volumes contain the same
content.
3. True or False: The Remote Copy auxiliary volume is write accessible by default when
its relationship has been stopped.
4. True or False: Metro Mirror is a synchronous copy environment which provides for a
recovery point objective of zero.
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Uempty
Review answers (1 of 2)
1. True or False: Bitmap space for Copy Services is managed in the node cache of the I/O
group.
The answer is True.
2. True or False: Upon the restart of a Remote Copy relationship, a 100% background
copy is performed to ensure the master and auxiliary volumes contain the same
content.
The answer is False.
3. True or False: The Remote Copy auxiliary volume is write accessible by default when
its relationship has been stopped.
The answer is False.
4. True or False: Metro Mirror is a synchronous copy environment which provides for a
recovery point objective of zero.
The answer is True.
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Uempty
Review questions (2 of 2)
5. True or False: HyperSwap provides application transparent failover between Metro
Mirror volumes in separate Spectrum Virtualize clusters.
6. True or False: The underlying master, auxiliary and change volumes can reside in a
storage pool comprised of MDisks from controllers at both sites.
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Uempty
Review answers (2 of 2)
5. True or False: HyperSwap provides application transparent failover between Metro
Mirror volumes in separate Spectrum Virtualize clusters.
The answer is False. HyperSwap provides application transparent failover between the master
and auxiliary volumes within a Spectrum Virtualize cluster.
6. True or False: The underlying master, auxiliary and change volumes can reside in a
storage pool comprised of MDisks from controllers at both sites.
The answer is False. HyperSwap storage pools are restricted to each site's failure domain,
comprised of controllers and MDisks residing only at that site, requiring separate independent
storage pools at each site.
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Uempty
Summary
&RS\ULJKW,%0&RUSRUDWLRQ
6SHFWUXP9LUWXDOL]HUHPRWHGDWDPLUURULQJ
Uempty
Overview
This module examines administrative management options that assist you in monitoring,
troubleshooting, and servicing an IBM Spectrum Virtualize system environment.
References
Implementing IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1
http://www.redbooks.ibm.com/redpieces/pdfs/sg247938.pdf
Uempty
Objectives
• Recognize system monitoring features to help
address system issues
• Differentiate between local support assistance
and remote support assistance
• Employ system configuration backup and
extract the backup files
• Summarize the benefits of an SNMP, syslog,
and email server for forwarding alerts and
events
• Recall procedures to upgrade the system
software and the use of host spare nodes
• Evaluate and filter administrative task
commands entries that are captured in the
audit log
• Describe the concept of a quorum
configuration
• Identify the functions of Service Assistant Tool
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
System monitoring
System Health and Alert Status
Event Log: Messages and Alerts
Directed Maintenance Procedure
Event Notifications
Performance Monitoring
Settings
Access
Quorum devices
IBM Service Assistant Tool (SAT)
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
This module discusses the system monitoring, event log detections and performance monitoring of
an IBM Spectrum Virtualize environment.
Uempty
'DVKERDUG6\VWHP+HDOWKYLHZ
Hardware components displays the health of all components that are specific to the physical
hardware
Logical components displays the health of all logical and virtual components in the management
GUI
Connectivity components displays the health of all components that are related to the system’s
connectivity and relationship to other components or systems
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
When an issue or warning occurs on the system, the management GUI provides a Health Status
indicator (rightmost area of the control panel). The health status indicator turns red on the system.
Depending on the type event that occurred, a status alerts provides message information or alerts
about internal and external system events, or remote partnerships. The status alert provides a
timestamp and brief description of the event that occurred. Each alert is a hyperlink and redirects
you to the Monitoring > Event panel for actions.
The System Health section on the Dashboard provides a complete view of the system through tiles
of data that represents different components that make up the system. A tile contains one type of
component, but it can contain multiple items of the same type.
The following list defines the categories of tiles:
• Hardware components displays the health of all components that are specific to the physical
hardware.
• Logical components displays the health of all logical and virtual components in the
management GUI .
• Connectivity components displays the health of all components that are related to the system’s
connectivity and relationship to other components or systems.
You can expand each tile option to view its components. Any component that has indicated errors
or warning are displayed first so that components that require attention have higher visibility.
Healthy components are sorted in order of importance in day-to-day use. Each component has a
More Details option that allows you to view component properties and information.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Although the IBM Spectrum Virtualize GUI allows you to not only view monitor capacity and its
component details, it also provides a visible indication that the system is operating healthy or an
issue has occurred. The system reports all informational, warnings, and errors related to any
changes detected by the system to the events log.
Events added to the log are classified as either alerts or messages based on the following criteria:
• An alert is logged when the event requires an action. These errors can include hardware errors
in the system itself as well as errors about other components of the entire cluster. Certain alerts
have an associated error code, which defines the service action that is required. The service
actions are automated through the fix procedures. If configured, a call home to IBM by way of
email is generated to request assistance or replacement parts. Messages are fixed when you
acknowledge reading them and mark them as fixed. If the alert does not have an error code, the
alert represents an unexpected change in the state. This situation must be investigated to
determine whether this unexpected change represents a failure. Investigate the cause of an
alert and resolve it as soon as it is reported.
• A message is logged when a change that is expected is reported, for instance, when an array
build completes.
Each event recorded in the event log includes fields with information that can be used to diagnose
problems. Each event has a time stamp that indicates when the action occurred or the command
was submitted on the system.
When logs are displayed in the command-line interface, the time stamps for the logs in CLI are the
system time. However, when logs are displayed in the management GUI, the time stamps are
translated to the local time where the web browser is running.
Events can be filtered to sort them according to the need or export them to the external
comma-separated values (CSV) file.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Primary debug tool for a system is the event log, which can be accessed from the management
GUI Monitoring > Events or using CLI by issuing the lseventlog command.
Like the other menu options, Events window allows you to filter and add many other parameters
related to events.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Errors with error code might direct you to ascertainment procedures to replace a hardware
component using the directed maintenance procedure (DMP) step by step guidance, while
ensuring that sufficient redundancy is maintained in the system environment.
A Run Fix procedure is a wizard that helps you troubleshoot and correct the cause of an error.
Certain fix procedures will reconfigure the system, based on your responses; ensure that actions
are carried out in the correct sequence; and, prevent or mitigate the loss of data. For this reason,
you must always run the fix procedure to fix an error, even if the fix might seem obvious. The fix
procedure might bring the system out of a Degraded state and into a Healthy state.
In a normal situation during the daily administration of the Spectrum Virtualize system, you are
unlikely to see error events. However, as events messages and alerts are displayed, there might be
a continuing flow of informational messages. Therefore, typical Events displays only recommended
actions.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
In this scenario a status alert has indicated that an unresolved event caused by a room temperature
that is too high, which might cause the system to overheat and eventually shut down if the error
situation is not fixed.
To run the fix procedure for the error with the highest priority, click Recommended Action at the
top of the Event page and click Run Fix Procedure. When you fix higher priority events first, the
system can often automatically mark lower priority events as fixed.
While the Recommended Actions filter is active, the event list shows only alerts for errors that have
not been fixed, sorted in order of priority. The first event in this list is the same as the event
displayed in the Recommended Action panel at the top of the Event page of the management GUI.
If it is necessary to fix errors in a different order, select an error alert in the event log and then click
Action > Run Fix Procedure.
Selecting the Run Fix procedure brings in the first Window of the DMP that shows the first step of
the DMP procedure. In this example, the system reports that drive 2 (flash module 2) in slot 5 is
measuring a temperature that is too high. In addition, the system has validated to report that all four
fans in both canisters are operational and online.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
In the next phase the DMP procedure, the user is ask to verify the events reported with few more
inputs related. In this case, it’s the room temperature which needs verification.
Suggestions are provided that could be probable indications or solutions to the event. Overheating
might be caused by blocked air vents, incorrectly mounted blank carriers in a flash module slot, or a
room temperature that is too high.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Once the error is fixed, system return the healthy status from the earlier degraded status. Event log
is also updated.
Uempty
Event notifications
• Configure the system to alert the user and IBM when new events are added to the system
• Can choose to receive alerts about:
ƒ Errors (for example, hardware faults inside the system)
ƒ Warnings (errors detected in the environment)
ƒ Info (for example, asynchronous progress messages)
ƒ Inventory (email only)
• Alerting methods are:
ƒ SNMP traps
ƒ Syslog messages
ƒ Email call home
• Call home to IBM is performed using email
ƒ Will send Errors and Inventory back to an IBM email address to automatically open PMRs
ƒ IBM will call the customer
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
You can configure the IBM storage system to receive automated notifications when certain events
occur in the system, such as a file system approaching its size limit, a quota being exceeded, or a
CPU becoming overloaded.
Event Notifications can be configured using the GUI Settings > Event Notifications or through CLI
commands. IBM system uses Simple Network Management Protocol (SNMP) traps, syslog
messages, and Call Home email to notify you and the IBM Support Center when significant events
are detected. Any combination of these notification methods can be used simultaneously.
Notifications are normally sent immediately after an event is raised. However, there are events that
can occur because of service actions that are being performed. If a recommended service action is
active then these events are notified only if they are still unfixed when the service action completes.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
For fast reaction on problems, IBM highly recommends the configuration of Call Home. Call Home
is normally configured when your system is first installed. Call Home allows the IBM storage
systems to send electronic call home messaging transmission of operational and error-related data
to IBM and other users through a Simple Mail Transfer Protocol (SMTP) server connection in the
form of an event notification e-mail.
Call Home event notifications contains only meta-data and customer contact data information to
specified recipients and the IBM Support Center. This data is used by IBM to pro-actively respond
to failures before they occur. Configuring call home reduces the response time for IBM Support to
address the issues.
Uempty
Email notifications
• Call Home support is initiated for the
following reasons or types of data
ƒ Problem or event notification:
í Data is sent when there is a problem or event
that might require the attention of IBM service
personnel
ƒ Inventory information:
í A notification is sent to provide the necessary
status and hardware information to IBM service
personnel
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Call home automatically notifies IBM service personnel when errors occur in the hardware
components of the system or sends data for error analysis and resolution.
The SMTP server must be configured to send emails and to allow email relays from the storage
system cluster IP address. Click Settings > Event Notifications, then select Email and Enable
Notifications to configure the email settings, including contact information and email recipients. A
test function can be invoked to verify communication infrastructure.
Uempty
SNMP notifications
• Standard protocol for managing networks
and exchanging messages
• Identify servers or managers using Settings
> Notifications > SNMP
ƒ SNMP server can be configured to receive all or a
subset of event types
ޤUp to six SNMP servers can be configured
ƒ Use Management Information Base (MIB) to read
and interpret these system events
SNMP server can be
ޤAvailable from the system support website configured to receive all of
these types of events
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The Simple Network Management Protocol (SNMP) is a standard protocol for managing networks
and exchanging messages. The system can send SNMP messages that notify personnel about an
event. You can use an SNMP manager to view the SNMP messages that the system sends. Up to
six SNMP servers can be configured. Customers often have a trouble ticketing software within an
operations center which receives SNMP messages to manage handling each issue.
You can also use the Management Information Base (MIB) file for SNMP to configure a network
management program to receive SNMP messages that are sent by the system. This file can be
used with SNMP messages from all versions of the software to read and interpret these
FlashSystem events.
To configure the SNMP server, identify the management server IP address, remote server port
number, and community name so that the FlashSystem generated SNMP messages can be view
from the identified SNMP server. Each detected event is assigned a notification type of either error,
warning, or information. The SNMP server can be configured to receive all or a subset of these
types of events.
Uempty
Syslog notifications
• Standard protocol for forwarding log messages from a sender to a receiver on an IP network
• Identify servers or managers using Settings > Notifications > Syslog
ƒ Up to a maximum of six syslog servers
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The syslog protocol is a standard protocol for forwarding log messages from a sender to a receiver
on an IP network. Click Settings > Notifications, then select Syslog to identify a syslog server.
The IP network can be either IPv4 or IPv6. She system can send syslog messages to a syslog
server which consolidates syslog messages into a single file. Then other software can be used to
notify a user about the messages, for example via email.
Syslog error event logging is available to enable the integration of FlashSystem events with an
enterprise’s central management repository.
The system can transmit syslog messages in either expanded or concise format. You can use a
syslog manager to view the syslog messages that the system sends. The system uses the User
Datagram Protocol (UDP) to transmit the syslog message. You can specify up to a maximum of six
syslog servers.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
IBM Spectrum Virtualize GUI uses real-time statistics to monitor CPU utilization, volume, interface,
and MDisk bandwidth of your system and nodes. Each graph represents five minutes of collected
statistics and provides a means of assessing the overall performance of your system. These
statistics summarize the overall performance health of the system and can be used to monitor
trends in bandwidth and CPU utilization. For example, you can monitor the interfaces, such as
Fibre Channel or SAS interfaces, to determine if the host data-transfer rate is different from the
expected rate.
There are three performance statistics that can help identify possible congested points and
incorrectly configured infrastructure within the system and external to it:
• Bandwidth System bandwidth bounds applications that serve large amounts of data per IO
operation. The read and write data paths affect each other minimally, but interactions between
the two are nuanced and can greatly affect overall performance.
• IOPS System IOPS measures the amount of external IO operations being serviced per second.
There is an inverse relationship between the size of the IO operation and the number of IOPS,
as it takes longer to service larger I/Os than smaller I/Os. Storage performance varies based on
I/O size; thus separate metrics exist for MB/s and IOPS. For large I/Os, there's a maximum
throughput reported in MB/s. For small I/Os performance there's an IOPS limit. While one can
convert IOPS to MB/s for a specific I/O size, the MB/s limit for large I/Os (for a particular setup)
will be at a much smaller IOPS rate, than the IOPS limit for small I/Os.
• Latency System latency is comprised of four types of latency: Average, maximum, read, and
write latencies.
Average latency is the average latency for all the I/Os. As the IOPS rate increases, so does latency
due to queueing for resources.
Uempty
• Maximum latency represents the longest it takes for a single I/O operation to complete for the
interval.
• Read and write latency are reported separately, mostly because write latency benefits from
write cache (on the system cluster and/or backend storage if the system cluster is in write
through mode), while read latency typically requires reading the data from storage. Thus we
typically expect better latencies for writes. The average latency will be affected by the read/write
ratio as a result.
You can also select node-level statistics, which can help you determine the performance impact of
a specific node. As with system statistics, node statistics help you to evaluate whether the node is
operating within normal performance metrics.
Uempty
System monitoring
Settings
System Data Collection
Service IP Management
Network Interface Connectivity
System License
Software Upgrade
GUI Preferences
Access
Quorum devices
IBM Service Assistant Tool (SAT)
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
This module covers various configuration settings to collect and download support data, configure
service IP addresses, monitor network interfaces, establish remote authentication. We will also
review procedures to update the system software update and to extract support logs.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The IBM Support Assistance (ISA) helps you to resolve questions and problems with IBM software
products by providing access to support-related information and to complete troubleshooting and
maintenance tasks.
The ISA is available at no charge to install on your computer; you then install the relevant product
add-ons. ISA has a built-in user guide, and the ISA download package includes a quick start
installation and configuration guide.
For clients who purchased IBM Support with Enterprise Class Support (ESC) 3 year warranty, IBM
Remote Support Assistance is available as part of the service agreement. This allows IBM Support
engineers to login remotely to proactively gather logs and do problem determination and service
the cluster (such as perform system or upgrade recoveries), only with the client’s explicit
permission. Remote support can also be enabled for client who have not purchased the ECS
service agreement. In this case IBM will evaluate the requirements to remote in for assistance.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
(QDEOLQJUHPRWHVXSSRUWDVVLVWDQFH
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Typically, both local support and remote support assistance features are configured during system
setup (if its part of the service agreement). However, it can be enabled after system setup using the
management GUI or through the command-line interface.
During this process you need to specify whether you want support personnel to access the using
local support assistance if you have restrictions that require on-site support only or using both
on-site and remote access. You also need to determine whether IBM Support can start session at
any time or permission must be granted by the client for access.
The support center will respond to manual and automatic service request from the system using
default support center IP addresses. Remote Support through a proxy server is optional for network
configurations using firewall or for systems without direct connection to the network.
Uempty
5HPRWH6XSSRUW$VVLVWDQFHXVHUPRGH
System generates remote support access user IDs (rsa_IDS)
ƒ Requires challenge response authentication
0RQLWRUXVHU 3ULYLOHJHXVHU
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
After Remote Support Assistance is enabled, the system generates two user IDs. These IDs enable
remote access users (rsa_IDs) access, and are used only by IBM Support personnel.
When logging with an rsa_ID, a shared-token is also generated by the system and sent to the
support center. The support personnel must respond to the challenge code presented with a
response code that is received from the IBM Support Center. Service personnel have three
attempts to enter the correct response code. After three failed attempts, the system generates a
new random challenge and support personnel must obtain a new response code.
IBM Support personnel with a Privilege rsa_ID is operating in a Restricted Administrator role with
support access to perform administrator tasks to help solve problems on the system. However, this
role restricts these users from deleting volumes or pools, unmapping hosts, or creating, deleting, or
changing users.
IBM Support personnel with a Monitor rsa_ID assigns in the Monitor role to view, collect, and
monitor logs and errors to determine the solution to problems on the system.
Uempty
5HPRWH6XSSRUWWRSRORJ\
EXAMPLE
Grant access
Customer
Firewall
IBM IBM
IBM system Infrastructure
Front Back-end
with RSC server Server
Performs RSA
Client owned INTERNET DMZ
IBM Secure
devices Proxy
IBM Firewall Server
Proxy Infrastructure
Optional
IBM
Support
Client’s network IBM network engineer
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
This example shows a remote support topology where IBM might access a client system. When
necessary, the client will access a proxy server (optional) to connect to the IBM owned device to
communicate to IBM Support Assistant server. Internet connects from client to IBM Support is
transmitted through a tunnel secure connections (front server) to the IBM Support (back-end
server) where IBM Support engineers will initiate remote support assistance. This process runs on
all nodes independently creating secure tunnels to the back-end server. The cluster and any
service commands implemented on the system are audited and stamped by unique identifier that
indicates the session and tasks performed. The IBM support can perform support levels from T1 to
T4 which can be more difficult to perform including concurrent code upgrades.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
A problem determination process might require additional information from the system cluster
for analysis by IBM Support personnel. This data collection can be performed using the
svc_snap command or the GUI or CLI using svc_snap command. An alternative to the
management GUI is to use the Service Assistant GUI to download support information. This
path might be necessary if, due to an error condition, the management GUI is unavailable.
The GUI provides for a simpler download of the support information.
From the Settings > Support window, you can select the Manual Upload Instructions to
download support package. If support assistance is configured on your systems, you can either
automatically or manually upload new support packages to the support center to help analyze
and resolve errors on the system.
Uempty
Standard logs
• Requests log files to be uploaded to a specific problem
• Can take longer to download
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
IBM Support usually requests you to send the Standard logs plus new statesaves. These
logs can take from minutes to hours to download, depending on the situation and the size of the
support package that is downloaded. The destination of the support package file is the system
where the web browser was launched. The system GUI prompts you to save the support
package to the local machine used for accessing the GUI. In addition the system preserves a
copy of the support data in the configuration node. IBM Support usually requests log files to be
uploaded to a specific problem management record (PMR) number using several upload
medias to IBM.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The system configuration data is stored on all nodes in the cluster and is internally hardened so that
in normal circumstances the system should never lose its configuration settings. However, in
exceptional circumstances this metadata might become corrupted or lost.
The Configuration backup can be collected using the management GUI or CLI. The CLI command
svcconfig backup backs up the cluster configuration metadata in the configuration node /tmp
directory. These files are typically downloaded or copied from the cluster for safekeeping. It might
be a good practice to first issue the svcconfig clear -all command to delete existing copies of
the backup files and then perform the configuration backup. Usually the svcconfig backup will be
done in a minute. However the time taken depends on the configuration. The configuration backup
XML can be kept for any recovery of the storage configuration when required. The Configuration
backup collected can be listed in the Management GUI, Support Window, where all the files in the
/tmp directory will be listed for download.
The application user data is not backed up as part of this process.
The IBM Support Center should be consulted before any configuration data restore activity is
attempted.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
In addition to the /tmp directory, a copy of the config.backup.xml from the svcconfig backup
command is also kept in the /dump directory.
The cluster actually creates its own set of configuration metadata backup files automatically each
day at 1 a.m. local time. These files are created in the /dump directory and contain ‘cron’ in the file
names.
Right-click the file entry provides another method to download backup files.
The content of the configuration backup files can be viewed using a web browser or a text
processing tool such as WordPad. This output is from the copy of the backup file extracted from the
/dumps directory using the GUI. It contains the same data as the file in the /tmp directory.
The configuration backup is in the form of XML files which can be opened in the notepad or similar
utilities for verifying the timestamp and other related information.
You can also issue the lsdumps command with -prefix /dumps/audit to list the files on disk.
These files can be downloaded from the cluster for later analysis should it be required by problem
determination. The file entries are in readable text format.
Uempty
Can be used to unconfigure service IP address by clearing the IPv4 or IPv6. Set the services IP
address for each node in the cluster. Only the superuser ID is authorized for access.
Can be used to modify iSCSI connections, host attachment, and remote copy.
Display the connectivity between nodes and other storage systems and hosts that are attached
through the Fibre Channel network.
Use to specify specific ports to prevent communication between nodes in the local system or
between nodes in a remote-copy partnership.
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Each IBM Spectrum Virtualize system requires three IP addresses for management:
• A cluster IP address and an IP address to access the Service Assistant on each node.
▪ Each of the three IP addresses must be a unique value.
▪ Service IP addresses can be configured from the management GUI or CLI, Service
Assistant (SA) GUI or CLI.
▪ You may also use the USB-Key InitTool to set up the cluster IP address.
From the management GUI, you can use the Network panel to:
• Establish management IP addresses. Multiple ports and IP addresses provide redundancy for
the system in the event of connection interruptions.
• Define service IP addresses to access the service assistant tool, which you can use to complete
service-related actions on the node. All nodes in the system have different service addresses. A
node that is operating in service state does not operate as a member of the system.
• Verify the system Ethernet ports and modify how ports on the system are being used.
• Configure settings for the system to attach iSCSI-attached hosts.
• Use the Fibre Channel Connectivity panel to display connectivity between nodes and other
storage systems. The system must support Fibre Channel or Fibre Channel over Ethernet
connections to your storage area network (SAN).
• Use the Fibre Channel ports panel in addition to SAN fabric zoning to restrict node-to-node
communication. You can specify specific ports to prevent communication between nodes in the
local system or between nodes in a remote-copy partnership. This port specification is called
Fibre Channel port masking.
Uempty
IBM_cluster:superuser>lssystemip
cluster_id cluster_name location port_id IP_address subnet_mask gateway
IP_address_6 prefix_6 gateway_6
000002006041E1F6 IBM_cluster local 1 9.42.166.211 255.255.255.0
9.42.166.254
000002006041E1F6 IBM_cluster local 2
IBM_cluster:superuser>
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The FlashSystem management occurs across Ethernet connections using the cluster management
IP address owned by the configuration node. Each node has two Ethernet ports and both can be
used for cluster management. Ethernet port 1 must be configured. Ethernet port 2 is optional can
be used as an alternate cluster management interface on a separate subnet.
The configuration node is the only node that activates the cluster management IP address and the
only node that receives cluster management requests. If the configuration node fails, another node
in the cluster becomes the configuration node automatically and the cluster management IP
addresses are transferred during configuration node failover.
If the Ethernet link to the configuration node fails (or some other component failures related to the
Ethernet network occur) because the event is unknown to the FlashSystem then no configuration
node failover would be triggered. Therefore, configuring Ethernet port 2 as a management interface
allows access to the cluster using an alternate IP address.
Use Settings > Network > Management IP Addresses to configure port 2 as the backup cluster
IP management address. You can use the alternate IP address to access to the management GUI
and CLI.
The chclusterip command is used to set or change the IP address of either Ethernet ports.
Actually most of the commands with ‘cluster’ have been replaced with ‘system’. The svcinfo
lsclusterip command has been replaced with the lssystemip command to list the cluster IP
addresses.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The Remote Authentication option provide wizard that allows administrators to authenticate to the
system using credentials stored on an external authentication service. With remote authentication
services, an external authentication server can be used to authenticate users to system data and
resources. User credentials are managed externally through various supported authentication
services, such as LDAP.
When you configure remote authentication, you do not need to configure users on the system or
assign additional passwords. Instead the LDAP administrator would create a group of system
administrators who would be authorized to access the GUI/CLI, while their scope of authority would
be configured on the system based on their user role. This simplifies user management service to
simplify user management and access, to enforce password policies more efficiently, and to
separate user management from storage management.
For availability, multiple LDAP servers can be defined. These LDAP servers must all be the same
type (for example MS AD). Authentication requests are routed to those LDAP servers marked as
Preferred unless the connection fails or a user name isn’t found. Requests are distributed across
all the defined preferred LDAP servers in round robin fashion for load balancing.
Uempty
System licenses
• Administrator uses the Licensed Functions window in the System Setup wizard to enter
External Virtualization licenses purchased
ƒ Base license provides basic functions (Encryption, Virtualization, FlashCopy, Global and Metro
Mirroring, Real-Time Compression, and Easy Tier
ƒ External license is required for your systems such as FlashSystem 900, and IBM SAS enclosures (12F,
24F and 92F)
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The base license that is provided with your system includes the use of its basic functions. However,
there are also extra licenses that can be purchased to expand the capabilities of your system.
Administrators are responsible for purchasing extra licenses and configuring the systems within the
license agreement, which includes configuring the settings of each licensed function on the system.
The base license entitles an IBM Spectrum Virtualize system to all the licensed functions such as
Encryption, Virtualization, FlashCopy, Global and Metro Mirroring, Real-Time Compression, and
also include Easy Tier. Therefore, any connected storage that is not part of the physical system
(such as the FlashSystem 900, and the IBM SAS enclosures (12F, 24F and 92F) requires the
External Virtualization license.
External Virtualization license is a per tebibyte (TiB) capacity unit of metric (TiB) measures volume
sizes in binary, so 1 GiB equals 1,073,741,824 bytes, which is 1024 to the power of three; while TB
measures volume sizes in decimal, so 1 GB equals 1,000,000,000 bytes, which is 1000 to the
power of three). You use the Licensed Functions window in the System Setup wizard to enter
External Virtualization licenses purchased for your system.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
To prepare for the upgrade, download the latest recommended firmware package from the IBM
Storage Support web site at www.ibm.com/storage/support. This package includes the INSTALL
firmware code, update test utility file, and ReadMe files. The site also provides links to information
that is valuable for planning the upgrade. Read the information carefully and act accordingly.
When downloading firmware from IBM, a valid IBM user ID is required. You will also need to
validate coverage by entering the system model number and serial number.
It is recommended that you perform upgrades at the lowest utilization of the systems – such as over
the weekend. Ensure that the system is free of errors, and system date and time are correctly set.
Since nodes will be in a cache write-through mode, means that none of the node during the
upgrade process will have cache enabled because all writes will be going to the actual disk drives.
So be aware there could be some impact in performance especially during heavy IO load. We
suggest that you plan a three-hour change window for your upgrades. This time can vary
depending on your system configuration.
Uempty
$XWRPDWLF 0DQXDO
Preferred default method Provides more flexibility
Upgrades each node in the Remove node from system
cluster systematically
Upgrades software on node
Configuration node is
updated last Return node to the cluster
9HUVXV
As each node restarted, Configuration node is
there might be some updated last
degradation in the maximum
I/O rate during
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
There are two update methods in which to upgrade the cluster software code: Automatic or Manual.
• Automatic support concurrent upgrade which is the default preferred procedure to upgrade the
system, which is referred to as Concurrent Code Upgrade (CCU). This method allows all
components of the system to be upgraded concurrently including the controllers and storage
enclosures. The new code is staged on the all nodes in the system, before upgrading the
configuration node.
As each node is upgraded, there is a reduction in the maximum IO rate that can be sustained by
the system. After all the nodes in the system are successfully restarted with the new software
level, the new software level is automatically committed.
This method is done with little to no intervention, but can longer to completed, depending on the
number of I/O groups in the cluster.
• Manual upgrade provides more flexibility, as it does not require waiting 30 minutes between the
node upgrades. This upgrade requires system administrators support (both storage and host) to
perform upgrade and to recover the paths immediately after a node upgrade.
Manual updates are performed only from the Service Assistant GUI. During this manual
procedure, the upgrade is prepared, then you remove a node from the cluster, the system
upgrades the software on the node, and return the node to the cluster. Like CCU, you must still
upgrade all the nodes in the clustered system, by repeating all the steps in this procedure for
each node that you upgrade that is not a configuration node.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
IBM has continued to implement IBM Spectrum Virtualize software upgrade improvements.
• Prior to version 7.4, if any problems or issues were detected, the upgrade had to be aborted.
Now, the system provides the ability to resume a stalled or stopped upgrade after the problem
resolved.
• A preemptive notification to the hosts was added in 7.8.1 that stops any I/O failures (inflight)
when node goes offline. This takes place approximately 30s before the node goes down. The
system sends a LUN change notification for LUNs assigned to that preferred node, then
switches to the partner when scanned. Provides an added benefit even if NPIV is in use.
• Performance improvements in I/O pause processing when nodes leave or join the cluster.
• Incremental improvements with significant reduction in 7.8.0 and 7.8.1.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
To view the current firmware level running, from the management GUI Settings menu, select
System > Upgrade Software. The Update System software window will also display fetched
information on latest software version available for update. The displayed version may not always
be the recommended version, always refer to IBM Fix Central for the latest tested version available.
Before you start a system update, ensure that there are no pending problems on the system that
might interfere with the system update. Each software requires that you run the software update
test and then the latest software package.
To initiate the automatic upgrade, click the Test & Update option and browse to the location of the
downloaded software update package. Once you have selected both the files, you need to confirm
whether to allow the system to pause throughout the update process. The Pause function will allow
users to pause CCU indefinitely. This pause allows customers to do any problem determination, for
example multipathing issues or simply to pause the upgrade until a more convenient time to
resume the upgrade. Software Update Test Utility can be run as a standalone option from the
management GUI prior to initiating an update.
Uempty
Fixes cannot be
applied while
upgrading
7HVW8WLOLW\FDQDOVREHSHUIRUPHG
PXOWLSOHWLPHVXVLQJWKH&/,
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Immediately after the system complete uploading the software code package, it generates an
svcupgradetest command that invokes the utility to assess the current system environment.
Update test utility will indicate whether your current system has issues that need to be resolved
before you update to the next level. If an issue is discovered, the firmware update stops and
provides a Read more link to help address any issue detected. The Update Test Utility Result
output can be download for technical support if required.
We recommend that you run this utility for a final time immediately prior to applying the upgrade.
The Update Test Utility can be run as many times using management GUI or CLI as necessary on
the same system to perform a readiness check in preparation for a software upgrade. If no issues
are found, click Resume to continue with the update.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
As the upgrade progresses, the first node in the IO group is taken offline. Once the code level is
verified by the system, the GUI generates the applysoftware command to apply the system
software code to the updating node. The system Health Status pod will also flag the condition as a
node status alert. You can issue the lsnode command to verify which node is the configuration is
the configuration node. In a four-node system, the configuration node isn’t typically upgraded until
after half of the nodes of the cluster have been upgraded.
The update process can take some time to complete. Once the node that was being updated has
been restarted with the upgraded software, it is placed back online with an updated code level.
After waiting 30 minutes for host paths to the updated node to recover, then the next node is
updated.
If you are updating multiple clusters, the software upgrade should be allowed to complete on one
cluster before it is started on the other cluster. Do not upgrade both clusters concurrently.
The administrator can also issue the svqueryclock command to view a duration time reference for
the particular upgrade in process.
Uempty
Example: 12'(LVRIIOLQHIRUXSGDWHSDWKVDUHDXWRPDWLFDOO\URXWHGWR12'(IRU,2DFWLYLWLHV
+RVW
3DWKVWR
$SSOLFDWLRQDFFHVV
SUHIHUUHG12'(
YROXPH
RIIOLQH
Configuration node
$OWHUQDWH
12'(
SDWKVXVHG
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
When a node is being updated, the partner node in the I/O group starts handling I/O requests for
the updating node's VDisks. Since there is only one node is working in the I/O group, write cache is
turned off and the node operates in write through mode. This reduces write I/O latency during the
upgrade, and can reduce application throughput.
From the host's viewpoint, its multi-path code marks the paths to the updating node as failed and
directs I/Os to the surviving node. The multi-path code can be used to monitor path status, take
paths offline, bring paths back online (manually or automatically after some time) if possible, or
dynamically add or remove paths.
The SDDDSM datapath query device command shows path information.
Uempty
$OWHUQDWH
12'(
SDWKVXVHG
'XULQJWKHXSJUDGHWKH*8,PLJKWJRRIIOLQHWHPSRUDULO\DVWKHQRGHLVUHVWDUWHGGXULQJWKH
XSJUDGH5HIUHVK\RXUEURZVHUWRUHFRQQHFW
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Typically there is a thirty-minute delay or time-out built in between node upgrades. This delay
allows time for the host multipathing software to rediscover paths to the nodes that are upgraded,
so that there is no loss of access when another node in the IO group is upgraded.
You cannot invoke the new functions of the upgraded code until all member nodes are upgraded
and the upgrade is committed. The system takes a conservative approach to ensure that paths are
stabilized before proceeding.
After the nodes in the nodes in the I/O groups have been updated, a system update is applied to
update all enclosures in the system.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
If your system supports hot spare nodes with NPIV capable hardware, IBM Spectrum Virtualize is
now capable of the supporting hot spare node use during upgrade. This offers tremendous benefits
to non-host disruption while upgrading the system code level. The NPIV target port behavior is
available in Spectrum Virtualize V7.7.0 and later. Hot spare nodes are available in Spectrum
Virtualize V8.1.0 and later.
Only host connection on Fibre Channel ports that support node port virtualization (NPIV) can be
used for spare nodes. The spare node uses the same NPIV worldwide port names (WWPNs) for its
Fibre Channel ports as the upgrading node, so host operations are not disrupted. If you try to create
an NPIV port on an HBA that does not support NPIV, an error will occur. If you try to create an NPIV
port on an HBA that supports NPIV but it is attached to a switch which does not support NPIV, the
port will be created with an offline status.
Hot-spare node upgrade is only supported on SVC and the FlashSystem V9000 that use an
external switch for Flash enclosure attachment. IBM Storwize V7000 can benefit from NPIV, but
hot-spare canister replacement is not supported.
Uempty
8SJUDGHVWDWXV
Inactive The upgrade is no longer running.
Upgrading The upgrade is progressing normally.
Downgrading The user aborted an upgrade and the system is currently downgrading.
stalled_non_redundant The upgrade stalled and the system is not redundant (the removal of a
single node will result in volumes going offline).
Always consult IBM Support to determine the cause of the failure before being advised on executing any
recovery procedures
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Typically, a failed upgrade causes the upgrade to go into a stalled state. This can be identified by
using the lsupdate command. The upgrade might also stall if your system supports hot spare
nodes that are connected using a Fibre Channel adapter and update software level 7.8.1.1 or
earlier to software level 7.8.1.2 or later. To avoid this issue, remove each Fibre Channel adapter
from the node that is being added to the cluster and check for the presence of a secure jumper. If
the secure jumper is present, remove the secure jumper to allow the firmware upgrade to complete
successfully. After the upgrade completes, return the secure jumper to the Fibre Channel adapter.
Uempty
ďŽƌƚĞĚ
Žƌ
ZĞƐƵŵĞĚ
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The simplest recovery case is where all original nodes are online. In this case, the upgrade can be
either aborted (in which it downgrades to the previous level) or resumed (to continue the upgrade
procedure).
Notify the IBM L3 support to determine the root cause of the failure and whether an abort or resume
is recommended. The abort or resume can either be executed through the GUI, or by using the
applysoftware command, and will cause the automated upgrade procedure to continue.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
In some circumstances, a node might fail while being upgraded and never return to the cluster. In
this case, the upgrading node appears to be continuing to upgrade but is stuck for 90 minutes or
more, or returned into the cluster and then immediately failed, in which case the upgrade stalls.
Another possibility is that the failed node is downgrading and will rejoin the cluster soon. The
system should be allowed at least 30 minutes from the point of stalling to determine whether the
node rejoins the cluster. If the node does rejoin, follow scenario 1 because all nodes will now be
online.
If you are using spare nodes, the spare node remains in place until further action is taken. This
might be a case where the node can be recovered, for example by replacing hardware, in a manner
that does not wipe the node, then the node will automatically rejoin the cluster. If the failed node
cannot be recovered, and must be installed or wiped with the satask leavecluster command, it
will be returned to the candidate state. At this point the spare node can be reintroduced into the
cluster by using the following command: swapnode -replace <original node id>. The upgrade
will stall, you can then follow Scenario 1. Another option would be to simply remove the failed node
from the cluster by using the following command: rmnode -deactivatespare <original node>.
At this point the system will be non-redundant because it is without an active spare.
Uempty
Service procedures that involve field-replaceable units (FRUs) do not apply to the IBM Spectrum
Virtualize product, which is software based. For information about possible user actions relating to FRU
replacements, refer to your hardware manufacturer's documentation.
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Error codes describe a service procedure that must be followed. Each event ID that requires
service has an associated error code. Error codes can be either notification type E (error) or
notification type W (warning). This table lists the event IDs that have corresponding error codes,
and shows the error code, the notification type, and the condition for each event.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The 1300 error code indicates a port that was configured for N_Port ID virtualization (NPIV) is
offline.
Complete both of the following procedures:
• Check the switch configuration to ensure that NPIV is enabled and that resource limits are
sufficient.
• Run the detectmdisks command and wait 30 seconds after the discovery completes to see if
the event fixes itself.
• If the event does not fix itself, contact IBM Support.
The usual cause for this will be a switch that does not support NPIV, has NPIV disabled in its
configuration, or has limited NPIV resources
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The 1380 error code indicates a spare node was added to the cluster that has a hardware
mismatch when compared to the nodes in the cluster, therefore no redundancy is provided to the
cluster. The usual cause for this is either a mismatch of memory or a compression card being
available on active nodes and not on spare or vice-versa. To resolve the issues, review the
hardware configuration of each node. You can collect system data to provide to IBM Support to be
advised on what changes need to be made to the spare.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The 3220 error code indicates that mismatch fabric WWNs were detected.
To resolve this issues, complete the following steps:
a. Run the lsportfc command to get the fabric WWNN of each port
b. List all partnered ports (that is, all ports for which the platform port ID is the same, and the
node is in the same I/O group) that have mismatched fabrics WWNs
c. Verify that the listed ports are on the same fabric
d. Rewire nodes if needed. For information about wiring requirements, see “Zoning
considerations for N_Port ID Virtualization” in your product documentation. Once all ports
are on the same fabric, the event correct itself
e. This error might be displayed by mistake. If you confirm that all remaining ports are on the
same fabric, despite apparent mismatches that remain, mark the event as fixed.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
If the spare node is the only node in the cluster and the failed node which is covered by an online
spare comes online, then Spectrum Virtualize will not failback to the active node. This will generate
an error code 3180. If the failback is performed manually with the swapnode -failback
permitoffline volumes command syntax, offline volumes may result. The failback will occur
automatically, within ten minutes of redundancy being restored, or when the swapnode -failback
command is executed.
Uempty
System warnings
• 0x088003 SS_EID_HSN_UNUSED_SPARE_NODE
ƒ A spare node in this cluster is not providing additional redundancy.
ƒ A spare node has been added to a cluster, but there are no other members of the cluster for which the
spare is a viable replacement. e.g. memory mismatch
• 0x088004 SS_EID_HSN_FAILBACK_REQUIRED
ƒ A spare node was in use, but is no longer required due to the original node coming back, but an
automatic failback was prevented by redundancy issues. e.g. Expansion Enclosure
• 0x045104 SS_EID_EN_SPARE_NODE_SINGLE_PORT_ACTIVE
ƒ Drives are single ported due to a spare node
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The following code does not require immediate user attention but serves as an alert:
• The 880003 error code indicates an event warning when a spare node has been added in the
cluster that does not provide protection for any active nodes – which might indicate a memory
mismatch. Although this alert is no harmful to the I/O operation as a spare is in a standby status
and not part of an I/O group. This event can be fixed either by adding a node in the cluster as an
active node or by removing this spare node.
• The 088004 error code indicates that a spare node was in use, but is no longer required due to
the original node coming back, but an automatic failback was prevented by redundancy issues.
In this case, a manually failback can be issued using the swapnode -failback <spare node
id> command syntax which forces a failback of a spare node to an original node.
• The 045104 error code indicates that the spare is connected to a drive enclosure in the IO
group in which it does not have direct access to, only the partner node (surviving) node in the IO
group has SAS connectivity to those drives.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
With the latest software update, you now have the ability to change memory limits for Copy
Services and RAID functions per IO groups by selecting the System > Resources option.
Copy Services features and RAID require that small amounts of volume cache be converted from
cache memory into bitmap memory to allow the functions to operate. If you do not have enough
bitmap space allocated when you try to use one of the functions, you will not be able to complete
the configuration.
Before you specify the configuration changes, consider the following factors.
• For FlashCopy relationships, only the source volume allocates space in the bitmap table.
• For Metro Mirror or Global Mirror relationships, two bitmaps exist. One is used for the master
clustered system and one is used for the auxiliary system because the direction of the
relationship can be reversed.
• The default 20 MB of bitmap space, using a 256 KB grain size, allows FlashCopy snapshots of
volumes totaling 40 TB.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
You can use the GUI preference page to change settings that are related to how information is
displayed on the GUI. This can be information that you wish for users to see as they login into the
GUI.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The GUI Preferences > General tab provides the following abilities:
• Clear all customized settings and restores defaults settings. This requires refreshing the
management GUI for changes to take effect.
• Set a default logout time force logout on session timeout for inactive users. You will need to
refresh the browser or login in again for changes to take effect.
• Enter a default web address to the IBM Knowledge Center for in-depth documentations on the
system’s support and capabilities. The Restore Default will restore the default browser
preferences.
• Refresh GUI cache to update the cached information. Many refresh tasks are invoked by events
when the configuration is changed in the cluster. In those cases, the GUI pages reflect changes
in a minute. However, this option is useful for certain types of data, where events are not raised
by itself to invoke the refresh tasks.
• By default the GUI does not offer the option to change the extent size from the default setting
during pool creation. This allows for consistent extent sizes which are important for migrating
VDisks between pool on the system. To allow extent size selection check the Advance pool
settings.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
There are some situations where refreshing GUI objects won't work, because of reloads, the
webpage might still be using the old files from the browser's cache, so emptying the cache can fix
interface display issues. Your browser has a folder in which browser objects that have been
downloaded are stored for future use.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The management GUI has certain compatibility requirements for the browser settings and
configuration to ensure it is supported and works correctly.
IBM supports higher versions of the browsers if the vendors do not remove or disable function that
the product relies upon. For browser levels higher than the versions that are certified with the
product, customer support accepts usage-related and defect-related service requests. If the
support center cannot re-create the issue, support might request the client to re-create the problem
on a certified browser version. Defects are not accepted for cosmetic differences between browsers
or browser versions that do not affect the functional behavior of the product. If a problem is
identified in the product, defects are accepted. If a problem is identified with the browser, IBM might
investigate potential solutions or work-around that the client can implement until a permanent
solution becomes available.
For the latest system features and support, refer to the IBM Storage website:
https://www.ibm.com/it-infrastructure/storage.
Uempty
System monitoring
Settings
Access
Audit Log entries
Quorum device
IBM Service Assistant Tool (SAT)
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
This module reviews Audit log entries performed by administrators and the commands issued.
Uempty
Right-click a column to
change display
Logs executed
action commands
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The audit log is extremely helpful in showing which commands have been entered on a system. An
audit log tracks domain user actions that are issued through the management GUI. You can view
the GUI audit log entries by selecting Access > Audit Log.
The audit log entries can be customized to display the following types of information:
• Time and date when the action or command was issued on the system
• Name of the user who performed the action or command
• IP address of the system where the action or command was issued
• Parameters that were issued with the command
• Results of the command or action returns code of the action command
• Sequence number
• Object identifier that is associated with the command or action
The GUI provides the advantage of filter or search among the audit log entries to reduce the
quantity of output. It also provides the ability to create an export.csv file of the objects that can be
submitted to IBM Support.
The in-memory portion of the audit log has a capacity of 1 MB and can store about 6000 commands
on the average (affected by the length of commands and parameters issued). When the in-memory
log is full, its content is automatically written to a local file on the configuration node in the
/dumps/audit directory.
The catauditlog CLI command when used with the -first parameter provides the requested
most recent number of entries with the CLI. In this example, the command returns a list of five
in-memory audit log entries.
Uempty
IBM_cluster:superuser>catauditlog -first 5
audit_seq_no timestamp cluster_user ssh_ip_address result res_obj_id
action_cmd
23 150411010003 superuser 127.0.0.1 0
svctask detectmdisk
24 150412010002 superuser 127.0.0.1 0
svctask detectmdisk
25 150413010003 superuser 127.0.0.1 0
svctask detectmdisk
26 150413180002 superuser 10.8.3.16 0
svctask chcurrentuser -keyfile /tmp/superuser.pub-8035843563636449081.pub
27 150414010003 superuser 127.0.0.1 0
svctask detectmdisk
IBM_cluster:superuser>
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
You can also display audit log entries using the CLI catauditlog command. The catauditlog CLI
command when used with the -first parameter provides the requested most recent number of
entries with the CLI. In this example, the command returns a list of five in-memory audit log entries.
Uempty
CLI commands
KWWSVZZZLEPFRPVXSSRUWNQRZOHGJHFHQWHUHQ
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The IBM Knowledge Center a list of CLI commands, changes and information that helps you
configure and use the system.
Uempty
System monitoring
Settings
Access
Quorum devices
IBM Service Assistant Tool (SAT)
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
This module introduces the quorum devices, and its benefit in helping a cluster manager make
cluster management decisions.
Uempty
&OXVWHU
7LHEUHDN
5HFRYHU\
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Wikipedia defines a quorum as “…. the minimum number of votes that a distributed
transaction must obtain in order to be allowed to perform an operation in a distributed
system. A quorum-based technique is implemented to enforce consistent operation in a
distributed system”.
Quorum disk (also called “witness” in some other implementations) is required with IBM Spectrum
Virtualize to resolve situations where cluster components lose communication so nodes or sets of
nodes do not. Both nodes can run cluster independently.
In Spectrum Virtualize 7.6.0, IBM first introduced IP quorum, where a quorum application can be
used to tie-break a split-brain scenario, where half of the cluster cannot see the other half and the
quorum application is used to break the tie.
Now, IBM Spectrum Virtualize 8.1supports the other half of the quorum functionality. For example,
the capability of storing data that is used for cluster recovery.
Uempty
* Might be used if local devices are attached, but not recommended in > 2 node systems.
* IP Quorum is supported on FlashSystem V9000 systems running software version 7.6 and above
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
When internal storage or back-end storage is initially added to a clustered system as a storage
pool, the Spectrum Virtualize cluster automatically creates three quorum disks by allocating space
from the assigned MDisk or drive candidates. An SVC cluster generates three MDisks, which it
attempts to spread over three different storage controllers in the cluster. A quorum device can be
assigned to internal attached enclosure when there are more than two nodes in a cluster. The
Storwize family, FlashSystem V9000, and FlashSystem 900 use three drives that are parts of
arrays or spares as quorum disks.
In addition, Spectrum Virtualize supports the ability to define Internet Protocol (IP) quorum base
support for lower-cost IP-attached hosts as a quorum disk. The maximum number of IP quorum
applications that can be deployed is five.
IP Quorum is most commonly used when deploying IBM Spectrum Virtualize Stretched or
HyperSwap cluster. HyperSwap function can be supported on SVC systems running software
version 7.6 and above, and with two or more I/O groups. As of IBM Spectrum Virtualize 8.2.1, IP
quorum adds full T3/T4 data recovery.
Uempty
:KDW3UREOHPV'RHVWKH,34XRUXP$SSVROYH"
9 IP quorum app uses a much shorter lease time than external FC quorum disks, so
results in faster failover times.
9 No iSCSI attached controller supports quorum disks. With the release of iSER
clustering, a complete iSCSI based solution can be provided to customers, capable of tie-
breaking as well as cluster recovery.
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
• Before Spectrum Virtualize 7.6.0, the third site had to be connected by using Fibre Channel,
and maintaining this third site and storage controller over FC makes the system costly for site
recovery implementation of IBM Spectrum Virtualize system.
• Failover/Failback time is an issue for many of our customers. IP quorum offers faster failover
times that use shorter lease time than external FC quorum disks.
• With the release of iSER clustering support, customer running Ethernet for iSCSI or iSER host
attachment are now capable of tie-breaking or cluster recovery.
• IP quorum also removes the requirements for internal storage.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
IP Quorum is a Java application running on a server that is IP connected to the same management
IP network as the cluster itself. Applications can be deployed on multiple hosts to provide
redundancy. IP quorum applications are used in Ethernet networks to resolve failure scenarios
where half the nodes on the system become unavailable. These applications determine which
nodes can continue processing host operations and avoids a split system, where both halves of the
system continue to process IO independently.
The mkquorumapp command is used to create the quorum application for the cluster.
Uempty
quorum as active
ƒ All systems must agree before the clustered systems
acknowledge the active quorum
• All quorum drive/MDisk devices contain the same T3
recovery data
ƒ Each time a change made to data needed - all quorum
devices are updated
ƒ Two copies in each of 2 areas per device - checksum’ed
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The three quorum devices are selected candidates. At any time, only one of these candidates is
acting as the active quorum disk. IBM Spectrum Virtualize uses a quorum algorithm to distinguish
between the three quorum candidate disks. The other two are reserved to become active if the
current active quorum disk fails. All three quorum disks are used to store configuration metadata,
but only the active quorum disk acts as tie-breaker for split brain scenarios. Quorums are identified
by the quorum index values 0, 1, and 2. In this example, the active quorum is index 0 resident on
MDisk ID 3 and the others are in stand-by mode.
To maintain availability with the system, each quorum disk is on a separate disk storage system or
domain, with the third quorum disk in the third failure domain as the active quorum disk.
IBM Spectrum Virtualize cluster manager implements a dynamic quorum. This means that following
a loss of nodes, if the cluster is able to continue operation, it dynamically moves the quorum disk to
allow more node failure to be tolerated. This process improves the availability of the central cluster
metadata, which enables servicing of the cluster. With the combined quorum devices, the cluster
has the possibility to have up to eight quorum devices.
So how does the cluster know which one to use? At any given time only one quorum device is the
active quorum. Every node in the cluster knows which is the active voting set of nodes, and which
is the active quorum. The active quorum can only be changed when all nodes in the active voting
set agree and can confirm.
Uempty
Quorum size
• A quorum disk contains a reserved area that is used exclusively for system management
Extent 1a/1 GB
Extent 2a
Extent 3a
Extent 1b
Extent 2b
Extent 3b
Extent 1c
Extent 2c
Extent 3c
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
A quorum disk is a MDisk or a managed drive that contains a reserved area that is used exclusively
for system management. Each quorum disk requires over 256 MB of data space to hold quorum.
The quorum size is affected by the number of objects in the cluster and the extent size of the pools.
For this example, the pool extent size is 1 GB and based on the number of free extents available a
quorum disk is deduced to be using one extent or 1 GB (the smallest unit of allocation). This might
help explain the missing capacity in the storage pool capacity value. The remaining extents of a
quorum MDisk are available to be assigned to volumes (VDisks).
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
In HyperSwap configurations, IP quorum applications can be used at the third site as an alternative
to third-site quorum disks. Quorum devices are based on a voting algorithm. Each node in the
cluster has a vote. The cluster keeps working while more than half of the voters are online. This is
the quorum (or the majority of votes). When there are too many of failures and not enough online
voters to constitute a quorum, the cluster stop working. In the case of a split brain scenario, where
the cluster divides perfectly into two equal halves, neither of which have the majority vote.
Therefore, a tie break device must decide which half should continue I/O operations. In a large
configuration of nodes, where a communication failure between parts of the system (5 out of the
eight nodes can communicate) – majority always wins.
No Fibre Channel connectivity at the third site is required to use an IP quorum application as the
quorum device. The IP quorum application is a Java application that runs on a host at the third site.
The IP network is used for communication between the IP quorum application and nodes in the
system. If you currently have a third-site quorum disk, you must remove the third site before you
use an IP quorum application.
IBM Storwize V7000 is not support Enhanced Stretched Cluster environments.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
In the previous Spectrum Virtualize release, if a link breaks between two sites triggering a tie-break,
both sites will attempt to request allegiance from the IP quorum application to choose which site
should continue the cluster. In this case, the application will give allegiance to whichever site
request is received first, therefore denying allegiance to the other site. The nodes granted the
allegiance will continue the cluster and the other node will be removed.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW
,%0&RQILGHQWLDO &RS\ULJKW,%0&RUSRUDWLRQ
With the Spectrum Virtualize V8.3, if a link breaks between two sites trigger a tie-break, you can
now specify a preferred site using the chsystem –quorummode preferred –quorumsite
[followed by the Site ID]. In this case, the preferred site will request allegiance and the
non-preferred site allegiance request is delayed by 3 seconds. During this process, the IP quorum
application will grant allegiance to the preferred site to continue the cluster, and the non-preferred
site will be removed from the cluster. This feature is only active for HyperSwap and stretched
topologies.
Uempty
6LWH 6LWH
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
In some cases, a Spectrum Virtualize system configuration might not require a third site, if all nodes
in the configuration have direct connection to a quorum device that can be use to perform a
tie-break scenario. In this case, if the link between the sites breaks, with Spectrum Virtualize
release V8.2 (and below), the node within an I/O group with the lowest ID would continue the
cluster operations. Depending on the type of host applications, this might not be the best of site to
continue the cluster operations.
In this same scenario, with Spectrum Virtualize V8.3, you have the ability to specific a Winner site to
continue the cluster operations using the chsystem –quorummode winner –quorumsite (site
ID) command. In this case Site 2 has been selected. The nodes at the Site 1 (non-winner site) will
then be removed from the cluster.
Uempty
$GGLWLRQDO6SHFWUXP9LUWXDOL]H,3TXRUXPLPSURYHPHQWV
• Reduced Lease Time
ƒ Provides the options to change quorum between long lease (48 seconds) or short lease (15
seconds)
ƒ Short lease is a default for the latest generation systems running on Spectrum Virtualize V8.3
í Recommended value for faster node failover
ƒ To change the quorum lease time using CL commands:
chsystem –quorumlease short
chsystem –quorumlease long
• Host Health Check validates all host connections at a failed site during a tie-break scenario
ƒ Allegiance is denied at the failed site, and preference is given to the other site to continue the cluster
(Only option for standard quorum mode)
• IP Quorum is Selected for Site 3 allowing it to receive a cookie crumb and participate in in the
T3 Recovery
• Deglitch IP Quorum EM Disconnects allows IP Quorum Application site to perform a tie-break
scenario if IP connectivity is lost at one site, followed by an inter-link failure.
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
• In the previous Spectrum Virtualize release, if the hardware suddenly fails, the node failover
time depends in part on its quorum lease time.
When using the IP Quorum Application, the lease time is currently set to 48 seconds (long
lease). With Spectrum Virtualize V8.3, you have the option to specify long lease or a short
lease. All the latest Spectrum Virtualize generation systems released on V8.3.0 will have short
lease configured by default. This is the recommended value for faster node failover. However,
long lease may be required for some setups, if you are experiencing lease expiry asserts.
HyperSwap without 3rd site is also supported. You can change the quorum release time by
using either of the following commands: chsystem –quorumlease short or chsystem –
quorumlease long.
• To prevent rolling disasters during a tie-break scenario when a site failure occurs and all host
connections are lost (only if IP Quorum application is used to tie-break), then failed site will
loose its allegiance. At this time, a Host Health Check is performed, and the other site will be
given preference to continue the cluster. Host Health Check only happens in standard quorum
mode.
• For T3 Recovery, IP Quorum now is “Selected for Site 3”. In this three site configuration, one
device will be selected from each site for Cookie Crumbs. For example, Quorum devices at site
1 and site 2, and IP Quorum Application at site 3 will each have a Cookie Crumb. Previously,
only physical quorum can be used for T3 Recovery after power outages. In this case, all of the
nodes update the quorum disk with critical information of all of the virtual mappings of blocks to
volumes, and this is used when bringing up the nodes again. With Spectrum Virtualize V8.3, the
3rd site using IP Quorum Application can now receive a cookie crumb allowing it to participate in
a T3 Recovery.
Uempty
• Deglitch IP Quorum EM Disconnects allow the IP Quorum application at the 3rd site to
perform tie-breaking scenario if IP connectivity goes down at one site, immediately followed by
an inter-site link failure.
Uempty
To view the state of an IP quorum application in the management GUI, select Settings > System >
IP Quorum. You can also use the lsquorum command to view the state of the IP quorum
application.
For stable quorum resolutions, an IP network must provide the following requirements:
• Connectivity from the servers that are running an IP quorum application to the service IP
addresses of all nodes. The network must also deal with possible security implications of
exposing the service IP addresses, as this connectivity can also be used to access the service
assistant interface if the IP Network security is configured incorrectly.
For Tie-break,
• Port 1260 is used by IP quorum applications (through SSL/TSL) for inbound connections from
the application to the service IP address of each node.
• The maximum round-trip delay must not exceed 80 milliseconds (ms), which means 40 ms
each direction.
• A minimum bandwidth of 2 megabytes per second is guaranteed for node-to-quorum traffic.
• If cluster configuration changes (such as add/remove node, SSL certificate, IP addresses)
require you to re-create the Java quorum application package.
• Must have security in place for host running the application.
• Review the IBM Interoperability Matrix for supports Oss and Java Variants by using the IBM
Knowledge Center link.
• Maximum of five applications.
The Cluster recovery requirements are the same as the tie-break with the addition of:
Uempty
• Increased requirement for network bandwidth to 64 MB per second.
• 250 MB of disk space.
• Only one application per IP address.
Uempty
Installing IP quorum
1. To create the IP quorum Java application, select Settings > System > IP Quorum and click either
Download IPv4 Application or Download IPv6 Application
ƒ Click to clear the box if Cluster Recovery is not required
ƒ Optional to use the CLI mkquorumapp command
2. Transfer the IP quorum application from the system to a directory on the host that is to run the IP
quorum application
3. Use the ping command on the host server to verify that it can establish a connection with the
service IP address of each node in the system
4. On the host, use the command java -jar ip_quorum.jar to initialize the IP quorum
application
5. Verify that the IP quorum application is installed and active, select Settings > System > IP Quorum
ƒ Use the CLI lsquorum command to verify that the IP quorum application is connected and is the active
quorum device
ƒ Use the CLI chquorum command to modify the MDisk that are used for quorum
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
To download and install the IP quorum application, complete the following steps:
1. In the management GUI, select Settings > System > IP Quorum and click either Download
IPv4 Application or Download IPv6 Application to create the IP quorum Java application.
Click the box if Cluster Recovery is not required. You can also use the command-line interface
(CLI) to enter the mkquorumapp command to generate an IP quorum Java application. The
application is stored in the dumps directory of the system with the file name ip_quorum.jar.
2. Transfer the IP quorum application from the system to a directory on the host that is to run the
IP quorum application.
3. Use the ping command on the host server to verify that it can establish a connection with the
service IP address of each node in the system.
4. On the host, use the java -jar ip_quorum.jar command to initialize the IP quorum
application.
Uempty
4XRUXPGLVNV
• Verify that the IP quorum application is installed and active, select Settings > System > IP Quorum
ƒ Use the CLI lsquorum command to verify that the IP quorum application is connected and is the active
quorum device
ƒ Use the CLI chquorum command to modify the MDisk that are used for quorum
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
To verify that the IP quorum application is installed and active, select Settings > System > IP
Quorum. The new IP quorum application is displayed in the table of detected applications. The
system automatically selects MDisks for quorum disks. In a stretched or HyperSwap configuration
with IP quorum, the system automatically selects an MDisk from both sites. These MDisks store
metadata that are used for system recovery. If you want to select specific MDisk to use as quorum
disks, select MDisk by Pools and right-click the MDisk and select Quorum > Modify Quorum Disk.
You can also use the lsquorum command on the system CLI to verify that the IP quorum
application is connected and is the active quorum device. If you want to modify the MDisk that are
used for quorum by using the CLI, use the chquorum command.
There are strict requirements on the IP network and some disadvantages with using IP quorum
applications. Unlike quorum disks, all IP quorum applications must be reconfigured and redeployed
to hosts when certain aspects of the system configuration change. These aspects include adding or
removing a node from the system or when node service IP addresses are changed. Other
examples include changing the system certificate or experiencing an Ethernet connectivity issue.
An Ethernet connectivity issue prevents an IP quorum application from accessing a node that is still
online. If an IP application is offline, the IP quorum application must be reconfigured because the
system configuration changed. If you change the configuration by adding a node, changing a
service IP address, or changing SSL certificates, you must download and install the IP quorum
application again.
Uempty
System monitoring
Settings
Access
Audit Log entries
Quorum devices
IBM Service Assistant Tool (SAT)
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
This module discusses the requirements and procedures to reset system password.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The Service Assistant Tool (SAT) is a web-based GUI that is used to service all IBM Spectrum
Virtualize systems to include storage enclosures. IBM Support and system administrators can
access the interface of a node that uses its Ethernet port 1 service IP address through either a web
browser or open an SSH session.
The Service Assistant Tool provides a default superuser user ID and a default passw0rd, which is
password with a zero “0” instead of the letter “o”. Only those with a superuser ID can access the
Service Assistant interface. For security maintenances, it is highlight recommended to change user
ID and password granting only those who required access.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The Service Assistant tool interface can be used in various service event cases. The primary use is
to perform service-related troubleshooting tasks when a node is in service state or is not yet a
member of a cluster and scheduled maintenance or when directed by maintenance procedures or
IBM Support engineer to perform certain service actions. However, Service IP Assist also allows
convenient access to node configuration information and status.
The Home page of Service Assistant Tool shows various options for examining installed hardware
and revision levels and for identifying canisters or placing these canisters into the service state.
One important fact to consider is that the Service Assistant Tool contains destructive and disruptive
functions, and any incorrect usage might cause unattended downtime or even data loss.
The Node Detail section displays data that is associated with the selected node:
• The Node tab shows general information about the node canister that includes the node state
and whether it is a member of the cluster.
• The Hardware tab shows information about the hardware.
• The Access tab shows the management IP addresses and the service addresses for this node.
• The Location tab identifies the enclosure in which the node canister is located.
• The Ports tab shows information about the I/O ports.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
You should used the Service Assistant Tool to complete service actions on node canisters only
when directed to do so by the fix procedures.
The storage system management GUI operates only when there is an online system. Use the
service assistant if you are unable to create a system or if both node canisters in a control
enclosure are in service state. The node canister might also be in a service state because it has a
hardware issue, has corrupted data, or has lost its configuration data.
The service assistant does not provide any facilities to help you service expansion enclosures.
Always service the expansion enclosures by using the management GUI.
If used inappropriately, the service actions that are available through the service assistant can
cause loss of access to data or even data loss.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Listed are a number of service-related actions that can be performed using the Service Assistant
Tool interface. A number of tasks might cause the node canister to restart. It is not possible to
maintain the service assistant connection to the node canister when it restarts. If the current node
canister on which the tasks are performed is also the node canister that the browser is connected to
and you lose your connection, reconnect and log on to the service assistant again after running the
tasks.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The Service Assistant GUI can be used to restart system services, if required by selecting Restart
Service. Next, choose the Service that should be restarted: CIMOM, Web Server, Easy Tier,
Service Location Protocol Daemon (SLPD) or Secure Shell Daemon (SSHD).
This is helpful if one of the services does not work as expected or if instructed by IBM Support.
Possible scenarios can be to reset SSH daemon if a script or external monitoring software harms or
stalls the SSH interface.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
The Service Assistant IP address is set by default and should be changed to fit in the client’s
environment. It is highly recommended to set this IP address to make it usable in the daily work,
which can be on different subnet (Management). Therefore, no client data flow will use this port.
Reset the node service IP address from the factory default, select Change Service IP address and
specify the new IP address using IPv4 or IPv6.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Figure 15-80. Recovery loss data using the Service Assistant Tool T3 recovery
If system state is lost from all control enclosure node canisters, an administrator might be directed
to re-creates the storage system by using saved configuration data and is also known as Tier 3 (T3)
recovery. The recovery might not be able to restore all volume data. This procedure assumes that
the system reported a system error code 550 or error code 578. To address the issue, perform a
service action to place each node in a service state.
For a complete list of prerequisites and conditions for recovering the system, see the following
information:
• Recover System Procedure in the Troubleshooting, Recovery, and Maintenance Guide
• Recover System Procedure in the Information Center
Uempty
Use the service CLI to manage a node canister in a control enclosure using the task commands
and information commands. Service Assistant CLI is only accessible with superuser rights, and can
be using the Service IP address through Secure Shell (SSH).
• Commands start with sainfo and satask (similar to standard commands).
• Prefix is mandatory for sa commands.
• Same functions as in the GUI like show node information, service state, service
recommendation.
• For more detailed information, issue a sainfo command with –h or check the Info Center.
• Be aware that the satask commands are only for the service assistant and can perform
dramatic changes on the cluster (leavecluster, and others.)
In addition to using the management GUI to monitor and maintain the configuration of storage that
is associated with your systems. Before you can use the CLI, you must have already installed and
configured the Spectrum Virtualize system. The CLI commands use a Secure Shell (SSH)
connection between the SSH client software on a client system and the SSH server on the active
file module.
You might also find it useful to create command scripts by using the CLI commands to monitor for
certain conditions or to automate configuration changes that you make on a regular basis.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
IBM Solution Support Services for storage provides technical advisor (TA) support to assist with the
resolution of high-severity problems. It will help your IT team save precious time by working with
IBM technicians to facilitate faster problem resolution. Moreover, your TA can offer tested
methodologies to prevent outages and use IBM support more effectively. Our remotely delivered
offering can help you improve your uptime, enhance your business efficiency, and avoid future
storage outages.
The benefit of a Technical Advisor is to consult with customers on effective ways of managing total
cost of ownership and freeing up customer resources:
• This starts with engaging in Technical Delivery Assessments during installation preparations.
• Develops an individualized Customer Support Plan.
• Educates the customer on the use of the IBM support structure, electronic support and
technical support website.
• Reviews and reports on hardware inventories and service requests, depending on GEO /IBM
Technical Support Services (TSS) contracts.
• Combines integrated support with IBM expertise, helps companies anticipate and respond
faster to new challenges / problems.
• Provides pro-active planning, advice and guidance on storage code level upgrades improves
storage device availability.
• Focal point for support related activities (For example: monitoring progress of open service
requests such as problem management records), while ensuring follow-up / closure as the
customer advocate.
• Leads as the single point of escalation.
Uempty
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
To proactively prevent any problems, stay informed on IBM’s technical support resource for all IBM
products and services and critical IBM software support updates with My Notification.
My Notifications users will receive technical notifications, that can be tailored to meet their needs.
Registration is required using the following site https://www.ibm.com/support/mynotifications. From
there you will need to create a userid and password.
Uempty
Keywords
• Audit Log • Quorum
• Event Log • SNMP
• Event notifications • Syslog
• Directory Services • Remote User Authentication
• Email • Support package
MDisk Quorum • Upgrade test utility
Drive Quorum • Service Assistant IP address
IP Quorum • User group
Transitional mode • Remote user
Enable mode • Tie 3 (T3) recovery
Disable mode Tie-break
Cluster recovery
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review questions (1 of 3)
1. True or False: The cluster audit log contains both information and action
commands issued for the cluster.
2. True or False: Only the superuser ID is authorized to use the Service Assistant
interface.
3. True or False: Host application I/O operations are allowed during FlashSystem
code upgrades.
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review answers (1 of 3)
1. True or False: The cluster audit log contains both information and action
commands issued for the cluster.
The answer is false.
2. True or False: Only the superuser ID is authorized to use the Service Assistant
interface.
The answer is true.
3. True or False: Host application I/O operations are allowed during FlashSystem
code upgrades.
The answer is true.
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review questions (2 of 3)
4. True or False: The Spectrum Virtualize cluster IP address can be accessed from
either Ethernet port 1 or port 2 for cluster management.
5. True or False: Application data is backed up along with cluster metadata data
when the svcconfig backup command is executed.
6. True or False: The Spectrum Virtualize system configuration backup file can be
downloaded using the GUI or copied using PuTTY secure copy (PSCP).
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review answers (2 of 3)
4. True or False: The Spectrum Virtualize cluster IP address can be accessed from
either Ethernet port 1 or port 2 for cluster management.
5. True or False: Application data is backed up along with cluster metadata data
when the svcconfig backup command is executed.
The answer is false.
6. True or False: The Spectrum Virtualize system configuration backup file can be
downloaded using the GUI or copied using PuTTY secure copy (PSCP).
The answer is true.
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review questions (3 of 3)
7. True or False: In a cluster, the system uses three active quorum disks in a tie-
break situation.
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Review answers (3 of 3)
7. True or False: True or False: In a cluster, the system uses three active quorum
disks in a tie-break situation.
The answer is False. The system uses three quorum disks to record a backup of system
configuration data to be used in the event of a disaster. However, one active quorum disk is
from the three disks to be used for a tie-break situation.
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Summary
• Recognize system monitoring features to help
address system issues
• Differentiate between local support assistance
and remote support assistance
• Employ system configuration backup and
extract the backup files
• Summarize the benefits of an SNMP, syslog,
and email server for forwarding alerts and
events
• Recall procedures to upgrade the system
software and the use of host spare nodes
• Evaluate and filter administrative task
commands entries that are captured in the
audit log
• Describe the concept of a quorum
configuration
• Identify the functions of Service Assistant Tool
,%06SHFWUXP9LUWXDOL]HDGPLQLVWUDWLRQPDQDJHPHQW &RS\ULJKW,%0&RUSRUDWLRQ
Uempty
Overview
This model introduces IBM Storage Insights and its ability to optimize your storage infrastructure by
using cloud-based storage management and support platforms with predictive analytics.
References
Implementing IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1
http://www.redbooks.ibm.com/redpieces/pdfs/sg247938.pdf
Uempty
Objectives
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
This topic introduces the IBM Spectrum Control Storage Insights an enterprise-proven, cognitive,
cloud-based storage offering.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
IBM Storage Insights is an IBM Spectrum Control Software as a Service (SaaS) offering with its
core running over IBM Cloud (SoftLayer). It combines IBM Analytics leadership and a rich history of
storage management expertise with an enterprise-proven, cognitive, cloud-based storage insights
platform, enabling storage administrator to take control of their storage environment to address
these challenges:
• Holistic storage monitoring such that you would always know what is going on with your storage
• Cognitive storage system analytics, or the ability to proactively identify if your setup is meeting
design best practices for the applications you have
• Planning tools to help optimize your storage infrastructure
• And proactive storage diagnostics that is focused on helping you reduce downtime and get your
systems up and running quickly in the event of an issue
Storage Insights is an entitled version that available for at no cost to clients of IBM block storage
systems with a current hardware warranty or maintenance contract.
Uempty
Proactive support
Analyze Predict and notifications
patterns prevent issues
Support
Performance
Integration
Data
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
Figure 16-4. Millions of telemetry data points collected daily from a single device!
With the increasing demands of growing storage environments, businesses are faced with the
complexity of trying to determine the best time to investments; should they add storage based on a
capacity need, only to determine that performance issues become more of a constraint for the
business.
IBM Storage Insights offers some key capabilities that help clients meet those demands and much
more by helping to build the connective fabric between IBM, the storage devices, and the user. With
a unified view of all managed IBM systems, IBM Storage Insights provides a common management
platform to monitor all your storage inventory with diverse workloads within a single point of
metadata collection for both current and legacy devices.
Through a data-centric architecture, Storage Insight collects millions of telemetry data points (per
device – per day) and calls home with that data providing up to the second system reporting of
capacity and performance, based applications, or departmental storage consumptions. This
storage monitoring solution looks at the overall health of the system, determines whether
configurations meet the best practices. These analytics-driven insights can help businesses
increase storage utilization by reclaiming unused storage, improve capacity planning with increase
visibility into data growth rates and available capacity, and if system resource management is just
being overly taxed, move data to the most cost-effective storage tier.
With its cognitive storage management capabilities, Storage Insights include proactive support
notifications that enable a high service quality through error reduction to pin points the source of
performance issues on your storage device and provides recommendations.
Finally, Storage Insights provide access IBM Support website to help officiate advanced customer
service, allowing storage administrators to open support tickets faster, and view those tickets (open
and closed) collectively IBM Support, as well as track trends and events to pin point critical actions.
With an auto log collection capability, administrators no longer must wait for support to collect the
logs before looking into the problem, enabling faster incident resolution as much as 50%, so that
Uempty
you can spend less time troubleshooting storage problems and more time planning for your future
storage needs.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
IBM Spectrum Control provides analytics-driven data management as well as efficient infrastructure
management for virtualized, cloud, and software-defined storage to simplify and automate storage
provisioning, capacity management, availability monitoring, and reporting.
For comparison, IBM Spectrum Control offers the many of the same functions that are provided in
IBM Storage Insights, and much more. However, Spectrum Control does not provide that optimized
integration with IBM Support, or IBM Support with information on monitored device and its
performance.
IBM Spectrum Control and IBM Storage Insights both complement each other. Clients can use
Storage Insights to transform their support experience, while still maintain Spectrum Control for a
more holistic view of the infrastructure (Fabrics and Hosts) and more capabilities like the ability to
provision storage, automatically move storage workloads to more appropriate storage devices
based on policy and service level and to monitor Fibre Channel SANs.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
Two versions of IBM Storage Insights are available to help you manage a storage environment: an
entitled version and a subscription-based, full version.
• The entitled no cost version of Storage Insights provides clients a foundation to see a
comprehensive inventory of their IBM storage systems and it facilitates hassle-free log
collection and better interactions with IBM Support.
• Storage Insights Pro, the subscription-based, full version, enable clients more visibility to more
insight into the storage performance, configuration, capacity planning, and business impact
analysis.
• IBM Storage Insights Pro is a license-based application that is licensed via a subscription. This
solution is purchased based on the managed capacity of the storage systems that are being
monitored and that is made available for storage consumption. For block storage systems,
managed capacity is the usable physical capacity that is made available to pools.
Uempty
This chart provides a quick comparison of the difference features between the two IBM Storage
Insights offerings. Clients with a IBM Storage Insights Pro subscription automatically have all the
functions in IBM Storage Insights provided with no additional charge, allowing immediate
streamlined support and event monitoring capabilities.
For those clients who do not have Pro that is installed can download a 30-day trial of the full
function IBM Storage Insights Pro offering easily from within Storage Insights dashboard with a
single click the “unlock more capabilities” button, instantly allowing greater visibility into their
managed storage devices. Clients can easily try the full function Storage Insights Pro product on
their Storage Insights dashboard.
Uempty
8SWR\HDURI
KLVWRU\
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
IBM Storage Insights Pro provides a more comprehensive view of the hosts that have storage on
this device, view the performance - even at the port level, capacity and health of storage resources,
and visual view of all your storage systems to include your FILE and OBJECT storage devices;
whether that is Block systems that IBM Storage Insights is monitoring.
Storage Insights Pro provides reports on many aspects of your controller (100+ Metrics) through
highly interactive and customizable performance charting, such as CPU performance, Port
performance, Pool performance. Its context-specific drill-up/drill-down capability supports efficient
problem analysis. If you find a time interval that is concerning, with a single click all the
performance charts synchronize to the same time interval - helping you with fast problem
determination.
Storage Insights Pro also helps clients reduce storage costs and optimize their data center by
providing features like intelligent capacity planning, storage reclamation recommendations, storage
tiering recommendations, and deeper performance troubleshooting capabilities.
To maintain a view of your storage system health, you must enable both Call Home and inventory
reporting to view its Call Home events in IBM Storage Insights Pro.
Uempty
IBM Support
,62.,60FHUWLILHG
Configuration best
practices
,%0PHWDGDWD
Remote diagnostics
Predictive analytics
Faster time to resolution
/LJKWZHLJKWGDWDFROOHFWRU
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
All devices are managed with in a cloud infrastructure team that provides an extension of your
monitoring to a team-of-teams through a single unified dashboard.
IBM Storage Insights uses a data collector, a lightweight software component that acts as a proxy
agent to communicate to the storage devices by using remote access. From the monitored storage
systems, it sends the collected metadata that pertains to asset, configuration, and performance
statistics without customer interaction.
When you access the dashboard for the first time, you are prompted to download the data collector.
Uempty
In the IBM Cloud, the metadata is protected by physical, organizational, access, and security
controls that are protected by IBM Cloud (SoftLayer) data centers. To ensure that uninterrupted
data flow, data collector provides the following characteristics:
• Data collector connects to your on-premises storage environment with the cloud-based service,
and only communicates in one direction - from your data center to the IBM Storage Insights
instance on the IBM Cloud.
• All communications between the storage systems in the local data center and the IBM Storage
Insights service in the IBM Cloud data center are initiated solely by the data collector. Provides
no LAN-external access or provide any remote APIs that might be used to interact with the data
collector.
• The data collector used Hypertext Transfer Protocol Secure (HTTPS), which encrypts and
compress the metadata and sends the metadata package through a secure channel to the IBM
Cloud data center. When the metadata package is delivered, the metadata is decrypted,
analyzed, and stored
▪ Sensitive information such as user names and passwords (or SSH certificates) are provided
by IBM Storage Insights to each storage system upon each data collection iteration. All
storage device passwords are AES 256-bit encrypted before they are stored on the IBM
Storage Insights instance. This information is transmitted over a secure communication
channel that is established by the data collector.
• All devices are managed with in an IBM secure ISO 27K Information Security Management
certified cloud.
As a web-based cloud solution, upgrades done automatically to the Storage Insights instance in
IBM Cloud, allowing the client to have instance access to the latest features.
Uempty
&XVWRPHUV $GPLQLVWUDWRUV
Performance metadata: performance
metrics based on read/write data rates, I/O
,%06XSSRUW ,%0&ORXGWHDP rates, and response times
Data collector collects metadata based on operations that are monitored, such as:
• Storage the asset metadata (name, model, firmware, and machine/model type of storage
system
• Inventory and configuration metadata for the storage system's resources such as volumes,
pools, disks, and ports
• Capacity metadata based values such as capacity, unassigned space, used space, and the
compression ratio
• Performance metadata based on metrics such as read and write data rates, I/O rates, and
response times
• The diagnostic data is also collected log packages and adds to findings to support tickets.
The actual application data that is stored on the client’s storage systems cannot be accessed by the
data collector.
Access to the metadata that is collected is restricted to only:
• The customer who owns the dashboard
• The administrators who are authorized to access the dashboard, such as the customer's
operations team
• The IBM Cloud team that is responsible for the day-to-day operation and maintenance of IBM
Cloud instances
• And, IBM Support for investigating and closing service tickets and events
Uempty
IBM Storage Insights supports the deployment of multiple data collectors for multiple data centers.
Whether to collect data from multiple data centers that is not connected together over IP or to
maintain high availability by deploying two of more data collectors with in the same data center,
Storage Insights automatically coordinate which data collector to use for each device monitored.
If data collector stops working or the communication with IBM Storage Insights is interrupted,
support detects this problem and an alert is sent to the email address that is used to subscribe to
the IBM Storage Insights service.
However, with a data collector high availability infrastructure, if a data collector stops working or a
VM fails, the other data collector on standby automatically collects the data, and should there ever
be an issue with connections the data collector with the fastest response time collects the data.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
This topic describes the deployment and configuration activities to configure the IBM Storage
Insights instance.
Uempty
Getting with IBM Storage Insights requires three simple steps. To being, you first need to enroll at
https://ibm.biz/insightsreg by using your IBM ID. If you do not have an IBM ID, you can create by
using the Create your IBM account link from the enrollment web page by completing the short form.
From the enrollment page, complete the customer information form by specifying the owner, who
the storage administrator who manages all users’ access and acts as the main contact for IBM
Storage Insights. Within 24 hours, an IBM representative contacts you through email with your
customer number and direct URL to your personal and secure IBM Storage Insights environment.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
You will log in into your Storage Insights GUI dashboard by using the customer number and
credentials that are provided by IBM. After logging in, click the Deploy Data Collectors option to
start the setup process.
Uempty
The lightweight data collector can run on either a physical server or virtual machine with at least 1
GB of RAM and 1 GB of disk capacity. Next, you need to download and extract the contents of the
data collector file in a location where it will be deployed by choosing one of the following operating
systems (Microsoft Windows, Linux, or IBM AIX).
• To install the Windows OS, right-click on installDataCollectorService.bat file and run as
administrator.
• For both Linux and AIX, run installDataCollecotrService.sh
The data collector also requires Ethernet network access. Therefore, you need to ensure that your
firewall is configured to connect to your instance of Storage Insights Foundation that target
Agent-sidemo.ib.ibmserviceengage.com:443. Your firewall must be configured with the data hub
target at a static IP address to allow outbound communication on default HTTPS port 443 that uses
TCP.
After the data collector is deployed, it attempts to establish a connection to IBM Storage Insights.
When the data collector is up and running, its status appears in the Configuration > Data Collector
view in IBM Storage Insights.
Uempty
Next, you need to add and configure the storage systems in IBM Storage Insights in order to enable
the tool to start collecting data and provide all of its available insights. IBM Storage Insights provide
direct support for most IBM storage, including flash, file, object, software-defined, and block
solutions. This table lists the following storage systems that are supported by IBM Storage Insights
and IBM Storage Insights Pro.
Currently, only IBM Block-Based Storage devices include support capabilities like auto log
collection. Some of capabilities for seeing tickets and managing the support workflows are different
for select systems.
Non-IBM device views are limited to storage devices behind a IBM Spectrum virtualize system,
therefore any deep root cause analysis of those device failures will still require additional tools that
are provided by the vendor is required.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
To get started, the wizard displays a list of currently supported storage systems types and models.
Select the storage family that you want to be monitored and analyzed by the data collector.
During this process, you complete following tasks for each storage device:
• Add storage system
• Schedule the probe and performance data collection
• Configure pools tiers and define their tiering thresholds
EMC is a non-IBM storage device that is only supported in Storage Insights Pro.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
When adding a storage device, you must provide the IP address of the storage system and the
corresponding login credentials or SSH certificates for authentication.
IBM Storage Insights tests the connection to the storage system. If it connects successfully, it adds
the system to the managed environment and displays the Data Collection Schedule configuration
pane, where you configure the storage system probe schedule, enable the performance monitor,
and configure the granularity of performance samples.
Uempty
All 1-minute interval is only retained for 7 days but aggregated to 5-minute data and retained
for the user-defined duration
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
IBM Storage Insights implements historical data retention periods for the data that is collected from
the storage systems as well as for the internal tool logging. These periods vary depending on the
granularity of the data to be retained.
The capacity data retention periods are:
• Daily: 12 weeks
• Weekly: 24 weeks
• Monthly: 24 months
Whereas the performance data retention periods are:
• Sample: 2 weeks
• Hourly: 4 weeks
• Daily: 12 months
All 1-minute interval is only retained for 7 days but aggregated to 5-minute data and retained for the
user-defined duration.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
IBM Storage Insights also provides the ability to add users, such as other storage administrators,
IBM Technical advisors, and IBM Business Partners, at any time to allow access your IBM Storage
Insights dashboard.
To do so, click your user name in the upper-right corner of the dashboard, and select Manage
Users. From your MYIBM page, ensure that IBM Storage Insights is selected, and then click Add
New User.
Users such as IBM Technical advisor (TA) provides technical advice support to assist with the
resolution of high-severity problems, by working with IBM technicians to facilitate faster problem
resolution, as well as offer tested methodologies to prevent outages.
Storage administrator can also grant access to IBM Business Partners, allowing them to view all the
performance data and reporting, including a high-level overview of the entire storage infrastructure
performance.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
Uempty
To view an IBM system, select Dashboard > Overview > Resources pane, then click Block and scroll
download to locate the system name or use the filter options.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
As a cloud web-based unified software, IBM Storage Insights will immediately start to collect your
storage systems data with the ability to deliver the first set of actionable insights in less than 30
minutes of runtime. This information is processed by using proprietary IBM Analytics, and insights
from simple tile views that provide quick visibility into storage environment health and
efficiency-improving recommendations, which can help storage administrators organization make
more informed and better decisions.
From the Dashboard > Overview pane, you can view information based on:
• Block Capacity and File Capacity: Both track the usage of space in your storage environment,
allowing you to view in graphic charts for the storage systems that manage block and file
storage. You can also view a projection of future capacity trending to help you avoid running out
of space and to plan for the purchase of more capacity.
• Resources: Allows you to view the number of block, file, and object storage systems that were
added for monitoring to include the number of servers, applications, and departments that
consume storage. If new IBM inventory is added, Storage Insights automatically scan and picks
up new storage systems that are associated with the customer numbers this is on record.
• Reclaimable Storage: This feature is supported only with IBM Storage Insights Pro. This chart
shows the total amount of space that can be reclaimed in your data center. This is based on
data that was analyzed to identify the volumes that are not being used. A list of the volumes that
are not being used is also generated and shown on the Reclamation page under the Insights
menu.
• Top Block Performance: This chart is based on your analysis of the changes that affect the
performance of the storage systems and their internal resources such as pools and volumes.
This information changes over time based on the amount of metadata that is available.
Uempty
• Tier Analysis: To see the tier planning chart, you must define the criteria for tiering the pools and
set threshold limits for each tier of storage. This feature is supported only with IBM Storage
Insights Pro. This feature allows you to compare the current with the recommended allocation
of space across tiers in your data center. The current column for each tier shows the distribution
of allocated space across each tier. The recommended column shows, based on the threshold
limits that you set, the optimum distribution of space across each tier.
Uempty
The dashboard provides an integrated view of all the storage systems of different workload
performance through device cards with drill-down capabilities to simplify and accelerate the
performance-related troubleshooting.
A Storage Administrator has a detailed view of device capacity utilization and historical growth, thin
provisioning and compression savings, capacity growth forecast based on empirical data such as
historical growth rates and available capacity, among several other offered insights.
Provides Call Home communication link between IBM storage systems, IBM Support, and IBM
Storage Insights that monitors the health and status of your storage. All events are fed and filtered
to focus on things that matter the most.
The Call Home feature transmits operational and event-related data about your storage to IBM
Storage Insights through a Simple Mail Transfer Protocol (SMTP) server connection. These events
are shown in the Event Feed on a dashboard in real time, so you can be aware of hardware failures
and potentially serious configuration or environmental issues when they occur.
Uempty
Tile view provides quick access essential information about each storage system
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
With an effort to continue to improve it features and functions, Storage Insights allows you to switch
from tile view to a table view, with the option to sort by columns to include customizing the columns
based on preference and then drill into items that are of greater interested.
Uempty
1. Click Create
Custom
Dashboard
2. Select the
storage systems
3. Specify a name
4. Click Create
2QO\HQDEOHGIRU
&XVWRPHU6WRUDJH
$GPLQLVWUDWRU
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
You can download multiple data collectors to create customize dashboard views to cover the
specific needs of your entire enterprise, which allows you to monitor storage systems based the
different data center locations, the types of storage environments, or for production system
environments.
This process requires only a few steps, click the dashboard icon and select the storage systems
that you want to include in the dashboard. Next, specify a name to create your customized
dashboard and click Create. The dashboard view is refreshed with the new dashboard, and only
the storage systems you selected are displayed.
This feature is only enabled for the customer’s storage administrator.
Uempty
6WRUDJH
0RQLWRULQJ
Key performance
metrics
Capacity
Call home events
&DOO+RPH
PXVWEH
HQDEOHG
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
IBM highly recommends the configuration of Call Home to help you resolve incidents before they
affect critical storage operations. When Call Home is enabled for storage systems in Storage
Insights, your dashboard constantly monitors the health and availability of your storage and
provides a diagnostic feed of events.
Call Home is typically enabled during the initial setup of the system by an IBM service
representative. If Call Home is not enabled, contact your representative for help.
If controllers that are indicating at problems, the device card that is positioned at the top of the
dashboard changes color with red alerting to serious events, yellow indicating a message or
warning, and gray indicating Call Home is not enabled. All events are presented in a time sequence
with the three most recent events listed.
Uempty
Want the
performance info,
flip the tab
6WRUDJH
0RQLWRULQJ
Key performance
metrics
Capacity
Call home events
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
With the collection of over 100 metrics from IBM devices, you can see the key performance of the
storage resources for your critical applications and storage systems, which helps you to investigate,
detect, and resolve performance issues much faster.
Each of the device cards can be flipped back and forth by using the page folder at the lower right
end of the card. This feature allows you to see the capacity and performance data, which might be
related to the serious issue reported. The data is shown in a basic metric of:
• Read/Write I/O
• Data Rate
• Response Rate
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
With its streamlined interaction, Storage Insights allow you to monitor the health, performance, and
capacity of a selected block storage overview as well as providing any up-front notification to your
storage system health.
Uempty
System Events
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
You also have quick access to track the Call Home events from a single storage enclosure for all
your alerts that were detected. All storage information that is displayed can be exported into a CSV,
PDF, or HMTL file.
Uempty
Upgrade recommendations
Below minimum
recommended level
Link to obtain the download
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
The Call Home generates informational event notifications when significant changes to the state of
the system are detected. Informational events provide information on the status of an operation. For
example, the first informational event has detected that the system is running a software level that
is several levels behind. The second example shows that a product warranty has expired. This type
of notification also provided the user with warnings from 120 days out up to 30 days before
expiration date.
Each informational event notification also provides links to additional information as well as access
to current or recommended downloads.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
In this example, an issue was found that cause a self-recovering node warm start. The issues are
related to a defect in the small timing window during the volume’s cache transition process. Based
on the in-cloud analysis of call home data, the system discovered that this is a related issue that
has occurred on several systems running a particular range of software release. Therefore, a
software fix can be issued for the various releases.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
Upon detection of any hardware failures or software error code, or potentially serious configuration,
Call Home delegates the storage system to send electronic call home messaging transmission of
operational and error-related data to IBM Support through a specified Simple Mail Transfer Protocol
(SMTP) server connection in the form of an event notification email.
IBM Support uses a streamlined ticketing process to determine whether the detected event
requires service or further investigation. If warranted, a problem ticket (PMR or PMH) is created
and sent to the appropriate IBM Support team. IBM Support obtains read-only access to diagnostic
information about monitored storage systems, so they can help troubleshoot and resolve problems.
This proactive support minimizes the number of interaction cycles between you and IBM Support.
Uempty
The Ticket pane allows you to see a collectivity view of all tickets that were opened or closed for the
storage device.
If the storage administrator encounters a problem, you can get support promptly through the unified
support experience without any IBM Support intervention by completing the following tasks:
• Open tickets for IBM Support about storage systems in your storage environment. Log
packages are automatically added to new tickets.
• Update tickets with a new log package and add a note or an attachment to the ticket.
• View the tickets that are open for a resource (or closed) regardless of how they were opened
• View the ticket history for a resource
Get hardware and software issues resolved by opening and updating tickets for IBM Support. Log
packages are automatically added to new tickets.
You can open tickets for IBM Support about storage systems in your storage environment. You can
also update those tickets with a new log package and add a note or an attachment to the ticket.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
The Properties pane defines the storage attributes of each storage device from its hardware
type/model number, current firmware level, IP address, and location. This information is
automatically gathered by IBM Support as part of the ticketing process.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
IBM Storage Insights allows you to view the list of nodes and enclosures in your inventory and the
type of enclosure. You can also see whether IBM Storage Insights has determined that the node or
enclosure has an active maintenance or warranty contract.
Uempty
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
This topic provides an active link to the IBM Storage Insights demo.
Uempty
KWWSLEPEL]LQVLJKWVWRXU KWWSLEPEL]VWRUDJHLQVLJKWVGHPR
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
Explore Storage Insights through guided and interactive demos to see how IBM Storage Insights
can help you monitor the performance, capacity, and health of your storage portfolio.
Uempty
Additional resources
ibm.biz/insightsfacts
ibm.biz/insightsstart
ibm.biz/insightssecurity
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
In addition to the demos, which are listed are several URLs that can help you get started with
implementing IBM Storage Insights.
Uempty
Keywords
IBM Storage Insights • Logs
IBM Storage Insights Pro • Hypertext Transfer Protocol Secure
(HTTPS)
IBM Spectrum Control
Block Capacity • IBM Cloud
File Capacity • IBM Support
Reclaimable Storage • Tier Analysis
File Objects • Dashboard
Data Collection • Call Home
Metadata • In-Cloud Analysis
Performance • IBM Spectrum Storage Software
Diagnostics
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
Uempty
Review questions (1 of 3)
1. True or False: IBM Storage Insights collects sensitive and private information from each
storage device.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
Uempty
Review answers (1 of 3)
1. True or False: IBM Storage Insights collects sensitive and private information from each
storage device?
The answer is False. Data collector collects only metadata after it is encrypted and
compressed. The actual application data that is stored on the client’s storage systems cannot
be accessed by the data collector.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
Uempty
Review questions (2 of 3)
3. What are the advantages of the Storage Insights cloud service?
A. Shared view of the infrastructures helps IBM support diagnose issues
B. IBM support can grab their own log packages
C. The Storage insights service is managed and upgraded by IBM operations
D. Provides a unified view of the infrastructure across many different devices
E. All of the above
4. True or False: IBM Support personnel accessing through Remote Support Assistance
requires a challenge response authentication to be permitted.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
Uempty
Review answers (2 of 3)
3. What are the advantages of the Storage Insights cloud service?
A. Shared view of the infrastructures helps IBM support diagnose issues
B. IBM support can grab their own log packages
C. The Storage insights service is managed and upgraded by IBM operations
D. Provides a unified view of the infrastructure across many different devices
E. All of the above
The answer is All of the above.
4. True or False: IBM Support personnel accessing through Remote Support Assistance
requires a challenge response authentication to be permitted.
The answer True.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
Uempty
Review questions (3 of 3)
5. True or False: The lightweight data collector can run only on a physical server with
bidirectional communication.
6. True or False: Support tickets and log packages can be submitted only by IBM Support.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
Uempty
Review answers (3 of 3)
5. True or False: The lightweight data collector can run only on a physical server with
bidirectional communication.
The answer is False. The lightweight data collector can run on either a physical server or virtual
machine with at least 1 GB of RAM and 1 GB of disk capacity, and communicates only outbound.
6. True or False: Support tickets and log packages can be submitted only by IBM Support.
The answer is False. Storage administrators can submit problem tickets and automatically add a
log package to the ticket without any IBM Support intervention.
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
Uempty
6XPPDU\
&RS\ULJKW,%0&RUSRUDWLRQ
,%06WRUDJH,QVLJKWV
backpg