SFCF - Book Ebook
SFCF - Book Ebook
ROGER BOUCHARD
ROGER BOUCHARD
This book is dedicated to Nicole, my fiance, whose support and understanding throughout the years have made this book possible. To Carmen Gnier and Germaine Bouchard, who both left us during the writing of this book.
2009 Brocade Communications Systems, Inc. All Rights Reserved. Brocade, the B-wing symbol, BigIron, DCX, Fabric OS, FastIron, IronPoint, IronShield, IronView, IronWare, JetCore, NetIron, SecureIron, ServerIron, StorageX, and TurboIron are registered trademarks, and DCFM and SAN Health are trademarks of Brocade Communications Systems, Inc., in the United States and/or in other countries. All other brands, products, or service names are or may be trademarks or service marks of, and are used to identify, products or services of their respective owners. Notice: This document is for informational purposes only and does not set forth any warranty, expressed or implied, concerning any equipment, equipment feature, or service offered or to be offered by Brocade. Brocade reserves the right to make changes to this document at any time, without notice, and assumes no responsibility for its use. This informational document describes features that may not be currently available. Contact a Brocade sales office for information on feature and product availability. Export of technical data contained in this document may require an export license from the United States government. Brocade Bookshelf Series designed by Josh Judd Securing Fibre Channel SANs Written by Roger Bouchard Edited by Victoria Thomas Design and Production by Victoria Thomas Illustrated by Jim Heuser, David Lehmann, and Victoria Thomas Contributors: Josh Judd (SAN basics), Marcus Thordal (Brocade Encryption Switch), Scott Kipp (key management), Jim Davis (zoning), and Thomas Scheld and Martin Sjoelin (lab experiments for myths) Reviewers: Greg Farris, Tom Clark, Josh Judd, Marcus Thordal, Scott Kipp, Jim Davis, and Mark Dietrick Printing History First Edition, April 2009
iv
Important Notice Use of this book constitutes consent to the following conditions. This book is supplied AS IS for informational purposes only, without warranty of any kind, expressed or implied, concerning any equipment, equipment feature, or service offered or to be offered by Brocade. Brocade reserves the right to make changes to this book at any time, without notice, and assumes no responsibility for its use. This informational document describes features that may not be currently available. Contact a Brocade sales office for information on feature and product availability. Export of technical data contained in this book may require an export license from the United States government. Brocade Corporate Headquarters San Jose, CA USA T: (408) 333 8000 [email protected] Brocade European Headquarters Geneva, Switzerland T: +41 22 799 56 40 [email protected] Brocade Asia Pacific Headquarters Singapore T: +65 6538 4700 [email protected]
DISCLAIMER
The author is not an attorney and this book in no way represents any legal advice or legal opinion. For legal advice or opinion on data protection measures, consult an attorney. Acknowledgements Very special thanks go to Martin Skagen, my friend and Brocade mentor, for his generosity in sharing his extensive technical knowledge with me and his support of my advancement in SAN security. Special thanks to Victoria Thomas, my wonderful copyeditor, whose experience, guidance, and patience helped this novice writer weave through the perils and intricacies of creating a book and making it look great. This book would not have been possible without the help of several contributors and reviewers that shared their knowledge and expertise. For this, I would like to thank Greg Farris, Tom Clark, Josh Judd, Marcus Thordal, Scott Kipp, Jitendra Singh, Jim Davis, Thomas Scheld, Martin Sjoelin, and Mark Dietrick. Finally, thanks to Ron Totah, who provided me with the opportunity to dedicate the time and created the environment essential to complete this project.
About the Author Roger Bouchard has been in the computer industry since 1978 with a wide range of experience in programming, analysis, consulting, education and management. He has taught IT security courses since 1994 and has been focused exclusively on the storage industry since 1996. Since Mr. Bouchard joined Brocade in 2000, he has obtained his BCFP, BCSD, and BCSM certifications as well as the CISSP certification in 2005 and an M. Sc. in Information Assurance (MSIA) from Norwich University. His role evolved within the company from a Sales Engineer (SE) Subject Matter Expert (SME) on Security to founding and leading the Security Practice in the Services organization. There he developed processes for SAN Security Assessments and SAN Hardening engagements delivered across North America. He is currently a Global Solutions Architect, and in this role has written several white papers on SAN security and is a frequent speaker at storage/SAN conferences.
vi
Contents
Chapter 1: Introduction .............................................................................1
The SAN Security Dilemma .................................................................................. 3 Why SAN Security? ............................................................................................... 4 Who Needs to Know About SAN and Storage Security? .................................... 6 Chapter Summary ................................................................................................. 7
Contents
Contents
ix
Contents
Contents
Appendix A: Fabric OS Security Features Matrix ...............................203 Appendix B: Standards Bodies and Other Organizations .................207
FCIA ....................................................................................................................207 IEEE ...................................................................................................................207 ANSI T11 .......................................................................................................... 208 SNIA .................................................................................................................. 208 IETF ................................................................................................................... 209 OASIS ................................................................................................................ 209
Index ........................................................................................................211
xi
Contents
xii
Figures
Figure 1. Fabric and SAN .................................................................................... 2 Figure 2. Examples of optical fiber sniffers .....................................................11 Figure 3. Sniffing FC frames in a SAN ..............................................................13 Figure 4. FC trace analyzer screen ...................................................................14 Figure 5. Port mirroring screen capture ...........................................................15 Figure 6. Sniffing FC frames using a mirrored port..........................................15 Figure 7. Excel spreadsheet reconstructed from FC sniffing .........................16 Figure 8. FC protocol generations ....................................................................21 Figure 9. FC frame format .................................................................................22 Figure 10. FC protocol layers ............................................................................23 Figure 11. FC devices and port link types ........................................................27 Figure 12. FC device WWNs ..............................................................................28 Figure 13. FC credit-based flow control ...........................................................29 Figure 14. FSPF path selection ........................................................................32 Figure 15. Frame redirection ............................................................................34 Figure 16. Dual-fabric design ...........................................................................36 Figure 17. Cascade topology (four and six switches shown) ..........................36 Figure 18. Ring topology ...................................................................................37 Figure 19. Full mesh topology ..........................................................................38 Figure 20. Partial mesh topology ......................................................................39 Figure 21. Pure core-edge topology .................................................................40 Figure 22. Typical core-edge topology ..............................................................40 Figure 23. Resilient core-edge topology ..........................................................41 Figure 24. Multi-tiered fabrics ..........................................................................41 Figure 25. Routed fabrics .................................................................................42 Figure 26. Extended fabric using dark fiber ....................................................43 Figure 27. Extended fabric using FCIP .............................................................43 Figure 28. Simplified public key exchange ......................................................77 Securing Fibre Channel Fabrics xiii
Figures
Figure 29. Block cipher encryption/decryption ...............................................79 Figure 30. Stream cipher encryption/decryption ............................................80 Figure 31. Hashing algorithm ...........................................................................80 Figure 32. Digital signature ..............................................................................82 Figure 33. Public key infrastructure .................................................................86 Figure 34. Trusted key exchange .....................................................................89 Figure 35. Opaque key exchange .....................................................................90 Figure 36. SAN Security Model .........................................................................92 Figure 37. Example of traffic isolation .......................................................... 101 Figure 38. SAN encryption points for data-at-rest ........................................ 111 Figure 39. Poor example of securing SAN-attached DMZ servers .............. 128 Figure 40. Securing SAN-attached DMZ servers to a production SAN ....... 129 Figure 41. Securing SAN-attached DMZ servers using a physically separate SAN .................................................................................. 130 Figure 42. Consolidated server network interface using FCoE and CEE .... 173 Figure 43. Front view of the Brocade Encryption Switch ............................. 177 Figure 44. Rear view of the Brocade Encryption Switch .............................. 177 Figure 45. Profile view of the Brocade FS8-18 ............................................. 178 Figure 46. Side view of the Brocade FS8-18 ................................................ 179 Figure 47. First-time encryption operation ................................................... 182 Figure 48. HA and DEK cluster ...................................................................... 185 Figure 49. DEK synchronization .................................................................... 187 Figure 50. LKM key exchange process ......................................................... 188 Figure 51. RKM key exchange process ......................................................... 189 Figure 52. SKM key exchange process ......................................................... 189 Figure 53. Brocade Encryption Switch internal architecture ....................... 192 Figure 54. Brocade FS8-18 Encryption Blade internal architecture ........... 193 Figure 55. Encrypted cross-site backup ........................................................ 200 Figure 56. Encrypting data over dark fiber with data-at-rest encryption .... 200
xiv
Tables
Table 1. FC protocol generations .....................................................................21 Table 2. FC protocol layers ................................................................................23 Table 3. FC port types .......................................................................................26 Table 4. FSPF path costs ..................................................................................32 Table 5. Fabric topology design factors ...........................................................35 Table 6. Full mesh topology ISL and port requirements .................................38 Table 7. Classification of human threats .........................................................53 Table 8. Data cleaning algorithms ...................................................................70 Table 9. Domain ID behavior ......................................................................... 100 Table 10. Well-known ports and services ..................................................... 139 Table 11. Brocade RBAC ................................................................................ 143 Table 12. PCI-DSS merchant levels and criteria .......................................... 159 Table 13. Common Criteria evaluation levels ............................................... 169 Table 14. Brocade encryption solution features matrix .............................. 180 Table 15. IPSec encryption and authentication algorithms for FCIP .......... 199
xv
Tables
xvi
Introduction
As todays IT organizations face more and greater security threats and a growing number of industry and government regulations, securing SAN environments has become an increasingly important aspect of overall data security. This is especially the case as storage area networks continue to grow in size and extend across multiple sites. A key factor in security is that many SANs use protocols other than the Fibre Channel (FC), with many different protocols now carrying storage traffic. Some are upper-level protocols (such as FICON in the mainframe world) while others run over IP (such as Fibre Channel over IP (FCIP) for tunneling FC between sites and iSCSI for fanning out to low-cost servers). The introduction of new protocols, such as FC over Ethernet (FCoE) based on the new Converged Enhanced Ethernet (CEE) standard, are also introducing new security concerns in the SAN. At a very basic level, security measures need to balance the probability of a threat occurring, the impact of a security breach, the cost of implementing counter measures, and the value of the assets. The tolerated risk level varies significantly from one organization to another and depends on several factors. It is often dictated by government legislation and industry standards targeted at specific verticals, such as: Gramm-Leach Bliley Act (GLBA) for the financial and insurance industries Health Insurance Portability and Accountability Act (HIPAA) guidelines for the healthcare industry Payment Card Industry Data Security Standard (PCI-DSS) for companies dealing with large volumes of credit card transactions. Canadas Personal Information Protection and Electronic Documents Act (PIPEDA)
Chapter 1: Introduction
The European Union (EU) Data Protection Directive (EU Directive 95/46/EC) The Monetary Authority of Singapore.
Some legislation is regional such as the precedent-setting California Senate Bill (SB) 1386 and similar laws currently in effect in 44 states1 at the time of writing. This legislation requires organizations to disclose security breaches of unencrypted personal information belonging to their state residents. This means that a security breach might be made public and have serious business consequences, including customer attrition and loss of brand equity. Regardless of the specific legislation, the more valuable the data is to an organization, the lower the tolerated risk level will be when it comes to protecting it. This trend will most likely continue, especially as data security becomes an increasingly global issue. SAN security can no longer be overlooked by security and storage professionals, since every day the volume and value of mission-critical data in their storage environments increases.
SAN
Application server
Fabric
HBA Management console FC backbone, director, switch, and embedded modules
Storage
1. National Conference of State Legislatures Web site (January 25, 2009): http://www.ncsl.org/programs/lis/cip/priv/breachlaws.htm
The storage area network (SAN) has been defined in many ways and the limits of where it begins and ends can vary depending on an individuals or organizations perspective. For the purpose of this book, a fabric, often depicted as a cloud in illustrations, refers to the Fibre Channel infrastructure that makes up a storage network, namely, the FC switches, directors, routers, and backbone devices. The Host Bus Adapter (HBA) on the host and the storage controllers are also included in this definition. The SAN includes the fabric (network infrastructure) and the storage devices on which the data resides, including disk arrays, tape libraries, and both disk and tape media. Figure 1 illustrates a simple fabric and SAN. This book discusses the actual data residing on the SAN (classic data protection concepts) at a high level onlyit mainly addresses the issue of data confidentiality. For those interested in information about protecting data in greater depth, consult an excellent book entitled, Strategies for Data Protection, First Edition, 2008, by my esteemed colleague, Tom Clark.
Chapter 1: Introduction
ers stealing backup tapes or disk drives containing sensitive company information such as medical, research, financial, and customer information. Many cases have been reported of employees actually copying information and taking it with them before they leave their employer and then selling it to criminal elements or using the information in their next position with a competitor. This book is primarily intended to raise awareness of storage, security, and IT management professionals of the need for SAN security. If successful, understanding more about security issues raised in this book will help bridge the knowledge and cultural gap between the storage and security groups within an organization, which in turn will help IT managers better understand the risks and potential liability issues associated with their SAN. To accomplish this, basic security concepts are introduced for those overseeing the storage environment and then basic storage concepts are presented to those involved in securing IT assets and electronic information. Of value to IT managers may be a review of some of the regulations and legislation in effect throughout the United States and other countries and how they apply to the SAN environment. Although this book is focused primarily on Brocade B-Series (classic Brocade) and M-Series (formerly McDATA) technology, the basic SAN security principles introduced here can be applied to any fabric or storage environment regardless of the vendor implementation. While there may be differences in feature availability and implementation among vendors, the general concepts and requirements are comparable. The information in this book is based on current research being performed by many organizations (full list in Appendix B) and realworld experience gained from performing actual security assessments, audits, and hardening engagements with Brocade customers throughout North America.
Since 2002, Brocade has been a leader in Fibre Channel SAN security. Based on years of real-world experience deploying SANs of varying sizes and architectures, Brocade developed a special licensed version of Fabric OS (FOS) called Secure Fabric OS designed to meet the specific requirements of the most security-sensitive environments. For instance, Brocade introduced the first access control lists (ACLs) in the Fibre Channel industry and provided the first Fibre Channel authentication mechanism using Public Key Infrastructure (PKI), which has since been replaced with the standards-based DH-CHAP (Diffie Hellman - Challenge Handshake Authentication Protocol), a forthcoming Internet standard for the authentication of devices connecting to a Fibre Channel switch, as defined in the FC-SP/FC-sec standard specification defined by the ANSI T11 committee. Most of the security features originally available in Secure Fabric OS have since been replaced with either equivalent or more powerful and flexible functionality in the base Fabric OS (version 5.3.0 or later), so they no longer require a special license. Appendix A provides a comprehensive list of technical security features that can be implemented in a Brocade-based SAN environment. As new security vulnerabilities are discovered or required, Brocade is continually enhancing existing features and creating new security features to help ensure that FC fabric infrastructures and data moving through them remain secure and highly available. Security represents a delicate balance among factors such as the type of threats and risks, the likelihood that a vulnerability can and will be exploited, the effort and cost associated with implementing counter measures, the impact on fabric management, and the value of the asset being protected. With more than 100 FC fabric security features available, not all features available should be implemented in all environments. Different organizations have different security requirements and levels of tolerance to risk. A detailed analysis and assessment of the state of security for a given environment should be performed to fully understand the risks and how to best mitigate them. There should be enough detailed information in this book to gain the knowledge necessary to conduct this assessment. Nevertheless, there may be advantages in hiring the services of a third-party organization with expert knowledge in the subject as is frequently done with conventional TCP/IP-based networks. Brocade offers such a service to help customers evaluate and assess the current state of security of their SAN.
Chapter 1: Introduction
Chapter Summary
IT Security Director/Manager. The IT security director or managers primary concern is with the IT assets, applications, and personnel that she is responsible for. Her concern with the SAN and storage environment is more detailed and she is responsible for implementing many of the controls and policies established by the C-level executives. Security Professional. The security professional can be responsible for creating security policies, implementing security measures, managing the security aspects of the IT environment, monitoring the state of security of the IT environment, and responding to security incidents. He should have a direct involvement in the SAN and storage security just as he would with the corporate LAN and server environment. Storage Professional. The storage professional, which include SAN administrators, storage administrators, backup administrators, operators and managers, is more concerned with following the security policies during the course of their daily activities managing and running the storage environment. The storage professional will often be called upon by the security team to provide advice on how to implement specific security measures in a SAN and storage environment. Questions the storage professional may be required to answer include: What is the best way to encrypt backup data on tapes? and Which secure protocols can be used to securely manage the SAN switches and storage devices? Within a given organization, many individuals will be involved in SAN and storage security at different levels. Each has a vested interest in the due diligence and care required to protect the data residing on the SAN environment.
Chapter Summary
In conclusion, although a significant gap has existed between the storage and security worlds, both sides are learning from each other as organizations are faced with more compliance, regulations, and attacks on their electronic data. Organizations such as SNIA (Storage Networking Industry Association), SSIF (Storage Security Industry Forum), IEEE, and OASIS (Organization for the Advancement of Structured Information Standards) are developing best practices and standards to help address these issues.
Chapter 1: Introduction
Over the past eight years, this writer has had the opportunity to discuss SAN and storage security issues with thousands, of security and storage professionals as well as IT managers and decision makers. These people represent businesses and industries spanning the entire spectrum from financial to health to telecommunications and also government, military, and intelligence. Although each organization has its own unique perspective on the subject of SAN security, some issues are common to all groups. Some people immediately understand the need for SAN security and recognize the hole in their IT security strategy. At the other extreme, others simply believe that there is no need to address SAN security at all. Several misconceptions have developed from the early days of the SAN, which unfortunately have become integrated and accepted into IT folklore and culture and are now perceived as fact. As with all folklore, myths can persist over time and take on a life of their own. Although they can be entertaining to some, it is important to understand the true facts so as not to fall into the security through obscurity way of thinking. This line of thinking can lead to a false sense of confidence in the security of a SAN environment. There is nothing more humbling to an organization than an actual security breach that becomes public and highly visible, creating a huge impact on customer and market perception. Hopefully, most organizations will take the bull by the horns and address SAN security issues before tragedy strikes. As a result of the writer's contact with real-world people and environments, some of these myths were identified to raise awareness and set the record straight. Although these myths may be quite entertaining to some readers, others may find them a little frightening.
attack on a SAN device management interface to compromise the SAN from this entry point. For this reason, it is important to apply similar security practices normally used in conventional TCP/IP networks to secure these interfaces. Additionally, some SANs use other protocols based on TCP/IP, such as iSCSI and FCIP. iSCSI is commonly used as a low-cost, lower performance SAN solution and allows organizations to leverage existing TCP/ IP infrastructure. FCIP is usually used to connect two or more data centers over distance to enable replication of data or to perform multi-site backups. Both iSCSI and FCIP use the TCP/IP protocol and the usual security measures deployed to protect the LAN or WAN should also be implemented with these protocols.
11
12
A simple experiment was performed to demonstrate this using a basic SAN and an FC trace analyzer. Although this experiment was conducted using an expensive FC trace analyzer, it could equally successfully have been performed using one of the inexpensive sensitive photosensors described in Myth Number 3. The diagram in Figure 3 (SAN Test A) illustrates the setup of the test equipment used for this experiment.
SAN Test A
Application server Plaintext credit card record
Figure 3. Sniffing FC frames in a SAN In this setup, a fictitious credit card record was created and sent in unencrypted cleartext to the FC fabric, where the record was then written to the disk array. The FC trace analyzer captured and recorded the FC frames involved in this transaction. The screen capture from the trace analyzer, shown in Figure 4, displays the different frames captured-the frame containing the credit card record is highlighted between the two red lines. Look at the bottom-right corner of the screen capture to see the ASCII version of the frame contents, circled in red. The credit card record can clearly be read from this screen.
13
14
To demonstrate how easy it is to capture sniffed data and rebuild an entire file, a second experiment, similar to the one in Myth Number 6, was conducted. This time, another feature was exploited - the ability to mirror a port. A simple laptop was used to store the frames captured by the FC trace analyzer. The storage array port (port 0) was mirrored to port 15 using the port mirroring feature built into the switch, as shown on the screen capture in Figure 5 and the diagram in Figure 6.
SAN Test B
Application server Plaintext Excel spreadsheet
Laptop
Storage port (0) Mirrored port (15)
Storage
FC trace analyzer
Figure 6. Sniffing FC frames using a mirrored port and storing them on a laptop
Securing Fibre Channel Fabrics 15
An FC trace analyzer was attached to the mirrored port (port 15) and captured the data going to the storage array. Again, as in the previous experiment, an inexpensive photosensor could be used for this demonstration A spreadsheet containing fictitious payroll information was sent across the SAN from the host to its LUN (logical unit number) on the disk storage array. The data captured by the trace analyzer was dumped to a binary file on a laptop, which was subsequently used to reconstruct a second disk. The file system on the new disk was mounted and the spreadsheet, shown in Figure 7, could be read as though it were the original copy.
Figure 7. Excel spreadsheet reconstructed from FC sniffing Clearly, FC frames can be selectively captured from a Fibre Channel network and easily stored on a storage device as simple and compact as a laptop. The information contained in the captured frames can be used to reconstruct entire files. Of course, even partial files or partial information could contain enough sensitive data to result in a significant security breach.
16
Chapter Summary
Common SAN security myths include the notion that because a storage network is physically isolated, it is secure. Also that Fibre Channel is impervious to attack both because it is a complicated protocol with no avenues in and it cannot be sniffed. There is also a belief that even if data were to be sniffed, it would be incomprehensible and unusable; however simple tests using an inexpensive optical fiber sniffer show that to be completely false. And because every SAN environment has its own operational and business requirements, default built-in security features on FC switches are not going to ensure SAN security. Certainly more security and storage professionals are asking about SAN and storage security than ever before--worldwide. The subject comes up in conversations every day and both storage and security professionals alike are craving more information so that they can come up to speed quickly and take the appropriate measures to secure their SAN.
17
18
Although a SAN is a network, it differs significantly from the conventional local area network or TCP/IP-based network. Since security professionals tend to be unfamiliar with SANs, they often overlook or ignore security issues for these networks. This chapter is for IT security professionals with little or no knowledge of storage and the SAN. Storage professionals may elect to skip this chapter and continue with next chapter, which discusses security basics for storage professionals. One of the first questions you might ask is this: Why do you need a SAN to begin with? The original model for open data storage was direct-attached storage (DAS), in which each server has its own storage directly attached using the Small Computer Systems Interface (SCSI, pronounced skuzzy) or other protocol. While the DAS model worked well in the early days of the data center, it became clear that DAS had reached its limits when the importance and scale of IT infrastructure outgrew it in the 1990s. DAS was inefficient, since some disks had large amounts of empty space while others were completely full, and additional disks had to be purchased. This led to a need to pool disk storage into one central location and share the resources with all hosts to optimize utilization of the storage devices. Broadly, this category of solution is known as white space optimization. SCSI also had distance limitations and could be used only to connect devices in the same rack or at best to another device in an adjacent rack. This precluded its use for most high-availability applications and all disaster recovery solutions. SCSI performance was also an issue for storage applications, particularly backup applications, which demanded increased bandwidth to meet growing business requirements and shrinking backup windows.
Securing Fibre Channel Fabrics 19
The purpose of the SAN is to provide a means of transporting data through a network between a server and any storage devices it requires. The SCSI protocol is still the foundation for the SAN, in which SCSI commands can be transported via a network protocol instead of via directly attached SCSI cables. Although a SAN can be implemented using different network protocols, most SANs have been implemented using the Fibre Channel protocol. (There are other protocols used such as iSCSI and now FCoE, currently under development.) Note that the spelling of the word Fibre is in fact correct here and was written this way intentionally to distinguish it from the word fiber, which usually refers to fiber-optic cable. This difference appears subtle, but is important. Although the FC protocol is often implemented using fiber-optic cable, it can also be implemented using copper cabling. Since Fibre Channel is the predominant protocol in the SAN market, for the remainder of the book, this will be assumed.
Server
Disk
Point-to-Point
Arbitrated Loop
FC-AL
Switched Fabric
Disk array
FC director
Tape library
21
The FC switched fabric protocol itself has also evolved through several generations. Its first implementation offered 1 Gigabit per second (Gbps) speeds using a removable transceiver-type device in the switch port, called a gigabit interface converter (GBIC). Subsequent generations, starting with 2 Gbps implementations, used a smaller transceiver called a small form factor pluggable (SFP) device. The successive generations doubled the previous generation's bandwidth to 2 Gbps, 4 Gbps, and now 8 Gbps speeds; 10 Gbps FC is also supported for distance extension and 16 Gbps is actively being developed.
FC Protocol Layers
Similar to TCP/IP, Fibre Channel is a multi-layer protocol with five defined layers as described in Table 2 and illustrated in Figure 10. If you are familiar with the TCP/IP protocol you may notice some similarities, particularly at the lower layers. This is not surprising, since the Gigabit Ethernet standard actually borrowed its lower layers from FC. However, many of the functions found in the TCP/IP model are performed in different layers in the FC model and some layers do not exist at all (routing, for example).
22
FC-3
FC-2
FC-1 FC-0
Audio/Video
Fast file transfer
Channels
Networks
Streams
IPI
SCSI
HIPPI
SBCCS
802.2
TCP/IP
FC-PH
Future rate
23
Types of Switches
There are different types of switches such as backbones, directors, and routers. Although there are no official definitions for switches, the terms director and backbone are generally accepted in the industry. FC-AL hub. A hub is used to connect devices using the arbitrated loop (FC-AL) generation of the FC protocol. Although rarely used today, it can be seen occasionally in older environments or sometimes integrated into a low-end storage arrays (JBOD, or just a bunch of disks) or tape library that uses FC-AL disk or tape devices in the background. FC switch. An FC switch is a networking device that supports the FC protocol and allows hosts and storage devices to communicate with each other. Generally, an FC switch has a 1U or 2U form factor. Some may have redundant power supplies and/or fan modules while others may not. The number of ports in current devices varies from 8 ports all the way to 80 ports. FC director. This term was borrowed from mainframe ESCON (Enterprise Systems Connection) technology. The ESCON protocol was implemented using highly robust and redundant networking devices called directors, which allowed storage devices to be connected to the mainframe. When the manufacturers of ESCON directors, McDATA and InRange, adapted ESCON directors to support the FC protocol, they simply used the same term. FC backbone. A backbone has a similar architecture to a director but adds greater performance and advanced functionality to support requirements of next-generation, consolidated data centers. Backbones may offer support for advanced features, such as: Encryption Virtualization Adaptive Networking services Integrated FC routing Support for future protocols (FCoE and CEE) FC router. An FC router is a switch with the ability to connect two or more separate fabrics and allow devices in each fabric to communicate with devices in other fabrics according to user-definable rules. Since there is no routing layer in the FC protocol, FC routers must use a special abstraction layer to present virtual switches to physical switches. FC gateway. An FC gateway allows devices using different protocols to be connected to a FC fabric. For example, servers connected to a TCP/IP network can be connected using an iSCSI gateway at one end and to an FC fabric at the other end.
24
FC Fabrics
FC fabrics are implemented using FC switches, directors, and routers. Each element is addressed by a domain ID (DID) ranging from 1 through 239; no two switches in a fabric can have the same DID.
Enterprise-Class Platforms
Over time, a director has generally become accepted as a switch with higher reliability, scalability, performance, and flexibility, described in more detail below. More recently, backbone platforms join directors in a category known as enterprise-class platforms. Reliability Highly redundant hardware architecture with hot swappable components Five nines (99.999 percent) availability or better; Brocade directors tend to be closer to six or seven nines (99.99999 percent) of availability. Non-disruptive firmware upgrades Non-disruptive failover of control processors Bladed architecture to add blades as needed Higher port count, supporting from 32 ports to 384 ports in current devices; Brocade directors have multi-terabit backplanes High performance backplane architecture Support for specialized blades (application, routing) Support for other protocols (Ethernet, iSCSI, FCIP, FICON)
Scalability
Performance Flexibility
FC Devices
There are two basic types of devices that can be connected to a fabric: initiators and targets. Hosts and servers are know as initiators and are the only devices capable of initiating conversations with other devices in the fabric. Data is written to target storage devices, which cannot generally initiate a conversation on their own with other devices-they require an initiator to do this for them. Hosts are connected to the fab-
25
ric using a special card, called a Host Bus Adapter (HBA). Storage devices are connected to the fabric through a storage or device controller. Host and storage devices are connected to FC switch ports, each of which contains an SFP (GBIC in older switches). Fibre Channel ports are classified into two basic categories, node ports and switch ports: The node ports identify the ports on the end devices such as the host and storage ports. Switched fabric nodes are called N_Ports. Arbitrated loop nodes are called NL_Ports. All switch ports begin life as universal ports (U_Ports) and take on specific personalities depending on what they are connected to. When a host or storage device is connected to a switch, the universal port becomes a fabric port, or F_Port. When a switch or router is connected to a switch port, it becomes an extended port, or E_Port. When an FC-AL device is connected to a switch, it becomes a fabric loop port, or FL_Port.
FC switches can be connected to other FC switches via E_Ports on the switch using an inter-switch link (ISL) to merge and form a cohesive fabric of switches. An ISL is simply the connection between two switches or directors. An inter-fabric link (IFL) is used to connect a router to a switch that is in a different fabric. An inter-chassis link (ICL) is used to interconnect two physically separate backbone chassis via a special kind of connector, an ICL cable. In this case the switch port remains an E_Port. Table 3 describes the FC port types and their functions, and Figure 11 illustrates the different FC devices and links in a fabric. Table 3. FC port types
Type N_Port NL_Port F_Port FL_Port E_Port Category Node Node Switch Switch Switch Name Node port Node loop port Fabric port Fabric loop port Extended port Description Port on host or storage device FC-AL port on host or storage device Switch port that connects to an N_Port Switch port that connects to an NL_Port Switch port that connects to other switches forming an ISL Securing Fibre Channel Fabrics
26
Description Switch port that mirrors the data going through another port Switch port with no devices connected to it; will become an E_Port, F_Port, FL_Port, or EX_Port Switch port on a router connecting to the E_Port on a switch, forming an inter-fabric link (IFL)
EX_Port
Router
Blade server
FC initiators
N_Port N_Port
FC targets
F_Port E_Port
E_Port ISL
E_Port
F_Port FL_Port
FC switch
FC switch
NL_Port
Disk array
FC switch
Figure 11. FC devices and port link types Targets and initiators are identified by a unique 8-byte address called a World Wide Name (WWN), which is equivalent to an Ethernet MAC (Media Access Control) address. There are two types of WWNs: A node WWN (nWWN) refers to the actual node or device or host. A port WWN (pWWN) refers to an actual port on an HBA.
Some HBA cards have more than one port, in which case each port has a different pWWN, but there is still only one nWWN for the entire host, as shown in Figure 12.
27
Port WWN (pWWN) addresses a single port on the HBA 21:00:00:04:cf:e7:74:cf 21:00:00:04:cf:e7:74:cd
RSCN
When new devices are added to a fabric and old ones removed, there must be a means of informing the other devices that a change has occurred. In FC switched fabrics, this is accomplished using a registered state change notification (RSCN). An RSCN is similar in some ways to a LIP in the FC-AL protocol, but it is much less disruptive-particularly with modern HBAs and drivers. When a new device is added to a fabric, an RSCN is broadcast throughout the fabric. With Brocade switches, the RSCN is limited to the affected zone so as not to disrupt the rest of the fabric (see Zoning on page 29). This is called RSCN scoping and is similar to the broadcast scoping function provided by virtual LANs (VLANs) in the Ethernet space.
Flow Control
One significant advantage the FC protocol has over other network protocols is the way data flow is controlled. In Ethernet, no meaningful flow control is provided at all. Packets are sent, and if the receiving switch or device is too busy to process them they are discarded. Ethernet LANs rely on higher-level protocols such as TCP to handle flow control. TCP uses rate-based flow control by sending a group of packets and waiting for an acknowledgement back from the other end before sending the next group of packets. If an error occurs and even one packet is not received, then the entire group of packets is resent. Fibre Channel, on the other hand, uses credit-based flow control and continuously sends frames without waiting for an acknowledgement from the other end. To achieve this, FC uses buffer credits (BB credits) to indicate whether or not there is sufficient memory available to store each transmitted frame, as shown in Figure 13.
28
Server
FC director
Disk array
Figure 13. FC credit-based flow control An FC switch will not transmit data to another switch port until that port has advertised a BB credit. The credit is essentially a promise that the receiving port will be able to deliver the frame to its next destination, either by forwarding it immediately or by storing it and forwarding it later. When a port sends a frame, it cannot use that credit again until the receiving device returns it as an R_RDY call. Once a frame is sent to its next destination, the buffer is freed up. At that point, the associated BB credit is released to notify the other switch that memory is once again available on that port to receive another frame.
Zoning
Fabrics can become quite large, span great distances, and consist of thousands of nodes. To avoid one flat network allowing every device to be aware of every other device, zoning is used to isolate groups of devices. Zoning is a fabric-based service that groups devices that need to communicate with each other. Once a device is assigned to a zone, it can communicate only with other devices in that zone.
29
Zoning terminology can be confusing, mainly for historical reasons. The terms hard and soft zoning, port and WWN zoning, and hardware- and software-enforced zoning are often used interchangeably. To be clear, there are two basic methods by which zone members are identified and enforced within a fabric: By the hardware via the ASIC: hardware-enforced zoning By software as a service: software-enforced zoning
For example, with software-enforced zoning, some hosts may cache the WWN of the devices in the zone with which it communicates. If a storage device is removed from the zone and placed in a different zone, the host could still access the storage device even though it is no longer in the same zone. If the WWN in the cache is removed either through a power cycle or cache timeout, the host would not be able to obtain the WWN from the SNS since it is now in a different zone. This is comparable to unlisted telephone numbers. Even though a person
30 Securing Fibre Channel Fabrics
delists their phone number, someone who has their phone number can still call them. If the caller loses the number, however, they would not be able to get if from directory assistance. With hardware-enforced zoning in Brocade switches, although the host may cache the WWN, the ASIC will block access to the device if it is not in the same zone as the host. This is equivalent to using the call-blocking feature. Even though someone has a persons unlisted phone number, if the callers number is blocked at the central office (CO), then the call would not be allowed. As a best practice, it is also recommended that you define zones using the pWWN instead of the DID/PID Organizations that implemented WWN zoning definitions were very pleased with their choice when they migrated the SAN from a fabric consisting of several 16-port switches to a single director. The migration is quite simple and involves copying the zoning database to the new director. Those organizations that used DID/PID definitions had to convert all zone definitions manually, since the DIDs and port numbers on 16-port switches did not map to the 256-port director.
Path Selection
Path selection refers to the algorithm for selecting the path frames will follow, given a possible choice. To avoid confusion, this is not the same as FC routing, which is discussed in the routed fabrics section. There are several types of path selection protocols in FC. Fabric Shortest Path First (FSPF) Dynamic Path Selection (DPS) Trunking
FSPF
Fabric Shortest Path First was invented by Brocade and is now an accepted standard. It is a link-state path selection protocol similar to the TCP/IP Open Shortest Path First (OSPF) routing protocol. FSPF does two things: It keeps track of the state of all the links in a fabric. It calculates a cost to each path.
In FSPF, paths are calculated by summing the cost of all the links traversed by the path. Each time a switch is traversed, it is called a hop. Only the lowest-cost path between a source and destination is kept in
31
the routing table. The routing table contains information about which switch port to use when forwarding a frame to its final destination. Table 4 lists the default path costs for the different link speeds. Table 4. FSPF path costs
Link Speed < 1 Gbps 1 Gbps 2 Gbps 4 Gbps 8 Gbps 10 Gbps Path Cost 2000 1000 500 500 500 500
The example in Figure 14 illustrates how FSPF works. A server is connected to a four-switch fabric. The link cost between each switch is set to 500. There are four possible paths in the fabric from the server to the disk array: A-C, A-B-D, A-C-D, or A-B-D-C. The paths through switches A-BD and A-C-D represent two hops, so the path cost is 500 + 500 = 1000. The cost through path A-B-D-C has three hops, for a cost of 1500. The cost of the path through switches A-C is 500, since there is only one hop. FSPF drops paths A-B-D, A-C-D, and A-B-D-C from its routing table, since they have a higher cost than path A-C, and frames will never follow these paths when A-C is available. In the event that switch C fails, however, FSPF will recalculate the paths and route the frames through A-B-D.
Server A B
Disk array
When paths with equal costs are available, paths are assigned in a round-robin fashion. Say that two paths are available between three hosts on a switch to the disk storage on another switch. The first host will be assigned one path, the second the other path, and the third will round-robin back to the first path. The Brocade implementation of this feature is called Dynamic Load Sharing (DLS). DLS does not use active load feedback, so the paths remain fixed regardless of the load on the link.
Trunking
The term trunking refers to consolidating several links into one link resulting in a higher-bandwidth link. In the LAN world, trunking is often used to consolidate several ports on a network interface card (NIC) card to provide a larger pipe, or trunk, to the switch. In the FC world, trunking currently applies only in the consolidation of ISLs between two switches. Trunking can be implemented in several ways and some FC switch vendors actually use the term trunking for exchange-based routing, which can be very confusing. Brocade implements trunking at the ASIC level in the hardware, which we believe is the only way to truly implement trunking. Hardware-based trunking takes load balancing to the highest level by providing the capability to spread frames across multiple links simultaneously and obtain the best load balancing across available links.
Frame Redirection
In Brocade FOS 5.3 and M-Enterprise OS (M-EOS) 9.8, Brocade introduced a new technology with the capability to redirect frames using a different route than originally intended. This technology was necessary to add a virtualization layer for certain types of applications that would not normally be in the direct data path in the fabric. The Brocade Data Migration Manager (DMM) and EMC RecoverPoint, both of which run on the Brocade 7600 Application Platform, essentially behave as appliances in a fabric, and frames need to be redirected through these
Securing Fibre Channel Fabrics 33
devices to perform their intended function. The Brocade encryption solution also requires this technology to allow a Brocade Encryption Switch or Brocade FS8-18 Encryption Blade to be introduced anywhere in a fabric and encrypt from any host to any LUN or tape drive. Frame redirection, also called nameserver redirection, actually redirects frames to an alternate destination before they reach their final destination. It does this by creating an abstraction layer on top of the physical fabric and its configuration. This abstraction layer has no impact on pre-existing zoning configurations or the physical hosts and storage devices in the SAN. An association between a source initiator and a storage port is created in a redirection zone (redirection zones are not the same thing as conventional fabric zones). The redirection zone presents a virtual target to the physical initiator and a virtual initiator to the physical target. The physical initiator believes it is communicating with the physical target but is in fact talking with a virtual target. Once the host or initiator sends a frame to the virtual target, the redirection zone sends the frame to the alternate device, the encryption device in this case. The encryption device encrypts the payload of the frame and sends it back to the fabric where it gets redirected to the destination target device, as shown in Figure 15.
Server
FC director
Disk array
34
Fabric Topologies
Fabric Topologies
Switches can be connected together in many ways to form simple fabrics or complex, resilient, multi-tiered fabrics. The topology chosen will depend on the business requirements of each organization. When choosing a topology that is best suited to meet specific business requirements, consider these four factors: performance, scalability, redundancy, and cost. And although you can architect a SAN for two or three of these factors at the same time, it will usually be at the expense of the other factors. For example, you can have a highly redundant, high-performance fabric but it will most likely not be very scalable. Finding the right balance among these factors is more an art than a science. Table 5 shows how these four design factors are interrelated for the different topologies described in this section. Table 5. Fabric topology design factors
Topology Cascade Ring Full mesh Partial mesh Core-edge Resilient-core-edge Performance Poor Good Excellent Good Excellent Excellent Scalability Poor Poor Poor Good Excellent Excellent Redundancy Poor Good Excellent Good Good Excellent Cost $ $ $$ $$ $$$ $$$$
For further information on this topic, Principles of SAN Design, Second Edition, by Josh Judd, is highly recommended.
Dual Fabrics
As with any network, FC fabrics should be designed without any single points of failure. From an architectural and design point of view, this redundancy is accomplished by using a dual-fabric architecture, as shown in Figure 16. Servers must also have multipathing input/output (MPIO) software running to load balance the traffic between the two paths and to fail over to one path in the event of a path failure. If any hardware component failure occurs in a fabric, starting from the host HBA through to the disk controller, no production downtime will be incurred since there is an alternate path for the traffic.
35
Fabric B Fabric A
Disk array
Figure 16. Dual-fabric design Dual-fabric design is a best practice and should always be used with disk environments. Tape environments, however, would not benefit from a dual-fabric architecture, since tape drives and backup applications do not have the capability of being dual attached.
Cascade Topology
A cascaded fabric is the simplest architecture; switches are daisychained together to form a string of switches. Middle switches are connected to two other switches and the two end switches are connected only to one other switch. Figure 17 illustrates a four-switch (on the left) and a six-switch (on the right) cascade topology. A disadvantage of this topology is that a server attached to Switch A in the six-switch topology would have to traverse four other switches to get to its storage if the storage device were attached to switch F. These multiple hops can degrade performance. For example, if all of the storage devices were attached to switch F and every port on switches A to E were connected to a host, the traffic between switches E and F could become highly congested.
Switch A A B Switch D Switch F Switch B Switch C D C
Storage
36
Fabric Topologies
This is not a scalable design and offers little redundancy. A failure of any switch with this topology would result in isolation of the devices on either side of the failed switch.
Ring Topology
A ring topology is created when every switch in a fabric is connected to two other switches in the same fabric, as shown in Figure 18. A failure of any switch in this topology would still allow all other switches to continue communicating with each other using a path in the opposite direction. This topology is also not very scalable since the number of hops increases as you add new switches, but it does provide some redundancy with the dual paths.
Mesh Topology
The full-mesh topology is created when every switch participating in a fabric is connected to every other switch in the fabric, as shown in Figure 19. This provides the highest level of path redundancy and excellent performance, since there is only one hop between any two switches in the fabric. However, this is the least scalable fabric topology due to the exponential increase in links required as the number of switches increases.
37
Figure 19. Full mesh topology The formula for the number of ISLs required to create a full mesh topology is: 1 + 2 + + (N-1) where N is the number of switches. For example: a five-switch full mesh fabric would require: 1 + 2 + 3 + (5-1) = 1 + 2 + 3 + 4 = 10 ISLs Table 6. Full mesh topology ISL and port requirements
# of Switches 2 3 4 5 6 7 8 9 10 11 12 # of ISLs 1 3 6 10 15 21 28 36 45 55 66 # of Ports 2 6 12 20 30 42 56 72 90 110 132
38
Fabric Topologies
As can be seen from Table 6, the number of ISLs required increases significantly for each additional switch. It is important to note that each ISL also requires two ports; one at each end switch. Eventually, there will be more ports using ISLs than actual hosts and storage devices in the fabric and switches simply won't have enough ports to connect all the switches. To improve scalability, a mesh topology can also be constructed as a partial mesh, as shown in Figure 20. In this case, most, but not all, switches are connected to all other switches in the fabric. This is a more scalable alternative to a full mesh design, but at the expense of path redundancy and possibly performance.
Switch A Switch B
Core-Edge Topology
The core-edge topology is the most commonly implemented and it represents the best compromise among redundancy, scalability, performance, and cost. There are several variations of a core-edge architecture. Pure core-edge architectures are designed so that all of the traffic must go through the core switch, hence only other switches can be connected to a core switch. The hosts and storage devices are connected to edge switches, as shown in Figure 21. In reality, most coreedge implementations actually connect some storage or host devices to core switches to maximize the cost efficiency of the fabric and make the best utilization of available ports, as shown in Figure 22.
39
Switch E (Core)
Switch E (Core)
Disk array
Tape library
Resilient Fabrics
A resilient core-edge topology simply means that the core switches are in a redundant configuration, as shown in Figure 23, making the fabric design more resilient to a failure of a core switch. The typical resilient fabric has two core switches with multiple edge switches connected to both core switches. In the event of a core switch failure; there is an alternative path from any edge switch to the other core switch.
40
Fabric Topologies
Switch E (Core)
Switch F (Core)
Multi-Tiered Fabrics
Multi-tiered fabrics are used for very large fabrics. Typically, one tier is used to connect the storage devices and another tier is be used for the hosts. There can be several variations and uses for this topology, including one with a resilient core, as shown in Figure 24.
Blade server Servers
Switch E (Core)
Storage devices
Routed Fabrics
A routed fabric, also called a metaSAN and shown in Figure 25, allows devices in two or more fabrics to communicate with each other without requiring all switches to merge into one flat fabric. The FC protocol, however, was not designed with a routing layer similar to the IP layer in
Securing Fibre Channel Fabrics 41
a TCP/IP network. Routing FC fabrics is accomplished by adding an extra abstraction layer of and tricking switches into believing they are connected directly to a specific physical switch. When a router connects to a switch in another fabric, the connection is referred to as an inter-fabric link (IFL) instead of an ISL. The port at the router end of the IFL is also called an EX_Port and the port at the switch end of the IFL remains an E_Port.
Servers Servers
SAN A
SAN B
Storage
Blade server
SAN C
Tape library
Storage
Extended Fabrics
In recent years, disaster recovery and business continuity have taken center stage in most IT organizations as a way to protect critical data and prevent potential business outages. Storage networks have played a prominent role in this trend; data replication, remote mirroring, and remote backup are represented in some of the most commonly deployed solutions utilizing long-distance SAN connectivity. Today's organizations typically use two data centers to exchange data between SANs over long distances. Cost, distance, and performance are the primary factors in deciding what technology to use in a long-distance deployment. As shown in Figure 26, dark fiber is the first method that offers the highest performance for connecting two sites over distance, although this solution comes at a higher price and has distance limitations. The
42 Securing Fibre Channel Fabrics
Fabric Topologies
other method uses FCIP, shown in Figure 27, which is a tunneling protocol that can be used to connect to sites over practically any distance using standard WAN connections.
Site A
Servers Servers
Site B
Dark fiber
Storage
Storage
Figure 26. Extended fabric using dark fiber Implementing dark fiber usually has the greatest initial cost if the organization has to lay the dark fiber or obtain a right of way to do so. Several providers, particularly utility companies, already have a dark fiber infrastructure in place and sell or lease strands of fiber to their customers. Although this option is less expensive than laying your own fiber, it is still quite expensive.
FC host
Site A
Site B
IP WAN
FCIP gateway
LAN switch
WAN router
WAN router
LAN switch
FCIP gateway
FC storage
Figure 27. Extended fabric using FCIP When two separate fabrics at different sites are connected in a standard extended fabric, the link between the two sites becomes a longdistance ISL. Since two switches connected together using an ISL must be part of the same fabric, the fabrics at each site merge to form one fabric. This is important to note given that both sites now share all of the fabric configuration information.
43
In some cases, it may be preferable to isolate each fabric from the other. A hybrid implementation can be used in this case by using FC routing to maintain isolation between the fabrics at each site. This allows for the sharing of resources between fabrics while maintaining separate configuration and management information.
Chapter Summary
The Fibre Channel protocol is in common use in storage area networks today. FC frames can carry a payload of 0 to 2,112 bytes-with a maximum frame size of 2,148 bytes. FC devices in the fabric include backbones, directors, switches, and embedded switches. Hosts, called initiators, connect to devices in the fabric via N_Ports to F_Ports. FC devices connect to each other via E_Ports and EX_Ports. ISL are created by connecting FC switches together and IFLs connect fabrics. FC fabric services improve performance and include path selection via FSPF, exchange-based routing, and trunking. Frame redirection is a Brocade proprietary technology that allows data to be redirected for a particular purpose, such as encryption, and then returned. Although there are a number of different fabric topologies, the simplest are not robust enough for most SANs, and so variations of a coreedge are commonly used. For very large fabrics, multi-tiered fabrics are used for scalability and resilience. Routed fabrics form a metaSAN, which allows devices to communicate without merging to form a single large fabric. Enterprises with multiple data center sites take advantage of extension using dark fiber or a long-distance fabric extension solution. SAN storage resides on disk or tape-and the terms that describe storage include disk-based storage, disk array, LUN(s), and tape-based storage.
44
To the uninitiated, security may seem like a highly complex concept with specialized jargon, but security really boils down to common sense applied using some basic principles. Certainly, implementing security solutions may not be quite that simple, but understanding the general concepts can go a long way toward understanding the issues. SAN security must be approached from a holistic perspective. There is no point in implementing strict access controls and mechanisms in the SAN if the management interface is relatively unprotected. All components of the SAN-from the infrastructure itself to management tools and physical security--must be considered if you want to create a comprehensive SAN security plan. This chapter is addressed primarily to the storage professional who may have little or no knowledge of security concepts. Security professionals may also find this chapter useful to better understand how basic security concepts apply specifically to the world of Fibre Channel fabrics. IT security is an extensive field consisting of multiple domains of knowledge. According to the International Information Systems Security Certification Consortium ((ISC)2), which is responsible for the Certified Information Security Professional (CISSP) certification, there are ten fundamental domains composing a body of knowledge for IT security: Access Control and Methodology Applications and Systems Development Business Continuity Planning Cryptography Law, Investigations, and Ethics
45
Operations Security Physical Security Security Architecture and Models Security Management Practices Telecommunications, Network Security, and Internet Security
These ten domains apply directly to the SAN and storage environments and must be addressed in a comprehensive SAN security program.
Security Models
SAN security involves more than just guarding against a malicious outsider with sophisticated hacking tools and the intent to destroy or steal data. In fact, most IT security threats are based on internal threats from employees or other people with access to networks and physical equipment inside the firewall. As a result, best practice IT security strives to achieve several basic security objectives, which vary depending on which model is being followed. At a minimum: Data must always be available to authorized users whenever it is needed. To maintain its integrity, data must not be modified in any way. Sensitive data such as personal information, intellectual property, and data pertaining to national security, must remain strictly confidential.
As you will see, there are several models in current use and they are described in the next few sections.
Confidentiality
Confidentiality as it pertains to electronic data is the protection of information from being disclosed to unauthorized users. There are several reasons why confidentiality must be considered in IT security, ranging from protecting the right to privacy of individuals to sensitive financial information to social security numbers and other pieces of personal information, which can be used to steal someone's identity.
46
Security Models
Several laws in place today, particularly in the United States, enforce the protection of confidentiality of Personally Identifiable Information (PII) of the citizens of a state. Approximately 44 states today have enacted similar legislation, also referred to as breach disclosure laws. Chapter 9: Compliance and Storage starting on page 157, discusses compliance and breach disclosure laws in greater detail. Confidentiality of electronic information is usually accomplished using cryptographic methods such as encryption of data-at-rest or data-inflight (see Chapter 5: Elementary Cryptography starting on page 73). Authentication methods and access controls are other methods used to address the confidentiality issue.
Integrity
Data integrity ensures the accuracy and consistency of electronic information to provide an assurance that the information has not been modified, deleted, destroyed, or tampered with in any way. For example, it is important to ensure data integrity to prevent attackers from modifying data by inserting unwanted code into an application or to delete pieces of information before it is stored on a disk. Integrity verification is generally achieved using methods such as hashing algorithms and check sums. These methods are described extensively in Chapter 5.
Availability
Organizations have becoming highly dependent on their computer systems and any loss of availability of critical applications can have farreaching and direct repercussions on the company's livelihood. Maintaining availability of applications, and particularly to the data used by these applications, has become essential. High availability (HA), clustering, and fault-tolerant systems are examples of technology used to maintain application availability. Disk mirroring, RAID (redundant array of inexpensive disks), and remote data replication are used to maintain availability of data stored on disks. Software and specialized appliances such as anti-virus, anti-malware, anti-spam, and intrusion detection systems, can prevent attackers from creating a denial-of-service (DoS) attack.
47
CIANA
This model expands the basic CIA model by adding two more security elements: non-repudiation and authentication. It is most often used in Information Assurance, which is primarily used by the military. This model is taught as part of a course to reach the NSTISS (National Security Telecommunications and Information System Security) 4011 Certification in the US.
Non-Repudiation
Non-repudiation is used to prevent someone who has performed an action from refuting it and claiming they have not performed action in question. For example, someone makes a purchase on the Internet and then claims they never made the purchase once they receive the goods. Non-repudiation is an essential element in conducting business. This also applies in the other direction in a situation in which an e-commerce Web site provides proof of payment to the customer. Historically, these functions have been performed using physical signatures and receipts, which then become legal and binding contracts for both parties. The same actions are performed electronically using digital signatures and signed certificates and other methods such as the Confirm button on some Web forms.
Authentication
Authentication is the process of verifying that people really are who they claim to be. There are several ways to authenticate an individual, including user accounts and passwords. Authentication methods can be quite sophisticated with biometric technology such as fingerprint scanners, face/voice recognition, and iris/retinal scanners. Each of these methods is known as a factor of authentication and can be used in combination, known as multi-factor authentication, to provide greater certainty of authenticity. Factors of authentication will be discussed in greater detail in the physical security section (see Physical Security on page 116).
48
Security Models
Possession or Control
If possession is nine-tenths of the law, it has never been more true than in IT security. Loss of control or possession of data must be prevented at all costs, since it must be assumed that once the owner no longer has control, the data is necessarily compromised. Suppose that a backup tape containing customer and credit card information is lost or stolen-a frequent occurrence in recent times. Even if the tape was simply misplaced and no data has actually been read, the assumption is that the data on the tape is now known and appropriate measures must be taken accordingly. Customers must be advised, and credit cards must be re-issued to prevent unauthorized use of the credit card information.
Authenticity
The origin or source of information can be spoofed or forged. Authenticity refers to validating that information does in fact come from the source claimed. Someone can forge an e-mail header to appear like it was sent from someone else. Fields in a database can have incorrect information inserted into them.
Utility
Information has value only if it can be used. If a database file is corrupted, then it is no longer useful and fails the utility test. Data encryption is a very useful method of protecting confidentiality, but if the key is lost the encrypted data is no longer useful since it will no longer be readable. Utility is not the same as availability, but a breach in utility may result in a loss of availability.
49
50
Types of Threats
Types of Threats
A threat is anything that can cause harm. An IT security threat is anything that can cause harm to IT assets. Threats against IT assets specifically can be classified into three basic categories. Disasters Technology Human
Of course, technology threats and sometimes disasters are created and executed by humans, so perhaps there are only two categories.
Man-made disasters include: Terrorism Fire Dam failure War Chemical emergency Hazardous material spill or leak
Protecting against disasters that impact IT assets and business requirements can be accomplished in many ways. The key to protecting against disasters is proper planning, implementation of plans, and dry runs.
51
The first step is to conduct a business impact analysis (BIA) by system to determine the impact of a disaster on each system in the company,and not only computer or IT systems. Once the BIA is completed, a plan must be created, which is usually known as the Business Continuity (BC) plan. Part of the BC plan addresses the recovery of data systems, which is usually referred to as the Disaster Recovery (DR) plan. Once the plan has been created, it must be executed or implemented. The DR plan is generally implemented using a combination of procedures and technology. A DR plan can include the following: Backups Replication Mirrored sites (hot/warm/cold) Procedures Computer Security Incident Response Team (CSIRT)
Finally, once the plan has been deployed, it must be tested on a regular basis. Performing a scheduled or planned failover from the primary site to a secondary site is not for the fainthearted, but it is necessary to demonstrate that procedures and systems will function properly in the event of a real disaster.
Technological Threats
The technological threats to IT assets are created by people and used by people to exploit vulnerabilities in IT systems. The software used to harm IT systems is called malware and includes: Viruses Worms Spyware Rootkits Trojans/Trojan horses Zombies Botnets, or bots Spam
Besides malware, there are other technological threats used by the black hat community to exploit system vulnerabilities and to learn and perfect the skills necessary to attach systems. There are several Web sites and discussion groups for the underground hacking community, from which attack tools can be downloaded. On these sites, information is exchanged among hackers so that they can discover new vulnerabilities and develop the exploits to exploit these vulnerabilities.
52 Securing Fibre Channel Fabrics
Types of Threats
One significant threat is the widespread availability of open source software that has hidden malware built into the application. Peer-topeer (P2P) sites used for sharing software, music, and video files are renowned for installing spyware (malware that captures information and relays it back to another computer) on the unsuspecting downloader's computer. Some spyware may contain key-logging software to capture key strokes from the remote user for the purpose of obtaining passwords, account numbers, and other sensitive private information. In the case of a SAN, it is possible that a computer used to manage a SAN is infected with spyware. The information collected by the spyware could be used later to compromise the entire SAN and its data.
53
Internal Non-malicious Carelessness Lack of training Lack of security awareness Improper zoning Misconfigured HBAs Inadequate backups Inadequate or non-existent operational procedures Reduced budgets
External
N/A
It is interesting to note that this table does not include non-malicious external threats. It is the writer's opinion that all external threats are malicious regardless of the intent since the result is always malicious. For example, even if a curious individual breaches a system and only browses around various directories, the security administrator who detects this breach must now investigate. Who is the person that breached the system? What was his intention? Was she simply collecting information in preparation for a more significant attack in the future? Addressing these questions during an investigation takes time and costs the company money resulting in a loss. Hence, all external threats, no matter how benign they may seem at first, have a negative effect and are considered malicious.
54
Types of Threats
Isolating the systems and assets from the outside world is the primary way to protect against external threats. The defense-in-depth strategy works well to provide multiple layers of protection from outside attacks such that each layer adds an additional barrier to the attacker. (See also The Brocade SAN Security Model on page 91.) There are two access points for an outsider to gain access to an organization's IT assets. Attackers can breach one or both of the following: Physical security to gain physical access to the assets The network to gain access to the servers and other assets connected to the network
Protecting assets from physical access requires appropriate physical security measures to restrict access to authorized persons only. Protecting assets from being accessed through the network is much more difficult, since there can be more than one entry point into the network. As with any technology, networks have many vulnerabilities with new ones discovered on a regular basis. Although protecting conventional LAN networks is out of scope for this book, if you are interested there are many excellent resources available on this topic.
55
Fatigue caused by long or nighttime working hours Misidentification of hardware Simple human errors
The key to minimizing the risks of this type of threat is to develop solid, well-documented operational procedures and restrict administrator privileges to only the tasks that are required for an administrator's job functions. Organizations should not grant additional privileges to a trusted, long-term, or favored administrator when those privileges are not required for that administrator's job functions. Malicious insider threats typically involve employees or contractors who have something to gain from exploiting a weakness in the system. These threats are the most difficult to manage and control, since they involve people who have legitimate access to the affected systems. The key to mitigating risks from this type of threat is to limit the privileges a specific individual has and to distribute workload and responsibilities among multiple administrators. In the event that a security incident occurs, it is also important to have a proper incident response procedure in place, with clear methods to track administrator activities and provide evidence for any potential criminal or civil investigation. The following list, while not comprehensive, provides important points to consider to protect against insiders: Proper hiring and screening practices Limited access to facilities and assets Personal identifiers, physical and digital Appropriate controls Monitoring Procedures and policies Incident response Training and awareness
The first step, and probably the most important, is to perform appropriate background checks on employees before they are given the keys to the kingdom. Background checks can be basic or exceptionally comprehensive depending on the nature of the systems they will be granted access to. For military and intelligence positions involving national security, a top secret clearance or higher may be required.
56 Securing Fibre Channel Fabrics
Types of Threats
A really top secret clearance means investigation of a person's history, relationships, lifestyle, financial positions, and includes a polygraph (lie detector) test. For other employees, a simple verification of references from previous employers or a credit check may be sufficient. A credit check may not seem relevant at first, but if a potential employee has serious financial difficulties, then this could indicate a weakness in that person's situation, which could be exploited by a criminal element. Once hired, employees should be given access only to assets or facilities they need to perform their job function. Providing an access card so that an employee can enter the building should not necessarily imply that the employee can now access all areas within the building. The same goes for accounts and passwords. A database administrator may be granted root privileges on the database servers for which they are responsible, but they should not have similar powerful privileges on the backup server, Web servers, or any other application they are not directly responsible for managing. This general concept is also known as separation of duties. Each individual employee should have a unique identifier assigned to them. A building access card, for, example should be unique and have a photo of the employee on it. When employees log in to a system, they should use their personal account with the appropriate privileges instead of the generic root or admin accounts, which could be used by anyone. The intention is to be able to associate an action with a person in a manner that cannot be repudiated. Appropriate controls should be put in place to limit access and detect anomalies or inappropriate behavior. These could be in the form of access control lists (ACL) or role-based access control (RBAC) assigned to individual users. Programs can log all access to files and file systems, computer systems, facilities, and so on. Once the controls are in place, they must also be monitored. There is obviously no sense in capturing valuable access information in log files if no one ever looks at the log files. A recommendation on the frequency of monitoring varies depending on the type of assets being protected. Some events need to be monitored only occasionally, while others need to be monitored in real time to provide an immediate response to a breach. Fire and burglary alarm systems are examples of real-time monitoring systems as are credit card fraud detection systems.Many, if not most, security breaches result from operator error. Creating welldocumented and detailed operations procedures help mitigate risks associated with operator error. Security policies also mitigate these
Securing Fibre Channel Fabrics 57
risks by establishing guidelines and rules for employees to follow. Policies also serve to protect the company from liability in the event that an employee acts against company policy. Once policies and procedures are in place, they need to be enforced. If a policy is established but infractions are always without consequence, then it will lose its effectiveness over time. Infractions must be flagged in some way, even if it is only a friendly reminder that a certain behavior has been observed with a link back to the policy for the employee to review. Of course for significant or repeated infractions, sanctions may be more drastic and include employee dismissal or even criminal charges in extreme cases. Finally, one of the most overlooked aspects with insiders is training and awareness. Training provides improved knowledge resulting in greater efficiency and reduction of operational errors. Awareness training also reduces the frequency of the type of error caused by not realizing the impact of certain actions, as described in the examples below. A classic technique used by hackers to gain access to systems is called social engineering. This technique involves manipulation of trust when a person impersonates or assumes authentic-seeming characteristics. A common social engineering technique is to impersonate a help desk person and ask an employee to update their company profile. During this process, the unsuspecting employee will be asked to provide their password so that the help desk person can log in and make the necessary changes. Another commonly used social engineering technique is phishing. A hacker may send an e-mail to an individual requesting them to update their account profile for their investment bank for example. They are asked to follow a link which leads them to a phony, but authentic-looking, Web site. As the user logs in to update their profile, their account and password information is captured and subsequently used to perform unauthorized transactions in their account. Raising the awareness of all employees to the possibilities is a way to combat hacking via social engineering.
58
Attacks
Attacks
Attackers have many options and strategies at their disposal to attack IT assets. They can be very simple or highly sophisticated attacks depending on the skill of the attacker and the target that is attacked. The first step in any attack usually involves collecting information to determine the best strategy to perform a successful attack on a system.
Types of Attacks
Hackers can be very creative individuals and there are many ways in which they can attack and compromise a system. There is an extensive black hat community whose members share information across the Internet and make it available to any interested person. The list of attacks is quite long; here are a few attacks that can be used in a SAN environment: Back doors Sniffing Denial-of-service (DoS)
59
Back Door
A back door allows someone to bypass the normal access methods to get into a system. It can have many forms, such as a program with hidden code that allows its creator to enter a system at a later date. Sometimes a host can be bypassed by placing it in single-user mode and bypassing the operating system authentication mechanism. A back door can also be a default account, such as those used by maintenance technicians to gain access to a system when users have forgotten their password to access the system. This is why it is extremely important to change all default account passwords for a new system. A simple Web search reveals default account passwords for most major IT equipment vendors (including Brocade).
Sniffing
Sniffing is the act of capturing traffic on a network. It can be accomplished using highly sophisticated and expensive equipment such as a trace analyzer. Or it can use inexpensive, readily available equipment such as software on a computer that places the network interface card (NIC) in promiscuous mode to capture all traffic that reaches it. As seen in Chapter 2: SAN Security Myths starting on page 9, sensitive optical couplers can be purchased for under $1,000 to sniff traffic on an optical fiber cable without having to splice the cable. The data itself can be stored on any computer, including a laptop, and with packet filtering software, unnecessary traffic or noise can be filtered out and only the interesting traffic is kept.
Denial of Service
A denial-of-service (DoS) attack aims at disabling systems or preventing them from performing their intended function. Powering off a FC switch or storage array is a simple form of a DoS attack. A distributed DoS (DDoS) attack is more sophisticated and requires the collaboration of large numbers of computers, usually infected with a sleeping process called a zombie, which simultaneously sends a large number of requests to a Web server resulting in congestion that may bring the system down. The first such attack of significance was performed by an adolescent with the aid of several programs he downloaded from the Internet and managed to bring down several Web sites including CNN, Yahoo!, Ebay, Amazon, E*Trade, and Dell.
60
Attacks
Man-in-the-Middle
A man-in-the-middle (MITM) is an active form of sniffing in which an unauthorized third party is introduced between two legitimate parties communicating with each other. Often, the MITM pretends to be one of the parties during the authentication process and then relays information between the two parties. The result is that the two parties believe they are communicating directly with each other, but in fact they are communicating through a third party. The third party can then store the traffic exchanged between the two parties and use the information for a subsequent attack. For example, a GUI using HTTP to manage a switch can be compromised by an MITM attack. To prevent this, an end-point authentication mechanism such as SSL can be used to secure the channel between the GUI and the switch.
Spoofing
Spoofing means pretending to be something you are not. Spoofing can be used to in SANs by assigning a WWN of a known device in a fabric to another host's HBA and introducing it into the fabric. An interesting fact is that the FC protocol does not have any mechanism to prevent duplicate WWNs in a fabric. This may seem odd at first, but it is similar to the Ethernet protocol, in which duplicate MAC addresses are allowed. In fact, some NICs come with several Ethernet ports and by default, each port shares the same MAC address. This is usually done to reduce the number of entries in the arp table where the MAC addresses are cached on the server. As shown above, there are many techniques a hacker can use to breach a system. All SANs have vulnerabilities that can be exploited, and special measures are required to protect against these attacks. The next section looks at how to protect against these attacks and mitigate the risks associated with them.
61
Authentication
When only one method is used to authenticate a person, it is called a one-factor authentication. When more than one method is used to authenticate a person then it is multi-factor authentication. The four different factors of authentication are: Something you have such as a key, an access card, an employee badge, or a user account Something you know such as a password, a personal identification number (PIN), or an access code Something that is a part of your physical person such as a fingerprint, retina or iris, voice, or facial features (biometrics) How you do something, such as the way you write your signature or how you type on a keyboard
Using more than one factor of authentication provides stronger authentication. For example, if an employee's access card to the company building is stolen, then the thief would be able to use the card to access the building without any further challenges. On the other hand, if the same employee was also required to enter a 4-digit PIN on a keypad, then that would provide additional protection against someone trying to use a lost or stolen access card. But even in this example of two factors of authentication, one could argue that an employee could be coerced to giving someone their PIN. For more sensitive environments, biometrics could help protect against coercion, since it would be difficult to simulate another person's biometric characteristics like a fingerprint or retinal pattern. Some devices are quite sophisticated and also measure temperature or other parameters to prevent using body parts that have been removed from their rightful owners.
62
Biometrics
Biometrics is the science and technology of measuring biological information. In IT, biometric technology is used as an authentication mechanism to identify and verify the identity of individuals via: Fingerprints Palm prints Hand geometry Retinal scans Iris scans Facial patterns Voice patterns
The following two biometric characteristics are different from the others, since they do not identify a body part, but they analyze how an individual performs a specific task: Signature dynamics Keyboard dynamics
Signature dynamics measure writing speed and pauses at different points in the signature. Keyboard dynamics measure a person's typing patterns, that is, how fast they can type, delays in typing two separate letters, and so on. One of the challenges of biometrics is balancing the error rate. There are two types of errors in biometrics: false positives and false negatives. A false positive (type I error) occurs when a biometric system falsely confirms a person's identity. A false negative (type II error) occurs when a biometric system fails to identify a person. Of the two types of errors, a false positive is more serious. If a biometric system generates too many false negatives, it becomes a nuisance to users of the system, since they are not identified and not authenticated. It may take several attempts to get a valid authentication and users get annoyed with the entire system-not to mention the time wasted and resulting loss of productivity. A false negative, on the other hand, could be a real problem when an invalid user is identified as a valid user and is authenticated. Biometric systems are tuned in such a way to achieve a good balance between type I and type II errors. In some cases, a biometric system may favor false negatives if false positives are not tolerated.
Securing Fibre Channel Fabrics 63
From a storage perspective, biometrics are often used to access secure computer rooms and are sometimes used to authenticate to the management workstation.
Physical Security
One challenge with passwords is a situation in which users have to memorize different passwords for each system they are required to manage. The ability for a user to use one password and account for all of the systems they are required to access is called single sign-on. Programs are available that allow a user to create or change a password and automatically update all the systems that user has access to. Utilities and protocols can perform this function as well as provide more sophisticated user account management, such as RADIUS (Remote Authentication Dial-In User Service) and LDAP (Lightweight Directory Access Protocol). When a user logs in to a system using one of these methods, the authentication request to the system is redirected to the RADIUS or LDAP server, which performs the authentication and sends a confirmation back to the system if the authentication is successful.
Physical Security
The first line of defense to protect IT assets from external threats and many internal threats is physical security. Physical security not only involves preventing and detecting access to assets but also addresses safety concerns affecting the personnel, facilities, and equipment in the data center. This section introduces general concepts of physical security relevant to IT and storage environments. Physical security controls come in the form of physical and psychological deterrents. Deterrents can be visible or invisible and real or an illusion. For example, a guard dog can be used as a real physical deterrent, but a Beware of Dog sign with no dog can provide an illusion of protection and acts as a psychological deterrent. Lighting can also be used as both a physical and psychological deterrent. When lighting is used in strategic locations with the proper intensity, it provides a disorienting glare effect, which can be a physical deterrent. Lighting is used most often as a deterrent to make intruders feel as if they are being observed and could be discovered-particularly if it is combined with a visible video surveillance system. To ensure physical protection of assets, the following groups of countermeasures should be considered: Policies and procedures Personnel Barriers
65
Equipment Records
As with any security program, policies and procedures provide the general guidelines and establish the spirit in which physical security is implemented. The policies and procedures also provide liability protection to an organization when employees do not follow them and incidents occur as a result of not following policy. Personnel include not only the obvious security guards. All system administrators, operators, and employees need to be involved in contributing to effective physical security. System administrators and operators need to follow published procedures and policies to ensure that the systems for which they are responsible are not left unprotected. Employees should be alert for suspicious looking individuals or situations, such as when the exterior door to their office building is left propped open. Barriers or access control systems can be structural, human, or natural. A door to a computer center with an electronic access control system is a structural barrier. A security guard posted at the entrance of a data center is an example of a human barrier. Access to a building via a crossing bridge over a natural creek represents a natural barrier. Equipment and technology is heavily used in modern physical security, which includes electronic access control systems, locks, fire and intrusion detection systems, and communication systems. Records and logs are also an important part of physical security to detect patterns, flag anomalies, provide evidence, and record events and activity. Records can be in paper format such as sign-in sheets and incident reports, or they can be in electronic format such as video tapes and electronic access databases. Physical access controls are put in place to allow authorized individuals to gain access to specified areas. These include barriers of all types such as fences, gates, and doors. These controls can combine multiple mechanisms to provide additional layers of security. Physical access controls include: Electronic access systems Intrusion detection systems Surveillance systems
66
Physical Security
Electronic access systems are frequently used to control individual access to buildings and areas within a building. The typical electronic access system uses a card, which can be either swiped in a card reader or placed near a proximity sensor to be read. Information contained in the card identifies the individual user and if the user is authorized, access is granted and recorded in a database with a time stamp. Other electronic access control systems may use biometrics to identify an individual and or a combination of methods for multi-factor authentication. Some electronic access systems use special entry mechanisms to prevent piggybacking, for example. Piggybacking occurs when an individual physically follows an authorized user, knowingly or otherwise, to gain access to a location. Social engineering techniques are often used to bypass this system to convince authorized users to let them piggyback. Piggyback-prevention systems include turn styles, double doors in a system with a second door that won't open until the first is closed, and weight-sensitive floors. Intrusion detection systems are used to detect unauthorized access to designated areas. These systems include motion sensors, infrared sensors, pressure-sensitive switches, and so on. Surveillance systems monitor activity in designated areas using security personnel, electronic systems, such as closed-circuit television (CCTV) systems, and computer equipment. To develop a comprehensive physical security plan, other factors need to be considered: Temperature and humidity control Power management Uninterruptable power supply (UPS) Generators
As explained, physical security is the first line of defense in protecting IT assets and is an important component of a comprehensive IT security program.
67
Data Sanitization
Data disposal and sanitization deals with maintaining confidentiality of information. Evidently, not all stored data needs to be destroyed or sanitized, and the degree to which it needs to be sanitized depends on the sensitivity and importance of the data as well as the risk of exposure to the company if the data were stolen. Certain industries regulate how certain types of data should be sanitized, while other industries are governed by legislation specifying what and how data should be destroyed. The first step in developing a data destruction and sanitization strategy is to classify the data to identify which types of data require special sanitization and/or destruction requirements. Once the data has been identified, then the level of sanitization to be performed should be determined. There are several ways to sanitize and destroy data. The NIST Special Publication 800-88 provides some useful guidelines on sanitizing media. This publication proposes four basic types of data sanitization methods, described in the following sections.
68 Securing Fibre Channel Fabrics
Disposal
Discarding media with no sanitization concerns is appropriate only for non-confidential or non-sensitive information. Simply deleting files and emptying the recycle bin or reformatting a disk drive would meet this requirement.
Clearing
Acceptable for non-sensitive data, clearing protects confidentiality by clearing information using an accepted overwriting method to protect against attacks using data scavenging tools. Simple file deletion is not acceptable at this level of sanitization. Overwriting does not work on failed or defective media. Data clearing is also referred to as data shredding, erasure, or wiping. The clearing method uses one of several techniques to overwrite data on a functional disk drive. Clearing can be accomplished in a variety of ways and several standard algorithms have been developed to accomplish this. Although this method is sufficient for moderately sensitive data, it is usually not appropriate for highly sensitive data. The read/ write mechanisms of disk drives are not precise enough to exactly overlay new data over old data. It is entirely possible to see small bands of residual data underlying the new data using sophisticated forensic equipment such as magnetic force microscopes. Clearly, such forensic equipment is not available to the average hacker, but it certainly could be used by a foreign government, for example, if an enemy's sensitive disk drive should fall into their hands. There has been controversy around this subject as a result of conflicting research data on the ability to recover overwritten data. Using special microscopes, some researchers were able to demonstrate that overwritten data could be recovered. More recent work has demonstrated that modern drives are more accurate and it is no longer possible to perform such an attack. Nevertheless, it is entirely possible that even modern drives could encounter calibration issues resulting from routine wear and tear, which could allow residual data to be observed.
Purging
Data purging is used to protect against sophisticated laboratory attacks using specialized equipment such as electron microscopes and sophisticated diagnostic and forensic tools. Degaussing, passing a magnetic field through a magnetic media, is an acceptable method of purging data, although certain types of degaussers are more effective than others depending on their energy rating. Clearly degaussing will not work on non-magnetic media such as optical media.
69
Destruction
Physical destruction of the media is the only accepted method to completely prevent any access to the data on a magnetic media; once the media has been destroyed it can no longer be reused. Physical destruction can be accomplished by disintegrating, incinerating, pulverizing, shredding, and melting. These methods are usually reserved for the most sensitive data and are the most common methods used by military and intelligence agencies to destroy media containing confidential data. They are also often used in combination with each other, for example, a disk may be first crushed then incinerated or melted. Data sanitization procedures should also include verification processes to ensure proper confidentiality is maintained. Random samples of sanitized media should be tested by persons not involved in the actual sanitization process.
US DoD 5220.22-M US Navy NAVSO P-5239-26 US Air Force System Security Instruction 5020 NATO Data Destruction Standard
3 3 3
70
Chapter Summary
Passes 7
Description Passes 1 and 2 certain bytes and its compliment; passes 3 and 4 random character; passes 5 and 6 character and its compliment; pass 7 random character Alternating passes of ones and zeroes and last pass with random characters Alternating passes of ones and zeroes Pass 1 zeroes; pass 2 ones; passes 3 through 7 random characters 35 passes of pre-defined patterns (considered excessive given todays drive technology)
Canadian RCMP TSSIT OPS-II NSA/CSS Policy Manual 9-12 Bruce Schneier Peter Guttman
7 7 7 35
Chapter Summary
When securing a SAN environment, it is important to consider a holistic approach. A defense-in-depth strategy provides multiple layers of challenges to potential attackers and hardens all aspects of the environment. Technological defenses, although important, do not necessarily address issues related to the human element such as human error. Security policies, training, operation procedures, and raising awareness can go a long way to address these issues and are unfortunately often overlooked.
71
72
Elementary Cryptography
This chapter is an introduction to some of the general concepts of cryptography for an audience not very familiar with cryptography. Many examples are simplified in order to present often highly complex concepts in a manner that IT professionals can understand. Cryptography can be used in a SAN environment to solve several problems. Here are some examples of where cryptography is used in a SAN: Exchanging data between the management interfaces on the switch and the management server Exchanging data across two data centers Protecting data-at-rest on a tape or disk media Authenticating devices joining a fabric using DH-CHAP
The word cryptography is derived from the Greek words kryptos, which means hidden, and graphia, which means writing, so it is the art of hidden writing. Stated more completely, It is the art, science, skill, or process of communicating in or deciphering messages written in code. Scholars certainly have speculated about the first use of cryptography, but one fact is indisputable. The need to exchange or store sensitive information in a manner that only the parties involved could understand has been around for a very long time-certainly several centuries. One of the earliest known ciphers was used by Julius Caesar and is appropriately known as the Caesar cipher or the shift cipher. It is based on the concept of shifting the alphabet by a pre-determined number of letters.
73
For example, if the Latin/Roman alphabet is shifted by five letters, the following cipher results. Original alphabet: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Cipher code: FGHIJKLMNOPQRSTUVWXYZABCDE Using this cipher, the word RETREAT would be encoded as WJYWJFY. This type of cipher is also known as a transposition cipher. A substitution cipher is another type of cipher, which mixes up the letters in no particular order. For example, if the order of the Latin/Roman alphabet is randomized, the following cipher results. Original alphabet: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Cipher code: QFBORXKUGWIPANSZHTDJCYMELV Using this cipher, the word RETREAT would be encoded as TRJTRQJ. Although these basic ciphers can probably be decoded easily by most weekend puzzle enthusiasts these days, they were nevertheless useful in their time. Mechanical devices have been developed to refine the encoding and decoding of messages. One of the best known encoding devices is the German Enigma Machine used in World War II, which used multiple passes of a simple alphabet substitution cipher. The electronic age introduced computers and electronic devices, which further increase the complexity and speed of the encoding process and subsequently the difficulty of decoding messages without the key. For as long as cryptography has been around, there has also been an equivalent aspiration to decode messages. The process of deciphering messages without access to the key is called cryptanalysis.
74
75
Symmetric Keys
Symmetric cryptography uses the same key or a secret key to encrypt and decrypt messages, for example, the Cesar cipher. Since the same key is used for both encryption and decryption, anyone in possession of the key can decrypt the message encoded using that key. Distributing the keys to the authorized persons poses a particular challenge and extreme measures sometimes need to be taken for what is termed a secure key exchange. If the key is stolen or intercepted during the transfer process, the code is deemed broken and the encrypted message no longer secure. Examples of well-known symmetric key algorithms are Data Encryption Standard (DES) 3DES (pronounced triple DEZ), and Advanced Encryption Standard (AES).
Asymmetric Keys
Asymmetric cryptography has been developed to address the key exchange issue. Exchanging keys in times of war on the battlefield certainly offered its challenges, but the Internet and e-commerce present even greater challenges. How can you conduct millions of transaction per day at wire speeds across the world and make sure you authenticate each transaction? Asymmetric cryptography is also referred to as public key cryptography, since it makes use of keys that are known publicly. A public key exchange system works on the principle of encrypting a message using a combination of a public key and a private key. Each party has their own public and private keys, which are different but mathematically related. Examples of familiar asymmetric key algorithms are used with Public Key Infrastructure (PKI) and RSA (represents the family names of the inventors: Rivest, Shamir, and Adelman). There are several ways of implementing public key exchanges. Below is a high-level example of how it works, without going into the details of how it is actually accomplished.
76
Say that Jim sends Maria a message that only Maria will be able to read. Both Jim and Maria have a private key that only they know about. They also have a public key that is available on a public server containing the public key repository. Jim queries the repository to obtain Maria's public key and uses it with his own private key to encrypt the message. The message is sent to Maria and she then retrieves Jim's public key. Using the combination of Jim's public key and her private key, she can then decrypt the message and read it. Bob is a bad guy and he intercepts the message between Jim and Maria. Since Bob does not know either Maria's or Jim's private key, he is unable to decrypt the message using just Maria's and Jim's public keys. Figure 28 illustrates this example.
Public Keys
Public key repository Name: Jim Maria Bob Victoria Key: JhiGhr*7km893 %re84_)Kflg@ Di*fi$3Lkvl#?kdf M_c&ll$mvoMk!
Internet
...
...
Message encrypted with Marias public key and Jims private key
Message decrypted with Jims public key and Marias private key
Bob
Hybrid Systems
Asymmetric keys are computation intensive and do not lend themselves well to processing large volumes of data. Hybrid cryptosystems can be used when an asymmetric algorithm can be used for authentication and key distribution along with a symmetric algorithm for the actual data encryption process.
77
Cryptographic Algorithms
A cryptographic algorithm or cipher is the actual procedure used to manipulate a readable message and render it unreadable. The readable message that is input to a cipher is called plaintext and its output is called ciphertext. Early thinking around ciphers encouraged security through obscurity. Proprietary algorithms were kept secret for fear of their being discovered and subsequently broken. With certain exceptions, notably military-grade applications, this thinking has been replaced by the use of open algorithms that withstand public scrutiny. August Kerckhoff proposed six rules for military cryptography in 1883 such that if an encryption algorithm were to fall into enemy hands, it would not result in a compromise of the message as long as the key was not discovered. Proprietary encryption algorithms are generally not considered as secure, since they do not benefit from being scrutinized by either the cryptographic community at large or the general public. These algorithms are usually analyzed by a group of elite professional cryptographers, who sometimes have tunnel vision and see things from only one perspective, a situation which could result in a gaping flaw that is overlooked. An open algorithm, on the other hand, has this advantage: at some point thousands of individuals attempted to break it. If thousands of people from different professions and viewpoints are unable to break the code, then the algorithm certainly can be considered secure. When someone eventually breaks the code, it will become public knowledge and the algorithm will have ended its useful life. Designing a cryptographic algorithm is very complex and should take the factors listed below into consideration, so it can be used efficiently in practical commercial applications: Speed of encryption. A highly complex and completely unbreakable algorithm would have no practical commercial use if it also required excessive amounts of processing power to compute, which would drastically impact performance. Memory usage. Algorithms that use too much memory to perform their computations and manipulations may require memory components too large to physically fit in certain portable devices and may restrict their practical application.
78
Cryptographic Algorithms
Range of applications. Ability to implement in a wide range of devices from supercomputers and disk arrays to Smart Cards and radio-frequency identification (RFID) devices can determine the value of an algorithm.
Cost. If the cost to implement the cryptosystem is too high then it may not find commercial relevance. Military and intelligence applications sometimes warrant the high cost in exchange for stronger cryptographic capabilities. Openness. In support of Kerckhoffs' principle stated by Auguste Kerckhoffs in the 19th century: a cryptosystem should be secure even if everything about the system, except the key, is public knowledge. There are three basic categories of ciphers: block ciphers, stream ciphers, and hashing algorithms.
Block Ciphers
Block ciphers are used to encrypt data as an entire block as opposed to one bit at a time. An entire block of data is processed at the same time by the block cipher. A plaintext message is broken down into fixed-length blocks and passed to the block cipher as plaintext. Each plaintext block is encrypted with they key to create a ciphertext block that is the same size as the input plaintext block. The decryption process takes the ciphertext message and breaks it down into fixed-size blocks. Each ciphertext block is decrypted using the key to produce a plaintext block the same size as the input ciphertext block, as shown in Figure 29.
PLAINTEXT MESSAGE
Message is broken into blocks
Key 1 Block Cipher Key 1 Ciphertext block Block cipher 1 Plaintext block
79
Stream Ciphers
Stream ciphers process plaintext one bit at a time, as shown in Figure 30. Generally, stream ciphers are considered less secure, since there is a higher risk of having repeating patterns. For this reason, block ciphers are more commonly used. Block ciphers can, however, be used on streaming data when they are operating in a streaming mode of operation, such as the counter (CTR) mode discussed later in this chapter.
Key Stream cipher 9 8 7 6 5 4 3
Individual bits are encrypted one at a time
Stream cipher
Figure 30. Stream cipher encryption/decryption Both the block and stream ciphers address the data confidentiality issue by rendering the data unreadable without the key. Hashing algorithms, on the other hand, address the integrity issue by providing a means to verify that data has not been modified.
Hashing Algorithms
Hashing algorithms, shown in Table 31, are used to convert a message of variable length to a shorter, sometimes fixed, length or numerical value. The resulting value is sometimes called a message digest (MD). These algorithms are also called one-way hashing functions since they only work in one direction, and it is not possible to reconstruct the original message from the message digest. In a real-world example, if a glass is smashed into fine particles, it would not be possible to reconstruct the glass in its original form.
Hash algorithm One-way function
PLAINTEXT MESSAGE
Hash value
Not possible to calculate original message from the hash value
PLAINTEXT MESSAGE
Cryptographic Algorithms
Hashing algorithms are often used for error-checking, but in IT security are generally used to verify the integrity of a message. For example, hackers have been known to modify code, particularly freeware, and add a back door, virus, or some other type of malware into the code. When the original software package is passed through a hashing algorithm, a hash value is generated, which can then be posted in a public location. If someone downloads this software package and puts it through the same hashing algorithm, the resulting hash value should match the one posted. If they do not match, then it can be assumed that the software has been modified and cannot be trusted to be secure. An MD by itself only provides integrity verification, but an MD can be encrypted with a symmetric key to provide authentication of the provenance of the data. This technique is known as a message authentication code (MAC).
Digital Signatures
A digital signature, shown in Figure 32, is exactly what it says: it is the equivalent of a person's paper signature but for digital transactions. Digital signatures cannot be repudiated later. A digital signature is created as follows: 1. A message is created. 2. The message is passed through an algorithm to generate a hash value. 3. The hash value is encrypted using a private key from some public/ private key authority. 4. The resulting encrypted hash is the digital signature. The validation process at the other end goes as follows: 1. The message is passed through the same hashing algorithm. 2. The digital signature is decrypted using the public key of the sender. 3. The resulting decrypted hash is compared with the newly calculated hash. 4. If the hash values match then the message is deemed valid.
81
PLAINTEXT MESSAGE
Digital Signature
PLAINTEXT MESSAGE
Send signed message to receiver
Sender Receiver
If the two MDs match, then message is authentic
PLAINTEXT MESSAGE
Digital signature
Figure 32. Digital signature Digital signatures provide non-repudiation and integrity to prevent someone from claiming that they did not perform an action or approve a transaction, and to confirm that the message has not been modified.
Modes of Operation
A cryptographic algorithm can be applied in different ways depending on the type of data and specific requirements of its application. For example, some data is fixed length and must remain exactly the same size after it has been encrypted as is the case with block data written to disks. In other contexts, such as tape backup applications, the data is streaming to the device very rapidly on a flexible media. Instead of creating a different cryptographic algorithm for each application and type of data, the same algorithm is used in different ways to accommodate the requirements. Furthermore, encrypting data bit by bit as it is transported serially through a wire requires another method of encrypting data. These methods are called modes of operation. The following describes common modes of operation in use today: Electronic Codebook (ECB). Divides the message into equal-size blocks that are encrypted separately. ECB is not very good for hiding patterns, since identical plaintext blocks encrypt to identical ciphertext blocks. Cipher-Block Chaining (CBC). A message is divided into equal-size blocks and the entire block is encrypted. The first block is also encrypted using an initialization vector (IV) to randomize the
Securing Fibre Channel Fabrics
82
Modes of Operation
encryption process. Furthermore, all subsequent blocks are chained in such a way that the encryption process depends on all previously encrypted blocks. Counter (CTR). Converts a block cipher into a stream cipher by encrypting successive blocks in a data stream using a counter to change the value for each block. Galois Counter Mode (GCM). A similar mode of operation to the Counter mode with the addition of an authentication component called the Galois mode. Authentication is usually a computingintensive process, which would not be acceptable for streaming data. Authentication is also necessary to prevent certain types of attack on a data stream. The Galois mode was developed to authenticate a message at very high speeds with minimal performance impact on the data throughput. XEX-based Tweaked Codebook with Stealing (XTS). This mode of operation was designed for data formats that are not evenly divisible by a given block size, as is the case for disk drives with sectors not evenly divisible by their block size. XTS is used by the Brocade encryption solution to encrypt block data on disk drives.
DES/3DES
The National Standards Bureau (NSB) recognized the need for a government-wide standard for the encryption of non-classified, sensitive data and developed a cryptographic algorithm to address this requirement. The first draft of the algorithm was written by IBM and was called LUCIFER. The name was eventually changed to the Data Encryption Standard (DES) and it was adopted as an official standard in 1976. The algorithm is a symmetric-key algorithm with 56-bit keys that determine which bits will be transposed and substituted in the original message. DES was broken by a brute force attack in 1999 by the Electronic Frontier Foundation (EFF), making it imperative to come up with a new cryptographic standard for the Federal Government. Selecting a new cryptographic standard is a complex and lengthy process, since proposed algorithms must be given the test of time and provide the opportunity to have as many people attempt to break it as possible. In the interim, it was crucial to replace DES with a new algorithm with a larger key space, since DES was no longer secure. The simplest solution was to use a modified application of DES until the new standard could be adopted.
83
3DES became the DES interim replacement. 3DES increased the key space from 256 to 2168 by simply performing three consecutive encryption passes using DES and a different 56-bit key for each pass. Effectively, this created an algorithm that used three 56-bit keys, which is equivalent to a 168-bit key size.
AES
The Advanced Encryption Standard (AES) was developed by the National Institute of Standards and Technology (NIST) to replace the DES through a competitive process, in which 15 competitors submitted proposed algorithms. The Rijndael algorithm proposed by Vincent Rijmen and Joan Daemen, two Belgian engineers, was selected as the new encryption standard in 2000. The AES is defined in the Federal Information Processing Standards (FIPS) publication 197. The Rijndael algorithm is a symmetric key block cipher which supports keys with 128 bits, 192 bits, and 256 bits (AES-128, AES-192, and AES-256 respectively). It was rapidly adopted by the industry and most commercial applications for encryption of data-at-rest will use AES-256. The AES standard is the first to use an open cipher that is available to anyone-distinguishing it from its predecessor DES. Although there had been some controversy around DES, which was co-developed by the National Security Agency (NSA), as to whether the NSA had created a back door into the algorithm, the open nature of the AES standard has all but eliminated this possibility.
Diffie-Hellman
Whitfield Diffie and Martin Hellman were the first to publish the concept of public key cryptography in 1976. In actual fact, the public keyprivate key theorem was first developed independently by James Ellis in 1969 and the algorithm problem was solved by Clifford Cox in 1973. However, their work was not published before the publication of the work of Diffie-Hellman. Without going into too many details of how this algorithm works, it is based on the process of factoring very large prime numbers-which is very difficult to do. Diffie-Hellman (DH) was the first practical implementation of public key cryptography and is ubiquitous in the IT security industry. It is an integral part of several standards and protocols. In the FC industry, the FC-SP (Fibre Channel-Security Protocol) uses DH-CHAP (DH-Challenge Handshake Authentication Protocol) to authenticate devices or switches joining a fabric.
84
Modes of Operation
RSA
At around the same time Diffie and Hellman were completing their work on public key cryptography, three researchers at MIT were also working independently on the same problem. Ronald Rivest, Adi Shamir, and Leonard Adelman found a practical implementation of the public key cryptography algorithm and published their findings in 1977.They obtained a patent for their discovery and subsequently formed a company in 1982 bearing the first initial of their last names: RSA. Their patent expired in September 2000 and is now in the public domain. The RSA algorithm is so widespread that it has become a de facto standard.
Digital Certificates
A digital certificate is sometimes confused with a digital signature but they are very different. A digital certificate is the equivalent of an ID card and is issued to an individual by a trusted certification authority (CA). It is composed of the owner's name, a serial number, an expiration date, a copy of the owner's public key, and the digital signature of the CA. Some digital certificates use the standardized X.509 format defined in RFC 2459. Starting in FOS 4.2, Brocade switches came pre-loaded with a digital certificate. Digital certificates are no longer pre-loaded (since the release of FOS 5.1), but one can still be obtained if you wish. This digital certificate was used to authenticate switches that were joining a secured fabric using the Switch Connection Control (SCC) policy.
PKI
The Public Key Infrastructure (PKI) is a set of programs, hardware, data formats, procedures, and policies required to manage digital certificates. It is a general concept with different implementations offered by multiple vendors. PKI emerged from the necessity to provide a secure means of exchanging information and performing commercial transactions over the Internet. The challenge was to ensure that digital certificates used in commercial transactions were authentic. To accomplish this, it was necessary to build a web of trust and provide the necessary authorities to attest to the validity of a digital certificate. Figure 33 illustrates the PKI scheme and its components. At the heart of the PKI is the certification authority (CA) or trusted thirdparty (TTP) that generates and distributes the digital certificate. Part of the digital certificate includes a digital signature from the CA attesting to the validity of the digital certificate.
85
The next step is to establish the identity of the user of the digital certificate which is accomplished by the registration authority (RA). The RA does not issue certificates but acts as an intermediary between the user and the CA. The role of the CA may be carried out by an actual human or by software running on a CA device. What happens when a digital certificate expires or is revoked because it has been compromised? In that case, a certificate revocation list (CRL) is maintained at the CA and consulted each time a transaction takes place using a digital certificate. Where Brocade is the CA, a PKI is used to distribute Brocade digital certificates.
CA
CA issues certificate CRL is checked for revoked certificates
CRL
Internet
...
RA
SSL
Secure Socket Layer (SSL) was developed out of a need to encrypt communications over the Internet and addresses only the confidentiality of data-in-flight. It was originally developed by Netscape in 1994 and SSLv3.0 is the most widely used today. It is a hybrid encryption system using both symmetric and asymmetric cryptography. Public key cryptography is used to authenticate between clients and servers, whereas symmetric cryptography is used to encrypt the application data. The application data can be encrypted using a 40-bit or 128-bit symmetric key version. The authentication is performed using digital certificates obtained from a CA in a PKI framework.
86
Key Management
SSL can be used with several protocols, although it is used primarily with HTTP. For example, SSL is used to secure communications between a Brocade management graphical user interface (GUI) and a Brocade switch. The secured version of HTTP in this case is called HTTPS.
IPSec
IPSec (IP security) is a framework that performs encryption at the routing layer (IP - Layer 3) in the TCP/IP stack. IPSec is commonly used to secure communications in a virtual private network (VPN), but it can be used simply to encrypt communications between two devices on a network. IPSec can either encrypt only the payload or data (transport mode) or it can encrypt the payload and the header information (tunnel mode). Since it is a framework, it does not actually specify which encryption or hashing algorithms are used. For example, IPSec can be used to encrypt communications between two Brocade 7500s using FCIP to replicate data between two data centers. Brocade's implementation of IPSec supports the following encryption and hashing algorithms. Encryption algorithms: 3DES AES-128 (default) AES-256 SHA-1 (default) MD5 AES-XCBC
Hashing Algorithms:
Key Management
The decision to encrypt information residing on disk or tape creates a long-term commitment and a dependence on the encryption keys. After being created, keys need to be backed up and managed. Keys can be lost, stolen, destroyed intentionally, or expire after a pre-determined period of time-all security vulnerabilities. Loss of the encryption keys is comparable to losing the data. Unlike data in flight, the keys for data at rest must be available for as long as the data needs to be read In the case of patient health records, information may need to be retained for 7 years after the death of a patient, which could be over 100 years. Keys can also be stolen or
Securing Fibre Channel Fabrics 87
compromised, in which case the information would have to be reencrypted or rekeyed using a different key to ensure the confidentiality of the information. Media such as disk and tape also have a limited shelf life and they go through evolution cycles to an eventually incompatible format. Encrypting data-at-rest requires management of the encryption key for the life of the data. Encryption keys are usually managed by a comprehensive key management system, because keys need to be managed for an extended period of time. A key management system is used to manage the lifecycle of keys. Encryption key Encryption key information needs to be refreshed as the media expires and the data has to be reencrypted using the same or a different key. Finally, encryption keys need to be backed up in a secure manner to avoid being compromised in the process. Keys can be backed up to a key vault, which is a part of a comprehensive key management solution used to establish policies and manage the keys throughout their lifecycle. For redundancy, a typical key vault is implemented with two or more units to prevent single points of failure. If the primary key vault becomes unavailable, the secondary or clustered key vault can accept or provide keys to the encryption device. Key management solutions are implemented using two basic methodologies to exchange the keys between the encryption device and the key management solution: trusted and opaque.
88
Key Management
Key Vault
Link key
6 Encryption engine 3 2 Cleartext DEK Wrapped DEK Secure link 4 Wrapped DEK Cleartext DEK Encryption engine 5 Encryption engine Key encryption key
Figure 34. Trusted key exchange To prevent key exchanges from being sniffed or intercepted during transmission between encryption devices and key vaults, most vendors use secure channels for the key exchange or wrap the key using a symmetric key before sending it over the channel. Many variations for the key exchange process exist. For example, the NetApp Lifetime Key Management (LKM) uses a secure channel (SSL) and wraps (encrypts) the key before sending it across the secure channel.
89
Key Vault
Wrapped data encryption key Encryption engine Cleartext data encryption key Secure link
Encrypted keys are stored in the key vault database before any data can be encrypted
FIPS 140-2 Layer 3 Security boundary defines where keys must be wrapped (encrypted) before they leave
Figure 35. Opaque key exchange The EMC RSA Key Manager for the Datacenter (RKM) and HP StorageWorks Secure Key Manager (SKM) are examples of opaque key management solutions.
Chapter Summary
Cryptography has been used for centuries and has evolved considerably with the arrival of computers and high technology. The development of networks and the Internet made it critical to develop new methods to exchange information securely across these new mediaand the SAN is no exception. There are several problems in a SAN environment that must be addressed using cryptography such as securely exchanging data across data centers and between switches and their management servers or to ensure confidentiality of data-atrest on disk or tape media. Many of the technologies commonly used in conventional TCP/IPbased networks can also be used in SAN environments, particularly when protecting the management interfaces. Specific solutions have also been created to address requirements unique to SAN environments, such as authenticating devices joining a fabric using DH-CHAP.
90
In previous chapters, basic security and encryption concepts were introduced in addition to FC SAN basics. This chapter puts it all together so that you can apply these security concepts in the form of best practices in your FC SAN environment. It offers guidelines that can be used by SAN administrators and security professionals to help build a SAN security policy and decide which features should be deployed. Implementation of these features will be explained in greater detail in Chapter 8: Securing FOS-Based Fabrics starting on page 133. When you design a SAN security policy, it is not necessary to implement and enable every single available security feature. Some security features add performance overhead, others may affect administrator productivity, and yet others may have associated implementation costs. A balance must be struck among the features and the value of the assets being protected-and the probability that the vulnerability of the system will actually being exploited.
91
SAN Security
Training and awareness Policies Operational security Physical security HBA FC HW
Storage
Physical security
HBA
FC HW
Storage
The model resembles an onion with layers of skin around its core (perhaps a mutant onion with three cores). At the center, the three basic types of SAN hardware are represented by the three smaller circles for the HBA, storage hardware (disk and tape), and fabric infrastructure hardware (switches, directors, and routers). Surrounding the SAN hardware is a series of concentric circles outlining the layers of security to protect that specific type of hardware.
How real is WWN spoofing? This question certainly comes up frequently and the answer is not always simple. It is possible to change the WWN on any HBA; tools are available from most HBA vendors that allow users to do this and some are also available as freeware. For example, an attacker from the outside could theoretically compromise a server in a DMZ (although originally a military term, in computer security, Data Management Zone), reconfigure the compromised server's WWN to another production host's WWN, and capture data destined to the other host. How likely is this to occur? With proper security measures in place, this type of attack is unlikely. However, as with all environments, if the value of an asset is very high or very attractive to someone, then the likelihood of any attack on that asset increases-no matter how difficult or sophisticated the attack needs to be. Generally though, only the most sensitive environments, such as military and intelligence organizations and some private organizations in the financial sector, require protection against WWN spoofing attacks.
Securing Fibre Channel Fabrics 93
As a best practice, assign only one host, or initiator, per zone. Singleinitiator zoning (SIZ) serves two basic purposes. A SIZ restricts host-tohost communications and limits RSCNs to the zones requiring the information.
As a best practice, LUN masking should always be used to restrict the visibility of a LUN to a specific host and to prevent other hosts from seeing a LUN that is not assigned to it. LUN masking used with switchbased zoning offers the best protection to a LUN within a fabric. Furthermore, it is recommended to create separate zones between the HBA and disk or tape storage. Separating the disk and tape storage prevents RSCNs destined for disk devices from being propagated to tape devices, which are more prone to disruption resulting from an RSCN.
94
The next line of defense is to prevent ports from becoming E_Ports (the default setting). In the event that an unused port is enabled, a switch would still be unable to join the fabric since the port will not become an E_Port. The next line of defense is to create an ACL specifying by WWN and/or switch port which devices are allowed to join the fabric. In Brocade fabrics, ACLs are used to restrict device access: The Switch Connection Control (SCC) policy is used to specify the WWN of the switches allowed to join the fabric. The Device Connection Control (DCC) policy is used to specify which hosts and storage devices are allowed to join the fabric. The DCC policy can specify members by WWN but it can also lock down a WWN to a physical port on a switch so that only a specific WWN can connect to a specific port in the fabric, and all other devices connecting to that port will not be authorized.
The Brocade DCC and SCC policies are also known as fabric-wide consistency policies, since they can be distributed throughout an entire fabric and used in two different modes: In strict mode, all devices participating in the fabric must be defined in the DCC or SCC policy. The tolerant mode allows some switches to join the fabric without requiring them to be defined in the ACL. This is useful when the fabric contains switches running older versions of firmware (prior to FOS 5.2.0), which cannot use the FOS DCC and SCC policies.
As a best practice, it is advisable to use the strict policy as much as possible. A fabric is only as secure as its weakest link, and if one switch does not participate in an SCC policy then that would be the easiest target for an attacker. The final and most sophisticated line of defense to prevent device access is to use a device authentication mechanism. The ANSI T11 standard FC-SP (FC security protocol) defines the DH-CHAP protocol for this purpose. Devices supporting DH-CHAP can be configured with a shared secret between the device and the switch, and only the device with the corresponding shared secret will be allowed to join the fabric. (see Diffie-Hellman on page 84).
96
Clearly, it is not necessary to implement every one of these lines of defense to prevent FC device access to a fabric. The number of layers an organization decides to implement will depend on their business requirements, sensitivity of the environment, and amount of risk accepted tolerable.
97
trum is a role that grants full admin privileges-other roles are somewhere in between. Typically these roles are customized for specific types of functions such as an operator or a security administrator. Physical isolation and routing. Separation of duties can also be accomplished through isolation of systems, each managed by a different individual This is used frequently in the data center in which a separate SAN is built for each different group or project in an organization. It is often used by outsourcing firms with shared multi-tenant environments. Many of their customers do not want to become part of the collective and prefer being physically isolated from any other customer. A separate SAN is constructed for each customer needing physical isolation and a restricted group of administrators are assigned to manage this environment. While it's true that creating physically isolated SANs provides the ultimate segregation, it's also true that use of storage resources in the shared environment are not optimized. A compromise between a fully shared SAN and an isolated SAN is a logical SAN (LSAN), which places a FC router between the two SANs. For example, one environment may require its own hosts, applications, disk storage, and FC infrastructure to be managed independently from the shared environment. However, to avoid the additional cost of a tape backup system, an administrator can create an LSAN to enable sharing of the tape backup resources with the isolated SAN. This implementation provides physical isolation but has the advantage of sharing some resources according to strict pre-defined rules. Note that FC routing can also be implemented to create LSANs using the Integrated Routing (IR) feature on Brocade 8 Gbps switches, available since FOS 6.1.1. Zoning. When FC SANs first emerged more than a decade ago, there was no real access control mechanism to protect storage used by one host from being accessed by another host. This was not a significant issue at the time, since the original SANs were relatively small. Over time, however, as SANs became larger, more complex, and missioncritical to most data centers-this became a risk. To help secure particular devices and data, Brocade invented the concept of zoning, or restricting device communication only to member devices inside a given zone. Today, zoning is an accepted standard and plays an integral role in SAN security.
98
For Brocade switches, there are two ways to identify zone members, and zone enforcement is performed either in the switch Name Server or inside the switch ASICs. Identification methods include DID/PID, the switch domain ID and the switch port number, and port World Wide Name (pWWN), the storage or host port WWN. Brocade recommends pWWN identification because of the management flexibility it provides; also, several advanced Brocade features require pWWN zoning. Brocade switches that operate at 2 Gbps or faster enforce both DP and pWWN zones in hardware. This was not the case in Brocade 1 Gbps switches, and users frequently chose DP identification because it was the only hardware-enforced zoning method at the time. Now, a zone with all DP identification or all pWWN identification uses the more secure hardware enforcement. However, there are some cases when mixing identification methods results in software enforcement. These cases include mixing DP and pWWN identification within a zone or using a DP identification for one zone and the pWWN attached to that DP in another zone. For this reason, Brocade recommends using the same zoning identification method (preferably pWWN) across the entire SAN to ensure that: All zoning is hardware enforced. Advanced features such as Fibre Channel Routing are usable. Zoning management methods are consistent.
As a security best practice, organizations should use single-initiator zones. That means that each zone should have only one host, although it can have multiple storage nodes. Single-HBA zoning improves security, helps contain RSCNs, and makes the SAN much easier to manage and troubleshoot. An extension of this best practice for mixed disk and tape traffic on the same HBA is to utilize two zones for each HBA: one for disk nodes and one for tape nodes. This approach isolates the disk and tape devices, even though they continue to communicate through the same HBA. Another best practice is to activate Default Zoning. By default, if no zones are defined or the current zoning configuration is disabled, all devices can see each other in the SAN, which can create a variety of problems. First, the SAN is more vulnerable from a security perspective. Second, HBA drivers can have difficulty discovering an entire SAN. The Default Zoning feature ensures that devices not already assigned to an active zone will be assigned to the Default Zone and will not be seen by other devices when an administrator disables a zoning configuration.
99
RSCN is required for a SAN to properly function, but RSCNs can be potentially disruptive if not managed properly by the SAN switch. Brocade switches forward RSCNs only to zones with devices affected by the addition or removal of a device. Also, Brocade switches forward only one RSCN if identical RSCNs occur within a half-second window, an approach that limits the impact of a device sending hundreds or thousands of RSCNs per second. Furthermore, organizations can entirely suppress RSCNs on specific ports. Some applications, particularly in the video imaging and multimedia industries and tape backups, actually require this capability. Finally, it is possible for a switch to obtain a new domain ID after a reboot, particularly when a switch is added to a new fabric or after a massive power failure. To prevent this from occurring, it is a best practice to assign a domain ID to a switch using an insistent domain ID (IDID). An IDID will survive reboots or power failures and will never change. Table 9 explains the domain ID behavior in insistent and noninsistent domain IDs. Table 9. Domain ID behavior
DID Assigned? DID not in use DID already assigned Non-Insistent Domain ID DID is assigned New DID is assigned Insistent Domain ID DID is assigned Switch won't join fabric
Virtual Fabrics and Administrative Domains. Virtual Fabrics (VF) allows a physical switch to be partitioned into multiple Logical Switches, each with its own unique fabric ID (FID). Logical Switches can be connected to physical or Logical Switches, similar to a physical switch using ISLs, to form Logical Fabrics. This feature is very useful for multi-tenant environments and for environments that can benefit from a logical separation of data and management on a common physical fabric. The Administrative Domain (AD) feature was introduced in Brocade FOS 5.2.0 and provides another method of partitioning a fabric into separately managed domains. An AD is a logical grouping of devices that can be managed separately either by the same or different system administrators. An administrator may have different privileges in one AD than in another AD. For example, a SAN administrator may have a read-only user role in AD 12 and an Admin role in AD 14.
100
You can use either the AD feature or the VF feature, but not both at the same time. VFs deliver more complete partitioning since this feature applies to the data, control, and management paths of the fabric, and AD applies only to the management path. Traffic Isolation. Separation of duties can also be applied to some extent to the type of traffic so that one type of traffic does not affect another type. This is implemented using a feature such as Brocade Traffic Isolation (TI). Figure 37 illustrates how TI works. In this example, data is replicated and backed up between Site A and Site B. Since tape backup is very I/O intensive and data replication is exceedingly important, it is a good idea to use TI to handle the data replication traffic on one ISL and the tape backup traffic on another ISL. Without TI, the data replication process would be competing for bandwidth with highly I/O-intensive tape backup, a situation that could result in severely degraded performance.
Site A
Data replication traffic
Site B
Servers
Tape library
101
102
Password policies can be defined to enforce basic rules on how passwords are created and managed. Passwords should be strong in the sense that they are difficult to guess. The use of common words or common number sequences do not make good passwords; random combinations of at least eight numbers and alphabetic characters is usually the norm. Passwords should not be reused on a regular basis and this can be enforced using a password history feature. Accounts should also be locked out after several (usually three) unsuccessful login attempts. Finally, passwords should be forced to expire after a period of time, even though this is always a delicate policy. As time goes by, a password has a higher probability of being discovered and compromised therefore it is important to change passwords on a regular basis. How often the password should be changed depends on the environment. If the password is changed too often, then it becomes more frustrating for users, who have to remember the password and not confuse it with previous passwords. Many users in that situation simply resort to writing their password down and keeping it somewhere near the computer. Since most system administrators are responsible for more than one system, a unique account administrator must be created on each managed device. The same goes for password changes, which must also be changed on each device the administrator is responsible for. To simplify this process, a single sign-on method such as RADIUS or LDAP is recommended. These methods allow a SAN administrator a user to change a password in one location and the changes will apply to all systems for which they are responsible.
103
SAN Availability
SAN availability is an important consideration when designing a SAN security program to protect against a targeted denial-of-service attack, natural disaster, hardware failure, or human error. The key to maintaining high availability is to eliminate or reduce the number of single points of failure (SPOF). SPOFs can be found throughout the FC fabric, including: Hardware devices Paths between devices Data centers
Switch hardware availability. The hardware itself may have redundant components such as power supplies and fan modules. Some of these components may be hot swappable to allow replacement of field replaceable units (FRUs) in the field without bringing down the switch. Another solution for hardware redundancy is to use enterprise-class directors instead of switches. Directors offer greater hardware redundancy and overall robustness for maximum production uptime. The Brocade 48000 Director or Brocade DCX or DCX-4S Backbone, for example, can offer six nines (99.9999%) availability or better. One of the simplest and best ways to eliminate a hardware SPOF is through the use of redundant dual--fabric architectures. In a dual-fabric design any single hardware component could fail without undue impact on the production environment (see Dual Fabrics on page 35). All hardware is duplicated in this architecture and there are two or more paths between any host and its associated storage. Of course, a dual-fabric architecture applies only to disk-based SANs, since backup applications cannot handle dual-attached tape devices. One common availability error observed in many data centers using dual-fabric architectures is to co-locate both fabric A and fabric B in the same physical rack or cabinet in the computer room. Often this is the result of a procurement issue when the switches are initially purchased along with one single rack. Once a fabric has been racked and installed, it will most likely not be reconfigured. But the switches should be moved into a new or repurposed rack or cabinet, new cables need to be created and reconnected, and so on. The next technology refresh or major maintenance window could provide a superb opportunity to separate the fabrics.
104
Data path availability. Redundant data paths between the host and storage devices are part of a dual fabric architecture. Dual-attached hosts using MultiPathing I/O (MPIO) software can load balance traffic between the two paths or they can fail over to one single path in the event of the failure of one path. Data path redundancy can also be built into a fabric, using resilient fabrics or other architectures that provide path redundancy as discussed in Chapter 3: SAN Basics for Security Professionals starting on page 19. Some SAN designers simply use dual ISLs for redundancy between the switches instead of single ISLs. Data center availability. The data center itself can be an SPOF in the event of a natural or man-made disaster such as an earthquake, fire, or massive local power failure. This problem is addressed with multiple data centers maintaining replicated copies of data between them. Fabrics in one data center can be mirrored in a second data center to create a hot site, which can be used to fail over all activity from the primary data center to the secondary. Exchanging data between the data centers can be done using dark fiber (depending on the distance and cost) or using the FCIP protocol over a public or private WAN.
more secure location, which can be done on most systems and certainly on Brocade switches. The notion is that the attacker would have to know about and compromise the server to which the logs are copied to destroy the syslogs. Different FC equipment vendors offer various levels and types of logging, but they are often not enabled by default. For example, on Brocade switches, the following logging features are not enabled by default: Event auditing Track changes Fabric Watch security class
To obtain more detailed logging, these logging features should be enabled. An often neglected but important detail with log files is the time stamp. Switches and other FC devices in a SAN run their own internal time clocks. Without any means of synchronization, the clocks on each device will be different and make it virtually impossible to correlate an event in the log file of one device with the log file of another device. This problem can be resolved simply by using the Network Time Protocol (NTP) to synchronize the time on each device. Simply identify an NTP server, either external or internal, and each device can synchronize its clock to the NTP server. Monitoring. When a security breach occurs, it is imperative to detect it as soon as possible to allow for a quick response to prevent or minimize damage caused by the attack. All FC switch vendors provide a GUI to manage their switches. The GUI is usually the primary management tool to monitor the status of the SAN in real time. Unfortunately, a critical event may not be observed immediately unless a SAN administrator is posted in front of the GUI at all times. To automate monitoring, other tools can provide automated alerts in the form of e-mail notifications or pages. Often, SNMP (Simple Network Management Protocol) is used to send traps to a management framework. Since the SNMPv1 protocol has known vulnerabilities, as a best practice use SNMPv3. With Brocade switches, the Fabric Watch feature can be used to monitor specific fabric and switch events and generate an SNMP trap or send an e-mail to alert administrators of the event. Specifically, Fabric
106 Securing Fibre Channel Fabrics
Watch has a Security class to monitor specific security events such as unsuccessful login attempts and device access control policy violations.
107
is obviously easier to manage and control one entry point than it is to control multiple entry points, which is the reason most enterprise computer rooms only have one access door. Fabric management can be performed using any switch in a fabric, which means that multiple management points are available. Using a fabric configuration server (FCS) policy, administrators can specify a specific switch as the only management point and they can also assign alternative switches as backup management switches in the event the primary management switch fails.
108
Encrypting Data-in-Flight
Encrypting Data-in-Flight
Encrypting data-in-flight uses a different encryption method than encrypting data-at-rest. Data-at-rest on a disk is block based, and when it is written back to disk the encrypted data must be exactly the same size as the cleartext data before encryption. With data-in-flight, the data is streaming over a cable in a serial fashion and needs to be encrypted on the fly as it moves across the cable. The concepts of stream and block ciphers were discussed in Cryptographic Algorithms on page 78. Data-in-flight can occur at three different points in the SAN: Between the host and fabric Between a switch and another switch Between the fabric and storage device
Host-to-Fabric Encryption
Protecting the confidentiality of the data exchanged between the host server or HBA and the fabric is accomplished by encrypting the data-inflight and can be implemented in several ways. Software-based encryption applications can be installed on the server, but as with any software-based encryption solution there will be a negative performance impact of 30 to 50%. This may be acceptable for some applications and environments, while others may not tolerate any performance degradation. Hardware implementation is the only implementation that does not impact performance. For example, HBA solutions with built-in encryption capabilities include the Brocade 8 Gbps HBA. The cost of implementing HBA encryption is relatively inexpensive for small environments, but the cost increases rapidly as the number of hosts increases in the fabric.
Switch-to-Switch Encryption
The FC infrastructure is a highly intelligent and reliable transport network that moves frames between servers and storage devices. All of the data in a SAN environment moves through this infrastructure and is usually transmitted in cleartext. As was shown in SAN Security Myth Number 3 on page 11, even data moving through a fiber optic cable can be sniffed without splicing the cable or breaching the protective jacket. Data can also be moved across a public network using the FCIP protocol, which uses a TCP/IP tunnel to move a FC frame. FCIP is particularly vulnerable since it uses the TCP/IP protocol with all of its associated vulnerabilities.
Securing Fibre Channel Fabrics 109
The primary confidentiality issue with switch-to-switch communications is not over the ISLs used to connect switches in a data center, but between switches that connect two data centers over distance. A dark fiber strand that is owned or leased by the organization is used to connect two data centers.
Fabric-to-Storage Encryption
There are no data-in-flight encryption solutions available today to encrypt the data between a fabric and a storage device other than a specialized encryption appliance, which can be installed in the data path. However, a data-at-rest solution essentially serves the same purpose by encrypting the data prior to sending it from the fabric to the storage device, to maintain data confidentiality.
110
Encrypting Data-at-Rest
Encrypting Data-at-Rest
Data-at-rest includes tape and disk media, which require different encryption methodologies. Disks are block-based devices and tapes are streaming devices, which usually require different modes of operation to perform the encryption. Encryption of data-at-rest can be performed in several places in the SAN, as shown in Figure 38. Application Appliance Fabric/network itself Host Disk Tape
Fabric
Disk
Application-Based Encryption
There are several schools of thought as to where encryption should take place. Some applications actually require the data to be encrypted at the application level to prevent unauthorized users from viewing certain types of data. For instance, it is possible to encrypt an entire column containing sensitive information in a database using
Securing Fibre Channel Fabrics 111
encryption software. Of course, any software-based implementation results in a significant loss of performance and may not be the ideal place to perform the encryption if performance is critical. Some backup applications offer an encryption module to encrypt the data to the backup media. The encryption process is built into the backup application software, but this method utilizes processing cycles on the backup server resulting in a negative performance impact, which increases the backup window. There are also specialized applications designed to encrypt data-atrest to disk or tape media. Several vendors such as RSA and PGP offer such solutions. Again, the main issue with software-based solutions is performance degradation and the impact on production performance.
Appliance-Based Encryption
There are several vendors of appliance-based encryption solutions, such as NetApp through the acquisition of Decru and nCipher through the acquisition of Neoscale. These encryption appliances do not become a part of the fabric and must be inserted in the data path to encrypt data. Usually, the process of inserting the appliance into the data path causes a disruption of the production environment but not always.
Fabric-Based Encryption
Fabric-based encryption is accomplished using switches with encryption and compression capabilities. These switches can be added to an existing fabric using standard ISLs and take up a domain ID as would any other switch. One of the main advantages of fabric-based encryption solutions over appliance based solutions is the ability to redirect or reroute frames from anywhere in the fabric through the encryption switch. Brocade switches, for example, use a technology known as frame or nameserver redirection, which was introduced in Fabric OS 5.3. It enables a transparent integration of the encryption solution into an existing fabric. Data can be written from servers to storage devices anywhere in the fabric without requiring direct insertion of the switch into the data path. Another significant advantage of fabric-based encryption is the ability to encrypt data in a heterogeneous environment. Some solutions, such as the Brocade Encryption Solution, encrypt data directed to both tape or disk devices and also work with a variety of third party vendor appliances. This provides organizations with greater flexibility and independence from the storage vendors.
112
Encrypting Data-at-Rest
Host-Based Encryption
Host-based encryption can be implemented using software installed on the host or an HBA capable of performing encryption. The greatest issue with host-based software encryption is the performance degradation resulting from CPU utilization of the encryption application. Host-based software encryption solutions are usually part of the operating system or the file system. There has been considerable discussion and interest recently around the concept of encrypting data within the HBA hardware. This concept appears to have some interesting cost benefits for smaller environments but has several implementation challenges from a technical perspective.
113
less hardware chips that can be placed on the card. An enterpriseclass encryption solution requires many features and functions which just would not fit into an HBA. Key management is a critical component of data-at-rest encryption and the HBA must be able to authenticate with and securely exchange keys with the key vault. Any keys stored locally would have to be protected by a crypto boundary to be compliant with the FIPS 140-2 standard. To reach FIPS 140-2 level 3, tamper detection and response mechanisms must be implemented. One major requirement for encrypting with data-at-rest for disk is converting existing cleartext data to ciphertext data while in place on the existing LUN. This requires the ability to read a data block on the disk, encrypt it then write it back to its original location. This process is extremely processing-intensive and requires a considerable amount of processing capability which would be difficult to fit onto HBA and would require excessive CPU cycles on the host. Furthermore, this process is extremely sensitive since the data is being modified and the potential for corruption may be high in the event of any type of failure or if the LUN is accessed by multiple hosts. Control mechanisms must be implemented to ensure the data will never be corrupted during the first-time-encryption (FTE). The solution might be to implement FTE by copying data to a different LUN instead of doing it in place. However, this solution requires an equivalent amount of disk space to perform the FTE and it cannot address the HA concerns if multiple hosts are accessing the same LUN. With HBAs, data encryption is also distributed across multiple servers that are probably connected to the SAN using a dual redundant path configuration. With this architecture, clustering the HBAs to share the keys presents another challenge. In a dual-path configuration, the keys in each path must be shared to make sure the encryption is performed correctly at the LUN. Another issue occurs with tape applications. Each tape backup application writes metadata to the tape media and each vendor uses a different format to accomplish this. The encryption solution must be able to write an index to the key as part of the metadata on the tape media and should support several commonly used backup applications which requires more code. In addition the HBA would require a compression engine as the tape drive cannot compress encrypted data.
114
Encrypting Data-at-Rest
Encrypting data at the HBA would also prevent de-duplication applications from working. De-duplication takes advantage of duplicated information and only retains one copy with pointers to the other copies. Once data has been randomized through the encryption process, there no longer exists duplicated data - only randomized data. Having said all this, an encryption solution at the HBA level could appear to offer some cost benefits, particularly for smaller environments, since the cost of an HBA is less expensive than that of a specialized encryption switch. Conversely, as the number of encryption HBAs increase, the cost will increase proportionally while a fabric-based solution will not since the switch can handle large numbers of hosts. Evidently, performing data-at-rest encryption at the HBA level, or anywhere else for that matter, is not a simple task. Although an HBA encryption solution for data-at-rest sounds promising, it is not a silver bullet and there are many practical aspects which must be considered. There are several technical challenges which must be surmounted before this feature can actually be implemented safely in a production environment.
Storage-Based Encryption
Tape-based hardware encryption solutions have the advantage of being implemented in hardware and operating at wire speeds with no observable performance degradation during a backup operation. On the other hand, these solutions require new specialized tape drives with built-in encryption capabilities (such as LTO-4). Although this solution addresses the tape encryption problem quite effectively, it does not address disk encryption. Many organizations start with a data-at-rest encryption project exclusively to address a tape encryption problem. However, even without an internal policy, it is highly likely that regulations or legislation will eventually force the encryption of both disk and tape media. in that case, a second disk encryption solution will be required that uses a different encryption hardware and key management solution. Disk-based hardware encryption solutions are available from vendors such as Seagate, with built-in encryption capabilities on the disk drive. Other vendors are working on similar solutions but they were not generally available at the time of writing. Although disk-based encryption addresses disk encryption effectively, it does not address tape encryption.
115
Physical Security
Physical security is a vast subject and this book cannot discuss it in depth. But some best practices that apply to the SAN environment are highlighted. Most organizations assessed by the writer have adequate physical access controls to the computer room and the SAN equipment, so this aspect of physical security will not be addressed in this section. Note that this aspect of security is usually addressed by a different group than storage or security administrators. As mentioned previously, one of the most frequently observed oversights in data centers is to physically install the switches in a dualfabric configuration in the same rack or cabinet. One customer's data center was located in a room on the floor underneath the cafeteria, and as you might expect, a water leak from the cafeteria leaked into the computer room one day. Fortunately, this leak did not damage SAN equipment, but if it had the entire SAN would have gone down, including all of the application servers and storage connected to itrepresenting most of their mission-critical applications! Most of an organization's critical applications reside in the SAN and loss of the SAN can be disastrous. Simply installing fabric A equipment in one rack and fabric B equipment in another rack would have solved this problem. One way to prevent this situation is to ensure that even if fabric equipment and a rack are ordered together, do not install them into the same rack but order another rack or redeploy an existing rack. If you are already facing this situation, the next maintenance window or technology refresh could be used as an opportunity to separate the fabrics properly. In shared environments particularly, it is a good practice to lock racks or cabinets containing switches and SAN equipment. In some shared environments with isolated SANs, the entire SAN can be enclosed inside a locked wire cage structure to prevent unauthorized access. The final aspect of physical security considered during a SAN security assessment are the environmental and utility factors. Power feeds and circuits should also be redundant to connect one switch supply into one circuit and the other power supply in a different circuit. The equipment itself should be protected with an uninterruptable power supply (UPS) system and test and replace batteries on a regular basis. To protect against a loss of availability resulting from a power failure, data centers should also use power generators, exercised on a regular basis to be certain that they will function properly when a power failure
116 Securing Fibre Channel Fabrics
Encrypting Data-at-Rest
occurs. Contracts should be in place with a service level agreement (SLA) to guarantee a specified response time from identified diesel fuel providers in the event of a power failure. A massive power grid failure similar to the one experienced in northeastern US and Canada on August 13, 2003, could result in hundreds or thousands of data centers within a large area scrambling for available diesel fuel to refuel their power generators. The equipment in a computer room not only consumes enormous quantities of power but generates so much heat that a complete failure of the cooling system could result in a shutdown of the entire computer room within a few hours. It is important to ensure there is proper cooling in all areas of the computer room and to eliminate any hot spots in an aisle or specific area.
operating the production SAN. It is not necessary to document every single procedure performed in the SAN, but at least document the critical tasks and those that are used frequently. Switch configuration files should be backed up frequently depending on how often changes are made to the production environment. The same applies to syslog and other log files. They should all be backed up automatically to a secure server with restricted access.
118
Encrypting Data-at-Rest
Training should be aimed not only at the storage administrator but also at the security administrator and IT management. Security awareness helps prevent security breaches by sensitizing the staff to security issues and attack methods used by hackers. One of the most frequent attacks by hackers to obtain password and other sensitive information is social engineering. A hacker calls a corporate user impersonating an official company help desk or support person and requests the user's password or other sensitive information. Users should NEVER divulge their password to anyone, even a real company help desk person.
119
The last plan of concern is the computer security incident response (CSIR) plan. An incident response plans outlines in detail what needs to be done in the event of a security breach. It usually involves the creation of a CSIR team (CSIRT), which will be mobilized when an incident occurs. The CSIRT is usually composed of employees from various groups across an organization, for whom this is not their primary role; although technical people are required to address the technical aspects of the response. There is a need for Human Resources specialists to deal with HR issues (such as dismissing an employee) or public relations issues resulting from the incident in order to prevent further exposure. Management should be represented to make rapid high-level decisions to minimize the impact of an incident and enable a proper response if unexpected costs are involved. It is not necessary to have a CSIR plan or a team specifically for the SAN environment, but members of the SAN management team should certainly be involved in building the CSIR plan and participating in the CSIRT.
Chapter Summary
An assessment on the other hand is not formal and its scope is not restricted to boundaries established by a policy or industry standard. An assessment is complementary to an audit. Some organizations with internal security policies perform yearly audits, but they also perform a comprehensive assessment to validate, expand, and update the existing security policy. Usually, organizations without a SAN security policy in place or those who want to integrate the SAN environment into the existing IT security policy have an assessment performed by a third-party vendor specialized in SAN security. Brocade for example offers full SAN security assessments to its customers to help them understand the state of security of their SAN and how to better secure it. For details, see the Professional Services section on the Brocade corporate Web site: http://www.brocade.com/services-support/professional-services/ index.page
Chapter Summary
Although there are no industry standards currently defining the requirements to secure a SAN environment, organizations such as the SNIA SSIF are working on raising awareness around SAN security and have developed storage security best practices to help organizations better understand the security issues surrounding a SAN. Brocade has been involved heavily in this area and has developed over 100 different security features to help harden an FC switch infrastructure, created several security-related white papers, and provided professional services engagements to assess, audit, and harden a SAN. Hopefully this book will also contribute to SAN security cause by providing a better understanding of the security issues associated with a SAN and raising awareness of these issues.
121
122
A DMZ (demilitarized zone) is a part of the network that sits between the internal private network and the external network or Internet. The DMZ also acts as a buffer between the inside and outside networks where applications such as e-mail, FTP, and Web servers exchange information between both networks. This buffer is critical for preventing potential attackers from the outside network, or Internet, to communicate with any of the internal systems directly. A SAN is a separate network from the LAN, which is used to exchange information between servers and storage devices such as disk arrays and tape devices. SANs are currently implemented in the data center using two protocols: Fibre Channel and iSCSI. This chapter focuses on the FC protocol since it is by far the most widely deployed. From a security perspective, there are clearly concerns with connecting servers located in a DMZ, which are accessible from the Internet and whose storage is connected via a SAN. The greatest fear is that a SANattached server in a DMZ will be compromised and somehow used as a stepping stone to gain access to the SAN itself. The next question becomes whether securing SAN-attached devices in the DMZ can be done safely or not. Certainly, there are risks involved in having a SAN in a DMZ, but with proper design and configuration it can be implemented with a high degree of safety. Note that vulnerable SAN components must be properly secured before attempting this.You now know that security is not always for preventing criminal activities originating from outside the bounds of the data center. In this situation, security measures must be put into place to prevent unauthorized internal breaches and prevent the propagation of human error beyond a fixed scope.
123
A server connected to a SAN can potentially see all of the storage devices and servers connected to the SAN unless proper measures are taken. This chapter explores several techniques that accomplish the secure configuration of a SAN with a DMZ. It is meant to be a standalone chapter and experienced SAN or security administrators could read only this chapter and find out what they need to address DMZrelated and SAN security concerns. If you are an avid reader and intend to read the entire book, you will notice several redundant sections in this chapter-just skip over them and continue.
passwords. This includes forcing at least eight characters, using a combination of alphabetic, numeric and special characters, and preventing the use of repeating characters and sequences. The password policy should also set a password expiration time and disable accounts after a number of unsuccessful login attempts. To simplify password management, a single place to administer user names and passwords for all users and devices is in large environments. RADIUS (Remote Authentication Dial In User Service) and the LDAP (Lightweight Directory Access Protocol) are tools that provide a simple, centralized method to enable and disable user accounts and change passwords for all switches in a SAN from one application.
an authorized host. While these port control methods add management steps to the configuration of a DMZ switch, they significantly increase the security of the switch and impose very clear change control so the DMZ SAN does not have unintended topology changes.
Zoning
Another technique, zoning, is implemented within a FC fabric. Zoning allows for devices such as servers, disks and tape drives to be grouped together and isolated from other devices. Devices can only communicate if they are in the same zone. However, a device can be in multiple zones to maximize configuration flexibility. All SAN switches are capable of hardware-enforced zoning, in which an ASIC allows or disallows device communicate as defined by the zoning configuration. Hardware enforcement is always done on Brocade switches if all zone identification in a zone configuration is D,P (port zoning) or pWWN (WWN zoning). Mixing identification methods in a zone configuration can cause some of the zone enforcement to be the less secure Name Server enforcement. Brocade recommends using all pWWN when zoning to ensure that all zones are hardware enforced and to enable some advanced Brocade features such as Fibre Channel Routing (see Zoning on page 29).
LUN Masking
LUN masking, can be implemented at the HBA (Host Bus Adapter) or at the disk controller. It assigns a specific LUN to a specific pWWN in the SAN. No other server will be able to see or access that LUN unless multiple LUN masking mappings are configured, typically on the storage subsystem. LUN masking is less effective when it is configured only on the server, because the masking can be disabled if the server is compromised. A server breach is more likely than a storage subsystem breach.
Administrative Domains
The last technique, Administrative Domains (AD), are used to logically group switches, switch ports, and device pWWNs (in a physical fabric) that should be managed separately from other components in the fabric. Comparing zoning and ADs, zoning logically groups devices that can communicate with each other, while ADs group devices to create a Logical Fabric that can be managed independently of how they are physically connected on the fabric.
126
For example, a SAN administrator may want to create a Logical Fabric to allow a sensitive project in a shared SAN environment to be managed separately from the rest of the production environment and to further isolate and protect it. Privileges can be assigned to a SAN administrator to manage the special project environment and different privileges can be assigned to the same administrator to manage the shared production environment. The benefit of this approach is that changes in the special project environment will not cause any disruption on the shared production environment and vice versa.
Authentication of Servers
To further enhance security, strong authentication mechanisms should be employed to authenticate servers joining a fabric. The ANSI T11 Technical Committee for FC has a standard that defines the use of an authentication protocol to authenticate end devices to switches. This protocol, DH-CHAP uses a shared secret to ensure that the pWWN of the HBA joining the fabric has not been spoofed and is in fact genuine. It is possible to change the pWWN on an HBA using tools from HBA manufacturers, and so it would be possible for someone to configure the HBA on a server to have the same pWWN as another server on the SAN. Use DH-CHAP and port ACLs to prevent spoofing of a server HBA pWWN.
127
Internet
Router
No LUN masking
Storage array
Production SAN
Server
Server
Internal Network
128
Internet
Router
DH-CHAP Authentication
Web server
LUN masking
Storage array
Production SAN
Internal Network
129
Internet
Router
DH-CHAP Authentication Core router with firewall Web server E-mail server Core router with firewall
DMZ SAN
Storage array
Production SAN
Server Server
Internal Network
Figure 41. Securing SAN-attached DMZ servers using a physically separate SAN
130
131
Chapter Summary
As with any network, SANs have security vulnerabilities, but with proper design and configuration a SAN can be extremely secure. In order to safely connect servers in a DMZ to a SAN, several security precautions must be taken. The techniques described in this chapter, in combination, can provide a high level of protection to the production SAN in the event that a server in the DMZ becomes compromised. Note that a SAN environment can be hardened to a much greater degree than described in this chapter. If you planned to read only this chapter, you might want to reconsider and read the book in its entirety.
132
When it comes to SAN security, each organization has its own requirements and level of tolerance to risk. Although Brocade FC switches can be secured to a very high degree, organizations usually use a more pragmatic approach by finding the right balance between managing the SAN and minimizing risk. A FOS-based fabric provides more than 100 security features. Clearly, not all features need to be implemented in the same fabric, but you have the flexibility to implement those that are necessary to achieve the exact level of protection that your organization requires. To protect FOS-based SANs, use the defense-in-depth strategy, described earlier in The Brocade SAN Security Model on page 91. This chapter covers most of the security features available in Fabric OS 6.3.0 and earlier. It is not intended to be implementation manual; but it does cover features and their associated commands at a high level. For further details on how to implement these features, see the appropriate version of the Brocade Fabric OS Administrator's Guide and the Brocade Fabric OS Command Reference. NOTE: The FOS CLI commands in this chapter are written with internal uppercase letters to make them easier to read. However, when commands are executed, they are case insensitive and normally entered as all lowercase, as shown in code examples. Also, bold text is used for commands in text to make them stand out.
To address these new enterprise security requirements, Brocade introduced the first security features specific to the FC SAN environment in Secure Fabric OS (SFOS) in 2002 (released with FOS 2.6.0). SFOS launched the first ACLs, FC switch authentication mechanism using PKI and digital certificates, and secure protocols to access the management interface. Although PKI can still be used, it has now been replaced with the industry standard DH-CHAP authentication method using shared secrets. Most of the security features previously available in SFOS have since been replaced with equivalent or more powerful and flexible functionality in the base Fabric OS (version 5.3.0 and later). Appendix A provides a comprehensive list of technical security features that can be implemented in a Brocade-based SAN environment. Brocade is continually enhancing existing features and creating new security features to help ensure that FC fabric infrastructures and data moving through them remain secure and highly available as new security vulnerabilities are discovered or required. It is important to note that even though the ACLs in SFOS and the new base FOS equivalents share the same names they are not compatible. A secure fabric running SFOS must be converted to the equivalent FOS-based security features according to the procedures detailed in the Fabric OS Administrator's Guide. With SFOS, all switches in a secured fabric were required to be in secure mode in order to join and participate in the fabric. With the standard FOS-based secure mode, fabrics can be in either strict or tolerant mode: In strict mode, all switches must participate in the fabric as was the case with SFOS. In tolerant mode, not all switches need to participate, which is particularly useful when a fabric contains older switches that cannot be upgraded to a firmware release greater than FOS 5.2.0.
134
The following is a list of features in Secure Fabric OS: FCS (Fabric Configuration Server) policy SCC (Switch Connection Control) policy DCC (Device Connection Control) policy MAC (Management Access Control) policy PKI Switch-Switch Authentication
To protect the Ethernet interface, organizations should employ reliable IP network security best practices to isolate management interfaces and ensure that they are accessible only to the appropriate staff. Typically, the Ethernet interface is connected to a dedicated LAN or a VLAN used exclusively for management purposes and is not connected to the production LAN, which provides proper isolation between the two LANs.
135
Most organizations display a standard welcome message or banner at system login. Although this type of login banner might not be a major deterrent, it can help minimize liability and provide legal support in the event of a security breach. It should be standard practice for any IT security strategy. SAN and security administrators have several tools at their disposal to tighten the security around these management interfaces.
136
Finally, several FOS commands are used to copy information to and from the switches that use unsecure services such as FTP to exchange data in cleartext format. The FTP service can be replaced in some cases with the SCP service, which is based on SSH. The following commands can be secured using SCP instead of FTP: configUpload and configDownload (since FOS 4.4.0) firmwareDownload (since FOS 5.3.0) supportSave (since FOS 5.3.0)
In order to use SCP instead of FTP for the configuration upload and download operations, the configure command must be used. To use SCP with the firmwareDownload and supportSave commands, a parameter must be entered at the command line to indicate that the SCP protocol will be used. SNMP is a commonly used protocol to monitor and manage Brocade switches. The earlier versions of this protocol, SNMPv1/2, had some security flaws that could be exploited. It is safer to us the latest version, SNMPv3, since it supports encrypted community strings along with many other capabilities. If you are using SNMPv1/2, change the default community strings used in FOS, since they are well known and can be used in an attack. Also, the security level can be changed and set to: No security Authentication only Authentication and privacy
137
Once a user logs in to a switch with telnet or SSH to use the CLI, a login banner or message can be displayed. By default, the login banner is not set and should be customized. The login banner can be enabled and created using the banner command When you create a login banner, use these guidelines: Include language indicating that the user is logging in to a private network and unauthorized users will be prosecuted. Include language indicating that any user accessing this interface is consenting to be monitored. This is to address privacy issues. Do not provide more information than necessary (organization name, type of OS, and so on)
There is legal precedence of a successful defense from hackers claiming they were not informed that they were not authorized to log in to a network, which is why this must be stated clearly in the login banner. In other cases, authorized users hacked into a network and were caught using network monitoring or procedures. They used the defense that their right to privacy had been violated since they were not informed that they could be monitored, so it is important to explicitly require consent in the login banner. As mentioned earlier, the first phase in an attack is to collect information so do not provide unnecessary information in the login banner itself. For the most part, laws in the US have adapted to the technology and legal provisions are already in place to address these issues. Nevertheless, it is still good practice to include explicit language to provide additional protection and to demonstrate due diligence. System and SAN administrators sometimes have a tendency to log in to a switch and forget to log out when they are no longer using it. Enable a telnet and Web Tools session timeout feature using the timeout command (set to 15 minutes by default). The Web Tools session timeout, available since FOS 6.2, can be set from the Web Tools interface.
Filtering IP Traffic
The concept of a firewall has existed for quite some time in the conventional LAN world but is a relatively recent feature in FC-based SANs. The IP filter (IPF) feature, introduced in FOS 5.3.0 behaves as a firewall and replaces the MAC policy found in Secure Fabric OS. Using an IPF, a TCP/IP port can be either allowed or denied and a SAN administrator can define a specific IP address or range of IP addresses that are allowed to access a specific TCP/IP port.
138 Securing Fibre Channel Fabrics
There are two IP filter policy types: one for IPv4 and one for IPv6. Table 10 identifies a few well-known ports used with Brocade switches that can be controlled using IPF. Table 10. Well-known ports and services
Service Name FTP SSH SCP (uses SSH) telnet HTTP SNMP HTTPS Well-Known Port Number 20, 21 22 22 23 80 161, 162 443
The IPF is often used to disallow the use of an unsecure service, such as telnet, when its equivalent secure version, such as SSH, must be utilized.
For the reasons described above, these accounts should never be used except as a last recourse when a SAN administrator's password is lost, for example, and there is no other way to gain access to the
Securing Fibre Channel Fabrics 139
switch. Each of these user accounts is also assigned a default password by Brocade. Some OEM partners change the default password to a different password. As a best practice, change these default passwords at first login. The password database is local to each switch. However, as of FOS 5.2.0, it is possible to manually distribute the local password database to other switches in a fabric using the distribute command (distribute -p PWD -d switch_list). If for some reason you want to exclude one or more switches from this distribution, use the fddCfg command entered from the switch to be excluded (fddCfg -localreject PWD).
Password Policies
A password policy can be created to ensure that users create strong passwords and follow the organization's password policy. There are four Brocade password policies that can be configured: Password strength Password history Password expiration Account lockout
Implement password policies on Brocade switches using the passwdCfg -set command. Password strength refers to how difficult it is for someone else to guess or break a user's password. A hacker often tries to guess a password using real words such as a person's name, spouse's name, pet's name, and so on. Real words should never be used as part of a password since they are too easy to guess. The use of numbers, special characters, and cases all contribute to making a password stronger. The following lists the different Brocade password strength features: Lowercase. Minimum number of lowercase characters to use (default = 0) Uppercase. Minimum number of uppercase characters to use (default = 0) Digits. Minimum number of numeric characters to be used (default = 0) Punctuation. Minimum number of punctuation characters to be used (default = 0)
Securing Fibre Channel Fabrics
140
MinLength. Minimum number of characters required (840 characters) Repeat. Maximum length of repeated character sequences that is disallowed (140 characters; default = 1) Sequence. Maximum length of ASCII character sequences that increase by one. For example, ABCDE is a 5-character sequence increasing by one (140 characters; default = 1) as well as 45678
Example:
switch:admin> passwdcfg --set -uppercase 3 -lowercase 4 -digits 2 -minlength 9
This example sets a password strength policy that required at least 3 uppercase letters, 4 lowercase letters, 2 digits, and an overall minimum length of 9 characters. Password history prevents users from using passwords that they used previously for a pre-defined number of passwords: History. Number of previous password values (including the current value) that are disallowed when creating a new password (124; default = 1). Example:
switch:admin> passwdcfg --set -history 10
This example sets a history policy that prevents the use of any of a user's previous 10 passwords. Password expiration or aging is used to control how long a password can exist. The following lists the different Brocade password expiration parameters: MinPasswordAge. The minimum number of days that must elapse before a user can change a password (0999 days; default = 0). Setting this parameter to a non-zero value discourages users from rapidly changing a password in order to circumvent the password history setting to select a recently-used password. MaxPasswordAge. The maximum number of days that can elapse before a password must be changed, (0999 days; default = 0) Warning. The number of days prior to password expiration that a warning about password expiration is displayed. (0999 days; default = 0)
141
Example:
switch:admin> passwdcfg --set maxpasswordage 180 -warning 14 -minpasswordage 7 -
This example sets a password expiration policy that specifies that users cannot change a password for 7 days after they set a password and must change their password after 180 days (a warning is sent to them 14 days before their password is about to expire). Password lockout is used to disable an account after a series of unsuccessful login attempts to prevent unauthorized users from entering consecutive password guesses until they guess the right one. The following lists the Brocade password lockout parameters: LockoutThreshold. The number of times a user can attempt to log in using an incorrect password before locking out the account (0999; default = 0). Setting the lockout threshold to 0 (zero) disables the lockout policy. LockoutDuration The time in minutes after which a previously locked account is automatically unlocked (099999 minutes; default = 30). Setting the lockout duration to 0 (zero) requires administrative action to unlock the account. Example:
switch:admin> passwdcfg lockoutduration 0 --set -lockoutthreshold 5 -
This example configures a password lockout policy that gives a user 5 tries to enter the correct password and specified that once an account is locked, it can only be unlocked by an administrator The lockout policy can be used as a denial-of-service (DoS) attack when an attacker guesses a user password until the switch locks out the account. Once the account is locked, then the authorized user is no longer able to access his account. The admin account is particularly vulnerable to this type of attack and thus has a special policy. The admin lockout policy can be disabled to prevent a DoS attack on that account, however it is then vulnerable to a brute-force guessing attack. The admin account lockout policy is enabled or disabled using the passwdCfg command (passwdCfg [- - enableadminlockout] [- disableadminlockout]). When a switch authenticates a user, by default it consults the local password database. However, the Brocade user authentication model allows for two other methods to authenticate users: RADIUS and LDAP.
142
SAN administrators can manage both passwords and user names on each switch locally or through a centralized access control administration method such as the RADIUS authentication protocol or the LDAP. These protocols allow a SAN administrator to change a password or disable a user's account from one central location and that change is applied immediately across all switches to which the user has access. The authentication method to be used is defined using the aaaConfig command (aaaConfig - - authspec ["radius" | "ldap" | "radius;local" | "ldap;local" - - backup]). For redundancy more than one authentication server can be added using the aaaConfig - - add command.
5.2.0
6.2.0
FabricAdmin
5.2.0
143
Description Routine switch maintenance commands All switch security and user management functions Most switch (local) commands, excluding security, user management, and zoning commands Non-administrative use such as monitoring system activity Zone management commands only
SwitchAdmin
5.0.0
User
All
Monitoring only
ZoneAdmin
5.2.0
Zone administration
FC-Specific Security
Brocade has developed several FC-specific security features that would not normally be available in a conventional LAN. For example, devices connecting to a Fibre Channel SAN can be authenticated using a strong protocol with the DCC policy.
144
FC-Specific Security
145
"SCC_POLICY", "member ;;"member"), where the member is the switch domain ID and an asterisk (*) is used to define all switches in a fabric.
Example:
switch:admin> secpolicycreate "SCC_POLICY", "2;4"
In a FOS 5.3.0 environment or later, use the DCC policy to define which devices are allowed to join a fabric. The DCC policy can identify member devices using their WWN or the physical port in the fabric to which they are connected to. To further enhance security, a WWN can be locked down to a specific port (as a WWN spoofing countermeasure) by preventing a device that is configured to mimic an existing device from joining a fabric unless the device being spoofed is first disconnected and then physically replaced with an unauthorized device. The SCC policy is defined using the secPolicyCreate command "DCC_POLICY_policyname", "member (secPolicyCreate ;;"member"), where the member is either a WWN or the switch domain ID (portID). When both the WWN and the switch ID/port ID definitions are used together, this is called locking down a port and only the WWN associated with that port are allowed to join the fabric. Example: switch:admin>
secpolicycreate "11:22:33:44:55:66:77:aa;1(3)" "DCC_POLICY_server",
This example creates a policy called DCC_POLICY_server and locks down the device with WWN 11:22:33:44:55:66:77:aa to port 3 of the switch with domain ID 1.
146
FC-Specific Security
Brocade supported SLAP (Switch Link Authentication Protocol), based on digital certificates, in SFOS. Today, FCAP (Fibre Channel Authentication Protocol), based on digital certificates, and DH-CHAP, based on exchange of shared secrets, are the principle authentication protocols use in FC. DH-CHAP is more frequently used since it is part of the FCSP standard. Brocade introduced the AUTH policy in FOS 5.3.0 to allow SAN administrators to enforce device authentication. The AUTH policy can be set to either of the following: OFF: No authentication required (default) ON: Strict enforcement of authentication on devices joining F_Ports PASSIVE: Authentication is optional and only authenticates devices configured for and capable of authentication
The ON mode of the AUTH policy was introduced recently in FOS 5.2.0. Prior to this, device authentication could not be configured to require authentication,
Logically
147
FC Routing
Fibre Channel Routing (FCR) is a means of isolating two fabrics from each other, while allowing specific devices in separate fabrics to communicate with each other according to a set of pre-defined rules. FCR can be implemented in one of two ways in a FOS-based fabric: Brocade 7500/7500E Extension Switch or FR4-18i Extension Blade Integrated Routing (IR) feature, available in FOS 6.2.0 and later.
The Brocade 7500/7500E and FR4-18i are specialized routing hardware platforms; IR is a licensed feature available on standard Condor 2-based products (Condor 2" identifies the ASIC type), which include the Brocade DCX/DCX-4S Backbone and Brocade 5100/5300 Switch. With the IR feature, a specific port in a supported switch can be configured to perform FC-FC routing.
Zoning
Zoning provides a logical means to group devices together and to isolate them from other devices. Zoning has been discussed at length in Chapter 3: SAN Basics for Security Professionals starting on page 19, as well as Chapter 6: FC Security Best Practices starting on page 91. This section discusses zoning in greater detail on how they are implemented and managed in a FOS environment. As a best practice, it is preferable to implement zones on FOS-based fabrics using the pWWN instead of the domain ID/port ID since both are hardware enforced and the pWWN provides more flexibility from a management perspective. A set of zones make up a zone configuration, and it is possible to have more than one zone configuration in a fabric. For example, there could be one zone configuration for the day shift, during which most production takes place, and another for the night shift, during which maintenance and backups are usually performed. When a configuration is changed, the effective configuration is disabled and the new configuration is enabled and then becomes the effective configuration. During this transition period, particularly with large fabrics, the name server must indicate to all the servers that there is a change in what devices they are allowed to communicate with. During this transition, when the effective configuration is temporarily disabled, it is possible for all servers in the fabric to see all devices, since no zone configuration is effectively defined.
148
FC-Specific Security
To prevent this from happening, default zones were created to ensure that all devices in a fabric cannot see each other during a configuration change. The default zone can be set to NOACCESS mode to prevent devices from seeing each other using the defzone - - noaccess command.
between the two types of traffic and simply shared the load between the two available ISLs. Traffic Isolation zones were created to address the issue. Traffic isolation can force traffic from one source to be sent over one path and traffic from another source to another path. In the previous example, the backup traffic could be sent over one path and the data replication traffic over another path. TI zones can be created using the zone command (zone - - create -t ti zone_name -p "ports"). Example:
zone --create -t ti red_zone -p "1,1; 2,4; 1,8; 2,6"
Audit log
Certain classes of events that occur in a SAN may be of great value to security personnel. These events include login failures, zone configuration changes, firmware downloads, and other configuration changesall of which may have a serious effect on the operation and security of the switch. These events can be recorded and filtered using the Brocade audit log feature, which was introduced in FOS 5.2.0. Auditable
150 Securing Fibre Channel Fabrics
Fabric-Based Encryption
events using this feature are generated by the switch then sent to an external host through syslogd (the daemon that sends messages to the syslog).
Fabric-Based Encryption
Encryption ensures confidentiality of data whether it is at rest or in flight. Encryption of data-at-rest in a FOS environment can be performed at the fabric level using the Brocade Encryption Solution. This solution is discussed in great detail in Chapter 11: Brocade Data Encryption Products starting on page 175. Encrypting data-in-flight can be used to secure communications between two data centers connected through an FCIP tunnel, for example. This solution could be implemented in a FOS environment using the Brocade 7500 or FR4-18i, also discussed at length in Chapter 11.
FIPS Mode
As discussed in Chapter 9: Compliance and Storage starting on page 157, FIPS 140-2 is a standard that was established to simplify the procurement of security products by providing a simple method to ensure that products meet certain security requirement levels. Brocade switches by default are not compliant with the FIPS standard, but they can be placed into FIPS mode to immediately enhance the security level of the switch. FIPS mode has been available since FOS 6.0.0. It is important to make this distinction: placing a switch in FIPS mode is not the same as making the switch FIPS-compliant. Placing a switch in FIPS mode enhances the security level of a switch according to the
Securing Fibre Channel Fabrics 151
compliance requirements specified by FIPS 140-2 Level 2. Enabling FIPS-mode is a disruptive action since it requires a reboot of the switch to take effect. FIPS-mode is enabled and configured using the fipsCfg command. CAUTION: FIPS-mode is disruptive and may have unexpected implications if you are not familiar with this mode of operation. For example, if you lose the admin password on a switch running in FIPS-mode, there will be no way to regain management control of that switch. FIPS-mode should be used only if there is a mandatory operational requirement to do so. Again, operating a switch in FIPS-mode does not imply that the switch is FIPS 140-2 compliant. When a Brocade switch is in FIPS-mode, the following occur: root account disabled telnet disabled, only SSH can be used HTTP disabled, only HTTPS can be used RPC disabled, only secure RPC can be used Only TLS-AES128 cipher suite used with secure RPC SNMP read-only operations exclusively, SNMP write operations disabled DH-CHAP/FCAP hashing performed only using SHA-1 Mandatory firmware signature validation SCP used exclusively (no FTP) for configUpload, configDownload, supportSave, and firmwareDownload commands IPSec usage of AES-XCBC, MD5, and DH group 1 blocked RADIUS uses only PEAP or MSCHAPv2, CHAP and PAP disallowed Only the following encryption algorithms functional: HMAC-SHA1, 3DES-CBC, AES128-CBC, AES192-CBC, and AES256-CBC
Starting in FOS 6.2.0, the following steps are required to prepare a switch to run in FIPS-mode: 1. (Optional) Configure RADIUS or LDAP server. 2. (Optional) Configure authentication protocols. 3. (LDAP only) Install SSL certificate on a Microsoft Active Directory (AD) server and CA certificate on the switch for using LDAP authentication.
152 Securing Fibre Channel Fabrics
Fabric-Based Encryption
4. Block telnet, HTTP, and RPC (using IP filters). 5. Disable boot PROM access. 6. Configure the switch for signed firmware. 7. Disable root access. 8. Enable FIPS mode (using fipsCfg command). Refer to the Fabric OS Administrator's Guide for the version of firmware you are using before performing the procedure to make sure that you have the most complete and current information. Once FIPS-mode is enabled, then several other steps are required to reset and zeroize certain switch parameters.
RSCN Suppression
It was explained earlier that RSCNs are contained to the devices within a FOS-based zone. It is also possible to explicitly suppress RSCNs at the port level. Some specialized applications are very sensitive and can be affected by an RSCN. If the environment is static and never changes once it is installed, then RSCNs can be disabled to prevent interruptions caused by RSCNs. RSCN suppression can be configured using the portCfg rscnsupr command.
Signed Firmware
Firmware can be tampered with and a modified version of the firmware installed on a switch. This type of attack, although unlikely on a Brocade switch, is usually performed by modifying the code to adding a back door, or malicious code known only by the author of the modified code. To ensure that the code being installed on a switch is in fact the authorized version and has not been modified by a third party, a hash value of the firmware is calculated. This hash value is then digitally signed with a private key at the source using the RSA algorithm and 1024-bit keys. The public key of the source is included in the firmware package to allow the switch to authenticate the firmware. This feature, called signed firmware, was introduced in FOS 6.1.0. When installing new firmware on a switch that has been configured for firmware signature validation, the public key is retrieved from the local public key file included with the firmware package and the firmware is validated.
153
A switch must be configured to enforce firmware signature validation and this is done using the configure command. Example:
switch:admin> configure Not all options will be available on an enabled switch. To disable the switch, use the "switchDisable" command. Configure... System services (yes, y, no, n): [no] cfgload attributes (yes, y, no, n): [no] yes Enforce secure config Upload/Download (yes, y, no, n): [no] Enforce firmware signature validation (yes, y, no, n): [no] yes
154
Chapter Summary
Insistent Domain ID
It is possible for a switch to obtain a new domain ID after a reboot, particularly when a switch is added to a new fabric or after a massive power failure. To prevent this from occurring, it is a best practice to assign a domain ID to a switch using an insistent domain ID (IDID). Once it is set, a DID survives reboots or power failures and will never change. The insistent domain ID is set using the configure command: Select y after the Fabric Parameters prompt. Select y again after the Insistent Domain ID Mode prompt.
Chapter Summary
With over 100 security features and more added in every Fabric OS release, many tools are at the disposal of SAN and security professional to increase the security level of their SAN environment. Most of these features are relatively simple to implement and do not add any overhead to the daily management tasks of the SAN administrator. Some features actually simplify management (RADIUS and LDAP), for example, allowing a SAN administrator to change the password for a user in one convenient location as opposed to every switch in the SAN. Deciding which FOS security features to implement depends on each individual organization's requirements and on factors, such as: Specific vulnerabilities Probability of that vulnerability to be exploited Value of the asset being protected Cost of the implementing countermeasures Impact on day-to-day management activities
Once these factors are weighed carefully, the SAN security policy can be created and implemented using appropriate countermeasures.
155
156
Certainly, most organizations will demonstrate due diligence and implement security measures on their own to protect their sensitive and critical data from loss or theft. Nonetheless, one of the primary driving factors for organizations to implement specific security measures is compliance-particularly mandatory and regulatory compliance. Compliance is the state of being in accordance with established guidelines, specifications, or legislation. These guidelines, specifications, and legislation can be industry specific, an accepted standard, or government legislation. Guidelines and specifications are not necessarily mandatory; some provide guidelines on which organizations can base their security policies to better protect their IT environments. Legislative specifications, however, are mandatory for certain organizations. Non-compliance is not an option and if prosecuted, organizations face severe penalties, including fines and jail time for executives in some cases. Guidelines, specifications, and legislation are not generally aimed at one specific area of technology, such as SANS and storage, but usually apply to all technologies and systems. A holistic approach is the best strategy to meet most regulatory compliance requirements.
The Data Security Standard (DSS) v1.1, established in September 2006, defines requirements to help prevent credit card fraud and hacking into credit card management systems. Merchants are required to meet minimum security standards. The following describes the general requirement categories but there are many specific requirements within each category. Build and maintain a secure network: Install and maintain a firewall configuration to protect cardholder data. Do not use vendor-supplied defaults for system passwords and other security parameters. Protect stored cardholder data. Encrypt transmission of cardholder data across open, public networks. Use and regularly update anti-virus software on all systems commonly affected by malware. Develop and maintain secure systems and applications. Restrict access to cardholder data by business need-to-know. Assign a unique ID to each person with computer access. Restrict physical access to cardholder data. Track and monitor all access to network resources and cardholder data. Regularly test security systems and processes.
158
Maintain an information security policy: Maintain a policy that addresses information security. Primary Account Number (PAN) Cardholder name Service code Expiration date Sensitive cardholder data under the PCI-DSS is defined as:
PCI-DSS uses a multi-tiered approach to managing merchant risks that depends on several factors. Merchants fall into a specified merchant level based on the criteria identified in Table 12. Table 12. PCI-DSS merchant levels and criteria
Merchant Level Level 1 Criteria All merchants processing over 6 million transactions per year Merchants whose data has been previously compromised Any merchant deemed to meet Level 1 compliance All merchants processing from 1 to 6 million transactions per year All merchants required by another payment network to report compliance as a Level 2 merchant All merchants processing from 20,000 to 1 million transactions per year All merchants required by another payment network to report compliance as a Level 3 merchant All other merchants
Level 2
Level 3
Level 4
Level 1 merchants, due to the significant number of transactions they process, are required to have an annual onsite audit. All other merchants must complete an annual self-assessment questionnaire and all merchants, including, Level 1, must undergo a quarterly network security scan performed by an approved scanning vendor (ASV).
159
Breach disclosure laws require organizations to disclose specific types of security breaches, particularly those involving personally identifiable information (PII) of individuals of a given state. There is no current federal legislation to address breach disclosure but there is one under development.
160
The precedent-setting law was the California Senate Bill (SB) 1386 which came into effect on July 1, 2003, as a result of a security breach of California's state Web site in 2002. California SB 1386 states that: 1. Any agency that owns or licenses computerized data that includes personal information; 2. shall disclose any breach of the security of the system following discovery or notification of the breach in the security of the data; 3. to any resident of California whose unencrypted personal information was, or is reasonably believed to have been, acquired by an unauthorized person. California SB 1386 was not perfect, so it was necessary to expand its scope to impose a general security standard on businesses that maintain certain types of personal information about California residents. California Assembly Bill (AB) 1950 came into effect in January 2005 and also requires businesses, and their subcontractors, to maintain reasonable security procedures and practices. At the time of writing, there were 44 US states that implemented similar breach disclosure laws. The National Conference of State Legislatures Web site contains a list of all US states with breach disclosure laws and references to them. Other similar breach disclosure laws have been enacted in other countries including: Canada. Personal Information Protection and Electronic Documents Act (PIPEDA) UK. Data Protection Act (DPA) EU. 95/46/EC - EU Directive and Basel II Japan. Personal Information Protection Law (PIPL) Australia. Commonwealth Privacy Act (CPA)
161
A major criticism of HIPAA has been that, in spite of providing welldefined penalties, it has not really been heavily enforced; although there have been recent cases of health care institutions being audited by the US Health and Human Services.
162
The key term is here is personally liable, which certainly gets the attention of the officers and directors of a financial institution. This law is enforced and has lots of bite to it with several cases tried and currently on trial under this act.
163
cryptographic material. At the time of writing, there were only three countries on this list: Iran, Iraq, and North Korea. For the most part, laws around exporting cryptographic material outside of these countries have been relaxed, but there still are some restrictions. It is best to verify with the BIS before exporting any cryptographic material. Other countries also have restrictions on exporting or importing cryptographic materials. For example, France has a current import restriction on 128-bit keys, which are subject to special permission.
Each organization has different security requirements and requires different degrees of security, hence FIPS 140-2 defines four security levels (see below). The lowest security level begins at 1 and each subsequent level builds upon the previous ones. The actual certification of the cryptographic module is performed by an independent lab, which validates the product to ensure it meets the criteria required for the Security Level being sought by the vendor. Once the tests are completed, the results are submitted to NIST. and upon their approval the product is officially posted on the NIST Web site at http://csrc.nist.gov/groups/STM/cmvp/inprocess.html.
Security Level 1
Security Level 1 provides the lowest level of security and it basically defines production-grade equipment with no physical security. Pretty much any product using a cryptographic module would qualify for this level of certification. An example of a Security Level 1 certified product is an ordinary laptop with a software-based encryption module.
Security Level 2
Security Level 2 enhances Security Level 1 with the tamper evidence requirement. Tamper evidence is implemented using special coatings or seals or pick-resistant locks for removable covers and doors. If a protective cover or door is tampered with to allow physical access to critical security parameters or plaintext keys stored in the cryptographic module, the coatings or seals will be broken and permanently modified. Additionally, role-based authentication must be used to authenticate an operator with a specific role that allows them to perform certain tasks such as deleting keys.
Security Level 3
Security Level 3 builds upon Security Level 2 with the addition of tamper-resistant mechanisms to prevent someone from gaining access to the critical security parameters (CSP) stored in the cryptographic module. This may include tamper detection and response systems, which could, for example, zeroize the keys stored in the local cache when the cover or door is opened. Security Level 3 must also include identity-based authentication mechanisms to authenticate a specific individual and verify that they are authorized to perform the specified task. Security Level 3 also requires that plaintext CSPs be exchanged using different ports than those used for other purposes (such as management interfaces). This enforces the principle of separation of duties to
166 Securing Fibre Channel Fabrics
allow different individuals to have authority over the different types of functions and prevents one individual from having total control over the entire process.
Security Level 4
Security Level 4 provides the highest level of security and builds upon Security Level 3. The physical security mechanisms at this level must provide a complete envelope of protection around the cryptographic module. All unauthorized attempts to physically access the cryptographic module must be detected and responded to by zeroizing all plaintext CSPs. The cryptographic module must also be protected against extremely vigorous environmental conditions that exceed the normal operating ranges for voltage and temperature. Only the most demanding environments require products certified to Security Level 4.such as combat zones and highly secure facilities that use equipment containing highly sensitive information. Under these exacting conditions, the equipment must still be able to zeroize the CSPs. For this reason, some people refer to Security Level 4 as a science experiment, since the testing process is extremely rigorous, lengthy, and expensive and few products are certified to this level.
167
Vendors seeking CC, or ISO/IEC 15408:2005, accreditation must have their product undergo independent testing by an approved laboratory to obtain the desired EAL accreditation level. A security product under CC evaluation is referred to as a target of evaluation (TOE), which can include hardware, operating systems, computer networks, and applications. To evaluate a TOE, the security requirements the product or system is designed to address and its security functions must be defined. This requirements and functions definition is referred to as the security target (ST). Since there are many different security requirements addressing specific security problems, categories are created to simplify classification of individual products. Each category is represented by an implementation-independent structure known as a protection profile (PP). When evaluators evaluate a TOE, they compare the ST for that product or system against pre-defined PPs and make a statement of compliance or non-compliance to the PP.
FIPS Process
Once a vendor applies to qualify under FIPS 140-2, there are a series of stages to go through. The vendor and product under evaluation are published on the NIST/NIAP Web site at: http://www.niap-ccevs.org/ cc-scheme/vpl/ There are five basic stages to get to final acceptance and qualification: 1. Implementation Under Test (IUT) 2. Review Pending 3. In Review 4. Coordination 5. Finalization
168
In some cases, a vendor chooses to evaluate a product to a specific EAL but may not have all of the functionality to achieve the next highest level. In this case, a vendor can augment the EAL achieved with some additional assurance components from the next highest EAL level.
169
Chapter Summary
The storage and SAN component of an IT environment are often subject to compliance requirements. Compliance guidelines and legislation described in this chapter that apply to the storage and SAN environments include PCI-DSS, Breach Disclosure Laws, HIPPA, GLBA, FIPS, Common Criteria, and FISMA. Often third parties are required to ensure the credibility of compliance reports. Cryptographic material, formerly categorized as munitions, is subject to export regulations in the US.
170
10
SAN security is still a relatively new field and has not yet achieved mainstream status. Efforts have been made by various organizations, however, including Brocade, to assemble and disseminate more information on this important subject and develop a more structured approach to security in the SAN and storage space. Other organizations and consortiums are developing new standards, particularly in the key management space, to simplify and enable interoperability among different vendor solutions. New technologies are emerging in the storage industry that may have a significant impact on how storage will be managed in the future. As these new technologies mature, new vulnerabilities and risks will come along with them.
iSCSI
The iSCSI protocol was designed as an alternative to Fibre Channel, but in reality it is a complementary technology. The attraction of iSCSI was the concept of leveraging an existing LAN infrastructure to also carry block-based storage data and thus reduce the cost of a SAN. However, one of the challenges faced by iSCSI is TCP/IP, which can be a very lossy protocol with associated performance degradation. To compensate for this, TCP/IP offload engines, or TOE cards, were created to offload the CPU processing requirements for the TCP/IP stack on the server. This subsequently results in an increase in costs, which offset a good part of the benefit of using iSCSI. For this reason, iSCSI has been primarily used in environments requiring less performance. In enterprise environments, FC is usually deployed for the high-performance enterprise applications and iSCSI for the less critical, low to mid-performance servers.
171
Security concerns with iSCSI are similar to those with TCP/IP in general since it is based on that protocol suite. There are a few storagespecific security features available with iSCSI to authenticate devices when joining a network for example. There are other storage-specific security features, such as ACLs, which can be used with iSCSI. Additionally, device authentication can be accomplished using the Kerberos, SRP (Secure Remote Password), CHAP (Challenge-Handshake Authentication Protocol), and SPKM-1/2 (Simple Public Key Mechanism) protocols (which are less secure than DH-CHAP with FC). IPSec is also used with iSCSI, particularly with extended fabrics over public WANs, to maintain data confidentiality by encrypting the data stream.
FCoE/CEE
(The following section is excerpted from a white paper entitled, Why Fibre Channel over Ethernet? authored by Tom Clark, Brocade Global Solutions Architect.) Recently, a new Fibre Channel standards initiative was created for running the Fibre Channel protocol over Ethernet (FCoE). Given the substantial investment in engineering resources, technical volunteers, and product development required to create a new standard and technology, and given the success Fibre Channel already has demonstrated for data center SANs, why is FCoE needed? Some industry observers have speculated that FCoE is an attempt by Fibre Channel vendors to compete with iSCSI, which, after all, also transports block storage data over Ethernet. When FCoE is compared to iSCSI, however, we see that the two protocols solve very different problems. iSCSI uses TCP/IP to move block storage data over potentially lossy and congested Local and Wide Area Networks (LANs and WANs) and is used primarily for low- and moderate-performance applications. The FCoE initiative, in contrast, intends to utilize new Ethernet extensions that replicate the reliability and efficiency that Fibre Channel has already demonstrated for data center applications. These new Ethernet enhancements are predicated on 10 Gbit/sec performance and are sometimes referred to as Converged Enhanced Ethernet (CEE). FCoE is not a replacement for conventional Fibre Channel but is an extension of Fibre Channel over a different link layer transport. Enabling an enhanced Ethernet to carry both Fibre Channel storage data as well as other data types, for example, file data, Remote Direct Memory Access (RDMA), LAN traffic, and VoIP, will allow customers to simplify server connectivity and still retain the performance and reli172 Securing Fibre Channel Fabrics
FCoE/CEE
ability required for storage transactions. Instead of provisioning a server with dual-redundant Ethernet and Fibre Channel ports (a total of 4 ports), servers can be configured with 2 CEE-enabled 10 Gbit/sec Ethernet ports. For blade server installations, in particular, this reduction in the number of interfaces greatly simplifies deployment and ongoing management of the cable plant. The main value proposition of FCoE is therefore the ability to streamline server connectivity using CEE-enabled Ethernet while retaining the channel characteristics of conventional Fibre Channel SANs, as shown in Figure 42.
FC HBAs for FC traffic Consolidated I/O ports on CNAs
Figure 42. Consolidated server network interface using FCoE and CEE Given the more rigorous requirements of storage transactions, FCoE is predicated on a new, hardened Ethernet transport that is both low loss and deterministic. Without the enhancements of CEE, standard Ethernet is too unreliable to support high-performance block storage transactions. Unlike conventional Ethernet, CEE provides much more robust congestion management and high-availability features characteristic of data center Fibre Channel. The FCoE initiative is being developed in the ANSI T11 Technical Committee, which deals with FC-specific issues and is included in a new Fibre Channel Backbone Generation 5 (FC-BB-5) specification. Because FCoE takes advantage of further enhancements to Ethernet, close collaboration is required between ANSI T11 and the Institute of Electrical and Electronics Engineers (IEEE), which governs Ethernet and the new CEE standards. As with any network protocol, it is expected that FCoE and CEE will also have security vulnerabilities. As the protocol becomes ratified and end-users begin deploying these solutions, security issues are likely to arise over time. It is still too early to tell what vulnerabilities will be found in this protocol but it will obviously share some security characteristics inherent with the FC and Ethernet protocols. This is good news from a security perspective since most of the security issues in
Securing Fibre Channel Fabrics 173
the LAN world are attributed to the protocol layers above the physical and data link layers, where CEE resides. Although attacks exist at these layers, there are less exploits available.
174
11
Brocade has been offering data encryption functionality since the introduction of the Brocade 7500 Extension Switch, which supported IPSec for encryption of data transported over an FCIP tunnel. In 2008, Brocade introduced a new hardware platform for encryption of data-atrest for both disk and tape media, which offers unprecedented encryption bandwidth from a single device. This encryption solution is actually based on a switch platform and not a single-purpose appliance. It functions in the same way as a conventional Layer 2 FC switch but with the additional hardware required to support line-speed encryption and compression functionality. NOTE: In this chapter there are references to FOS 6.3, which is scheduled to be released in late summer 2009.
175
The Brocade encryption device features the following: Up to 96 Gbps processing bandwidth for disk encryption Up to 48 Gbps processing bandwidth for tape encryption and compression Encryption using the industry standard AES-256 algorithm Compression using a variant of gzip 8 Gbps FC port speeds Disk encryption latency of 1520 sec Tape encryption and compression latency of 3040 sec Brocade internally-developed encryption ASIC technology FC switching connectivity based on the Brocade Condor 2 ASIC Dual Ethernet ports for HA synchronization and heartbeats Smart Card reader used as a System Card (ignition key optional) The ignition key is an optional feature, available with FOS 6.3 and later, which can be enabled to enhance the level of security on the switch. The ignition key is a Smart Card inserted into a Smart Card reader to enable the cryptographic capabilities of the switch. Without it, the Brocade Encryption Switch behaves as a regular 8 Gbps Layer 2 FC switch. If the ignition key feature is used, then it is also imperative to store the Smart Card in a safe location after the cryptographic functions of the switch have been enabled. The Smart Card must be reinserted in the reader (see Figures 43 and 45) each time the switch is rebooted or power cycled to enable the cryptographic capabilities of the switch.
32 x 8 Gbps FC ports Three redundant fan modules Two redundant power supplies USB port RJ-45 GbE management port
Securing Fibre Channel Fabrics
Two redundant RJ-45 GbE ports for intercluster communication FIPS 140-2 Level 3 compliant cryptographic boundary cover Smart Card reader used as a System Card (ignition key - optional)
Figure 43 and Figure 44 illustrate the Brocade Encryption Switch and its components.
Smart Card reader USB port
Status LED Power LED Ethernet 2 RJ-45 GbE redundant management port cluster ports RJ-45 serial port 8 Gbps FC ports
Figure 44. Rear view of the Brocade Encryption Switch The Brocade Encryption Switch is also available in an entry-level version for disk encryption. Some companies may not require the full 96 Gbps of bandwidth for disk encryption nor have the budget for this type of solution. This entry-level product was created offering up to 48 Gbps of encryption processing bandwidth at a lower price point. The entry-level version is identical to the full-featured version with the exception of encryption processing bandwidth; all 32 FC ports are still enabled and can be used to connect hosts and storage devices. Later, if the 48 Gbps encryption bandwidth is exceeded, a simple license upgrade to the full 96 Gbps bandwidth version can be installed online without interruption to the production environment.
177
Figure 45 and Figure 46 illustrate the FS8-18 blade and its components.
2 RJ-45 GbE redundant cluster ports
8 Gbps FC ports
Figure 45. Profile view of the Brocade FS8-18 The FIPS 140-2 Level compliance posed several challenges for the FS8-18. The typical Brocade enterprise-class platform blade has all of its ASICs exposed on the card. To prevent tampering with the compo178 Securing Fibre Channel Fabrics
nents of the blade involved in the cryptographic (crypto) process it was necessary to build a physical crypto security boundary protecting all the memory, true random number generator, encryption, and Condor-2 ASICs. This boundary was created using a cover over these components, which in turn posed a new challenge: cooling. The cover cannot have vents, which could allow intruders to access the internal components with specialized tools, so copper heat sinks were placed on the cover to dissipate the heat, as shown in Figure 46. As with the Brocade Encryption Switch, the FS8-18 Encryption Blade is also available in an entry-level version for disk encryption. The entrylevel version of the blade though applies to the entire DCX family chassis. The Brocade DCX family chassis can support from one to four FS818 blades per chassis. With the entry-level version, the entire chassis is limited to 48 Gbps of encryption processing bandwidth per blade for disk. The entry-level version affects only the encryption processing bandwidth; all 16 FC ports are still enabled and can be used to connect hosts and storage devices. Later, if the 48 Gbps encryption bandwidth is exceeded, either new FS8-18 blades can be added or all the encryption blades in the chassis can be upgraded with a simple chassis-level license upgrade to full 96 Gbps bandwidth.
Figure 46. Side view of the Brocade FS8-18 One advantage the FS8-18 blade has over the encryption switch is that all of the encryption traffic passes through the backplane. It is not necessary to connect the hosts or storage devices involved in the encryption process into one of the 16 FC ports on the blade. In fact,
179
encryption will take place even though there are no devices directly connected into the blade. This is accomplished using the frame redirection technology described in Frame Redirection on page 33.
Feature Multi-path rekeying to a LUN through an EE (up to 8 paths) System Card to enable crypto capability Quorum authorization of sensitive operations Access Gateway for third-party support (switch only)
FOS 6.2
Compression is an important component of a data-at-rest encryption solution for tape. Once data is encrypted, it is no longer compressible. Compression works on the principle of searching for patterns and optimizing them. Encryption takes data and removes all patterns by randomizing the data. Once the data is randomized and all patterns are removed, then the compression algorithm has no patterns to optimize. If encrypted data is sent directly to a tape drive, then the native compression capabilities of that tape drive will no longer operate Hence, it is important to compress the data first and then send it to the tape drive to prevent an unnecessary increase in the number of tape media used for backups. The compression algorithm used in the Brocade encryption solution is based on a variant of the standard gzip algorithm. The compression ratio obtained using this compression algorithm varies, like any other compression algorithm, depending on the type of data and how compressible it is. Data with a lots of white space compresses quite well while some data may not compress at all.
CryptoTarget Containers
A CTC (Crypto Target Container) is created for each storage target port hosted on a Brocade encryption device and is used to set up the encryption to a media. A CTC can be composed of only one storage port target but it can have multiple initiators or hosts associated with it. A CTC can also have several LUNs behind the storage port in the
Securing Fibre Channel Fabrics 181
CTC. Furthermore, once a storage port has been assigned to a CTC, it cannot exist or be defined in another CTC. Essentially, this forces all traffic that goes through a specific storage port to be encrypted and to go through the same encryption device. NOTE: The storage port can still be made accessible (with appropriate zoning) for other hosts in case encryption is not required for their LUNs. In this case, these LUNs are not added to the CTC.
Host
0 1 0 0 1 1 0 1 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0 1 ciphertext 0 1 0 1 1 1 0 0
. . .
. . .
LBn
$1028.06
182
The next encryption process to consider is the rekey operation, in which a LUN is re-encrypted using a different key. There are two basic scenarios that may force a rekey operation: a compromised key or a security policy requirement. If a key is deleted or stolen, it is compromised and the data encrypted using this key can no longer be considered secure. The security or risk management department may also implement a policy requiring that all keys must be refreshed on a specified schedule, such as every 36 months. This is often done out of fear that keys may have been compromised without their knowledge and errs on the side of caution by forcing a rekey of all encrypted data after a defined period of time. Rekeying can be performed automatically by setting an expiration date on for key through the key management system. In-place rekeying is not possible for tape, since a tape drive is a steaming device and the media itself is flexible. Rekeying data on a tape involves copying it to a new tape and encrypting it with a different key as the data is copied. In the case of disk media, the process is much simpler, since the LUN with the compromised key can be rekeyed inplace and online if necessary. During the rekey operation, the LUN actually has two keys assigned to it. Once the rekey process is completed, then the original key is simply discarded. As with a first-time encryption, the rekey operation can be performed online or offline.
183
Clustering is commonly used to ensure protection against hardware failure. There are two types of clusters for Brocade encryption solutions, which can be used independently or simultaneously. The High Availability (HA) cluster provides hardware redundancy for the encryption devices. The DEK cluster allows two or more encryption devices to share the same keys.
HA Cluster
The HA cluster is an active-passive clustering configuration in which one encryption device is a warm standby for the other encryption device it is paired with. Only two encryption devices can form an HA cluster and they must exist within the same fabric. Heartbeats are exchanged between the encryption devices using redundant Gigabit Ethernet ports through an out-of-band dedicated network to let the other know it is still alive. This same dedicated network is used to synchronize key state information between the units to allow one to take over for the other at any given time. Since the HA cluster uses an active-passive configuration per CTC, it is more efficient to balance the load across both encryption devices instead of having the entire load on one unit with the other doing nothing until the active unit fails. It is possible for each encryption device to be active simultaneously and carry its own encryption load. In this case each unit is active with its load and passive while waiting for the other unit to fail over. In the event that one encryption device fails, it is important to consider the available bandwidth on the other cluster member. For example, say that Encryption Device A in the cluster is currently pushing 52 Gbps of traffic and Encryption Device B is pushing 61 Gbps. If Encryption Device B fails, Encryption Device A will take over the CTCs. Since Encryption Device A is already pushing 52 Gbps and now has an additional 61 Gbps, for an aggregate of 113 Gbps of traffic, this exceeds the 96 Gbps capability of the encryption device.
DEK cluster
The DEK cluster by definition shares the same data encryption keys as all other encryption devices within a cluster management group. An encryption group contains several encryption devices that share the same DEKs. For each encryption group, there must be one encryption device designated as the group leader. The group leader is responsible for functions such as the distribution of the configuration to the other members of the group, authenticating with the key vaults, and configuring CTCs.
184
Encryption Group
Primary key management appliance or key vault
Fabric 1
DEK Group
Fabric 2
Encryption Device 1
Encryption Device 3
Encryption Device 4
Encryption Device 2
HA Cluster 1
HA Cluster 2
Cluster LAN
Key Management
Once data is encrypted onto a storage media, the keys become highly critical and extensive measures must be taken to protect them. Appropriate measures should be taken to manage these keys throughout their lifecycle. Keys need to be backed up as they can be lost, or stolen, destroyed intentionally, or expire after a pre-determined period of time. Loss of the encryption keys is equivalent to losing the data. Unlike data-in-flight, the keys for data-at-rest must be available for relatively long periods of times depending on the type of information being encrypted. With patient health records for example, it is possible this
Securing Fibre Channel Fabrics 185
information is kept for the lifetime of a patient, which can be over 100 years. Keys can also be stolen or compromised, in which case the information would have to be re-encrypted using a different key to ensure the confidentiality of the information. Media such as disk and tape also have a limited shelf life and they go through evolution cycles to an eventually incompatible format (remember 8-track tapes and floppy disks?). The information needs to be refreshed as the media expires and must be re-encrypted using the same or different key. For redundancy, a typical key vault will be implemented with two or more units to prevent single points of failure. If the primary key vault becomes unavailable, the secondary or other key vault can accept or provide keys to the encryption device. The following key management solutions are currently supported: NetApp Lifetime Key Management (LKM) EMC RSA Key Manager (RKM) HP Secure Key Manager (SKM)
Future releases of Fabric OS will include other key management solutions from other vendors. Brocade is also involved in the development of key management standards. Two standards in particular are being developed: the IEEE 1619.6 key management standard and the OASIS KMIP (Key Management Interoperability Protocol). Once these standards become ratified, they will be supported with the Brocade encryption solution. Brocade encryption devices generate the actual data encryption key and store it locally in its cache. The DEK is used to encrypt data using the AES-256 encryption algorithm. Before any data encryption begins, the key must be backed up to a key vault, or key manager, and then placed in the local cache before it can be used. Subsequently, the DEK is exchanged with the other members in the encryption group. When a new LUN, tape media, or LUN with existing cleartext data is encrypted, the Brocade encryption device generates a DEK. This key is then backed up to the primary key vault and secondary key vaults if they exist. Once the primary key vault has successfully stored the DEK, it confirms this to the encryption device. The DEK is then synchronized with all of the other members in its encryption group, as shown in Figure 49. Only once all of this has occurred will the new key be used for the encryption process.
186
LAN
Primary key vault
2. DEK backed up to primary key vault 5. DEK synchronized with encryption group members
Figure 49. DEK synchronization LKM. Decru was a privately owned company that offered storage security solutions and it was acquired by NetApp in June 2005. The DataFort appliance is the NetApp hardware-based encryption appliance for data-at-rest, with one appliance for disk and another for tape. The key management component is the Lifetime Key Management (LKM) system, which was originally offered as a software-only solution and is now available as a FIPS 140-2 Level 3 -compliant hardened appliance (NetApp KM500 appliance). The LKM solution uses a trusted key exchange method to transport the keys from the encryption device and the LKM key vault. Keys are actually encrypted using a key encryption key called a domain key. The key exchange is performed using a link key based on a shared secret using DH key exchange. The key is encrypted in transit, decrypted at the other end, and then re-encrypted before it is stored in the key vault.
187
6 Encryption engine 3 2 Cleartext DEK Wrapped DEK SSL 4 Wrapped DEK Cleartext DEK Encryption engine 5 Encryption engine LKM Domain key
Figure 50. LKM key exchange process RKM. When RSA was formally acquired be EMC in September 2006, it was propelled into the storage space. The RSA Key Manager (RKM) is available through RSA in a software version or a hardened appliance version. RKM uses an opaque key exchange method (explained in Opaque Key Exchange on page 89) to exchange the keys between the Brocade encryption solution device and the key vault. To ensure confidentiality during the key exchange, the DEK is encrypted using a Brocade key encryption key, or master key, that is not known by RKM. The master key is stored locally on the Brocade encryption device and is backed up on a recovery card using shares. When generating the master key, a pre-established passphrase (several passwords) is required. The encrypted or wrapped DEK is sent from the encryption device to the key vault using a secure SSL link. Once the RKM receives the key, it is stored as is in its encrypted form. Figure 51 illustrates the RKM key exchange process.
188
Wrapped data encryption key Encryption engine Cleartext data encryption key SSL-protected link
Encrypted keys are stored in the RKM database before any data can be encrypted
FIPS 140-2 Layer 3 Security boundary defines where keys must be wrapped (encrypted) before they leave
Figure 51. RKM key exchange process SKM. The HP StorageWorks Secure Key Manager provides secure centralized encryption key management services for HP LTO-4 enterprise tape libraries and Brocade encryption devices. It is a hardened appliance validated to the FIPS 140-2 standard. SKM also uses an opaque key exchange method in which the key is encrypted first before being sent to the key vault. The key encryption key is stored locally on the Brocade encryption device. The encrypted, or wrapped DEK is sent from the encryption device to the key vault. Once the SKM receives the key, the key is stored as is in its encrypted form. Figure 52 illustrates the SKM key exchange process.
Brocade Encryption Device
Key encryption key
HP SKM Appliance
Cleartext DEK
189
DataFort Compatibility Mode. As mentioned previously, the NetApp DataFort encryption appliance has historically been the market leader in the storage encryption space. NetApp and Brocade established a strategic relationship and the next-generation DataFort appliance is actually the Brocade encryption solution. One of the challenges to making this happen was what to do with existing DataFort customers who have thousands of tapes previously encrypted using the DataFort product. The solution was to create a DataFort compatibility mode in the Brocade encryption solution to read media previously encrypted with the DataFort appliance. The DataFort compatibility mode can read either disk or tape media and can also write to new tapes or existing LUNs encrypted with the DataFort format. The DataFort compatibility mode does several things. The Brocade encryption device uses the ECB mode of operation for the AES-256 encryption algorithm, which is used by the DataFort product. The metadata format used by the DataFort product replaces the native format used by the Brocade encryption device. The compression algorithm is the same on both platforms so there is nothing special which must be done for compression. The DataFort compatibility mode enables an easy migration from the DataFort product to the new Brocade encryption solution, which will also integrate with the NetApp LKM key management solution already deployed with the DataFort encryption appliance. However, customers using earlier versions of the LKM, which was software-based, need to upgrade to the NetApp KM500 4.0 or later appliance.
190
Encrypting with Backup Applications. Although only the payload in the frame is encrypted, special considerations must be taken to adapt to each backup software vendor. There are two basic elements in a backup solution that an encryption solution must understand. The first is how the backup application writes its metadata to the tape media. This is important to know to determine where to place the key information on the media for later data recovery. Obviously, the actual unencrypted key is not stored on the tape media itself, which would be equivalent to sliding a spare house key under the front doormat. In fact, only an index referring to the key is written to the tape media as part of the tape header written by the backup application. The second is how each backup application handles tape pools. Keys can be assigned either on a per-tape media basis or on a per-pool basis. As a best practice, assign one key per physical tape media to reduce the rekey overhead if a key gets compromised. Nevertheless, for some special situations, it may be useful to use one key per pool. For example, if a set of tapes is planned to be sent to a third party, perhaps for auditing purposes, a single key could be used for the entire tape set to simplify the reading of the tapes at the other end. The following backup software solutions will be supported in FOS 6.3 and later: Symantec (formerly Veritas) NetBackup IBM Tivoli Storage Manager (TSM) EMC (formerly Legato) NetWorker CommVault Galaxy Data Protection HP Data Protector BakBone NetVault CA ARCserv
191
Front Panel
BP
Condor 2
Condor 2
Crypto boundary
Battery
192
BP
Front Panel
Backplane
Condor 2
Condor 2
Crypto boundary
Battery
Figure 54. Brocade FS8-18 Encryption Blade internal architecture The components described in the following three sections are enclosed within a physical crypto boundary. The security boundary is designed to comply with the FIPS 140-2 standard at Level 3 to isolate all hardware components involved in the processing of cleartext keys. The encryption switch cover is the physical crypto boundary for the Brocade Encryption Switch and the encryption blade has a special cover that covers the necessary hardware on the card.
193
Battery
A Lithium-ion battery is used when there is no power to the encryption device. This battery has a life span of approximately seven years after power has been removed from the encryption device. It is used primarily to sustain the FIPS 140-2 Level 3 tamper response mechanism, which zeroizes the keys stored in the local cache. The remaining components are found outside the security boundary.
Condor 2 ASIC
The Condor 2 ASIC features forty 8 Gbps ports and is the heart of the FC Layer 2 switching. Each encryption device has two Condor 2 ASICs.
194
Management Interfaces
Managing and configuring the Brocade encryption solution can be performed either with the FOS CLI or Brocade DCFM Enterprise version as well as DCFM Pro/Pro+ as of FOS 6.3.0. As a best practice, it is highly recommended that you use DCFM. The CLI requires many more steps to perform operations that can be done with one click in DCFM, and having to type many commands increases the risk of typing errors. Also, the DCFM interface provides wizards that guide you through the configuration process to further reduce the risk of errors due introduced as a result of improper sequencing of commands. The management interfaces should never be accessed using unsecure protocols such as telnet for the CLI or HTTP for a DCFM. Use secure protocols, such as SSH instead of telnet and HTTPS instead of HTTP; and block or disable their equivalent unsecure services. For additional protection, the System Card, or ignition key feature should be implemented and a Smart Card required to enable the encryption capability of the switch. This will prevent someone who steals both the switch and the disk media from being able to decrypt the data on the storage media. Of course, it is also important to store the System Card in a secure location away from the encryption switch and storage media.
Availability
As with any IT solution, there are many ways to ensure availability. Choosing the best method to maintain availability depends on the value of the information (and impact of a loss of availability), the risk and probability of disruption, and the cost of implementing high availability.
195
Clustering
Clustering is a commonly used method to ensure protection against hardware failure. There are two types of clusters for Brocade encryption solutions, which can be used independently or simultaneously. The High Availability (HA) cluster provides hardware redundancy for the encryption devices. The Data Encryption Key (DEK) cluster allows two or more encryption devices to share the same keys. For tape encryption using a single fabric, a single encryption device could be sufficient. However, some organizations consider the backup application as mission-critical or high priority due to a service level agreement that must be respected. If this is true, a business case can be made to justify the use of a second encryption device to form an HA cluster. For disk encryption using a dual-fabric configuration, the minimum requirement is for one encryption device per fabric. In the event of the failure of one encryption device, the MPIO software on the host automatically fails over the traffic to the remaining path. This may result in degraded performance in some heavily used systems, which may or may not be acceptable. If it is not acceptable, then add a second encryption device in each fabric to form two HA clusters. For redundancy, it is good practice to implement more than one path from the disk storage device to the fabric. If more than one path exists in the same fabric from a host to a LUN, then it is important to use FOS 6.3 or later when performing a first-time-encryption or a rekey operation. Multipath rekeying operations through a single encryption engine is not supported prior to FOS 6.3.
196
Performance
As explained earlier, the latency of the Brocade encryption devices is practically negligible compared to the time it takes to complete an I/O operation. However, a complex fabric may have multiple ISLs and offer many paths between the various devices within the fabric. As discussed earlier, the frame redirection feature can automatically redirect frames to the encryption device regardless of where it is located in the fabric. However, certain locations for the encryption devices offer the best performance. The basic concept of locality applies to the encryption solution as well. Locality simply states that a host and its storage devices should be located as closely as possible given a specific architecture. For example, the highest locality occurs when a host and its storage device are connected to the same switch in a fabric or the same blade in a director or backbone. Essentially, SAN placement of the encryption devices should be done as close as possible between the host and its storage devices. To avoid forcing traffic to pass through ISLs, a backbone can be used to consolidate multiple switches. The Brocade FS8-18 Encryption Blade in a Brocade DCX or DCX-4S does not require ISLs to perform the encryption and all the traffic to be encrypted passes through the backplane.
197
Key Management
Key Expiration. Part of managing the keys is determining how long a key should exist. Many organizations never expire a key, while others require expiration every six months. There is no general rule as to the frequency of key expiration and it depends entirely on the business requirements and tolerance to the risk of a 256-bit key going stale. Since an online rekey operation can affect application performance and an offline rekey requires downtime, most organizations would rather not perform a rekey too often. Generally, it is considered safe to expire 256-bit keys somewhere between every two to four years. Key Per Media v. Pool. For tape encryption, a single DEK can be assigned to one tape media or to an entire pool of tapes. Best practice is to have one DEK per tape media. In the event the DEK is compromised, it is much simpler to create a new backup for one tape as opposed to an entire pool of tapes.
198
used for cross-site backups where data stored at one site is backed up to a tape library located at another site. Figure 55 demonstrates how the data-in-flight for a cross-site backup can be encrypted using a data-at-rest encryption solution.
as
Site A
Ciphertext Encrypted frame payload
Site B
Servers
Tape library
Figure 55. Encrypted cross-site backup Similarly, this same strategy could be used for data replication between two sites. Figure 56 illustrates how a data-at-rest encryption solution can be used to encrypt data on the dark fiber during data replication. In this case, the data stored on the primary data center is encrypted using the encryption device. The disk-to-disk replication application (such as EMC SRDF or IBM PPRC) will simply copy the data which is already in ciphertext format to the alternate site where it will be stored as is in ciphertext.
Site A
Ciphertext
Site B
Servers
Tape library
Figure 56. Encrypting data over dark fiber with data-at-rest encryption
200
Chapter Summary
Chapter Summary
Brocade provides encryption solutions for both data-at-rest and datain-flight. The Brocade Encryption Switch and the Brocade FS8-18 Encryption Blade for the Brocade DCX backbone family can be used for both disk and tape media to encrypt data-at-rest. The Brocade encryption switch is a regular 8 Gbps Layer 2 FC platform and, when used in encryption mode, provides robust encryption in combination with third-party key management. The addition of a Smart Card reader for an ignition key provides additional security. Brocade offers data encryption for data-in-flight in the Brocade 7500 Extension Switch and Brocade FR4-18i Extension Blade, both of which support IPSec for encryption of data transported over an FCIP tunnel The Brocade data-at-rest encryption solution, described in detail in this chapter, can be used to encrypt data-in-flight. The encryption device in the primary data center encrypts the frame payload before sending it over the dark fiber connection.
201
202
A
Security Level I I I I A I B B A
3.1.0
A A A A A
203
Security Feature Secure File Copy (SCP) for configuration upload Secure File Copy (SCP) for firmware download Secure File Copy (SCP) for supportSave SecTelnet Telnet disable Telnet timeout Web Tools timeout Secure passwords (centralized control via RADIUS/CHAP) RSA RADIUS server RADIUS password expiration RADIUS source IP address information LDAP LDAP in FIPS mode Multiple User Accounts (MUA, up to 15) Multiple User Accounts (MUA, up to 255) Role Based Access Controls (RBAC) Admin, User, Switch Admin roles RBAC Operator, Zone Manager, Fabric Admin, Basic Admin roles added RBAC Security Admin role added RBC permission violation (message ID: SEC-3047) Admin lockout policy Boot PROM password reset Password hardening policies Upfront login in Web Tools
FOS 2.x
FOS 3.x
Security Level I I I I I B B A A A A A I I I I I A I A B B
2.6
3.1
4.1 4.4
2.6
3.1
4.1 6.2
3.2
3.2
4.4 5.2 5.0.1 5.2 5.3 6.0 5.3 6.0 5.1 5.0.1 Default in 5.2
204
Security Feature Login banner Monitor attempted security breaches (via Audit Logging) Monitor attempted security breaches (via Fabric Watch Security Class) FC Security Policies - Device Connection Control/Switch Connection Control (DCC/SCC) policies Management access controls IP Filters (IPF) Trusted Switch (FCS) central security management FCS policy (without SFOS) AUTH policy Management Access Controls (SNMP, Telnet, FTP, Serial Port, Front Panel) Zoning Hardware-enforced zoning by WWN and Domain/Port ID Default zoning Insistent domain IDs RSCN suppression/aggregation Configurable RSCN suppression by port Event auditing Change tracking Firmware change alerts in Fabric Manager E_Port disable (portCfgEPort) Persistent port disable (E/F/FL/Ex/M_Ports) Securing Fibre Channel Fabrics
Security Level B A A A
5.2
A A A A
2.6
3.1
A B B
5.1 4.2 3.1 4.1 5.0.1 5.2 2.4 3.0 4.0 4.4 2.6 2.6.1 3.2 3.2 4.2 4.2
I I B O I I A I I
205
Security Feature Administrative Domains (AD) Logical Switch/Logical Fabric/Base Fabric/Default Fabric (replaces AD) IPSec (7500 only) IPSec to secure management interfaces IPv6 IPv6 auto-configuration IPv6 for IPSec Security DB size increased to 1 MB (from 256 K) FIPS mode (140-2 level 2) USB port disable/enable Fabric-based encryption for data-at-rest Hash authentication of firmware (signed firmware) Integrated Routing (IR) Traffic Isolation zones (TI)
FOS 2.x
FOS 3.x
FOS 4.x+ 5.2 6.2 5.3 6.2 5.3 6.2 6.2 6.0 6.0 6.0 6.1.1 _enc 6.1.0 6.1.1 6.0
Security Level A A O O O O O
A B O A O O
206
The Fibre Channel Industry Association (FCIA) is a mutual-benefit, nonprofit international organization of manufacturers, system integrators, developers, vendors, industry professionals, and end users. The FCIA is committed to delivering a broad base of Fibre Channel infrastructure technology to support a wide array of applications in the mass storage and IT-based arenas. FCIA working groups and committees focus on specific aspects of the technology, targeting both vertical and horizontal markets and including data storage, video, networking, and SAN management. The FCIA is also responsible for managing events such as interoperability testing, such as plug-tests held at the University of New Hampshire and Fibre Channel Technology demonstrations at industry events such as SNW (Storage Networking World). For more information, visit the FCIA Web site: www.fibrechannel.org
IEEE
The Institute of Electrical and Electronic Engineers (IEEE) has a wide variety of standards developed in relation to security. The IEEE 1619 Security in Storage Working Group (SISWG) develops standards for encrypting storage media for data-at-rest. SISWG has developed standards for disk-drive-based encryption (IEEE 1619), tape-based encryption (IEEE 1619.1), and key management (1619.3). SISWG operates as a project under the IEEE Computer Society Information Assurance Standards Committee.
207
1619.3: Standard for Key Management Infrastructure for Cryptographic Protection of Stored Data. This is an API standard between a key management server and client (incomplete at the time of writing). For more information on SISWG, visit the SISWG Web site: https://siswg.net/
ANSI T11
The American National Standards Institute (ANSI) is the voice of US standards and conformity assessment system and was formally recognized as such in 1970. T11 is the ANSI technical committee defining the Fibre Channel protocols and physical layer. Fibre Channel Security Protocol (FC-SP) defined methods of authorizing, authentication, and encrypting Fibre Channel interfaces for a fabric. To claim compliance with FC-SP, devices need to support authentication via Diffie Hellman Challenge Handshake Authentication Protocol (DH-CHAP). DH-CHAP is a mutual authentication between end devices and switches. Fibre Channel Framing and Signaling 2 (FC-FS-2) defined the structure of the Fibre Channel frame that conveys the Encapsulating Security Payload (ESP) header as defined in Request for Comments (RFC) 4303. For more information on T11. For more information, visit the ANSI T11 Web site: http://www.t11.org/index.html
SNIA
The Storage Network Industry Association (SNIA) is a not-for-profit organization which was incorporated in 1997, and although it is not directly involved in the development of standards, it acts as a catalyst for the development of storage solution specifications, global standards, and storage education, the development of storage solution specifications and technologies, global standards, and storage education. It is composed of individuals representing member companies that work together to further advance the storage industry. For more information, visit the SNIA Web site: http://www.snia.org/home
208
SNIA also has various technical workgroups and forums addressing specific areas of storage. The Storage Security Industry Forum specifically focuses on issues concerned with storage security. This forum has created several valuable documents with the help of various industry contributors. For more information, visit the SSIF Web site: http://www.snia.org/forums/ssif The technical working groups support the SNIA mission by delivering information and standards that accelerate the adoption of storage networking. Specifically, the SNIA Security Technical Workgroup (TWG) helps drive some of the standards addressing storage security issues. Its focus is not only with Fibre Channel security but with any security inherent in underlying transports or technologies. For more information, visit the SNIA TWG Web site: http://www.snia.org/tech_activities/workgroups/
IETF
The Internet Engineering Task Force (IETF) has the large job of securing the Internet. The Security Area of the IETF defines security protocols for a variety of techniques to authorize, authenticate, encrypt and manage various aspects of data exchanges. From Public Key Infrastructure (X.509) to Mail Security (S/MIME), the IETF addresses many aspects of security. For more information on security in the IETF, visit: http://trac.tools.ietf.org/area/sec/trac/wiki
OASIS
The Organization for the Advancement of Structured Information Standards (OASIS) is a consortium that drives the development, convergence and adoption of open standards. OASIS has developed a number of security related standards for identity management, key management and web service security. The Key Management Interoperability Protocol (KMIP) defines an interface between encryption devices that consume keys and the key management system that manages the keys. For more information, visit the OASIS Web site: http://www.oasis-open.org/home/index.php
209
210
Index
Numerics
3DES 76 Brocade Encryption Switch 176 Brocade FR4-18i Extension Blade 148 Brocade FS8-18 Encryption Blade 178 Brocade roles 143 Brocade SAN Health Pro 131 Brocade SAN Security Model 91 buffer credits (BB credits) 28 Business Continuity (BC) 105 Business Continuity (BC) plan 52
A
access control list (ACL) 5, 57 Advanced Encryption Standard (AES) 76, 84 ANSI T11 5 appliance-based encryption 112 application-based encryption 111 assessment 120, 131 asymmetric cryptography 76 attacks back door 60 denial-of-service 60 distributed DoS 60 man-in-the-middle 61 sniffing 60 spoofing 61 audit 120, 131 audit trail 50 AUTH policy 147 authentication 62 multi-factor 62 one-factor 62
C
California Senate Bill (SB) 1386 2 CIA triad 46 CIANA 48 cipher block 79 cryptographic 75 stream 80 substitution 75 transposition 75 ciphertext 75 cleartext 75 Common Criteria (CC) 167 Common Criteria evaluation levels 169 Converged Enhanced Ethernet (CEE) 1, 172 core-edge topology 39 countermeasure 50 credit-based flow control 28 cryptographic algorithm 75 cryptographic cipher 75 cryptosystem 75 211
B
back door attack 60 biometrics 63 block cipher 79 Brocade 7500/7500E Extension Switch 148 Brocade Encryption Solution 181
Index
CSIR team (CSIRT) 120 CTC (Crypto Target Container) 181 Cyclic Redundancy Check (CRC) 22
D
data cleaning algorithms 70 data disposal 68 Data Encryption Standard (DES) 76, 83 data purging 69 data sanitization 68 data-at-rest 108 DataFort compatibility mode 190 data-in-flight 108 decryption 75 Defense Information Systems Agency (DISA) 169 DEK cluster 184 denial-of-service (DoS) attack 47, 60 device access control 95 Device Connection Control (DCC) policy 96, 135 DH-CHAP (Diffie Hellman-Challenge Handshake Authentication Protocol) 5, 84 digital certificate 85 digital signature 81 direct-attached storage (DAS) 19 Disaster Recovery (DR) 105 Disaster Recovery (DR) plan 52 disposal, data 68 distributed DoS (DDoS) attack 60 DMZ (demilitarized zone) 123, 12 DoS attack 47 dual-fabric architecture 35 Dynamic Load Sharing (DLS) 33 Dynamic Path Selection (DPS) 33
host-based 113 storage-based 115 enterprise-class platforms 25 evaluation assurance level (EAL) 167 exploit 50 extended port 26 external threat 54
F
F_Port 26 fabric configuration server (FCS) policy 108 fabric management 108 fabric port 26 Fabric Shortest Path First (FSPF) 31 fabric-based encryption 112 false negative (biometrics, type II error) 63 false positive (biometrics, type I error) 63 FC backbone 25 FC director 25 FC protocol, arbitrated loop 20 FCS policy 135 FC-SP (Fibre Channel-Security Protocol) 84 Federal Information Security Management Act (FISMA) 170 Fibre Channel over Ethernet (FCoE) 1, 172 Fibre Channel over IP (FCIP) 1 Fibre Channel ports 26 Fibre Channel Routing (FCR) 148 FIPS 140-2 165 FIPS 140-2 Level 2 152 FIPS-mode 152 first-time encryption (FTE) 182 flow control 28 credit based 28 full-mesh topology 37
E
E_Port 26 EMC RSA Key Manager for the Datacenter (RKM) 90 encryption appliance-based 112 application-based 111 fabric-based 112 212
G
gigabit interface converter (GBIC) 22 Gramm-Leach-Bliley Act (GLBA) 163 gzip 176
Index
H
hackers 10, 58 High Availability (HA) cluster 184 HIPAA (Health Insurance Portability and Accountability Act) 162 Host Bus Adapter (HBA) 93 host-based encryption 113 HP StorageWorks Secure Key Manager (SKM) 90 HTTPS 152 human threat 53
login banner 138 logs 105 loop initialization process (LIP) 20 LUN (logical unit number) 16, 44 LUN masking 94, 126
M
MAC (Management Access Control) policy 135 management interface 124 man-in-the-middle (MITM) attack 61 metaSAN 41 monitoring 106 multi-factor authentication 62
I
identification 62 IEEE 7 IEEE 1619.3 174 insider threats 10, 55 insistent domain ID (IDID) 155 Integrated Routing (IR) 148 inter-fabric link (IFL) 42 inter-switch link (ISL) 26 IP filter (IPF) 138 IPSec (IP security) 87, 110 iSCSI 1, 171
N
N_Port 26 National Standards Bureau (NSB) 83 NetApp Lifetime Key Management (LKM) 89 network data monitoring 11 node port 26 node WWN (nWWN) 27
J
JBOD (just a bunch of disks) 24
O
OASIS (Organization for the Advancement of Structured Information Standards) 7, 174 one-factor authentication 62 opaque key exchange 89
K
key 75 key management 87 Key Management Interoperability Protocol (KMIP) 174 key space 75
P
Parkerian Hexad 49 partial-mesh topology 39 password management 102 password policy 140 path selection protocols 31 PCI-DSS (Payment Card Industry Data Security Standard) 157 PCI-DSS merchant levels 159
L
LDAP (Lightweight Directory Access Protocol) 65 log file 105, 118 logging 105, 150 Logical Fabric 100, 149 logical SAN (LSAN) 98 Logical Switch 100, 149
213
Index
Personal Information Protection and Electronic Documents Act (PIPEDA) 1 Personally Identifiable Information (PII) 47 physical security 65, 116 plaintext 75 Policy-Based Routing (PBR) 124 port WWN (pWWN) 27 preventive measures 50 protection profile (PP) 168 Public Key Infrastructure (PKI) 5, 85 purging, data 69
Q
quorum 97
R
RADIUS (Remote Authentication Dial-In User Service) 65 registered state change notification (RSCN) 28 resilient core-edge topology 40 role-based access control (RBAC) 57, 97, 143 routed fabric 41 RSA 85 RSCN 99
S
SAN security model 91 sanitization, data 68 Sarbanes-Oxley Act (SOX) 164 SB 1386 2 Secure Copy Protocol (SCP) 124, 152 Secure Fabric OS (SFOS) 5, 134 secure management interface 94 Secure Socket Layer (SSL) 86 security assessment 120 security audit 120 security incident response (CSIR) plan 120 security target (ST) 168 Security Technical Implementation Guides (STIG) 169
security vulnerability 50 separation of duties 10, 97 separation of duties. 57 service level agreement (SLA) 117 Sharing Peripherals Across the Network (SPAN) 169 signed firmware 153 small form factor pluggable (SFP) 22 SNIA (Storage Networking Industry Association) 7 sniffing 11, 60 experiments 13, 15 SNMP (Simple Network Management Protocol) 106, 135 social engineering 58 spoofing 61 SSIF (Storage Security Industry Forum) 7 storage JBOD 24 LUN 16 storage name server (SNS) 29 storage-based encryption 115 stream cipher 80 substitution cipher 75 Switch Connection Control (SCC) policy 96, 135 switched fabric (FC-SW) 20 symmetric cryptography 76 syslog 118 syslogd (syslog daemon) 151
T
target of evaluation (TOE) 168 TCP/IP protocol 10 technological threat 52 TI zones 149 threats 51 external 54 human 53 insider 10, 55 technological 52 topology core-edge 39 full-mesh 37 partial-mesh 39 resilient core-edge 40 Securing Fibre Channel Fabrics
214
Index
Traffic Isolation (TI) 101 transposition cipher 75 trunking 33 trusted key exchange 88
V
Virtual Fabrics (VF) 100, 149 VPN (Virtual Private Network). 124
U
U_Port 26 universal port 26 user accounts 139 user management 102
Z
zoning 126
215
ROGER BOUCHARD
$49.95
Brocade Bookshelf www.brocade.com/bookshelf