Data Efficiency PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

Data Efficiency

Nutanix Tech Note

Version 1.1 • July 2019 • TN-2032


Data Efficiency

Copyright
Copyright 2019 Nutanix, Inc.
Nutanix, Inc.
1740 Technology Drive, Suite 150
San Jose, CA 95110
All rights reserved. This product is protected by U.S. and international copyright and intellectual
property laws.
Nutanix is a trademark of Nutanix, Inc. in the United States and/or other jurisdictions. All other
marks and names mentioned herein may be trademarks of their respective companies.

Copyright | 2
Data Efficiency

Contents

1. Executive Summary.................................................................................4

2. Introduction.............................................................................................. 5
2.1. Audience.........................................................................................................................5
2.2. Purpose.......................................................................................................................... 5

3. Nutanix Enterprise Cloud Overview...................................................... 6


3.1. Nutanix Acropolis Architecture.......................................................................................7

4. Nutanix Data Efficiency...........................................................................8


4.1. Data Avoidance..............................................................................................................9
4.2. Data Reduction............................................................................................................ 11

5. Conclusion..............................................................................................24

Appendix..........................................................................................................................25
About Nutanix...................................................................................................................... 25

List of Figures................................................................................................................ 26

List of Tables.................................................................................................................. 27

3
Data Efficiency

1. Executive Summary
Nutanix designed the Enterprise Cloud from the ground up to provide best-in-class reliability,
consumer-grade simplicity, and peerless efficiency. In this technical note, we discuss how
the Acropolis Distributed Storage Fabric (DSF) exhibits data avoidance and efficiency using
techniques such as thin provisioning, intelligent cloning, compression, deduplication, and erasure
coding. These techniques accelerate application performance and optimize storage capacity.
They are intelligent and adaptive, requiring little or no fine-tuning in most cases, which reduces
operating expenses and frees your IT staff to focus on growth and innovation. Unlike traditional
storage architectures, the Nutanix web-scale design ensures that data efficiency techniques
scale as the cluster grows—node by node, with no bottlenecks, single points of failure, or
expensive proprietary hardware and software.

1. Executive Summary | 4
Data Efficiency

2. Introduction

2.1. Audience
This tech note is part of the Nutanix Solutions Library. We wrote it for IT architects and
administrators responsible for designing, managing, and supporting Nutanix infrastructures.
Readers should already be familiar with the Nutanix Enterprise Cloud.

2.2. Purpose
This document discusses how the Nutanix Enterprise Cloud provides data efficiency, including an
introduction to the following features:
• Data transformations and utilization.
• Thin provisioning.
• Intelligent cloning.
• Compression.
• Deduplication.
• Erasure coding.

Table 1: Document Version History

Version
Published Notes
Number
1.0 November 2017 Original publication.
1.1 July 2019 Updated Nutanix overview.

2. Introduction | 5
Data Efficiency

3. Nutanix Enterprise Cloud Overview


Nutanix delivers a web-scale, hyperconverged infrastructure solution purpose-built for
virtualization and cloud environments. This solution brings the scale, resilience, and economic
benefits of web-scale architecture to the enterprise through the Nutanix Enterprise Cloud
Platform, which combines three product families—Nutanix Acropolis, Nutanix Prism, and Nutanix
Calm.
Attributes of this Enterprise Cloud OS include:
• Optimized for storage and compute resources.
• Machine learning to plan for and adapt to changing conditions automatically.
• Self-healing to tolerate and adjust to component failures.
• API-based automation and rich analytics.
• Simplified one-click upgrade.
• Native file services for user and application data.
• Native backup and disaster recovery solutions.
• Powerful and feature-rich virtualization.
• Flexible software-defined networking for visualization, automation, and security.
• Cloud automation and life cycle management.
Nutanix Acropolis provides data services and can be broken down into three foundational
components: the Distributed Storage Fabric (DSF), the App Mobility Fabric (AMF), and AHV.
Prism furnishes one-click infrastructure management for virtual environments running on
Acropolis. Acropolis is hypervisor agnostic, supporting two third-party hypervisors—ESXi and
Hyper-V—in addition to the native Nutanix hypervisor, AHV.

Figure 1: Nutanix Enterprise Cloud

3. Nutanix Enterprise Cloud Overview | 6


Data Efficiency

3.1. Nutanix Acropolis Architecture


Acropolis does not rely on traditional SAN or NAS storage or expensive storage network
interconnects. It combines highly dense storage and server compute (CPU and RAM) into a
single platform building block. Each building block delivers a unified, scale-out, shared-nothing
architecture with no single points of failure.
The Nutanix solution requires no SAN constructs, such as LUNs, RAID groups, or expensive
storage switches. All storage management is VM-centric, and I/O is optimized at the VM virtual
disk level. The software solution runs on nodes from a variety of manufacturers that are either
all-flash for optimal performance, or a hybrid combination of SSD and HDD that provides a
combination of performance and additional capacity. The DSF automatically tiers data across the
cluster to different classes of storage devices using intelligent data placement algorithms. For
best performance, algorithms make sure the most frequently used data is available in memory or
in flash on the node local to the VM.
To learn more about the Nutanix Enterprise Cloud, please visit the Nutanix Bible and
Nutanix.com.

3. Nutanix Enterprise Cloud Overview | 7


Data Efficiency

4. Nutanix Data Efficiency


Nutanix incorporates a wide range of storage optimization technologies that work in concert
to make efficient use of available capacity for any workload. These technologies are software
driven, massively scalable, and adaptive to workload characteristics, eliminating the need for
manual configuration and fine-tuning. The native compression, encoding, and deduplication
features substantially increase storage efficiency and improve performance because of larger
effective cache sizes in the performance tier.
True to Nutanix methodology, the capabilities that provide data efficiency are entirely software-
based. The Enterprise Cloud does not rely on special networking hardware, such as custom
ASICs or FPGAs, but rather performs all data efficiency computation using an industry-standard
server architecture. This approach facilitates regular product version upgrades, so Nutanix can
deliver improved capabilities and future enhancements as a simple software update.
In the following sections, we cover how Nutanix uses native data avoidance (thin provisioning
and intelligent cloning) and data reduction (compression, deduplication, and erasure coding)
techniques to handle data efficiently.

Note: Administrators can enable deduplication, compression, and erasure coding on


the same container. However, unless the data is a good candidate for deduplication
(as defined in the Elastic Deduplication Engine section), we recommend using only
compression.

In the following figure, we provide an example of the data reduction achieved using compression,
deduplication, and erasure coding, as well as the overall efficiency resulting from additional data
avoidance techniques. Capacity optimization metrics are available via Prism on the Storage
Overview page and for individual containers on their detail pages.

4. Nutanix Data Efficiency | 8


Data Efficiency

Figure 2: Sample Prism Capacity Optimization

4.1. Data Avoidance


Data avoidance technologies typically contribute the most to data efficiency because they prevent
the creation of unnecessary data, which also minimizes the need for more resource-demanding
data reduction technologies. With fewer back-end operations, more resources are available
for front-end (user-driven) operations and applications. As Nutanix enables its built-in data
avoidance technologies automatically, there is no need for manual configuration or fine-tuning.

Thin Provisioning
Thin provisioning is a simple and broadly adopted technology for increasing data capacity
utilization by overcommitting resources. The DSF enables this feature in all containers by default.
In deployments using the VMware ESXi hypervisor, containers are presented to hosts as natively
thin-provisioned NFS datastores. Although it is a widely accepted method for increasing capacity
utilization, thin provisioning traditionally has been associated with reduced storage performance.
However, on Nutanix, thin provisioning outperforms thick provisioning and is recommended for all
workloads.
Some applications, such as Oracle RAC and vSphere Fault Tolerance, require thick provisioning.
The DSF supports thick provisioning (eager zero or lazy zero thick) VMDKs via the VMware API
for Array Integration (VAAI) NAS reserve space primitive. Eager-zeroed thick VMDKs guarantee
space reservations in the DSF but do not actually write data. Instead, the DSF acknowledges
every I/O and performs a simple metadata update.

4. Nutanix Data Efficiency | 9


Data Efficiency

Calculations for the Overall Efficiency metric account for savings from thin provisioning.
Administrators can view current thick provisioned capacity via Prism on the Storage Container
Details page.

Intelligent Cloning
The DSF provides native support for space-efficient, offloaded VM clones, which you can choose
to provision automatically via VAAI, View Composer for Array Integration (VCAI), and Microsoft
Offloaded Data Transfer (ODX), or interactively via nCLI, REST, or Prism. Clones take advantage
of the redirect-on-write algorithm, which is the most effective and efficient implicit virtual disk
sharing technique.
On the Nutanix platform, VMs store data as virtual disk files (vDisks). Each vDisk is composed
of logically contiguous chunks of data called extents. These extents are stored in physically
contiguous groups as files on storage devices. When you clone a VM, the system marks the base
vDisk read-only and creates another vDisk as read and write. At this point, both vDisks have the
same block map, which is a metadata mapping of the vDisk to its corresponding extents.
The following figure shows what happens when you clone a VM.

Figure 3: Example Clone Block Map

The system uses the same method for multiple clones of a VM or vDisk. Clone operations are
metadata only, so no I/O takes place. You can also create clones of clones the same way;
essentially, the previously cloned VM acts as the base vDisk. On cloning, the system locks the
base vDisk’s block map and creates two clones, with one block map for the previously cloned VM
and another block map for the new clone. There is no maximum number of clones.
In the following figure, both clones inherit the prior block map, and the system creates new
individual block maps.

4. Nutanix Data Efficiency | 10


Data Efficiency

Figure 4: Multiclone Block Maps

Any new metadata write or update occurs in the new individual block maps.

Figure 5: Clone Block Maps: New Write

Any subsequent clones lock the original block map and create a new one for read and write
access.
Calculations for the Overall Efficiency metric account for intelligent cloning savings. You can view
this metric via Prism on the Storage Container Details page.

4.2. Data Reduction


The Nutanix Capacity Optimization Engine (COE) transforms data to increase data efficiency
on disk. The technologies that compose the engine increase the effective storage capacity of

4. Nutanix Data Efficiency | 11


Data Efficiency

a cluster by eliminating repetitive data as it is written to the system, or by transforming it using


mathematical functions post-process as a series of MapReduce jobs. Easily configured on the
container level for per-VM or even per-vDisk granularity, compression, deduplication, and erasure
coding technologies give customers the flexibility to fine-tune data reduction techniques.

Compression
The DSF provides both inline and post-process compression to suit the customer’s data types
and application usage patterns. The inline method compresses sequential streams of data or
large I/O sizes (more than 64 KB) as they’re written to the capacity store (extent store), while
post-process compression initially writes the data in an uncompressed state and then uses the
Curator framework to compress it cluster-wide. A Nutanix system compresses all incoming write
I/O operations over 4 KB inline in the persistent write buffer (oplog). This approach enables you
to use oplog capacity more efficiently and helps drive sustained performance. From AOS 5.1
onward, post-process compression is enabled by default for newly created containers.
The DSF uses LZ4 and LZ4HC algorithms for data compression in AOS 5.0 and beyond. On
initial ingest, regular data is compressed using LZ4, which provides a very good balance between
compression and performance. As data cools, LZ4HC further compresses it to improve the
compression ratio.
We can characterize cold data into two main categories:
• Regular data: No read or write access for three days.
• Immutable data (snapshots): No read or write access for one day.
The following figure shows how inline compression interacts with the DSF write I/O path.

4. Nutanix Data Efficiency | 12


Data Efficiency

Figure 6: Inline Compression I/O Path

Tip: Almost always use inline compression (compression delay=0) because this
setting only compresses larger or sequential writes and does not impact random write
performance. In fact, inline compression typically increases effective performance by
increasing the usable size of the SSD tier. In addition, when the system replicates
larger or sequential data for protection, it can send compressed data, further
increasing performance because less data crosses the wire. Inline compression also
pairs perfectly with erasure coding.

When you enable post-process compression, all new write I/O follows the normal DSF I/O
path without compression. After the data meets the compression delay setting (which you can
configure in Prism), it is eligible for compression in the extent store. Post-process compression
tasks use the Curator MapReduce framework to distribute work across all nodes. Automatic
resource adjustments between front-end or user-driven and back-end operations ensure that
post-process compression tasks do not limit the performance of user applications.

4. Nutanix Data Efficiency | 13


Data Efficiency

The following figure shows how post-process compression interacts with the DSF write I/O path.

Figure 7: Post-Process Compression I/O Path

For read I/O operations on compressed data, the system first decompresses the data in memory,
then serves the I/O. Heavily accessed data is decompressed in the extent store and then moves
to the cache.
The following figure shows how decompression interacts with the DSF I/O path during read
operations.

4. Nutanix Data Efficiency | 14


Data Efficiency

Figure 8: Decompression I/O Path

To view current compression rates in Prism, hover over the Compression setting on the Storage
Container Details page.

Workloads Recommended for Compression


• Almost all.

Workloads Not Ideal for Compression


• Encrypted datasets.
• Already compressed datasets (for example, images, audio, or video).

4. Nutanix Data Efficiency | 15


Data Efficiency

Erasure Coding
The Nutanix platform relies on a replication factor for data protection and availability. This method
provides the highest degree of availability because it does not require data recomputation on
failure or reading from more than one storage location. However, because this type of data
protection requires full copies, it uses additional storage resources.
To provide a balance between availability and storage capacity consumption, the DSF offers the
ability to encode data using erasure codes (EC-X). Similar to the concept of RAID (levels 4, 5, 6,
and so on), EC-X encodes a strip of data blocks on different nodes and calculates parity. In the
event of a host or disk failure, the system can use the parity to decode any missing data blocks.
In the DSF, the data block is an extent group, and each data block must be on a different node
and belong to a different vDisk.
You can configure the number of data and parity blocks in a strip based on how many failures
you need to tolerate, or your preferred level of fault tolerance. The configuration is the <number
of data blocks> / <number of parity blocks>. For example, “replication factor 2–like” availability (n
+ 1) could consist of three or four data blocks and one parity block in a strip (in other words, 3:1
or 4:1). “Replication factor 3–like” availability (n + 2) could consist of three or four data blocks and
two parity blocks in a strip (in other words, 3:2 or 4:2).
You can calculate maximum usable capacity as <number of data blocks> / <number of total
blocks>. For example, a 4:1 strip has 80 percent usable capacity, and a 4:2 strip has 66 percent
usable capacity. As the total strip size increases, the resulting usable capacity benefits diminish.
The following table characterizes usable capacity using different replication factors and encoded
strip sizes.

Table 2: Encoded Strip Sizes and Example Usable Capacity

Cluster
Replication EC- EC- EC- Replication EC- EC-
Size RAW
Factor 2 X 2:1 X 3:1 X 4:1 Factor 3 X 3:2 X 4:2
(Nodes)
“Replication Factor 2–Like” “Replication Factor 3–Like”
4 80 TB 40 TB 53 TB N/A N/A 27 TB N/A N/A
5 100 TB 50 TB 67 TB 75 TB N/A 33 TB N/A N/A
6 120 TB 60 TB 80 TB 90 TB 96 TB 40 TB 72 TB N/A
105 112
7 140 TB 70 TB 93 TB 47 TB 84 TB 93 TB
TB TB

4. Nutanix Data Efficiency | 16


Data Efficiency

Tip: Nutanix recommends that you always maintain a cluster size that is at least
one node greater than the combined strip size (data + parity) to allow space for
rebuilding the strips if a node fails. This sizing eliminates any computation overhead
on reads once the strips have been rebuilt (a process that the Nutanix Curator
service automates). For example, a cluster with a 4:1 strip should have at least six
nodes. The table above follows this best practice.

The Nutanix platform invokes EC-X post-process on write-cold data, using the Curator
MapReduce framework for task distribution. Because MapReduce is a post-process framework, it
does not affect the traditional write I/O path, and automatic resource adjustments between front-
end or user-driven and back-end operations ensure that EC-X tasks do not limit the performance
of user applications.

Note: Overhead on storage controllers and the network increases proportionally


as strip sizes increase. The risk that a drive or node failure can impact the strip also
increases. For this reason, we do not recommend strip sizes larger than 4:1 or 4:2.

Administrators can change the replication factor for containers with EC-X enabled. This option
provides greater flexibility to help customers to achieve their desired level of data protection
during the application life cycle. Containers can transition efficiently between replication factor
2 and replication factor 3, enabling the system to create or remove a resulting parity block
automatically without additional read-modify-write operations.
As of AOS 5.1, the Nutanix platform automatically increases or decreases the size of existing
erasure-coded strips during node additions and removals. This automation ensures that
protection overhead remains limited after node removals and that capacity utilization continues
to be optimized as cluster size increases. For example, 53 TB usable capacity across four nodes
using 2:1 strips increases to 64 TB when the system dynamically increases the strip size to 4:1
after meeting the six-node minimum requirement.

Tip: You can override the default strip size (4:1 for “replication factor 2–like” or 4:2
for “replication factor 3–like”) using the nCLI, where N is the number of data blocks
and K is the number of parity blocks.
ctr [create/edit] … erasure-code=<N>/<K>

The following figure illustrates the data layout of a normal environment using a replication factor.

4. Nutanix Data Efficiency | 17


Data Efficiency

Figure 9: Typical DSF Replication Factor Data Layout

In this scenario, we have a mix of both replication factor 2 and replication factor 3 data whose
primary copies are local and whose replicas are distributed to other nodes throughout the cluster.
When Curator runs a full scan, it finds eligible write-cold extent groups (data not written or
overwritten for seven days) that are available for encoding. After Curator finds the eligible
candidates, Chronos distributes and throttles the encoding tasks.
The following figure shows an example 4:1 and 3:2 strip.

4. Nutanix Data Efficiency | 18


Data Efficiency

Figure 10: DSF Encoded Strip: Before Encoding

Once the system has created the strips and calculated parity to encode data, it removes the
replica extent groups.
The following figure shows the environment and storage savings after EC-X is complete.

4. Nutanix Data Efficiency | 19


Data Efficiency

Figure 11: DSF Encoded Strip: After Encoding

To view current EC-X savings via Prism, hover over the Erasure Coding setting on the Storage
Container Details page.

Tip: EC-X pairs perfectly with inline compression; you can safely enable them
together for maximum efficiency.

Workloads Recommended for Erasure Coding


• Write once, read many (WORM) workloads.
• Backups.
• Archives.
• File servers.
• Log servers.
• Email (depending on usage).

Workloads Not Ideal for Erasure Coding


• Anything write- or overwrite-intensive.
• VDI.
Because of data-avoidance technology like intelligent cloning, VDI workloads are typically very
write-intensive but not capacity-intensive.

4. Nutanix Data Efficiency | 20


Data Efficiency

Elastic Deduplication Engine


The Elastic Deduplication Engine is a software-based feature of the DSF that deduplicates
data in both the extent store and unified cache tiers. The system fingerprints streams of data
during ingest using a SHA-1 hash at a 16 KB granularity. Unlike traditional approaches that use
background scans requiring data to be read again, this fingerprint occurs only on data ingest and
is then stored persistently as part of the written block’s metadata. The stored fingerprints allow
the Elastic Deduplication Engine to detect and remove duplicate copies easily, without scanning
or reading the data again.
To increase metadata efficiency, Nutanix tracks dedupability by monitoring fingerprint reference
counts. The system discards fingerprints with low reference counts, minimizing metadata
overhead. To minimize fragmentation, extent store deduplication prefers full extents.
The following figure shows how the Elastic Deduplication Engine scales and handles local VM I/O
requests.

Figure 12: Elastic Deduplication Engine: Scale

Fingerprinting occurs during ingest of data with an I/O size of 64 KB or greater (either initial I/O or
when draining from the oplog). The engine uses Intel acceleration for the SHA-1 computation to
reduce CPU overhead. In scenarios where fingerprinting does not occur at ingest (for example,
with smaller I/O sizes), it is completed as a background process. As the Curator MapReduce
framework identifies duplicate data (multiple copies of the same fingerprints), it removes the
duplicates. Automatic resource adjustments between front-end or user-driven and back-end
operations ensure that deduplication tasks do not limit the performance of user applications.
Cache deduplication increases storage efficiency and can also improve performance by enabling
larger effective cache sizes. With deduplication enabled, initial read requests enter the DSF
unified cache at a 4 KB granularity. Any subsequent requests for data with the same fingerprint
pull directly from the cache.

4. Nutanix Data Efficiency | 21


Data Efficiency

The following figure illustrates how the Elastic Deduplication Engine interacts with the DSF I/O
path.

Figure 13: Elastic Deduplication Engine I/O Path

To view the current deduplication savings via Prism, hover over the Capacity and Cache
Deduplication setting on the Storage Container Details page.

Workloads Recommended for Deduplication


• Base images (cache)—you can manually fingerprint them using vdisk_manipulator.
• P2V and V2V when using Hyper-V (ODX uses a full data copy).
• Cross-container clones (not usually recommended because single containers are preferred).

4. Nutanix Data Efficiency | 22


Data Efficiency

Workloads Not Ideal for Deduplication


• Anything outside the recommendations above. In most cases, compression yields the highest
capacity savings and should be used instead.

4. Nutanix Data Efficiency | 23


Data Efficiency

5. Conclusion
The Nutanix Enterprise Cloud incorporates an array of powerful data-efficiency techniques that
are agile and scalable. Thin provisioning and intelligent cloning, which are native data-avoidance
technologies, provide significant space savings and very fast snapshots and replicas. Data-
reduction features such as deduplication and compression generate dramatic performance
improvements while also increasing the cluster’s effective storage capacity—all with extremely
low computational overhead. The Nutanix erasure coding (EC-X) algorithm delivers predictable
storage efficiency across all data types and workloads, with no impact on performance. The
Nutanix data-efficiency features are intelligent, adaptive, and above all simple to use, extending
the Nutanix commitment to ushering in the era of invisible infrastructure.
For more information about data efficiency with Nutanix or to review other Nutanix technical
documents, please visit the Nutanix website.

5. Conclusion | 24
Data Efficiency

Appendix

About Nutanix
Nutanix makes infrastructure invisible, elevating IT to focus on the applications and services that
power their business. The Nutanix Enterprise Cloud OS leverages web-scale engineering and
consumer-grade design to natively converge compute, virtualization, and storage into a resilient,
software-defined solution with rich machine intelligence. The result is predictable performance,
cloud-like infrastructure consumption, robust security, and seamless application mobility for a
broad range of enterprise applications. Learn more at www.nutanix.com or follow us on Twitter
@nutanix.

Appendix | 25
Data Efficiency

List of Figures
Figure 1: Nutanix Enterprise Cloud................................................................................... 6

Figure 2: Sample Prism Capacity Optimization................................................................. 9

Figure 3: Example Clone Block Map............................................................................... 10

Figure 4: Multiclone Block Maps......................................................................................11

Figure 5: Clone Block Maps: New Write......................................................................... 11

Figure 6: Inline Compression I/O Path............................................................................ 13

Figure 7: Post-Process Compression I/O Path................................................................14

Figure 8: Decompression I/O Path.................................................................................. 15

Figure 9: Typical DSF Replication Factor Data Layout................................................... 18

Figure 10: DSF Encoded Strip: Before Encoding............................................................19

Figure 11: DSF Encoded Strip: After Encoding............................................................... 20

Figure 12: Elastic Deduplication Engine: Scale...............................................................21

Figure 13: Elastic Deduplication Engine I/O Path........................................................... 22

26
Data Efficiency

List of Tables
Table 1: Document Version History................................................................................... 5

Table 2: Encoded Strip Sizes and Example Usable Capacity......................................... 16

27

You might also like