Migrating To VSAN
Migrating To VSAN
Migrating To VSAN
Table of Contents
1.1 Introduction
Introduction
Migration strategies and options for vSAN are numerous depending on your environment and
implementation of vSphere. This article will discuss the native options for migrating virtual machine
workloads to vSAN. The methodologies presented are valid for vSAN in general, vSAN Ready Nodes
clusters as well as hyper-converged infrastructure (HCI) appliances such as Dell EMC VxRail™
Appliances.
While third-party options and solutions, such as backup, recovery, and replication are valid options;
those items are out of scope for this document due to extra cost and resources involved to deploy,
configure and implement. Recommendations presented are based on current VMware best practices.
We will cover topics including migration within an existing data center with both shared and non-
shared storage, from physical servers direct to vSAN and migrating between physically disparate data
centers.
Modes of vMotion
Migration of a virtual machine can be either compute only, storage only or both simultaneously. Also,
you can use vMotion to migrate virtual machines across: vCenter Server instances; virtual and physical
data centers; and subnets. vMotion operations are transparent to the virtual machine being migrated.
If errors occur during migration, the virtual machine reverts to its original state and location.
Compute vMotion
Compute mode vMotion operations usually occur within the same logical vSphere cluster, The two
hosts involved in a vMotion can, however, reside in separate logical or physical clusters.
Storage vMotion
Storage vMotion is the migration the files, that belong to a running virtual machine, residing on one
discrete datastore to another discrete datastore.
Combined vMotion
When you choose to change both the host and the datastore, the virtual machine state moves to a new
host and the virtual disks move to another datastore.
Shared-nothing vMotion
Also known as vMotion without shared storage, allows you to utilize vMotion to migrate virtual
machines to a different compute resource and storage simultaneously. Unlike Storage vMotion, which
requires a single host to have access to both the source and destination datastore, you can migrate
virtual machines across storage accessibility boundaries.
vMotion does not require shared storage. This is useful for performing cross-cluster migrations when
the target cluster machines might not have access to the source cluster's storage.
Cross-vCenter vMotion
Also known as vMotion between vCenter instances and long-distance vMotion allows for the migration
of VMs across vCenter boundaries both within and outside an SSO domain as well as over links with up
to a 150ms RTT (Round Trip Time).
Migration between two vCenter servers within the same SSO domain is accomplished within the
vSphere web interface, which leverages enhanced linked mode (ELM). While migration between two
vCenter servers that are members of different SSO domains require APIs/SDK initiation.
Migration of VMs between vCenter instances moves VMs to new virtual networks; the
migration process issues checks to verify that the source and destination networks are
similar. vCenter performs network compatibility checks to prevent the following
misconfigurations:
• MAC address incompatibility on the destination host
• vMotion from a distributed switch to a standard switch
• vMotion between distributed switches of different versions
• vMotion to an isolated network
• vMotion to a distributed switch that is not functioning properly
1.3 Preparation
Preparation
To allow for a successful migration of VM workloads onto vSAN a review of your current
virtual infrastructure is advised. Extension of the existing vMotion network into the new
vSAN environment is required, allowing for migration of the VM workload from its
current location to the new vSAN infrastructure.
There are many possible valid configurations for compute and storage, but for
migration into vSAN there are specific requirements listed below:
Source Environment
Licensing
• Essentials Plus or higher for vMotion feature
• Enterprise Plus or higher for Cross-vCenter vMotion
• Enterprise Plus for long-distance vMotion
NTP
• Uniform time synchronization is required for the vCenter and ESXi hosts
vCenter Topology
• One vCenter, one SSO domain
• Two vCenters, one SSO domain
Networking
• L2 (Layer two) adjacency between source and destination VM networks
• VSS or VDS configuration at, or greater than, version 6.0.0
ESXi
• ESXi v6.0 or above for Cross-vCenter migration
Clusters
• If EVC (Enhanced vMotion Compatibility) is enabled, the source cluster must be at
a lower or equal EVC level to the target cluster
Virtual Machine
• Application dependencies
• RDMs – either converted to VMFS or migrated to in-guest iSCSI
• VMTools will require an update if VM is migrated to a newer ESXi version
Destination Environment
The destination vSphere environment requires network access for the virtual machine matching the
source environment, for example, VLAN access and IP addresses must be considered. Additionally,
advanced configurations such as DRS affinity rules, and Storage Policies will need to be re-created on
the target environment if they are still required.
While this operation can be done "live," organizations may choose to migrate with VMs
powered off. vMotion of a powered-off or suspended virtual machine is known a cold
migration and can be utilized to move virtual machines from one data center to another.
A cold migration can be operated manually or via a scheduled task.
By default, data migrated in a cold state via vMotion, cloning, and snapshots is
transferred through the management network. This traffic is called provisioning traffic
and is not encrypted.
On a host, you can dedicate a separate VMkernel interface to provisioning traffic, for
example, to isolate this traffic on another VLAN. A provisioning VMkernel interface is
useful if you plan to transfer high volumes of virtual machine data that the management
network cannot accommodate or have a dedicated network for migration data between
clusters or datacenters.
For information about enabling provisioning traffic on a separate VMkernel adapter, see
the vSphere networking documentation .
Migration Scenarios
The previous sections highlighted Compute, Network and Virtual Machine configuration
recommendations and requirements; we will now focus on the vCenter and SSO
configuration. The main migration topologies supported are listed below.
• Topology A: Single vCenter, Single SSO domain
• Topology B: Two vCenters, Single SSO domain
• Topology C: Two vCenters, Two SSO domains
We recommend that the source vCenter be v6.0 or higher. If using a VDS, it must be
version 6.0 or above for cross vCenter migration. The initiation of the vMotion
operations can be via the vSphere Web Client or API (PowerCLI).
In addition to the supported topologies, there are source and destination vCenter
versions that need to be adhered to:
vCenter places limits on the number of simultaneous VM migration and provisioning operations that
can occur on each host, network, and datastore. Each operation, such as a migration with vMotion or
cloning a VM, is assigned a resource cost. Each host, datastore, or network resource, has a maximum
cost that it can support at any one time. Any new migration or provisioning operation that causes a
resource to exceed its maximum cost is queued until the other in-flight operations reach completion.
Each of the network, datastore, and host limits must be satisfied for the operation to proceed. vMotion
without shared storage, the act of migrating a VM to a different host and datastore simultaneously, is
a combination of vMotion and Storage vMotion. This migration inherits the network, host, and
datastore costs associated with both of those operations.
Network Limits
Network limits apply only to migrations with vMotion. Network limits depend on the version of ESXi
and the network type.
Maximum concurrent
Operation ESXi Version Network Type
vMotions per Host
Considerations must be made for uplink speed of the NIC assigned to the vMotion service. For
example, if you are using vMotion from a 1GbE source vMotion network to a vSAN Target destination
with 10GbE, you will be throttled to the lower speed of the two.
Datastore Limits
Datastore limits apply to migrations with vMotion and with Storage vMotion. Migration with vMotion
and Storage vMotion have individual resource costs against a VM's datastore. The maximum number
of operations per datastore are listed below.
Host Limits
Host limits apply to migrations with vMotion, Storage vMotion, and other provisioning operations such
as cloning, deployment, and cold migration. All hosts have a maximum number of operations they can
support. Listed below are the number of operations that are supported per host - note that
combinations of operations are allowed and are queued and executed automatically by vCenter when
resources are available to the host.
1.6 References
References
PowerCLI
An example migration script for moving VMs between vCenters and SSO domains, using PowerCLI, is
shown below. The script moves myVM from myVC1 to myVC2 on to target port group myPortGroup
and datastore vsanDatastore .
10
More information and detail on the Move-VM command can be found here: https://
blogs.vmware.com/PowerCLI/2017/01/spotlight-move-vm-cmdlet.html
Vuong Pham is a Senior Solutions Architect who has been in IT for 19 years in many aspects of IT. Pre-
sales, Design, Implementation, and Operations of small, medium and enterprise environments across
multiple industries. He is SME in virtualization, data protection and storage solutions for multiple
vendors. His current focus is HCIA VxRail solutions. VCP 3,4,5,VCAP Design, VCAP Administration,
EMCIE. You can follow Vuong on Twitter as: @Digital_kungfu
Myles Gray
Myles Gray is a Senior Technical Marketing Architecture for VMware in the Storage and Availability
business unit, primarily focused on storage solutions. With a background as a customer and partner in
infrastructure engineering, design, operations, and pre-sales roles. He is a VCIX6-NV and VCAP6-DCV.
You can find him on Twitter as: @mylesagray
11
12
2.1 Introduction
Introduction
Traditionally, there have been two particular reasons why people use RDMs in a vSphere environment:
To allow the addition of disks to VMs that were larger than 2TB in size; For shared-disks, such as
quorum and shared-data drives for solutions like SQL FCI, Windows CSVs.
The first of these is trivial to address - the limitation for 2TB VMDKs was removed with ESXi 5.5 and
VMFS-5. The limit is now the same as with RDMs at 62TB, and as such RDMs should no longer be
considered for this use-case.
The second is the main reason RDMs may still be in use today: Shared-disk quorum and data between
VMs.
In this section, we will address the migration of non-shared disk RDMs to native vSAN objects, as well
as the transition of shared-disks from the legacy RDM based approach to in-guest iSCSI initiators.
Virtual Mode
Non-shared RDMs are trivial to migrate to vSAN, as they can be live storage vMotioned to VMDKs. To
start with your RDMs must be in virtual compatibility mode to leverage a storage vMotion conversion
to VMDK. After converting any physical mode RDMs you have to virtual mode, you may then initiate a
storage vMotion to vSAN directly. You can see in the below example, I svMotion a VM with a virtual
mode RDM, live, to a vSAN datastore and its RDM is converted to a native vSAN object:
13
Change the policy to your chosen SPBM policy and choose the target vSAN datastore:
After the migration has completed, you will notice that the disk type is no-longer RDM, rather it is
listed as a VDMK and is editable as it is not a first-class citizen of the datastore:
14
Physical Mode
If you have physical mode RDMs they cannot have the LUN contents migrated live and would require a
cold migration. Under the consideration that most physical mode RDMs are created for large data sets,
to minimise downtime from a cold migration we recommend converting the RDMs to virtual mode
first, then carrying out the necessary storage vMotion to convert the disk to a VMDK which can be
done while the VM is operational.
15
Introduction
Shared disk RDMs in either virtual or physical compatibility mode have been enabled typically to
provide support for guest OS clustering quorum mechanisms. Since Windows Server 2008, the need
for a dedicated quorum shared disk has not been necessary. Instead, you can use a FSW (File Share
Witness), the FSW can be a standard Windows server on a vSAN datastore.
File Share Witness fault-detection provides the same level of redundancy and failure detection as
traditional shared-disk quorum techniques, without the additional operational and installation
complexity that those solutions command.
Migration
Below you can see I have a SQL FCI cluster with two nodes, currently utilizing a shared-disk for cluster
quorum:
We are going to convert this cluster to File Share Witness quorum, I have a file server in the
environment (file01) and have created a standard Windows file share on it called: sql-c-quorum. N.B:
This can be done live and is not service affecting.
Firstly, right click on the cluster and got to More Actions -> Configure Cluster Quorum Settings...
16
17
You will see a dialogue telling you that the cluster voting is enabled and was successful:
18
We can then verify we are operating in FSW mode on the main dialogue of the Failover Cluster
Manager:
The VM no longer requires the RDMs used for cluster quorum or voting and they can be removed - this
VM can now be migrated to vSAN by a simple storage vMotion and no downtime is required for the
entire operation.
19
Introduction
Shared RDMs have traditionally been an operational blocker to any migration or
maintenance due to the complexity they create in an environment as well as the version
dependencies they introduce and specific VM configurations they command.
Organizations may wish to simplify their operations by having their VMs all operating
under a single compute cluster with homogenous configurations at a vSphere level.
Detailed below is the process for migrating VMs with existing shared RDMs, from
physical and virtual mode RDMs to instead using in-guest iSCSI initiators; This allows
clustered VMs to be migrated into a vSAN environment to reduce operational
complexity while leaving data in place on the existing SAN.
Example Setup
The use case covered is a WSFC (Window Server Failover Cluster) for a SQL FCI. In the
below figure; there are three disks shared between the VMs for data access for: SQL
Data, Logs, and Backups. Volumes presentation to the VM utilizes physical mode RDMs.
Note: in the below example, the RDMs are in physical mode and are on Virtual Device
Node; "SCSI controller 1". This information is essential to record for later as it will be
necessary to remove this SCSI controller after removing the RDMs from the VM
configuration.
20
As a point of reference, RDMs are provided in this environment via an EMC Unity array
with iSCSI connectivity on four uplink ports (Ethernet Port 0-3) with IPs of 10.0.5.7-10
respectively.
Preparation
21
To migrate existing RDMs, whether in physical mode or virtual mode the simplest option
is to move the LUNs to an in-guest iSCSI initiator. Given RDMs are simply raw LUNs
mapped through to a VM directly, storage presentation to the VM remains the same.
VMs will have the same control over LUNs as they would have with an RDM and
application operations will be unaffected by the migration.
In preparation, there are a few steps that must be completed on each VM in the cluster
to allow for iSCSI connectivity to the SAN presented LUNs. Firstly, we will need to add a
NIC connected to the iSCSI network to the VM.
Next, the Windows iSCSI Initiator needs to be initialized. When prompted to have the
iSCSI service start automatically on boot, select Yes.
In the following window, add one of the SAN's iSCSI targets into the Quick Connect section of the
dialogue box. There is no need to add every target here; after MPIO is configured the array should
22
communicate all target paths that can be used for LUN connectivity to the iSCSI Initiator, providing
load balancing and failover capabilities.
At this point, you can apply MPIO policies specific to your array and OS version. Refer to
your vendor's documentation for configuring MPIO in a Windows environment. Next,
add the VM’s iSCSI initiator into the SAN’s zoning policy for the RDM LUNs. This again
will vary from vendor to vendor. You can see below that the host object has been
created on the SAN and has been given access to the three LUNs that are used for
shared data between VMs.
23
VM RDM Reconfiguration
At this point, migration from RDM to in-guest termination can begin. It would be prudent to start with
the secondary node in the cluster, and given that WSFC is not transparent during role transferral,
carrying out this work during a maintenance window is advised. Firstly, place the node undergoing
reconfiguration into the "Paused" mode from the Failover Cluster Manager console, choosing to "Drain
Roles" during maintenance.
Shut down the secondary VM, and remove the RDMs and the shared SCSI controller
from it. It is important to note that when you are deleting the disks from this node that
you should not click "delete from datastore", remember: These are still in use by the
primary node in the WSFC. Navigate to the VM in the vSphere Web Client and choose
"Edit Settings" from here remove the disks and click "Ok."
24
It is necessary to enter "Edit Settings" once more, now that the bus-sharing SCSI controller we
recorded at the start is unused, and remove it. N.B: using controller SCSI0:* is not supported for
shared/clustered RDMs, so RDMs should always be on a tertiary SCSI controller - you
can verify this by checking the sharing mode on the controller.
Figure 10 - Removing the bus-sharing SCSI controller previously used by the RDMs
Power up the secondary VM and log in. Currently, the shared disks are not presented to
the VM. Open up the iSCSI Initiator dialogue; your targets should all have connected at
this point.
25
Navigate to the "Volumes and Devices" section, and click "Auto Configure", this will mount the disks
and display their MPIO identifiers in the Volume List.
26
Opening up the Windows disk management dialogue, you should now be able to see the disks
connected but in the "Reserved" state. The reserved and offline state is expected, as this node is not
the active node in the cluster, once a role transfer is complete you will see these disks listed via their
volume identifier (D:\, E:\, F:\). Right-clicking on one of the disks and selecting "Properties" you will be
able to see each disk's LUN ID as well as specifics on MPIO, multi-pathing policies, and partition type.
27
Figure 13 - Disk management dialogue showing the disks re-presented via iSCSI
Reintroduce the VM into the WFSC, open the Failover Cluster Manager and right-click the secondary
node that has been undergoing maintenance, choose "Resume" selecting "Do not fail back roles".
28
Ensure the WSFC console says the cluster is healthy, that both nodes are "Up" and
transfer any roles from primary to secondary. The disks will automount via iSCSI at this
time as the volume signature has remained the same. To transfer the roles over to the
secondary node navigate to "Roles," right click, choose "Move" and "Select Node...", then
choose the reconfigured node.
Ensure your services are operating as expected, as mentioned earlier, in disk manager
on the secondary node now, volumes will be listed but, with their volume identifiers.
29
Figure 16 - Disk manager showing the volumes as active and identified correctly
As before, enter the node to be migrated into "Paused" mode and choose "Drain Roles,"
then shut down the VM.
30
In the vSphere console, locate the VM (in this case; sql-c-01) and "Edit Settings".
Remove the RDMs as before, but this time choose “delete from datastore,” this is safe to
do as no other nodes are actively using these RDM pointer files anymore. Note;
choosing "delete from datastore" does not delete data from the underlying LUN, which
remains unaffected, this operation only removes the RDM pointer files from the VMFS
upon which, they are situated.
As previously, navigate back into "Edit Settings" and delete the bus-sharing SCSI
controller from the VM's configuration.
31
Power the VM on and open the iSCSI Initiator dialogue, verify that the targets are all listed as
"Connected," navigate to the Volumes and Devices dialogue and click "Auto Configure". The volumes
will now show up in the volume list, detailed by their MPIO identifier.
Figure 20 - Volume list detailing the MPIO identifiers for the iSCSI mounted volumes
Verify the disks show up in the Windows disk management snap-in and exhibit a "Reserved" and an
offline state; again this is normal for the passive node in the cluster, only the active node mounts the
volumes.
32
Open the Failover Cluster Manager dialogue again and navigate to the "Nodes" section,
then resume the node's participation in the cluster, choosing "Do not fail back roles".
Ensure the cluster is reformed healthily and both nodes indicate a status of "Up".
Figure 22 - WSFC is shown as healthy, and both nodes are in the "Up" state
33
At this point the WSFC disk migration is complete, both VMs have had their RDMs
removed and now rely on in-guest iSCSI initiators for connectivity to shared disks. You
can optionally transfer the WSFC roles back to the primary node, as a matter of
preference.
Migration to vSAN
With the RDMs and bus-sharing SCSI controllers gone, we can now migrate the VM to vSAN. Note:
This only migrates the VM's objects that are accessible to vSphere (VMX, swap, namespace, OS and
non-shared VMDKs), the data for the shared disks still resides on the SAN. Please refer to the
documentation on migrating a VM residing on VMFS/NFS to vSAN .
Rollback
In the circumstance you wish to migrate a VM back from the new mode of operation to the previous
mode of operation, this is achievable by Storage vMotioning the VM from the vSAN datastore to a
VMFS volume (required for RDM and bus-sharing compatibility) and following the below steps:
Myles Gray is a Senior Technical Marketing Architecture for VMware in the Storage and Availability
business unit, primarily focused on storage solutions. With a background as a customer and partner in
infrastructure engineering, design, operations, and pre-sales roles. He is a VCIX6-NV and VCAP6-DCV.
You can find him on Twitter as: @mylesagray
34
35
When using VMware Converter, choose the vSAN Datastore as the target location for the converted
machine - it will migrate to the datastore with the default vSAN storage policy.
Ensure to change all disk types to thin during the migration - in the options section, select Advanced
then adjust all disk types to thin .
36
37
Constraints
Be aware that if your physical machines participate in a WSFC or shared-disk clustering, please
reference our guide on migrating these machines to in-guest iSCSI termination before attempting a
migration to vSAN to ensure supportability throughout and after the migration process. The process is
a similar process on physical machines as it would be on VMs with RDMs.
38
39
4.1 Introduction
Introduction
Customers may wish to migrate hundreds, or thousands of VMs in a predictable and repeatable
fashion, there are a number of ways to orchestrate the migration of large numbers of VMs, in this
section we will cover the use of vSphere Replication and Site Recovery Manager to migrate large
numbers of VMs in a similar fashion to a destination vSAN datastore.
Included in this coverage will be migrations to a vSAN datastore within the same datacenter (in a
different cluster), in another vCenter and in a separate SSO domain.
For example: If at the primary site you have a large cluster utilizing a storage policy with FTT (Failures
to tolerate) set to two, this allows for extra redundancy in the event of hardware failures. However, on
the secondary site, a smaller cluster is utilized to save costs, VMs can be replicated with a storage
policy specifying that FTT is set to one in order to save space on the smaller copies, the lower
redundancy on the target site can save on ongoing capital and operational expenses while still
providing an effective replication target for DR.
For more information on using vSphere Replication with vSAN, check out our Tech Note here . A click-
through demo is also available to demonstrate this capability.
vSphere Replication is limited to the recovery/migration of a single VM at once - SRM, conversely, can
support concurrent migration of up to 2,000 VMs. SRM also provides the ability to orchestrate
changes to VMs upon migration, for example, IP addresses if the migrations are across L3 (Layer 3
40
network) boundaries. In addition to these benefits, the migration plan can also be tested multiple
times, with no ill-effect on production workloads, providing predictability and peace of mind to the
process.
Demo
For an example of migrating large numbers of VMs concurrently with SRM on vSAN, there is a demo
you can find here showing the recovery of 1000 VMs in 26 minutes with vSphere Replication and SRM,
on top of vSAN.
It is wise to note that when replicating VMs with SRM and vSphere Replication that the policy selected
when initially creating the replicas will be applied to the target VM container from then on, any
subsequent changes to SPBM policy will only be replicated to the target VM once it has been
recovered via a failover. Again, the testing process will allow you to account for this and model any
rebuild traffic generated on the target side post-failover.
Migration testing
SRM offers the unique ability to test a migration or failover scenarios prior to actually
enacting any change. This is especially useful in the case of large-scale migrations
where multiple applications and dependencies are affected. The ability to test the logic
and operation of a migration to a new environment prior to actually doing the migration
is invaluable. With SRM, users can test their application group failovers by remotely
connecting to a test bubble environment and ensuring applications are operating as
expected prior to an actual production migration taking place.
41