An Introduction To Virtualization
An Introduction To Virtualization
An Introduction To Virtualization
Virtualization
by Sean Campbell and Michael Jeronimo
Virtualization Defined
Virtualization refers in this article to the process of decoupling the har dware
from the operating system on a physical machine. It turns what used to be
considered purely hardware into software. Put simply, you can think of
virtualization as essentially a computer within a computer, implemented in
software. This is true all the way down to the emulation of certain types of
devices, such as sound cards, CPUs, memory, and physical storage. An
instance of an operating system running in a virtua lized environment is known
as a virtual machine. Virtualization technol ogies allow multiple virtual
machines, with heterogeneous operating systems to run side by side and in
isolation on the same physical m achine. By emulating a complete hardware
system, from pro cessor to network card, each vi rtual machine can share a
common set of hardware unaware that this hardware may also be being used
by another virtual machine at the same time. The operating system running in
the virtual machine sees a consistent, normalized set of hardware regardless of
the actual physical hardware components. Technologies such as Intel ®
Virtualization Technology (Intel® VT), which will be reviewed later in this
article, significantly improves and enhances virtualization from the
perspective of the vendors that produce these sol utions.
With a working definition of virtualization on the table, here s a quick
mention of some of the other types of virtualization technology available
today. For example, computer memory virtualization is software that allows a
program to address a much larger amount of memory than is act ually
available. To accomplish this, you would generally swap units of address
space back and forth as needed between a storage device and vi rtual memory.
In computer storage manage ment, virtualization is the pooling of physical
storage from multiple network storage devices into what appears to be a
single storage device that is managed from a central co nsole. In an
environment using network virtualization, the virtual machine implem ents
virtual network adapters on a system with a host network adapter. But again
in the context of this book virtualization refers to the process of utilizing
virtual machines.
Terminology
Individual vendors often choose terminology that suits their marketing needs
to describe their products. Like the nuances of the virtualization technologies,
it s easy to get confused over the different terms used to describe features or
components. Hopefully as virtualization technology c ontinues to evolve and
as more players enter the marketplace, a common set of terminology will
emerge. But for now, here is a list of terms and corresponding defin itions.
Host Machine
A host machine is the physical machine runn ing the virtualization sof tware. It
contains the physical resources, such as memory, hard disk space, and CPU,
and other resources, such as network access, that the virtual m achines utilize.
Virtual Machine
The virtual machi ne is the virtualized representation of a physical m achine
that is run and maintained by the virtualization software. Each vi rtual
machine, implemented as a single file or a small collection of files in a single
folder on the host system, behaves as if it is running on an ind ividual,
physical, non-virtualized PC.
Virtualization Software
Virtualization software is a generic t erm denoting software that allows a user
to run virtual m achines on a host machine.
Virtual Disk
The term refers to the virtual machine s physical representation on the disk of
the host machine. A virtual disk comprises either a singl e file or a collection
of related files. It appears to the virtual mac hine as a physical hard disk. One
of the benefits of using virtual machine architecture is its portability whereby
you can move virtual disk files from one physical machine to another with
limited impact on the files. Subsequent chapters illustrate various ways in
which this can be a significant benefit across a wide var iety of areas.
Shared Folders
Most virtual machine implementations support the use of shared folders. After
the installation of virtual machine additions, shared folders enables the virtual
machine to access data on the host. Through a series of under -the-cover drive
mappings the virtual machine can open up files and fol ders on the physical
host machine. You then can transfer these files from the physical machine to a
virtual machine using a standard mechanism such as a mapped drive.
Shared folders can access install ation files for programs, data files, or
other files that you need to copy and load into the virtual machine. With
shared folders you don t have to copy data files into each virtual m achine.
Instead, all of your virtual machines access the sa me files through a shared
folder that targets a single endpoint on the physical host m achine.
Hypervisor
In contrast to the virtual machine monitor, a hypervisor runs directly on the
physical hardware. The hypervisor runs directly on the hardware without any
intervening help from the host operating system to provide access to hardware
resources. The hypervisor is directly responsible for hosting and mana ging
virtual machines running on the host machine. However, the impl ementation
of the hypervisor and its overall benefits vary widely across vendors.
Figure 2 Hypervisor Architecture Overview
Paravirtualization
Paravirtuali zation involves modifying the operating system before it can be
allowed to run in the virtualized environment as a virtual machine. Thus its
use requires an open source operating system whose source is publicly
available.
History of Virtualization
Before we place a foot firmly into the realm of virtualizat ion technologies that
exist today, it s worthwhile to take a step back into history to e xplore the
origin of virtualization within the mainframe environment. This is important
because virtualization in its current incarnation is not a completely new
technology and has roots in some past efforts.
From the 1950s to the 1990s
The concept of virtual memory dates to the late 1950s when a group at the
University of Manchester introduced automatic page replacement in the Atlas
system, a transistorized mainframe co mputer. The principle of paging as a
method to store and transmit data up and down the memory hierarchy already
existed but the Atlas was the first to automate the process, thereby providing
the first working prototype of virtual me mory.
The term virtual machine dates to the 1960s. One of the earliest vi rtual
machine systems comes from IBM. Around 1967, IBM introduced the
System/360 model 67, its first major system with virtual me mory. Integral to
the model 67 was the concept of a self -virtualizing processor instruction set,
perfected in later models. The model 67 used a very early operating system
called CP-67, which evolved into the virtual machine (VM) operating
systems. VM allowed users to run several operating systems on a single
processor machine. Essentially VM and the mainframe hardware cooperated
so that multiple instances of any operating system, each with protected access
to the full instruction set, could concurrently coexist.
In the mid 1960s IBM also pioneered the M44/44X p roject, exploring the
emerging concept of time sharing. At the core of the system archite cture was
a set of virtual machines, one for each user. The main machine was an IBM
7044 (M44 for short) and each virtual machine was an e xperimental image of
the 7044 (44X for short). This work eventually led to the widely -used
VM/timesharing systems, including IBM s well-known VM/370.
The concept of hardware virtualization also emerged during this time,
allowing the virtual machine monitor to run virtual machines in an isolated
and protected environment. Because the virtual machine mon itor is
transparent to the software running in the virtual machine, the software thinks
that it has exclusive control of the hardware. The co ncept was perfected over
time so that eventually virtual machine monitors could function with only
small performance and resource overhead.
By the mid 1970s, virtualization was well accepted by users of various
operating systems. The use of virtualization during these decades solved
important problems. For example, the emergence of virtual storage in large-
scale operating systems gave programs the illusion that they could address far
more main storage (memory) t han the machine actually contained. Virtual
storage expanded system capacity and made pr ogramming less complex and
much more productive.
Also, unlike virtual resources, real system resources were extremely
expensive. Virtual machines presented an efficient way to gain the maximum
benefit from what was then a sizable investment in a co mpany's data center.
Although hardware -level virtual machines were popular in both the
research and commercial marketplace during the 1960s and 1970s, they
essentially disappeared during the 1980s and 1990s. The need for
virtualization, in general, declined when low-cost minicomputers and personal
computers came on the market.
Although not the focus of this article , another type of virtual m achine,
Sun Microsystems Java Virtual M achine (JVM) and Microsoft s Common
Language Runtime (CLR), deserve a place on the historical timeline and are
worth mentioning here. The key thing to understand though is that these
machines do not present a virtual hardware platform. But due to the poten tial
confusion between this type of virtual machine and the vi rtual machines
covered in this article a brief overview is in order to clear up these
differences. These virtual machines emerged during the 1990s and extended
the use of virtual machines into othe r areas, such as sof tware development.
Referred to as simulated or abstracted machines, they are implemented in
software on top of a real hardware platform and o perating system. Their
beauty lies in their portability. In the case of JVM, compiled Java pro grams
can run on compatible Java virtual machines regardless of the type of machine
underneath the implementation.
Figure 1.3 outlines the relationship between a JVM or the CLR and the
host operating system.
Paravirtualization
Briefly discussed earlier, this solution requires changes to the source code of
the guest operating system, especially the kernel, so that it can be run on the
specific VMM. Paravirtualization can be used only with operating systems
that can be modified, such as Linux.
CPU
The CPU is one of the more significant bottlenecks in the system when
running multiple virtual machines. All of the operating systems that are
running on the host in a virtual machine are competing for access to the CPU.
An effective solution to this problem is to use a mul ti-processor or, better, a
multi-core machine where you can dedicate a core or more to a virtual
machine. The technology to assign a given core to a virtual m achine image is
not yet fully provided by current virtualization vendors but is expected to be
available in the near future. In the absence of a multi -core processor, the next
best step is to find the fastest processor available to meet your needs.
Memory
Memory also can be a significant bottleneck but its effect can be mit igated, in
part, by selecting the best vendor for your virtualization solution because
various vendors handle memory utilization differently.
Regardless of the vendor you chose, you must have a significant amount
of memory one that is roughly equivalent to the amount you would have
assigned to each machine if they were t o run as a physical machine. For
example, to run Windows XP Professional on a virtual m achine, you might
allocate 256 megabytes (MB) of memory. This is on top of the 256 MB
recommended for the host computer, assuming Windows XP is the host.
This can mean in many cases that a base machine configuration comes out
to approximately 1 2 gigabytes (GB) of memory or perhaps many more
gigabytes for a server -based virtualization solution.
You can easily change memory configuration for a guest operating system
that is virtualized. Typically this change is done from within the virtualization
software itself and requires only a shutdown and restart c ycle of the virtual
machine to take effect. Contrast this process with the requirement to manually
install memory on each physical machine and you can see one of the benefits
of virtualization technology.
Physical Disk
When it comes to virtua lization, overall disk space utilization for each virtual
machine isn t as great a concern as is the intelligent utilization of each
physical drive. An additional important point to consider is the rotational
speed of the drive in use. Because you may utilize multiple virtual machines
on a single drive the rotational speed of the drive can have a dramatic affect
on performance with greater drive speeds. For the best performance across
most of the virtualization products today, consider implementing mult iple
disk drives and using the fastest drive poss ible, in terms of its rotation speed,
for each drive.
One way to boost performance of a virtualized solution beyond just
having a faster drive is to ensure that the host machine and its associated
operating system have a dedicated physical hard drive, and that all virtual
machines or potentially each virtual machine has a separate physical hard disk
allocated to it.
Network
Network utilization can also present bottleneck issues, similar to those with
memory. Even though the virtual machine doesn t add any signif icant amount
of network latency into the equation, the host machine must have the capacity
to service the network needs of all of the running virtual machines on the host
machine. However as with memory you still need to appl y the appropriate
amount of network bandwidth and ne twork resources that you would have if
the machines were running on separate physical hardware.
You might need to upgrade your network card if you are running multiple
virtual machines in an IT environment and all machines are e xperiencing
heavy concurrent network traffic. But in most desktop virtua lization scenarios
you will find that the network is not the problem. Most likely the culprit is the
CPU, disk, or memory.
Conclusion
Virtualization technology, while not new, is growing at a significant rate in its
use on servers and desktop machines and long ago lost its conne ction to
mainframe systems alone. While challenges do exist, such as the unification
of terminology, the development of even more robust software solutions, and
the implementation of greater device virtualization support, virtualization is
still poised to make a significant impact on the landscape of computing over
the next few years.
For more information about virtualization and Intel VT , please refer to the book
Applied Virtualization Technology by Sean Campbell and Michael Jeronimo.