Friday, September 14, 2012

VMWare vSphere

VMWare vSphere 5.0 Architecture Essentials

vSphere 5.0, being a cloud operating system, virtualizes the entire IT infrastructure such as servers, storage, and networks

vSphere 5.0 logically comprises of three layers: virtualization, management, and interface layers.

The Virtualization or Infrastructure layer of vSphere 5.0 includes two services, infrastructure    and application.

- Infrastructure Services such as compute, storage, and network services abstract, aggregate, and allocate hardware or infrastructure resources. Examples include but are not limited to VMFS and Distributed Switch.

- Application Services are the set of services provided   to ensure availability, security, and scalability for applications. Examples include but are not limited to VMware vSphere High Availability (HA) and VMware  Fault Tolerance (FT).

- The Management layer of vSphere 5.0 comprises of the vCenter Server, which acts as a central point for configuring,  provisioning, and managing     virtualized IT environments.

- The Interface layer of    vSphere 5.0 comprises of clients    that allow a user to access    the vSphere datacenter,    for example, vSphere Client and    vSphere Web Client.

A typical vSphere 5.0 datacenter consists of basic physical building blocks such as x86 computing servers, storage networks and arrays, IP networks, a management server, and desktop clients.

vCenter Server: vCenter Server provides a single point of control to the datacenter. It provides essential datacenter services, such as access control, performance monitoring, and configuration. It unifies the resources from the individual computing servers to be shared among virtual machines in the entire datacenter.

Management Clients: vSphere 5.0 provides several interfaces such as vSphere Client and vSphere Web Client for datacenter management and virtual machine access.

A host represents the aggregate computing and memory resources of a physical x86 server running ESXi 5.0 server.
A cluster acts and can be managed as a single entity. It represents the aggregate computing and memory resources of a group of physical x86 servers sharing the same network and storage arrays.

Resource pools are partitions of computing and memory resources from a single host or a cluster. Resource pools can be hierarchical and nested. You can partition any resource pool into smaller resource pools to divide and assign resources to different groups or for different purposes.

vMotion enables the migration of live virtual machines from one physical server to another without service interruption. With vMotion, resources can be dynamically reallocated across physical hosts.

Storage vMotion enables live migration of a virtual machine’s storage to a new datastore with no downtime.
Datastore is the storage location for the files that belong to a virtual machine. This location needs to be specified when a virtual machine is created. The datastore formats used in vSphere are the Virtual Machine File System (VMFS) or the Network File System (NFS).

Migrating single virtual machines and their disks from one datastore to another is possible because a virtual machine is composed of a set of files. Even the virtual machine's disks are encapsulated in files. Migrating a virtual machine's disks is accomplished by moving all the files associated with the virtual machine from one datastore to another. Extending the vMotion technology to storage helps the vSphere administrator to leverage storage tiering, perform tuning and balancing, and control capacity with no application downtime.

vSphere 5.0 uses a mirrored-mode approach for Storage vMotion. In this new architecture, Storage vMotion copies disk blocks between source and destination and replaces the need for the iterative pre-copy phase. This was used in the Changed Block Tracking (CBT) method in earlier versions of vSphere. With I/O mirroring, a single-pass copy of the disk blocks from the source to the destination is performed. I/O mirroring ensures that any newly changed blocks in the source are mirrored at the destination. There is also a block-level bitmap that identifies hot and cold blocks of the disk, or whether the data in a given block is already mirrored in the destination disk.

vSphere HA provides high availability for virtual machines and the applications running within them by pooling the ESXi hosts they reside on into a cluster. Hosts in the cluster are continuously monitored. In the event of failure, the virtual machines on a failed host are attempted to be restarted on alternate hosts.

When a host is added to a vSphere HA cluster, an agent known as Fault Domain Manager (FDM) starts on it. These agents communicate amongst themselves and transfer state and status information. An agent can perform the role of a master or a slave. The role is determined through an election algorithm. This election algorithm determines the agent that will play the role of the master. All other hosts then play the role of a slave. As a master, the agent acts as the interface to vCenter Server, monitors the slave hosts and any virtual machines running on it, and ensures that information is distributed amongst the other nodes within the cluster as needed. The slave host sends heartbeat to the master host to check for its health and availability. If the master host is not available, the slave host participates in another election to select the master host. They also monitor the virtual machines running on them and update the master about their status.

If a host fails, the virtual machines hosted on it restart. vSphere HA also detects other issues, such as an isolated host or a network partition and takes action if required. While dealing with failures, vSphere HA can now take advantage of heartbeat datastores. Heartbeat datastores is a new feature that allows communication between the nodes in the cluster in the event of failure of the management network.

Using the VMware’s vLockstep technology, Fault Tolerance FT on the ESXi host platform provides continuous availability by protecting a virtual machine (the Primary virtual machine) with a shadow copy (secondary virtual machine). The secondary virtual machine runs in virtual lockstep on a separate host. Inputs and events performed on the primary virtual machine are recorded and replayed on the secondary virtual machine ensuring that the two remain in an identical state. For example, mouse-clicks and keystrokes are recorded on the Primary virtual machine and replayed on the Secondary virtual machine.

The Secondary virtual machine can take over execution at any point without service interruption or loss of data because it is in virtual lockstep with the primary virtual machine.


DRS helps you manage a cluster of physical hosts as a single compute resource by balancing CPU and memory workload across the physical hosts. You can assign a virtual machine to a cluster and DRS finds an appropriate host on which to run the virtual machine. DRS places virtual machines in such a way so as to ensure that load across the cluster is balanced, and cluster-wide resource allocation policies (for example, reservations, priorities, and limits) are enforced. When a virtual machine is powered on, DRS performs an initial placement of the virtual machine on a host. As the cluster conditions change (for example, load and available resources), DRS uses vMotion to migrate virtual machines to other hosts as necessary.

Storage DRS (new feature in vSphere 5) provides the same benefits in storage as available in DRS, such as resource aggregation, automated load balancing, and bottleneck avoidance. You can group and manage a cluster of similar datastores as a single load-balanced storage resource called a datastore cluster. Storage DRS collects the resource usage information for this datastore cluster and makes recommendations to you about the initial VMDK file placement and migration to avoid I/O and space utilization bottlenecks on the datastores in the cluster.

Storage DRS also includes Affinity/Anti-Affinity Rules for virtual machines and VMDK files. VMDK Affinity rules keep a virtual machine’s VMDK files together on the same LUN and this is the default affinity rule. VMDK Anti-Affinity rules keep a virtual machine’s VMDK files separate on different LUNs. Virtual Machine Anti-Affinity rules keep virtual machines separate on different LUNs. Affinity rules cannot be violated. These are hard rules.


When DPM is enabled, the system compares cluster-level and host-level capacity to the demands of virtual machines running in the cluster. If the resource demands of the running virtual machines can be met by a subset of hosts in the cluster, DPM migrates the virtual machines to this subset and powers down the hosts that are not needed.
When resource demands increase, DPM powers these hosts back on and migrates the virtual machines to them.
This dynamic cluster right-sizing that DPM performs reduces the power consumption and this reduction reduces expenses.

ESXi Architecture : vSphere 5.0 offers two versions of ESXi:

- ESXi Installable Edition can be  installed in a number of ways such as Interactive, Script, and Auto Deploy.
- ESXi Embedded version comes pre-installed as firmware embedded on hardware that you purchase from a vendor.

The ESXi architecture comprises the underlying OS, called VMkernel , is a POSIX-like OS. VMkernel provides means for running all processes on the system, including management applications and agents as well as virtual machines. It has control of all hardware devices on the server, and manages resources for the applications.

Virtual Machine Monitor (VMM) : The key component of each ESXi host is a process called VMM. One VMM runs in the VMkernel for each powered on virtual machine. When a virtual machine starts running, the control transfers to the VMM, which in turn begins executing instructions from the virtual machine. The VMkernel sets the system state so that the VMM runs directly on the hardware. However, the OS in the virtual machine has no knowledge of this transfer and thinks that it is running on the hardware.

Virtual Hardware: All the configuration details of a virtual machine are recorded in a small configuration file stored on the ESXi host. All the files that make up a virtual machine are typically stored in a single directory on either an NFS or VMFS file system.

Device Drivers : standard VMware device drivers are same in any Windows virtual machines. Standard virtual device drivers allow portability without having to reconfigure the OS of each virtual machine. If you copy these files to any other ESXi host, they will run without the need to reconfigure the hardware, even if the hardware is totally different.

VMware Tools: The VMware Tools is a suite of utilities that enhances the performance of the virtual machine’s guest OS and improves management of the virtual machine. Installing VMware Tools in the guest OS is vital. Although the guest OS can run without VMware Tools, you lose important functionality and convenience. The VMware Tools service is a service that performs various duties within the guest OS. The service starts automatically when the guest OS boots.
Memory Optimization : ESXi uses several innovative techniques to reclaim virtual machine memory which are
Transparent page sharing (TPS) - Reclaims host memory by removing redundant pages with identical content
Ballooning- Reclaims host memory by artificially increasing the memory pressure inside the guest

Hypervisor swapping - Reclaims host memory by having ESXi directly swap out the virtual machine's memory
Memory Compression - Reclaims host memory by compressing the pages that need to be swapped out


VMKernel Swap : Each virtual machine includes a VMkernel swap file. If multiple virtual machines need their full allocation of memory, then the ESXi host will swap their memory regions to disk on a fair-share basis governed by the memory resource settings you have assigned to each virtual machine. The VMkernel uses this feature only as a last resort because it causes performance to be noticeably slower.

In vSphere 5.0, the VMkernel allows the ESXi swap to extend to local or network Solid State Drives (SSD) devices, which enables memory over commitment and minimizes performance impact.

References and further reading :

No comments: