Solaris Virtualization – An overview: Solaris Virtualization – An overview Lunch and Learn(?)
Why Virtualization? (stale news): Why Virtualization? (stale news) Virtualization is an Industry keyword now, as an answer to terms such as –
Datacenter Consolidation/Server Physical Footprint reduction
Server Power/Cooling fingerprint reduction
Workload balancing
Service Level Objective Automation (SLO automation)
What are our options?: What are our options? Following general categories of virtualization available:
Hardware level partitioning (technologies such as Sun Domains and HP’s nPAR)
Software level partitioning (technologies such as Solaris Containers, Logical Domains, XEN Source, VMWare and HP’s vPAR technology)
Hypervisor-based partitioning
Non-hypervisor based partitioning
Virtualization in Sun/Solaris: Virtualization in Sun/Solaris
Why not Hardware Partitioning?: Why not Hardware Partitioning? Hardware Partitioning is old technology.
Used to be a time when consolidating physical frames into a single big-frame server (aka E10K, F15Ks, SunFire 25K etc) was considered cost-effective.
Technology has evolved, thus limiting the scope of hardware partitioning and it’s reach (boxes have become cheaper, faster)
Inherent limitation of hardware partitioning – cannot dynamically relocate resources (cannot move heavy loads across servers/frames easily, cannot partition memory on the fly, etc)
Exorbitant costs associated with features such as Workload Management/Global Workload Management on the mid to big-frame server market (eg: HPUX WLM costs around $4K/CPU core. In an rx8640 that amounts to a sticker price of about $128K per frame + additional to migrate workloads across frames)
Software Partitioning – How?: Software Partitioning – How? Software Partitioning can be achieved using Hypervisor-based technology as well as without Hypervisors.
Examples of Soft Partitioning/Virtualization using Hypervisors – VMWare, Xen Source, LDOMs, vPARs
Two concepts therein –
Type-1 Hypervisor (direct bare-metal access) vs Type-2 Hypervisor (resource access via intermediary software interface)
Examples of Non-Hypervisor virtualization – Solaris Containers (Zones + SRM/FSS)
Hypervisors are also called VMM (Virtual Machine Monitors)
Non Hypervisor-based Virtualization: Non Hypervisor-based Virtualization In the Solaris world it is known as a Container, which a combination of Solaris 10 Zones along with SRM (Solaris Resource Manager, a Fair-share Scheduling mechanism)
Zone is a subset of the OS Kernel running in it’s own isolated “jail”.
Sparse-root zones share binaries with Host OS (Global Zone)
Full-root zones have replica of binaries of the Global zone
SRM implements resource utilization rules using combination of Resource Capping, Resource Pooling and Solaris projects.
Different zones can be allocated different resource pools (poold) and the usage enforced via Resource-capping (rcapd).
Available in both SPARC as well as x86/x64 platforms
Non Hypervisor-based Virtualization: Non Hypervisor-based Virtualization brandZ allows zones to run different OS by employing some abstraction/obfuscation of instructions via software (signal redirection (inherently different implementation of signals in Linux vs Solaris) and trampoline code)
Currently available –
Solaris8 brandZ (via the Solaris 8 Migration Assistant project (codename Etude) on SPARC
Linux brandZ (RHEL3 and CentOS) on the x86/x64 line
Hypervisor-based Virtualization: Hypervisor-based Virtualization Type-1 Hypervisor based technologies available from Sun are their LDOM and xVM technologies.
Type-1 hypervisors allow near bare-metal execution of OS (provided OS is hypervisor aware). There is a thin layer of software that controls the hardware resources and provides interface to the Guest Operating Systems
Solaris 10 + LDOMs in SPARC world (T1/T2-based processors)
Solaris 10 + Xen dom0 (xVM) in x86/x64 world
Hypervisor-based Virtualization: Hypervisor-based Virtualization Type-2 hypervisor
In other words, use a software interface to interface with the hardware (meaning, runs on top of a Host OS – a-la VMWare Workstation, VMWare GSX server, etc)
More handy for desktop virtualization (out of scope of this discussion)
Sun xVM Suite: Sun xVM Suite Provides xVM server (Xen-based paravirtualized solution) on IA64 and x86 as well as a Type 1 Hypervisor known as xVM w/HVM (HVM does not perform very well yet, per research)
Provides LDOMs for UltraSPARC T1 and T2 processors
xVM IA-64/x86: xVM IA-64/x86 Sun xVM Hypervisor is the VMM
Dom 0 is the Control VM (which provides domain mgt and console along with guest apps)
Dom 0 can directly access I/O
Fist VM started by the VMM
Dom U are the “real” Guest OSes
Dom U accesses I/O via Dom 0 drivers
xVM IA-64/x86: xVM IA-64/x86 Dom U accesses resources through Dom 0 using Hypercalls.
The following form the “plumbing” for Dom U operations
Hypercalls – synchronous calls from the GOS to the VMM
Event Channel – Asynchronous notifications from VMM to VMs
Grant Table – share memory communication between VMM and VMs and among VMs
Xen Store – heirarchical collection of control and status repository
xVM IA-64/x86: xVM IA-64/x86 Memory Management
Physical Memory sharing and partitioning
xVM introduces distinction between machine memory and physical memory
VMM using hotplug and ballooning techniques to optimize memory usage.
Hotplug will let VM dynamically adjust memory to it’s inventory
Ballooning is used by VMM to control distribution of Physical memory between VMs
xVM IA-64/x86: xVM IA-64/x86 I/O Virtualization
Uses a split device driver architecture
Front-end driver runs in the Dom U
Backend-driver runs in Dom 0
xVM w/HVM: xVM w/HVM Following requirements have to be met for HVM support:
A processor that allows an OS with reduced privilege to execute sensitive instructions
A memory management scheme for a VM to update its page tables without accessing MMU hardware
An I/O emulation scheme that enables a VM to use its native driver to access devices through an I/O VM
An emulated BIOS to bootstrap the OS
A processor that allows an OS with reduced privilege to execute sensitive instructions
A memory management scheme for a VM to update its page tables without accessing MMU hardware
An I/O emulation scheme that enables a VM to use its native driver to access devices through an I/O VM
An emulated BIOS to bootstrap the OS
xVM w/HVM: xVM w/HVM The processor for HVM has two operating modes:
privileged mode and reduced privilege mode
Processor behavior in the privileged mode is very much the same as the processor running without the virtualization extension.
Processor behavior in the reduced privilege mode is restricted and modified to facilitate virtualization.
xVM w/HVM: xVM w/HVM After HVM is enbaled, processor is operating at privileged mode
Transitions from Priv-mode to reduced-priv mode is called VM Entry
Exit back to priv-mode is called VM Exit
LDOMs: LDOMs LDOMs leverage the CMT technology of Niagara processors and partitions processors into “strands”.
Each strand gets it’s own hardware resources
Each VM (domain) is associated with one or more dedicated strands
The Hypervisor is the hardware abstraction layer
LDOMs: LDOMs OS boots from OBP
After boot, an LDM (LDOM Manager) is enabeld and inits the first domain (control domain) aka. keys to the kingdom
LDOM doesn’t share strands with other domains
Solaris Guest domain can directly access hardware
LDC Services (Logical Domain Channel) provides plumbing between domains
LDOMs: LDOMs Memory Virtualization
Physical Memory sharing and partitioning
Address space is split into:
Virtual Address (VA): User space programs access this
Real Address (RA): underlying memory associated with guest domain/os
Physical Address (PA): appears in the system bus to access physical memory
LDOMs: LDOMs I/O Virtualization
LDOMs provide ability to partition PCI buses so more than one domain can access devices directly.
Domain that has direct access to I/O is called an I/O domain or service domain
Domain that doesn’t have direct access uses the VIO (virtual I/O) framework and goes through the I/O Domain for access
Device drivers vds (disk server) and vsw (net switch) are the server drivers in the I/O domain
Vdc (disk client) and vnet (network) drivers are client drivers in non I/O domain (vnex provides bus services to vdc and vnet)
Network and I/O Virtualization: Network and I/O Virtualization Two projects underway in the Solaris community (opensolaris to be precise) are –
NPIV (N-Port ID Virtualization) to virtualize HBAs at the OS level
Crossbow – NIC Virtualization
Zone creation - demo: Zone creation - demo Demo – create simple Solaris 10 zone and deploy a few apps etc
VMM Comparison: VMM Comparison
VMM Comparison: VMM Comparison
Comparison Matrix (Virtualization on SPARC): Comparison Matrix (Virtualization on SPARC)
Comparison Matrix (Virtualization on x86/x64): Comparison Matrix (Virtualization on x86/x64)
References: References http://prefetch.net/presentations/SolarisVirtualization_Presentation.pdf
http://developers.sun.com/events/techdays/presentations/2007/TD_BOS_SolarisVirtualization_Dickson.pdf
http://johnjmclaughlin.blogspot.com/2007/12/xvm-white-paper-blueprint-and.html
http://www.cassatt.com/
http://en.wikipedia.org/wiki/Hypervisor
http://blogs.sun.com/kucharsk/entry/signal_corps