Virtual Machine Hosting

Virtual Machine (VM) hosting provides VMs so that your organization's IT staff can run dedicated and customized Linux or Windows systems. This allows your IT staff to focus on your computing needs without the worry of purchasing and maintaining hardware resources.

Your VM will live in UFIT's secure private cloud which leverages:

  • Multiple enterprise-class datacenters
  • Secure enterprise-class network - public or UF private IP space available
  • SAN/NAS-backed failover to prevent downtime due to hardware failure or maintenance
  • VMWare vSphere 6.5 environment

UFIT Provides

Everything up to the hypervisor (virtualization layer). This includes all physical resources such as computing hardware, networking, datacenter resources, and VMware software. UFIT also provides access to allow you to connect to your VM for management purposes such as console access, configure CD media, power on/off and VM Tools installation.

Customer Provides

You provide IT staff to install, configure, and maintain all software on your VM - OS and Application software. This includes maintenance of proper licensing for any software installed on your VM. Additionally, your staff will be required to handle all monitoring and backups of your VM. While UFIT monitors the health of the underlying hypervisor systems we do not provide VM guest OS or Application monitoring for your hosted VM. Additionally, your staff will be responsible for working with UFIT Network Services to maintain network ACLs pertaining to your VM's IP address(es). Your staff will also be responsible for responding to and working with UFIT's office of Information Security and Compliance.

Data Center Architecture

The University of Florida has two physical data centers, Space Sciences Research Building (SSRB) on campus and University of Florida Data Center (UFDC) off campus. Our infrastructure design stretches our compute, storage, and network across these two physical datacenters into two Availability Zones (AZ1, AZ2). Each AZ consists of a compute, and dedicated storage and network infrastructure to provide redundancy across the physical data centers. Compute consists of stretched clusters that allow machine states to be migrated across datacenters at any time. Storage consists of two SAN arrays, one in each datacenter, synchronously replicating data to provide failover capabilities between datacenters. Network consists of two pairs of routers in each datacenter providing a network fabric across both datacenters.

In conjunction, these infrastructure resources provide the capability to automatically shift VM workloads between datacenters as needed and automatically restart (outage) or failover (no outage, stun) VMs in the event of an infrastructure outage.

VM Availability Deployment Recommendations

For applications with a single box architecture, you would want to place your VM into either of the AZs and NOT pin the VM to a datacenter.

For applications with a highly available architecture, you would want to place at least one VM in each AZ and for additional redundancy, you would “pin” the VMs to one of the data centers within the AZ (EX: VM1 = AZ1 SSRB, VM2 = AZ2 UFDC).

Availability Zone Design Diagram

Infrastructure Failure Scenarios

There are a number of ways for the datacenter infrastructure to fail. Here we will describe the most common occurrences we've seen and detail the recovery model for each one.

Single AZ Network(s) Outage

All resources using the network(s) will become unavailable via network interfaces. The VMs will continue to run and no automated infrastructure recovery actions will be taken. Resources in other AZs will remain unaffected.

Single AZ Single Storage Array Outage

All resources using the storage array will become 'stunned' or freeze for approximately 30 seconds while the storage paths fail over to the storage array in the other datacenter. Most VMs will remain powered on and experience no 'outage'. Resources in other AZs will remain unaffected.

Single AZ Single Compute Cluster Outage

All VMs using the compute cluster will power off. The VMs will be automatically started on unaffected infrastructure hardware within 5 minutes. Resources in other AZs will remain unaffected.

Single Compute Enclosure Outage

All VMs running on the compute enclosure in both AZs will power off. The VMs will be automatically started on unaffected infrastructure hardware within 5 minutes.

Single AZ Corrupt Data

All VMs whose data was corrupted will be replicated across the mirrored storage. The VMs will have to be rebuilt or restored from backup. Resources in other AZs will remain unaffected.

Single Datacenter Outage

All VMs running in the affected datacenter will be powered off. The VMs will be automatically started on unaffected infrastructure hardware within 5 minutes. Resources in other datacenters will remain unaffected.

Documentation