IT Infrastructure | Feature
10 Myths of Virtualization
Lone Star College System has become a virtualization success story. One secret to its success: not allowing common virtualization fears to hold back progress.
- By Dian Schaffhauser
Illustration by Nigel Buchanan
Half of servers in higher ed are virtualized, according to the Educause 2011 Core Data Service Report. But that number's not high enough for Link Alander, interim vice chancellor and CIO at the Lone Star College System (TX). He aspires to see 100 percent of the system's infrastructure requirements delivered as IT services from its own virtualized data centers or other cloud-based operators.
The Houston-area community college system, with 100,000 students attending classes at 16 locations across 1,400 square miles, is continually expanding. Twenty-one new buildings were added in 2011, and the student population exploded from 63,000 to 90,000 in three years. Back in 2008, the system suffered from unreliable services with constant outages. Most of the hardware components in the data center were at end-of-life, and the student ERP system couldn't keep up during registration--a highly visible bruise to IT's reputation.
Since then, Lone Star's IT has undergone a methodical transformation in its organization, infrastructure, and business operations, all in pursuit of hyper-virtualization. (The community college system won a 2012 Campus Technology Innovators award for its effort; read more about it here.) As a result, it can now boast "five nines" service levels, a standard set in 2009 and achieved regularly by late 2010. This level of performance allows for only six minutes of unplanned downtime for any given service each year. "It's not just the data center that has to be up," Alander notes. "If the service can't be delivered due to whatever problem, that's part of your downtime."
Getting to that point of IT maturity and high availability hasn't been without challenges. Typically, though, it's misplaced concerns about virtualization that get in the way. Here, Alander and Cory Bradfield, Lone Star's infrastructure architect, address 10 common myths that can stop campuses from getting 100 percent out of their virtualization efforts.
Myth No. 1: Virtual Server Creep Is a Given
Because virtualization makes it much easier to provision servers on demand, it's possible for the growth of virtual machines at an institution to spiral out of control--resulting in the dreaded virtual server creep. Indeed, when the push to broaden virtualization started at Lone Star, security logs revealed quite a few servers, which had been set up for test purposes, that nobody had touched for months.
With proper management, however, virtual server creep is hardly a foregone conclusion. Alander and Bradfield recommend putting in place a strong lifecycle-management policy to maintain standards in areas such as change control, use of "golden master copies" for configuration of virtual machines, and other processes.
Now, before the Lone Star team provisions a new environment, standard practice is to ask users what it will be used for and how long it'll be needed. When the time is up, IT checks back to see if the virtual machine is still needed. If not, it's wiped out and the dedicated resources are allocated back to the pool from which they came. The system is archived after being decommissioned, just in case it needs to be brought back. It's officially purged three to six months later, depending on the service it was tied to.
Myth No. 2: The Virtual Environment Is More Complex
A virtual environment is no more complex to manage than any other kind of infrastructure, Bradfield and Alander insist. In fact, they believe it's considerably easier. It just requires the proper training up front, the right organization structure, and standardization on the back end.
"Invest in your people," Alander exhorts. "I wish I could do more training. We do as much as we possibly can. If you try to roll out a virtualized environment without somebody who's fully trained in virtualization, then you're going to have issues."
In addition to training staff, Lone Star also focused on restructuring IT roles. "We came up with defined roles of who would be doing what in the management of this environment," Bradfield says. Then, instead of sending everybody off to an outside class where only part of the material might have been relevant, the college system did customized training for each of the new roles. "We engaged staff in four and five days of on-site materials and our environment. That accelerated our learning quite a bit."
In the old structure, each Lone Star campus was autonomous. So a systems administrator from the Tomball campus, for example, would be involved in everything from speccing and installing servers to putting in place the appropriate applications. "Now his job consists of project management of the server requests [for his campus] and making sure he gathers the requirements up front," notes Bradfield. "We deploy the server farm, he goes in and installs the applications he needs, and he's off and running."
As a result, virtualization has given admins a simpler work environment. "Everything they touch is deployed from a standard set of templates," Bradfield says. "Whether it's a dental arts server or an antivirus server, they know that the configuration is the same throughout. They don't have to worry about storage and the underlying infrastructure."
Myth No. 3: Backups Take Too Long
Traditional streaming backup strategies don't hold up well in a virtual scenario. With fewer physical servers--all dedicated to virtualized production work--backups often end up as an afterthought, handled by whatever physical resources are left. At the same time, stratospheric growth in the amount of data generated through virtualized IT services can mean longer backup times. Backing up a 450-plus-terabyte datastore, such as Lone Star maintains, could easily exceed the available window. Yet that hasn't troubled Lone Star in its shift to total virtualization.
The key, says Alander, is a unified storage setup that is dynamic and flexible. Lone Star uses dual data centers 37 miles apart, one hot (running critical services), one warm (noncritical services), for quick failover purposes. Storage arrays consist of a combination of EMC products, including VNX, Symmetrix VMAX, Avamar, Centera, RainFinity, and RecoverPoint. The system's weekly full backup window is only eight hours, compared to 48 hours pre-virtualization.
Lone Star is constantly monitoring costs and making adjustments to its backup infrastructure. The college currently uses a pair of EMC products, Networker for backup and recovery and Avamar for deduplication, which "do a phenomenal job of taking care of remote locations," Alander says. But he's found that the combination has "a higher cost per terabyte," compared to another EMC duo: NetWorker and Data Domain (a de-duplication storage system EMC acquired in 2009). So that's the route the college will take next in a refresh of its backup and recovery solutions.
Myth No. 4: The Data Center Requires More Power
A data center running a set of blade servers may, indeed, require more power and generate more heat than the previous layout with physical servers. The comparison isn't really equitable, however, because the virtualized version has much greater capacity and performance capability. In fact, Lone Star is using only about a quarter of the power that would be consumed by a comparable number of physical servers performing the same work, according to Bradfield. Heat generation shows a similar drop.
Still, IT departments working with high-performance virtualized servers may have to "rethink how they're going to reposition that type of equipment because the heat is condensed [in one place]," Alander admits. But that's easily done, he says--Lone Star uses traditional forced-air cooling.
There is one other caveat, adds Bradfield. "If you're transitioning from a stand-alone storage environment to shared storage, the [storage area networks] can be pretty power intensive. In our case we already had [shared storage] in place, so we didn't see that increase either."
Myth No. 5: Infrastructure Costs Rise
For many institutions, virtualization is tied to a larger goal of higher availability of server resources--and that's where the cost comes in. At Lone Star, high availability was a major priority for the organization, one set by the chancellor and the board and agreed to by the Office of Technology Services (OTS). That's why the institution has two data centers; should one go down, services will automatically failover to the other.
Maintaining dual setups is a pricey decision, Alander acknowledges. "Yes, we have capacity that's not being used, but that's similar to someone having a disaster-recovery site ready to go." And once an institution makes that investment, the benefits are far-reaching. "You're delivering that level of availability to all services you deploy, not just to the select few that require it," notes Bradfield, adding that the resources in the second data center don't sit idle: They're used to house many of Lone Star's testing and development environments.
With or without the dual data center setup, the efficiency of a virtualized infrastructure is a cost-saver in terms of reduced downtime. While the loss of a physical server could take systems offline for days or hours, the loss of a physical component of a virtual host merely means a transition to another set of CPU and memory, explains Alander. Even if Lone Star didn't have that second site, he insists, if the college lost a big portion of its hardware, OTS could reprovision all of those virtual machines off to another location automatically or manually "faster than you could get the parts in to fix the hardware."
Myth No. 6: Legacy Software Can't Cope
To the uninitiated, logic dictates that legacy software and a modern, virtualized infrastructure do not mix.
Actually, says Alander, the problems with running software in a virtual environment have rarely been technical. "The only obstacle I've ever seen is vendors who have said they will not support their software in a virtual environment," he explains. "But we've proved to quite a few of those vendors that we can run it in a virtualized environment and provide the service."
Bradfield says the only issues he's experienced in trying to run even the most ancient of programs are logistical: not being able to find the original media or not having the means to reinstall it. "We've never run into a piece of software that would not run in a virtualized environment," he proclaims. "Most run better."
Myth No. 7: Troubleshooting Is a Guessing Game
Many virtualization naysayers may remember the early days when troubleshooting tools weren't very sophisticated. Indeed, when Lone Star started along its virtualization process, the tools didn't exist to get end-to-end visibility, Bradfield says. What was available back then "gave us a pretty decent high-level view and pointed us in the right direction when problems arose. But [the tools] still couldn't see into the network stack beyond the hypervisor or storage array."
The college now uses a combination of products across the network, storage, and applications, including VMware's vCenter Operations Management Suite, SolarWinds' Network Performance Monitor, EMC Ionix, and HP Insight Control, all of which provide a detailed view into the virtual infrastructure. Of course, divvying up problem-solving across multiple tools continues to make troubleshooting "a struggle," Bradfield acknowledges.
Also, the metrics used to monitor potential problems change in a virtual setup, so system admins need a new troubleshooting mindset--and that takes retraining. For example, 100 percent CPU usage in a virtualization scenario could mean the CPU is truly tapped out; but it could also mean there's CPU contention going on, there's a limit put in place for use of that CPU, or the CPU is actually sitting there waiting for a response from a disk I/O command. The system admin may conclude that another core is needed, or additional I/O capacity on the back end, or a tweak of the resources.
Mostly, Bradfield observes, solving problems "comes back to training and experience, understanding the types of issues that can occur, and knowing where to look."
Myth No. 8: Capacity Planning Is a Nightmare
In a dynamic scenario where conceivably "anything goes," how does the IT team plan for usage? With virtualization, capacity planning is best described in simpler, old-fashioned terms, Alander notes. "The reality is, you're just back in the old days of capacity management, like on the mainframe. You're not talking about individual servers as much. You're talking compute resources."
Dedicated SAN management tools have made the job of capacity planning easier, Bradfield adds. But he still finds it a challenge to manage thin provisioning--fooling various parts of the system into thinking they have more resources than actually exist. "Decide up front where thin provisioning will be done," he recommends, "because you can do it at the hypervisor, VM, and storage array layers. Determine where you're going to do it, and stick to that layer. Don't do it at all layers, because then capacity management is just a nightmare."
Lone Star's standard right now is to do thin provisioning on the storage array. "It does a better job," Bradfield says. "The reporting tools are there to help us understand the capacity management and there's less performance overhead there than at the hypervisor level."
Myth No. 9: Users Come to Expect Miracles
Having users expect amazing performance from IT services in itself isn't a bad thing. But in a high-availability virtualized environment, if users are shielded from the challenges of maintaining an always-up system, they can take it for granted--and suddenly IT may find its budget slashed.
That scenario can be prevented by communication, says Alander. "I often have to share failures with management. They don't see the problems, but the problems still exist. Problems happen just as frequently as they used to; it's just that they're transparent to [management]."
Recently, during registration, a network issue caused a failover of the ERP system and "some other crashes that came along the way." Behind the scenes, Bradfield pulled logs to see why the disconnect had happened; networking people were engaged; ERP people became involved. "It took a lot of time from the staff," Alander recalls, but "nobody saw anything."
To make sure that work doesn't go unnoticed, Alander uses a "showback" model to keep the college aware of the costs involved in delivering 99.999 percent uptime. At the start of a project, divisions are charged for the capacity they're consuming so they're aware of the costs; afterward, OTS assumes responsibility for those resources. That way, campus leadership gains an awareness of what it's getting and will be protective of the OTS budget.
Myth No. 10: Virtualized Services Can't Deliver All They Promise
The terms "virtualization" and "cloud" are bandied about so often these days that skeptics may underestimate their potential. But the benefits of dynamic scaling and application elasticity--delivering IT services to the school as they're needed--are no myth, insists Alander.
The example he points to: that ERP system that used to break down at peak student registration time before the start of every semester, grinding work to a halt. Now, the virtualized environment allows OTS to shrink development and testing capacity down and spin up more application servers and web servers in order to handle the onslaught.
"The processes are delivering exactly what we expected originally," Alander declares. "The application performance during peak registration periods is exactly the same as application performance during our dead time. The difference is that I may be running 12 more app and web servers than I was running before. When I'm done, those things disappear; they go back into hiding until they're needed again."