LMU Reins In Network Ops with Change Management
- By Dian Schaffhauser
- 04/23/09
Dan Cooke remembers the day well. It was about four years ago when a core group of IT staff members at Loyola Marymount University in Los Angeles had decided to clean up DNS tables on the campus network. As Cooke, director of infrastructure technology and services, recalled, "Even though all the great technical minds were in the room and agreed to [the decision], it didn't get vetted by anybody else outside the group." What they hadn't remembered, he said, was that it's probably not a good idea to make a network change during finals week.
The modification to the domain name system created problems with mail flow, "which during finals week at a university is a huge problem." There was no easy way to roll back the change, and the mess that ensued, he said, became "representative of what we didn't want to be."
This hadn't been the first time a change had been made that impaired network operations. Whether it was a reconfiguration to a switch or router or the application of a patch on a server, IT would frequently be caught off-guard when a problem would surface after the change. Frequently, one team member wouldn't know that another had made the change in the first place. "You'd spend time troubleshooting what happened as opposed to knowing what had changed," said Cooke. "That was the killer for me."
That's when his department realized it needed a change management policy, a process laying out how each step of a network change--planning, implementation, testing, etc.--should transpire. After studying both the Information Technology Infrastructure Library (ITIL) and Control Objectives for Information and related Technology (COBIT) frameworks and guidelines for IT management, the team wrote its own change management policy. It also sought tools to help with the work.
Putting Change Management to Work
Tod Isaacson, manager of network services, and his crew of three oversee the network, recommend changes, architect the network, perform upgrades, and implement security measures. They're also immersed in designing the network infrastructure for a new state-of-the-art library anticipated to open in fall 2009. Currently, the campus has about 200 Cisco switches to support data communications for 7,000 students, 400 full-time faculty, and 800 staff members.
For the job of helping to manage those switches, the university evaluated Netcordia's NetMRI 2.0, an appliance that automates the jobs of network device configuration and change management. As Isaacson recalled, that was the only product the team tried. They liked it so much, he said, "We went with it."
A vendor rep came in to train the group on how to use the device and to help set it up. Once that was done, the appliance spent a couple of days compiling information about the network. The initial configuration covered IP addresses for the entire network, a mistake since it meant everything on the network--including servers, client machines, and VoIP phones--was being monitored, when all the infrastructure team wanted out of NetMRI was data about the switches, as well as routers and firewalls.
After a reconfiguration specifying subnets just for those devices, administrators could be alerted when a change had been made on that hardware and could quickly find out who had made the change. "We don't want to act like we're policing everybody," said Isaacson. "It helps us in case somebody forgets that they made the change."
Also, the appliance enables the university to maintain a consistency among its hardware by allowing administrators to define policy templates to be followed by like devices. "We had a lot of problems in the dorms with people plugging routers in backward," explained Cooke. "That would cause a loop in the network and take the rest of us down. [Using NetMRI], we could add commands to the switch that would just turn off the port instead of affecting the whole residence hall. Once we put that command in, we could apply that across the whole university. Every switch put in has to follow these guidelines, which really helps us. We know we're safe."
A Daily Dose of Network Monitoring
Each day, said Isaacson, he'll come into the office and check out the issues for the day on a NetMRI reporting feature that provides a rating score for network compliance based on which devices were most changed in the selected period. "We're usually in the high eights or nine, which is pretty good. The lower the score," he said, "the more problems your network has."
If a problem were to surface, such as a fan failure in a switch chassis, said Isaacson, "It would tell us which switch has a bad fan, so we could replace it. We'd have a hard time knowing about it if we didn't use NetMRI. This way, we can be proactive rather than reactive."
The appliance will also report on a change that the administrators weren't aware of because it was outside their normal scope. For instance, if somebody calls for tech support because a printer connected to the network has stopped working, the client services staff might troubleshoot the problem and discover that the printer no longer has an IP address. They'll escalate the problem to Isaacson's team, who can look on NetMRI to find out if somebody has changed the printer virtual LAN. That wayward change can be rolled back or modified to get the printer working again.
The university recently tried out version 3.0 of NetMRI, which it will be upgrading to as part of its service contract. The new release has a starting price of $10,000 for 50 network devices. It has enhanced reporting, including a timeline dashboard that shows changes over time and visually correlates network health and policy adherence; topology status views, to show exactly where in the network a change occurred, overlaying both health and policy compliance status, to correlate by topology location and dependence; and network explorer, allowing the user to do ad hoc analysis of devices on the network. Isaacson said he's especially impressed by the addition of new templates, which will expedite setup. "Everything is easier to navigate," he said. "You don't need as much hand-holding."
Cooke's team recently went through a new budget exercise, in which it examined contracts, devices, and tools used in the department, categorizing each as necessary, nice to have, or disposable. "MRI made the cut," he recalled. "Without it we'd have a hard time doing things the proper way and providing the standards that we want to live up to."