Backup & Recovery
Protecting Data at the University of St. Thomas
- By Dian Schaffhauser
Data backup had increasingly become a major challenge at the University of St. Thomas. With a growing body of data in its Banner enterprise resource planning system, its Blackboard installation, departmental needs, and personal storage directories set at 500 MB for every one of the 11,000 students and 2,000 faculty and staff members, data stores were gobbling up terabytes of space.
Up until 2007, the campus' installation of Symantec/Veritas NetBackup would perform a nightly incremental backup routine to the university's main campus in St. Paul, MN. Data would be stored to a Quantum Scalar i2000 enterprise tape library with four LTO-2 drives (formerly an ADIC product, acquired by Quantum). That backup process would start at about 5 p.m. when offices would shut down and run to 8 a.m. when they opened up again. A full 20 TB backup took place on weekends, consuming the entire period.
"We were using [a] raw [storage area network] disk for disk-to-disk staging," said Laura Thomas, central systems administrator in the operations and technical support division of the IT organization. "It had become increasingly slow as it got more fragmented, and it wouldn't clean itself very well."
If something went wrong, she recalled, the windows were so tight, there would be no chance to rerun the backup job, or, worse, it would generate cascading failures for the rest of the night. "Any time you have failures, you worry that you won't have the data when somebody wants it," she said.
Thomas got into the habit of logging into the system from home every night to make sure it had sufficient tape and space cleared on the SAN so the backups would run. "It was a constant juggling act," she said. "We needed to do something to shorten our backup windows and make everything more reliable and more robust, so we began looking for solutions with more throughput and capacity."
Upgrading Backup Systems
The university, which has two sites--the main campus in St. Paul and a smaller campus in Minneapolis--tends to lease most of its hardware on a three-year cycle. It replaces about a third of its hardware every year. This means there's a certain amount of money dedicated to that category of expenditure in the budget, which doesn't vary. "Say you spend a quarter of a million dollars on servers. Three years from now you have a quarter of a million to spend again, unless you can make a case for more money," said Thomas.
This meant she had to make a case to get more money than her fair share of that lease in order to make what she considered to be significant but essential improvements to the backup infrastructure. In 2006--well before the dire budget crunch of 2009--she was still able to sell "the specter of data loss and unhappy end users."
She was allocated about $250,000 to upgrade the system on both campuses. Assuming she'd upgrade the tape system to a "bigger and better" one, Thomas talked to storage vendors, particularly those with whom the university already had a relationship; read a lot about storage options online, followed up with people at vendor installations to find out what they were happy and unhappy about, and hit up peers for recommendations.
Recalled Thomas, "If somebody said, 'My repair guy is really fast--I know him really well,' I didn't take that as a good sign. I don't want to know my repair guy."
At the end of the purchase process, which took about six months, the university chose to return for its new storage equipment to the same company that had supplied its previous generation. It signed a new deal with Quantum in April 2007. But this time, since Thomas liked both the disk and tape solutions Quantum offered, she went with one of each for the St. Paul data center. The main campus moved to a new Scalar i2000, this one with six drives for tape backup, and a DXi5500 disk backup and replication appliance, which Thomas referred to as a "de-duplicating wonder toy." (The company no longer sells this model; the current high-end version is the DXi7500.)
Thomas said the DXi was remarkably easy to add to the data center. The backup software recognized it as a tape backup, and the Windows servers recognized it as having drives. "We put it in, loaded the drivers, and we were backing up to it in under an hour."
During a testing phase, the IT people ran backups on it for a week that were also being duplicated elsewhere. They performed spot restores. Then they ran tests to see what would happen if they filled it up. It generated tape write errors, even though there is no tape. "It took me a minute to troubleshoot what happened, and I actually had to call support. Quantum support was awesome," Thomas recalled.
A De-duping Wonder Toy
The DXi series works like a virtual tape library, even though it's disk-based. As Thomas described it, "The DXi itself is a little 2U unit that sits in the rack. It's got a bunch of disks in it. It broadcasts over the fiber SAN the unfailing belief that it's a great big tape library with lots of tapes in it." As she pointed out, "It believes it's a tape library. My backup software believes it's a tape library and treats it like one. But it's not backing up to tape: It's backing up to disk. And while it's backing up, it's taking out the duplicates."
To perform the de-duping, the firmware reads the stream of data. When it finds a block of data it already has--such as "St. Thomas,"--it makes a pointer to the previously recorded block in that stream. Thomas estimated that the campus is getting a five- or six-to-one compression with the new system. "If it would have taken 12 GB on disk, it's taking 2 or 3 GB now," she said. "I used to have a terabyte and a half of SAN space. I'm backing up three times as many servers and keeping the data there longer."
Although the tape library is still maintained, and some data--grades, transcripts, financial and business information--still goes offsite in a traditional backup manner, for some servers, the DXi has become the entire backup environment. That equipment is typically generating student working data or streaming data from the Web sites. This makes for a faster recovery process.
Plus, Thomas's nights are her own again. "I haven't had to get up in the middle of night and check on things," she mused. "I don't have problems where somebody asks for extra backup, and I ask, 'How am I going to fit that in?' Those problems have gone away."