Storage | Case Study
Q&A: Consolidating Data in Yale's Digital Media Center
With a few dozen high-end Mac digital media workstations serving about 1,000 students a year, Yale's Digital Media Center for the Arts has seen its share of data overload, not to mention storage catastrophes for students along the way. As part of an effort to overcome at least some of those challenges, the center consolidated storage in a centralized system, reducing the number of hard drives down to one while still delivering all the capacity those digital media students need.
- By Bridget McCrea
A May 2011 report from McKinsey & Co. revealed that the education sector stored 269 petabytes of data in 2009 alone. A sizable chunk of that bulk was generated by Yale University's Digital Media Center for the Arts, where Lee Faulkner, media director, used to cringe when students approached his desk, begging him to bring their crashed flash drives back to life. "In most cases," said Faulkner, "there wasn't much I could do about it."
As an interdisciplinary organization that facilitates both high-end graphic design and video production activities for graduate and undergraduate students, the Digital Media Center for the Arts has been grappling with the challenges of big data management for years. Faulkner spoke with Campus Technology to discuss the issues, the solutions, and the advice he'd give other educational centers dealing with similar challenges:
Bridget McCrea: What were your data management issues?
Lee Faulkner: We facilitate about 1,000 students annually across disciplines like music, filmmaking and photography. Some are taking classes here, and others are just using our facilities, but they all have different needs regarding their final products. When video began to pick up steam about six years ago, we started dealing with more data than ever. Initially, our only option was to utilize internal hard drives for storage, but with so many users getting into video it was apparent that the approach wasn't going to work anymore.
McCrea: What solutions did you try?
Faulkner: At the time, reasonable FireWire flash drive technology was offered, so many of our students just went out and bought their own storage devices and were happy with that. They used the drives not just for backups, but also for moving their data between shared machines and workstations. Unfortunately, these drives had the tendency to crash. When we started noticing the crashes becoming more and more frequent, we weren't sure if the drive quality was decreasing, or if the devices were just being moved around and tossed into backpacks. Either way, we wound up with a lot of catastrophes. By the end of the semester, I'd have a pile of drives on my desk and a group of students begging me to retrieve their data.
McCrea: How did this affect the students?
Faulkner: Most of the time, the damage had been done and there wasn't anything I could do. They had to redo their work. We embarked on an educational campaign that encouraged students to back up their drives, but 1 TB of information is a massive thing to try to backup, so it really became cumbersome. Most of our constituency isn't necessarily incredibly well versed in the minutia of backups and programs that are available and would rely on making duplicate copies rather than having backup programs.
McCrea: What other storage options did you look at?
Faulkner: Early on we'd dabbled in shared storage. We were using it for about four video editing stations, and it became apparent to me that the benefits we reaped from that could be spread through the center, and across the student body, if we could get the resources to support the initiative. It would mean moving from a localized solution to one that spread across our entire facility, which includes 40 workstations and big, long runs of cable. I knew that would make a SAN network cost-prohibitive, but during my research I found a central storage solution from EMC/Isilon.
McCrea: What made a central storage solution a viable option?
Faulkner: This one was based around gigabit Ethernet, and we were running a Macintosh computer platform that incorporated gigabit interfacing directly into our machines. Upon further investigation, I realized that this solution didn't require us to run any client-based applications in order to enable the connection. We could use standard connections, which means the learning curve for our constituents would be minimal. We connected the servers and listed the addresses. Students log on, initiate the connection via the Yale network IDs [without the need for a password], and back up their data.
McCrea: How is it working?
Faulkner: Since we installed the system two years ago, we've been able to reduce our total number of hard drives from 36 to one. Students have the same amount of space that they would get with an external hard drive, and that allocation is more than sufficient for schoolwork. In some cases, we do host more advanced projects for students. However, the quota control on the system lets me give everybody the right amount of space.
McCrea: How does the system compared to the old FireWire flash drives?
Faulkner: We don't lose data unless I do something incredibly stupid.
McCrea: Did you run into any challenges when moving to this new system?
Faulkner: When you become responsible for the data that everyone is relying on for grades, there's definitely more to think about. I was a systems administrator when I started in this position, but I wasn't a network professional and I still wouldn't count myself as one. There were some anticipated teething problems along with my lack of knowledge, both of which were handled by getting on the phone with paid tech support--sometimes for four or five hours at a time.
McCrea: What advice would you give another university that's grappling with the same problems? Faulkner:
If you're fighting with capacity and data loss, definitely consider a centralized storage place where everyone can put their data. When you make that switch, it really changes everything. I couldn't imagine going back to doing it the way we were.