Why Universities Need to Align Data Storage with Data Value

Universities are voracious data generators, with one well-known institution of around 40,000 students currently producing in excess of 15TB per day from research activities alone. This kind of volume places storage requirements firmly in the petabyte range, comparable to those of large enterprises, with infrastructure needs set to grow further as data-intensive AI tools are more widely adopted.

In many environments, unchecked data growth is now outpacing the ability of IT teams to manage it effectively. It's a situation that has a potentially serious knock-on effect on everything from technology performance and research timeliness to budgets, which, generally speaking, remain under significant pressure.

Central to the challenge is that institutions tend to address data growth in a one-dimensional way: When storage fills up, keep adding more. Compounding the problem is that a significant proportion of university data estates consists of inactive or low-access information that remains on primary storage simply because it has never been assessed or classified. Similarly, universities are understandably risk-averse, to the point that data is retained indefinitely because institutions lack the confidence to archive or delete it.

While this approach provides a certain level of reassurance, in practical terms, it also means high- and low-value data are treated in the same way. This not only increases overall costs but also limits the effectiveness of technology investments in the long term.

Viewing the data growth problem and solution primarily through a storage-capacity lens also misses a critical point: Any lack of visibility into what data exists, where it resides and how it is used creates a fundamental disconnect between expenditure and the value that data actually delivers.

A Shift in Approach

Taking back control of data so it can be managed and budgeted for in line with its value is the first step. It's then about managing the access requirements, both of which require a shift in approach. Institutions need to move away from a reactive habit of expanding storage and towards a more deliberate data management model based on understanding and control.

The starting point is visibility, because without a unified view of the data estate, it is difficult, if not impossible, to distinguish between data that supports active research, for example, and that which is no longer accessed but continues to consume high-performance, costly storage resources.

This approach depends on the ability to analyze large volumes of unstructured data at university scale, which typically means billions of files across multiple systems and locations. This is a data management software challenge, with modern systems capable of analyzing billions of files to provide the visibility needed for informed decision-making.

At this scale, data management simply cannot rely on manual processes and instead depends on automated intelligence to bridge the gap between requirements and resources. This provides the foundation for making consistent, data-driven decisions about how different datasets should be handled, ensuring that storage infrastructure is properly aligned with the actual value and access requirements of each dataset and the associated compliance processes.

Regardless of where data resides, institutions also need to ensure that access permissions are consistently defined and maintained across environments. Without this level of control in place, sensitive or regulated data can remain exposed even if it has been moved to a more appropriate storage tier, potentially undermining both governance and compliance.

Armed with definitive insight, institutions can then begin making informed decisions about which datasets should remain on high-performance infrastructure and which can be moved to more cost-effective archival environments or deleted altogether. This offers a solid foundation for adopting policy-driven lifecycle management, in which data is actively governed throughout its lifespan and, when certain stages are reached, can be moved to a more appropriate setting or deleted permanently.

The shorter-term impact is typically a reduction in pressure on primary storage systems and a more controlled approach to capacity planning. More importantly, it allows budgets to align with actual data needs, so investment is directed towards supporting core institutional priorities rather than just continuing to absorb funds that could be better used elsewhere.

And let's be clear, this isn't just about reducing storage costs, important as that is. It's also about improving how institutions operate at scale and preparing them for a future in which data volumes will grow even further. Breaking the cycle of periodic storage expansion and replacing it with a more predictable, sustainable model is fundamental to sustainable IT investment. Those institutions that get the balance right can enjoy a win-win of improved cost control and more effective support for research and innovation.

About the Author

Steve Leeper is VP of product marketing at Datadobi.

Featured

  • college students sitting with laptops at an outdoor table

    How Colleges Are Building More Connected and Responsive Student Support

    Colleges are making steady progress in building more connected and responsive student support systems. By aligning services and improving coordination, institutions are enhancing both the student and staff experience.

  • woman speaking into microphone

    Best Practices for Designing Higher-Ed AV Environments

    Cloud-based management, interoperability, and upfront planning are helping campuses build AV infrastructure that performs at scale.

  • large group of college students sitting on an academic quad

    Student Readiness: Learning to Learn

    Melissa Loble, Instructure's chief academic officer, recommends a focus on 'readiness' as a broader concept as we try to understand how to build meaningful education experiences that can form a bridge from the university to the workplace. Here, we ask Loble what readiness is and how to offer students the ability to 'learn to learn'.

  • glowing brain above stacked coins

    The Higher Ed Playbook for AI Affordability

    Fulfilling the promise of AI in higher education does not require massive budgets or radical reinvention. By leveraging existing infrastructure, embracing edge and localized AI, collaborating across institutions, and embedding AI thoughtfully across the enterprise, universities can move from experimentation to impact.