How Institutions Can Prepare Their IT Environments for a Data Science Program

Data scientist is a hot job, with a median six-figure salary and projected 36% growth in positions over the next several years. Recent advances in artificial intelligence (AI) and other forms of data analytics only expand the potential opportunities for budding data scientists.

These trends have captured the attention of colleges and universities. By one count, more than 1,000 institutions now offer undergraduate or graduate degrees in data science.

There are two aspects of data science institutions should focus on, each of which can benefit the other. First is the actual data science education program that schools offer students. Colleges and universities that don't yet offer such a degree — or that offer only older forms of data-focused study, such as statistics — should consider launching a data science program. A robust program can help them attract and educate high-caliber students who can excel in tomorrow's jobs.

The second aspect is the data science projects that institutions conduct themselves as part of their business operations. Increasingly, schools need data science to sift through troves of student, market, financial, and other data to gain insights that can contribute to more effective education and stronger competitive advantage.

Both aspects of data science require some foundational resources and capabilities. Beyond fielding faculty and staff with data science expertise, institutions should invest in two specific areas.

First is to develop an IT department with an adaptive culture. Data science is a relatively new discipline that's advancing rapidly. The types of data analytics that the IT team must support are evolving at a dizzying pace. The tools that researchers and students must be educated on could be different next year than they are today — as evidenced by the sudden interest in generative AI tools like ChatGPT. Those realities mean the IT department must be agile enough to support this rate of change.

Second is a stable IT environment where security is prioritized. Schools of all types have become prime targets of cybersecurity attacks, making strong data security an imperative. Data science by its nature involves large quantities of data, further raising the security stakes. Two strategies can help:

Place compute and data close together. In the past, organizations operated datacenters on site, where computer servers and data storage were housed behind locked doors. More recently, workloads and data have migrated to public clouds. Today, institutions are rethinking where they maintain their most sensitive information, with some returning certain data on premise.

This centralized approach presents challenges, however. More data is being generated at the edge of the network. For data science researchers, this could be in labs and satellite facilities. For data science practitioners, it could be on internet of things (IoT) devices. In either case, transmitting vast data streams to a central location can be costly, and it risks exposing large quantities of sensitive information.

Rather than transfer data to centralized compute resources, place the compute close to the data. This is achievable today using a containerized IT architecture. A container is a lightweight, standalone package that combines an application with its necessary files and settings.

Containers give institutions the ability to run data analytics applications on small devices. Analysis of the data can take place at the edge, and only the output needs to be transmitted. This can help reduce the amount of data that must be transferred. NASA, for instance, is using containers to conduct scientific analysis on the Internal Space Station.

Secure the technology supply chain. Following the infamous Sunburst supply chain hack of 2020 — an attack that spread to thousands of organizations through popular IT monitoring software — many institutions now worry about supply chain security. And for good reason.

Colleges and universities use all sorts of technology — some of which is "shadow IT" not authorized by the IT department. It's easy to understand how this happens. Faculty members receive grants to conduct research, they require specialized tools, and they don't want to wait for approval from IT. Instead, they build out their own technology portfolio — without necessarily knowing whether it's safe or how best to secure it.

One solution is for the IT department to become as responsive as possible to faculty and staff needs. Another is to invest in an IT architecture and application stack built around open source software. Open source code is developed in a decentralized and collaborative way, relying on community production and peer review.

Commercial IT solutions based on open source software can be more secure than proprietary products, because they benefit from transparency and diverse input. A vibrant open source community can foster best practices in cybersecurity. It can also quickly identify and remediate security issues. And if the solution provider is an established member of the open source community, it can contribute to a more secure supply chain, tracking the code's provenance and confirming it has been thoroughly tested.

Boston University deployed a commercial solution built on an open source machine learning (ML) platform for its computer science program. The solution enables researchers to rapidly train and manage ML models either on premise or in the public cloud. It simultaneously provides an environment for an open source textbook, interactive lectures, and demonstrations. Students use a web browser to access a personalized virtual space for completing assignments and exploring ML models.

Investing in such IT strategies and capabilities can empower institutions to offer a robust data science program and establish their own data science practice. What's more, these two aspects of data science — program and practice — can enhance each other.

An effective data science program can help schools attract faculty and students alike. Likewise, the data science research — and freshly minted data scientists — that emerge from such programs can advance their data science practice, helping them gain new insights to educate students more effectively and better compete in the marketplace.

Featured

  • SXSW EDU

    Explore the Future of AI in Higher Ed at SXSW EDU 2025

    This March 3-6 in Austin, TX, the SXSW EDU Conference & Festival celebrates its 15th year of exploring education's most critical issues and providing a forum for creativity, innovation, and expression.

  • man working on laptop outdoors

    Digital Leadership Must-Haves for 2025: A CDO's Picks

    Now that he's more than a year and a half into his chief digital officer role at NJIT, we've asked Ed Wozencroft to reflect on his areas of concentration: What work must digital leaders "own" in 2025?

  • From Fire TV to Signage Stick: University of Utah's Digital Signage Evolution

    Jake Sorensen, who oversees sponsorship and advertising and Student Media in Auxiliary Business Development at the University of Utah, has navigated the digital signage landscape for nearly 15 years. He was managing hundreds of devices on campus that were incompatible with digital signage requirements and needed a solution that was reliable and lowered labor costs. The Amazon Signage Stick, specifically engineered for digital signage applications, gave him the stability and design functionality the University of Utah needed, along with the assurance of long-term support.

  • digital artwork of glowing, interconnected neural-like shapes on a gradient background of deep blue and vibrant purple

    Google Announces Upgrade to Flagship Gemini AI Platform, Enhancing Multimodal Capabilities

    Google has launched Gemini 2.0, designed to empower enterprise users and developers with advanced multimodal capabilities and enhanced performance.