Data Science: Re-Imagining Our Institutions at the Systems Level

A Q&A with George Siemens

We know that higher education institutions have been exploring data science for decades. Many began by leveraging institutional data to serve administrative computing needs and efficiencies, later taking on an additional learning science focus, at least to some, often limited degree.

What can institutions do now, to use data science better and perhaps reinvent themselves in the process? Are they taking advantage of all the access they have to so many disciplines and researchers, to help move data science ahead in the real world? Here, George Siemens, who is a professor of practice at the University of Texas-Arlington and co-leads the Centre for Change and Complexity in Learning at the University of South Australia, talks with CT about data science in higher education.

Multi exposure of abstract statistics data hologram interface with hands typing on computer keyboard on background

"How are systems impacted through the use of data to navigate and guide innovation?" —George Siemens

Mary Grush: In general, what types of applications do we see today in data science, in higher education?

George Siemens: If an institution is using data science and analytics to improve business processes and institutional practices, that is certainly one type of data science application. Though this type of data science doesn't affect the learning process, it does help the institution become more efficient and understand what it knows and how it knows it.

For me, however, when I'm referring to data science in education, I'm looking at a triad that includes the institution, the faculty member, and the student. This is in the domain that's typically known as learning analytics, used to understand and improve the learning experience of students.

There are other related tools and fields that should be mentioned in this discussion as well. Educational data mining, AI in education, and the learning sciences are all noted for their interplay in the academic domains that underpin data science in education.

Grush: As we look at how our institutions are using data science, are we now moving ahead, from discrete functional solutions for an institution, to wider goals that seek to benefit higher education more broadly?

Siemens: There are things within the education landscape that need to be better understood. The role of data science in addressing those needs is an important one, to help us understand learning, the psychological basis of learning, the processes that learners take, the pathways that students take through the various courses the university offers as well as how they are supported when they are at risk academically, and so forth. So there is a very real research question that exists around using data to understand learning.

And of course from an institutional point of view we see a range of opportunities around using data and data science techniques to help universities support their students, for example to identify who's at risk to drop out, or to discover what kind of support structures are needed for low income students — including support needs the institution was not previously aware of.

So, data science is advancing both the quality of research about learning and the student experience at the institution. These are among the wider goals benefiting higher education more broadly.

Data science is advancing both the quality of research about learning and the student experience at the institution.

Grush: Could you give an example of one of these broader goals that we may approach with data science?

Siemens: As educators and practitioners, there is a big issue we need to address, and that is the student experience of coming from an under-represented population. There are ways in which a university sometimes marginalizes under-represented students, unintentionally of course. Universities attempt to understand and address these problems with data science tools.

Grush: How are vendors currently involved in data science in higher education environments?

Siemens: The data science field is growing rapidly within higher education (as well as in other sectors). Companies such as SAS, Microsoft, Google, and AWS all have a presence on most American campuses. They also have a fairly substantial imprint in providing education around data science skills. A lot of the reskilling that has to happen at a fairly broad level in nearly all sectors of society is being done through them and through additional education vendors like Coursera.

Grush: What is convergence in the context of data science? I know that data science is one of the areas NSF has chosen for its convergence accelerator grants.

Siemens: There's something unique about using data as a mechanism for understanding and improving learning processes and institutional practices. In order to understand what that looks like, and how this has impacted real-world practices, there are really effective methods NSF uses to bring together researchers and practitioners from related or disparate fields, to allow them to make an impact in pragmatic and dynamic ways. For the data science category [of NSF's convergence accelerators], they bring in people from a lot of fields with differing expertise and different areas of interest, all with one common thread: data science.

There's something unique about using data as a mechanism for understanding and improving learning processes and institutional practices.

Data is changing our institutional practices — just as it has completely changed marketing, or journalism, for example. Data provides a different way for organizations to do things, and because of that it has a broad and substantial impact.

Grush: Is the idea of convergence different from the connectivism that you've been studying for years?

Siemens: There is a bit of overlap between the two concepts. The original idea I had with connectivism was, I wanted to understand how people connect when they are involved with learning activities. But with convergence, it's a more systemic effect. How does an entire field produce knowledge and advance its work and its impact? That's probably the main distinction.

I always emphasize, the idea of knowledge development and growth is a function of a connected mindset, so from that stance, convergence is one of the attributes of what I would classify as connected knowledge or connectivism in general. My original orientation with connectivism was to argue that we are constantly generating connections as individual learners, and over time as we broaden and reach out in other sectors, that's where the depth of our understanding starts to advance.

We are constantly generating connections as individual learners, and over time as we broaden and reach out in other sectors, that's where the depth of our understanding starts to advance.

So, you could say that a convergence initiative with NSF is the concept of connectivism done at the systems level.

Grush: Could you point to a couple of the most interesting trends or highlights in work related to data science, maybe from your own research, or work you're involved in at UTA? What's notable, currently?

Siemens: The first thing I'd mention is that UTA is the first to offer a fully online Master of Science in Learning Analytics globally. The basis for that program [M.S. in Learning Analytics] was simply to build the talent, and the capacity pool for people in education in K12, higher education, and corporate settings who are using data to make sense of the complex landscapes that we all interact in and engage in on a regular basis.

And my own interests are more and more related to the next stage, which is what happens with AI in education settings. This is similar to what's happening with really all sectors of society: If it's digital, you have data. If you have data, you have analytics. And if you have analytics and data you are not too far away from AI. But there are questions now around the impacts of automation and AI. Are there ways AI can be more impactful and can we be more intentional in how we ensure that all learners are addressed, and that their needs are met effectively?

Are there ways AI can be more impactful and can we be more intentional in how we ensure that all learners are addressed, and that their needs are met effectively?

That last point is my current focus with a fairly substantial lens that's becoming an even bigger lens, which is the system impact: How are systems impacted through the use of data to guide and navigate innovation?

We know that schools, universities — all organizations that have a learning focus — are rapidly evolving, and we know higher education now faces competition from a range of corporate providers, Coursera being just one example. So as we look at this landscape, we have to be acutely aware of how AI changes our existing universities at a systemic level. It's a really active question, and the use of data science methods and techniques to track, monitor, and evaluate trends and innovation is quite a consequential contribution.

We have to be acutely aware of how AI changes our existing universities at a systemic level.

Grush: What would be the thing you'd most like to see, with regard to the advancement of data science in the eduction sector?

Siemens: It's really twofold, covering both ends of a spectrum:

First, I think there's an enormous need for leadership development. We need to support and develop the data capabilities of senior leadership at most universities. Many first came to their leadership posts pre-data science explosion, if you will… but soon after, data flooded across their desks, with regular reports providing various types of feedback from their institutional data teams. The data capability question here is, how will our institutions' senior leadership make decisions in an impactful, intentional way? How will they lead in a way that treats data and data methods as a lens with which to evaluate organizational practices and processes?

And second, there's the other end of the spectrum, which is the student. I think there is a critical data literacy need for students. Are they aware of the ethical implications of what is captured about them any time they engage in any type of learning experience on campus? I'd like to see much more advancement in terms of questioning what's being done with the data that's being captured and the potential impacts. We can be better and more impactful in that process as a whole, and re-imagine practices that promote understanding our students.

Featured