Open Menu Close Menu


Campus Technology Insider Podcast July 2021

Listen: The Science of Studying Student Learning at Scale

Rhea Kelly: Hello and welcome to the Campus Technology Insider podcast! I'm Rhea Kelly, executive editor for Campus Technology, and your host.

Imagine you're a professor teaching a course, and you want to test what teaching practices work best for your students. For example, is it better to give immediate feedback on assignments, or delay the feedback for a few days and give students an extra opportunity to process the concepts in their work? Even after testing each method and determining a result, it's impossible to know if your findings are unique to your own particular course or if they could apply to all students in general. To solve that problem, a team from Indiana University set out to expand the scope of pedagogical research by creating ManyClasses, a model for studying how students learn not just in a single classroom, but in a variety of different classes across multiple universities. For this episode of the podcast, I spoke with researchers Emily Fyfe and Ben Motz about how ManyClasses works, the challenges of using a learning management system to conduct research, what they learned from the first ManyClasses experiment, and more. Here's our chat.  

Hi, Emily and Ben, welcome to the podcast. So I think I'd like to start by having you each introduce yourself and just talk a little bit about the work you do. Emily, would you want to start?

Emily Fyfe: Sure. So my name is Emily Fyfe. I'm currently an assistant professor at Indiana University in the Psychological and Brain Sciences department. I conduct research on the science of learning. I have an emphasis on STEM learning, so how children think about problems in mathematics and how the errors they make sort of give insights into their cognition. But I also think critically about this across development. So not just with children, but with adolescents and adults. And the goal is to think about how people learn and how we can use that information from science to inform educational practice.

Kelly: Great. And Ben?

Ben Motz: Yeah, I'm kind of similar to Emily. So my name is Ben Motz, and I'm a research scientist in Emily's department, in the Department of Psychological and Brain Sciences. And I also direct the eLearning Research and Practice Lab, which is a research unit that's kind of within our IT division, where I build bridges between faculty research and the student data infrastructure and student data warehouses that a big university like Indiana University maintains.

Kelly And I know you've recently published a paper about a new model for studying how particular teaching practices can improve student learning, called ManyClasses. And there's so much to dive into, but maybe first, you could give me kind of a brief overview of what ManyClasses is.

Motz: So as you might guess from the name ManyClasses, it's a study that takes place across many different classes. And that's relevant because when researchers go out to study how people learn, or even teachers go out to seek how different instructional tactics or strategies affect student performance in their class, they're usually doing it in very isolated situations. So usually just in one class. That creates problems: For example, it's unclear whether the results are due to the intervention, whether they're due to whatever the experimental manipulation might be, or if it's just because of the random idiosyncrasies of that class. So ManyClasses is really an attempt to try and expand the scope of research so that what we're doing in asking a question of how people learn, is expanding beyond the boundaries of any single classroom, really aiming at developing inferences that could generalize beyond that narrow scope, but also that might be able to identify where a practice might have benefits. If it's not so hetero, I'm sorry, if it's not so homogenous across the student population, maybe it's the case that some things work, not everywhere, but only in specific settings. So ManyClasses is also pretty well equipped to be able to answer those types of questions as well.

Fyfe: Yeah, I'll just add that, I mean, I think when we are thinking of ManyClasses as a research team, we're thinking of it as a new gold standard for how to conduct scientific research in classrooms. So there's always this balance between sort of the rigor of a randomized experiment that you want as a scientist, where you can randomly assign students to different conditions and say, "Wow, that condition resulted in better learning than the other." But at the same time, we want to maintain some authenticity to the educational practices and courses that students engage in and that teachers use. And so ManyClasses is a model for sort of combining the rigor of these randomized experiments within these authentic settings. And the goal is to sort of, as Ben said, to do this across many classes, so that we're not just running one experiment, but we're replicating it across all of these different authentic educational settings. And so really, at the heart of it, ManyClasses is a new model for conducting research in educational settings.

Kelly: Yeah, you know, when I first read about ManyClasses, and you know, just thinking about studying how students learn at this kind of scale, one of the first things that came to mind for me was about 10 years ago, when Harvard and MIT were doing studies with the edX platform, you know, you've got massive open online courses and gathering data from huge numbers of students, and sort of gleaning insights from that. So obviously that's quite different from what you're doing. But I just wondered what your thoughts were on that, like, what are the differences? And do you see any similarities?

Motz: So that's an interesting parallel. And I do think that there are some similarities. I work with people who do educational research in MOOCs and I'm fond of their research, especially in that it seems to be at a particularly large scale, and has a diverse audience of students, making the sample relatively robust. But I also think that what ManyClasses is doing is — I'll emphasize something that Emily said — it's trying to make sure that the materials that we're manipulating are authentic to what teachers would generally do in a way that's diverse. So there are many different kinds of teachers out there. And even though a MOOC might have, you know, thousands of students enrolled, it really is an implementation of one instructional design. So the inferences that you gain from an experiment conducted in a MOOC might still be questionable whether it would apply to a different MOOC, or whether it would apply to a regular classroom setting. So what we've tried to do with ManyClasses is have the experimental manipulation be something that itself is unique to that particular classroom setting. So yeah, we're not saying, okay, teachers, everybody give the same quiz, and then we'll manipulate how students get feedback on that quiz. We're saying, teachers, take your own quizzes, take the materials that are authentic to what you'd normally do, and then let's manipulate how you deliver the feedback in those quizzes. So in a sense, I think that one of the things that ManyClasses brings to the table that might be lacking in some forms of MOOC research — I don't want to cast doubt on all of them — but one of the things that's unique about ManyClasses is the diversity of the different types of experimental manipulations, and the ways that it's authentic to what routine instruction might be using information technology.

Kelly: Yeah, so you know, it sounds like Many Instructional Designs, which is kind of more of a mouthful than ManyClasses, right?

Motz: Yep. No, it's true.

Kelly: Emily, did you want to add anything?

Fyfe: I'll just quickly add that I parallel Ben's sentiment that research on MOOCs can be super, super valuable. And I agree that one of the similarities is that we're trying to go for scale here, right? We don't want to be so limited to these narrow, small contexts, but think about the large breadth of what's out there. But another difference is that some of the research on MOOCs is just looking at the large amount of rich data that they're getting. And that's great. But ManyClasses is trying to go beyond just getting more and more data. We're trying to make it an actual randomized experiment, right? So instead of just looking at correlations and say, this experience in this class is related to this outcome or to this learner characteristic, we're going to try to actually manipulate something about their experience. And some research on MOOCs does that, but I think the goal of ManyClasses is to highlight that methodology.

Kelly: Yeah, it sounds, it's much more of a scientific experiment, it sounds like.

Fyfe: Yes, very much so.

Kelly: So I'd love to walk through the process of how you conducted that first ManyClasses experiment. And so how did you decide what teaching practice to evaluate? And how did you recruit participants? And how did you sort of get it going?

Motz: It's funny that you should mention those three different things, because each of them had their own unique process associated with it. And this was such a long timeline between like, imagining what the what the experiment could be like, and then actually implementing it. So I'm gonna, I'm gonna give you the history pre-Emily. And then Emily jumps in and really kind of elevates the team and brings us to action. So for a long time, for more than a decade, both Emily and I separately have been conducting experiments in educational settings. So what we do is we implement a manipulation to some assignment or something and measure how that affects student learning. Yeah, in, in many different ways. And one of the things that I discovered in my research was that when we would conduct a study and find some result, there was always this paradox, where sometimes we would observe results that confirmed what we thought we knew from theory. And in that situation, everything was gravy, it was easy to publish, and people believe what we would say. But in some settings, it seemed to be the case that we would observe something that was different from what theory predicted. And then we're back to that issue that I mentioned earlier, about maybe the results are just idiosyncratic to the sample. So in the process of doing these studies, at small scales, it slowly just became clear that if we were going to conclusively answer any scientific question about learning and instruction, we would need to kind of zoom out and conduct things at a grander scale across multiple classrooms, so that we could escape from that challenge of maybe the results are just idiosyncratic. So this is something that we started talking about and framing and discussing with IT administrators, and the backbone of our capacity to do the study was really enabled by the Unizin Consortium. So forgive me, I'm going to take a moment to go on this tangent of what Unizin is. So Unizin is a consortium of like-minded institutions, including Indian University and others, some of the largest university systems in the country, that have all kind of banded together and share the same educational technology ecosystem. And when you share the same educational technology ecosystem, you share the same data structures that come out of that educational technology. So Unizin has provided a data platform that combines data that come out comes out of our LMS, that comes out of our SIS, and that comes out of a bunch of different what you might imagine to be peripheral learning tools. It combines this all into one unified format, which made it possible, at least in theory, for us to collect data of an experiment that's manipulated in our classrooms, I should say, actually, in the learning technology in our classrooms. It makes it possible to collect data from multiple institutions in a way that has a standard format. So we started exploring this possibility with IT administrators and with campus administrators. And through a very lengthy series of meetings and brainstorming sessions, we ultimately arrived at some framework for how to proceed. The real idea of where this particular experiment came from was thanks to Emily joining the faculty at the Department of Psychological Brain Sciences. And we're so lucky that she was interested to play this wild game with us where we try and do research at scale. Maybe I should let Emily take over on in discussing the specific research question.

Fyfe: Sure, yeah. So yeah, so I joined the team, and they have this brilliant idea for ManyClasses. And, and it sounded wonderful and exciting and novel, and just the right thing that we needed to do to move and progress the field of educational research forward. But I couldn't wrap my head around the method in total, without thinking about what the research question was. And so we decided to do the first ManyClasses study on the effects of feedback, and specifically feedback timing. So you know, you imagine you do an assignment in a course, and when do you get the feedback on that assignment? Do you get it immediately when you hit submit? Or do you get it a couple of days later? And there were lots of reasons that as a team, we sort of, you know, landed on this the timing of feedback as our central idea. Part of it was because it aligns with some of the research I've done in the past on feedback and problem solving. But more importantly, there were these practical and theoretical reasons for studying feedback. So practically, feedback is a feature of every course that we have ever seen. So teachers in some way, give feedback to their students on their assignments, or on their exams or on something in their class. So it's prevalent, teachers use it. And there are also educational recommendations about how to give feedback, namely, try to give it as soon after the learning assignment as possible, give it to them immediately. So from a practical standpoint, it seemed like there would be good buy-in from teachers, because they're already doing something like this in their classes. But theoretically, there was also this interesting, you know, sort of building up in the literature suggesting that there might be some benefits to immediate feedback. But there might also be some benefits to delayed feedback. You know, for example, with delayed feedback, you get to study the content on the assignment. And then if you get the feedback a week later, you get to study it again, or process it again. And so we've got this sort of spaced study and processing of information. So theoretically, the contrast between immediate feedback and delayed feedback was sort of perfect for studying at this scale, to see under what situations do we find benefits of immediate versus delayed feedback. And from a practical standpoint, it just seemed like the right way to go since it's already being used in a lot of the classes that we've encountered.

Kelly: That makes sense. So I'm assuming, or, I think I remember reading, that this experiment you're running basically using the Canvas learning management system. And that really sounds like no small feat, just in terms of setting up your experimental conditions. Because I imagine you've got individual faculty members who are managing their own courses in the LMS. And, you know, somehow you have to make them consistent. So I'm just curious, how did that work?

Motz: It was not easy. So the process of implementing the experiment in Canvas was in many ways manual and kind of, kind of absurd. Emily and I, with a great deal of organizational expertise and help from a graduate research assistant named Janelle Sherman, managed to, just barely managed to get through it. There are a lot of challenges that I'm sure that you're anticipating when you ask that question. For example, how do you differentiate a student's experience of feedback within a particular class? So it would obviously be easy if we had two different classes, like one class does it this way, the other class does it that way. But we did individual-level randomization. So some students got some things at one time, and the same students got something else at a different time in the class. And those transitions were things that had to be manually encoded in Canvas course sites, we use the Canvas Sections tool to differentiate how assignments worked within each class site. But that on its own was kind of manual. So you had to manually move students into different sections that were created as ad hoc sections within a Canvas site. We wrote a gizmo that made that happen automatically at Indiana University, but for other campuses that were participants in the, in the ManyClasses study that happened to be just totally manual, again, just clicking and dragging the students one by one. There's also another element to conducting the research study in Canvas that's worth kind of highlighting. And that's that getting data out of Canvas is not as easy as just kind of like knocking on the front door of a university and saying, Hey, could I have all your student data? So part of the process of implementing the study was also getting permission from participating institutions for the delivery of data from their, from their Canvas sites, and every institution had a different model for how that would work. So at some institutions, the fact that we had basically an IRB compliance protocol, so we had a research protocol written up and approved by an IRB office, was sufficient. But at another institution — all these will remain nameless — but at another institution, we actually had to become certified through a security audit as an educational technology vendor. That turned out to be the easiest way to get access to data. And yeah, it's, it's, it's perhaps going to be something that the educational technology community works on over the next 10 years. And that's how we standardize how the massive quantities of educational data, how we standardize the delivery of those data for research purposes, if that's something that we actually want to prioritize.

Kelly: So it wasn't enough just for the participants to be Unizin members, and, you know, using the same data platform, that there's, there's still more layers of permissions to go through.

Motz: Yeah, and another, here's a fun one as well. So we decided, because we actually thought it was the right thing to do, and also because it appeased certain administrative stakeholders, to collect informed consent from students. The process of collecting informed consent in a Canvas course site, it turns out, isn't super straightforward. So, you know, obviously, there's ways of creating questions in a Canvas course site — you could make a quiz. But one of the principles of informed consent is that when the student responds to informed consent, they should feel no coercion. And we thought that there was at least the possibility that because a teacher might see their responses in a Canvas quiz, that they would feel compelled to consent to participate when they might not otherwise want to. So to protect the voluntary nature of this particular study, we had to create an encryption of students' responses in a Canvas question. So we had this complicated, like set of numeric codes and some numeric codes indicated consent. Other codes indicated that they declined consent. And yeah, it was a fun adventure, that was again another example of how we kind of had to invent the wheel for this first ManyClasses study.

Kelly: It really sounds like you were basically hacking the LMS to, really to force it to do more than it's designed to do. Is that what it felt like?

Motz: I don't know if I'd call myself a hacker, but definitely you're picking up on something. The LMS is not a research tool. As much as the LMS does, you know, regardless of LMS, as much as the LMS does provide features for data export, I don't think that those features have been designed by the actual research audiences that might use those to better understand student learning and instruction. Nor are LMS features currently set up for experimentation. Certainly, that's, that's kind of, that's kind of a blind spot in the LMS feature set. So yeah, I do think that we had to, we had to bend things in ways that initial designers of LMSes wouldn't have thought.

Kelly: Interesting. Emily, did you want to weigh in on challenges, either technical or other kinds of challenges?

Fyfe: I actually just wanted to go back one second, and say that, so one of your original questions here or prompts here was about getting, you know, the institutions on board and how did this work when you're working in, you know, 38 different LMS Canvas course sites. And Ben did a great job highlighting the technical challenges and the manual labor that we had to do to sort of make it work for this study. But I also just wanted to emphasize the communication with the actual teachers, and how pivotal that was to our success. So in addition to, you know, having our wonderful graduate student, Janelle Sherman, working in these, in these LMS sites — to you know, sort of, you know, group students into sections and say, you're going to get immediate feedback here, and you're going to get delayed feedback here, and oh, I'm going to release the feedback here — we did, we had one on one meetings with each of these participating instructors to say, you know, how is this going to work in your class. And they would walk us through, you know, the structure of their class and say, well, here's where I give, when I give quizzes and, and here's usually the selections in my LMS that I pick for giving the feedback and, and working with them directly to say, Okay, great, this quiz that you give in your class, we can manipulate the feedback in this way to make it work. So I just also wanted to highlight this communication in this true sort of researcher-teacher collaboration that really made it possible, which resulted in its own challenges, but is sort of what made this possible from the get-go.

Kelly: Yeah, I can totally imagine, you know, communication among 38 different courses and instructors, just a lot of, a lot of challenges there.

Fyfe: And you might be surprised to know, like, I, as somebody who is both a researcher and an instructor of college students, so I have, you know, when I run my courses, I design my Canvas course site the way I think is really intuitive for students to use. But we got so much insight into how variable teachers set up their Canvas course sites for their students. And so working through that, having us navigate their Canvas course sites, understanding how students understood the course as it's presented in the Canvas course site, was remarkably enlightening.

Kelly: Wow. So let's talk about results. So what did you learn from the experiment itself? And I'm also interested in what you learned just by putting the ManyClasses model through its paces.

Motz: I think actually, that thing you mentioned is our number one insight. So just by doing this research study, I think that one of the biggest takeaways is just that such a study is possible. So while there are a great deal of things that researchers might view as barriers — whether it is, you know, getting campuses to agree to share their data, or implementing informed consent in an authentic classroom setting — yeah, this kind of stands as proof positive, that these are surmountable barriers and ones that shouldn't prevent people from doing multi-institutional and multi-class research in a way that's sensitive to the nuances of each individual class. So that's, that's a big takeaway. Yeah, and and for what it's worth, we hope that this is not the last of ManyClasses research study. So our own insights into how this might work better is also a really valuable takeaway. Another thing that I think is really important to mention about the results is that, remember the experiment was to try and test whether the timing of feedback, whether feedback is deployed to students immediately when they hit submit on their online quizzes, or whether it's delayed by a few days. So that's the experimental contrast. And what we found at the global scale, so across all these 38 classes, was the absence of any difference in student performance, depending on whether they got immediate feedback or delayed feedback. That's kind of an interesting finding, because it resolves a conflict that's out there in the literature, and that conflict is built on individual narrow sample studies. So sometimes a research study would find that immediate feedback would work best, and other times a research study might find that delayed feedback might work best. But our ManyClasses research study resolves these conflicts, and really says actually, there's no global effect at all. The global difference is zero. So I think that there's a nice, even though the results don't suggest, you know, hey, you should do this, they do resolve some theoretical conflict about how we would expect abstract principles of learning to affect classroom implementations at scale. So those are, that's the big takeaway. I'll also add that, as I mentioned, when we were talking about the design of the ManyClasses research study, one of our goals was to try and find out whether there are pockets of the design space where there might be differences in how people perform and how people learn, depending on what the experimental manipulation is. And we did find some suggestive evidence that in certain kinds of classes there might actually be advantages to delayed feedback; it just didn't meet our sort of objective threshold for statistical inference. So there are some places where it seems to be the case that the pattern of feedback, I'm sorry, the timing of feedback might actually affect student learning. And these hints kind of are signposts for future research that might continue to explore the timing of feedback.

Kelly: Emily, did you want to add anything on what you felt about the results?

Fyfe: Yeah, no, I mean, Ben covered that super, super well. I'll just reiterate that our most exciting result is that we did it. We pulled it off, we did ManyClasses 1, and it was a lot of hard work. But it was also a lot of fun. And we learned a lot, both about the results, and about conducting experimental research, about doing multi-institutional research, about being, using open science and transparent research practices. I feel like as a team, we really both learned a lot and grew a lot. And now we feel sort of competent and ready to do another ManyClasses, because we now know, it's, it's totally possible. So this, this type of new research, where we're doing randomized experiments across many different kinds of classes at the same time, is, is feasible and, and helpful for the field. So that's, that was, yeah, like Ben said, that sort of result number one that we're most proud of. But yeah, speaking to the actual results about the timing of feedback, the fact that we don't see a global effect is sort of surprising to many people that we talk to, because sort of the assumption is that, of course, immediate feedback is going to be better. And it's surprising that we didn't find that here. And in fact, like Ben said, we have hints that there are certain situations where delayed feedback might be better. And so we've, we've sort of talked about this as, you know, relieving teachers of the stress to provide feedback, you know, immediately after an assignment. So yeah, so those, those two things really hit home what we found out from this project.

Kelly: So of course, I have to ask, you know, what plans do you have for a ManyClasses 2? Are you tossing around, kind of, experimental questions? You know, where are you with that?

Motz: Okay, so our finding from ManyClasses 1 was that this is feasible. But as I mentioned before, one of the qualitative aspects of that feasibility was that it was really hard, just with the number of hours invested in pulling off this one research study, basically makes it unsustainable. So, it was enough to show as a proof of concept. But yeah, it's not the sort of thing that we could imagine ourselves investing effort in routinely. So we're taking our learnings, and I'm working on designing a platform that will automate some of the features of the research in LMSes that made it so challenging. So like I mentioned, one of the big challenges was informed consent. So we could imagine a tool that will just do that automatically. One of the big challenges was getting access to identifiable data so that we could then de-identify it. Well, we could fix that if there was a tool that would just export de-identified data, perhaps already with non-consenting participants excluded. So I'm working to develop this tool as an LTI app that adds into the Canvas learning management system. It's called Terracotta. You can learn more about it at And we're working on we're just actually right now putting the finishing touches on the alpha prototype. So hopefully, we'll be able to unveil that in some form this fall. It's being built by Unicon. So Unicon is kind of the dream team of educational technology development these days. And yeah, it'll make it so that future research studies, whether you know, it's the ManyClasses team conducting research at massive scales, or even a teacher who's just interested to manipulate some aspect of an assignment and see what the benefit is to their students. It'll make these experimental research studies much more feasible. Yeah, and perhaps hopefully improve the evidence base for what we're doing in our classrooms.

Kelly: And next steps for you, Emily?

Fyfe: Yeah. So like we, I think we both mentioned this already, that we, this is, we have completed the ManyClasses 1 experiment and we're hoping there's many, many more to come. And so as a team, we've gotten together and we're currently in the process of writing and submitting a grant to get funding for ManyClasses 2. And our goal with ManyClasses 2 is to take on a new challenge of looking at a different educational practice. So ManyClasses 1 focuses explicitly on the timing of feedback, and ManyClasses 2 is going to be focusing on this educational practice called retrieval practice, which is the idea, this widespread idea that actively retrieving information or you know, bringing it to mind in memory is actually good, you know, as a good study practice — sometimes referred to as the testing effect. And so this is a practice, an educational practice that, again, has both practical and theoretical implications, making it a good case study for ManyClasses 2. And so it's a widespread and authentic educational practice, it's, it's highly recommended to instructors, but theoretically, we need to better understand how is it working, what's the mechanism by which it's working, and in what conditions, like what settings and and in what implementations. Does it work across diverse educational settings? So we're in the, in the process of, you know, writing grants to get funding to do this study. And we hope that there are many more ManyClasses studies to come after that as well.

Kelly: Any final advice just from what you've learned from this experience?

Fyfe: One set of advice that comes to mind, mostly because I'm now I'm looking right at Ben's face, is that, to work with amazing people, right? So this is the biggest project I've ever been a part of. The paper has a gazillion authors and, and co-authors and people listed in the acknowledgments. And, and it was wonderful to sort of, and of course, difficult, to sort of wrangle that many people. But our core team, it was just really fascinating to work together and have these different areas of expertise. You know, I knew more about the feedback literature, but another member knew way more about the data analysis, and Ben did really great job with all the technology stuff and recruitment across institutions. And this this kind of collaboration, I say, you know, my advice to people is to go for it, because it really was a lot of fun and totally, totally wouldn't have been possible with just one of us as the lead.

Motz: That's a good answer, Emily. I like you, too.

Kelly: I love it. Okay. Well, thank you so much for coming on. That was great.

Motz: Thank you so much, Rhea.

Fyfe: Yeah, thank you. This was fun.

Kelly: Thank you for joining us. I'm Rhea Kelly, and this was the Campus Technology Insider podcast. You can find us on Apple Podcasts, Google Podcasts, Amazon Music, Spotify and Stitcher, or visit us online at Let us know what you think of this episode and what you'd like to hear in the future. Until next time.

comments powered by Disqus