Data Analytics
The Rocky Road of Using Data to Drive Student Success
The California State University system has hit its share of potholes as it tests predictive analytics to forecast student performance in high-failure-rate courses. Here are its lessons learned.
- By Dian Schaffhauser
- 07/26/18
The premise is fairly simple: If colleges or universities could just identify students most at risk of failing a required course based on a predictive model, faculty and advisers could reach out and lend helping hands. But the road to student success in higher education is often pitted with potholes.
And California State University is finding its fair share as it pursues a pilot project that grew out of the university system's "Graduation Initiative 2025." This is an ambitious plan to increase graduation rates for all CSU students while eliminating equity gaps for under-represented minorities and Pell-eligible students. For example, the four-year graduation rate for freshmen is pegged to increase from 23 percent in 2017 to 40 percent by 2025; the six-year rate is expected to rise from 59 percent to 70 percent.
Much of the focus for the 2025 program is to eliminate bottlenecks — dumping placement exams, removing non-credit-bearing remedial courses, helping students succeed through their math and English requirements and focusing on waitlisted courses — classes with high demand and low success rates.
The Office of the Chancellor runs an Academic Technology Services division to facilitate (and subsidize) tech projects that support many CSU campuses. The same group handles systemwide contracts with education technology companies, including learning management system vendors.
That's where this story really begins — with the LMS. Kathy Fernandes, senior director for Learning Design and Technologies, and Jean-Pierre Bayard, director for Systemwide Learning Technologies and Program Services, both part of Academic Technology Services, consider the LMS a goldmine for retention work.
"We both love to say that the data in a student information system is dead data, meaning it's already passed and there's nothing you can do to change it," said Fernandes. "However, if faculty are using the LMS, you can peek into what we call the 'live data' and see where the students are. If you're going to try to improve student retention, that's where you need to catch them in their learning process — not after the course is over."
With the support of Assistant Vice Chancellor Gerry Hanley, they set out to understand the potential for using that live data to advance the system's student success goals. Along the way, they hoped, they could help the campuses learn how to redesign key courses and initiate a culture of data analytics in academic offerings.
The Starting Line
Although the 23 CSU campuses use different LMSes, Blackboard Learn and Moodle are the most common. Blackboard offers predictive analytics products for each: Learn has the platform-agnostic Blackboard Predict, and Moodle has X-Ray Learning Analytics.
One of the differences between the two products is that Predict pulls past data from the student information system to flesh out the predictive model, while X-Ray relies solely on the data in the LMS. Also, noted Fernandes, X-Ray uses machine learning to interpret whether a student is just going through the motions or is truly engaged in the class. "It literally will show you a student who may be replying in the discussion board but really isn't engaged in the discussion," she explained. The program identifies those individuals in the discussion who are truly acting as the leaders.
Academic Technology Services offered to subsidize the testing of the two products, and a handful of schools came forward in December 2016. "It's not like people aren't paying attention to the wave of learning analytics," said Fernandes. By joining the project, the campuses knew they'd get more help than if they were going to do it on their own. "They would have had to get a lot more oomph and budget behind it to make significant progress."
Four campuses were early buy-ins. Chico State and San Diego State would test Predict. San Francisco State and Sonoma State would try X-Ray. Each campus would set its own goals and determine how best to choose the courses and faculty that would participate.
3 Kinds of Potholes
That's when the project began hitting potholes.
Pothole No. 1: Sonoma State, which self-hosted Moodle, ran into a problem with the version of PHP running in its Moodle instance, which meant the latest version of X-Ray wouldn't work. When new leadership arrived, the school decided it was time to hold an LMS review.
Solution: Make sure the car is ready for the road trip. The university put the X-Ray pilot on hold. "You don't change your LMS and then do learning analytics at the same time," explained Bayard.
Pothole No. 2: The X-Ray pilot at San Francisco State was intended to be school-wide vs. for a set number of classes. But the campus, which also runs self-hosted Moodle, discovered that the risk model wasn't valid across all LMS courses. Faculty use the LMS in their own ways, making the data inconsistent.
Solution: Take a detour to avoid the pothole. The university decided that X-Ray served better as a "more robust and detailed dashboard" to view student activities in a course than what Moodle generically has available. The scope of learning analytics across the whole institution vs. specific courses will take a longer-term view and engagement.
Pothole No. 3: Chico State, running a self-hosted version of Blackboard Learn when it began this project, found that instructors considered the predictions superficial, leading to a lack of trust. As Fernandes recalled, one person who taught organic chemistry "basically said, 'Yeah, within the first two weeks, 80 percent of the students [in my class] are predicted to fail. I could have told you that before you got the software.'"
Solution: Fill the pothole. As a result of that kind of feedback, the vendor received a recommendation: Prioritize the students predicted to fail and give feedback about why the program thinks those students will fail. There are 25 to 60 variables used in the risk calculations, Fernandes pointed out. Is the student expected to fail because he's a freshman with a low GPA or because he missed handing in an assignment? "You're not helping me by saying they're potentially at risk. I need to know what specifically is causing the vulnerability of that risk."
Driving Lessons
Both Fernandes and Bayard would acknowledge that the learning is still going on. But just a semester or two into the deployments, they've already picked up plenty of insight.
Use of the LMS gradebook is essential. Without it, there isn't much to work from in building the picture of success, said Bayard. "A course that is making frequent use of the gradebook for assessment — particularly in the first six weeks — gives you more accurate data." At the same time, due to a bug in the software, faculty needed to avoid gradebook "weighted columns." The fix: Have faculty work with an instructional designer to assist in setting up low-stakes, high-frequency assessments that use the gradebook.
Clicker apps are a quick way to assess. Since frequent assessments recorded to the gradebook makes for better predictions, Bayard has seen pickup of student polling apps among instructors. "They're doing it individually in courses to try to see where students are," he noted. "And those clickers do generate some pretty good data sets."
Data capture will determine your predictions. Student success depends on myriad variables, and if your instructors don't design the course in the LMS to gather data (assignments, tests, online homework, attendance, etc.), the predictive results won't be valid. For example, class attendance might be a predictor for a given course, but if the instructor doesn't bother keeping attendance, there's no way to know how that will influence the outcomes.
Don't delay. "Ideally, you would want to give feedback to students within the first four weeks of the semester to make sure they're getting aboard on course concepts and coursework," noted Fernandes. "If you're not going to be able to give a prediction that they might fail until the sixth or seventh week (before mid-terms), it's too late." And if the course is too complex to know by week four, maybe it's not the right place to start with predictive analytics. Likewise, if you can't start at the very beginning of the semester, hold off, she advised. "It's difficult to catch up with this predictive model idea when you miss the first two weeks."
Be choosey. An overriding consideration for Bayard is to have "very good diagnostic tools for selecting courses and instructors." In order to have an LMS course that can give you good data to determine a valid predictive model of student success, you have to have faculty willing to engage in the course design either themselves or with the help of an instructional designer. For instance, while faculty are accustomed to setting up their gradebooks the way they want, right now, the infancy of the learning analytics software requires the gradebook to be set up in a particular way in order to have the kind of data that will be of value to the predictive modeling.
Boss buy-in is essential. Without the support and interest of learning analytics in upper administration, "forget it," suggested Fernandes. "This is a long-term thing. You have to be committed to it. It takes multiple units on campus to talk with each other. If you think, 'Oh, we'll stand it up for a year and see if we get return on our investment,' we'll tell you don't do it."
Leapfrog the learning. San Diego State, another Blackboard Learn user, began its use of Predict a semester later than Chico State, enabling it to strengthen its approach. For example, by the time the university implemented Predict, it had already created "really effective, rich media" to send signals to students that the school was paying attention to student engagement in the courses. "They did some really fun things," said Fernandes. One example was a video of an empty chair in the classroom with a message that said, "We noticed that you were absent today." That would be accompanied by data showing what their chances of passing the course would be if they attended class vs. if they didn't attend. You have to create clear road signs for students and faculty so no one gets lost.
Predicting success or failure is just the start. While it may be useful for the institutions themselves to understand the factors that lead to success or failure, not everybody is convinced that the students need to know what side of the line they fall on. Chico, for example, declined to show students their predicted failure, while San Diego State chose to inform students. In doing so, however, the university also let students know to "take it with a grain of salt, to not give up," Fernandes said. "They did it much more gingerly."
Plus, the decision-making can't end there, she added. "You shouldn't get into learning analytics just to find out who's potentially going to fail and not have the rest of the institution on board with what are you going to do about it once you know that." And approaches for how to help students based on their risk factors is yet another large institutional project.