Data Mining for Academic Success

Purdue’s academic analytics correlate data from the course management and student information systems, to create predictive models that can support student retention strategies.

StatsIN A PROJECT begun in 2005, researchers at Purdue University (IN) are developing models to predict academic success: academic analytics that will eventually be used to create interventions for at-risk students. Their first step was to identify data that could be mined from the course management system (CMS) and from the student information system (SIS), and demonstrate which factors are most significant.

Researchers studied an initial sample of about 1,500 students during the Fall ’05 semester, and quickly expanded their work to reflect the entire range of WebCT supported classes at Purdue in Spring ’06. Analyses now include data on some 130,000 seats in the CMS (individual students may be counted more than once if they take more than one course), representing more than 30,000 students.

Exploring the Factors

Project lead John Campbell, Purdue’s associate VP for Teaching and Learning Technologies, explains how the study looks at the factors influencing academic success: “Academic success is really based on two different components: aptitude and effort. You can be the smartest person in the world, but if you don’t put in any effort, you’re not going to be successful. And people with less aptitude, who put a lot of effort into it, can be very successful.” So the researchers are rigorously examining indicators of aptitude and effort, by mining historical data such as SAT scores and GPA from the SIS (reflecting aptitude), and data on student use of the CMS from the Oracle back-end database connected to their WebCT system (reflecting effort).

The example in the graph above is a representative sample of 600 students across a range of classes and departments at Purdue. The chart shows the number of WebCT logins (where the fourth quartile is high and relative to the given class), the SAT scores (where the fourth quartile is high and relative to student SAT records for the given class), and the earned grade for the course (where A=4.0). This analyis demonstrates that the number of WebCT logins tends to impact the final grade—more dramatically in the case of students with a history of lower SAT scores and fewer WebCT logins.

StatsPredicting Is in the Future

Ultimately, the end goals are to develop intelligent agents that will automatically take actions (such as alerting the instructor that a student is likely in trouble, or notifying the student about help sessions that are available), and to provide trend data to administrators with an interest in retention. Campbell explains: “We have a lot of retention initiatives; the biggest challenge is getting the right people to the right initiative.” He points out that early intervention can be critical to success—and interventions may be more timely when triggered by academic analytics.

Editor’s Note: John Campbell and a team from Purdue will present their work on academic analytics at Campus Technology 2006 in Boston. For more information, go to www.campus-technology.com/conf.

Featured

  • pattern featuring interconnected lines, nodes, lock icons, and cogwheels

    Red Hat Enterprise Linux 9.5 Expands Automation, Security

    Open source solution provider Red Hat has introduced Red Hat Enterprise Linux (RHEL) 9.5, the latest version of its flagship Linux platform.

  • glowing lines connecting colorful nodes on a deep blue and black gradient background

    Juniper Launches AI-Native Networking and Security Management Platform

    Juniper Networks has introduced a new solution that integrates security and networking management under a unified cloud and artificial intelligence engine.

  • a digital lock symbol is cracked and breaking apart into dollar signs

    Ransomware Costs Schools Nearly $550,000 per Day of Downtime

    New data from cybersecurity research firm Comparitech quantifies the damage caused by ransomware attacks on educational institutions.

  • landscape photo with an AI rubber stamp on top

    California AI Watermarking Bill Garners OpenAI Support

    ChatGPT creator OpenAI is backing a California bill that would require tech companies to label AI-generated content in the form of a digital "watermark." The proposed legislation, known as the "California Digital Content Provenance Standards" (AB 3211), aims to ensure transparency in digital media by identifying content created through artificial intelligence. This requirement would apply to a broad range of AI-generated material, from harmless memes to deepfakes that could be used to spread misinformation about political candidates.