Data Mining for Academic Success

Purdue’s academic analytics correlate data from the course management and student information systems, to create predictive models that can support student retention strategies.

StatsIN A PROJECT begun in 2005, researchers at Purdue University (IN) are developing models to predict academic success: academic analytics that will eventually be used to create interventions for at-risk students. Their first step was to identify data that could be mined from the course management system (CMS) and from the student information system (SIS), and demonstrate which factors are most significant.

Researchers studied an initial sample of about 1,500 students during the Fall ’05 semester, and quickly expanded their work to reflect the entire range of WebCT supported classes at Purdue in Spring ’06. Analyses now include data on some 130,000 seats in the CMS (individual students may be counted more than once if they take more than one course), representing more than 30,000 students.

Exploring the Factors

Project lead John Campbell, Purdue’s associate VP for Teaching and Learning Technologies, explains how the study looks at the factors influencing academic success: “Academic success is really based on two different components: aptitude and effort. You can be the smartest person in the world, but if you don’t put in any effort, you’re not going to be successful. And people with less aptitude, who put a lot of effort into it, can be very successful.” So the researchers are rigorously examining indicators of aptitude and effort, by mining historical data such as SAT scores and GPA from the SIS (reflecting aptitude), and data on student use of the CMS from the Oracle back-end database connected to their WebCT system (reflecting effort).

The example in the graph above is a representative sample of 600 students across a range of classes and departments at Purdue. The chart shows the number of WebCT logins (where the fourth quartile is high and relative to the given class), the SAT scores (where the fourth quartile is high and relative to student SAT records for the given class), and the earned grade for the course (where A=4.0). This analyis demonstrates that the number of WebCT logins tends to impact the final grade—more dramatically in the case of students with a history of lower SAT scores and fewer WebCT logins.

StatsPredicting Is in the Future

Ultimately, the end goals are to develop intelligent agents that will automatically take actions (such as alerting the instructor that a student is likely in trouble, or notifying the student about help sessions that are available), and to provide trend data to administrators with an interest in retention. Campbell explains: “We have a lot of retention initiatives; the biggest challenge is getting the right people to the right initiative.” He points out that early intervention can be critical to success—and interventions may be more timely when triggered by academic analytics.

Editor’s Note: John Campbell and a team from Purdue will present their work on academic analytics at Campus Technology 2006 in Boston. For more information, go to www.campus-technology.com/conf.

Featured

  • glowing digital brain-shaped neural network surrounded by charts, graphs, and data visualizations

    Google Releases Advanced AI Model for Complex Reasoning Tasks

    Google has released Gemini 2.5 Deep Think, an advanced artificial intelligence model designed for complex reasoning tasks.

  • abstract pattern of cybersecurity, ai and cloud imagery

    OpenAI Report Identifies Malicious Use of AI in Cloud-Based Cyber Threats

    A report from OpenAI identifies the misuse of artificial intelligence in cybercrime, social engineering, and influence operations, particularly those targeting or operating through cloud infrastructure. In "Disrupting Malicious Uses of AI: June 2025," the company outlines how threat actors are weaponizing large language models for malicious ends — and how OpenAI is pushing back.

  • cybersecurity book with a shield and padlock

    NIST Proposes New Cybersecurity Guidelines for AI Systems

    The National Institute of Standards and Technology has unveiled plans to issue a new set of cybersecurity guidelines aimed at safeguarding artificial intelligence systems, citing rising concerns over risks tied to generative models, predictive analytics, and autonomous agents.

  • magnifying glass highlighting a human profile silhouette, set over a collage of framed icons including landscapes, charts, and education symbols

    AWS, DeepBrain AI Launch AI-Generated Multimedia Content Detector

    Amazon Web Services (AWS) and DeepBrain AI have introduced AI Detector, an enterprise-grade solution designed to identify and manage AI-generated content across multiple media types. The collaboration targets organizations in government, finance, media, law, and education sectors that need to validate content authenticity at scale.