Cornell Research Finds Route to Faster, Better Page Ranking -- Campus Technology

Cornell Research Finds Route to Faster, Better Page Ranking

By Dian Schaffhauser
02/08/16

A technique developed by Cornell University researchers offers the promise of speeding up search rankings to real-time speeds. In a visual representation of a database, circles, referred to as "nodes," represent data items; lines, called "edges," represent the links between the nodes. The new mechanism, described in a paper presented at last summer's ACM SIGKDD Conference on Knowledge Discovery and Data Mining in Sydney, focuses on the most important edges.

When a user enters a term in a search engine, recommendations are often personalized based on what that user has recently been entering in search. For example, hunting in Google for a particular product one night could affect how results are displayed the next day for related search terms. "They have computed the rankings offline, based on your choice," said Wenlei Xie, primary author of the paper and a PhD candidate in the Department of Computer Science.

The research project is part of a larger initiative at Cornell to gain a greater understanding of data "that goes well beyond classical database queries," the university wrote in a $2.4 million National Science Foundation grant request. The overall goal is to create "the foundations" for a new kind of database, called the "causal database," where queries go beyond mining for "statistically significant patterns in data" to take causality and "explanations" into account in generating results.

Xie's work lays out the standard route for how a computer performs a search ranking: It "walks" around the database, frequently following a route set by nodes and edges that have been "weighted" based on how many times a particular site has been visited or a type of product has been viewed. Node-weighted algorithms tend to be a bit random. It's akin, the researcher explained, to ranking something in Twitter based on the topics of tweets rather than the relationship between two people.

The approach proposed by Xie's paper recommends "edge weighting." These are already used, but they tend to be slow. The paper offers a way to speed up edge-weighted ranking by reducing the granularity of the walking route and focusing on nods that have linked edges — by way of similar interests or strong connections. The overall goal is to develop "general, fast methods for edge weight personalization."

The researchers tested their approach with a database of scholarly publications and a blog search system. The results: The new method "outperformed existing methods by nearly five orders of magnitude."

As the paper reported, "This huge performance gain over previous work allows us — for the very first time — to solve learning-to-rank problems for edge weight personalization at interactive speeds, a goal that had not previously been achievable for this class of problems."

The "reduced model," as it's called, also speeds up "learn to rank" systems, where the computer learns user preferences by making note of which items in a list the user clicks on.

Further performance gains might be possible by downloading the reduced model onto the user's computer to perform the calculations there and to update the reduced model continuously as new data comes in.

The work, which also involved Cornell researchers David Bindel, Alan Demers and Johannes Gehrke, was supported by the National Science Foundation and the National Research Council of Norway.

The project has a GitHub site where it provides access to the script, data and results from its experimentation. The paper is available online there also.

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

E-Mail this page

Printable Format

Featured

Digital Education Council Defines 5 Dimensions of AI Literacy

A recent report from the Digital Education Council, a global community devoted to "revolutionizing the world of education and work through technology and collaboration," provides an AI literacy framework to help higher education institutions equip their constituents with foundational AI competencies.
OpenAI Report Identifies Malicious Use of AI in Cloud-Based Cyber Threats

A report from OpenAI identifies the misuse of artificial intelligence in cybercrime, social engineering, and influence operations, particularly those targeting or operating through cloud infrastructure. In "Disrupting Malicious Uses of AI: June 2025," the company outlines how threat actors are weaponizing large language models for malicious ends — and how OpenAI is pushing back.
From the Kuali Days 2025 Conference: A CEO's View of Planning for AI

How can a company serving higher education navigate the changes AI brings to ed tech? What will customers expect? CT talks with Kuali CEO Joel Dehlin, who shared his company's AI strategies with attendees at Kuali Days 2025 in Anaheim.
Meta Launches Stand-Alone AI App

Meta Platforms has introduced a stand-alone artificial intelligence app built on its proprietary Llama 4 model, intensifying the competitive race in generative AI alongside OpenAI, Google, Anthropic, and xAI.