Cornell Research Finds Route to Faster, Better Page Ranking

A technique developed by Cornell University researchers offers the promise of speeding up search rankings to real-time speeds. In a visual representation of a database, circles, referred to as "nodes," represent data items; lines, called "edges," represent the links between the nodes. The new mechanism, described in a paper presented at last summer's ACM SIGKDD Conference on Knowledge Discovery and Data Mining in Sydney, focuses on the most important edges.

When a user enters a term in a search engine, recommendations are often personalized based on what that user has recently been entering in search. For example, hunting in Google for a particular product one night could affect how results are displayed the next day for related search terms. "They have computed the rankings offline, based on your choice," said Wenlei Xie, primary author of the paper and a PhD candidate in the Department of Computer Science.

The research project is part of a larger initiative at Cornell to gain a greater understanding of data "that goes well beyond classical database queries," the university wrote in a $2.4 million National Science Foundation grant request. The overall goal is to create "the foundations" for a new kind of database, called the "causal database," where queries go beyond mining for "statistically significant patterns in data" to take causality and "explanations" into account in generating results.

Xie's work lays out the standard route for how a computer performs a search ranking: It "walks" around the database, frequently following a route set by nodes and edges that have been "weighted" based on how many times a particular site has been visited or a type of product has been viewed. Node-weighted algorithms tend to be a bit random. It's akin, the researcher explained, to ranking something in Twitter based on the topics of tweets rather than the relationship between two people.

The approach proposed by Xie's paper recommends "edge weighting." These are already used, but they tend to be slow. The paper offers a way to speed up edge-weighted ranking by reducing the granularity of the walking route and focusing on nods that have linked edges — by way of similar interests or strong connections. The overall goal is to develop "general, fast methods for edge weight personalization."

The researchers tested their approach with a database of scholarly publications and a blog search system. The results: The new method "outperformed existing methods by nearly five orders of magnitude."

As the paper reported, "This huge performance gain over previous work allows us — for the very first time — to solve learning-to-rank problems for edge weight personalization at interactive speeds, a goal that had not previously been achievable for this class of problems."

The "reduced model," as it's called, also speeds up "learn to rank" systems, where the computer learns user preferences by making note of which items in a list the user clicks on.

Further performance gains might be possible by downloading the reduced model onto the user's computer to perform the calculations there and to update the reduced model continuously as new data comes in.

The work, which also involved Cornell researchers David Bindel, Alan Demers and Johannes Gehrke, was supported by the National Science Foundation and the National Research Council of Norway.

The project has a GitHub site where it provides access to the script, data and results from its experimentation. The paper is available online there also.

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

Featured

  • open laptop with data streams

    OpenAI Launches AI-Powered Web Browser Built Around User Context

    OpenAI has introduced ChatGPT Atlas, a standalone browser that places ChatGPT at the heart of everyday web activity. This release represents a major expansion of the company's efforts to reshape how users search, browse, and complete tasks online.

  • school building connected by lines to symbols of AI, data charts, and a funding document with a dollar sign

    ED Issues Guidance on the Use of Federal Grant Funds to Support Learner Outcomes with AI

    In response to President Trump's April 23 Executive Order on advancing AI education, the United States Department of Education has issued new guidance on how K-12 and higher education institutions may use federal grant funds "to support improved outcomes for learners through the responsible integration of artificial intelligence."

  • computer monitor with an envelope and padlock shield icon

    Email Security Transparency Dashboard Added to Office 365 Defender

    Microsoft has announced a new e-mail security dashboard in Microsoft Defender for Office 365, offering customers visibility into threat detection metrics and benchmarking data.

  • server racks, a human head with a microchip, data pipes, cloud storage, and analytical symbols

    OpenAI, Oracle Expand AI Infrastructure Partnership

    OpenAI and Oracle have announced they will develop an additional 4.5 gigawatts of data center capacity, expanding their artificial intelligence infrastructure partnership as part of the Stargate Project, a joint venture among OpenAI, Oracle, and Japan's SoftBank Group that aims to deploy 10 gigawatts of computing capacity over four years.