USC Tool Taps Big Data to Bring Down Child Trafficking

The average age of entry into prostitution in the United States is 14. The kind of profit a pimp can expect to make on a child prostitute each year is $150,000. Child trafficking operations spend about $45 million a year advertising their services on literally thousands of sites and millions of pages. A three-year research project at the University of Southern California wants to shutter those operations, and it's using big data searching to do it.

Two computer science professors, Pedro Szekely and Craig Knoblock, working out of the Information Sciences Institute, have developed a new search tool intended to use the Internet to turn advertising against human traffickers.

The work is being funded by Memex, a Defense Advanced Research Projects Agency (DARPA) program aimed at developing the next generation of search technologies and transforming the way they discover, organize and present results.

The researchers are treating their search project as a big data problem. The tool they've created combs through escort ads; downloads all relevant pages, including the ones those ads link to; discovers connections, folds the data into a repository; and provides query and analysis functionality to enable searching by law enforcement users.

DIG (for "Domain-specific Insight Graphs") lets users who are searching for a missing child believed to be trapped in the escort industry to search by phone number, location, alias and photo and recommend a way to reach them. Currently, the database contains content from 50 million Web pages and 2 billion records; it's growing at the rate of about 5,000 Web pages per hour.

"As the database continues to grow, DIG will be able to uncover new connections and patterns in the data, making it even more useful," said Knoblock, the director of information integration at the institute, in a prepared statement.

All of the code for DIG is open source and made available freely to law enforcement agencies. The project leaders expect to upgrade that quarterly over the course of the project, which began six months ago.

Eventually, they said, Szekely and Knoblock hope to enhance DIG to be able to flag potential victims and identify trafficking rings through their ads.

This isn't USC's first entry into the plight of human trafficking. The university sponsors the Technology and Human Trafficking Initiative from the Center on Communication Leadership & Policy. That project is studying the exploitation of modern communications technology in modern slavery.

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

Featured

  • glowing digital brain-shaped neural network surrounded by charts, graphs, and data visualizations

    Google Releases Advanced AI Model for Complex Reasoning Tasks

    Google has released Gemini 2.5 Deep Think, an advanced artificial intelligence model designed for complex reasoning tasks.

  • abstract pattern of cybersecurity, ai and cloud imagery

    OpenAI Report Identifies Malicious Use of AI in Cloud-Based Cyber Threats

    A report from OpenAI identifies the misuse of artificial intelligence in cybercrime, social engineering, and influence operations, particularly those targeting or operating through cloud infrastructure. In "Disrupting Malicious Uses of AI: June 2025," the company outlines how threat actors are weaponizing large language models for malicious ends — and how OpenAI is pushing back.

  • cybersecurity book with a shield and padlock

    NIST Proposes New Cybersecurity Guidelines for AI Systems

    The National Institute of Standards and Technology has unveiled plans to issue a new set of cybersecurity guidelines aimed at safeguarding artificial intelligence systems, citing rising concerns over risks tied to generative models, predictive analytics, and autonomous agents.

  • magnifying glass highlighting a human profile silhouette, set over a collage of framed icons including landscapes, charts, and education symbols

    AWS, DeepBrain AI Launch AI-Generated Multimedia Content Detector

    Amazon Web Services (AWS) and DeepBrain AI have introduced AI Detector, an enterprise-grade solution designed to identify and manage AI-generated content across multiple media types. The collaboration targets organizations in government, finance, media, law, and education sectors that need to validate content authenticity at scale.