CMU Researcher Uses eCommerce Tool To Digitize Books

A researcher at Carnegie Mellon University has found a way to turn the process by which people register at commercial websites into a method for digitizing books, the Associated Press reported.

The method involves putting the time and effort people spend deciphering the short word puzzles used to confirm a registration to better use by having users key-in print materials that need digitizing.

The word puzzles are known as CAPTCHAs, short for "completely automated public Turing tests to tell computers and humans apart."
Computers can't decipher the letters and numbers, ensuring that real people are using the websites.

CMU researchers estimated about 60 million CAPTCHA puzzles are solved every day, taking about 10 seconds each. Researchers have now come up with a way for people to type in snippets of books when registering at a site to help speed up the process of putting texts online.

"Humanity is wasting 150,000 hours every day on these," said Luis von Ahn, an assistant professor of computer science at Carnegie Mellon, who helped develop the original system.

Von Ahn is working with the Internet Archive, which runs several book-scanning projects, to use CAPTCHAs for this instead. The Archive scans 12,000 books a month and sends von Ahn image files that the computer cannot recognize. The files are split up into single words that can be used as CAPTCHAs at sites all over the Internet.

Read More:

About the Author

Paul McCloskey is contributing editor of Syllabus.

Featured

  • From Fire TV to Signage Stick: University of Utah's Digital Signage Evolution

    Jake Sorensen, who oversees sponsorship and advertising and Student Media in Auxiliary Business Development at the University of Utah, has navigated the digital signage landscape for nearly 15 years. He was managing hundreds of devices on campus that were incompatible with digital signage requirements and needed a solution that was reliable and lowered labor costs. The Amazon Signage Stick, specifically engineered for digital signage applications, gave him the stability and design functionality the University of Utah needed, along with the assurance of long-term support.

  • Abstract geometric shapes including hexagons, circles, and triangles in blue, silver, and white

    Google Launches Its Most Advanced AI Model Yet

    Google has introduced Gemini 2.5 Pro Experimental, a new artificial intelligence model designed to reason through problems before delivering answers, a shift that marks a major leap in AI capability, according to the company.

  • Training the Next Generation of Space Cybersecurity Experts

    CT asked Scott Shackelford, Indiana University professor of law and director of the Ostrom Workshop Program on Cybersecurity and Internet Governance, about the possible emergence of space cybersecurity as a separate field that would support changing practices and foster future space cybersecurity leaders.

  • Two stylized glowing spheres with swirling particles and binary code are connected by light beams in a futuristic, gradient space

    New Boston-Based Research Center to Advance Quantum Computing with AI

    NVIDIA is establishing a research hub dedicated to advancing quantum computing through artificial intelligence (AI) and accelerated computing technologies.