MarketingVOX: The Voice of Online Marketing | MEDIA KIT

Spam Tool CAPTCHA Helps Digitize Books


That security verification may
assist a nobler cause

CAPTCHAs, or the jumbled word tests users must pass to register at Web sites or make purchases, has just been enlisted on a mission to help digitize books by researchers at Carnegie Mellon, reports Globe and Mail.


CAPTCHA stands for "completely automated public Turing tests to tell computers and humans apart." Approximately 60 million such word jumbles are solved every day worldwide, and each takes about 10 seconds on average to decipher and type in.

Because computers cannot read the twisted letters and numbers CAPTCHAs produce, automated programs cannot sneak into Web sites to comment spam or otherwise make mischief.

Researchers at Carnegie Mellon have come up with a way for people to type in snippets of books, thereby putting time to good use even as they confirm they are not machines. Well-executed, getting searchable texts online will become a less tedious process.

Other active projects for digitizing books include a scanning process and optical character recognition, which helps make the text searchable. Unfortunately, this technology does not work on text that is old, faded or distorted. Precious few other means of digitizing fully-searchable books exist beyond a manual typing hand.

The Internet Archive runs a number of book-scanning projects, scanning 12,000 boks a month and sending Luis von Ahn, an assistant professor in computer science at Carnegie, hundreds of thousands of files featuring images that computers cannot recognize. These files can be split into single words that can then be used as CAPTCHAs.

If enough users decipher a given CAPTCHA image the same way, the computer will begin to recognize it as the correct answer, thereby correcting written work "so that they are really in good shape, then you can go and use these books in other type devices more easily," such as in handheld computers or programs for reading to the blind, said Co-Founder Brewster Kahle of the Internet Archive.

The project, dubbed reCAPTCHA, has not yet been put to use.

Related Topics

publishing
signs of what's to come
new and improved
spam & anti-spam
computers & tech

Search

VideoEgg
sponsor
E-Mail This Story email this story «
Related stories:

Subscribe to MarketingVOX|News

MARKETING JOBS