ReadWriteWeb

ReCAPTCHA Introduces Enhanced Audio CAPTCHAs to Transcribe Old Radio Shows

Written by Frederic Lardinois / December 8, 2008 11:00 AM / 3 Comments

recaptcha_logo_dec08.pngAs we have reported before, the reCAPTCHA service, which is based at Carnegie Mellon University, is not only an easy way to keep spammers away from your web sites, but is also an interesting experiment in harnessing human intelligence to transcribe old texts. To enable those with visual impairments to access those sites that utilize this system, the reCAPTCHA team has now also launched an enhanced audio version of the service, which will be used to transcribe old radio shows that speech recognition technology is not yet able to transcribe.

Security

As the team points out in a recent blog post, traditional audio CAPTCHAs based on distorted digits or letters are relatively vulnerable to automated attacks and can be broken by using machine learning algorithms. Indeed, Wintercore Labs, an IT security firm, showed how easy it would be to break Google's audio CAPTCHA solution earlier this year.

Transcribing Old Radio Shows

recaptcha_audio.pngBy using old audio clips, however, ReCAPTCHA is circumventing these security problems (you can here an example of these clips here by clicking on the speaker button).

One problem with this type of CAPTCHA, however, is that a lot of these clips are quite hard to solve - especially because a lot of them are from radio plays and feature different voices within a single clip, as well as the occasional audio effect. Most of the clips are about ten words long.

The reCAPTCHA team acknowledges this problem by allowing a certain amount of misspellings and other mistakes, but even with some practice, we still didn't get far beyond solving every third CAPTCHA correctly (but then, a lot of visually impaired users might be more sensitive to picking up these audio clues). If you did better, let us know in the comments.

Comments

Subscribe to comments for this post OR Subscribe to comments for all ReadWriteWeb posts

  1. what if we dont really understand the song?

    www.iamlittle.net

    Posted by: Hüseyin Erkmen | December 8, 2008 11:50 AM



  2. Problem with audio samples is you have to speak english. Good english I would say. Anybody can read letters.

    I prefer audio samples with some specific sounds (dog, cat, cow, car, etc.).

    Posted by: Jan Menšík | December 8, 2008 2:54 PM



  3. Has anyone seen http://buytaert.net/manual-spam-service?

    Posted by: Bart De Munck | December 8, 2008 9:03 PM



The ReadWriteWeb Online Community Management Guide
RWW SPONSORS


FOLLOW @RWW ON TWITTER



RECENT JOBS


TEXT LINK ADS