CAPTCHA creator speaks on web innovations

“Ok, so I want to start by asking you all a question: How many of you have had to fill out some sort of web-form where you’ve been asked to read a distorted sequence of characters or distorted words? How many of you found it really, really annoying? Ok … I invented that.”
In a talk titled “Human Computation” given in Wege Auditorium on Thursday, Luis von Ahn, an award-winning associate professor of computer science at Carnegie Mellon, explained each of his most recent research projects and their most recurrent theme – re-utilizing basic human abilities and interests in clever ways to solve grand-scale problems via the Internet.
Over 10 years ago, von Ahn and a team at Carnegie Mellon developed a web tool known as CAPTCHA – those sometimes pesky but practical distorted characters users must enter while establishing personal accounts to verify that they are human – in order to improve Internet security and prevent spammers from designing computer programs that create millions of accounts at a time.
Concerned with the wasted time resulting from entering CAPTCHAs (on average, over 200 million are entered daily), von Ahn’s team evolved its initial project in 2008 into reCAPTCHA. Under this system, CAPTCHAs are generated from words found to be unreadable by optical character recognition (OCR) software, which scans physical text during the process of digitizing books from their original form.
Now, when a user enters a CAPTCHA while on Facebook, Craigslist, Ticketmaster or around 350,000 other websites, their “brain is doing something that computers cannot yet do,” and that person is, word by word, contributing to the digitization of almost 2.5 million books a year, as well as the entire New York Times archive. Von Ahn’s figures show that more than 750,000,000 unique individuals (or 10 percent of the human population) have made such effortless contributions. Not only is this further establishing a useful library for humanity, it also is gradually improving OCR technology so that computers can become a more efficient resource.
With his forthcoming project, Duolingo, von Ahn is beginning to tackle an even greater problem: translating the web into every major language for free. He compares this monumental task to many of humanity’s other great accomplishments with the excited thought that, “If we can put a man on the moon with 100,000 [men], what can we do with 100 million?” Two of the major obstacles involved with such an endeavor are the overall lack of bilinguals in the world and the lack of motivation to do something without monetary reward.
Duolingo solves this problem, though, by appealing to what many people are already interested in – learning another language. While on the website, users are given basic vocabulary and grammar lessons so that, when prompted with passages (drawn from Wikipedia and the New York Times, the site’s first sponsors), they can appropriately translate them into German or Spanish, the site’s initial target languages. As a user improves in skill, the lessons and the passages become more complex, just as they would with language-learning software such as Rosetta Stone. Just as with reCAPTCHA, after numerous verifications, the used passages can be accurately translated and archived on the web in their new language. Similarly, an audio function on the website serves both to teach users bilingual speaking skills and to dub over foreign-language movies, making full use of all human contribution to the site.
According to von Ahn, perhaps what is greatest about Duolingo, however, are its early results, which indicate the same effectiveness as such aforementioned software; this creates a fairer business model for education, however, since those unable to pay to be taught a language can now access such resources for free. This revolutionary new website begins its private beta testing in two to three weeks and anyone can become a part of it by visiting duolingo.com.