Did you know that every time you sign up for some type of account online, that you may be helping to digitize books? That word that is all distorted that you much type out in order to sign up for an online service is called a
CAPTCHA. It was developed to make sure that only people were signing up for email accounts or buying tickets online. Computers are unable to pass a
CAPTCHA test, so when one is passed, it is safe to assume that a real person is signing up for something and not a computer program. This is all very interesting and useful, but the real story is how
CAPTCHA is now being used to digitize the large back log of newspapers and other printed materials. This program is called
reCAPTCHA and was developed as a way to put the
CAPTCHA test to "good use." When a person signs up for an online they are given a
CAPTCHA challenge to read as well as another word from a book or newspaper that was being digitized. The
reCAPTCHA word could not be read by optical character recognition software and it needed the human eye to read it. When the word is entered as part of the
CAPTCHA challenge, the
reCAPTCHA word is then sent back to the service that is digitizing the material and the unrecognizable word is now recognizable by the computer. According to
Wikipedia, the
reCAPTCHA program provides the equivalent to 12,000 hours per day of free man hours to digitization projects.
I learned about this great idea on Nova Science Now. The problem of computers not being able to read and digitize a piece of printed materials was one of the topics that was covered in my
metadata class. Its so interesting that
someone was able to come up with a way to use a simple computer test to help in quest to digitize as much printed materials as possible. Think how much has already been digitized by using this program and all without anyone having to do any real "work."