“Stop spam. Read books”

I used to get frustrated with those popup boxes that make you decipher undecipherable text for no apparent reason. And then I learned that there IS an apparent reason.

I mean, initially it was just about making sure that the typist was in fact a person, and not just a spam bot. Hence the name Completely Automated Public Turing test to tell Computers and Humans Apart*.  Bots, it seems, are great at bringing the internet to its knees, but aren’t so talented when it comes to reading distorted text.

These days, however, most websites use reCAPTCHA, and its intentions are infinitely more honourable: stopping spam AND ‘reading’ books. Now, I’m all for the simple eradication of spam, but helping to archive the annals of English literary history seems like more of a lasting service to humanity…

See, all over the world, Very Delicate, Old, and Ephemeral Books are being digitised and thus preserved for all eternity (thereby avoiding the risk of another Alexandria?). But this involves scanning said books, and then transposing the images into workable text. And if you’ve ever used “Optical Character Recognition” to edit scanned text in Adobe Acrobat, you’d know how successful THAT can be. Lowercase ‘r’ next to ‘n’ ALWAYS comes out as ‘m’.

When you fill out a reCAPTCHA prompt, one of those words is from one of those old texts, garbled by OCR (because OCR is a computer and can’t tell the difference between ‘rn’ and ‘m’, and presumably you know better).

The other is a known variable: a chosen word, mangled in the same way as the ‘unknown’ word. If you’re capable of deciphering this word, then reCAPTCHA assumes you’ve correctly translated the word it actually needs.

So next time you’re asked to verify a posted link on fakebook, stop before you grumble, and remember that:

  1. You’re doing a noble and relatively effortless deed, to help a noble and otherwise unconquerable cause, and
  2. This is perhaps the only time that as a human, you are more useful than a computer, and you should do what you can to reinforce that assumption.

Hell, I’m tempted to put every blog post behind a reCAPTCHA-protected link, just to move the whole process along**.


*I generally don’t approve of meaningless neologisms, but I’ve a soft spot for shamelessly twee acronyms.

** Don’t worry: I won’t.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s