Neat hack: reCAPTCHA

I’m not a huge fan of CAPTCHAs – those annoying prove-you’re-not-a-spam-bot thingummies where you have to type some disguised letters.  Even if you can deploy them without the outrageous accessibility issues, they’re at best a necessary evil.  (At worst, they don’t even fox sophisticated spam tools.)  It’s fundamentally wrong for a computer to be setting a human make-work.

But reCAPTCHA is a fantastic hack around this.  Instead of generating an image by distorting some known text, they use scanned text that has failed to OCR.  This is very neat in two ways.  Firstly, the human effort involved contributes to digitisation projects and so becomes real, useful work instead of wasted drudgery.  And secondly, the images in question are a great source of images that are likely to be easily read by a human (who has no visual impairment or challenges) but are known to be hard to read by a computer.  Excellent!

(Of course, it’s no help to the clever counter-attack strategy of bots getting people to solve CAPTCHAs in exchange for access to porn, although I’m not sure that’ll work in practice at scale.)

This sort of issue isn’t usually a problem in formal educational settings, since anonymous access isn’t usually allowed.  But it is for informal educational projects like OpenLearn.  And as I like to keep saying, the boundaries between the two are fuzzy and getting fuzzier.


Author: dougclow

Academic in the Institute of Educational Technology, the Open University, UK. Interested in technology-enhanced learning and learning analytics.