New data: old jokes

One of the joys of data science / Big Data / the quantitative turn* in my area is the chance to recycle ancient maths jokes. So, for instance, take this old chestnut:

Theorem: All positive integers are interesting.

Proof: Assume the contrary. If that is the case, there must be an integer x such that x is the smallest integer that is not interesting in any way. But that makes x a pretty interesting number – contradiction! Therefore all positive integers are interesting.

We can apply that to data dredging – the process of ploughing through a large dataset without much of an idea of what you’re looking for, in search of ‘interesting’ findings – along with some mischief about probability, and we get:

Theorem: All data dredging exercises yield interesting findings.

Proof: Assume the contrary. If that is the case, a data dredging exercise x could find no significant correlations despite testing many hundreds, if not thousands of potential correlations. But we’d expect  5% of correlations to be significant at the p < 0.05 level by pure chance. To look so hard and find so few would be extremely unusual. So that would be a pretty interesting finding from exercise x – contradiction! Therefore all data dredging exercises yield interesting findings.

May your data analysis in 2014 be at least that interesting.

* If nobody else has used ‘the quantitative turn’ as a phrase to characterise what’s happening with analytics at the moment, particularly in education, I am totally claiming bagsies. Although actually I’m more keen on an empirical turn than a quantitative one per se.

This work by Doug Clow is copyright but licenced under a Creative Commons BY Licence.
No further permission needed to reuse or remix (with attribution), but it’s nice to be notified if you do use it.

The numbers are people

Educational research has two cultures, but unlike the two in C.P. Snow’s famous talk, they do overlap and they do talk to each other. One is fiercely qualitative, concerned directly and immediately with the lived human experience of learners and teachers, in all its ethnographic complexity, subtlety, and sophistication. The other is determinedly quantitative, concerned with what can be counted and known in more regular, repeatable, transferable ways.

My sympathies lie with both, but my inclination is definitely towards the empirical, quantitative, and generalisable. But that mustn’t be at the cost of losing sight of the human perspective. With the rise of learning analytics, and more and more quantification of learning – of which I’m a small part – it’s easy to be sucked in to watching the numbers and forgetting the people.

Two things have come to haunt me recently. Both are about predictive analytics: using all the data we have about learners, and previous learner’s success or failure, and using that to predict the success (or otherwise) of learners who haven’t finished yet – or even signed up. The OU is doing a lot of work in this area at the moment. The ethical issues are complex and difficult. I think both the quantitative and the qualitative perspectives are needed to guide our policy development here.

cc licensed ( BY ) flickr photo shared by Photo Mery

Continue reading “The numbers are people”

InFocus: Learner analytics and Big data

Liveblog notes from “InFocus: Learner analytics and Big data“, a one-day conference organised by the Centre for Distance Education, University of London, held in Senate House.

cc licensed ( BY ) flickr photo shared by NASA Goddard Space Flight…

Steven Warburton: Welcome and Introduction

Head of Department for Technology Enhanced Learning, University of Surrey 

Analytics appears three times in the Gartner hype cycle curve.  At the peak, content analytics, very closely related. Big data at the top too. Prescriptive analytics – what is the best course of action. Beyond descriptive, then predictive, to prescriptive. (ouch) Help people make the best decisions.

Predictive analytics story: from Target – How Target figured out a teen girl was pregnant before her father did. Guessed whether pregnant from purchasing patterns, sent targeted flyers to daughter, father angry.

OU Innovating Pedagogy report – learning analytics there. NMC horizon report – learning analytics again there on 2-3 year timetable.

What are the benefits? For learner, tutor, institution.

Erik Duval concern -programming out the human. We should be in control of those algorithms, thinking about human value.

Continue reading “InFocus: Learner analytics and Big data”

Password change for the better

Nine months ago, I posted my rather involved process for mandatory work password change day.

Today, it was password change day again, and the whole thing was done in 20 minutes, including 10 minutes spent on a completely different task. That’s the third time in a row that the simpler procedure (turn everything off, change the password, bring things back up one at a time and enter new passwords as required) has worked without a hitch.

So I’m going to mark this as resolved, and give a public thankyou to whichever nameless person in our IT department made the change that fixed whatever issue made the complex workaround malarkey necessary. Heaven knows “nameless person in our IT department who made a change I don’t fully understand” gets a lot of stick, so I’d like to give them a thumbs-up for once.

While I’m thanking IT infrastructure people, I’d like to thank all those responsible for Eduroam, from the high-level policy people to the back-room technical people who make it work and troubleshoot it. It’s gone from ‘this is a cool idea, hope it works’ to ‘this just works and I can take it for granted a lot of the time’, which is great and makes my work life so much easier.

While I’m talking about passwords, I have finally and belatedly made the switch to using a password manager. It’s such a relief. I had a pretty good system before, which worked with how my memory works, but it was fraying around the edges, and it didn’t cope well when passwords had to be changed. (That meant I had to memorise an exception to the system, which was a lot of extra work, and tended to disrupt the whole thing.)


It was a bit of work to set up the password manager, but mainly because it turned out I had over 150 username/password combinations to enter. I’m impressed at how well my old system worked – I hadn’t realised quite how many unique passwords it let me remember – but then again, if it wasn’t so good, I’d have gone for a password manager much sooner, and would’ve been better off as a result.

It’s fantastic. It’s like having a new superpower.

Faced with a demand to create yet another unique, strong password for some new online service or other, I can click a couple of buttons, paste in &QtjhQWFIkgr/(! and be confident I’ll be able to remember it later. When a site has idiosyncratic requirements (e.g. must have non-alpha characters, but only a small subset of them that doesn’t include the ones my system requires, or must not exceed 12 characters, must not have more than half lower-case letters, etc), I can do that and I don’t have to memorise another exception to the system. When a site I don’t trust demands I set up password recovery questions, I used to worry about divulging my mother’s maiden name, and struggled to think of what answer I could give to questions like “Favourite sports team” that I’d be able to remember later. Now I can simply say that my mother was born Miss 4^mSKZFI9@PNoa8 and that I’m a lifelong supporter of those paragons of sporting prowess, G3loF!aQynSR?Z%.  When I get yet another “this site has been hacked and all passwords stolen, please change your password”, I can go “Ok” and not worry about it.

For those who care about the details, I use PasswordSafe on PCs, Password Gorilla on Macs, and pwSafe on my iPhone, all synced via Dropbox. The runner up was KeePass, but 1Password and LastPass looked Ok too, although my paranoia doesn’t like security software where you can’t see the source.

I’m pretty sure that which password manager you choose matters a lot less whether you use one, and wholeheartedly encourage everyone to use one.

This work by Doug Clow is copyright but licenced under a Creative Commons BY Licence.
No further permission needed to reuse or remix (with attribution), but it’s nice to be notified if you do use it.