Liveblog notes from Wednesday afternoon at LAK16 [full list of blogging]: Session 2B LA Challenges, accessibility and ethics.
Leah Macfadyen chairing.
2B1 Privacy and Analytics – it’s a DELICATE Issue. A Checklist for Trusted Learning Analytics.
LACE project background on the slides, hooray!
Hendrik starts. Work on ethics and privacy, because this keeps arising. We think we have a clue now, we have something. Get the checklist online: bit.ly/DELICATE to download.
LACE project, successful EU-funded project, become an associate member, LA community for Europe. DELICATE done under this umbrella.
This doesn’t come from just some initial ideas, but from six workshops on EP4LA, very heterogeneous groups. Worked closely with SURF in the NL, first workshop in 2013, up to a pre-workshop here at LAK on Monday.
Wolfgang says this may contain some disturbing images [laughter].
What are the obstacles for implementing LA, part of early workshops, with managers. HEIs are struggling with LA. Managers get different information streams – promise vs concerns. This leads to confusion. Some stark examples of negative imagery. One of this is InBloom closure in 2014, where due to public, parent pressure, they closed down a well-intended venture. Similarly in the NL, a private organisation, Snappet, gave out tablets but collected data and fed back to the schools – caused an outrage, authorities stepped in, the Dutch Data Protection Agency regulated it. It’s now limited in scope.
Ignoring the fears and public perception can lead to a lack of acceptance, protests, and even failure.
Hendrik again. Coming up, 4% of global income as a fine if you breach them. A guest editorial on Ethics and Privacy in Learning Analytics. Slade and Prinsloo produced much. JISC work, Open University UK policies. In NL, SURF report in light of HEI law on data collection.
Wolfgang. Slide of someone prepped for ECT! Considerable confusion between ethics and privacy. Looked more in to depth. Research ethics origins lie in post-WW2, Nuremberg Code. Medical field as the driver, human subjects. Helsinki Declaration, Belmont Report. Late 90s, data about humans became an issue. 2000s biomedical sciences led the debate, human genome field. Also in the 2000s, data processing about humans increasingly plays a role in the shaping of research ethics. An indicator – Responsible Research and Innovation policy for EU projects.
Bad examples, infamous examples. Electrical Shock Machine and the Milgram Experiment. [He wasn’t joking about images, it turns out – shot here of Milgram Experiment, and Stanford Prison Experiment.] More recently, the Facebook study where they manipulated news streams without their consent and then published results. It’s a fluid concept, debates on whether that was a good thing, are the results still usable.
Hendrik. Privacy. Starts with definition as the right to be left alone; informational self-determination; informational, decisional, local privacy. It’s not anonymity or data security. They’re not the same. Privacy has changed over time. Gynaecological examination picture from C19th [!!]. Contextual integrity.
Who are the bad guys? Is it me and my data? Government? Commercial sector? Hackers and bad guys? [I’m not sure this is a helpful conceptualisation] Schools and education should do things differently. Important role for the LA community.
Legal frameworks: EU Data Protection Directive 95/46/EC, and new General Data Protection Regulation GDPR coming soon will strengthen this. Also EU Data Retention Directive.
Modernisation of EU Universities, EU report Oct 2014 – Rec 14 – legal frameworks should allow collection of learning data, only for educational purposes; Rec 15 – online platforms should inform users about privacy/data protection, and should always have the choice to anonymise their data. If they step out, we still need to provide study.
In the US, 200 companies signed a Student Privacy Pledge, voluntary.
And the EU framework about the Right To Be Forgotten, in the new 2017 GDPR.
Wolfgang. Fears around data collection and processing. Many fears transferred from other sectors. Power, exploitation, ownership, anonymity, privacy, transparency.
Power-relationship important, especially biometric. The relationship is asymmetrical. State actors, people don’t have a say. Exploitation, free labour and crowdsourcing not for public benefit. Data ownership, what is a user able to do. Google offers Take Out, to download all the data. [I was the only one who’d used this apart from Wolfgang.] Can’t import this data in to another tool. [But for me you can at least work on the data yourself, which makes a difference to techies at least.] OpenPDS/SA as an interesting project. Anonymity and data security. If you don’t fit 100%, you’re still fitted in to the boxes.
Transparency and trust issues – smaller is easier than large companies. Transparency also used as an instrument of control.
Hendrik. [Nicer picture, butterfly.] This is hard stuff. We need to communicate what is in our heads. We came up with the DELICATE checklist. Help support informed communication. It’s not replacing deep thoughts; a checklist for quick data conversations. Inspired by medical checklists for patient information.
Determination – Why?
Explain – communicate, what you’re doing, who has access. Very important.
Legitimate – why are you allowed? Example of taking social media streams – we’re not allowed to do that! Just because it’s public doesn’t mean you’re allowed to take it and process it.
Involve – all stakeholders and subjects. Open, responsible – reply to them.
Consent – make a contract. OU UK good example, Edinburgh too. Explain how they can get data out.
technical aspects… Wolfgang now
Anonymise – or pseudonymisation, aggregate data, who can see individuals
Technical – keep it secure, not hacked or changed, best efforts at least if not 100% security guarantee.
External – external providers, must make sure they adhere to the same principles and regulations.
Just one more thing … [laughter]
Call for papers for Special Issue on LA.
Q: Doing work with consumer advocacy. Two rights as a consumer, not heard you mention, right to safety, no harm being done. Also right to redress if there is any damage. Where do those standard consumer rights fit? From Australia.
Wolfgang: Included it in mediation procedures if something goes wrong, it’s in the paper rather than the brief description. Have an Ombudsperson to address potential conflicts, discrimination claims and whatever.
Q: The right to be forgotten, is that in there?
Hendrik: No, we’ll update that. One year ago it didn’t look like it would be there. To add to the existing legislation, we will update that.
Q: The recommendation 14, from the EC. These two sentences contradict each other.
Hendrik: Yes, that’s very European.
Q: Any practical implications of people on campus, any resources [inaudible]
Wolfgang: Take policy from Open University as a model, use DELICATE checklist to modify it for your needs.
Hendrik: Wanted to make sense for the floor, data scientists and others. Hands up who wants to use DELICATE? [quite a lot] That’s great, mission accomplished!
2B2 Using predictive indicators of student success at scale – implementation successes, issues and lessons from the Open University
Head of Analytics. For 3y have led institutional strategy to increase analytics to drive student success. Slides on Slideshare.
Average age of new students is 29. 174k students. No prior quals. Over 20k with a declared disability. 900k assessments last year, 400k phone calls, 176k emails.
Predictive indicators – focus on lessons learned, less on the technical details. Those have been reported at this conference quite widely. Two approaches, and have taken a tactical view to let them continue and bloom, piloting them before thinking strategically to bring them together.
First, by the planning office, module probabilities, integrated in to our student support intervention tool. Predicted probability of a student completing and passing the module. A module takes an academic year, typically. Provide these in ranges, 10% bands. Can use those to target interventions. We have a structured set of interventions. Use these probabilities at key points, especially the start. Target potentially most vulnerable one.
Second, OU Analyse. Predicts the submission of next assignment weekly. Deployed through OU Analyse Dashboard. Our tutors are finding it incredibly useful between those assignment points. These come in weekly. We operate at distance, our tutors have very little 1-2-1 contact, very little group contact. This is a useful trigger to think about who to intervene with.
Lessons from implementing at scale. My job to help that, coordinate the groups developing those with faculty staff, student support staff, in terms of getting them used.
First challenge: Create the right story for the end users.
It starts with measures and stats about the predictions. Frequently-generated predictions, and how they can intervene at the right time. Accuracy, volume, and timeliness all important. Wait, am I being recorded? At a senior level, we have a range of experience and comfort with statistics and measures. That’s common across all levels.
Weekly tracking, the model gets more accurate through the presentation, as we get more trace and assessment data. Put that graph in front of maybe 2/3 of decision makers, they don’t understand it.
So bring down to the group. [Excellent use of graph showing recall and precision/type 1 type 2 errors of individuals to show percentages. This is the way to do it!] By week 5, 2/3 predictions true, but only finding half of non-submitters. By week 14, predictions identify 2/3 of non-submitters. Later get to 85% precision. If they’re flagged up, pick up the phone.
Volume, through user engagement, what they’re interested in are changes week to week. If predicted to submit, but changes to non-submit, that’s the trigger. How many are we getting? A lot of noise? Or manageable? What percentage of predictions have changed? The changes are quite low. That’s roughly once a fortnight you’ll get a change. That’s manageable, actionable.
Timeliness. This is the big red line- the measure here is quite crude. We’ve measured gap between when the changes in predictions are made, and the last engagement date we have for students who didn’t complete. Are we getting in to a window of opportunity? 40% of withdrawers are thinking of it for 1 month before they go, another 20% for two months. The intervention strategy is crucial. Of the first changes, first time predicted not to submit, look at when that appears relative to their last engagement, we get a wide range. The mean is 14 days ahead, but a broad spectrum (sd 65 days). Some Deans might not cope with that analysis. So a viz – 70% of these alerts are ahead, 32% after, 31% in the window 30d before to 14d after.
So big lesson learned is creating that story that applies to the end user.
Second lesson: Start small – find and nurture your champions.
A process over last 3y from 2 modules, interacting heavily with them to refine algorithms, to piloting on a dozen modules, dashboards, tutor feedback. Then 12 modules, 250 tutors, 8,000 students. Next year possibly 20 modules, up to 2000 tutors and 40,000 students.
We found champions early on – Chris Kubiak from Health and Social Care.
Lesson three: Don’t underestimate the guidance required.
4: Crate super-users and case studies.
We recruited some early users to brief, train, get feedback from other users. They’re being briefed not by administrators from the centre like me, but other people like them who’ve used the system and found it useful in their practice.
We have evaluations. It’s useful, but we don’t have the quant evidence that it’s definitely impacting retention, but yet to get to end of this academic year with the 250 tutors. But what’s really powerful are case studies and vignettes. Talking to decision-makers, this voice of the tutor is more powerful than big evaluation studies and big quantitative impact.
[I’m sure he’s right, certainly not just at the OU, but I find this terribly depressing sometimes.]
5: Foreground the “should we” discussion.
We did that early. Ethical principles, aligned to our mission and ideals, got that out to students.
Final lesson: Repeat, repeat, repeat.
I’ve been doing this for three years. In a meeting last week, having the exact same discussion I had two years ago. Repeating is necessary. Managers change, and the issues keep up too.
Refs: Calvert 2014, Open Learning.
Q: Danger of self-fulfilling prophecy?
Yes. That is a major concern of a lot of our staff. Our ethics policy process. General feedback from students is they like the idea, both existing students who can see this helps the university target help. Students who do withdraw, they want to be noticed, students who are struggling want to be noticed. Many quotes from students who withdrew, who said nobody contacted me, no follow-up. There is a danger of self-fulfilling prophecy. So we are careful about which elements we share with tutors. We can predict the score, but we don’t share that with the tutors. We do share traffic light around on the borderline for passing. Power of OU Analyse is it’s not just drop out, it’s achievement. Haven’t got to discussions about raising achievement to raise retention. Primary discussion around retention and completion.
Q: Puzzled, only looking at student performance, it only works in repetitively stable teaching environment. If you have any changes, that influences the prediction factors. Does it look at changeability of the environment, say different textbook, bring in some daily news in to the class, Something that’s unpredictable.
I’m not the best person to ask. Zdenek is sat here. But yes, we take in to account changes year to year. Talk to Zdenek about how the algorithms take that in to account. AS predictions are acted on more, how does that affect the accuracy of our predictions – we’d hope for more false positives, because the predictions are acted on and because we intervene the students do submit. Big challenges about capturing interventions so we can see if they help, a big challenge in our context.
Q: Do you share the algorithm, have you had gaming? With the public, students?
No, we share information about what we’re doing, but not the algorithm or the prediction. We think the prediction, the intervention should be mediated by the tutor who knows the student. Even if we’re just aiming to do standard interventions, e.g. haven’t submitted an assignment, there are central interventions, but our tutors have said, I know they haven’t and I know the reason, there was a bereavement, don’t send them that email now because it won’t help. The prediction and conversation should be mediated through the associate lecturer.
Q: This sounds very familiar. Did you think about having financial incentives? Or other incentives? For the tutors to use it? Super users, but also the users.
We have a somewhat complex environment. Our ALs are all part time, work under a specific contract. We have difficulties talking about paying them extra or not. To an extent they are paid a fixed fee, element of de-incentivising them to intervene, the fewer students they less work. The vast majority want to get all their students to the end. Using incentives in some way would be a difficult discussion to have. Even monitoring on a tutor group level is a difficult conversation to have. Interesting in a HE context. In a school context, every teacher monitored on that basis. So incentives would be really difficult as a conversation.
2B3 What Can Analytics Contribute to Accessibility in e-Learning Systems and to Disabled Students’ Learning?
Annika talking. Kevin has done a nice job of setting the context for the OU.
Many countries legislate about anticipating and meeting needs of disabled students. We have a large number of students who are disabled. We need to support them better. There’s web standards, but they are not the whole story and don’t address all the problems. So we need to understand why they’re struggling and how to help.
Context is the OU. Larger than average number of disabled students. Greater challenges responding at a distance. Students can declare a disability, but we don’t know what type, and no two disabled students are the same anyway.
Previous work on predicting learning outcome from demographics and VLE data, but disability not yet given a lot of attention.
RQ: Can LA help ID modules with accessibility deficits? And how does this compare to post-course surveys?
Initial analysis: average completion for ND students 75.3%, D 69.5%. OU has improved accessibilities, and only 3 modules post 2012 had markedly lower completion rate for students with disabilities.
Picture is more complicated. Relatively low percentage on any module – could be only two students on a module, so if one of them drop out, that’s a big issue. To avoid this, remove modules with <25 disabled students, now N=668 modules.
Refined the approach using odds ratios – outcome is succeeding on the module, comparing disabled with non-disabled students. Bigger odds ratio, bigger disparity. What threshold should we use? Picked >3.0 odds ratio as threshold, have a few there.
Validate against post-course surveys. Picked 6 modules above the thresholds.
Ranking table – very little agreement. You can’t rely on one source of data.
Future work – investigating ‘critical learning paths’, identifying accessibility issues.
Analytics are an alternative. Key is to combine. MOOCs might be important too.
Q: Can you clarify attracting more students to MOOCs who need less accommodations?
It’s that there’s high number of students who declare a disability on MOOCs. So possibility to identify on MOOCs why students are struggling.
Leah: Initial causality in the study. Assumption is that if we can design course just right, to make it accessible, students with disabilities will pass. Surely it’s more complex?
You’re making it a more equitable environment so they’re in the same position as other students, who will also struggle for various reasons, but you’re trying to find where they have an unfair experience in terms of, say, struggling with a mode of presentation.
Leah: Any investigation asking whether students felt accessibility was an issue?
Yes, the student survey covers this.
Q: How do you define disability? According to me experience, results depend on how you define it, taking in to account physical disability, others, or if you use any proof or just self-reporting. The results are quite different in our case.
I would have to refer back. At the moment it’s just a disability flag, the categoristion is quite broad. Rebecca knows.
Rebecca Ferguson: There are standard categories from the Government, which we can use. We don’t require students to declare a disability. Sometimes during a course students refer to a disability they haven’t flagged up. Some they might think aren’t a problem, or aren’t relevant, or they don’t want to disclose. With smaller courses, some sorts of disability aren’t significant. But on our bigger courses, social sciences 101 would attract about 5,000 students, so if 30% have declared disability, some groups will be quite large. We use same categories when we send surveys to our MOOC learners. So 10k learners, 13% register as disabled. Some won’t impact at all, e.g. mobility. But some of them, if we break them down, when you have 10,000 even small percentages are significant. Maybe 4 students are completely blind. 4 times in a year for 3 years, that’s quite a lot, you need to think about how to deal with that. Graph descriptions, assessment assuming they can see pictures, that they saw the people in the videos, or assessment that copes with their problems and any other students’ problems.
This work by Doug Clow is copyright but licenced under a Creative Commons BY Licence.
No further permission needed to reuse or remix (with attribution), but it’s nice to be notified if you do use it.