LAK14 Thu am (7): Keynote, MOOCs, at-risk students

Liveblog from the second full day LAK14 – Thursday morning session.

Stephanie welcomes everyone to the second day. Introduces Mike Sharkey, founder of Blue Canary, a sponsor. Former director of academic analytics at the University of Phoenix. At LAK12 he and his partner talked about founding a company, and now has Blue Canary.

Mike was talking to George at another LAK, focus on research and theory side, but what about application and implementation side. Was very focused on implementation side, saw a gap in people that had the ability and resources to implement the exciting ideas. Started software-focused company a year ago, going to institutions and help them help their students out. Happy to be a sponsor here, first major sponsorship from their company.

Here to introduce Nancy Law.

Downtown Indianapolis

Keynote: Nancy Law: Is Learning Analytics a Disruptive Innovation?

Thanks organisers for the invitation. Unsure whether she wants the keynote to be disruptive or now. So what’s the answer? The problem is, I don’t have an answer. If it is to be disruptive, not clear whether it’s good or not. Only have questions, not answers.

What’s disruptive? From Clayton Christensen’s book (Innovator’s Dilemma) – two kinds of innovation: sustaining and disruptive. Sustaining are iterative development. Disruptive are those that may not look like an improvement. They tend to serve other purposes. They may lead to some drastic overthrows of existing things. Good examples: digital cameras. Asks who used a digital camera 10y ago; weren’t so good. [I think she’s 10y off here – I have a 10 year-old digital camera in my bag right now, because it’s better than my smartphone camera. *20y* ago they were rubbish.] Who uses film-based cameras? None. Those of you who take photos, last 200. How many most taken using a dedicated camera? Not many. Most use cameras. Who invented the first digital camera? Kodak, but they’re nowhere now. Should have had first mover advantage. Why are they not there now? They are real camera people, thought digital camera was too poor.

Two senses of disruptive: disruptive of beliefs, routines, practices, relationships; innovations at the early stages are often clumsy and not as refined as established approaches. Disruptive innovations are potentially transformative.

Whole concept of taking images – why use phone? Because can share it right away. Now not just for ourselves, but to share. Change fundamentally how we do things.

Is LA a disruptive innovation? What’s it competing with? Who are the target clientele – learners, teachers, leaders, learning scientists? How hae they been involved? What are the conditions for LA to be transformative.

Key challenge to disruptive innovations – these are foreign species. Several possibilities. Endangered species, or invasive species. What’s the likelihood for a foreign species to be invasive? Quite low. But they can be disastrous.

Skeptics views – technology cannot transform formal school education. Oversold & Underused, Larry Cuban. Rethinking Education in the Age of Technology, Allan Collins and Richard Halverson. How far learning technologies have taken root in US schools? Also looking ahead. Were saying technology. The number of gadgets we have around ourselves, and how schools are using them, they haven’t taken much root. Many children have smartphones, tablets, laptops. But usually, the teachers say no, you cannot use them in my class.

Innovations – “Adding wings to caterpillars does not create butterflies” – Stephanie Marshall (1996), you need transformation.

Systematic model of change: Initiate, innovate, implement, institutionalise, scale up by adoption of refined model. No longer works! Example of Nuffield physics approach, implemented, whole package, slowly implemented. Worked well then. Now things are changing too fast, especially in TEL. Can’t do any prototyping.

Simple analogy – ecological model. Educational systems are complex. Gardening analogy. She’s an urban gardener. The crop plants we depend on do not survive well with climate change and increased pollution. Find a new variety of the plant, is much more hardy in hotter temperatures, harsher conditions. Unfortunately, the flowers bloom very early, when most plants have not sprouted theirs. Blooming early, butterflies that used to pollinate that plant are not around that early. So even if we plant a whole plot with the new seeds, won’t have many fruits. Worse than the older variety. So can’t just look for what is the best solution, it’s only good (in context). Have to change the environment, change the other parts.

How can ICT-supported transformation become an education epidemic (David Hargreaves, 2003, Education Epidemic book). Why so difficult to cause transformation? Since the 90s, so many education reform movements. Are they successful? Hong Kong started major curriculum reform in 2000. If you ask people what they think about the reform, will get mixed feedback. Why haven’t they become like, for example, we might not like an epidemic. But nobody can forget the SARS in 2003. But in fact SARS is a good case in point – where is it now? We have it under close surveillance. Because it was so invasive, we tried to keep it down. Normal flu is not as threatening, so is more successful.

Back to digital cameras. Disruptive innovation. Why has it been successful? As a gadget, it was no competition to the traditional cameras. Why successful? (Several audience answers.) My sense is, the first people who were real users were not people like me who just want to take a good photo, they’re people like journalists (?!) they don’t need high resolution images (?!) they just want it quick. Resolution doesn’t matter too much on print or TV. Now journalists have to work as a single person, being reporter, writing up, transmitting it back to the HQ. It’s actually serving a different purpose. Not just by itself. It was successful because there was a need for low resolution pictures, that can be transmitted very quickly. If the invention were not at a time when the Internet was around. Access to Internet becoming popular, social media – without that, it wouldn’t prosper.

Who wants to bring cameras with you all the time? But we have the phone. There is no need for a camera. Now when there’s a sign, no photography, camera image with a cross. I think how can you prevent people taking photos? I may be just talking on the phone. The success of the digital camera is in the fact that it basically disappears.

Appropriation by different technologies/appliances. Integrated/morphed into other devices. In empowering, connecting and democratizing. I hate the complicated cameras, large, you have to know the stop number. But now even young children can take good photos. It’s digital so you can easily send, it’s connecting people, and democratising. We can do our own reporting.

The personal computer. Was it a disruptive innovation? We don’t think of our cellphone as a computer, but it’s more powerful than computers we had a few years ago. It has been successful because it’s empowering, connecting, democratising.

Is LA sustaining or disruptive? Will it achieve transformative, sustainable impacts at scale? Look at TEL for ideas.

International comparative project: IEA (Intl Assoc for the Evaluation of Educational Achievement), Second Information Technology in Education Study: Module 2 (SITES: M2). 200-2011. Studying innovative pedagogical practices using technology. Trying to think about how we assess students’ ability to use technology, but we don’t have the assessment technology to be confident about what we’re trying to measure. Asked about innovations, have they been sustained, spread. All the cases from Asia, sustainability and scalability was much lower. European cases was much more sustainable and scalable. Cross-country comparison HK vs Finland. Connectedness was important.

Another study, Pedagogy and ICT Use. Overall condition, not much change. Even when using the technology, often for traditional means. Book, Educational Innovations Beyond Technology, Nancy Law et al. Looked for principles for sustainable innovation. Five principles:

  • Diversity
  • Connectivity
  • Interdependence
  • Self-organisation – mechanisms
  • Emergence

Much more grassroots innovation.

Another example, ICT-enabled innovation for learning in Europe and Asia. Problem: what are the conditions for ICT-enabled innovation for learning to be sustained and scaled.  Three European cases, four Asian. eTwinning case – teacher network in Europe, encourage higher awareness about other cultures and languages. Not pedagogically motivated, but secure platform to connect every school, classroom in Europe.

The more successful cases have multiple pathways to innovate and scale, but clear vision. Ecological diversity fosters scalability. Some dimensions inderdependent and mutually constraining, have to be accommodated. Evolutionary, and scale is dynamic. Leadership for strategic alignment. Multilevel, system-wide, strategic partnership.

Will LA achieve transformative impact at scale, sustainably?

Design-based research working with teachers. Case study, teacher in secondary school. CSCL. Using analytics, simple measures.

Diana Laurillard project on learning design. Shared learning design by 5 teachers. What constitutes evidence of learning outcomes?

Learning analytics, the way we’re doing it is connected to our own learning design. They’re not portable at all. [Eh? I think some are.] Using the camera example, make them more appropriatable, integrated, empowering.

LA should be more closely aligned, to become a part of the ecology to make LT as an invasive species, make it really spread.

fhp1
I’m not sure this is the sort of effect we want learning analytics to have on educational systems.

MOOCs, cMOOCs and xMOOCs. Take a complex system model of learning.

Change at level of brain, individual, institution, group/community, education, society level. Change is learning. Pace slower (ms to hours to centuries), ways to study different (neuroscience to organisational learning to anthropology [?]).

Questions

Charles/Giles?: Push to teach teachers to code, programming – does that provide some connectivity?

Do you mean that will help them to understand analytics more.

C/G: So they’ll be able to teach their students to code. Could there be a virtuous cycle? The British Government wants to train 16,000 teachers, they’ve brought basic programming in to their curriculum as a requirement. code.org has 35k teachers to push in to learning to code.

I’m not sure it’ll be empowering.

Hendrik: Elaborate why we need quality indicators for analytics, to make it more comparable, for sustaining innovation? Quality indicators, on the different levels. What kind of indicators do you have in mind, from your studies?

Question I have is not so much specific LA indicators are the most important. But that the teachers, K12 teachers, in Hong Kong. Most of them are responsible. But they tend to think about activity design. It’s not learning design, which is different. When you ask them, what do you want the students to learn? They don’t have a very specific cognitive or metacognitive outcome. When you ask, what do you want them to achieve? That’s something new already. So then if you say, I want them to communicate, collaborate. What count as indicators for learning outcomes for those? That needs to be developed. The challenge for us in LA as a community, for it to make impact, we don’t have – the panel on sustainability raised this – there’s a lack of language, conceptual understanding about what counts as assessment, outcomes, indicators. We need conversations about that. Are there ways where LA can have a few, more simple analytics that people can talk around. SNA, more a tool for researchers – is it meaningful for teachers?

5B. MOOCs (Grand 3)

Visualizing patterns of student engagement and performance in MOOCs. Carleton Coffrin, Linda Corrin, Paula de Barba, Gregor Kennedy (Full Paper)

Linda talking, from Centre for Study of HE at Melbourne. Carleton is the computer scientist who did the visualisations. Paula is a PhD student studying ed psych and did all the stats. Gregor is the Director.

MOOC research – to develop a greater understanding of students’ online learning experiences, processes, and outcomes. Context: lots of big reports from initial adopters of MOOCs – Edinburgh, Stanford. Looking at what we can do with LA and MOOCs, and what the scale of the data can tell us.

Very exploratory research. Having a look to see – didn’t have defined RQs. Looking for patterns and trends. But two main purposes: 1. More refined LA techniques for MOOC data. 2. Visualisations that’re meaningful to end users.

2012, U Melbourne signed up with Coursera. In 2013, deliver 7 MOOCs. This looks at the first two – for convenience because data available. But also significant.

First one – Discrete Optimisation. CS. Solving problems in more efficient ways. Other – Principles of Macroeconomics, introductory style.

Why these two? Differ in structure. Prerequisite knowledge – can’t mandate – but Macro had few, but Disc Opt required strong comp sci background to participate fully. 8w/9w long. Curriculum design different too. Macro is very linear. Disc Opt, released all materials and assessment at the beginning. Up to student to decide order and pace. Only recorded marks at the end. Macro – 8 quizzes, 3 summative, peer review essay. Disc Opt – prelim assignment, 5 primary, 1 extra-credit. Disc Opt also had unlimited attempts at the assessment – some, being CS, wrote code to change and resubmit their assessment One student had 8000 tries! That’s surely worth marks in itself. Also a leaderboard.

Completion rates – 4.2%, 3.5%, from 50k ish started down to 1,000 ish completed. Macro is xMOOC, Disc Opt almost a cMOOC [I don’t think so!]

Data analysis approach, used by Kennedy and Judd (2004) – audit trail data, start high, use those observations to inform iterative analysis, refining. Look at clustering in to subpopulations.

RQs: How can we use analytics to classify student types? [And another one]

Participation fell off over time, performance is a steeply-falling-off graph – most 0, small numbers got lots. More students view videos than complete assessments. Constant decline of participation. Of students attempting assignment, most get very few marks. Despite difference in curriculum structures, they looked similar for both courses – for both participation by time, and for distribution of marks.

Enrollment/completion stuff – about 4%. Lots of people didn’t turn up in the first place, or didn’t participate, or didn’t do the assessment. So picked active participants – not just watching videos but submitting assignments. Ups completion rate to 18% for Macro, 13% for Disc Opt.

Student performance diagram, bar chart, size of bar makes a big difference. Changed to cumulative distribution instead. Saw a marked decline after two weeks. Hypothesis: marks in 1st 2 weeks are good predictor of final grade. Linear regression – Disc Opt R^2 52.7%, p <0.001; Macro too R2 20.6%. But Macro only has first summative assessment in week 3. If include 3rd week for Macro, R2 up to 51.5%.

Students most likely to succeed were those who performed well in the first two assessments – let’s look at > 60th percentile. Called that the qualified group. Assumption is if you did that well, assumption is you probably already knew some of this stuff. Add in qualified group completions, now 42.1% and 27.4%. They’re the group getting through better than the rest, vs auditing, or active but no prerequisite.

Subgroups – auditors, active, qualified. Look at participation by subgroups. Relative proportion wise, qualified group is consistent within and between the MOOCs; auditors came and went. The active students also declined significantly.

So? Student activity and success in first couple of weeks signif assoc w outcomes. Prior knowledge. Value of informal diagnostic tests at the start. Could be used to support adaptive work.

Do they take advantage of the open model in Disc Opt course?

State Transition Diagrams. Entry point, circle area is number of students in a state, line thickness is number of transitions.

Student video views diagrams – really cool. [Note to self: see if you can draw these easily in R, they’re awesome.] Qualified students do a lot more moving around than the non-qualified ones. Diagrams have thicker lines, and more lines, when you compare qualified vs non-qualified ones. And the Disc Opt qualified group are the thickest, most dynamic.

Student assessment – for Macro, non-qualified students skip the formative ones, but the qualified groups don’t. In Disc Opt, lots more going back among the qualified groups.

Repeated viewing could indicate it needs more clarification, support. Or that the sequencing is wrong.

Future plans: Test subgroup classifications across a range of MOOCs. More detailed analysis of transitions students make from active to auditors. Link state transition diagrams with performance data – patterns that indicate success. And look at motivation.

Questions

Stephanie: Completion – received certificate? People who failed – took every test but not over the threshold to earn the certificate? Maybe more in the macro course? Less failure for those active in assessment.

Can’t answer the second, don’t know the difference. In terms of overall completion, only count students who passed, who were eligible for cert.

Q: So what question. At NorthWestern Univ, evaluating MOOCs. Are we trying to improve MOOCs? Tease out learning for other contexts?

Trying to improve MOOCs. What in Disc Opt running again? Used this data to make tweaks, talk to students about the structure. Are trying to improve. But identifying likely successes early, give instructors chance to make changes while course is still live. Target students with interventions.

Q: MOOCs important part of the future?

Yes, at Melbourne. But only small part of our online learning, so hope to inform other initiatives.

Bodong: Enjoyed the visualisation. Troubled by classification, especially qualified learners. Maybe the dropouts have even more advanced knowledge?

At a basic level, only data from MOOCs. Stanford research looked at bringing in data from surveys, will look at in the future. Need more research in validity, relation to motivation. This is basic at the moment. Need to really test the quality.

Small to Big Before Massive: Scaling up Participatory Learning Analytics. Daniel Hickey, Tara Kelly, Xinyi Shen (Short Paper)

Daniel from Indiana, Center for Research on Learning and Technology.

Was a full paper but accepted as short. Grant from Google to support massive participation. How scaled up to a BOOC – Big Open Online Course. Cut out how to automate the features to teach again in the summer. Tara doc student, Xinyi intern.

2013 year of the anti-xMOOC. Disaster in Georgia. Said, that’s not going to happen here. Aim to have them mention BOOC. Now, DOCC, Distributed Open Collaborative Course, cMOOCs, mentors, crowdsourcing. Project based courses are very intensive. People who have all the expertise aren’t going to do it. Not going to work when there’s a large body of disciplinary knowledge, a textbook we want people to know.

This is fostering large engagement when there’s a large body. We don’t want them to reconstruct validity, we want them to enact.

Using participatory assessment design research. Five general principles – Let contexts give meaning to knowledge tools, reward disciplinary engagement, and others.

Thirteen features. 3000 loc on top of 5000loc in Google CourseBuilder.

Situated theorist. Psychometrician. Really about contextualising disciplinary knowledge. When you register, have to say your discipline and role. 460 registrants, put them in to 17 networking groups. Also required to define a personal learning context, a curricular aim. Telegraphs that learning here is different, contextualised. Scared off some who weren’t serious about taking the course. Also part of the first assignment. Each week, assignment, first thing is restate their aim.

Graph of number of words changed in context and aim across units – but jump at one point when they come across concepts that matter to them.

Emergent groups, made it possible to – people added to their name to project their identity into this space.

Main feature – public course artifact, ‘Wikifolio’. Everything is public. All discussion not in forums but in wiki. Refined version of Sakai wiki. It gets a lot of writing – about 1500 words per week, dense, contextualised disciplinary stuff. People enrolled for credit are writing most.

Another feature – rank relative relevance, of the concepts you want them to learn. Evaluating instruction, more relevant to use classroom assessments, or evidence from accountability tests. Most states are going to use high-stakes tests to evaluate teachers in the Fall. Good way of engaging. One weak student example, still got it.

Feature 6 – access personalised content. External resources, rank them for relevance. Now you know a bit, why not find some external resources and share them with classmates.

All these features refined in an intimate context, and then automated to run with a big group.

Promote disciplinary engagement – Randi Engle’s idea. So feature 7 – peer commenting and instruction. Ask to post a question each week, and comment. Got about 3 comments each, with reasonable number of words each. 1/3 students had no interaction (2 universities in Saudi Arabia). 2/3 engaging. About 25% didn’t comment each week. Hand-coding of comments and how related. But all interaction was quite disciplinary.

I hate peer assessment. It never works. Peer endorsement – click here if it’s complete, that works. Peer promotion – can promote one and only one as exemplary. Nobody responded to one student’s question, but two people promoted him. Can see who promoted someone, and why.

160 completed the first assignment, 460 to 160 to 60. Proportion of people who promoted each week, was quite high. See exactly the opposite of MOOC pattern of abandoning the forums.

Feature 10 – public individualised feedback. Early posters tend to be the best, give lots of comments on e.g. Wednesday, ahead of Sunday deadline.

Feature 11 – aggregate how people did. Much wisdom and knowledge there – people in each group that chose options as most relevant.

Assessment important – Appropriate accountability (12), 3h complete, randomly sampled.

13 – Digital badges. These were awesome, have a blog post about it. Badge for each 3 sections, so long as it was endorsed as complete, and you took the exam. And good an ‘assessment expert’. Badge contains a claim, and evidence. You can click through to see the work they did. Sharing badges.

Conclusions – don’t go straight to massive, scale up incrementally. Design-based research. Learn defines context of use. Contextual and consequential knowledge. Embrace situative approaches to assessment.

Success, activity and drop-outs in MOOCs. An explorative study on the UNED COMA courses. Jose Luis Santos, Joris Klerkx, Erik Duval, David Gago, Luis Rodriguez (Short Paper)

José Luis talking.

We call success completion. Success in MOOCs is very relative, people who don’t follow it because they want certification. So take it as a relative concept.

Collaboration between KU Leuven and UNED COMA. Exploratory study. We are mainly a group that works on learning dashboards. Studying what are the human-computer interactions in this context. MOOCs are cool, so think about building a dashboard, and dig in to the data to see what we can build. So what we found in our case studies.

Platform is UNED’s MOOC platform, Cursos Online Masivos y Abiertos. OpenMooc. unedcoma.es. Two courses – German and English. Courses with high registration. Similar – language learning. Some differences. German for beginners. English was for advanced students – professional English. So need background. German 6w, English 12w. Both videos (YouTube) and questionnaires. German had peer review, but English assessed only with questionnaires. German 34k start, 3% complete (1000), English 23k, 6% complete (1500)

Don’t expect an easy path to reach your goal. Messy data. Wanted to see if they were attracting enough information to perform studies. It’s only 3 RQs:

  1. How does activity evolve?
  2. Are all learning activities relevant? Inc forum.
  3. Does use of target language in forum influence outcomes?

RQ1  How does it evolve?

Looked at humber of access to questionnaires, and activities. German has monotonic decline in questionnaires, English more jumpy. (Possibly because of structure of course?) Already drop students who didn’t start. By week 4, 75% gone. Also saw that the first activity of every unit gets more access than the last activity of the previous unit.

Activity – participation over time. Also, if focus on the people who reach the end of the MOOC, what’s the completion? 40% pass. Like some 1st year bachelor courses.

RQ2 Are all activities relevant?

120/105 activities. Binned in 10s, for each bin, % who passed course. [Trouble with slides – reinforces my belief that you should never have animations in your slides] If you miss just 10 activities, drop 25% in completion (I think.)

Compare with activity in the fora. Posting activity – peak of pass rate at 30 posts, but drops if you posted lots more than that (on German course).

What were the most active/useful threads about?

RQ3 Use of target language

UClassify – uclassify.com – Probability of every student using the target language TL. Then compare pass or not using TL. Doesn’t look like a big difference on German – but very few using German, it was a very beginner’s course. In English, 10% with higher probability used English >90%. Inflects around 60% probability – more than that, more successes; below, less success. Quite low English use. Only 13% passed the course. But no real clear finding here.

It was an exploratory study. It was a success, largely, but room to improve.

Recommendations: pay attention to the first activities. Increase the usefulness of the fora.Keep an eye in the use of target language.

Job pitch at the end! He’s finishing his PhD, interested in new opportunities. Got a round of applause from the audience.

Questions

Daniel: Discussion forums and MOOCs. Evidence says they suck. No, it’s not working! Anyone found them helpful?

John Whitmer: Yes, we found good evidence. Forum was kinda icing on the cake, but looked at levels of engagement. The people who do tests are in the forums. Don’t know causal link. But more active on forum is correlated with tests.

Jose: They were publishing the solutions of the questions in English. In German, used forum for sharing resources for learning German. They are useful because it’s necessary, cannot push back. Teaching in another online university, sometimes my students complain I don’t know how to engage them in the conversation. Can’t blame the system.

Q: Do you think forums in a MOOC should be redesigned? I hear, the charts of participation drop off. A student posts, another post post post, they are (overwhelmed). What if the first 20 people who get there block that forum off, they come back to that chunk. Then attrition factor. If you could create smallness in the strategy.

Yes, I agree. We need to redesign them, make easier to avoid information overload. Lots of threads, messages, have to spend a lot of time to read.

Ulrich: Missed one detail. Forum participation related to success. What was exactly success? Present yourself somewhere, or not present yourself for an exam? I read in your paper, the f2f exam and online exam. Not clear what not passing the exam means in the interpretation of your contingencies there, between e.g. forum participation and the ultimate success.

Completion means only passing the online exam.

U: Explicitly, or based on your activities?

Awarding with a badge, everything is automatic.

U: You don’t have to apply for it?

No. Afterward, can apply for the official certification, you have to go there f2f, additional step if you want to pay.

U: Your criterion was online?

Yes.

Daniel: I ran out of time. Anyone working on EdX, interested in digital badges, I want to talk to you. Question for audience, does anyone have a forum where participation increases over time? We see over and over it gets so messy, brilliant study from MIT tech review, they just explode and become intractable, almost inevitably.

John someone: We’ve used a participation portfolio instead of forum. Rubric of a quality post, ask students to do Q&A, they submit a self-graded portfolio of their best work. They do hunting and finding of quality, not the instructor. I’ll tweet out the URL. The participation has shot through the roof. Regular courses in the LMS, online course.

John Whitmer: No activity in a MOOC increases over time, they all decrease. The percentage, enrolment, the interesting question is relative to prior activity for an individual student. Need to relativise to that. Look at what leads to higher levels of engagement. Entropy! In MOOCs without that incentive structure.

Daniel: I haven’t see it. My data is only one where people participating in an assignment where the activity level goes up.

John W: It could be a rounding error on other enrolments, it’s small compared to MOOC. I’ve not seen it.

Chris Brooks: Beginning, heavily used for socialisation – hey I’m from Germany, etc. We have to think about what we mean by engagement – blathering, or making deep critical comments. That’s important. Inside MOOCs, forum is one piece. Some Michigan MOOCs, students self organise on Facebook.

Daniel: Stephanie in her video said, this is where it gets ugly.

Q2: Twitter conversation, we know a lot about supporting good online discussions, take Chris’ point, what are we expecting. Many different models. Heartened to see your work, Dan, where they choose affiliations and you use that to group. A few thousand people in a meaningful conversation, that doesn’t make sense. It’s about giving purpose for discussion, finding the right thing to help them discuss. Yeah, we can say we haven’t see it.

Daniel: I do have a forum in my class, but for asking practical questions.

Windows at The Westin Indianapolis Hotel | 110908-1402-jikatu

6B. Learning Analytics for “at risk” students (Grand 3)

Engagement vs Performance: Using Electronic Portfolios to Predict First Semester Engineering Student Retention. Everaldo Aguiar, Nitesh Chawla, Jay Brockman, George Alex Ambrose, Victoria Goodrich (Full Paper, Best paper award candidate)

Everaldo talking.  Univ Notre Dame Coll of Eng.

Motivation – lots of study on student retention, attrition. Nearly half of students who dropout do so in first year. Early identification of at risk students crucial but not simple. What data needed? Low academic performance equal or not to at risk?

Eng class – 450-500 income, 18-22, 75% male. First year of studies program initially, intended major at start, declared at end of 1st year. Standard curriculum, including Intro to Engineering.

STEM Interest and Proficiency Framework – x/y chart, interest y axis, performance x axis. Interest displayed in terms of their ?study choices. Four possibly quadrants. 1. high interest low proficiency, need academic support. 2. High interest, high proficiency, optimal. 3. Low interest, low proficiency, need most attention. 4. High proficiency, low interest. I fell in to that place, and was given support to develop excitement about being an engineer. Need to develop that engagement. Aim to move them all to quadrant 2.

Overlay stayers vs dropout – high proficiency tend to stay, interest has little effect. Low/low has most of the dropouts; still some stayers in high interest low proficiency. Can do predictions on this – but should we focus on ones where we can’t predict. Interested in low proficiency but high interest, some don’t drop out. So interest access important. Also high proficiency low interest relevant.

Can we build predictive models that take into account both of these aspects? [I guess yes. Adding features is not hard.]

They have electronic portfolios. Creative space, record, collect artifacts to show that they have done. Assignments developed around the tool. Highly customisable. Assignments designed to entice students to reflect. Used this as data source for interest.

Previous, related work. Several models predicting student retention, and some comprehensive studies understanding dropout.

Data features used – three disjoint sets of features. Academic data (e.g. SAT, grades on courses), demographic data (incl dormitory!), engagement data (eportfolio logins, submissions, hits received).

Analyse each feature individually, to see which best describes the outcome. Several different approaches. (ePortfolio hits was important). Semester 1 GPA was not correlated with outcome for most approaches except one. (?!) Grades for intro course were more important. SAT scores not very important – SAT verbal scores were negatively correlated (!! – laughter in the room). ePortfolio hits was very important to all of the models, #1 for all but one, #2 on that one. (Pleasingly, ethnicity is a blank too.)

Created subsets of the data features. All academic, top academic, all engagement, top academic+engagement. Tested lots of classifiers.

Dataset very imbalanced, so measuring accuracy alone is very deceiving. 88.5% were retained anyway, so if you just predict all retained, get 88.5% accuracy. So looked instead at Acc- prediction accuracy for dropout students. And Acc+ is for retained students.

10-fold Cross Validation structure. 10 stratified samples, ran the same expt 10 times, average to get final result. Academic data features very low accuracy – no better than 25%. But engagement better. Very good on Acc- is the all engagement data features- 77%/83%. And academic plus engagement the very best – NB was 87.5% (Naive Bayes). Small tradeoff – incorrectly labelled some who didn’t drop out. That’s probably less important than missing people who did miss out.

Evaluating these results – confusion matrix true/false positive/negative.

ROC curves – best curves are the engagement features and top academic+engagement, by a long way.

Results: had 48 students eventually dropped out. Picked 47 to train model, got the other one to test. (And round and round, presumably.) On only performance, academic data, only spotted 11. When used engagement features, saw 42 students.

Lessons learning: Portfolios were good source of data, engagement data was good. In future. look at how early we can ID students who are disengaging. And what can we learn from analysing the content of ePortfolios.

Questions

Q: Four quadrants, high interest but not prepared. Interesting to identify them. Curious what thoughts you have about a mechanism to prepare them. I spent a lot of time trying to nuture interest, metacognitive stuff. But the content stuff is like bang for your buck. Yeah, huge potential in that corner. What ideas are worth pursuing.

Great question. Sometimes take for granted, easy to find students struggling academically, help them move up. But if disengaged, it’s much more difficult to assess that than to assess them academically. Working on peer mentors. Often times that feels more approachable than professors. Additionally, inviting students to attend engineering events – career fairs, graduate events.

Alex (coauthor): One step further, STEM ambassador communities. Using best examples in ePortfolios, to help inspire and reach out. At midterms, when ID good engagement, send advisors to the disengaging groups.

Q2: In ROC curve, engagement only worked as well as engagement+academic, which doesn’t work?

Engagement+academic was a tiny bit better. But without academic, still predict retention very accurately. Most of the students performed homogeneously well. So looking at just their performance, hard to discern dropout vs retain. Engagement features much larger role. Academic performance played a minor role.

Q2: Ok, it’s real. The four quadrants, that was illustrative data, not real?

Yeah. We don’t have that many dropouts.

Q3: What you ended with, text-based analysis – interested in anyone else doing portfolio analysis.

Very new work, more people doing analytics on portfolios. Students doing text analysis, looking for trends among successful students in what they put in there. Find how successful student uses the portfolio. Perhaps train other students to follow those trends.

Q4: Tried with non-engineering students?

No. IRB barriers. Curious to see how this works with other populations.

Perceptions and Use of an Early Warning System During a Higher Education Transition Program. Stephen Aguilar, Steven Lonn, Stephanie Teasley (Short paper)

Stephen Aguilar starts. Outline: Brief history of Student Explorer, design-based research. Methods and setting. Results. Implications.

Steve Lonn takes over. Design-based research, work with academic advisers, iteratively design it based on their needs. This is third paper about Student Explorer at LAK, can see the trajectory. In 2011, on v.0. At-risk program in college of Engineering. It was in Excel, manual acquire data. Performance data. A few rules. Using LMS logins as the rule breakers. All manual. Gave data that wasn’t previously available, but labour intensive and too infrequent to make changes.

2012, v.1. Work on redesign, to automate the processes so it’s not grad students doing it. Used Business Objects, automatic, weekly report. Good to automate, but cumbersome and unstable platform.

2013, v.2. New partner, another at-risk program, including athletes, a different college. Another redesign. More custom, built on .NET. But now daily processing.

The rules – red/amber/green – engage, explore, encourage – action for academic adviser. Formative data, not predictive model, more actual performance. They’re already at risk. First rule is how are they doing in performance – if >85% they’re green. If middle, find distance for course average – if close, back to green, if more middling, bring in percentile rank of course site logins (25% is their boundary).

When you select a student, get main summary, across all classes that’ve had one graded assignment (pulled from LMS gradebook). Current week, show alert status – red/orange/green. Detailed view for each course, can see class performance over time, and how student performs over time. And then qualitative data – if instructor is giving feedback, can see that too.

Stephen Aguilar takes over again.

Low achieving for engineering, but not very low achieving. But Summer Bridge is for non-trad students, help for transition to college. Community-building living environment. Hang out most of the day. Math, English and soc sci, mainly study skills. Individualised academic counselling. 200 students. 9 academic advisers, N=9. All with higher degrees. This is their main job.

Research interest was around intended vs actual use.

Data sources – user logs, appointment calendar data, pre/post test.

1h training session to introduce the tool (Student Explorer). Also to collect feedback – important point about option to hide student status, so can share them.

‘Please rank courses by how often you think you’ll raise issues related to them’ – stayed constant, math most important.

Usage patterns – consistent. First week, no data to report. Similarly for last week – still meeting. Also blank at lunch for meetings. Student Explorer activity happens in between. Activity spike around the math midterm (was ranked high), because in their experience that was most important predictor for future. Use pattern – overwhelmingly during the meeting, a bit before, almost none after. It was designed for meeting prep, but the advisers used it during the meeting with the student. Unintended!

Design intentions don’t always equal use. Some courses are more important – so maybe equal screen real-estate is not right design choice.

Affordances of design (Don Norman). Confusion matrix – perceptual visible, affords the action, that’s the quadrant you want. Talk about data literacy, puts onus on the user. But design literacy on those of us doing the work.

Design literacy, to understand the affordances, ability to author representations of data. All LA is an act of communication.

Example – paper with grade, simple. Quant test a bit more. Signals/student explore are a bit more complex, need more understanding. As they get more complex, have to think about the design. In theory, have data about motivation, self-regulation, data literacy, engagement.

Future work – student perceptions, how they make sense of it, and how they react to comparative data. Also nimble and clear representation – give new use case, how can we make it better for use during meetings with students.

Question

Q: Your users identified that they were going to use that tool during meetings, they asked for that textbox. It was clear to advisers on seeing the tool that one of the most useful, best uses was concurrent in their interactions with students.

That was accidental. Clear to one adviser in a room of nine, but it spread. Clear … once they’d thought about it.

Q2: Excited about visualisations to learn to use visualisations. Integral part of STEM enquiry. Push hard on using these representations. If students see representations of data about them, enormous opp to increase their literacy in using visualisations and understanding.

In the future, 3-5y, access to their data, but also to tools to make visualisations. We don’t have all the good ideas, they might come up with something useful to them.

Q3: If instructors use the gradebook like they should – you said that. Feedback to students, could give feedback to instructors about how it’s being used. I’d like to see that as well. Advisers now in a key broker role.

Thanks.

Q4: Surprised by how little student access there was outside of advising. (Accessing their own.) Student access?

No, was only adviser access. Students didn’t have access. Only saw their data within a meeting with their advisor. Have been careful, we don’t know how students will interpret the data. Some have trouble with very basic literacy. Especially in early warning systems for at-risk students.

John Whitmer: Any data about effectiveness of tool on changing student performance.

Thinking about that for the future. 7w program, student performance hinges on the midterm. Round 2, make changes, show student their own habits, use that as a lever.

[tech problems for next speaker]

JW: Meetings, or not?

Not done much on prediction. It’s just the data we have now. Mostly on purpose, it’s not the direction we’re going in.

Andrew Krumm: Different theories drawing on. E.g. reference group, reinforcing stereotype?

Interested in, motivation framework, self-determination theory. If student already on a mastery track, that’s adaptive, what does that mean if we give them comparative. But could be maladaptive.

AK: Translate from psych to design? Efficient or hard?

Fun problem space. Definitely a struggle to take that and turn it in to a design.

Q5: Last person talking, interesting question about making these at-risk visuals available to students.

This is so new, not a lot of showing stuff to students and then get the outcome. That’s a rich space for research.

Modest analytics: using the index method to identify students at risk of failure. Tim Rogers, Cassandra Colvin, Belinda Chiera (Short Paper)

Tim speaking. Reference to Dillenbourg’s work on modest computing.

Couple of angles to risk – attrition, and performance. UNISA, busy developing grand algorithms to predict attrition and performance. Point to help them not drop out, and do well. Have a guy called Gollum, his Precious is data, very eccentric. It’s important work. But very complex and takes a long time. Taken a year, comes from biology in using mashups of decision trees to pull together data not done before. Thinks can do same with student data.

But! One context is day to day job in aiding academics improving their teaching. They want to predict how students will go in their courses. One project looking at interviewing students who did better or worse than expected – compared to what? Need something to analyse that and give you the data. Need to pull out students we expect not to do so well. Shouldn’t lose sight of trying to assist students. Another angle: in the course, it has a particular context. Large algorithms are fine if applied broadly, more skeptical when applied to student performance at aggregate levels. Dragan looked at Moodle data, just using regression, picked out variables that did indicate students at risk and predict it. When split it down to course level, different variables at work. Question is, how refined do we have to be? What level to aggregate or disaggregate?

Scott Armstrong, forecasting.  Unique perspective, serious mathematical chops, very much about the results – can you show the technique works in concrete circumstances. Fast and frugal heuristics. Anyone aware of that? (No.) Use rule of thumb vs regression. Often show simple, straightforward approach can do just as well as more sophisticated one, given constraints. Example, first past the post, where one main variable predicts enough. Don’t need more to make an accurate variable. Also students who’ve been there for a time, their GPA is quite predictive.

Don’t generally work with students who’ve been here for a while. Mostly first year courses. Many don’t come with any performance data we have – alternative pathways. So methods to tap in to that population. Index method. Childishly simple! The forecasting community have resurrected it. Tabulate variables that are relevant to the question. E.g. show table of student (each row), columns – have they logged in on time, 1/0; parents did not complete high schools 0/1, failure in previous course 1/0.  Series of variables, simply tabulate and add up.

Interesting history. Index more accurate than various regressions. Resurrected every decade. Used by credible forecasters with strong stats background. It’s either compitetive with more sophisticated approaches – or does better.

Index vs regression – pulled data out of system. Couldn’t get engagement out of the LMS. Compared to regression, simple elimination, drop lowest t value. From 2011, 4 courses, 2nd half of 1st year. Two models: index method, regression on total. And regression model. Tested it on novel data for 2012 students.

index R was -0.59; regression -0.70. So not as good.

Incredibly simple technique. No coefficients, so lose a lot of stuff. Can pile on as much predictors as lit will allow, including expert opinion, can drop, add variables. Sample size not an issue. Can be owned by educators, it’s simple.

Implications for orchestrating adoption. My major point. It’s not so much about the index method or other fast and frugal heuristics. Academic has a black box given to them, but they don’t know what’s going in to that. The kind of conversations you can have if you’re working with data are important. An academic who feels that an institutional risk algorithm presenting a list of at-risk students, they can’t interact with that if they don’t think it’s representative. Risks alienating people.

Shane Dawson and I had a conversation with academics recently. Academic said, 2nd year students failing because their lazy. That’s a strong assumption, but didn’t take it further. Need to have more data-based conversations with the academics. Have to pull them in to the data. However sophisticated it gets, have to leave room for them to participate in the questions they want asked. Ok, your students are lazy, why do you believe that? Data to illuminate that?

This work by Doug Clow is copyright but licenced under a Creative Commons BY Licence.
No further permission needed to reuse or remix (with attribution), but it’s nice to be notified if you do use it.

Advertisements

Author: dougclow

Academic in the Institute of Educational Technology, the Open University, UK. Interested in technology-enhanced learning and learning analytics.