LAK12: Wednesday morning (1)

Liveblog notes from LAK13 Learning Analytics Conference in Leuven, Belgium.

Welcome and Opening

Erik Duval, General chair for LAK13 and local host, welcomes everyone.

The Vice-Rector of Science, Engineering and Technology at KU Leuven, Karen Maex, welcomes people and tells people a bit about KU Leuven. Been around since 1425, have a wide and deep historical and cultural heritage. Interdisciplinarity of LAK is good. Changes in communications patterns in society. Open, social models of learning. The first core business of a university is educating students to take part in a society that has changed so much.

Dan Suthers, Program chair for LAK13, welcomes everyone. He and Katrien thank authors and reviewers. Thanks the other chairs too. Theme of conference – Learning Analytics as a ‘middle space’. Very diverse ideas, now we’re moving to identify a core – though not too tightly. So keen on learning, and analytics. Conception of learning can be broad: no correct conception. Individuals as agents of learning, or something small groups do, or process of becoming a member of a community (LPP) where the community is the entity that’s changing. Different epistemologies of learning – moving information around, or it’s intersubjective meaning-making, or the process of becoming part of a community. We want to stay open to these questions. But we need to be explicit and clear about our own conception of what learning is. Interesting diversity and contributions – can be new technologies, or exploring their utility for informing our understanding of learning or the educational enterprise. The ‘middle space’, and ‘productive multivocality’. We do have many voices – different theoretical, methodological traditions; researchers, teachers, administrators, policy-makers, funders. Challenge for discourse. Risk of two extremes – it breaks up in to different discourses, or an ill-considered mish-mash of traditions. We need to find appropriate boundary objects, things that are meaningful in multiple traditions, the shared referent enables productive discourse. Asking everyone to seek this in the conference.

Nice chart of submission types over time – from 30ish in 2011, 60ish in 2012, >100 in 2013. Slides from Xavier. Acceptance rate going down: 63%, 39%, 28% this year. New and returning authors – most are new but some returning.

Katrien Verbert talks a bit about Conference Navigator. Has links to the full papers. Can add talks to your schedule, see who else is attending. They’ll use the data to develop visualisations. Introduces first keynote speaker: Marsha Lovett, from Carnegie Mellon University.

Keynote: Marsha Lovett – Cognitively Informed Analytics to Improve Teaching and Learning

Gothic townhall of Leuven, Belgium
(cc) Eddy Van 3000 on Flickr. The conference reception is actually in this building.

These are exciting times. My research has focused on how students learn, and have applied this to improve teaching and learning in real world contexts, and extracting value from datasets. Many relate to performance and behaviour measures. Wants to argue that we shouldn’t lose sight of trying to understand how students are learning. It’s a latent variable, have to infer it. Fundamental question: How do we tell how well students are learning?

In the past, used traditional techniques – quiz scores, homework, compared to previous cohorts, participation, discussions with them. But they are largely based on an intuitive judgement of how well they’re learning.

New way ?- why we’re here – we’re at the beginning of this field. Data, visualisation, analytics. How can these analytics give us better or fuller insight in to students’ learning.

Typical result: students spend 100+ hours across the term, and yet show learning gains of only 3%. (In intro stats course at CMU but representative.) You might think it’s not like that in your situation – maybe it was a dry professor, or traditional methods. It was an award-winning lecture, forward-looking instructional techniques, interactive, 3x a week, collaborative lab session. Yet still learning gains only 3%. We can improve that!

Replicated the study. Baseline, >100h for 3% learning gain, for traditional learning course. Effectively using technology – another study, adaptive, data-driven course, <50h for 18% learning gain. For same material, same measures. Will talk more about this in the discussion session later. One factor in this is learning analytics. Is that enough? In part. It depends how you carry them out. Powerful thing: predictive measures that can lead to action – which students are going to run in to trouble, what kind of performance you can expect on the exam. Important and useful, but on top of predictive approaches it’s important to think about the model underlying those predictions. More ‘prediction + understanding -> targeted action’.

Sometimes prediction is enough. There’s value for understanding as well. For cognitive science, learning science, use large datasets to improve understanding of the underlying theories. Can be helpful not just to tell students/teachers what’s going wrong, but give insights in to the learning process to help teachers go through professional development and understand student learning – and for students too. Metaphor: prediction might tell you that tomorrow you’ll have a fever. Important to know. But even better if that prediction came along with a diagnosis for the reason that fever was coming, and a few sets of treatments that would address that. At CMU with professors, often find surface features can have multiple distinct diagnoses.

Key ingredients for learning analytics:

  1. Informed by cognitive theory
  2. Built on solid course design
  3. Meeting users’ needs

1. Informed by cognitive theory

Understand what they’ve learned. Science of learning – a lot of theoretical developments. Fundamental mechanisms underlie how people learn and solve problems [or models thereof]. Detailed cognitive models successfully capture learning and performance across a wide variety of task domains. Can use these in technologies today.

Power law of learning. Error decrease over time – as students practice, performance improves with marginally decreasing returns. Robustness of this phenomenon makes it a powerful diagnostic … but real student data often isn’t quite that tidy.

Six problems in introductory stats – error rate isn’t smooth – U-shaped, bumpy. One reason for not understanding this, from teachers expertise, they aggregate info in to chunks that students might not yet have learning. The six problems that look all of a piece may be very different skills for students – first three is interpreting histogram, then a table, then a boxplot – which look quite different. Really three distinct learning tasks. The power law of learning is still actually in place.

There are tools for doing this – – like the DataShop tutorial yesterday at LAK13.

A more precise statement: as students practice a given skill, their performance at that skill improves; other skills are not affected. If you’re not paying attention to the skills students need to learn, you’re missing something. With focus on skills and learning, teachers can focus on the particular ones where students have difficulty. Students can monitor their own strengths and weaknesses and focus their practice where they need it most. Students may have weak metacognitive skills, so this info can be very useful to them.

2. Built on solid course design

From what we know from learning science etc, there are many aspects in making teaching effective.

Align with the skills the need to learn, give opportunities for repeated practice, provide targeted and timely feedback. Engaging, right level of challenge, tied to students goals and values.

You can focus on questions about how well the course was designed, what skills didn’t receive enough attention. Useful to course designers and administrators as well as students and teachers.

3. Meeting users’ needs

How they’ll be used, students’ constraints and other learning affordances.

Two users – instructors and students.

Instructors typically only have access to averages or distributions of student scores – implications are coarse and come too late, after the unit is completed. Need something quick, up-to-date actionable information. From the analytics to drive action to be more effective and efficient. They’re busy, so need a quick snapshot. They want to be able to find out more – access to details; benefit from alerts to noteworthy patterns; some, especially new benefit from where students have difficulty, but pointers to resources to help them support and adapt their teaching. Recommendation system for instructors as well as students.

Students typically only pay attention to the grade. When you get your first grade, have an emotional response and have little incentive to remediate. By the time they get the high-stakes exam feedback, the class has moved on to the next unit. So don’t know what to do to resolve their misunderstandings, but have no time to do it. So students benefit from up-to-date acitionable information. Might include: Quick snapshot of how they’re doing; access to details; alerts to patterns (e.g. cramming vs pacing their study); pointers to opportunities for adapting their learning. Have a mapping from difficulty areas to resources that can help in targeting those. e.g. spend 30 more minutes practising this, consider these one or two things.

Both need meaningful, actionable inferences from the data. Visualisations should be: quick to apprehend (for novices, but also to enable expertise to develop); flexible enough to enable deep drill-down;  customisable to individual needs.

Learning Dashboard

Project bringing these three together: the  learning dashboard. Main audiences – instructors, students, designers, administrators.

It’s informed by cognitive theory – statistical models embed skill-based power law learning. When students do much learning in online environments, we can collect clickstream data automatically from their interactions.

It’s built on solid course design – focus on alignment, practice opportunities, and feedback. Implemented in the Open Learning Initiative (OLI) – key approach used is to ensure the learning outcomes are mapped well to the instructional activities. That alignment is what will support students to get enough practice and feedback on what they need to learn.

It’s meeting users’ needs – what they want to know, estimates of students’ learning state, per skill; aggregated for at-a-glance view but with focal drill-down. Visualisations refined through user testing.

Offers deep insights in to student learning. Most LA systems track, record, summarise, predict. But this also reveals what did/didn’t learn; quantify how well … and some other stuff.

As they do online activities, data go to the Learning Dashboard. Real-time analysis are performed; interactive visualisations are produced.

Student works on an activity, drawing on knowledge and skills, some strong/weak. Want to know their learning state. The interactions go to the Learning Dashboard, makes inferences about their learning state, skill-by-skill, and they evolve as the student works in the environment. The estimate is shown back to the student – showing strengths and weaknesses. They helps student know where to go for more review or remediation on areas of difficulty. Also a view for instructors – see the whole class, drill down to get more individual.

States of the art model – Bayesian hierarchical model to capture multiple components of variation. Main latent variable is students’ learning states – becomes more accurate as data accrues. Borrows strength across students, classes and populations. Sophisticated algorithm enable efficient computation.

RESTful API. Can work with any instrumented learning environment. Fairly simple. Scalable to large numbers. Community of users can share inputs and outputs. For each activity, what is the skill or set of skills exercised by that activity; as different users work, they can borrow from each others’ work in understanding that.

Domain generality – not only technical interoperability but semantic interoperability. Data from wide variety of topics. Skill map – a tool to connect instructional elements in learning environment (anything that has a student response) to skills in the domain. That mapping works both ways – can analyse and aggregate it skill be skill. But the skill map can be used for recommendations for e.g. remediation practice. Tools available to facilitate that mapping.

Video demo

Starts with a blank canvas, and you add views that you want. E.g. an English language arts class, with about 20 students. Assignment to finish online materials. Tomorrow you have class.

Tool is structured by Available Views listed by the question they can help answer – e.g. On what skills are my students struggling? Block map of learning state on each skill, with heatmap approach showing how each student is doing. Can look at individual skills and drill down. E.g. finding stasis in a text, identifying a claim, etc. Can mouseover to see individual student names, click in and can see their state on that skill. There are grey cells too – red=low, green=high, but if they haven’t interacted well, have an estimate of learning state but uncertainty very high – communicate that to users as grey colour.

Another view – What misconceptions have my students shown? What skills have my students practiced least? (Perhaps done poorly because little practice.) Many different metrics, here use the opportunities to interact with the material on a skill-by-skill basis. Again can drill down to show individual scores.  Another is an effort-based measure – How hard are my students working.

The little displays can be moved around, added, removed, and the state saved for next time you visit the dashboard.

Student dashboard: Visualisations for students too. Simplifies – shows top 5 skills I’ve mastered, Top 5 skills I need to practice, Uh oh! (there there are issues). Can drill down again.

Take-home points

Focus on student learning – not just behaviour – is important. Incorporate fundamental cognitive mechanisms. User-centred design and course design important.

The Learning Dashboard contributes to significant improvement in learning.

Thanks to colleagues.


Christopher Brooks: Achilles heel is how much of the domain you have to map early on – all the knowledge components. LA reduces this mapping by visualising what data you can find. Is this scalable across full institutions, or just specific disciplines or courses, where ITSes are used.

Tension between using the big data, mining what we can, vs building a conceptual framework. Trying to find a sweet spot, a happy medium. 1, there are more automated or semi-automated ways of doing that mapping. John Stamper. Generate a good skill map by doing the reverse process of looking for those power law curves. The Learning Dashboard (LD) takes archival data, during semester, use the ongoing student interaction to ID where the skill map is not so good. Bootstrap process – use data to get a good start and refine as you go. 2, as we move forward, there’ll be some cases where a given domain – e.g. intro stats – maybe some gateway courses where domain focus is able to be an investment for many students, discipline-based researchers and learning science folks can come together. If we have skill maps/sets in a format where they can interoperate that’s helpful. Turn the assessment pressure in to a more positive forcing function. They can feed in to having a standard language. Trying to stay in the middle.

Ravi: Being provocative. Is there a fundamental assessment regime assumption? It’s more or less the Anglo-Saxon model of assessment. It’s dominant, but not in my institution. We don’t have that kind of assessment in the Nordic countries. Is it portable across regimes?

In all our work there are theoretical perspectives that frame it. I’m looking at this looking for learning gains, using methodologies where pre/post measures focused on the learning outcomes of a given course – that’s my focus and framework. Curious to see how some methods can apply. The idea that’s central, modelling the process is really the key. The implementation and framing is in one assessment regime, the more translatable theme is to think about the key underlying process and how to build that in to the modeling, data-mining – bring in research to give more structure. That underlying theme has more generalisability.

Someone from SFU: Do you think the models of the clickstream data are likely to become powerful enough to not need the assessment data?

That’s already coming to pass. In Massachusetts state testing, project called Assistments. Idea is that if you’re doing this ongoing collection of the clickstream data in the context of real problems, analyse them that way with high-stakes assessment, and predictive validity is strong enough – why waste a day when they’re not learning? It’s a definite possibility. Would be nice to move that way.



This work by Doug Clow is copyright but licenced under a Creative Commons BY Licence.
No further permission needed to reuse or remix (with attribution), but it’s nice to be notified if you do use it.


Author: dougclow

Data scientist, tutxor, project leader, researcher, analyst, teacher, developer, educational technologist, online learning expert, and manager. I particularly enjoy rapidly appraising new-to-me contexts, and mediating between highly technical specialisms and others, from ordinary users to senior management. After 20 years at the OU as an academic, I am now a self-employed consultant, building on my skills and experience in working with people, technology, data science, and artificial intelligence, in a wide range of contexts and industries.

%d bloggers like this: