Multimodal Learning Analytics
Held in the CERAS Fab Lab at Stanford.
Paulo Blikstein and Marcelo Worsley
Laser cutting: Alicja Zenczykowska
Also Julie (someone).
Want you to have first-hand experience of multimodal learning, to reflect on the analytics we can use to analyse. First part will be a learning activity, learn new tools, build some things, then reflect on what you’ve done. Handout sheets with programme.
Four groups, each do a different thing, then a more academic part.
Four groups – laser cutting – acrylic trains, robotics with instructor, robotics without instructor (x2). Try to monitor what you feel you might be learning, the difficulties you have understanding things, if you’re getting better. Presentation at the end on example projects on biosensing, eye-tracking, video/audio, logfile data.
Robot groups – build a mini Rube Goldberg machine – inputs – touch, temperature, light, IR sensors, and outputs of LEDs, motors. Each group designs one input and one output, but it has to activate input of the next group – e.g. LED that lights up and that works a sensor on the next one. Equipment – robotics stuff, and boxes of cheap craft materials.
Big fight in education. Math wars, constructivists vs instructivists. Main are more student-centred vs more instructionist approaches. Content delivery vs student exploration. Phonics vs whatever. Open-ended/creative vs more scripted. The balance is not very fair – easier to measure instructionist things than open-ended, student-centred. Measuring math with MCQs is easier than month-long ethnography on how kids build things. Multimodal lets you measure things.
The divide of the C21st is kids doing stuff in front of computers, watching videos, MCQs, limited efficacy. Affluent schools, they’re doing offline, projects, creative things like in this lab. Etextiles, robotics. Don’t want just those kids doing that stuff. One way to recalibrate that is to have better assessments of this stuff. Learning is social, mediated, multimodal, influenced by emotion, physiology and cog devt. A lot of personalised learning, there’s no collaboration. Collaboration hard to capture with a clickstream. Social learning is multimodal by nature, so hard to capture.
Robotic kit is GoGo Board, like Arduino. Bit easier to use, don’t need wires or breadboards. http://www.gogoboard.org
Huge lot of fun with the three robotics teams. My team built a system with a popping balloon: balloon (tied on a thick string) supported by fan with Bernoulli effect. Press button, fan turns off, balloon floats down in to bucket, then motor turns on with popsicle stick with pin that came round to stick the balloon. That was the trigger for the second team, who detected the pop (which was very loud) via a sound sensor. That triggered an LED lighting up, which was detected by a light sensor. Then a pause for 2s, and then that activated a motor with a popsicle stick that knocked a soft ball off a mount, it ran down a ramp made of chopsticks to cover over the light sensor of the third team. Their device detected the ball over the light sensor and lit up an LED in the hand of a model astronaut standing astride the light sensor, and also activated a further fan that blew bubbles – the final output.
Then a discussion about what may or should have been learned, how to recognise, what you could capture, and how to analyze it. All quite tricky and nuanced – is problem-solving/building stuff/diagnostics/troubleshooting a general skill, or is it all about contextual nitty-gritty stuff about the syntax of the particular programming language.
Back together. Paolo explaining his lab’s work and data collection strategies.
Interested in programming, building physical artifacts, building virtual artifacts (simulation, something like that), explaining mechanisms, collaborating.
- text collection and text mining,
- snapshot (e.g. programming, not just clickstream but screenshots),
- gesture tracking (using Kinect sensor, Wiimote – used to be very expensive, now cheap),
- object tracking (add visual tags or contrasting colour to make objects visible easily in machine ways from cameras – blobby tags which also give you orientation), eye tracking (IR cameras at bottom of computer screen),
- location sensing (cameras, try to find faces; cellphones – in-house GPS, many wireless routers and capture wifi signal on phone, can get location and record; RFID logger, put close to a machine, kids have RFID tag that they tag in with when they go to the machine, and can track kids paths, which machine used most),
- e.s.m. (experiential sensing methods – ask are you motivated, happy, etc – on paper, in app)
- biosensing (lots of galvanic skin response – low cost, can move while you’re wearing it)
- eeg (brain waves via helmet with electrodes)
- fMRI (!).
- … and more.
In 2.5 h can’t show it all, but this is an overview. Judy is biosensing specialist.
Compared text/video then enquiry vs inquiry then text/video. Did brain science – bits of plastic brain with sensor, connect in different ways. Another group with just videos.
Schneider, Blikstein & Pea (2012) – when do exploration first, then watch video, mid test and post test are much higher. If first read, then explore, you think you know what’s going on so you don’t explore so much. Misconceptions not challenged. Explore first you get ideas about how things work or not. So adding these hands-on explorations is not just because fun, but are measurable improvements even on most traditional ways of measuring.
Video example of making things. Give people simple building task, use Kinect to see what happening with their body. Study how much 2-handed vs 1-handed action. Quite surface level, amount of 2-handed action, people with more of that tended to be more experienced, know more about what they’re doing. Very easily capturing the gestures students make. There does seem to be important information in student gestures. This was quite surface, but lots of opportunities in exploring that as an add-on to other data.
Also on videos – coding schemes used for e.g. video data about constructive practices – building, prototyping, organising, breaking, adjusting etc. Code up activity.
Adaptive expertise – capture process of drawing a mechanism or a machine. When you ask them to draw, you get final drawing but not the process. They invented low-cost ways of capturing drawings. Spot ‘external tool usage’ – abstract representation – more of that indicates greater expertise. Rather than just drawing.
Exploratory behaviour – eye tracking. Gears game – do people recognise the core parts of a problem and what is peripheral. Study showing two games, engineering games. The other was wheels game. Have to connect them and make something turn. Can see where they looked first and for how long. More experienced, engineers, looked at gears first, lower time for first fixation, lower number of clicks, more unique fixation points, more duration for fixations. Eye tracking reveals attention, switching.
Instruction vs no instruction – 1h session building virtual device – physics engine. Step by step instruction vs none. Task was build tower or bridge. Used eye tracking, gsr, heart rate. Instructions then open ended vs open ended then instructions. Some students less flexible, keep mindset. Some with low reactivity better at switching. Wouldn’t have seen that but for this data capture.
Programming – capture snapshots of programs. Temporal coding activity. Mean of all students doesn’t tell us much. But extract individual patterns, can see peaks at different times, and align with help interventions. Did some clustering of the snapshots, related it to help-seeking. Got 70% accuracy predicting how much help-seeking just based on the code snapshots.
Toolkits that know what you’re doing. Physics toolkit with visual tags, can capture data. BrainExplorer. LightUp – electronics kit, with little tags that enable cameras to track what was going on.
Motivating work around these toolkits that are embedded with opportunities to do data capture. Get rich data, in conjunction using traditional modes – audio, video – using them in conjunction to get better context setup, find differentiators. Need multimodal; one modality is likely to miss the real pluralism of the ways students are learning. You all had different experiences, we want to capture this better with multimodal data.
Feedback from groups:
Kathleen: What’s your plan for multimodal fusion? Choose low or high level, and how to choose?
Definitely important problem. One example, eye tracking and gsr, makes sense, were interested in reactivity and attention, so synergistic. One problem with multimodal data is you can be overwhelmed, so much data, takes 6 months to analyse each hour. Do pilots, see if it is tractable, does it give any signal.
Kathleen: Signal detection, etc, in your papers?
Yes, normally have 2 or 3 modes together.
Dan: What about relational attributes? How you’d use your sensors to look at pairwise, relational data.
State-based representations – we don’t know the state of the student, but we have some sort of sensor, a probe, which enables us to infer what it may be; some machine learning techniques let you play with some combinations.
Dan: Not just one person at a time. Person touches an object, but some other person also touching, you know that’s an interaction.
Network analysis and representations to look at two people – e.g. both look at same thing, treat that as a node in the network.
Dan: I do that with online interaction in Tapped In.
Just finished running a study, students doing physical building task, in pairs, we’re looking for that, coordination of their action.
Dan: There unit of analysis is a pair. But here there were all these dynamic groups, new connections forming.
That has been some of what we try to study with student location. Challenge of being in this space. Also question about fusion. In large part comes down to what we already know about what we can meaningfully extract from those modalities. Other part is to do with research question we’re looking at. Programming – looked at a couple of levels. Can look at the specific semantics of the code. Graph in terms of the changes – that’s a much higher level. Depending on the RQ, that drives the fusion, and the level of feature extraction we’re using.
Feedback from group 1 – made little trains with laser cutter out of acrylic sheets. One participant felt ‘pedagogically impoverished’ – highly individualised, following recipe, support ; wasn’t clear what learning intent was, just a task. Task to create a train, what’s the learning value. Most of challenge was around getting to draw the shape. Could’ve done trace analysis about how efficiently we carried out the task, but totally meaningless. From educators, watch levels of engagement, gestural recognition, pick up it was uninspiring. (Others experience differ.) Just because can measure stuff, doesn’t mean you can. If using location sensors, see me finish fast as I could to where the fun was happening. Another participant was less negative on it. The school inspectors critique. It was about a procedural learning task. Little that could be done apart from capturing points to streamline the instructions. But not so negative. If I’d just bought a laser cutter, I’d do something similar myself as a learning task. Maybe more motivating for younger people. Felt wasn’t worth doing serious data collection on it, wouldn’t give much insight. It was pretty superficial. Someone else – I took a laser cutting workshop, pretty much the same thing. It is pretty much procedural. No shortage of that stuff going on in classrooms. The equipment is expensive, if you don’t follow the procedure you can’t operate it on its own. Smarter learning task if embed it in e.g. a motivating task. Still need the guidance through the laser cutter.
Group 2 – What may or should have been learned. Ran in different directions, had a lot of fun, somehow all came together. Did run in to places where things didn’t work, even though everything we did right, if hadn’t had guidance to say you’re right but equipment has failed, could’ve got frustrated. Did we learn anything? Short version: certainly some technical, nontrivial stuff, how to work the programming language, maybe about what it’s possible to do with tools like this. Was there more interesting learning around problem solving, or learning to collaborate in groups – some didn’t believe such skills exist, others think they do. Even if they did, was this unique opportunity to learn. Was this different than when bunch of boys on soccer field argue about the rules. My take – there was something really interesting being learned about team work, building, troubleshooting, problem solving, how ideas move around – hard to capture but all the multimodal stuff would’ve helped get there. But hard to test that that thing had been learned outside a really complex and rich situation like that. And those skills are the C21st stuff that are the really valuable stuff.
Group 3 – (missed a chunk here). Sense of ownership, I worked on this part. Social learning going on. Talked about physics, things doing things. Needed enough power to push the ball – that’s all the physics you need to know. No equations. Questioned about the depth of it. What would you analyse? Discourse analysis, what the kids (temporarily so) were talking about throughout the activity, how we came to consensus. When code was uploaded, how that matched to the growth of the design. Who touched what when, how did that relate to the process of design, learning together. Proximity of people and objects. Bring sound sensor to fan, the ball went to the other group at one point. How they moved throughout the space was interesting. Boundary objects, boundary mediators. Dan moved between groups 3 and 4, multiple discussions, talking to all of us.
Group 4 – Laundry list of things that could’ve been learned. Disagreements in their group too! Could’ve been stuff, problem-solving C21st stuff, identity development, modularity and design – but thought I already knew them. For me, only learning in reflection afterwards. Had to work through step by step worked example would help me understand the sequence of the new tool. Some students want structure first then stick with it. Structured challenge – not completely free form, has a more or less defined goal, can see checkpoints. From completely free form to intermediate steps.
Concluding words from Paolo:
Hard in just 2h. Outside of traditional assessments. Many times it looks like this. The data we’re collecting, the LA stuff, is just a fraction of what kids do in schools. Speaks to the complexity, making big policy decisions. In this room, had a lot of disagreement about what counts as learning, is there even such a skill as collaboration. Important to know, take a step back. Are they things that are measurable, or do they even exist? Important to be reflective about this kind of thing. All the investment in makerspaces, fablabs, robotics in schools. It’s not very simple. Might learn collaboration, but might already know that, it’s hard to measure. Maybe you learn when you reflect on your experience.
Feedback: Delight, joy, engagement. Doesn’t matter what we learned, if we were inspired to do something for 40y, launch a career. There’s that dimension as well as the mechanical sense of what’s learned.
Look at relationship between that and what’s learned.
Tacit learning. Lots of learning is tacit at first. You’re building, it doesn’t have to be something you can explain, building up a wealth of experience and tacit understanding. Don’t want to devalue the tacit that you can’t articulate when you’re learning.
It was fun, and fun is memorable.
Important – we only spent 45 minutes. But maybe over a semester.
Valuable about having curiosity and imagination being a carrot to learn more of a programming language, rather than say here’s this language, learn it. Powerful device. You could use this approach to introduce college students to programming 101.
Turns to personalisation. Thousands of books that fascinates them about different things – can bring in calculation, theory.
Initially afraid of hardware, ‘we’re programmers, we don’t do hardware’, without teamwork, unstructured task, might imagine group 4 would get nothing done.
Multimodal learning analytics – seems like, I’ve been doing multimodal for a long time – video, audio, tone of voice, drawings. Some is just a way of collecting that stuff better, or more clearly. Also digitising it more easily, don’t have to do it yourself. Some you really couldn’t get another way, e.g. gsr.
Encourage that agenda, putting constructivist learning on better footing by having better instrumentation. E.g. fast ways to track. Live fb for teacher, or for students to reflect, summative assessment, researcher.
The amount of labour that goes in to it is huge, which limits how fast you can do it. One problem of multimodal, we have resources to make tools that make the process more feasible.
Same analysis as before but faster, vs stuff that’s entirely new to capture.
Big gains is making it faster, more efficient. And synchronising new types of data with other stuff.
Being able to analyse data at a scale you couldn’t handle before. That’s where machine learning and learning analytics comes in.
This is where it can get better, can find what’s meaningful.
Not sure if you can start with the data. You have so much data. Can’t just get a video, press a button, and get a cluster analysis. Need to have some kind of a theory about what’s going on. Way too many degrees of freedom. You can spend a month just playing with the number of clusters, and it might lead you nowhere.
Starting from the data is Ok if it’s administrative byproduct data, but not in this context.
Workshop at conference in Australia in December on multimodal learning analytics. Bring your kids to the lab!
This work by Doug Clow is copyright but licenced under a Creative Commons BY Licence.
No further permission needed to reuse or remix (with attribution), but it’s nice to be notified if you do use it.