LASI13 Wednesday (1): Solving problems and building data scientists

First liveblogging post from Wednesday 3 July 2013 at LASI13 #lasi13, the Learning Analytics Summer Institute, at Stanford University, CA.

What problems are we trying to solve?

Using Learning Analytics to Illuminate Student Learning Pathways in an Online Fraction Game

Taylor Martin & Nicole Forsgren Velasquez

Taylor starts. “3 out of 2 people have trouble with fractions.” A crisis of STEM illiteracy – everyone says ‘I can’t do math’, but not ‘I don’t know how to read’. Limitations with current measures. Great opportunity in the great amount of data.

Several different models of fractions teaching. Game round the splitting model. games.cs.washington.edu/Refraction (better link: http://centerforgamescience.org/portfolio/refraction/) – splitters apply to laser beams, apply iteratively to get the fraction you need (e.g. half of a half of a half is an eighth).

Map these as mathematical states, then show visualisation of trajectories. Focus on the 1/9 task.

Nicole takes over. Pre/post tests show improvement – learning gains. There are very different strategies. Some explore space differently. Look at what underlies those differences from the details we’ve captured – tie to personalised learning. Use cluster analysis to identify these.

Identified interesting variables – number of unique boards states (how much space they explore), total number of boards states, average time per state, number of moves until initial 1/3 board state (because important to solve – can’t get to 1/9 unless you start with 1/3) – also success on game level. Results were 5 clusters (fussing strategies) – Duncan’s Multiple Range Test used to interpret, post-hoc test to classify and interpret the clusters.

Cluster 1 – Minimal. Few board states, spend long time on each state, very slow, thoughtful. Takes them a long time to get to 1/3, might never get there. They don’t know where they’re going.
Cluster 2 – Haphazard. Medium states, total number very high, trying lots, low time per state. But hard time to get to 1/3, success low. Move around a lot, more exploration of the space but don’t make that 1/3 move.
Cluster 3 – Explorer. High N states, total number medium, high time per state, high to 1/3, medium success. If they don’t run out of time, they tend to do pretty well.
Cluster 4 – Strategic explorer. Very high unique states, high total number, average time very low, medium time until 1/3 space – really exploring and paying attention Succes very high. Soon as they get to 1/3 they go for solution.
Cluster 5 – Careful. Contrast to minimal. Low number of state, medium time, get to 1/3 very quickly, very high probability of winning. Take medium time, considered about moves, don’t dwell,but go straight to conclusion.

Learning gains and transfer – no association with startegy. But strategy is related to learning. If prior knowledge is medium of better, explorer strategy learned most. All high-fussing strategies (strategic explorer, explorers, haphazard) were good. If prior knowledge low, minimal strategy was better than haphazard, high fussing bad.

Tracing learning and self-regulated learning

Phil Winne

Gaps to span – learners lack tactics and strategies. Also lack data about the learning tactics, data about which tactics have effects. Getting inside the minds of the learners while they’re working – the way they make things. Focus on learning tactics, strategies, and data about which tactics have which effects.

There’s a black box, and he’s opening it. Traces – use them to get inside the black box. Winne & Hadwin model of self-regulated learning SRL. Learners are agents, and environments afford choices – even ones we don’t know about.

Fine-grained ways of looking at traces – studying text – capturing SMART operations (searching, monitoring, assembling, rehearsing, translating). Applies in any text domain.

Software called nStudy. Has outline format in panel – bookmarks, quotes, tags, notes, terms & termnet, connectors/see also, documents, concept maps. Complete suite for learners to engage and share information.

For example, you can quote a selection – drag cursor, make a quote, appears on left-hand side. Can accumulate many. You get a trace of this – it’s a selection, revealing standards in metacognitive monitoring. Not about what information is, but how it relates to what they know, or what the task is. Any action is motivated. Examples of notes – has tags, can make your own or provide them. Invite comments. It’s very configurable. Everything they do is recorded, timestamped, except mouse moves.

When tagging a note, capture some data. Model of learner’s state of engagement with that information. The information they enter in the textbox, all of that information, gives us data to make inferences about what’s happening in their mind. How can we use that to model what they’re doing? Look at some of the information about how they assemble tactics to make strategies.

Co-occurrence and transition matrices. Can map out as a picture of operations and times. Are learners patterned, or do they act rather randomly? Density – actual links compared to maximum number of links possible. Are partitions structurally similar? Graph theory used to explore – patterns of transitions across 2 graphs. (LogMIll). Do superficially different events play a similar role relative to similar partitions? What do learners know about when they should do what? Structural equivalence, calculation – essentially Pythagorean multidimensional distance. Determine whether events trigger learners to engage in similar tactics.

Affect and engagement during learning

Sidney D’Mello

Video of a student solving difficult problems. See confusion, thinking, frustration (half smile), anxiety – tracking emotions in many situations. Range of rich affective experiences – is important. Study many learning contexts, tracking a range of emotions, use a range of sensors, use that to engineer affect-aware learning technologies.

Example of a physics ITS, track confusion, boredom, frustration, model with facial features, interaction patterns – used on the fly to develop system that changes appropriately, closes the loop.

Three fundamental questions: Is positive or negative affect better? Which are relevant? Can you boost learning by confusing people?

Fielder & Beier Assimilation/Accommodation model. Affective states influence these. Positive affect facilitates top-down assimilative. Negative affect stimulates accommodation/bottom up. So depends on the learning task. There’s a cycle. Bottom-up, easy processing, in time that triggers positive affect, get more risky, more top-down critical creative problems, failure triggers negative. So art of learning is to regulate this process, master it.

Important not to get carried away – contempt is negative, and not good for learning. Specific, discrete emotions matter. (Nice map of them, Note to self: check it out more thoroughly.)

Meta-analysis on emotions during learning with tech – D’Mello in press. Highlight – engagement/flow – frequent across all studies. Boredeom and confusion frequent. Frustration, anxious, curiosity, happiness – in 1/3. Contempt, anger, disgust, sadness, delight, fear, surprise – very infrequent.

Cognitive disequilibrium/goal appraisal model – transition structure. State of confusion is pretty relevant to learning, detecting impasses, opportunity to learn – can go either way. Why is confusion beneficial, and can we constructively promote confusion?

Example – case studies of bad science. Flaws subtle, hard to find. Human learner talks to agents discussing the flaws.

Five or six experiments, big overview – contradictions, false feedback successful in inducing confusion. We give them far transfer case studies. I have trouble detecting the flaws in them! Ask them to pick out only flaws. No main effects of contradictions or confusion on learning. But confusion moderates impact of contradictions on learning. Observed in MCQs and far transfer tasks.

Is +ve or -ve better for learning? It really depends. Both have different functions. Can be mutually beneficial assuming regulation and alignment. Must look at valence as well as arousal.

Affective states relevant to learning – engagement, boredom, confusion – non-basic emotions. The basics: anger, sadness, disgust, fear, surprise, happiness – 98% of work has been done on those basic ones. In learning, needs non-basic emotions. Great opportunity.

Confusion beneficial to learning when students are meaningfully confused, they attempt to resolve it, with scaffolds. More work needed. How do you confuse an entire class? LA means I can specifically confuse you and regulate your confusion.

Questions

Jason, WGU: Question for Phil. The nStudy, is it a product? Does it work with eReader?

Phil: It’s a Firefox extension.

Jason: Download for student to take?

Phil: Each local machine needs that extension.

Chris, Saskatchewan: For Sidney. The basic emotions, and research focused on there, we need to consider other emotions and do research there. Can you elaborate more – what emotional states are important?

Sidney: Basic/non-basic comes from about 100y ago. They’re relevant in the whole ecosystem. I had rage and despair during my dissertation. but in short-term context, no reason. Reinhardt ? has a taxonomy of 4 types – achievement, social emotions, topic emotions. Place to figure out what to focus on.

Chris: For Nicole and Taylor. Found distinct strategies for students doing fractions. How generalisable are those? E.g. iterative, unstructured approach, do they use that in other problems?

Taylor: We have just found these. Environments we’re going to look in are visual programming environments like Scratch. Also in different fraction games. Other way is predictive analyses we’re moving to.

Nicole: Look at e.g. demographics, prior knowledge, how indicative that is of each strategy. Have seen similar patterns in other environments. Some iterate in small increments, some try something once or twice then delete whole thing and start over. Interesting to continue over. Can tailor the hints towards current strategy.

Taylor: Fun to compare work in more structured tutoring environments.

Judy, Sydney: Have to admit to some confusion. My conception of the title – what is LA going to empower us to do? What is generalisable from what all three of you are doing?

Phil: I take analytics in a broad sense. We’re using graph-theoretic ideas. The techniques help us ID patterns, figure out what triggers them, can push nudging messages. The clustering, we might ID a neighbourhood of people almost like me but doing better, I could get a suggestion to help me, I see if that works. Feedback about process, document self-regulating in me.

Taylor: Big problem I’m trying to solve, for 20y in math ed, we don’t have good answers to what’s going to help kids learn and teachers teach. Is cluster analysis going to fix that? No. But it’s the overall goal.

Panel: Building the Educational Data Scientist

Chair: John Behrens

Speakers: Ryan Baker, John Stamper, Piotr, Marcelo Worsley

Ryan Baker – Creating a grad program in learning analytics

At Teachers College, Columbia University. Largest grad school of ed in US. Says Wikipedia. Has best-rated undergrad programs by National Council on Teacher Quality. Despite not having any undergrad programs. (!)

Moving towards creation of MS in Learning Analytics. All grad programs have to be approved by the state of New York, so takes a while. Currently Masters in Cog Studies in Ed, Focus in LA. Admitting students now, rolling admissions.

Courses specifically on learning analytics: what it is, how you can use it. Nitty gritty on methods. Broader societal picture. Feature engineering studio. Ed data managements. And lots of electives.

All syllabi and materials will be open access – core methods on EDM already are (on Ryan’s web page). Some will be real-world data, some on breadth, some bring-your-own data to analyse. Breadth options – many including psychometrics, data use by teachers and school leaders – Alex Bowers.

The goal is to create a cohort of LA professionals ready to take leadership positions.

First program specifically devoted to LA. MOOC on EDM methods , big data and education, in the Autumn.

Send us your students!

John Stamper

HC Institute at CMU, LearnLab Pittsburgh Science of Learning Center. Many years of PhD ed. Corporate partner program. Consistently getting feedback on need for something more than the rigorous PhDs, wanted masters level, practitioners in educational data science. So new program – masters in learning science and engineering. Application deadline passed, first class starts in August. 1y, 24 month program condensed in to 3 semesters over 12 months. Psychometrics, EDM, interaction design, cog and social psych, design, implementation & evaluation of ed interventions. Really getting technology out there. Differentiated from trad data science programs more focused on data methods, here focused on bringing in educational elements. Pretty soon will be taking in next class, deadline likely around Jan 2014.

http://www.lse.cs.cmu.edu

Piotr

EdX.

Sebastian Thrun, background in AI, models in data. Coursera founders Daphne and Andrew with background in AI/machine learning too.

Chronicle article – what employers do and don’t want – don’t want knowledge, want complex problem solving, communication, critical thinking, ability to get stuff done.

Couple of core skills – mathematical maturity, informatics, cognitive science and communications. Applies across broad range of disciplines. Start in these areas, and then apply to e.g. education. No data scientists, in e.g. Udacity, EdX – but big focus on data.

Do we really need data scientists?

Marcelo Worsley

In PhD program, been here for 4y. Talking about opportunities in what can be done to train the data scientist. First class was Andrew Ng’s machine learning. Gets people to put up hands for majors, picked up a few non-engineering (econ), but not education. But more now take that class. Now have critical mass of students who have taken it. We can start to have these discussions, own group that studies machine learning and human learning within the School of Ed. Encouraging.

We’ve seen a lot of change. In School of Ed, when I said what I’d do, people thought I was crazy. You can’t monitor people … but more accepted now. A lot of growth and expansion. To continue that, need work like Ryan and John’s in building out courses. Projects, have to solve challenging problems. Looking at courses to choose – if no final project, I won’t take it. If no project, I can’t apply them. So when designing courses for LA, EDM, data science – do take an approach that gives students chance to bring their own data, explore the space, in a way that’s meaningful to them.

Each of us in the room has different definition of LA. It’s not one thing. Have to be willing to set up, recognise different ways to tackle the space. As a community, or even as a professor, be willing to reconcile it. Some will be more breadth-oriented, versus saying LA is this one thing. Also offer classes in a lot of depth. Many presentations this week describe things in depth. As PhD and masters students, cool to know different techniques at a surface level, but need to be able to use them.

Relationship between us and teachers, and between us and students. Incorporate opportunities to directly interface. That’ll give us a better perspective in the work we’re doing.

Lastly, anyone in two different schools/departments? (Quite a few.) Anything like a joint position, make sure you know who’s making the tenure decisions. End up trying to satisfy two sets of requirements. LA students feel like that – reach of rich, deep computer science work – AI, HCI etc – while also satisfying the deep learning theory people too. These programs should integrate those things, set up as a unified program. Need to know there is a single home that can evaluate them. Excited to see these programs coming along.

John: Personal note. First job, educational statistics training, cognitive psych. Got job saying no stats. Second job saying only stats, nothing else. In the business world – diagram from new O’Reilly publication on Analyzing the Analyzers – data businessperson, data creative, data developer, data researcher.

Questions

Mark Mimms: I think you have to catch these skills before grad school – what about undergraduate programs? Boulder has tracks in CS. What else?

John Stampter: We have undergrad courses in ed tech, we have ed game design class for undergrads and postgrads. Our CS and Stats programs are rigorous in analysis.

Mark: Mine them for student researchers?

John: We have an internship program. LearnLab and CMU. They often turn in to masters or PhD students. Agree there is a need for that. Don’t know how you get undergrads

April: CMU stats dept has semester and year-long project courses, great opp to bring in undergrad and work on LearnLab data. Biggest entry point.

Stephanie, UMich: New undergrad program! Juniors and seniors to get undergrad degree in information, with component in data science. Finding students coming when they’re alienated from CS/Info and want more human approach to dealing with data.

John: Piotr, your experience. Arguing for a high specialisation in math and physics. Pearson, we hire math or CS folks. What’s your sense about the nature of people that might be highly mathematical or computational but need rounding out in cognition, communication, but also humanistic studies about society? In some roles, highly politically important.

Piotr: One thing missing was pedagogy, how people learn and think. As do analysis, try to find patterns, has to be motivated by how does this match up with cognitive theory. Also a HCI component, interacting with customers. But customers don’t know what they want to know. Give tools to motivate educators to make better course. Strong humanistic component. Within one person, can be.

Caroline: The iSchools in general, undergrad programs in informatics, get disillusioned with CS, many across the US. Programs in communications. There are undergrad programs already. The iSchools graduate people who call themselves data curators. Not analysis, but better than scientific teams, define structures and maintain data for others to analyse. Need to report data for NIH and other agencies, have to disseminate your data too. It’ll come in educational research if it’s not there already.

John: Ontology, database design?

Caroline: Yes, also defining metadata, technical storage aspets, legal requirements. Somewhat records management. More data name to it in the information schools.

Sharon: Consensus is broad area. One thing not mentioned in grad programs is it’s critical to have empirical research methods, causal inference, research methods in general being taught. Those techniques and strategies need to be used with data mining, machine learning. Bootstrapping each other, creating new science between them. Another comment, multidisciplinary and broad. In HCI programs, psychology. Not all LA people have to be the same flavour. Psychology, stats, maths, CS, education may track them. Should be as broad as that.

Ryan: About integrating more trad research methods, completely agree, it’s a requirement in our program. About tracks, that’s a thing that’s emergent. People have choice beyond core methods, can create own sets of things along them. I’m a little bit to early to say what the best tracks will be. Will see diversity.

John S: We have that too. Our masters modelled after HCI masters, project-based. Corporate-sponsored project, teams working full-time on a project. Hope to implement that loop, taking data, analytics, experiment, back in to the design, the loop Ken showed.

Marcello: Like way Stanford is set up. Our program, has tech interest, also rooted in psychological studies – developmental and psychological studies, or in teaching, or in SHIPs – history/social sciences. Also if you don’t have a masters, are required to get one outside the School of Ed. I had to do a masters, did it in CS. Joint expectation. At other schools, said interested in masters in CS outside School of Ed, they said you can’t do that. Having that option helps.

Piotr: Within MIT, split between labs and depts. Labs have people from many depts. Interdisciplinary. How are your programs integrating disciplines?

John S: based in HCI, already interdisciplinary. CMU has centers, institutes and depts, very interdisciplinary. Within the LearnLab, is a center across CMU and Pittsburgh.

Ryan: Organisationally, Teachers College has courses in the same dept. But program has coursework in a domain area, can be in same dept, or multiple other ones. Program is in a dept, but students can draw from a range.

Janet, Colorado: Heard great discussion of technique of our area, but not the humanities – ethics, legalities, how we relate to the communities. How might we include that. Any time we start a new program, think about evaluation metrics – thoughts?

Ryan: LA and society course is focused on those exact issues. Metrics – personally I think it’d be neat to look in medium term at what graduates do. If go in industry, they’re changing the world. On to doctoral programs, what are their publication outcomes. Don’t see enough looking at long-term outcomes. We see where they go for jobs, but not long term. Good to see.

John S: I agree. Our major outcome is seeing people get jobs as ed data scientists, designers. Initially a tremendous demand for these positions. Interested to see if that demand is going to be maintained. Intuitively I think it will. Shift in what educational corporate people are hiring. Less content people, instructional designers. We’ll see.

John B: At Pearson we have K12 to HE, some parts only do testing, some only do kindergarten, whole clinical area, professional area. ITS, LMS. Many holes in education that might use data scientists with different profiles. Broad range of needs, different profiles

John S: We don’t want to be everything, we’re focusing on what we’re good at – that’s why it’s learning sciences and engineering

Piotr: Learning engineer – someone who can engineer courses based on data. Different from data scientist. Do A/B tests.

John B: That’s one profile. Some work on geospatial data, how we relate, patterns in crime/employment and so on. Depends on your conceptualisation.

Sidney: For Piotr. You mentioned you don’t hire educational data scientists

Piotr: No, just we haven’t.

Sidney: Because you haven’t met any, or you hired them and they sucked?

Piotr: Neither. Like early Google, find very talented people, very good skills across domains, happens to be most came from other backgrounds – e.g. physics ed researchers. They had that skillset but came to it a different way.

Marcello: Advice I’d give is to not be intimidated by the computer science community, don’t let the level of maths steer you away. This has been one of the main deterrents in education. Methods classes presented as ‘don’t worry, you don’t need the math’, people take on persona of not needing to get in to the math. My big takeaway, head in to, dive in to it. Get your knees scraped up a bit. Keep the fight going, lot of opportunities when educational researchers start to tackle tough maths problems. Algorithms that work for CS may not make sense in learning theories, won’t change until tackle them. Same from CS side, at times, even at Stanford, see education as less interesting – but keep going, there are a lot of changes we can make to education if we’re willing to engage this to the fullest.

–

This work by Doug Clow is copyright but licenced under a Creative Commons BY Licence.
No further permission needed to reuse or remix (with attribution), but it’s nice to be notified if you do use it.

Author: dougclow

Data scientist, tutxor, project leader, researcher, analyst, teacher, developer, educational technologist, online learning expert, and manager. I particularly enjoy rapidly appraising new-to-me contexts, and mediating between highly technical specialisms and others, from ordinary users to senior management. After 20 years at the OU as an academic, I am now a self-employed consultant, building on my skills and experience in working with people, technology, data science, and artificial intelligence, in a wide range of contexts and industries. View all posts by dougclow