Liveblog notes from Wednesday afternoon at LAK16 [full list of blogging]: Session 1C MOOC discussion analysis.
Alyssa chairing. Says first paper might be really awesome, no bias. [laughter]
3C1 MOOCEOLOGY Bringing Order to Chaos in MOOC Discussion Forums with Content-Related Thread Identification
Yi talking. MOOCEOLOGY project, led by Alyssa.
Too many messages, piled together, regardless of the original purpose, makes the forum very confusing. Some measures have been taken, setting up separate forums, peer recommendation systems (views, votes), but research suggests this doesn’t solve the problem. So our team is interested in EDM/LA to automatically identify posts, particularly posts related to learning – content-related discussions. Socialising, technical, logistical posts, non-content-related discussions. This binary classification would help.
Outline: RQs, context, methods, results, pedagogical implications, current work.
Do content-related ones have linguistic features that can be used to build a predictive model? How well will that generalise to different contexts? Views and votes are so widely used to sort; are they helpful for our purposes. And are they robust across time.
[I like these slides.]
Data challenges: the data was post-hoc, so it was what’s available and shareable, no input to the design, and limited information about the learning context.
Stanford OLI gave us 5 courses on 3 subjects. N threads 299 to 837. One course training set, second pres of the same as cross-offering. More advanced course, cross-course. Then intro-psych as near cross-domain example, then physiology, far cross-domain. Some similarities in design, some differences.
RQ1 – do content-related threads have distinct linguistic features?
Use thread-starting post as unit of analysis. Manual coding content/non-content, good kappas, about 0.7. Extracted bag-of-words feature set, unigrams and bigrams only, parts of speech unhelpful. 2236 features extracted, ranked by kappa, took top 30 for content, non-content. 60 different distinct features. Took those top 60 from all five courses, examined in what contexts and purposes they were used. IDed 10 different categories.
Tracked features across content-related and non posts.
First group of features are about the course subject – e.g. in stats course saying “probability”, “value” or “mean”. When they use words from the field, they are talking about the content. There are not many of these in the top features; even if we take them out we have a lot of distinguishing features. Another group – ‘understand’, ‘examples’. Or question words. Connectors, connecting ideas – ‘and’, ‘or’, ‘then’, ‘but’ (!).
Discussion about ‘exam’ videos’; pronouns; appreciation and greetings words. These are non-content-related. Others in the middle lest clear.
So yes, content-related threads do have distinct linguistic features.
RQ2: Can these be used to build a model?
Yes. 10-fold cross validation, accuracy, kappa, recall, precision. Not bad.
RQ3: Does it generalise? Not too bad, although kappa falls off for the transfer (near domain, then more to far domain). Could be change of subject matter, but could also be course design and pedagogy.
RQ4: Are views and votes good indicators for this prediction?
Supplemental model – views and votes as features don’t improve the model. Stand-alone model with views and votes, kappa <0.13. So no!
RQ5: Does it work across course time segments?
Looks Ok, but saw a change in StatLearn.
- Post-hoc filtering tool, to assist forum navigations.
- Live-tagging tool.
- Learning identification – maybe indicate learning goal-orientations. Instructor could intervene.
Currently working on data partitioning pre-step. Topic modeling, find information of interest to learners. And SNA, comparing social to academic networks, influence on learning achievements.
Q: How promising do you think combining general features like connectivity, centrality, to combine to topic analysis?
It’ll be promising. Would like to see what patterns we can find out of the content/non-content activities. Previous studies found academic connections help with retention. We are interested in finding if that is true in MOOCs.
Q: More a comment. Without being that systematic, the approach is very convincing to identify these elements of speech. We have done something quite similar. We have used, in a project on video-learning, looked at Khan Academy comments. First characterise them by semantic richness in terms of domain concepts. We used Automat (?), text to network tool, now [another tool] Among non-domain concepts, interesting category we called signal concepts, to do with explain, help, what’s the difference between. Striking to see your learning-process-related concepts are a superset of our signal concepts. Ours was just exploratory but very similar. We have used these to characterise students problems of understanding.
Diagnostic analysis is interesting. It matters when our findings are applied to help teachers and learners.
Q: Working in different courses and splitting is really important for generalisability. Are there variables about the students that might be possible splitters, maybe within courses. Maybe older versus younger, that might characterise helpful differences?
We didn’t have demographic data. We thought about how these findings could be combined with previous research on subgroups of MOOC learners. That too would be promising.
Q: Joke, filter off anyone under 20 years old [laughter. Hmm.]
Q: Top Feature Distribution, what is each block?
Each one is one linguistic feature. Higher means more features.
Q: Red and yellow [are StatMed 13 and StatMed 14] It’s strange to me you have so much difference. It should be the same. Have you experimented how good can you generalise with those categories with a new course? This suggests you have to calibrate.
Well, we haven’t done that yet because we don’t have the access to course design. The variance in the number of data for first to second offering, could be difference associated with the learning group, or how the forum was managed and facilitated. Both instructor and TA might have different techniques and understanding of the learning population.
Alyssa: This is showing the categories in which the top features fell. But these are only the top features. The whole model, has the whole feature set, kappa didn’t change at all between one course to the next. Some different word might become cool in the lingo, but the general patterns.
Q: Are the categories based on supervised or unsupervised techniques? [Supervised] Might be fun to do cluster analysis, to see if those categories come out statistically.
Alyssa: Because they appear together?
Q: Yes, interested if those cluster.
Alyssa: I wonder if they would. Similar kinds of words, but maybe less likely to co-occur.
Q: Yeah, but that could be interesting.
Alyssa: Could be interested as a follow-up. Maybe one is better.
3C2 Investigating Social and Semantic User Roles in MOOC Discussion Forums
In MOOCs, forum often only way to get contact with each other. Complex structures arise. Analysis is required to understand the underlying mechanisms of information exchange. Semantic roles, social roles. Goal is to investigate to what extent the social and semantic structure is interleaved.
Dataset from Coursera MOOC on Corporate Finance. Network analysis, content analysis. Combine to get socio-semantic blockmodelling.
Thread network – highly ‘cliquish’ networks. Thread initiator to all replying users, very sparse, some forums don’t nest replies.
Posts, automatic classification. Random forest classifiers. Info-seeking, info-giving, other. Post linking – only info-seeking and info-giving posts are considered. Then derive social network, edges are gives-help. 647 remaining users, started with 1500.
Block modelling, from SNA. Reduce complex structure to an interpretable macro structure. Relations mapped between groups. So e.g. hierarchical structures, or core-periphery structures. Relations between clusters, so colouring nodes by type of relations between colours.
Content analysis – discussion topic modelling.
Semantic analysis. Open Calais Social Tagging infers topics by comparing texts to ikipedia articles. If an article is similar enough, it assigns the title of that article as a topic for that thread.
Threads can overlap in their topics and users but not necessarily both. Users can have similar interests even if do not participate in the same thread.
Use overlap of topics to derive similarity of users. Similarity of user profiles is derived from thread similarities.
Socio-semantic structure. To what extent is the social and semantic structure interdependent? E[ssentially do the semantic and social models line up.]
Calculated correlation – Spearmen 0.36***. Scatter plot looks pretty widely spread; a bit of a trend, but not that much. Clustering, modeling errors, social and semantic structure are partially interleaved. But better than random.
There are three main roles connected by regular relation patterns, typical core-periphery structure, most users in the periphery. Core users, more posts, well connected, engaged in the main topics; particularly important. Occasional help seekers, very different. Small number of posts, usually one or two on very specific topics, more as an information source. Occasional help givers not very active, but provide help. Could be drop-outs, or ‘elder statesman’ behaviour. (!) Low activity but helpful information acknowledged by the community.
It’s hard to ID meaningful structures. Semantic structure and network structure are not strongly interleaved. Connection patterns more governed by general forum activity.
Open issues – core users main focus, but peripheral users are important. Identifying changing role patterns is complex. Support mechanisms are needed.
THE DATA AND CODE ARE AVAILABLE! On Github: https://github.com/hecking/socio_semantic_blockmodelling
Q: What MOOC platform?
Q: Separate forum?
Yes, sub forum structure.
Q: Purely info seekers and info givers.
So we ignore social posts, like join my FB group. Just looking for help.
Q: Maybe difficult to find social patterns because they change over time. Did you try a duration in three periods, say?
Alyssa: Condensing to 3 weeks, 13 weeks, may lose patterns and artificially create them. Glad to hear about this work.
Q: Were you able to find out about if the active users were also largely the completing population?
We do not have the data to investigate this. The population in the forum is small compared to the population in the MOOC. There are studies on MOOC completion does not go with forum activity, but forum activity goes along with MOOC completion.
Q: Naive question. If forum activity, teacher/TA stuff, can you see that, is it registered?
We know the type of users, technical staff, instructor, students. We filtered all instructor posts and posts by admin staff.
Q: Were those core users, is there an identifiable, easy metric. The ones who regularly seek or give information, what is regular?
It’s from the graph model. Pre-specified patterns, one is periphery pattern. One big cohesive group, and surrounding. Activity more concretely, counted number of giving and receiving posts, and how diverse are those users. In the core, more active, more diverse communication pattern. More outreach.
Q: We know this notion of blockmodelling is not very easy. SNA concepts adopted widely in this community, centrality, cohesive subgroups, but blocks, partitioning of the community, regular patterns. They can be cohesive – the core is – the peripheral groups do not. This is a topic that goes across. General point, SNA techniques, there’s a lot to explore beyond the first-order things. Blockmodelling is one of these. A caveat.
Alyssa: About blockmodelling. If looking at just roles and equivalent, doesn’t surprise me it doesn’t link to content. People would take similar roles.
Q: Asymmetry goes that way. Semantic diversity brings in this distinction that is not in the role.
Yeah, you’re right.
3C3 Untangling MOOC Learner Networks
Sasha talking. Happy to be last in the session, some similarity in themes. Nice to know we’re looking at similar themes, but differently.
Understanding student relationships. From ed research, student relationships via in-class interactions lead to all sorts of improvements. What’s the impact of scale?
Many concepts – social presence, strong ties, CoP, so on – these constructs are from non-MOOC contexts. Some models very much off – e.g. who turns up every day very different in a class than in a MOOC.
Assumption of continuous interaction, assume they’re present. How does that translate to MOOC terms? How many people do you even meet in a MOOC forum? Much good stuff about engagement patterns, they’re not explicit about sub-groups, and don’t integrate engagement patterns in to analysis of content.
Group of people defined by regularity of participation, timeliness of replying, mutuality of interaction. Three-week rolling window.
Simple questions, quite qualitative. How does network of regular forum participants compare to overall cohort? What do they co-produce? How to define them?
TU Delft MOOC on Solar Energy, 57k students, 2730 certs. 14k forum users, 17k forum posts. 6k contributors. 582 posting in 3 or more weeks. Timely and reciprocal interactions for 3 or more weeks: 255. There were 12 people who made a post in every single week.
Compare to overall MOOC? Network grows over time. Degree accumulates over time. Peak activity is 3/4 in, then falls. Frequency of encounters is not very high except for hyper-poster outliers.
Degree distribution at different times. Edge weight distribution.
Looked at positions in the network. Based on co-occurrence and conversation. Clusters on betweenness and clustering coefficient. Two clusters, high participation in lots of threads. Other two, most of the regular participants…
What is the content they co-produced? Classify as task/non task, content/social. Admin/technical issues; metacognition – and mixed.
Cognitive task grows over time. Social non-task high at the start, fades away.
The network of regular participants grows over time, both in participation and intensity of interaction. Social negotiation threads are central (usually at beginning, but not always), while content threads are dominant and grow over time. People who participate regularly and post a lot meet each other more so more potential to interact.
Q: How social competences go online. The majority who do not regularly participate. Is there a private community, maybe on Facebook? Could you find out?
There is work, people have looked at other places. I’m not interested in their leisure time. You want to intervene where you can. Study groups in the library, the teacher has no influence. If you come to my workshop, I can influence you. My interest is on these. As a teacher I don’t need to know what happens outside to facilitate in class.
Q: I agree it’s out of influence. But if I want to explain that online courses not so bad, but only 10% interact, I would be happy to learn what the other ones do, say there’s more going on. That’s why I’m interested in what goes on outside.
What I’ve done can’t inform that.
Q: More a suggestion. This MOOC was in the old session days, every week there is something to do. New model, on demand, always on. It would be interesting to use methods to show difference between the timed, weekly one, versus always on. No strong evidence to prove it’s not good, that would be interesting if you could.
Another course didn’t have assessment you needed to get a grade. First 4 or 5 weeks were free of assessment. Differences in these cases, these networks are more centralised. If instructor controls the pace, hyper-posters emerge early. Where assessment is loose, network is more distributed.
Alyssa: We also need to start looking at what these discussions are doing for learning. Not just showing, I agree self-paced model won’t have the same connections. We need to see how that’s valued in the pedagogical model.
Q: How to interpret your cluster 2, the high betweeners low clustering coefficient. Interesting finding. I would interpret, this is a big MOOC, anonymous. Why would you find this betweenness, spreading over semantic or topic diversity? Would that be an interpretation?
It’s much simpler. It’s co-occurrence from thread replies. At the start, early posts, 30 participants. Clustering goes away from that. High betweennness, the more you participate in …
Q: You would hope between different threads.
It’s the volume of participation, there’s nothing more there.
Shane: They do jump around. Have a profile that might come through.
Q: Might be interesting to see these topical aspects. Maybe for final discussion.
Alyssa: There’s really interesting patterns across all three of these. It matters what people are talking about. First paper, looking at content. Dive deeper, relate social networks and topics. Or look over time, in terms of who runs in to whom, how do you design for conversations we think are important. Most of us look at the data after fact. Often not how we’d have designed it. How do we move in to ways in being more pro-active, analysing courses we expect to do certain things, rather than analysing courses that may not have had a pedagogical intent.
This work by Doug Clow is copyright but licenced under a Creative Commons BY Licence.
No further permission needed to reuse or remix (with attribution), but it’s nice to be notified if you do use it.