More liveblogging from LAK13 conference – Thursday morning.
Communication and Collaboration
Analysis of Writing Processes Using Revision Maps and Probabilistic Topic Models
Vilaythong Southavilay, Kalina Yacef, Peter Reimann, Rafael A. Calvo
Topic is collaborative writing (CW). Essential skill. Majority of documents are authored by more than one person. Single-authored essay is a rare instance in professional life. CW combines the cognitive and communication requirements of writing with the social requirements of collaboration. It’s a complex process, has attracted cognitive psychology, applied linguistics and other research interest.
Coordination issues – many different forms – group single-author writing, sequential writing (amendment), horizontal-division writing.
22 masters students in the course EDPC5021 – topic learning sciences and technologies. Every two weeks (a cycle) students divided in to groups of 4-5 members, write about a topic jointly, 3000 words. Six cycles over the whole semester.
Used Google docs. Have the text, with comments on the right hand side. Gave template to students – a part A where they formulate their individual understanding, and Part B where they were more collaboratively to work on relating the readings to the big themes of the course or the field. Google docs stores revision history – version number, author ID, timestamp. Some challenges in using Google Docs API.
Want to provide feedback on the finished product, the essay, but also give feedback and guidance on how to work together.
The easy part is the final product. There’s a set of essays, can be graded, appraised and give feedback using a rubric. There’s also the revisions from the revision history, which tends to be large and extensive. As a teacher, can’t go through all those revisions to draw conclusions – need help from computer scientists.
That’s the task – information on the process as well as the products to give students feedback.
Big table – group ID, number of revisions and other indicators. Mean 66 revisions per doc.
Visualisations – revision maps, topic evolution charts, topic-based collaboration networks.
Data flow – google docs, api, get version history to create revision map. For the other two, send via text comparison utility, topic model (topic evolution chart), author-topic model (topic-based collaboration network). Details in the paper.
Three kinds of mining activity: artifact analysis, process analysis, social network analysis.
Only one shared with students is the revision map. Works on the level of sentences and words. Students need guidance on how to interpret. Columns are revisions, rows are paragraphs. Columns are attributed to the author. Each cell is colour coded – green=words added, red=words deleted. [more red/green colour coding!] Yellow is balanced. White is no change in that revision. Right-hand column showing total change per paragraph; top column showing total change per edit.
This helps students answer – which sections had most and least action, when major edits happened, whether students worked sequentially or in parallel, and how many authors worked on each paragraph/section.
Critical point – deleting something someone else has written. Can be a sign of trouble, or of a very agreeable writing team.
Topic evolution chart.Depicting how topics evolved over time. Topic is a cluster of words that frequently co-occur in a revision. Used Latent Dirichlet Allocation, DiffLDA for writing processes – for extracting topic evolution within a document. Bayesian and Markov models, not trivial technically. It’s not in an archive of different documents over time, but revisions within one document, requires a few changes in the processing.
Get a chart showing relative weight of the topics over time. Nice graphics. Can see when topics appear, change importance.
Topic-based Collaboration Networks – who contributes to the topics. DiffATM. Node is a student, connection/link showing that they wrote about same topics. Can quickly see sparse networks – which nonetheless got high grades.
[The revision map is rich, but takes some expertise to interpret.]
Have used these on small corpus, but the techniques are stable on larger corpuses. Not just in the background – using R, but want to automatise them to serve as continuous feedback or at least on weekly level.
Q LDA is currently the method for analysing textual artefacts. We’ve done that, found one problem. These are topics, they’re keyword clouds. Sometimes the granularity is not fine enough to see the changes. Changes below the level of these clouds that LDA give you.
I cannot say the details. The sensitivity, if there’s not a lot of revisions, it’s an issue. Had 1000 iterations to identify the prior probabilities, ran it 200 times in Markov model for practical reasons. Sure there’s things we’re missing. Used for indicators.
Q If you want novelty indicator. Also looking at dynamics. This may not be so good.
Technical questions, address in the proceedings for first author who can answer those.
Learning Analytics for Online Discussions: A Pedagogical Model for Intervention with Embedded and Extracted Analytics
Alyssa Friend Wise, Yuting Zhao, Simone Nicole Hausknecht
Not universal LAs, but particular for a single context – learning through collaborative discussions online.
Learning context – online discussion. Data type – process data based on clickstream. Timeline – in-process learning events, short cycles of feedback. Interpretation/action – instructors and learners making local pedagogical decisions. Very human-centric model.
Three challenges. Capturing meaningful traces, presenting it meaningfully, supporting interpretation. Model of online speaking and listening, embedded and extracted analytics, pedagogical model of intervention.
Need a learning model, what we think is going on. We’ve always had lots of data. Model focuses on how students contribute and attend to the messages of others. Lot of other work looking at the messages, but less on ‘online listening’ to others’ messages.
Social constructivist perspective; two basic processes – speaking (Externalising your ideas), listening (taking in the externalisations of others). A metaphor!
Students have high control over timeline of engagement online. Can skim comments quickly, decide what to read and which order. Could be reflective. But students have trouble with this, managing their time, especially when discussions are prolific. So want to help learners to actively monitor and regulate how they attend to others. If’ nobody’s listening, it’s just a bunch of people shouting in a room.
Speaking is mechanism for sharing ideas with others – recurring, responsive, rationaled, temporally distributed, moderate proportioned. Speaking is visible, but not all qualities are salient. Huge work done on post quality – textual analysis, complex to assess.
Listening – it’s the invisible part of learning through online discussions. Value in listening that’s broad, integrated, reflective. Early research suggested they’re really bad at it, but recent stuff says they have different strategies – coverage (read everything), interactive, self-focused (only their own stuff and replies), targeted (look at e.g. people they know have good idea). Different students do different things; give them pointers about that they can better self-regulate.
Metrics – many! Range, number of sessions, percent of posts read, others. Percent of posts read – Listening Breadth – taking this as a key. Notion of a read is important.
Data processing – mySQL query merging log and post tables, Excel VBA macros to clean it up. Reading and scanning categorisation – view actions based on maximum reading speed of 6.5 wps (=about 400wpm), read if spent the time, scan if less.
Simple table format as visualisation! Had lots of amazing, brilliant things … that the students couldn’t understand. Not very innovative, but very effective. Your data compared to class average.
Extracted vs embedded analytics. This table is an extracted analytic – pulled out and presented back. It’s separate from the learning activity. Want to feed back within the activity: embedded analytics. In the discussion interface – alter it to reflect that info. Used Visual Discussion Forum – can see your viewed/unviewed posts, own posts in light blue.
The spider-diagram/starburst type vis is actually the interface. Yellow starting post, with discussions spinning off from those. Can see what you’ve read, and what you haven’t, and in what parts spatially you’re contributing. Students are asked to take collective responsibility for the discussion. The red/blue colour for viewing, will be different from % posts read metric – if you just click through, you don’t get high % read, get different responses from students.
Supporting Interpretation. How do we make those visualisations integrated, part of the pedagogy, very actionable by students and teachers. Danger of rigidity of interpretation, lack of transparency about process, optimising to only that which you can measure, possibly impeding metacognitive self-development.
Six principles as a pedagogical framework for learning analytics intervention.
- Integration with the learning activity. Part of what’s going on, the goals.
- Diversity of metrics based on learning model, multiple pathways
- Agency in interpreting meaning
- Reflection happens in explicit space and time
- Dialogue between students in interpretation
- Parity between instructor and students.
Trial context: Blended doctoral seminar with 9 students. 10 week online discussion, reflective journal with embedded analytics; extracted analytics added for weeks 5-10. Guidelines for participation, facilitation, analytics, given out up front. Interviews after course end.
Integration: Connect the purpose of the activity with expectation and how analytics provide indicators. e.g. ‘Try to read as many posts as possible to consider everyone’s ideas …” and guidelines on interpreting the metrics. High student buy in to guidelines/metrics. Hard to distinguish these, because they thought of them together. They interpreted metrics (in their reflective journals) it was in terms of the guidelines.
Diversity: Students found different metrics valuable, multiple pathways. Talked about working on different aspects. Highlighted lack of listening by some of the vociferous speakers, honoured the efforts of others. In the numbers and the reflective journals. Self-critique. Trust of the numbers important, calculation choice became important – range of participation, was exclusive not inclusive – if you’d gone in two days in a row it took a day away. Made them question the numbers.
Agency: Presented as starting points for discussion; class average gives reference points; set personal goals. Students found goal-setting valuable, multiple strategies. Validation, but surprises too, more interesting. Emotional reactions, e.g. clicked on everything but very quick. Hard data contrasting with their perceptions. No major ‘big brother’ issues. Involuntary propensity to target the average.
Reflection: Dual danger of omnipresent analytics – any time means never, distract from engagement. Rhythm for reflection, in-class time for it. It happened. In a single space, they talked about going back and forth. High self-awareness of meeting goals (or not). Did sometimes need support in making change.
Dialogue: Online wiki reflections were private but teacher could view. T made comments in first week as initial support. Started a conversation that continued, was really important. Audience for reflection. Students liked negotiation and contextualisation – personal explanations of choices, strategies, struggles. Instructor really helpful about how to change.
Parity: Turned out not so important. Instructor’s reflection was useful for initial model, but didn’t attend to it later. Students were Ok with having the instructor overseeing.
Future plans – tool development.
Q How did the analytics affect their behaviour towards the assessment?
Haven’t done that analysis yet. Next step, see did their interactions change, related to the assessment?
Q Was it stated that the software would use that?
We explicitly stated the purpose was learning how to participate in dialogue, analytics to help them improve not for grading. Grading on more standard and overall. Reflection chance to talk about it.
Caroline Haythornethwaite: How do you know the intervention, your implementation has anything to do with what’s going on? The messiness of the class makes it hard to extract the tool effect.
Yes, that’s hard. In first 5 weeks did reflective journal, had interface, but not extractive chart until second half. Hoping we can see there is or isn’t some difference in behaviour.
Emily: This is a lot of dialogue in the classroom. Online not the opportunity for that, how would it transfer?
Want to do this, it’s issue for scalability. There was no f2f dialogue, it was all in the wiki. All asynchronous, turned out to be important. Worked with similar assignment in a different class. Can have larger reflections, more like an assignment. Or peer support. Or reflective dialogue doesn’t happen so often. The audience was important. Maybe could be less frequent. That’s a challenge we’re looking at next.
Q Potential problem – counting number of posts and number of words, give idea that more is better. Especially in a grad class, teaching them to be concise, quickly and clearly making points, that’s a goal.
Two things. One, there’s better measures of dialogue we could integrate. More is always better we were explicit about not being good in the guidelines. The pressure was to be more concise and short. Dialogue was not doing more, but being more directed.
Q You were explicit about it as a pedagogy – don’t make the wrong assumption from these numbers.
Guidelines said here’s the metric, here’s the interpretation, here’s how to use it. For length, need certain amount, but think shorter and invitation to respond.
Understanding Promotions in a case study of student blogging
Bjorn Gunnarsson, Richard Alterman
Different context, same problems. This is in a blogging environment. The blog posts are homework. Everyone can read it. Post it before the due date, all can read and learn from it.
Added feature – can like someone’s homework assignment (=promote it). Different types – badges – e.g. nicely written. (=promotions) Want to determine if high-quality content is the promoted content.
Class on Internet and Society, four books, two posts on each book. Students have own blog, read others (aggregated in reverse chronological order, with the promotions shown), template for blog posts, and feedback. Assignments go in a cycle – do editorial on first book, editorial on second book – can adapt to how it was set up and improve. They use the data in their blog post for their term project. They also need to do it in a set of reflective posts that bring together the ideas from all the books.
Case study – promotions. They require just one click. Different from comments where you put thoughts in to words, heavy cognitive load. Incentive to create these badges, they show up in the view of the blog posts. ‘Like’ as a general overarching one. Do people promote? Yes, they did. Do they act on the promotions from others? Yes – more promotions a post gets, the more reads (monotonic increase of reads with number of promotions). Are the promoted posts the quality ones?
Case study – technology. At Brandeis University. Transition of the technology. Used the format, with viewpoint with comments. Added the social features. Transition from collaborative blogging to social blogging.
Grading – simple grading scheme – 0 not completed, 3 exceeds expectations. TAs graded each post. Questionnaire – six questions on this 0-3 scheme. Less prone to grading errors or miscommunication. Specialist grading view, not influenced by likes/comments. Also peer reviews – part of questionnaire answered by student themselves. We had a lot of students, a lot of work to grade.
From 10 assignments (last ‘was an outlier’), 92 students, 15 filtered out for non-participation (some dropped the class).
They actively promote posts as they read. They promote high quality. Students are reliable – some better than others: some reliably promote poor materials (!), they get better over the semester. Promotions useful as highlighting mechanism, preliminary assessments for grades – grader just verifies community verdict.
Promotion feature used by 90% at least once. 6.77 ae posts promoted. There’s a decline in promotions over the semester – fewer students do promotions as time goes on.
Higher quality posts got more promotions; more likely to get promoted. All types of promotions were useful. [Slightly confusing data presentation of these results] Average number of promotions was higher for higher grades – R^2 0.6, p <0.002. General ‘likes’ were the highest correlated with average grade.
Feedback to students. Grade of promoted posts vs student grade; one student promoted only posts that got the highest grades, but only got an average grade themself. Doesn’t look like there’s much of a correlation here.
These can be useful, for highlighting quality posts.
Martyn Cooper: How do you handle the timing issue? The timing of the blog post and the promotion by the student.
Once assignment out, they can go and edit it again, and any time you can ‘like’ what’s already in the blogosphere. Spikes of liking are always just after the assignments were due.
Chair: In this social network, sometimes the initial likes have a disproportionate effect, sometimes they’re just random preferential attachment. Did you talk about random variations at the beginning skewing the results.
There was a blank post that got lots of badges. But we didn’t filter that out, and still the data is Ok. We didn’t have to take that in to consideration.
Q:Take any precautions to avoid the bandwagon effect? Advertising and asking students to like them. In social networks, that’s very common.
No, nothing like that. We did not see that happen. It was completely organic. We might have wanted more people to do promotions. Some only used it once during the semester. Did not see that effect.
Naomi Jeffery: Am I right that you judged quality of the post purely by the grade? Or did you have other measures?
Measured by the graders in the grade received. Graders and instructor met and did a few together to get consensus. Same scheme as editorial review. On second term of grading, were in communication and agreement. With course grading scheme it was difficult to get from a 2 to a 3, it was clear what category it fell in to:
Naomi Jeffery: Could it be that the correlation is because students have a good understanding of what’s going to get them a good grade?
Sure, of course. That’s a part of our quality standard. Measure how clearly they state their issue, etc. There’s also an overall grade.
This work by Doug Clow is copyright but licenced under a Creative Commons BY Licence. No further permission needed to reuse or remix (with attribution), but it’s nice to be notified if you do use it.