Liveblog notes from Wednesday morning at LAK16 [full list of blogging]: Session 1D Analytics Visualisations and Dashboards. Stephanie Teasley chairing.
1D1 Semantic Visual Analytics for Today’s Programming Courses
(short paper) by Sharon Hsiao, Sesha Kumar Pandhalkudi Govindarajan and Yiling Lin; Arizona State University, National Sun Yat-sen University
Sharon I-Han Hsiao presenting.
Work her team is working on currently. Outline – problem statement, related work, research platform, preliminary findings.
Arizona State U is one of the largest in the US. 19k undergrads, 3,900 in CS. Also “#1 for innovation”
The problem is majority of programming classes are blended. f2f lectures, in class, supported by online tools, ITS, SAQs, SMS. Primary, preferred way of delivery. More flexibility for teachers, easy to proctor exams and prevent academic dishonesty.
Lower division programming courses, hard to give feedback on paper exams. Can highlight where you have misconceptions, the main things missing. But not necessarily the main concept. More traditional blended classrooms [have this], so can’t use advanced learning analytics unless they’re online.
Related work – orchestration and LA (many examples), Visual LA and student modelling (. Before we use smart tech, criticise cost, but also for teachers to learn to use them and manipulate dashboards during class. We want to provide practical solutions for our current majority blended classrooms, and provide visual learning analytics.
Goal: to bridge the gap between lecture-based classrooms and ideal tech-enabled ones via Visual Analytics. Specifically, solution for teachers to continue paper-based exams, and be able to use advanced LA. The bigger picture is to be able to provide personalised feedback.
Took a semantic approach. Want feedback that addresses the concept you don’t have. Architecture very big, challenge to capture and deliver it. Developed an analytics API to capture whatever we capture from paper exams, integrate tools outside.
Shows the tool, EduAnalysis. Looks like course management tool. When you create and exame, choose the format of your exam, we’ll extract your questions. First thing you see is an overview of your exam, distribution of higher-level concepts and what questions they correspond to. Can see if too much emphasis on one concept or group of concepts. Then you can see the questions extracted from the exam, the content. It’s normally combined with natural language. It’s actually an authoring interface. If it’s an mid-term, want to emphasise a concept, can adjust the weight to put the focus on it.
Teacher’s dashboard view, after students have finished the exam and it’s graded. How do we input it? Two methods. Initially, create a TA interface to look for each question, use red-pen grading. That’s really tedious. (!) Then designed a new approach.
Student’s view. They see their progress, their knowledge, compared to the rest of the class, the average or the best. Can see which question you got wrong.
How do we do the automatic index? We use topic modelling, combining natural language and programming language to extract topics. Not just the code, but some natural language.
The goal of this dashboard is to bridge physical and cyber programming. Collect student’s performance in semantic level; question/concept mapping for feedback to students.
The delivery, the capture is hard, but we have a solution. We use a mobile device to capture their written answer, combined with a QR code on the exam script. We can know what concepts are related, TA configures the concepts (?). Can systematically calculate partial credit.
Human graders are inconsistent. Especially the co-writing question.
Evaluation, we need to ensure that auto-indexing actually works and indexing the correct concepts. We hired two experts, use their exams, 4 of them with 76 questions, asked them to index the concepts from a provided Java ontology. If we split their questions by complexity – auto-indexing was significantly higher than experts. Are they key? Split whether procedural or declarative. e.g. MCQ and co-writing question, whether you need to apply the knowledge. Auto-indexing significantly higher to. Subjective feedback from experts thought it was handy, but worried about the precision (!).
Not reported – whether we cover as many, and how many key ones. Just finished collecting data from MTurk, better expert indexing setup. Want to explore how students use the feedback.
Questions
Roberto: Important problem that we don’t tackle, bridging f2f in a blended scenario. Wondering, why is this happening that we focus on online, not the f2f part? Nice to see your approach. What are your thoughts for the future? What possible solutions besides yours?
Sharon: Monday we have pre-conference event, Cross-LAK, a lot of interest in trying to address this. Lots of activity outside, online, in class. Rationale is simply trying to provide help, what we know is working. We’re not able to provide personalised feedback. Ken’s closing keynote, it’s a data-driven interactive engineering. We can model, know better.
Q: Looks like mapping the concepts to questions requires front-loaded activity for the faculty to do it. Any thoughts about the cost, and do you anticipate faculty resistance to that level of work?
Sharon: It’s going to be work, teachers tend not to do that. If you use a smart classroom, they have to do that. That’s why we provide the automatic indexing, to ensure accuracy. This could be learned through iterations of the training. Algorithm to learn from those teachers configuring those weights, the algorithm will be smarter next time.
1D2 The Role of Achievement Goal Orientations When Studying Effect of Learning Analytics Visualizations
(full paper)
Sanam talking. LA Tools have been focused on institutions, supporting decision-making to guide students learning process, retention. Opportunity to provide analytics from student perspective, to improve their learning as SRLers. One way is via visualisations. Starting from awareness, opportunity to reflect, step further to sense making, finally an actual impact in terms of change of behaviour or learning. Present trace data to students, e.g. time spent, artifacts, social interactions. See influence of LA viz on students. Different methodologies; self-assessment, lab settings, course settings.
Individuals differ (says ed psych), particular intervention or tool that’s useful for a particular student in a particular context may not be useful for others. Hypothesis this is true for LA viz too. Epistemic beliefs, goal oritentation. So aim to investigate effect of different information on learning behaviour of students with individual differences.
Theoretical construct: Achievement Goal Orientation (AGO, Elliot et al 2004). Two dimensions – definition: Task, Self, Other; valence: approach or avoidance. 3×2 self-report instrument.
Study design: Three LA viz, study effect on posting behaviour in online discussions, controlling for differences in achievement goal orientation. In LMS at SFU. Student randomly assigned to access to only one viz. Posting behaviour – quantity and quality. Individual differences of students, using AGO instrument, link to trace data from LMS. Instructional scaffolds to guide participation. Quality operationalised through Coh-Metrix Analysis. Hierarchical Linear Mixed Models.
RQ1 – Is there assoc btw viz and quantity of posts, controlled for AGO? RQ2, similarly for quality.
Each viz was different, aligning with different goals. Not contribution about designing LA viz that are theoretically grounded, but step towards studies that understand individual differences.
First viz, class average, compare your posting performance with average number posted by the class. Typical in many existing systems. Can have negative effects on self-efficacy. We used because common.
Second viz, top contributors compared to yours. Includes pictures and names next to contribution level.
Third viz, focus on the content rather than quantity. How many key concepts covered in their messages when they posted. Measured quality using LSA/NLP measuring coherence of the text. Colour-coded with darker shades more coherent on certain keywords. Instructor selected the keywords.
Experiment: online group discussion activity based on guidelines in collaborative learning lit. 4-11 in a group, 7-14 days, open-ended questions. Role assignment used, clearly defined marking criteria. Six discussions, 4 undergrad courses, Spring Summer 2015. Not all students contributed. N = 58, 55, 56. Using viz not compulsory. Viz users visited more than once, N=38, 22, 32.
Quality assessed using Coh-Metrix Analysis.
Analysis. Hierarchical Linear Mixed Models. Evaluate fixed effects vs random effects. For RQ1: Fixed – viz type (independent) * AGO scale scores (covariate; dependent var Count of posts. Random effect – Student in a course (random), dependent var again Count of posts. Null model R^2 0.70, Fixed Model 0.91. Investigate deeper, interaction effect between AGO scale, find only statistically significant findings – Other-Approach/Viz had significant effect: class average reduced count of posts, top contributors and quality increased them.
RQ2, same model, but dependent variable Coh-Metrix components. Five models, for all of them, fixed model signif between fit than null. Found effects for Self-Avoidance, Task-Avoidance, Task-Approach. Self-Avoidance positive association for Quality, but negative for other vizs. Task-Avoidance, completely opposite association – negative for Quality, but positive for class average and top contributors. Avoidance, providing them a tool would help, but tricky. Task-Approach, again results are different.
Students actually used the vizs, and it affected their behaviour. [Really? Causation?]
Highlights importance of conducting empirical studies in authentic courses. Fine-grained data with self-report instruments.
You should all totally consider individual differences in LA. More research required.
Future work on study approaches and other theoretical constructs. Look at role of instructional scaffolding. Pedagogical intervention [make ’em use ’em!]. Link to other traits.
Questions
Q: Is there a change in what they viewed during the discussion? More later on, or earlier, or no difference?
Looked in to that slightly. Could see a spike at the beginning, getting less, then get up. Closer to the due date they do more, just interested at the start.
Q: About the AGO. It’s self-reported. Did you consider ways to estimate it, from the data?
There are papers showing how you can estimate AGO based on traces of data on how they have done. Don’t know the state of that, but I know that’s a direction.
Q: Transparency of LA solutions to students. Picked up on using class or cohort average is potentially demotivated. Is it a metric we should hold back from giving? Is it dangerous?
Based on our results, our thought was it would negatively impact in many cases. We don’t see that in our preliminary results. In some cases here or there, potentials of +ve effect for it. The answer is, I don’t think we necessarily have to hold back, but have to be careful.
Q: Particular learning behaviours, you have to be aware of that?
Generally, students with performance orientations, they tend to either be very motivated to use it and disengaged from the others, or in between.
Q: Are you thinking longer term that a system might automatically choose which viz, or that students choose their own?
Looking at a complex construct like this, if we gave this choice to the students, it would be hard for them to assess. We envision to have the system automatically choose that for you. Could be adaptive, changing over time. Still good ways to give this opportunity to students, but not based on ed psych theories.
Stephanie: For students who have performance orientation, a mastery orientation- was that considered?
Initial 2×2 model of achievement goals is mastery/performance, but mastery mapped in to task and (the other one). Would expect quality viz would work for that type, but also top contributors.
1D3 The NTU Student Dashboard: Implementing a whole institution learning analytics platform to improve student engagement
(practitioner presentation)
Fewer statistics because it’s a practitioner presentation. NTU is a large UK univ, 28k students. Also partnership, ERASMUS projects with TUDelft, Leiden, KU Leuven.
Brief questions: Why did we do this? What is it? How have we managed the development? Has it led to transformational change to experience of students and staff yet?
Why interested in the dashboard?
UK project, What Works Retention & Success. What are the factors that help them stay. Attainment, Belonging, Progression. Difficulty landing that in to institutional structures. Internal audit, quite good at retention; not for all groups (poor, male) but on the whole good, but don’t share data with staff. Tutor has to ask administrator, not simple, for access. IS dept talked to sector, what can LA do? Companies pitched, went with DTP Solution Path. Three things to look at: Progression – helping them to see they may be less engaged. Belonging – enable staff to have more info about their students; large cohorts, build student-tutor relationships. Attainment – show good students they’re doing well, motivational benefits.
At its core, monitors engagement with course, mostly electronic. Not a risk course, not saying you are at risk, but saying you are engaging with things we can measure. Cognitive, affective, time on task – latter is all we can measure. Card swipes, library use, VLE login and submission, electronic resource use, attendance. Academics very focused on attendance.
High, Good, Partial, Low, Not fully enrolled (never completed enrolment, or withdrawn).
Students and staff see exactly the same. Management screens mean staff can see more students, but the data is the same. Tutors can make notes. Tutors get alert if no engagement for a fortnight in term time.
What is it and how does it work?
It’s a dashboard. Ring doughnut. At individual level, can look at engagement compared to course average. Cumulative graph gives a good overall picture. But if you fall behind, will take a long time to get back. So also a spikier week-by-week one. Student profile – can see your most recent engagement.
Developmental journey?
First involvement in a large IT project, has been a great joy and a pleasure for me. [laughter] Small group, IS dept, learning and teaching, student input, student support, study support. Few months to a pilot, 4 courses, thought had enough to be useful. We went from a small, agile team to a proper governance structure.
Ops group, ethics group, university systems group, informal student group. Then dashboard governance group. Academic standards & quality committee, then up. Can we link to data e.g. if student is expected not to come in (e.g. health issues) without disclosing that? Have missed out multiple committees that have an input.
Is it accurate, has it changed the world yet?
Take engagement. Map of 13/14 year, every student. 56.7% satisfactory engagement. Included weekends and holidays. PT students more likely to have low engagement.
Is there a relationship between engagement on the dashboard and progression? From bar charts, looks very much like it. Low engagement group, only 1/4 progressed simply. Better for satisfactory, like 84%, good and high it’s well over 90%. So there is an association between what they do and how they progress.
What happens over time – if your engagement is low on average, less likely to progress, not so much a problem initially, but gets worse. Longer your engagement is low the greater the risk.
Final year students, engagement vs degree classification. Low, 41% got a good degree; high it’s more like 81%. Prior quals more relevant for final degree classification.
Students with some behaviours do well, others don’t. How do we change that? Surveyed students, 1st years, what do you do when you use the dashboard? Report changes to behaviour. Those measures are going up; though biased response. Asked if tutors using it in 1-2-1 mtgs – only 8% said yes! Will use this in staff development.
Students use it more. Staff about usefulness and frequency of use; clear link. If use it a lot, find it useful. [I think this may go the other way causally: if you think it’s useful, you’ll use it a lot.]
Lessons learned from Dashboard Implementation
LA and timetables parallels. The two are enabling tools. We get excited about LA, it’s super-cool. It’s new, we’re focused on it. There probably isn’t a parallel conference looking at timetabling software. LA is an enabler, don’t know how much it’s a change agent in its own right.
The nub of the issue: We’re looking at big picture stuff, but we’re interested in change. Mousetrap the game, metaphor. LA is turning the crank, the academic taking action is them diving in to the bathtub. Need all sorts of things to enable that. Just knowing a student has a problem isn’t that helpful, it’s about what we do next.
Next work?
IT infrastructure, quality. New data sources – but not much. But the big stuff is integrating this in to institutional working practices.
So looking at what we do to lead to change, get staff to engage. Small pilot study, looking at aspects of use. Maybe not just alert to tutor, maybe student too. Use of notes pages, appointment booking. Staff development. Survey tools & tests to maybe modify it.
Finally, stuff we haven’t considered yet.
Questions
Q: Holy Grail of LA is to measure positive impact. Hinted at this, it’s difficult to do. How far are you from measuring +ve impact of your methods? Ultimately, retention. Can you come up with stat saying you’ve improved it?
We’ll certainly have, by end of academic year, will have run those pilots through. One school looking at how it identifies students, building a team spotting low engagement, not just one tutor. Pilot year, two courses improved, two it went down! The course size increased, so could justify it. But no clear-cut answer. It’s about the use of this, that’ll take more time. Not there but hope for answers.
Q: You have a feature sending a note to tutor saying please intervene – do you track whether they do?
Not yet. Running at great speed. Consulted staff and students, students less concerned. They’re surprised this data exists. Staff are more concerned, more deeply about the ethics. I’ve been accused of being Stalinist. This isn’t a LA tool, a staff mgt tool. In this year, will build in, a count of alerts to the tutor. Build up to that data. Need to be quite careful. We’re trying to win people over positively, not a big stick. Staff don’t have space and time to do proper interventions. There’s work to be done meshing IT.
Stephanie: Track did they talk to the tutor, what did they do?
Q: We’re doing a similar one (at Aston University). One model, staggered approach, a super tutor, some success. Your VLE data, ours is not used consistently across programmes. How consistently is yours?
A drive to have some consistency of use, it’s pretty consistent now but not perfect. Even Art & Design, the same patterns are there even though usage low. Time on task still works. We may never get to perfection, but is good enough to get to.
Q: Mentioned surveyed students about how often they look – couldn’t the dashboard tell you that? Also said, not telling them about risk. Why?
Yes, we do have that data. Asked them in survey to give context. Quicker and easier to get them thinking. Do we say risk? Pushed much more the data about association between high engagement and getting a good outcome. Subtlety about average. Care about low engagement = fail message. They’re skimming through. Focus on positive message.
Q: Technical question. Using time spent variable included in engagement calculations?
No, it’s just are they using it, did they log in?
Q: Not how long interacting, duration?
No. Talked around using it. The way students use the VLE, have many tabs open including the VLE. They dip in and do 20 things as well. Are they using it at least reasonably frequently?
–
This work by Doug Clow is copyright but licenced under a Creative Commons BY Licence.
No further permission needed to reuse or remix (with attribution), but it’s nice to be notified if you do use it.