LAK14 Monday (1): Workshops

Liveblog from the first day of workshops at LAK14. I’ll be collecting the liveblogging as usual, here.

Computational approaches to connecting levels of analysis in networked learning communities

Ulrich Hoppe (Univ Duisburg-Essen) and Dan Suthers (Univ Hawaii)

Ulrich introduces himself, Dan and the day. Proceedings will be available on very soon. Best papers from the workshop will be invited to a special issue of IJ Learning Analytics.

Idea of spelling the analysis out through multiple levels – of agency, of aggregation of the learning group. The levels issue has been discussed in many other workshops. Data-intensive methos of analysis. Using it to inform decision-making in learning.

Themes emerging from today’s papers, identified by Ulrich.

  1. Learning analytics and CSCL:- theory-based interpretation of interaction data; from action logs to contingencies and networks.
  2. Learning analytics tools & methods for large online courses (incl MOOCs): measures of participation, contributions; indicators of transactivity, collaboration; transitions between large group and small group activities.
  3. Reification, sharing and re-use of analysis workflows: reproducible data analysis; workbenches for data analysis.
  4. Embedment/integration of analysis methods: data capturing and analysis from ubiquitous learning environments; embedment of analysis components in learning platforms.

… but the levels idea seems more in the background.

Dan Suthers and Nathan Dwyer: Multilevel Analysis of Uptake, Sessions, and Key Actors in a Socio-Technical Network

Dan talking. Uptake is the relationship between contributions; finding ‘sessions’; and key actors. The motivation is learning being a rich complex phenomenon, takes place in multiple times and places. Even if you have a narrow interest, it’s helpful to know the context of say e.g. a small group interacting. More how things are situated inside each other than levels. Interaction distributed across media, settings, times, places. Sometimes multiple log files that capture aspects of interaction, need to assemble in to a whole.

‘Traces’ approach. Abstract transcript, integrate. (Suthers, HICSS 2011; Suthers & Rosen LAK11).

Start with a logfile, abstract to some level of description – e.g. rectangle representing an act – e.g. writing, reading messages. Model as sequence of events. Find contingencies between them (used to say dependencies, but that’s too deterministic). Proximity in time, similarities, etc. That’s just evidence for uptake – taking an idea forward in some way. Collapse contingencies in to uptake. Network analysis techniques often require only one link between nodes, so that’s how they get there. Another model is changing ontology – so writing a message becomes a link, and the participants are nodes – forms the associogram. Can abstract that further by folding it down to actors and get traditional SNA (social network analysis).

Two examples: Tapped In (shut down last year!). 1997-2013, network of educational professionals. 2y sample, 20k active accounts.

Community detection – Latour idea about social assembled from networks of associations between actants; Licoppe & Smoreda idea that choice  of media reflects/reaffirms nature of social relationships – example of who you call immediately and who you email later when you have a baby. Use visualisation to see patterns. Then ‘community detection’ – more cohesive subclusters of the graph. Can partition it in to partitions. Amazing thing, network structure-driven split of partitions is reflected in real-world ‘communities’.

Here, automating discvoerg of distributed sessions and relationship between them. Start with logfiles, abstract transcript. Then spot contingencies, mostly only apply within a chat room – very dense clusters there –  but some cross. Very complicated – millions of nodes and edges, so need computational techniques. Then moduliarity partitioning algorithm, now call out to iGraph in R to find clusters, leaves behind some of the links between sessions – can’t have more than one link. Gives uptake within and between sessions (interaction model). Then fold that to get just the actors, can do standard SNA.

All controlled by an XML schema about how this is combined – all the scripts are controlled like this, but would be more usable if was more graphical. A lot of code written in Java, and Hibernate for persistence, call out to iGraph, used Python and nltk for analysis, lots of stuff linked together.

Example session picked out by hand – online chat between teachers. Took three days of data from around it, not tell it anything, see how it’s situated in the larger context. Uptake graph – hugely dense network map. Coloured by room, can see one session across two rooms, or two sessions in one room. (Cool!) Collapse the session in to nodes, weight by in-degree. Using Gephi to explore. Found that hand-picked session as important by automatic methods. Did a pretty good job of detecting real sessions. Also found connections – people following people around from one chat to another. (?Not sure if have found ideas moving across, or just wants to.)

Takeaways: participant interaction is distributed; aggregate phenomena are produced by and the setting of specific interactional events.


Q: What are the features you used for clustering the sessions?

Dan: Not a feature-based approach, it’s a graph-analysis approach. Modularity partitioning.

Q: Which part of the graph, what are the attributes to use it ..

Ulrich: It’s not attributes of the nodes, it’s just the nodes and the links. You may have many networks.

Dan: Did it on event contingency network and other, both able to find clusters

Q: nltk for lexical analysis?

Dan: Overlapped two chats

Ulrich: That’s the construction of the networks. Once you have that, you use the network structure and nothering else.

Q2: Creating sequencing of interactions that become nodes in the next level of analysis?

Dan: Yes. Individual chat events, relations; they are collapsed in to a single link, then a graphical modularity partition – max amount of linkage within a partition compared to random. Interested in large system, how do you trace out people and ideas moving through the system.

Q3: Where do the ideas sit in these clusters, nodes?

Dan: Each node in initial map represents the contribution – incl the text. Can explore/mine it there. Can ask if there’s an idea, that goes through the threads, a recurring idea, could say the session is about the idea – except some care. First find linkages of interaction, next presentation is more about what they say, the concepts. I haven’t gotten to the content yet.

Agathe: Disconnections?

Dan: based on lexical stems – e.g. teacher teaching teachers – if there is overlap, put a link, weighted by how many.

Ulrich: Technical bit, perhaps convey in a tutorial or so. Another type of network analysis technique, called block modelling. You didn’t use that?

Dan: Based on matrices, my stuff is too big for that.

Ulrich: Block is not necessarily a cohesive subgroup. Example of secretaries, all have a relationship to a boss, but are not a cohesive subgroup. That’s the point, big difference. Wanted to go for that, but method doesn’t do that?

Dan: Haven’t got my head round that.

Ulrich: May be cohesive structure, but not necessarily. Role analysis.

Q: Are you using different data?

Ulrich: No, the same dataset.

Dan: Finding overlapping clusters, the algorithm forces people to be in one place, which isn’t the case. Edge communities approach we’re experimenting with. Different community detection approaches may have different benefits.

Agathe Merceron: Connecting Analysis of Speech Acts and Performance Analysis

From Beuth Univ of Applied Sciences, Berlin.

This is an initial study, starting new.

Interested in forums, collaborative/networked learning, and relationship to performance: participation to the forum is useful/necessary but not marked. Will talk about initial case study, then context and data

Draws on Paredes and Chunk (LAK2012), where participation in forum isn’t marked, but is important. N=36, online project management. Assignment, quiz, exam. General forum plus group work forum. Ego social network, also hand-coded ‘content richness’. Correlations found between different measures and performance and content richness. Assignment score, final exam, contribution index, content richness, quiz – negative correlation between amount of participation and assignment score -0.55. Quiz / final correlation 0.88. Correlation with content richness and assignment or quiz was about 0.3.

Ulrich: If use density in a growing network, will favour small groups, since density has a quadratic denominator in terms of number of nodes. The best density is in the smallest network. So it’s not a positive factor.

But that’s not the -0.55, that was with contribution index.

Another paper – Lopez, Romero, Ventura & Luna (EDM 2012): predict pass/fail in the final exam from forum data, with accuracy of 0.894. N=114. Wasn’t online course. Used E/M clustering and N messages sent, N replies, N words written, centrality degree, degree of prestige, and forum participation mark manually given by course teacher (!).

Another – Khan, Clear & Sajadi (LAK2012) – N=163 and N=143. Looked at duration and others, couldn’t find it.

Grunwald et al, delfi 2013. MOOC, 10,000 enrolled, 1,000 obtained cert – of whom half did not post message. The more students posted messages, the better the mark in the certificate.

Kizilcec, Piech & Schneider (LAK2013) – MOOC analysis, K-means clustering to Completing, Auditing, Disengaging, Sampling. Completing students post significantly more messages than others.

My case study different. In a course, not MOOC, online course in Java Programming, compulsory part of the course. Blended learning. 4 cohorts from 2010 to 2013. 2 f2f periods, web conferences weekly. LMS, and help-forum, not mandatory but encouraged.

Quantitative results: posting is skewed – a few people post loads, the bulk post much less.

Q: Same course, same materials, same instructor? – but interesting differences.

Yes. One year, was one student who was struggling but had many many questions in the forum.

Numbers small in each cell – N=26-35 depending on year. Number who post in forum *and* pass final exam are <10.

Conditional probabilities of posting and attending final exam. (Missed what was said here.) Final exam marks of students who post or not – [don’t look very different to me, given the small numbers – also very complex box plots that are hard to read at a distance). Maximum and minimum both come from students who posted. Looking at the average number of posts from top and bottom quartiles, and the average – jumps around all over the place over years. In 2010, 2011, bottom quartile of posts were least likely, but reversed in 2012 and 2013. What do they do?

It’s speech acts. (Austin 1962). Categorised them, using categories from Kim, Li & Kim (NAACL HLT BEA 2012). Question, issue, answer, positive acknowledgement, negative acknowledgement, reference (ques, iss, ans, ack-a, ack-n, ref)- hint or suggestion related to the subject and not answering any previous message (new category she added). Because it was a help forum.

Top 25 and bottom 25 – bottom didn’t participate much in 2010. But did in 2011. [Numbers in each category vv small – like 6, 2, etc.] But in 2012 there was 20+. In 2013 too. [This is what distorted the means in the wacky graphs before.]

Top25 have more answers than bottom25. Balance of contributions different too – help seekers vs help givers – but some years are well balanced.

Q: Bit like Stack Overflow, could give useful indicators.

Dan: Threaded? [yes] They could be starting a thread or replying. Would be interesting to look at the speech act in the structural context. May get more nuance in categorising users if categorise their speech acts in context of what they were responding to. Flipping it, using uptake: for each person, see how many times people ask a question in response, raise an issue – that’d be in-degree. Degree divided up by speech act category. Maybe that would give more nuanced measure of their roles.

Agathe: Yes. I was missing in SNA is when people are in relation together, what for? What I miss here is exactly this.

Cindy: Questions could be simple or complex, did you break that out? Would be interesting – high achievers asking more complex questions?

Yes, should look in to this. Also I saw other words where they take different speech acts. (?) Saw Teo & Johni (CSCL2013), found strong students benefit from being help-givers, more than newcomers benefit from participation. Also in MOOCs, philosophy is to encourage participation in the forum, most of the time is to help your fellow students. It’s nice to help, social recognition, but also good to say help, you’ll do better.

Cindy: in the collaborative lit, see this effect

Ulrich: Learning by teaching argument.


Q: I do the same analysis, call it uptake, same technical aggregations. From that experience and perspective, your approach makes sense to me (Dan). Substantial scale you’re working with. Agathe’s system is small, and by hand?

A: Yes, but because doing it initially. Will do ML on it later.

Q: Scale, data issues?

Dan: I was going to ask – are people making progress on categorising speech acts automatically?

Agathe: Kim, Li, Kim paper have 75% agreement on automatic detection. Some work behind.

Dan: That’s a hard problem. The category of an utterance can’t be determined in isolation. Would have to see how they do that.

Agathe: Start with a hand-annotated corpus, do the ML on that. Best results is when you have a big corpus, well annotated. Some work where you annotate them by clustering, but the accuracy is lower.

Dan: I could use that, if I could mark the speech act for each utterance, then when have the uptake graph, say this is in response to that, use that in the analysis.

Agathe: [missed]

Dan: Also ask, a given utterance, the input domain – how many nodes downstream ultimately point to that node. So one picked up by a single person is small, vs one thateveryone picks up. But have to be careful – could just be ‘hello’ and everything follows on; vs really interesting question. What did it lead to in the session.

Q2: Look at the content.

Ulrich: Claims you may also determine relevance of threads in an uptake graph, just by structural methods. Main path analysis, presented last year at LAK. You construct a flow through the network, and say threads that have maximum flow are the most relevant. May start with a trivial thing, but can see a thread that capture many of the references, others that are more peripheral. Comes from literature citation graphs, first study done with DNA literature, validated against the known relevance of that data. Linguistic techniques, lots of things you can do; but somethings work in the absence of content analysis.

Agathe: Sequences are interesting. Long threaded discussion, e.g. student who keeps not getting it because they have difficulties. So doesn’t mean […].

Dan: There’s different analysis objectives here. You’re mainly looking at predicting student success?

Agathe: Maybe to predict, but first to understand.

Dan: Level of analysis is the student. But maybe to look for interesting sessions. After the step where I identify sessions automatically, which ones have interesting structures – how do you find good conversations? Maybe speech acts would be good to add.

Agathe: How can we judge interesting, I ask myself.

Dan: Other objectives, how do ideas move through the socio-technical system. I’m doing too many things at once.

Agathe: Interesting because you learn something. Make correlation with the mark, but the mark is just the final exam, not all of what the students have learned. It’s sharp to make this connection.

Q4: [can’t hear – maybe about quantity of responses and learning]

Agathe: Would be interesting to connect the two. You looked at some dialogues, what speech acts would be interesting – question/answer, or what?

Dan: Would you use different speech acts in different environments? Yes, interesting. Tapped In was profl educators community. One should not oversimplify, it had courses in it. You’re studying a course setting, students, teacher, things to be learned, so might ask questions about learning a concept. But others in Tapped In where relatively peers; but also in-service teaching, student teachers giving mutual support. So not so much querying for information. More like, tell me about your experiences of mentoring in school. Facilitator more trying to kick off conversations – e.g. should mentors be assigned or discovered, people share experiences. May be a lot of the ‘ref’ category from your mapping. Training data for ML interesting here. Pepperdyne University did (something). One small cluster I found was a sheep herder in Australia running an ESOL group. Don’t know how much you can organise a bridge. The graphical stuff is the same; but the rules for constructing the graph may be different. May get genre-specific.

Q5: Many different approches to how you go about thinking, say, whether forum related to performance. Talked about binary approach they used in final exam, using network process. Why not just a logistic regression and figure out your variables? Why this approach vs a different type of approach? How did you decide your method of analysis?

Dan: Using speech acts, or something else?

Q5: Using that kind of, looking at automated approach using network approach, how does yours complement that approach vs traditional machine learning?

Agathe: ML would be to learn or discover the speech acts. You don’t have to analyse the forum by hand.

Ulrich: Speech acts we see them here. They can be operationalised, automatically detected, with whatever precision. That’s do-able, within range of current NL technologies.


Ulrich: Yes, and other methods, with training data, more numeric. Many. Natural language guy says yes, we do that. Take this up in the chat workbench. But would like to learn patterns of speech acts, associated to a person, could that be a predictor of failure or success? If you have these, how can you predict from the profile someone has, certain other parameters like success/failure – could maybe do that with ML.

Dan: Why speech acts rather than content analysis. Speech act is a micro-level relationship thing. How people fit in a social setting, it’s the micro level of that. I am now questioning, I am now doing somethign else; people move. It’s an abstraction, why?

Q?: metadata about discussion!

Q5: Look at switching between modes, who gets to take up?

Agathe: That’d be interesting to see.

Coffee break

The Big Tree, Victoria Falls

Hiroaki Ogata, Songran Lio, Kousuke Mouri: Ubiquitous Learning Analytics Using Learning Logs in the context of language learning in a museum

Hiroaki talks.

Definition of learning – gain knowledge of a subject or skill by experience, by studying it, or by being taught. Experience is important source.

Issues – lots of learning experiences. How can we record them, in the real world, and reuse them for future learning? Lifelogging is a key technology.

Lifelog from Vanevar Bush definition. SCROLL – System for Capturing and Reusing Learning Log. Example: target user is 2nd language learning in Japan. They often have notes, log down from conversations with friends. Diffiult to find appropriate ones immediately, difficult to share.

So use smartphone to record them. System online here:

LORE model for learning by logging – Log, Organize, Recall, Evaluate.

How to log? Active – capturing it when you have a problem, been taught. Passive – record everything, e.g. photo taken automatically every minute.

System can show you similar things – and similar things you’ve logged before. Can reuse learning logs to generate personalised quizzes, from your logs. Example of fire hydrant – can work from learning log data.

Theories – encoding specificity, spaced repetition, many others (too quick to note)

Live demo. Take photo. Can share using FB. Have vocab in English and Japanese. [And classic crash on live demo! With entertaining stream of low-level data.] Shows on a map, can scroll a timeline above, so you can see where and when you learned before. Also has Android app. Took a photo, then typed ‘workshop’, then uses Google Translate to generate Japanese.

Data collection: April 2010 to now, 1,411 users, 21,757 learning logs.

Dan: [English-speaking] students learning Japanese, or Japanese students learning English?

Both. Also students in Egypt, China, Mexico.

Using analysis to understand when, where, how what and who was/is learning? The teachers, don’t know how that works beyond the classroom. This can show them how they learn.

Use action logs, K-means clustering, generates learner’s behaviour model. Use that with personal info C4.5 classification to generate learner model. Then take that with the learning contents, the learner’s personal information, to generate contents recommendations. Using WEKA3 from U Waikato. Biiiiiig decision tree! Generated automatically.

Ulrich: Contextual factors too?

Location, time, yes.

Another application domain of SCROLL system: in a museum. To support science communicators (SCs) to capture tacit knowledge and share them. In Nat Mus Emerging Science And Innovation (Miraikan) in Tokyo. 1m visitors/y, 3k/d. 50 science communicators, with variety of backgrounds. Many interactive exhibitions. No handbook for skill of creating mutual communication. Employment term is 5y.

So using SCROLL. The SCs take notes, system captures. Data collection of filming, reflection meetings, and data analysis and software development. Worked with an interaction analyst. Have 689 logs, 36 SCs.

Ulrich: The handwritten notes?

They used those before, now they are using tablets to enter the data.

Analyse interactions of individual SCs with visitors, broken down by e.g. age (so can spot who talks to older people, who talks to schoolkids, etc). Can also do map, see if particular words work in certain places.

Example of interaction analysis: standing in middle of group; glance to next showcase; time management – she asks ‘How long do you stay here?’

Q: Do you track where they are, moving through museum? (Yes)

Ulrich: We saw your association of terms to locations. Trajectory information, also have time spent in a location. (Yes, but we didn’t analyse it.) You could take that as an indicator of interest, could construct a narrative from the sequences. Now manual analysis, you want to automate it? (Yes.)

Tilman Göhnert, Sabrina Ziebarth, Per Verheyen and H. Ulrich Hoppe: Integration of a flexible analytics workbench with a learning platform for medical specialty training

Embedment of a workbench, in workplace training.

Tilman talks. Motivation and goals: had learning platform from a project for training doctors, and analysis platform to apply to it. Wanted flexible integration with different types of learning environments. Analysis platform – SNA, stats, sequence analysis. Wide range of input data – logs, networks/graphs, tables.

Related work: CISHell and Network Workbench. Generic analysis framework, with plugins for network analysis (NWB). Lemo – learning analytics tool, connects to different learning platforms for data input via plugins.

Analytics workbench – generic, extensible analysis framework. Many data transformation modules, lots of input/output modules – e.g. activity streams for logs, Pajek, UCINET and GML for networks/graphs, CSV for tables.

Frontend looks a lot like Yahoo! Pipes: upload widget (can connect to learning platform), then widget to process, split in to two flows, descriptive stats plus transform into a network, do more into a person-person network, find important people. Also attach difference

Dan: Can the streams come together?

Yes, but it’s a bit more complicated than splitting. You don’t know what data is coming in. For joins, can merge two networks, or two logfiles.

Ulrich: Or difference. E.g. two graphs over same set of nodes, can see what’s different or what’s the same

Dan: Finding clusters seperately from interactions within clusters, needed clusters before it.

For sync, the system/agent will wait until it has the data, so not a problem.

Ulrich: Merge semantics more difficult than the split semantics.

Can see results already created. More to come in demo this afternoon.

Architecture behind it: web-based front end workbench, connects via TuplePlace server to processing agent interacting with a data repository. Can be on the same server. Basis for extension to the KOLEGEA project.

KOLEGEA – collaborative learning platform for medical doctors specialising in family medicine – knowledge exchange, collaborative learning, mentoring, exam prep. Speciality training for family medicine has issues that make it difficult – real cases as basis for learning, self-organised, job changes, geographically dispersed. Privacy of patients important.

Restricted to doctors only, within which there are open area forums. Closed areas, small groups, but also self-organised ones. In these closed, small groups, work on user-generated cases. Shows description, with media e.g. photos of case. Links to guidelines to help evaluating, tags, reason for consultation, etc.

Through the architecture, switch the repository to this, add web services to get data in to Workbench. Also working on storing the data back to the platform, so can generate workflow that happens in the workbench but are available through the learning platform, so the users don’t have to use the workbench but can see the result.

Q: Size of the data? Does it work well with web services?

Yes, for this project it is enough. Log data, is running for 1.5y. Is possible to take whole input. But is built so can have local storage of log data and just update it.

Q: Is it gigs, tera?

It’s maybe a few mega.

Ulrich: 200 users. Less active users. If we had all family doctors in training in Germany, it would be maybe 10x that. So target group is not big.

Also our analysis is not on whole dataset at one time.

Q: Can get them in batches? (Yes, is possible.) Find any IP issues with extracting data from the instution? Privacy?

Ulrich: Of course. Privacy extremely sensitive here. Several policies for upload of cases. Even though is in a controlled group of just medical doctors. You still have to delete identifying information.

Q: In analysis itself you don’t need to know?

Ulrich: No. And even in the data itself we don’t want to know. The system knows about the IDs of the participants. Usually, you have the IDs on the platform, but if extract for analysis, we anonymise. Do that for our Moodle platform – e.g. for analysis in an exam. Never give another person access.  Even a document describing the policy, so they know what we apply.

Want to show you some examples. Effects of stimuli – graph of distinct users and mentors, can see bursts at start, activation mail, f2f meeting, and another activation mail.

Case of the month stats.

Q: Format of the log, defined somewhere and you can work on any log?

The format we support is general exchange is Activity Streams, an open format, JSON-based.

Q: What kinds of analysis?

Social network analysis. Show users, cases, forum threads. Can instantly see group of users connecting one part of the network, and another bit where it’s cases.  Fold the network, do more classical analysis techniques – centrality of a user network – can ID people contributing a lot. Can see mentors very important, but some users who are very active – possible candidates for future mentoring. Currently discussing this.

Outlook – want to integrate and apply more analysis techniques – sequential patterns in log data. Automated upload into the learning environment, and scheduled execution of workflows on the learning environment – e.g. for building case of the month.

For more, visit the website – demo version available, website soon to be updated.

Agathe: Format of data?

Can provide the data in to a format the workbench understands. Or extend workbench with a filter that understands your format, transforms it inside the workbench.

Ulrich: Plugging in R scripts is quite easy. Not fully automated, but is a routine thing to do. Can wrap R script as a filter in the script, can deploy it in the interface. Have to work with the formats supported, but we have format converters.

Agathe: To have users be mentors later, is it enough that someone is having a lot of contributions?

Of course not. But a means to identify candidates – nothing more, nothing less. Activity online is a precondition for being a good mentor, but not a sufficient criterion. Could be really active because they don’t understand much.

Agathe: Or if active a lot, but very good?

Ulrich: We have no quality metric. Could construct one, from the cases. Because we can evaluate the cases, we can see if actor contributes to many good cases, can infer quality of contributor.

The candidates are given to medical doctors supporting the platform, they would choose their own.

Agathe: This is some information.

Ulrich takes over.

Decided not to give a short paper on the workbench, but are going to use it as a demo. Reconstructed Dan’s analysis. Using posting act theory. Benchmarks for that, find out position of contingency detection mechanisms. PageRank. Main Path Analysis too. Same interface, but with different filters/processing. Chat analysis – with visualisation, shows uptakes using main path analysis. Bolded contributions as the main one. Compares with Dan’s data/human analysis – auto/man-A 0.97, auto/man-B 0.99.

The KOLEGEA case. Not just a workbench, can do expert analysis, interesting to embed in a practical environment, a community platform. Have done this in two cases. Here it’s a bit indirect, we are not the ones to build the platform. We cannot just go there and do the dashboard with the workbench behind as we could elsewhere. There is a step of indirection involved. Have done it for another environment for the food industry, was under our control but handed on to a software company. Done in a European project. Map the workbench workflows in to the environment. Interesting engineering problem.

Q2: Say more about the dashboards?

Based on the visualisations from the workbench. Can see, statistical things, networks, can arrange in a dashboard in the target environment.

Tilman: Specific to the platforms where they are delivered.

Ulrich: The widgets you put in correspond to the visualisations. We have client-side and server-side visualisations, both.

Tilman: Usually JavaScript front end, can add computational power from the server for complex calculations.

Q3: Plans for basic LTI capabilities for those widgets? Make it available in other platforms?

Tilman: In the EU project, we have Open Social based widgets, so is a way of integrated them. More in the scope of that project.

Q2: Said integraiton with R for filtering, but for visualisation?

Ulrich: No

Tilman: R will give you static image, can display in a slideshow, is of course possible, using that in special cases, but have taken R-based workflow and changed it to play with it more. Didn’t want to replicate them, but use existing workflow, offer results of R through the web. Usually want more of the web-based visualisations. Even server-supported ones, have interaction on the client – don’t get that with a plain picture.

Ulrich: Final remarks. Medical community paper, the message that it’s not about the workbench, but how to embed in to another target, didn’t come through at review time, but it’s an important challenge for us. We don’t just do the analytics, but have means to transport them in to community platforms. Standard answer, create a dashboard.

Q5: Good to have standardisation on format?

Ulrich: That is of course of interest.

Q9: The data you’re running through the data factory, comes out the other end, with clear problem, serves an analytics or research result.

Ulrich: Utility problem is a very important one.

Q9: Have an important example. Can help, projects are similar when the data is similar, proving out that it works for other datasets than your own.

Ulrich: I hope it will be more chaotic in the afternoon. And roundtable, interactive demos. Turn in to a lively discussion. At 1pm.


This work by Doug Clow is copyright but licenced under a Creative Commons BY Licence.
No further permission needed to reuse or remix (with attribution), but it’s nice to be notified if you do use it.

Author: dougclow

Experienced project leader, data scientist, researcher, analyst, teacher, developer, educational technologist and manager. I particularly enjoy rapidly appraising new-to-me contexts, and mediating between highly technical specialisms and others, from ordinary users to senior management. After 20 years at the OU as an academic, I am now a self-employed consultant, building on my skills and experience in working with people, technology, data science, and artificial intelligence, in the education field and beyond.