Digital scholarship: Advanced technologies for research (3)

Third (and final, you’ll be pleased to learn) set of liveblog notes from Digital scholarship: Advanced technologies for research – a JISC-sponsored roadshow held on 10 March 2010 in the Ambient Technology Lab, Jennie Lee Building, The Open University.

Martin Weller launched the OU’s DISCO site – a hub for digital scholarship in the OU. Alas, is currently restricted to OU users. Then Graham Pryor giving an introduction to the Digital Curation Centre.

Graham Pryor, Associate Director, The Digital Curation Centre

Graham Pryor from the Digital Curation Centre – contributes to a group blog on digital curation.

Three component parts – consortium: UKOLN, HATII (Glasgow) and Edinburgh.

Definition: digital curation is a process of maintaining, preserving and adding value to digital research data throughout its lifecycle. More than just leaving it in an archive; making it possible to do more than it was possible to do in the future.

Why do it? Return on research expenditure. Reduce risk of digital obsolescence. Sharing more widely. Enhancing long-term value. And value to researchers themselves – it’s an intellectual asset for you as a researcher.

There are national data centres (UKDA, NERC, ADS, others); emergent HE data repositories; BBC and other media. And activity beyond the UK.

The DCC has a lifecycle model; it’s moderately complex. It starts at the conceptualisation stage and goes all the way through to long-term access and reuse.

Backed by JISC. Original work with librarians etc; then second phase (2007-10) of more direct involvement with research community and longitudinal studies. Now new third phase as an observatory, outreach, innovation and support, funded 2010-2013.

They have ample resources for all your digital curation needs: A reference manual. A digital curation lifecycle model. A data asset framework (or data audit framework?) – an assessment tool for how to manage data assets. DRAMBORA – a risk management tool for data. Briefing papers. And a data management planning tool – DMP online – to be launched at next month’s JISC conference. Will help you complete the necessary information for a data management plan for research funding council applications – tool customised to each research council.

Now want to work through a network of institutional champions, and more partnerships and collaborations. Exciting Research Data Management Forums are held regularly, and there are plenty of training courses too.


Colin: Where does the main driver come from? Academics, the funders? And how does that translate to where this should happen?

Graham: Mainly from the funder. They want data to be shared, which requires organisation. That’s not the only place – also within the disciplines, e.g. the genome project. Or because of scale of the research – e.g. physics and astronomy. The ones in between are missing out. Neurophysiology project had a sense of needing to organise their data, wanted to share it (so long as they were in charge). Economic demands, but those are not generally concerns of the academics.

Colin: Data overlay journals – establishing datasets. Is that increasing, or discipline-specific?

Graham: Discipline-specific, sometimes require data with the article – though often only the tip of the iceberg for that specific publication. We’re interested in the rest, left on someone’s laptop or CD. Needs to be planned from the start, so as not to lose what’s been invested. Journals have seized on certain areas where there’s value having the data to support the articles – but it’s not all the data. But also question of what you can do with the data, misunderstanding and misuse. So often superficial.

Simon: Has a vision of a community of scholars gathering around a dataset, so you can find out what people were doing with it and what they were claiming on the basis of it. Does that align with DCC?

Graham: We don’t run a repository, we provide tools and techniques for doing it. You can find some of this from good repositories already – they show who’s been using it. Citations.

Simon: Evidence that releasing your data increases citations?

Graham: Yes. Cambridge talk about it. Can’t quote off the top of his head. Mandate in Edinburgh to deposit in institutional repository (articles). But there isn’t a data repository outside the national centres.

Linda Wilks: Ethical issues – e.g. recording of an interview, permission, confidentiality.

Graham: Some of that data is accepted in the UKDA, where you can identify individuals. Have to organise it so as not to be able to identify the individual, with the data organised so the source is invisible but you can still get data. Sensitive data could just be used then dumped. There’s a paper about this on our manual somewhere.

Colin: With research outputs it’s valuable to still have the metadata even if you can’t have the full text; can then request it if you need it. Is that true of data too? E.g. if person not willing to share the data.

Graham: They may be willing to share it in the future – e.g. embargoes. Neurophysiology said the science is rudimentary, want to come back in 5 years’ time to reanalyse it when we know more, so keep it because it’s of value. Many arguments against sharing – career profile and so on. The sharing isn’t the main driver for keeping and curating data, it’s because it may be valuable at another time and place.

Discussion: How does e-Research fit in to broader debates and practices in digital scholarship?

Colin: The recognition issue is key. Is it bottom-up, or top-down? Recognition through the producers, or from e.g. HEFCE officially regonising it.

Martin: It’s a bit of both. If we agree this is scholarly output, but can’t be recognised, creates a driver for change. At an institutional level we need to remove those barriers to engagement. Need a bit of both. Helps to have good examples. Michael Wesch from Kansas State – very widely viewed outputs. Martin wrote for Computers in Education – they accepted it after a year, then it didn’t come out in print for another two years. It was three years out of date, you end up looking like an idiot. But a blog post would’ve been up straight away – so that’s a driver.

Nick: Important not to set these up as mutually exclusive – Twitter and blogs can feed in to conferences and traditional publications, and vice versa.

John: Absurdity is no guarantee of change. Journal publishing is already patently absurd, but that doesn’t mean it won’t persist. Almost every area of publication is facing these kinds of issues in relation to user-generated content. I suspect you just have to take a long-term view. It’s a huge transformation, you can expect it to be uneven as it changes. It’s rarely clear what are going to be the key pressure points for changes. It could be a device, services, collapse of financial models. There are vast disciplinary differences. Astrophysics, or physics generally, have already moved way beyond all this stuff. For them, putting stuff immediately in to repositories is just the way they work – or just look at each others’ Facebook profiles. Some community scholars have already moved, some don’t even know they need to.

Someone: Do they need to move the same way? Three years for an article in history is less of an issue. Conference proceedings can take a year to come out. Is it always useful to talk about the whole specturm in one go?

Martin: Boyer’s four practices – some come more to the fore in some disciplines than others.

Colin: Also link to quality. Availability versus trust without the quality stampe, if not peer reviewed and accepted by a recognised journal. Would people be confident.

Jan Parker: Journals – one of the answers as a bridge. As an editor, we’ve stopped doing book reviews, unless everyone’s going to be talking about it for ten years. More e.g. conference reports on a quick turnaround. Need a section to say these are the blogs making waves in our areas. There are names that everyone follows, but not everyone who reads the journal will know them.

Martin: Reputation is not the only thing; peer review is not without its flaws. If you have a good enough reputation on a blog, you have a personal reputation that carries as much weight as a journal article. Has started putting up articles on his blog, gets much better feedback than from the peer review process. Anyone can publish anything, but brand matters a lot. See the same with open educational resources (OER). They might not take a YouTube video from Martin, but would from e.g. MIT. Status, reputation, brand.

Someone: It’s also about discoverability. One of the problems for younger people who haven’t built the brand – you can blog and it feels like you’re talking to nothing, can get quite disheartening. Reviewing the scene for blogs in a discipline is really exciting.

Graham: The MMR paper, it gets more pointed in the biomedical field. The reliable data got completely submerged in the wash of opinion. It was obviously hard for journalists and parents to understand what was reliable and reputable information. There’s a strong need to retain that peer review support, for accrediting things as reputable. Not just generating reputation – many commentators have rep by having controversial viws.

Martin: It had been peer reviewed so it had that stamp of authority – legitimacy it didn’t deserve.

Graham: Also funding issues.

Simon: What makes something scholarly or scientific is the quality of the discourse. Our infrastructures have to support quality discourse. Articles are often very, very dry – but the author’s presentation can be very rich and engaging. Until now it was a travesty because you only saw the text. Important for e.g. policymakers. Journals are hugely democratising – how does an up-and-coming researcher get some wise eyes on their blog? Economy of attention. Leaders in the field will follow up a smart new researcher, pick them up in e.g. conferences. Blind peer review can give benefits.

Linda: Reputation is hard to pin down. It’s contextual. People in the Arts Faculty won’t have heard of Martin. (laughter)

Martin: Also distributed identity.

Linda: RAE and REF are at least an objective count.

Martin: It’s reputation within your community.

John: Years ago, Peter Medawar wrote a wonderful paper on why the scientific paper is a fraud – good reading still. Discourse we’re having is about how we can take the good bits of the print-based publication culture and transplant them in to the online culture. Can see why people want to do that – it’s like journalists wanting to reproduce a newspaper online. Could go the other way. The blogosphere is an interesting place – actually, has very good quality control. The problem with the MMR was the cluelessness of our established media, and their irresponsibility – as well as the mistake the Lancet made. In the blogosphere you’d have more interesting, rational discourse. Very important to have scholarly aggregators for disciplines – an intelligent, informed guide to what was interesting in young researcher’s blogs. And editorial board. Blogosphere works this way anyway. We could do that.

Martin: Stephen Downes does this for edubloggers. Inflation of A-level results, is less democratic, harder to stand out as a bright kid from an underprivileged background, extra-curricular stuff is harder for them to do. Similar argument for blogs, makes it harder to break in to. In other ways, you don’t care. If you link and make intelligent comment, you’ll get attention. You don’t care whether they’re an undergrad or 40y tenured professor. The space is very democratised. Your real reputation doesn’t matter as much as online.

Simon: It’s a generational thing – the greatest minds at the moment do print-based, we may see a shift in practice over time.

Martin: What are the big shifts that are significant? What are the obstacles?

David: Did a project on perceived barriers to engaging with this technology. Frequently rather mundane technical issues – e.g. have to have digital identity certificates, requires degree of interaction upfront, don’t want to invest in that effort. Access interfaces to these resources and ways of collaborating and aggregating them are not that mature, so require a degree of commitment of time upfront to get over that hurdle. Also – even with Web 2.0 you need to learn the interface. And ephemerality. The big issues include some of the ones we’re discussing here. The nature of publishing of scientific work, and capturing reproducibility and making it available, is changing. Virtualisation and clouds as one idea – digital searches and simulations, especially in biomed, becomes part of what everyone does. You could capture image of machine as it was when you did the experiment, can store it, publish it, and so on. Good in e.g. bioinformatics and patenting pharmaceuticals and so on – can publish as part of patent that you had the model in your machine at that time.

Esther: Early-career researchers and younger people getting recognised. Blogging, sharing results online – of PhD students, about 50% leave academia. A lot of research is being done that isn’t being published, lots that generates negative results and doesn’t make it in to journals. Scope for reporting this online, not reported using existing journal lifecycle. Might not have credibility yet but getting information out and people can decide themselves.

Martin: On Radio 4, lots of genetics research is blocked because some experts are blocking it. Was shouting at radio “just put it online!”. But quite rigid idea about what constitutes and academic article.

Esther: Much research is lost. Important to know not just what has worked but what hasn’t.

David: Exposure, genetic and biomed research – sometimes difficult or even dangerous to publish online. E.g. if you’re working with animal models.

Nick: Being more transparent in all areas isn’t necessarily what we want – may impede public understanding. Minor debates versus substantive disagreement.

Jan: Scholarship, Boyer – the initial push was for equality of esteem between the four, so to see that research was not the only one. Is digital scholarship a fifth? Or an application or processes with the four?

Nick: We’re at early stages, but digital applies to those four rather being a separate one. Digital scholarship could be a grand unifying theory for scholarship. We have potentially the best platform for the use of any technology for what scholars do – so every funding stream ever.

Colin: Everyone is going to want to comment on your blog, Martin – it’ll turn in to a journal!

David: Incumbent on academics to get better at explaining why science/scholarship is better than unstructured debate.

Judy: Lots that JISC is doing in this area. Planning major event on ?15th June in British Library, with Research Councils, looking at consequences from UEA/Climategate on data management and sharing. JISC has a Managing Research Data Programme on, briefing paper out, is very relevant.

Matt: JISC is about to announce more OER funding, and one strand crosses over with digital scholarship in reputation. Will be big programme. Also around open access.

