Friday 26 June 2015

Research data sharing


Sharing research data is increasingly becoming more popular, and while not synonymous with traditional scholarly publishing yet, it is nevertheless moving in that direction. We, as an aspiring research institution, need to start thinking about depositing and sharing “publications and data”, rather than treating research data as a special entity, if in fact we treat it as anything at all.

There are a number of benefits to the institution and the researcher for sharing data. Demonstrating good practice and research integrity raises the profile of the university and individual researcher. Sharing data makes it citable, which in turn can lead to increased citation metrics for both the publications associated with the data and the data itself.  This is a good thing.  Increased exposure from the data records can help foster new collaborations in research areas not previously thought of. And funding opportunities may improve due to a healthier research ecosystem and greater integration between systems and researcher profiles.

When reading about the positives for data sharing it is hard to understand why there is such a resistance to sharing within academic circles. Do researchers fear they will not be recognised or credited for their data? If the data has a good framework around it making it easy to obtain, understand and cite, then this risk should be reduced. Or do they fear “getting scooped”? Embargoing the data may be the solution to this.

Often institutions and policy makers have a perception that it is the “big” data that needs the most help when it comes to managing and sharing. This is usually not the case. Big data often has a more robust framework surrounding the collection and management of it – often due to requirements of funding organisations. The problem is with small data – the multitude of small spreadsheets that researchers maintain, often without adequate management, code keys, storage, backup… If data is managed correctly during the collection and analysis stage, it makes it all the easier for sharing once work has been completed. Data that is managed correctly – i.e. has a good framework around it – is more likely to be used and therefore cited. Unfortunately, citations are the name of the game in order to stay current in research.

For every risk or concern that researchers or institutions can throw up for sharing, there will always be a solution. Data should be shareable. Publically funded data should definitely always be shareable. The risk to institutions for not sharing data – non-compliance with policy and funding agreements, reputational damage, poor practice, low awareness –means that institutions should lead by example and facilitate the sharing infrastructure.

Sometimes there are legitimate concerns about sharing data – it is identifiable, confidential, private? What is the best way to manage this sort of data? Is it shareable? In these cases, the metadata can be available with mediated access to the data. When it comes to data of a sensitive nature, there will always need to be someone that can respond to requests.

Research data is an institutional asset, and as such should be treated as such. Unrecognised effort is a prime precursor to disengagement from researchers, staff and the community. And as an asset, you (whether the researcher, lab technician, administrator, executive, institution) should be treating research data with the respect it deserves.

“Products of research are not just publications” – NSF senior policy specialist Beth Strausser.

Graphic: http://d7.library.gatech.edu/research-data/home