Data sharing in the Netherlands

JiscRDM 1,982 views 30 slides Jun 27, 2017
Slide 1
Slide 1 of 30
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30

About This Presentation

Enabling data sharing in the Netherlands: contributions by DANS
Ingrid Dillo, Deputy Director, DANS
Research Data Netwok


Slide Content

Enabling data sharing in the Netherlands: Contributions by DANS Ingrid Dillo Deputy D irector DANS Research Data Network Workshop Parallel Sessions II: Research Data Solutions from SURF and DANS York, 27 June 2017

Outline RDN workshop: “..focus on innovative tools, and approaches that offer practical solutions to current and future RDM  challenges” NL RDM landscape issues: “funding, putting policy into practice” DANS Frontoffice-backoffice model Certification of digital repositories FAIR data assessment Business models for digital repositories

Institute of Dutch Academy and Research F unding O rganisation (KNAW & NWO) since 2005 First predecessor dates back to 1964 (Steinmetz Foundation), Historical Data Archive 1989 Mission: promote and provide permanent access to digital research resources DANS organisation

DataverseNL to support data storage during research until 10 years after NARCIS Portal aggregating research information and institutional repositories EASY Certified Long-term Archive DANS core services https:// dans.knaw.nl

DANS international connections

DANS international connections

Policy makers want open data

Researchers remain hesitant

Motivations for data sharing

Data sharing incentives Influence of sharing norms within direct research circle Professional rewards for data sharing External drivers: Publisher requirements (DAPs ) Funder policies/mandates http ://repository.jisc.ac.uk/5662/1/KE_report-incentives-for-sharing-researchdata.pdf

Other data sharing challenges Enabling the researcher to comply with open data requirements: a wareness raising, training and support for data management (DMPs, FAIR data) infrastructure for preservation of and long-term access to the data

Sustainable support model Frontoffice-backoffice model Division of labour Economies of scale Backoffice Curation and preservation expertise Training of local data experts Long-term preservation infrastructure

“Perhaps the biggest challenge in sharing data is trust: how do you create a system robust enough for scientists to trust that, if they share, their data won’t be lost, garbled, stolen or misused?”

Pillars of trust actions and attributes of the trustee (integrity, transparency, competence, predictability, guarantees, positive intentions) external acknowledgements: reputation (researchers ) third party endorsements ( funders, publishers)

DANS and Data Seal of Approval 2005 : DANS to promote and provide permanent access to digital research resources Formulate quality guidelines for digital repositories including DANS 2009 : international DSA Board Almost 70 seals acquired around the globe, but with a focus on Europe https://www.datasealofapproval.org/en /

http:// www.ncdd.nl/wp-content/uploads/2016/10/201611_DE_Houdbaar_Report_DSA-survey_2016.pdf

Partnership with WDS under the umbrella of RDA Goals: Realizing efficiencies Simplifying assessment options Stimulating more certifications Outcomes: Common catalogue of requirements for core repository assessment Common procedures for self-assessment and review process One new certification body: CoreTrustSeal Board

New CoreTrustSeal Requirements Requirements: Context (1 ) Organizational infrastructure (6) Digital object management (8) Technology (2 ) https:// goo.gl/kZb1Ga Endorsed recommendation by Research Data Alliance EC- r ecognition as ICT technical specification

Requirements dealing with “data quality” or “fitness for use” or “ FAIRness ” R2 . The repository maintains all applicable licenses covering data access and use and monitors compliance . R3. The repository has a continuity plan to ensure ongoing access to and preservation of its holdings. R4 . The repository ensures, to the extent possible, that data are created, curated, accessed, and used in compliance with disciplinary and ethical norms . R7 . The repository guarantees the integrity and authenticity of the data . R8. The repository accepts data and metadata based on defined criteria to ensure relevance and understandability for data users. R10 . The repository assumes responsibility for long-term preservation and manages this function in a planned and documented way. R11 . The repository has appropriate expertise to address technical data and metadata quality and ensures that sufficient information is available for end users to make quality-related evaluations. R13 . The repository enables users to discover the data and refer to them in a persistent way through proper citation. R14 . The repository enables reuse of the data over time, ensuring that appropriate metadata are available to support the understanding and use of the data.

All data sets in a Trustworthy Repository are FAIR, but some are more FAIR than others

Experiences with Data Reviews at DANS started in 2011 M. Grootveld , J. van Egmond en B. Sørensen https:// goo.gl/Tf4HFN

FAIR badge scheme Proxy for data “quality” or “fitness for (re-)use” Prevent interactions among dimensions to ease scoring Consider Reusability as the resultant of the other three: the average FAIRness as an indicator of data quality (F+A+I )/ 3=R Manual and automatic scoring F A I R 2 User Reviews 1 Archivist Assessment 24 Downloads

Findable (defined by metadata (PID included) and documentation) No PID nor metadata/documentation PID without or with insufficient metadata Sufficient/limited metadata without PID PID with sufficient metadata Extensive metadata and rich additional documentation available Accessible (defined by presence of user license) Metadata nor data are accessible Metadata are accessible but data is not accessible (no clear terms of reuse in license) User restrictions apply (i.e. privacy, commercial interests, embargo period) Public access (after registration) Open access unrestricted Interoperable (defined by data format) Proprietary (privately owned), non-open format data Proprietary format, accepted by Certified Trustworthy Data Repository Non-proprietary , open format = ‘preferred format’ As well as in the preferred format, data is standardised using a standard vocabulary format (for the research field to which the data pertain) Data additionally linked to other data to provide context

Creating a FAIR data assessment tool Using an online questionnaire system Prototype: https:// www.surveymonkey.com/r/fairdat

Website FAIRDAT To contain FAIR data assessments from any repository or website, linking to the location of the data set via (persistent) identifier The repository can show the resultant badge, linking back to the FAIRDAT website F A I R 2 User Reviews 1 Archivist Assessment 24 Downloads Neutral, Independent Analogous to DSA website Mockups !

Sustainable business models for data repositories Increasing need for data repositories and data stewardship. Increasing volume presents a challenge. Requirements for stewardship present a greater challenge. Sustaining digital data infrastructure is a major issue for science policy c urrent funding models will prove inelastic and not meet the growing requirements – concern on the part of repositories and funders

Sustainable business models for data repositories RDA Cost Recovery Interest Group, also supported by WDS and CODATA Report Income Streams for Data Repositories (Feb 2016; https://zenodo.org/record/46693#. WTUR-TOB2T8 ) based on 25 in-depth interviews, identifying topics and trends, alternative r evenue s treams

Sustainable business models for data repositories Continuation of the work under the umbrella of OECD/GSF Around 50 interviews in total Thorough economic analysis Cost optimization Stakeholder workshops Presentation of report and stakeholder recommendations at RDA Plenary Montreal Expected OECD publication end of 2017 https :// www.innovationpolicyplatform.org/open-data-science-oecd-project

User Base Data depositors Data users Research institutions Research funders Others Products Research data Research facilities Value-adding services Contract services Research services Revenue Sources Structural funding Host institutional funding Deposit-side charges Access charges Services charges Financing Investment funding Development funding Operational revenue Identifying the user base Developing the product mix Making the value proposition(s) Understanding cost drivers & matching revenue streams Elements of a Business Model for Data Repositories

Thank you for listening i [email protected] www.dans.knaw.nl