Openscience
Empirical Software Engineering journal (EMSE) open science and reproducible research initiative
Install / Use
/learn @emsejournal/OpenscienceREADME
EMSE Open Science Initiative
Openness in science is key to fostering progress via transparency, reproducibility, and replicability. Especially open data and open source are two fundamental pillars in open science as both build the core for excellence in evidence-based research. The Empirical Software Engineering journal (EMSE) has therefore decided to explicitly foster open science and reproducible research by encouraging and supporting authors to share their (anonymised and curated) empirical data and source code in form of replication packages. The overall goals are:
- Increasing the transparency, reproducibility, and replicability of research endeavours. This supports the immediate credibility of authors' work, and it also provides a common basis for joint community efforts grounded on shared data.
- Building up an overall body of knowledge in the community leading to widely accepted and well-formed software engineering theories in the long run.
This document describes the principles, the process, and the infrastructure that support this initiative.
This information is also available in the editorial The open science initiative of the Empirical Software Engineering journal (Published May 2019, doi:10.1007/s10664-018-9632-7)
Open Science Principles at EMSE
As for any initiative in a research community, the success of the Open Science Initiative, too, depends on the willingness and the possibilities of authors to disclose their data. Therefore, we strive to implement the Open Science Initiative at EMSE as a community effort with services that aim at encouraging and supporting authors of EMSE articles in opening up their research. The steering and motivating principle is that only openness in empirical research increases the transparency of research in a way such that the authors' empirical analyses can be reproduced, fully understood, and ideally replicated by others not involved in the research. To this end, we aim at promoting a data-sharing culture where authors publicly archive their data and related material required to understand and reproduce the claims and analyses presented by them in their manuscripts. Our hope is to move our community as a whole forward to the point where open science becomes the norm.
All submissions to EMSE will undergo the same known review process regardless of whether authors decide to disclose their data or not. Yet, as the leading journal in empirical research methodologies and their application to software engineering, we strongly encourage all authors to make an effort in supporting this initiative by making data available upon submission (either privately or publicly) and especially upon acceptance (publicly). Authors who cannot disclose non-public data (e.g. industrial data sets that fall under non-disclosure agreements), are asked to please provide an explicit and short statement in their manuscript.
To make research data sets and research software accessible and citable, we encourage authors to:
- archive data on preserved archives such as zenodo.org and figshare.com so that replication packages remain available in the very long term (on Zenodo, there is a dedicated community for empirical software engineering).
- use an appropriate license, e.g., the CC-BY 4.0 license for data and the MIT License for code. Look at choosealicense.com for more information about suitable open source licenses.
Those replication packages disclosed by the authors will then undergo an additional, short, review by the open science board as described next. When archiving data as part of a replication package, we ask authors to attend to the FAIR, i.e. data should be:
- Findable,
- Accessible,
- Interoperable, and
- Reusable.
Authors should therefore use archival repositories and avoid putting data and software on their own (institutional or private) websites or systems like Dropbox, version control systems (SVN, Git), or service like Academia.edu and ResearchGate. Personal websites are prone to changes and errors, and more than 30% of them will not work in a 4 year period. Moreover, nobody should have the ability to delete data once it is public. Finally, the package disclosed via an archival repository should link to the paper (DOI) upon final production of the manuscript.
Open Science Board
- Daniel Méndez (Chair), Blekinge Institute of Technology, Sweden, and fortiss GmbH, Germany
- René Just, University of Washington, USA
- Daniel Graziotin, University of Stuttgart, Germany
- Neil Ernst, University of Victoria, Canada
- Chakkrit Tantithamthavorn, Monash University, Australia
If you are interested in joining the board and contributing to open science at EMSE, contact Daniel and Martin by email.
Open Science Process
- Once a manuscript gets "Minor revision" or "Acceptance", the decision email contains the following text:
- "EMSE encourages open science and reproducible research. We are happy to invite your to submit your open data, open material, or open source code (in the folllowing referred to as "replication package") for an additional, short, review by the open science board. Provided you agree to participate, the board will then review the replication package and check its eligibility to publicly recognise your open science effort with an open science badge. The board will provide you with constructive feedback on content and documentation of the package. To submit your replication package, please send an email to mendezfe@acm.org. Should you have any questions, please do not hesitate to contact the open science chairs Daniel Mendez. Note that your decision to participate in the open science initiative will not affect the remaining reviewing and editorial process in any way."
- The authors are given 2 weeks to submit their replication package after the final acceptance.
- When the authors submit a replication package, the Open Science Chairs ask one member of the Open Science board to review the package.
- The review is made according to transparent review criteria
- The open science reviewer is given two weeks to accept or consolidate a list of questions to the authors
- The open science review is blinded, the open science reviewer does not sign her review
- If necessary, the open science reviewer asks for changes by sending an email to the authors
- the authors are given another two weeks to make the changes.
- The open science reviewer makes the final decision.
- If the replication package is rated as insufficient, the manuscript is still accepted and the authors are given a list of constructive comments on how to improve their open science practices
- If the replication package is considered to be of good or excellent quality, the authors can add in their final version. "Open Science Replication Package validated by the Open Science Board".
Throughout the whole communication process, the Open Science co-Chairs serve as mediator between the authors and the Open Science Board members in a, for now, single blind process.
The Frequently Asked Questions provides additional information.
EMSE papers with the Open Science Badge
When papers are awarded the "Open Science" badge, the following text is added as a separate title note, the badge note appears after the “Communicated by”-line
“This paper has been awarded the Empirical Software Engineering (EMSE) open science badge.”
Springer maintains the ‘Open Science’ topical collection for papers that have been awarded the EMSE open science badge: https://link.springer.com/journal/10664/topicalCollection/AC_deca3131734b61f9ddc593f577c02eb1/page/1
FAQ
How should the replication packages be disclosed?
We encourage authors to archive their data as part of replication packages on preserved archives such as zenodo.org or figshare.com so that the data will receive a DOI and become citable. Further, we recommend the authors to use the CC0 dedication (or the CC-BY 4.0 license) when publishing the data (automatic when using, for instance, zenodo.org or figshare.com).
Those archives allow updating published replication packages any time. We strongly recommend the authors to update the package information after the review process, once the manuscript is in production and receives a DOI, with a reference to the published manuscript so that the package is citable along the published article.
In this EMSE open science process, what’s the difference between reproducibility and replicability?
There is no consensus across disciplines about the difference between reproducibility and replicability. Often, replicability is seen as the ability to repeat the same study under the very same conditions yielding same results.
Reproducibility is seen as the ability to independently reproduce the study yielding same or similar results with a given precision. In the EMSE open science process, we make no specific difference for now. The goal is to encourage open data and code so that researchers can reproduce the results (partially or completely), and/or perform further research using this data and code.
What happens if the data violates one or more of the FAIR principles?
[FAIR](https://www.force11.org/group/fairgroup/fairprincipl
Security Score
Audited on Dec 13, 2024
