What is research software?
It is software, in source code or compiled form that supports scholarly research. Software may be downloaded, compiled, executed and instantiated.
Why cite research software?
Software is pervasive in research. A UK Research Software Survey of 1,000 randomly chosen researchers showed more than 90% of researchers acknowledge software as being important for their own research and about 70% say their research would not be possible without it. In a separate 2018 study looking at 40 papers published in Nature from January to March 2016, 32 of them explicitly mentioned software. These studies provide evidence that software plays an important role in research, and should be treated in the same way as other research inputs and outputs such as research data and paper publications.
Proper citation of software has the following benefits:
- ensures scientific transparency and reasonable accountability of a researcher
- aids scientific reproducibility through direct, unambiguous references to the precise software used in a particular study
- provides fair credit for software developers or researchers who spend time developing software
- assists in tracking the use and reuse of software through reference in scientific literature and within other software
- helps developers verify how their software is being used.
How to cite research software
In general, software should be cited in a similar fashion to data and research papers. The core required elements of a citation are:
- Author(s) – the people or organisations responsible for the intellectual work to develop the software.
- Publication Year – the year when the software was published to a repository or any other publication venue.
- Title – the formal title of the software/service.
- Version – the precise version of the software used. Careful version tracking is critical for accurate citation.
- Publisher – the repository where software is held, archived, distributed, released or produced, ideally an institutional or disciplinary repository that provides curation of software over the long term. For example, Climate Data Gateway at NCAR, NASA Earth Exchange, Zenodo, Github.
- Locator/Identifier – a persistent identifier (PID) for the software such as a DOI, Handle or ARK that resolves to landing page. DOI is considered a best practice for software citation. DOIs are a unique, persistent identifier that can be used to track software citation metrics and to link related research outputs such as journal articles and research data.
Various international organisations have been working to develop guidelines for software citation. Examples of these include:
- Force11 Software Citation Principles
- DataCite Metadata Schema 4.1 (with additions to describe software and examples for software citation)
The DataCite DOI Citation Formatter is a simple online based system which uses your data set DOI to allow you to quickly format your citation in hundreds of different styles.
If a DOI/PID doesn’t exist, the URL can be used but must be used in conjunction with the access date:
- Access Date (optional) – ongoing development of software may not always be reflected in release dates and versions. It is important to indicate when a software was accessed, especially when the software is not referenced through its DOI but a URL indicating the software’s location.
Software citation format
Software citation should follow this general citation format:
Creator (PublicationYear): Title. Version No. Publisher. (resourceTypeGeneral). Identifier.
In the case of software citation, use resourceTypeGeneral=software.
- Xu, C., & Christoffersen, B. (2017). The Functionally-Assembled Terrestrial Ecosystem Simulator Version 1. Los Alamos National Laboratory (LANL), Los Alamos, NM (United States). (Software). https://doi.org/10.11578/dc.20171025.1962
Where the software is a library that was developed and run on a software platform, for example, a kinetic analysis software library with Matlab (TM) wrappers, it can be cited as follows:
- Dowson, Nicholas; Baker, Charles; Raffelt, David; Smith, Jye; Thomas, Paul; Rose, Stephen; Salvado, Olivier (2014): InsightToolkit Kinetic Analysis (itkka) Software Library. v1. CSIRO. (Software). https://doi.org/10.4225/08/540E9A7D11EB0
Where the software does not have a DOI, but is accessible from a URL, a suggested citation format is as follows:
Creator (PublicationYear): Title. Version. Publisher. (resourceTypeGeneral). URL. Access Date.
Again, where (resourceTypeGeneral) should be replaced with (software).
- Jones E, Oliphant E, Peterson P, et al. (2001). SciPy: Open Source Scientific Tools for Python. (Software). http://www.scipy.org/ [Online; accessed on 2018-07-26].
When the locator/identifier is a URL which doesn’t point to the exact version that has been utilised in the research, it is important to include an access date as this may help to identify the version.
Some repositories provide a recommended format for citing software from that repository.
- The NCAR Command Language (Version 6.4.0). (Software). (2017).
Boulder, Colorado: UCAR/NCAR/CISL/TDD. http://dx.doi.org/10.5065/D6WD3XH5