The FAIR principles state that data and metadata should be made accessible with standardised communications protocols. We provide some guidance on what standardised communication protocols are in practice.
All information and communications technology relies on standardised communications protocols to operate effectively. A communications protocol is a set of formal rules describing how to transmit or exchange data, especially across a network. A standardised communications protocol is one that has been codified as a standard. Examples of standardised communications protocols include WiFi, the Internet Protocol, and the Hypertext Transfer Protocol (HTTP).
Modern communications protocols rarely operate in isolation and depend on other protocols in a layered model known as a stack. Each layer in a stack relies on those below it, and provides for the layers above. For example, the Internet works using the TCP/IP stack, which is divided into four layers.
You will generally only need to know and have to make decisions about the top layer because your IT department or eResearch partner has already set up the other layers through their infrastructure.
The Link layer governs the direct connections between two devices, such as a computer and a network switch or a phone and a mobile network tower.
The Internet layer routes traffic from a source network to a destination network.
The Transport layer routes traffic between any two devices, regardless of their network. For example, between your computer and a server in a remote location.
The Application layer is used commonly by two computer applications that are communicating with each other. For example, a web browser will use HTTP to access and retrieve data from a web server.
So what are examples of standardised communication protocols that you can use to make your data available?
- You can make data available as a download from a website or through a repository via HTTP
- You can make the data available from a file server via FTP
- You can make the data available through a well-documented API
The link to the data set or the API should be included in a machine readable form on the landing page and in the metadata. This will allow not only humans but also machines to find the link to the actual dataset. Consult with a metadata expert such as a data librarian, repository manager, or ARDC expert on how to include a link. For more in-depth information about metadata and storing metadata, see our metadata guide.
What is an API and when should it be used?
Although HTTP was initially developed to transmit web pages, it has since been adopted for transfer of other types of information too. One way to exchange information is through web APIs (Application Programming Interfaces). APIs allow computer applications to share and access machine-readable data. These applications can run on computers located anywhere, relying on other network protocols in the stack (see figure above) to handle data transport.
An API can also allow applications to share and access subsets of data. This is particularly useful for datasets where it would be impractical or unsuitable to transfer them in their entirety, such as large or sensitive datasets.
There are already a number of well-documented APIs used for the exchange of data and metadata. For example, OGC WMS is used for geo-registered map images and OAI-PMH is used for exchanging repository metadata. If there is not already a standardised API for your kind of data or metadata, it’s possible for a software developer to create one using a framework such as OpenAPI.
Authorisation and authentication
Not all data can be made openly available. If data cannot be made openly available, there should be a well-documented authorisation procedure to get access. It should be clear how you can request access on the landing page and metadata describing the data. When authorised, there should be an authentication mechanism by which a human or machine can then access the data securely. A traditional authentication method is with a username and password.
Many application layer protocols have some method of authentication built into them. For example, HTTP provides a mechanism for a username and password challenge. Alternatively, many APIs require authentication by requiring an API key. A key is generally a long, randomly-generated string of numbers and letters. A key should be treated with the same level of security and confidentiality as a password.
Open, free, and universally implementable
The standardised communications protocol used to facilitate access to the data should be open, free and universally implementable.
- Open means available to access by anyone without any barriers
- Free means available without cost and not belonging to any single person or organisation
- Universally implementable means able to be used by anyone in the world
In practice, this means using protocols that other people can also use in their own systems without any barriers or special costs. Most of the protocols underpinning the Internet, such as TCP/IP and HTTP, meet these principles.