5. Publication
This step can be implemented as follows
The person promoting data sharing in the organisation and the data administrator
- prepare comprehensive metadata for the data.
- determine where the data can be published.
- publish the data in a data portal, such as opendata.fi.
- together with the organisation’s communications unit, agree on how they will communicate about the publication of the data.
The organisation's IT professionals
- prepare the data for publication.
Metadata description
This step discusses why and how the organisation should describe the dataset to be opened and where the description data can be published. This description is also referred to as metadata.
Why is describing the dataset important?
Describing datasets helps data users, both people and computers, to understand the data and, consequently, facilitates its re-use. Information that describes the data is called metadata. Metadata provides information, for example, about the dataset's origin, structure or access rights. Without this information it would be impossible to use the data. When data is published on the Internet, its original context can easily be lost, which makes it particularly important to provide the user with metadata.
Let’s say that a dataset contains the number 37. Without metadata, the number 37 can refer to indoor temperature, shoe size, seating position or something else. You need metadata to help you understand the meaning of the number correctly. In this case, the number 37 could refer to indoor temperature and are given in celsius degrees.
Datasets usually have two types of metadata.
- Internal metadata of the dataset describes the data fields of the dataset and their connections.
- For example, the data model can specify the date format used in a table or what the table column ‘name’ refers to.
- To define the data model, the Data Vocabularies tool on the interoperability platform can be used.
- External metadata of the dataset describes the entire set, including its administrator and quality and how the data was produced.
While no Information Management Board recommendation exists on drawing up the metadata, for example the metadata published on the opendata.fi service uses the EU countries’ common DCAT-AP 2.0 data model.
Under the INSPIRE Directive, EU Member States have an obligation to prepare metadata for the spatial data sets and services covered by the Directive and publish them on a search service, which in Finland is Paikkatietohakemisto.
How should open data be described?
The metadata should be available in both human and machine readable formats.
It is advisable to provide the metadata in Finnish, but also in Swedish and English if possible. In particular, metadata in English is useful for potential international data users. The datasets described in the opendata.fi service can be automatically found in the data.europa.eu service, which compiles European data and thus gives international visibility to open data from different countries. Organisations that have already opened their data have, in connection with publishing the dataset, published descriptive information on the data, including the title, context, producer, content and structure of the published dataset.
Organisations that have already opened their data have published the metadata in connection with the dataset, either on a data portal (such as opendata.fi), on the organisation's website or both.
What kind of metadata should be provided?
The metadata of a dataset should include at least the following:
- basic information, including the name and licence
- When you publish a dataset in a data portal, there usually are mandatory fields for the necessary basic information
- a description of the data content and possible shortcomings
- the data production process
- a description of data quality using the indicators of the public administration’s common quality criteria
- any calculation formulas associated with data production or similar, if possible.
- contact details of the data administrator.
It is also advisable to specify in the metadata how often the dataset will be updated. The date on which the data will be updated and the update cycle are important information for data users. Data distributed through an API (such as weather data) may even be updated in real time, whereas data shared in file format (for example statistical data) may be updated less frequently, for example once a year.
Publication and communication
This step describes how and where the dataset to be opened should be published and how it should be communicated about.
No official recommendations exist for publishing and communicating about the datasets to be opened.
Organisations that have already opened their data have usually published the datasets to be opened and their metadata in a public data portal, ensuring that the datasets can be found as easily and quickly as possible.
Where should the opened data be published?
Several data sharing portals are available in Finland.
- Anyone can publish their open data on opendata.fi.
- The cities in the Helsinki Metropolitan Area publish their data on the HRI service as a rule.
It is advisable to make the datasets opened by central government actors visible on opendata.fi. This way, the message reaches a large number of people interested in using the data.
Opendata.fi service is as a national open data contact point, and the metadata for the datasets available on it is harvested by and published on the data.europa.eu service administered by the European Commission. Opendata.fi contains a large volume of metadata for datasets published in other Finnish data portals – among other things, data are harvested from Helsinki Region Infoshare service and Paikkatietohakemisto.
In addition to publishing the metadata for its dataset in a data portal, the organisation may also publish the data on its website.
Compliance with FAIR principles in data publication
The FAIR principles were originally developed for research data. They are also applicable to open data publication, however, even though not all principles can necessarily be followed as such.
FAIR stands for
- Findable
- Accessible
- Interoperable, and
- Re-usable.
By following these principles, an effort is made to ensure that data can be easily re-used and that published data and its metadata are of a high quality.
Check that the open data you publish complies with the FAIR principles:
- Publish the data in a public data portal, such as opendata.fi.
- Make sure that the data is assigned a unique identifier.
- Publish the data in an open file format.
- Describe the data comprehensively.
- License the data under an open license, such as Creative Commons BY 4.
Read more about the FAIR principles and publishing open data (in Finnish).
How should I communicate about the publication of data?
It is advisable for an organisation to market the datasets it has opened using different methods of communication and encourage the use of the data.
For example, you can spread information about the opened data
- on the organisation's website,
- in a newsletter and
- on social media.
In addition, data can be presented to the organisation's networks, and stakeholders can be encouraged to use it by organising different events where the data is utilised.
The organisation should also encourage data users to tell the organisation about applications based on the open data published by it. Possible examples of applications can be showcased on social media, for example, to inspire other application developers and data users.