4. Planning and implementation
This part can be implemented as follows
The person promoting data sharing in the organisation and the data administrator
- assess the possibilities of sharing data, for example in terms of user rights.
- assess the benefits, risks and costs of data sharing.
- assess the need for data anonymisation and aggregation.
- evaluate the quality of the data to be shared.
- select the license under which the data will be shared.
- make a decision on opening data.
A data protection specialist is consulted about
- identifying information security and data protection risks.
- the need for anonymisation and aggregation.
The organisation's IT experts and the person responsible for the data to be shared
- consider the technical implementation of data sharing, for example if the data should be shared in file format or through an interface.
Definition of the data to be opened
This part describes aspects that the organisation should assess and specify when it starts planning the opening of data in practice.
No official recommendations exist for defining the datasets to be opened.
At the start, organisations that have already opened their data have usually identified the person who administrates and is responsible for the data that the organisation is planning to open, the information system in its background, and potential use cases. At this stage the organisation can, for example, make use of its information management model which describes the organisation’s information resources, or rely on the assistance of the person responsible for opening data, if the organisation has allocated resources to this task.
Organisations that have opened their data have considered the aspects related to dataset ownership, copyrights, data disclosures, data protection and information security. If these factors have not prevented data sharing, the way in which the dataset could in practice be constructed and distributed technically has been examined with the data controller. At the same time, the completeness of the data set has been determined, as well as the level of accuracy at which the data can be opened without compromising on its potential usefulness or usability.
In connection with identifying the dataset, the following have also been assessed:
- the potential benefits, risks and costs of opening the data,
- the quality, metadata and publication channel of the data to be opened,
- the life cycle of the dataset, and
- planning of sufficient communications network capacity and user support.
These aspects are described in more detail in the next steps.
As a basic premise, organisations that have opened their data have striven to determine if national or international standards exist for opening the data in question (for example, data modelling and formats), or if some other party has already opened similar data, in which case the organisation can use its information model for opening its own data.
The six largest cities in Finland have compiled a list of certain international standards (google sheets) which have also been used for data sharing in Finland. The DCAT-AP data model profile is used in the metadata of the datasets, for example on opendata.fi. When using international standards, organisations should note the dissimilarities between the laws of different countries, especially regarding data protection.
At the same time, it is also advisable to consider if it would be possible to also open the data production process (calculation rules, algorithms, etc.) when opening the dataset.
For more information, visit the web service data.europa.eu:
Benefits, risks and costs
This step describes how the organisation can assess the potential benefits, risks and costs of opening the planned dataset. To download a tool developed for assessing benefits, risks and costs, go to the section Method for assessing the potential benefits, risks and costs of opening data.
The Information Management Board has issued a set of recommendations for applying certain information security provisions (in Finnish) (Ministry of Finance publications 2021:65), according to which information risk management is a continuous activity, and the information management entity should describe the objectives, responsibilities and key methods related to it. The management is responsible for the organisation of and allocation of resources to information risk management. In addition, the information management entity maintains datasets comprising the risk assessment results and risk management plans and regularly assesses if this data is partly or fully secret or classified.
The Information Management Board has also issued a recommendation on the criteria for assessing information security in public administration (Julkri), which contains instructions for applying the criteria (Ministry of Finance publications 2022:43). The assessment criteria support the needs to develop and evaluate information security in all branches of public administration. They can be used to assess compliance with the Public Information Management Act, the Decree on Security Classification of Documents in Central Government and, in part, the information security requirements laid down in the General Data Protection Regulation.
Ensuring data protection
This part describes the assessment of needs to aggregate and anonymise the dataset to be opened as well as organisation-specific practices for ensuring the data protection of the dataset and for carrying out any aggregation, anonymisation or pseudonymisation required.
There are no official recommendations for assessing data aggregation and anonymisation needs.
Usually, organisations that have already opened their data have carefully assessed if the dataset to be shared is public and if it may contain personal data or other data critical to the functioning of society. If the dataset to be opened contains data concerning persons or related to the security of society in some way, the opening of the datasets and any aggregation, anonymisation or pseudonymisation of the data should be discussed with the organisation's data protection officer. The data protection officer or other data protection experts or networks of the organisation should also otherwise be consulted when assessing the aggregation and anonymisation needs of the dataset.
According to the principle of openness (Act on the Openness of Government Activities 621/1999), official documents shall be in the public domain, unless specifically provided otherwise in the Act on the Openness of Government Activities or another Act. It is important to note, however, that a public document may contain personal data and that there must always be a legal basis for disclosing personal data, even if the document is public. The authority must assess if the personal data contained in the document can be disclosed. Consequently, public information does not necessarily mean that the information can be published, as a public document may contain personal data that cannot be published even though the document is not secret. The precondition for secrecy is fulfilling the criteria for secrecy laid down in the Act on the Openness of Government Activities, and secrecy provisions are also included in special legislation.
Anonymisation is a way of removing personal data from a dataset. It should be noted, however, that as long as a person can be directly or indirectly identified based on the data or the data can be reverted to an identifiable form, it is still personal data, and the General Data Protection Regulation applies to it.
Anonymisation means processing the personal data in a way that eliminates the possibility of individual persons being identified on its basis. For example, the data can be aggregated and opened at a more general level, or converted into a statistical format, which means that the data concerning an individual person is no longer in an identifiable format. Identification must be prevented irrevocably and ensuring that the controller or some other third party can no longer convert the data back into an identifiable format using the data in their possession.
Pseudonymisation means processing the personal data in a way that eliminates the possibility of associating it with a certain person without additional information. Such additional information must be kept carefully separate from personal data.
The Office of the Data Protection Ombudsman is the national supervisory authority overseeing compliance with data protection legislation. The Data Protection Ombudsman and the Deputy Data Protection Ombudsmen perform their duties independently and impartially. The Office of the Data Protection Ombudsman has an Expert Board (whose term of office runs from 1 October 2020 till 30 September 2023). The Expert Board’s task is, on the Data Protection Ombudsman’s request, to issue statements on significant issues related to applying the legislation on personal data processing. For more information, visit the website of the Office of the Data Protection Ombudsman.
Under the General Data Protection Regulation, certain controllers and processors of personal data must appoint a data protection officer. This obligation applies to all authorities and public administration bodies. The data protection officer provides guidance on data protection to the controller and employees processing personal data. They monitor compliance with the GDPR and the information activities and training provision related to data protection in their organisation. The data protection officer provides advice related to impact assessments and serves as the contact point for the supervisory authority.
Read more:
- Data.europa.eu materials on the challenges of data administration and data protection
- National Cyber Security Centre’s report: Identifiers and data protection – anonymisation and its limits (in Finnish, pdf).
Selecting the form of data sharing and file format
This part describes the forms in which data can be shared and what should be taken into account when selecting it.
No official recommendations exist for determining the form in which datasets should be shared and the file formats to be used.
The selected data sharing method should be compliant with the legislation on access rights, data disclosure and providing data in machine-readable format as well as the obligations imposed by these statutes, such as sections 22 and 24 of the Act on Information Management in Public Administration. In addition, any needs to modify the datasets should be accounted for, including pseudonymisation or anonymisation.
Organisations that have already opened their data have shared data as files, through APIs or using a download service. The technical implementation of data sharing largely depends on the types of sharing solutions developed for the information system. Data in file format can be exported from the system as a batch report and/or through an API. APIs rarely exist in, or can be developed for, older information systems, which is why batch files may be the only option for sharing data. Whenever possible, the dataset should be shared in several different formats, for example offering a file in addition to an API.
When publishing open data, it is advisable to use open data formats (file formats) as far as possible. More information on selecting file formats for open data is available on data.europa.eu service.
Which sharing method is suitable for each type of data?
Examples of how organisations are sharing data
Defining data quality
This step describes how the quality of the dataset to be opened can be evaluated, specified and described.
No official recommendations exist for defining the quality of datasets to be opened.
Organisations that have already opened their data have striven to describe their evaluation of the current quality of the data, including possible shortcomings, in the description, or metadata, of the dataset. In the opendata.fi service, for example, you can type the data quality evaluation in the Description field of the dataset’s metadata, or add the description as a separate resource in PDF format.
It is important to note that while the quality of the dataset to be opened is not as good as the party administrating the data or their stakeholders might wish, this does not necessarily mean it cannot be shared. The dataset can be shared, drawing attention to the shortcomings in its quality in the metadata.
The public administration's shared data quality criteria and indicators developed to support improvement in the quality of public administration data can be used to evaluate and describe the data quality.
Defining access rights
This step describes how the dataset to be opened should be licenced, in other words what terms of use should be set for the data.
No official recommendations exist for defining the access rights for datasets to be opened.
Data access rights are defined by selecting a suitable licence that informs data users of the terms on which the published data may be used. The licence must be cited in the metadata of the dataset to be published.
How do you select suitable access rights for the opened data?
In order to qualify as open data, the shared data must have an open licence which allows free sharing, modification and use of the dataset for all purposes, including commercial ones. A fully open license (for example CC0) means that the organisation waives all copyrights that restrict the dataset’s use, within the limits of legislation.
Datasets published as open data are usually licenced under a Creative Commons CC BY 4.0 or CC0 licence. While no national recommendations currently exist in Finland for the licensing of public administration’s open datasets, the earlier JHS-189 recommendation on open data access rights recommended the use of CC BY 4.0 licence.
Most common open data licences
Example of a disclaimer
In addition to a licence, liability towards data users may sometimes need to be limited with a disclaimer.
(Organisation name) shall not be liable for any loss, litigation, claim, prosecution, cost or damage of whatsoever nature, caused either directly or indirectly by association with the open data published by (organisation name) or the use of the open data published by (organisation name).
Why should Creative Commons licences be used?
When using Creative Commons licences, the organisation knows in advance what to do in case of various disputes concerning data access rights, for example. It is important that the organisation does not start creating licences itself, as case law related to them cannot be predicted.
The use of known licences is also beneficial for data users
- The Creative Commons licences are internationally recognised, which makes cross-border use possible
- Aggregating and re-using datasets is easier when they are subject to consistent and familiar terms and conditions.
Deciding to open data
This step describes how the organisation can proceed in decision-making on opening data.
There are no official recommendations for making decisions on opening data.
Under Finnish legislation, the decision to open a dataset is made by the authority to which the task of administering the data has been assigned in the legislation. For example, the Finnish Institute for Health and Welfare (THL) makes the decisions on providing its datasets as open data. There is no centralised body in Finland that would make decisions on the openness of data in the entire administration.
Data is opened for someone to use it. Organisations often know at least some of their customers who could use the data to be opened and see value in it. It is worth assessing the opportunities created by opening the data, for example in a workshop, with these potential data users. As input information, a description of what data the organisation could open is needed. Organisations often have a great deal of data, and only a small part of it can be opened at once.
At the same time, a decision on managing any residual risks should be made. Residual risk refers to the remaining risk or part of a risk that the organisation cannot or chooses not to counteract with measures. Read more about residual risks in risk management instructions (in Finnish).
If the organisation intends to open several datasets, it may need to prioritise the order in which they are opened and its development measures.