This list of concepts describes key terms and their meanings in the context of this document. The definitions given for the concepts on this list are not necessarily generally valid.
Aggregation refers to the re-grouping of data on the basis of one or more factors to a less accurate level. The data can be aggregated and opened at a more general level, or converted into a statistical format, which means that the data concerning an individual no longer is in an identifiable format.
Processing data to eliminate the possibility of identifying individuals directly or indirectly. For example, identifiers can be deleted or generalised to a level where individuals can no longer be identified. Identifiers include names, addresses, phone numbers or personal identity codes.
Source: Information on anonymisation on the website of the Office of the Data Protection Ombudsman.
Open data is machine-readable data in digital format that is freely available to everyone for any purpose as long as its original source is acknowledged. Examples of open data are census data, map data or real-time data concerning bus locations.
Open source refers to open production and development methods of computer programs that allow users access the program source code and modify it according to their needs. Open source principles also include the freedom to use the program for any purpose and to copy and distribute both the original and the modified version.
Source: About open source in Wikipedia.
An application programming interface whose features are fully public and can be used without restrictive conditions. Open APIs can be used free of charge, and the user need not ask the interface provider for permission or inform them in advance of the purpose for which they intend to use the interface. The description and documentation of an open API must be freely accessible to everyone. Testing of the interface must also be possible.
Source: avoinrajapinta.fi (in Finnish).
The Comprehensive Knowledge Archive Network (CKAN) is an open source information management system published by the Open Knowledge Foundation in 2007 which has since been developed further. It was designed particularly for publishing and finding open data. See data catalogue.
Source: CKAN software site.
Creative Commons is a non-commercial organisation that promotes the sharing and use of creativity and information. CC licences are a standardised and internationally recognised way of granting rights to the use, further processing and sharing of data.
Source: Creative Commons website.
Data refers to machine-readable potential information consisting of digitally stored characters and symbols which can, for example, add up to documents, databases and audio recordings. Data can be understood as a raw material that produces meaningful information when refined.
Source: P2PU open cultural data course (in Finnish).
A data catalogue is a structured metadata register in which the metadata for data held by more than one public organisation is combined.
Data catalogues can be:
- national (including opendata.fi and data.gov.uk),
- regional (Washington D.C. or Helsinki Region Infoshare),
- maintained by cities (San Francisco and Tampere),
- maintained privately (Sunlight Foundation - National datacatalog).
Source: Julkinen data - johdatus tietovarantojen avaamiseen (in Finnish).
Harvesting refers to the automatic collection of data from different websites to a single location, such as opendata.fi. For example, opendata.fi harvests Paikkatietohakemisto, which means that Paikkatietohakemisto data can automatically be found also on opendata.fi. Harvesting makes it easier to find data, as you can search for it centrally in one location rather than visiting many different sites.
Information refers to, for example, a text or a bit queue composed to describe a certain state of affairs. The more accurately the information describes the state of affairs, the larger its information content. Information is produced by refining data.
Source: Tieteen termipankki information (in Finnish).
Data content structured in a way that enables a computer to process it.
Source: Tieteen termipankki machine readability (in Finnish).
Terms and conditions applicable to the use of intellectual property rights, right to data protection or other object protected by rights
Information that describes the context, content or structure of a dataset and guides and documents its processing and management.
Source: Finto metadata.
Application programming interface
Application Programming Interfaces (APIs) are documented technical interfaces through which software, applications, or systems can exchange data or functionalities.
Source: Application Programming Interfaces in Governments: Why, what and how
MyData is the principle followed in the management and processing of personal data. It means that people must be able to manage, use and disclose personal data concerning them (including health data, energy data or purchase data). MyData is not open data.
Source: My data - johdatus ihmiskeskeiseen henkilötiedon hyödyntämiseen (My data – an introduction to human-centric personal data use).
Pseudonymisation means processing personal data so that it can no longer be connected to a certain person without additional information.
Raw data is machine-readable data that is usually not subject to intellectual property rights.
An area of knowledge management that aims at and enables knowledge-based decision-making.
Source: Finto knowledge management.
An area of knowledge management in which the preconditions for using information are maintained and developed by means of data management, directing of information flows and monitoring of the information quality.
Source: Finto information management.
Organising information processes so that the availability, discoverability and usability of data for different purposes are ensured throughout the life cycle of the data.
Source: Finto data management.
Format of the file in which data is stored. Specifies the format, or structure, of the file stored on a computer. Examples: CSV, XML.
Source: About file formats in Wikipedia (in Finnish).
Knowledge is truthful and justifiable information to which the recipient has given a meaning. Knowledge is produced by refining information.
Source: Termipankki knowledge (in Finnish).
A dataset is an identifiable collection of data.
Source: Termipankki dataset (in Finnish).
Management that promotes the organisation's ability to create value through knowledge and competence.
Source: Finto knowledge management.
An overall arrangement comprising data processing equipment, software and other data processing.
Source: Act on Information Management in Public Administration.
Data balance sheet
A report used to support knowledge management that describes the state of the organisation's data processing and data management.
Source: Finto data balance sheet.
A dataset or a collection of datasets formed for a specific purpose consisting of logically or physically interconnected data.
Source: Finto information resource.
WFS (Web Feature Service) is a standardised software-independent technology and interface through which spatial data sets, usually up-to-date ones, can be shared with users in vector format.
Source: Instructions for using WFS interfaces (in Finnish) (pdf).
WMS (Web Map Service) is a standardised software-independent interface through which spatial data sets can be shared with users as images (raster format, also vector format svg files are possible). Maps used through a WMS service are usually up to date. They can, for example, be used as background maps in spatial data programs.
Source: Description of Helsinki WMS view service (in Finnish) (pdf).
The ability of actors, processes and information systems involved in the activities to operate and communicate with each other in a manner or to the extent that they can routinely use and understand each other's data.
Source: Finto interoperability.