Thesis metadata, Bachelor of Engineering, ICT (taught in Finnish)

This dataset has been gathered from the 478 bachelor‘s theses written for the Finnish-taught ICT (Information and Communication Technology) degree program which were published between 2017 and 2022. Most of the theses can be found in the joint database of Finnish Universities of Applied Sciences, Theseus ( The students were able to choose whether to write their thesis in Finnish or English (135 theses).

The data was extracted from the theses’ pdf files by a python script into two excel files, one for the ones written in Finnish, one for those in English, which were then cleaned and converted into csv and json format. While extracting the data, about 100 theses which’s metadata could not be read in full, have already been discarded, most for not following the template given, also for providing Finnish metadata for an English thesis or not mentioning the author‘s name. Rows marked as “restricted“ (8 in Fi, 2 in En) “does not follow the template” (2 in En) or “not in theseus“ (17 in Fi, 1 in En) have been removed by hand, as well as a few lines with obvious logical errors (e.g. more keyword appearances than words in the thesis). Dots at the end of keywords and spaces in the middle of words have been removed, minor typos have been corrected.

The word count includes only the thesis itself, neither abstract nor appendix. 10 pages were provided by the template given. The supervisor id‘s are matching those in the dataset for the English-taught ICT degree program.

The dataset contains the following fields:

Total References – Total number of references

Printed References – Number of printed references

Internet References – Number of references from the internet

Weak References – Number of weak references (wikipedia, reddit, blog, youtube)

Pages – Number of pages

Total Word Count – Number of words

Study Credits – Number of study credit at the moment of graduation

Study Entitlement Days – Length of study entitlement measured in days

Grade – Thesis grade (1-5, 1 is the lowest passing grade and 5 the highest)


Additional Info

Collection Open Data
Maintainer Turku University of Applied Sciences
Maintainer email
Maintainer website
Geographical coverage
Valid from 27.02.2024
Last modified 22.03.2024
Show change log
Created on 27.02.2024