Finna training corpora

Dataset contains TF-IDF data matrices targeted for machine learning use. Matrices are generated from document corpora based on metadata that has been extracted from the service in 2019 via its open API. There are corpora in Finnish, Swedish and English.

Resources (6)

Additional Info

Field Value
Dataset visibility
Outdated No
More about the license

Koulutusmatriisit on tuottanut CSC - Tieteen tietotekniikan keskus Oy. Alkuperäisen datan on kerännyt Kansalliskirjasto.

Geographical coverage
Update frequency
Valid from
Valid until
Links to additional information
Collection type Open data
International benchmarks
State Active
Dataset maintainer Analytiikkaryhmä
Maintainer email
Maintainer website
comments powered by Disqus