Skip to content
Mika Hämäläinen edited this page Dec 30, 2021 · 7 revisions

What models are there?

Model Catalogue

UralicNLP can currently use three different kinds of models: HFST morphological generator, HFST morphological analyser and constraint grammar disambiguator. The HFST models are available for all the supported languages, while the CGs are limited to only a few languages.

The models originate mostly from the Giellatekno repository and Apertium. Their copyrights belong to the respective authors, however everything provided by Giellatekno and Apertium is open source.

Downloading models

from uralicNLP import uralicApi
uralicApi.download("fin")

The above snippet downloads all the models for Finnish. Run with sudo privileges for a system wide installation.

Where are models located?

from uralicNLP import uralicApi
print uralicApi.__model_base_folders()

Gives you the list of the possible locations for the models. If you want to create your own models, just create a subdirectory in any of these locations by the three letter language code of your language. Name your models as generator, analyser and cg without file extensions.

Uninstalling models

If you want to free up some space, or end up getting confused which models will be loaded when uralicNLP is used, you can also uninstall models easily

from uralicNLP import uralicApi
uralicApi.uninstall("fin")

Using your own transducers

It is possible to use your own transducer file on uralicNLP by passing a filename parameter

from uralicNLP import uralicApi
uralicApi.generate("kissa+N+Pl+Nom", "fin", filename="/path_to_your/transducer.hfstol")
uralicApi.analyze("kissat", "fin", filename="/path_to_your/transducer.hfstol")
uralicApi.lemmatize("kissat", "fin", filename="/path_to_your/transducer.hfstol")