Skip to content

estnltk/suffix-lemmatizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

suffix-lemmatizer

Suffix-lemmatizer is a lemmatizer for Estonian language, which handles both in- and out-of-vocabulary (OOV) words. OOV issue is addressed by generating candidate lemmas based on suffix transformations and ranking them using a statistical model.

Suffix-lemmatizer works with Python 2.7.

Installation

git clone https://github.com/estnltk/suffix-lemmatizer.git
cd suffix-lemmatizer
python setup.py install

Usage

  from suffix_lemmatizer import SuffixLemmatizer
  sl = SuffixLemmatizer()
  lemma = sl('metsast')
  print(lemma)
  >>> 'mets'

About

Suffix Lemmatizer for Estonian

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages