Morfessor demonstration
-
Updated
Mar 13, 2015 - Python
Morfessor demonstration
Central repository with pretrained models for transfer learning, BPE subword-tokenization, mono/multilingual embeddings, and everything in between.
The concept of DAWGs is based on: Blumer, A. et al. (1985). The smallest automation recognizing the subwords of a text. Theoretical Computer Science, 40, 31–55.
Repository for the experiments in my paper: "A Systematic Analysis of Vocabulary and BPE Settings for Optimal Fine-tuning of NMT: A Case Study of In-domain Translation "
Morfessor EM+Prune
Cognate-aware morphological segmentation
ICEBERT: Interlingual-Clusters Enhanced BERT. A BERT-like model trained on clusters of monolingual subwords.
Parsing and subword segmentation code for the VML-HD Dataset
Morfessor EM+Prune
Morfessor FlatCat
Morfessor is a tool for unsupervised and semi-supervised morphological segmentation
Add a description, image, and links to the subword-segmentation topic page so that developers can more easily learn about it.
To associate your repository with the subword-segmentation topic, visit your repo's landing page and select "manage topics."