Tokenizer 1.31.0
New features
- Add utilities to build and use vocabularies:
pyonmttok.Vocab
pyonmttok.build_vocab_from_tokens
pyonmttok.build_vocab_from_lines
- Define the method
Tokenizer.__call__
to simplify the tokenizer usage when additional features are unused:
tokens = tokenizer(text)
Fixes and improvements
- Update pybind11 to 2.9.1