Releases: PyThaiNLP/pythainlp
PyThaiNLP v4.1.0-beta4
Docs: https://pythainlp.github.io/dev-docs/
Report bug: https://github.com/PyThaiNLP/pythainlp/issues
Install: pip install --pre pythanlp
See 4.1 Milestone.
What's Changed
- Fix incorrect passing of flags to re.split by @hauntsaninja in #832
- Add pythainlp.ancient by @wannaphong in #833
- Add syllable_tokenize by @wannaphong in #834
- Add wanchanberta_thai_grammarly by @wannaphong in #836
New Contributors
- @hauntsaninja made their first contribution in #832
Full Changelog: v4.1.0-beta3...v4.1.0-beta4
PyThaiNLP v4.1.0-beta3
PyThaiNLP v4.1.0-beta2
What is change?
Full Changelog: v4.1.0-beta1...v4.1.0-beta2
PyThaiNLP v4.1.0-beta1
Schedule
- First Beta release: 24 July 2023
Docs: https://pythainlp.github.io/dev-docs/
Report bug: https://github.com/PyThaiNLP/pythainlp/issues
Install: pip install --pre pythanlp
See 4.1 Milestone.
What is new?
Deprecation and other API changes
- 5e97e7c Change the default NER to thainer-v2
New API
- Add pythainlp.coref: Add pythainlp.coref for support Thai Coreference resolution #802
- Add wtpsplit to sentence segmentation & paragraph segmentation #804 and add paragraph_threshold into paragraph_tokenize function #806
- Add word approximation to pythainlp.soundex.sound by @wannaphong in #809
- Add pythainlp.wsd for Thai Word Sense Disambiguation by @wannaphong in #818
- Add pythainlp.chat and WangChanGLM to pythainlp.generate by @wannaphong in #819
- Add a param-free classification model (
pythainlp.cls
) by @c4n in #821 - Add pythainlp.el by @wannaphong in #822
- Add pythainlp.util.abbreviation_to_full_text #826 by @wannaphong in #826
Tokenizer
- Add wtpsplit engine to sentence_tokenize #804
- New
paragraph_tokenize
funtion to split Thai text to a paragraph. #804 - add
paragraph_threshold
intoparagraph_tokenize
function by @pavaris-pm in #806
Translate
- Add small100 to pythainlp.translate by @wannaphong in #815
Corpus
- Add orst list by @wannaphong in #810
- Add thai_synonym #825 by @wannaphong in #825
Util
- Add pythainlp.util.encoding by @wannaphong in #813
- Add pythainlp.util.spell_words by @wannaphong in #817
- Add pythainlp.util.abbreviation_to_full_text #826 by @wannaphong in #826
New Contributors
- @pavaris-pm made their first contribution in #806
- @falukelo made their first contribution in #824
Full Changelog: v4.0.0...v4.1.0-beta1
PyThaiNLP v4.0.2 Released!
PyThaiNLP v4.0.2
is a bug fix release of PyThaiNLP v4.0
.
Upgrade: pip install -U pythainlp
Documentation: https://pythainlp.github.io/docs/4.0
Report bug: https://github.com/PyThaiNLP/pythainlp/issues
What's Changed
- fixed bug by @kangkengkhadev in #798
- fig เอือน อวน by @kangkengkhadev in #799
Full Changelog: v4.0.1...v4.0.2
Contributors
Thanks all the contributors. (Image made with contributors-img)
If you want to contributing to PyThaiNLP, you can read Contributing to PyThaiNLP.
PyThaiNLP v4.0.1 Released!
PyThaiNLP v4.0.1
is a bug fix release of PyThaiNLP v4.0
.
Upgrade: pip install -U pythainlp
Documentation: https://pythainlp.github.io/docs/4.0
Report bug: https://github.com/PyThaiNLP/pythainlp/issues
What's Changed
- Fix mishandling Karun in Kavee Matra Checker by @HRNPH in #793
- adding tonemark removal to fix mattra checking by @HRNPH in #795
Full Changelog: v4.0.0...v4.0.1
Contributors
Thanks all the contributors. (Image made with contributors-img)
If you want to contributing to PyThaiNLP, you can read Contributing to PyThaiNLP.
PyThaiNLP 4.0 Released!
PyThaiNLP published the first version is 0.0.4 to PyPI at 6 years ago, so PyThaiNLP 4.0 will have special codename. The codename for PyThaiNLP 4.0 is PyThaiNLP 4.0 (Real).
See 4.0 Milestone.
Documentation: https://pythainlp.github.io/docs/4.0
Report bug: https://github.com/PyThaiNLP/pythainlp/issues
If you want to contribute to PyThaiNLP, you can read Contributing to PyThaiNLP.
What is new?
Deprecation and other API changes
- Delete all LST20 model #728
- 947c7be Change pythainlp.tools.misspell to pythainlp.tools.misspell.misspell
Improve
Tokenizer
Tag
Util
Transliterate
- Add Thai2Rom ONNX model #743
Khavee
Parse
- Add ud_goeswith #757
Corpus
- Add new science word #763
Full Changelog
- Improve: Reduce import time by @wannaphong in #719
- Create CITATION.cff by @wannaphong in #721
- Fix/broken numeric data format (#652) by @noppayut in #723
- Add blackboard pos_tag to cls by @wannaphong in #734
- Update perceptron.py by @wannaphong in #736
- Feature/integrate transliteration dictionary (#681) by @noppayut in #735
- Delete all LST20 model by @wannaphong in #728
- Add blackboard cls by @wannaphong in #732
- Add blackboard pos_tag by @wannaphong in #733
- Add style.css: extend docs page width by @LXZE in #742
- Add rule to TCC and Change TCC rule for newmm by @wannaphong in #741
- Setup action to check for code formatting by @new5558 in #746
- Add more test for TCC by @wannaphong in #747
- Add Thai2Rom ONNX model by @new5558 in #743
- Add pythainlp.util.count_thai_chars by @wannaphong in #748
- Feature: keyword extraction with keybert and frequency ranking by @noppayut in #751
- Add ud_goeswith by @wannaphong in #757
- Bump tensorflow from 2.7.2 to 2.9.3 by @dependabot in #758
- Add new science word by @wannaphong in #763
- Add thai_strptime and convert_years by @wannaphong in #767
- Fix typo in thai_full_month_lists for February by @PhakphumV in #770
- Add pythainlp.util.phoneme by @wannaphong in #772
- Add remove tone ipa by @wannaphong in #776
- add khavee to pythainlp by @kangkengkhadev in #777
- Add khavee docs tests by @wannaphong in #778
- add aek/too checker function to khavee by @HRNPH in #779
- Add Thai NER 2.0 by @wannaphong in #781
- Add Copyright to the header files by @wannaphong in #782
- Fixed some issues in Khavee. It's a problem with use อ by @kangkengkhadev in #785
- PyThaiNLP 4.0 beta 1 by @wannaphong in #786
- fix some bugs and add check_karu_lahu function by @kangkengkhadev in #787
- PyThaiNLP 4.0 Released! by @wannaphong in #789
Full Changelog: v3.1.0...v4.0.0
Contributors
Thanks all the contributors. (Image made with contributors-img)
If you want to contributing to PyThaiNLP, you can read Contributing to PyThaiNLP.
New Contributors
- @LXZE made their first contribution in #742
- @new5558 made their first contribution in #746
- @PhakphumV made their first contribution in #770
- @kangkengkhadev made their first contribution in #777
- @HRNPH made their first contribution in #779
PyThaiNLP v4.0.0-beta1
This post will give you the change log for PyThaiNLP 4.0. PyThaiNLP published the first version is 0.0.4 to PyPI at 6 years ago, so PyThaiNLP 4.0 will have special codename. The codename for PyThaiNLP 4.0 is PyThaiNLP 4.0 (Real).
This release is the first beta release of PyThaiNLP 4.0.
Schedule
- Beta release: 1 April 2023
- Production release: 14 April 2023
See 4.0 Milestone.
What is new?
Deprecation and other API changes
- Delete all LST20 model #728
- 947c7be Change pythainlp.tools.misspell to pythainlp.tools.misspell.misspell
Improve
Tokenizer
Tag
Util
Transliterate
- Add Thai2Rom ONNX model #743
Khavee
Parse
- Add ud_goeswith #757
Corpus
- Add new science word #763
What's Changed
- Improve: Reduce import time by @wannaphong in #719
- Create CITATION.cff by @wannaphong in #721
- Fix/broken numeric data format (#652) by @noppayut in #723
- Add blackboard pos_tag to cls by @wannaphong in #734
- Update perceptron.py by @wannaphong in #736
- Feature/integrate transliteration dictionary (#681) by @noppayut in #735
- Delete all LST20 model by @wannaphong in #728
- Add blackboard cls by @wannaphong in #732
- Add blackboard pos_tag by @wannaphong in #733
- Add style.css: extend docs page width by @LXZE in #742
- Add rule to TCC and Change TCC rule for newmm by @wannaphong in #741
- Setup action to check for code formatting by @new5558 in #746
- Add more test for TCC by @wannaphong in #747
- Add Thai2Rom ONNX model by @new5558 in #743
- Add pythainlp.util.count_thai_chars by @wannaphong in #748
- Feature: keyword extraction with keybert and frequency ranking by @noppayut in #751
- Add ud_goeswith by @wannaphong in #757
- Bump tensorflow from 2.7.2 to 2.9.3 by @dependabot in #758
- Add new science word by @wannaphong in #763
- Add thai_strptime and convert_years by @wannaphong in #767
- Fix typo in thai_full_month_lists for February by @PhakphumV in #770
- Add pythainlp.util.phoneme by @wannaphong in #772
- Add remove tone ipa by @wannaphong in #776
- add khavee to pythainlp by @kangkengkhadev in #777
- Add khavee docs tests by @wannaphong in #778
- add aek/too checker function to khavee by @HRNPH in #779
- Add Thai NER 2.0 by @wannaphong in #781
- Add Copyright to the header files by @wannaphong in #782
- Fixed some issues in Khavee. It's a problem with use อ by @kangkengkhadev in #785
- PyThaiNLP 4.0 beta 1 by @wannaphong in #786
New Contributors
- @LXZE made their first contribution in #742
- @new5558 made their first contribution in #746
- @PhakphumV made their first contribution in #770
- @kangkengkhadev made their first contribution in #777
- @HRNPH made their first contribution in #779
Full Changelog: v3.1.0...v4.0.0-beta1
PyThaiNLP v3.1.1 Released!
PyThaiNLP v3.1.1 is the releases updates of PyThaiNLP v3.1.0.
What's Changed
pythainlp.tools.misspell
changed topythainlp.tools.misspell.misspell
.- Add Reduce import time #719 to PyThaiNLP 3.1.1 #753
- Doc: Lst20 deprecation warning for 3.1.1 (#749) #752 (Thank you @noppayut)
Full Changelog: v3.1.0...v3.1.1
You can install or upgrade by pip install pythainlp==3.1.1
.
Documentation: https://pythainlp.github.io/docs/3.1
Report bug: https://github.com/PyThaiNLP/pythainlp/issues
See 3.1 Milestone.
Contributors
Thanks all the contributors. (Image made with contributors-img)
PyThaiNLP v3.1.0 Released!
This is the release version for PyThaiNLP v3.1.0
You can install by pip install pythainlp==3.1.0
.
Documentation: https://pythainlp.github.io/docs/3.1
Report bug: https://github.com/PyThaiNLP/pythainlp/issues
See 3.1 Milestone.
What is new?
Deprecation and other API changes
#687 Remove deprecated function
- pythainlp.word_vector; doesnt_match, get_model, most_similar_cosmul, sentence_vectorizer, similarity. use WordVector class instead
- pythainlp.util.delete_tone. use pythainlp.util.remove_tonemark instead
- Remove pythainlp.util.time_time. use pythainlp.util.time_to_thaiword instead
- pythainlp.tokenize.syllable_tokenize. use pythainlp.tokenize.subword_tokenize instead
Dependency Parsing
- Now, PyThaiNLP support dependency_parsing 🎉 Add pythainlp.parse.dependency_parsing #706
Name Entity Tagging
- #665 Add Thai-NNER
pythainlp.tag.NNER
- #658 Add LST20NER onnx model. It is LST20NER model to onnx model from fine-turning by WangchanBERTa model.
Transliteration
- #659 Add ISO 11940 transliteration
- #660 Add Thai W2P v0.2
- #686 Add wunsen
- #694 Wunsen Mandarin and Japanese update
PyThaiNLP Corpus downloader
- #656 Add support zip/tar.gz to download corpus
Text normalization
- #673 Add a normalising rule for Lakkhangyao ๅ
Translate
- #674 add gpu option
Text summarize
- #679 Add mt5 cpe kmutt thai sentence sum
Util
- #682 Add live-dead syllable classification
- #684 Add live dead syllable classify
- #690 Add tone detector
Soundex
- #699 Add Thai-English Cross-Language Transliterated Word Retrieval using Soundex Technique
Other
- #689 map NG tag to PART
- #691 Remove TinyDB as a dependency
- #692 Fix notifications that newer versions of corpora are available
- Add warning about LST20 license
Contributors
New Contributors
- @chameleonTK made their first contribution in #673
- @vikimark made their first contribution in #674
- @BLKSerene made their first contribution in #691
- @cakimpei made their first contribution in #694
Full Changelog: v3.0.10...v3.1.0
All Contributors
Thanks all the contributors. (Image made with contributors-img)
We build Thai NLP.
PyThaiNLP