-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix MetaSound + Adjust tokenizer selector + More documentation + clean code #135
Conversation
bact
commented
Oct 23, 2018
•
edited
Loading
edited
- fix MetaSound, follow algorithm in the Snae & Brückner (2009) paper
- update doc: pythainlp-dev-thai.md
- more details on word tokenizers
- add alias "longest" for "longest-matching" tokenizer
- remove "dict" tokenizer from document, as there is no implementation in the code
- remove "mm" tokenizer from document, as it is not recommended to use (maintenance mode), but keep the code, so people can run it
- remove unused imports
- more details on word tokenizers - add alias "longest" for longest-matching tokenizer - remove "dict" tokenizer from document, as there is no implementation in the code - remove "mm" tokenizer from document, as it is not recommended to use (maintenance mode), but keep the code, so people can call it - update doc: pythainlp-dev-thai.md - remove unused import sys
Merge from PyThaiNLp/pythainlp
- update doc
Add more MetaSound test
- minor docs additions
Delete mkdocs.yml
- follow algorithm as explained in the paper Snae & Brückner (2009) https://pdfs.semanticscholar.org/3983/963e87ddc6dfdbb291099aa3927a0e3e4ea6.pdf - padding zeros to 4 characters length (default number, as specified in the paper) - retain the first alphabet - rename MetaSound to metasound (small caps function naming convention) - remove the use of regex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
เปลี่ยนชื่อไฟล์ MetaSound.py เป็นไฟล์ metasound.py ครับ
เปลี่ยนชื่อเรียบร้อยครับ เหมือนพอมันเป็นชื่อเดียวกันแต่ต่างแค่ตัวเล็กตัวใหญ่แล้วตัว git มองไม่เห็นว่ามันเปลี่ยน |