Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix MetaSound + Adjust tokenizer selector + More documentation + clean code #135

Merged
merged 16 commits into from
Oct 24, 2018

Conversation

bact
Copy link
Member

@bact bact commented Oct 23, 2018

  • fix MetaSound, follow algorithm in the Snae & Brückner (2009) paper
  • update doc: pythainlp-dev-thai.md
  • more details on word tokenizers
  • add alias "longest" for "longest-matching" tokenizer
  • remove "dict" tokenizer from document, as there is no implementation in the code
  • remove "mm" tokenizer from document, as it is not recommended to use (maintenance mode), but keep the code, so people can run it
  • remove unused imports

- more details on word tokenizers
- add alias "longest" for longest-matching tokenizer
- remove "dict" tokenizer from document, as there is no implementation in the code
- remove "mm" tokenizer from document, as it is not recommended to use (maintenance mode), but keep the code, so people can call it
- update doc: pythainlp-dev-thai.md
- remove unused import sys
Merge from PyThaiNLp/pythainlp
Add more MetaSound test
@coveralls
Copy link

coveralls commented Oct 23, 2018

Coverage Status

Coverage decreased (-0.5%) to 54.385% when pulling fb229b2 on bact:dev into 55fcff7 on PyThaiNLP:dev.

- follow algorithm as explained in the paper Snae & Brückner (2009) https://pdfs.semanticscholar.org/3983/963e87ddc6dfdbb291099aa3927a0e3e4ea6.pdf
- padding zeros to 4 characters length (default number, as specified in the paper)
- retain the first alphabet
- rename MetaSound to metasound (small caps function naming convention)
- remove the use of regex
@bact bact changed the title More documetation Fix MetaSound + Adjust tokenizer selector + More documetation Oct 23, 2018
Copy link
Member

@wannaphong wannaphong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

เปลี่ยนชื่อไฟล์ MetaSound.py เป็นไฟล์ metasound.py ครับ

@wannaphong wannaphong added this to the 1.8 milestone Oct 23, 2018
@bact bact changed the title Fix MetaSound + Adjust tokenizer selector + More documetation Fix MetaSound + Adjust tokenizer selector + More documentation + clean code Oct 24, 2018
@bact
Copy link
Member Author

bact commented Oct 24, 2018

เปลี่ยนชื่อเรียบร้อยครับ เหมือนพอมันเป็นชื่อเดียวกันแต่ต่างแค่ตัวเล็กตัวใหญ่แล้วตัว git มองไม่เห็นว่ามันเปลี่ยน
ผมเลยใช้วิธีเปลี่ยน metasound.py -> metasound_.py แล้ว commit, แล้วเปลี่ยน metasound_.py -> metasound.py แล้ว commit อีกทีครับ เป็น 2 ขยัก

@wannaphong wannaphong merged commit 020805c into PyThaiNLP:dev Oct 24, 2018
@wannaphong wannaphong mentioned this pull request Nov 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants