Skip to content
This repository has been archived by the owner on Oct 22, 2023. It is now read-only.

veer66/wordcutpy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

wordcutpy

wordcutpy is a simple Thai word breaker written in Python 3+

Installation

pip install wordcutpy

Example

Conventional verison

#! -*- coding: UTF8 -*-
from wordcut import Wordcut
if __name__ == '__main__':
    with open('bigthai.txt', encoding="UTF-8") as dict_file:
        word_list = list(set([w.rstrip() for w in dict_file.readlines()]))
        wordcut = Wordcut(word_list)
        print(wordcut.tokenize("กากา cat หมา"))

Simplified version

#! -*- coding: UTF8 -*-
from wordcut import Wordcut
wordcut = Wordcut.bigthai()
print(wordcut.tokenize("กากา cat หมา"))

Test

Run tests

python -m unittest discover -s tests

About

A simple word breaker written in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages