gcws is CWS(Chinese Word Segmentation) for golang - many cws adapters manager.
The repo is inspired by database/sql
.
go get github.com/WindomZ/gcws/...
- sego - Go中文分词,用双数组trie(Double-Array Trie)实现[GitHub]
- jieba - "结巴"中文分词的Golang版本[GitHub]
- cwsharp - Golang中文分词库,支持多种分词模式,支持自定义字典和扩展[GitHub]
- segment - golang 版中文分词包, inspired from 盘古分词[GitHub]
- gse - Go efficient text segmentation; support english, chinese, japanese and other.[GitHub]
Import it
import (
"github.com/WindomZ/gcws"
)
Init it (example with jieba
)
import (
_ "github.com/WindomZ/gcws/jieba"
)
...
cws, err := gcws.NewCWS("jieba")
Use it
cws.Tokenize("For man is man and master of his fate.") // return []string{...}
- ModeDefault - default mode
- ModeSearch - search optimization, support
sego
,jieba
,segment
,gse
- ModeFast - run fast, support
cwsharp
- ModeEnglish - optimization for English, support
sego
,jieba