GitHub

Chinese Word Segment 2018-11

clean() 删去无用字符，打乱后 train 70% / dev 20% / test 10% 划分

convert() 分别删去、记录空格得到 sent、label，pad() 填充为相同长度

通过 rnn、s2s 构建序列标注模型，计算 mask_loss、mask_acc

predict() 比较原句和填充长度得到 mask_pred，在为 1 的字后插入空格

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
data		data
feat		feat
model		model
stat		stat
.gitignore		.gitignore
README.md		README.md
build.py		build.py
eval.py		eval.py
explore.py		explore.py
nn_arch.py		nn_arch.py
preprocess.py		preprocess.py
represent.py		represent.py
segment.py		segment.py
util.py		util.py