VietnameseTextToxicClassify

I train this model using PyTorch to detect toxic of a comment for Projectube

I use VNCoreNLP for preprocess the raw Vietnamese sentences data and PhoBERT to train model to classify text. I use these technology at https://github.com/VinAIResearch/PhoBERT.

Use my code

1. Git clone my repository:

git clone https://github.com/hoangcaobao/Vietnamese_Text_Toxic_Classify.git

2. Change directory to my folder and install VNCoreNLP:

cd VietnameseTextToxicClassify
pip3 install vncorenlp
mkdir -p vncorenlp/models/wordsegmenter
wget https://github.com/raw/vncorenlp/VnCoreNLP/master/VnCoreNLP-1.1.1.jar
wget https://github.com/raw/vncorenlp/VnCoreNLP/master/models/wordsegmenter/vi-vocab
wget https://github.com/raw/vncorenlp/VnCoreNLP/master/models/wordsegmenter/wordsegmenter.rdr
mv VnCoreNLP-1.1.1.jar vncorenlp/ 
mv vi-vocab vncorenlp/models/wordsegmenter/
mv wordsegmenter.rdr vncorenlp/models/wordsegmenter/

3. Add more data in 2 json files

4. Run training file:

python3 training.py

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
LICENSE		LICENSE
README.md		README.md
normal_dataset.json		normal_dataset.json
sacarism_dataset.json		sacarism_dataset.json
training.py		training.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VietnameseTextToxicClassify

Use my code

1. Git clone my repository:

2. Change directory to my folder and install VNCoreNLP:

3. Add more data in 2 json files

4. Run training file:

HOANG CAO BAO

About

Releases

Packages

Languages

License

hoangcaobao/Vietnamese_Text_Toxic_Classify

Folders and files

Latest commit

History

Repository files navigation

VietnameseTextToxicClassify

Use my code

1. Git clone my repository:

2. Change directory to my folder and install VNCoreNLP:

3. Add more data in 2 json files

4. Run training file:

HOANG CAO BAO

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages