Skip to content

Commit

Permalink
Update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
datquocnguyen committed May 29, 2018
1 parent 908ab52 commit 8fa8caf
Showing 1 changed file with 6 additions and 3 deletions.
9 changes: 6 additions & 3 deletions Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,16 @@ VnCoreNLP is a Java NLP annotation pipeline for Vietnamese, providing rich lingu

**The general architecture and experimental results of VnCoreNLP can be found in the following related papers:**

1. Thanh Vu, Dat Quoc Nguyen, Dai Quoc Nguyen, Mark Dras and Mark Johnson. **2018**. VnCoreNLP: A Vietnamese Natural Language Processing Toolkit. In *Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations*, [NAACL 2018](http://naacl2018.org), to appear. [[.pdf]](https://arxiv.org/abs/1801.01331)
2. Dat Quoc Nguyen, Dai Quoc Nguyen, Thanh Vu, Mark Dras and Mark Johnson. **2018**. [A Fast and Accurate Vietnamese Word Segmenter](http://www.lrec-conf.org/proceedings/lrec2018/summaries/55.html). In *Proceedings of the 11th International Conference on Language Resources and Evaluation*, [LREC 2018](http://lrec2018.lrec-conf.org/en/), pages 2582-2587. [[.bib]](https://people.eng.unimelb.edu.au/dqnguyen/resources/LREC2018.bib)
1. Thanh Vu, Dat Quoc Nguyen, Dai Quoc Nguyen, Mark Dras and Mark Johnson. **2018**. [VnCoreNLP: A Vietnamese Natural Language Processing Toolkit](http://aclweb.org/anthology/N18-5012). In *Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations*, [NAACL 2018](http://naacl2018.org), pages 56-60. [[.bib]](http://aclweb.org/anthology/N18-5012.bib)
2. Dat Quoc Nguyen, Dai Quoc Nguyen, Thanh Vu, Mark Dras and Mark Johnson. **2018**. [A Fast and Accurate Vietnamese Word Segmenter](http://www.lrec-conf.org/proceedings/lrec2018/summaries/55.html). In *Proceedings of the 11th International Conference on Language Resources and Evaluation*, [LREC 2018](http://lrec2018.lrec-conf.org/en/), pages 2582-2587. [[.bib]](https://dblp.uni-trier.de/rec/bibtex/conf/lrec/NguyenNVDJ18)
3. Dat Quoc Nguyen, Thanh Vu, Dai Quoc Nguyen, Mark Dras and Mark Johnson. **2017**. [From Word Segmentation to POS Tagging for Vietnamese](http://aclweb.org/anthology/U17-1013). In *Proceedings of the 15th Annual Workshop of the Australasian Language Technology Association*, [ALTA 2017](http://alta2017.alta.asn.au), pages 108-113. [[.bib]](http://aclweb.org/anthology/U17-1013.bib)

Please **CITE** paper [1] whenever VnCoreNLP is used to produce published results or incorporated into other software. If you are dealing in depth with either word segmentation or POS tagging, you are encouraged to also cite paper [2] or [3], respectively.

**NOTE** that if you are looking for light-weight versions, VnCoreNLP's word segmentation and POS tagging components have also been released as independent packages [RDRsegmenter](https://github.com/datquocnguyen/RDRsegmenter) [2] and [VnMarMoT](https://github.com/datquocnguyen/VnMarMoT) [3], resepectively.

VnCoreNLP is **free** for non-commercial use and distributed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International ([CC BY-NC-SA](https://creativecommons.org/licenses/by-nc-sa/4.0/)) License.

## Using VnCoreNLP from the command line

Assume that Java 1.8+ is already set to run in the command line or terminal (for example: adding Java to the environment variable `path` in Windows OS); and file `VnCoreNLP-1.0.1.jar` (27MB) and folder `models` (113MB) are placed in the same working folder. You can run VnCoreNLP to annotate an input raw text corpus (e.g. a collection of news content) by using following commands:
Expand All @@ -27,7 +31,6 @@ Assume that Java 1.8+ is already set to run in the command line or terminal (for
// To perform word segmentation
$ java -Xmx2g -jar VnCoreNLP-1.0.1.jar -fin input.txt -fout output.txt -annotators wseg

**NOTE** that if you are looking for light-weight versions, VnCoreNLP's word segmentation and POS tagging components have also been released as independent packages [RDRsegmenter](https://github.com/datquocnguyen/RDRsegmenter) [2] and [VnMarMoT](https://github.com/datquocnguyen/VnMarMoT) [3], resepectively.

## Using VnCoreNLP from the API

Expand Down

0 comments on commit 8fa8caf

Please sign in to comment.