Underthesea - Vietnamese NLP Toolkit
-
Updated
Jun 22, 2024 - Python
Underthesea - Vietnamese NLP Toolkit
Solves basic Russian NLP tasks, API for lower level Natasha projects
Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing
Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.
Bitextor generates translation memories from multilingual websites
Rule-based token, sentence segmentation for Russian language
CKIP CoreNLP Toolkits
A toolkit for discourse segmentation (EDU segmentation).
🦜 Containerized HTTP API for industrial-strength NLP via spaCy and sense2vec
A sentence segmentation library with wide language support optimized for speed and utility.
NLP tools, word segmentation, sentence segmentation, New-Word-Discovery,新词发现
A flexible sentence segmentation library using CRF model and regex rules
Deep neural approach to Boundary and Disfluency Detection - Based on my Master's work
Pre-trained models for tokenization, sentence segmentation and so on
Sentence Segmentation for Spacy
HTML2SENT modifies HTML to improve sentences tokenizer quality
Vietnamese Sentence Boundary Detection
Punctuation Restoration for Khmer language
Pre-trained models for tokenization, sentence segmentation and so on
A tool to perform sentence segmentation on Japanese text
Add a description, image, and links to the sentence-segmentation topic page so that developers can more easily learn about it.
To associate your repository with the sentence-segmentation topic, visit your repo's landing page and select "manage topics."