Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.
-
Updated
Jul 20, 2023 - Go
Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.
Natural Language Processing (NLP) Tokenization Libary designed for English. Fast, Lean, Customizable. Tokenizes text, replaces abbreviations, replaces contractions, lowercases words, optionally you can remove stop words as well
📒 An Aho-Corasick algorithm based string-searching utility for Go. It supports tokenization, ignoring case, replacing text. So you can use it to find keywords in an article, filter sensitive words, etc.
Golang Word Frequency Counter
Plagiarism detection using stopwords n-grams
Add a description, image, and links to the stopwords topic page so that developers can more easily learn about it.
To associate your repository with the stopwords topic, visit your repo's landing page and select "manage topics."