Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.
-
Updated
Jul 20, 2023 - Go
Removes most frequent words (stop words) from a text content. Based on a Curated list of language statistics.
📒 An Aho-Corasick algorithm based string-searching utility for Go. It supports tokenization, ignoring case, replacing text. So you can use it to find keywords in an article, filter sensitive words, etc.
Plagiarism detection using stopwords n-grams
Golang Word Frequency Counter
Natural Language Processing (NLP) Tokenization Libary designed for English. Fast, Lean, Customizable. Tokenizes text, replaces abbreviations, replaces contractions, lowercases words, optionally you can remove stop words as well
Add a description, image, and links to the stopwords topic page so that developers can more easily learn about it.
To associate your repository with the stopwords topic, visit your repo's landing page and select "manage topics."