Skip to content

faisaltareque/BanglaLanguageToolkit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Version Info Python

Bangla Language Processing Toolkit

This is a very basic Bangla language processing toolkit for my personal use.

Most of the code snippets are taken from following open source projects. Please follow these projects for greater details:

Installation

Run following command to install

pip install git+https://github.com/faisaltareque/BanglaLanguageToolkit.git

Usage

Code

from BanglaLanguageToolkit import BanglaTextCleaner
cleaner = BanglaTextCleaner(remove_emoji=True, remove_email=True, remove_url=True, remove_punct=True)

text = "সে কিভাবে রিগামের সাথে সম্পর্কিত? How is he related to Regum?, www.google.com, demo@gmail.com."
text = cleaner.clean(text)
print(text)

Output

সে কিভাবে রিগামের সাথে সম্পর্কিত <PUNC> How is he related to Regum <PUNC> <PUNC> <URL> <PUNC> <EMAIL> <PUNC>

Code

from BanglaLanguageToolkit import BanglaTextCleaner
cleaner = BanglaTextCleaner(remove_emoji=True, remove_email=True, remove_url=True, remove_punct=True)

text = "সে কিভাবে রিগামের সাথে সম্পর্কিত? How is he related to Regum?, www.google.com, demo@gmail.com."
text = cleaner.clean(text)
text = cleaner.replace_foreign_words(text, keep_special_tokens=True, replace_multiple_foreign_words=False)

Output

সে কিভাবে রিগামের সাথে সম্পর্কিত <PUNC> <FOREIGN> <FOREIGN> <FOREIGN> <FOREIGN> <FOREIGN> <FOREIGN> <PUNC> <PUNC> <URL> <PUNC> <EMAIL> <PUNC>

About

Bangla Language Processing Toolkit

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published