Skip to content

Topic modeling using NMF(Non-Negative Matrix Factorization) algorithm for perisan language.

Notifications You must be signed in to change notification settings

saied71/Persian_Topic_mdeling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Persian_Topic_mdeling

Topic modeling using NMF(Non-Negative Matrix Factorization) algorithm for perisan language.

input data is Persian news from hamshahri(همشهری) newspaper that is gathered between 75 to 87(Solar Hijri calendar)

input_data_link : "https://bigdata-ir.com/%d8%af%db%8c%d8%aa%d8%a7%d8%b3%d8%aa-%d8%a7%d8%ae%d8%a8%d8%a7%d8%b1-%d8%ad%d8%af%d9%88%d8%af-%da%86%d9%87%d8%a7%d8%b1-%d9%87%d8%b2%d8%a7%d8%b1-%d8%ae%d8%a8%d8%b1-%d9%81%d8%a7%d8%b1%d8%b3%db%8c-%d8%a8/"

sample output is pandas Dataframe that is stored in a pickle file for checking output of the model.

note:copy "persian" to "nltk_data/corpora/stopwords"

About

Topic modeling using NMF(Non-Negative Matrix Factorization) algorithm for perisan language.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages