Skip to content

Graduate school course in Natural Language Processing - weekly assignments.

License

Notifications You must be signed in to change notification settings

cbroker1/text-as-data

Repository files navigation

Graduate school course in Natural Language Processing - weekly assignments.

Week 1 - Introduction To Text As Data

  • What is a 'Document Frequency Matrix' and what is it used for?

Week 2 - Text Pre-Processing

  • Exploring best practices when pre-processing text data: tf-idf, tokenization, stop-words, stemming, etc.

Week 3 - Getting Data and APIs

  • Exploring the use of 4 APIs: NYT, Steam, Facebook and Twitter. Used NYT API to preform a sentiment analysis on 6-months-worth of headlines via pytorch.

Week 4 - Methods AND Models

  • Exploring regex and various models.

Week 5 - Dictionary-Based Methods

  • Sentiment Analysis, Hyper-Parameter Tuning, Training Dataset Selection

Week 6 - Topic Models

  • Exploring topic models

Week 7 - Supervised Learning Models

Week 8 - Unsupervised Learning/Clustering

Week 9 - Supervised Classification Models

Week 10 - Unsupervised Learning Methods

Week 11 - Word Embeddings

Week 12 - Course Project Submission

About

Graduate school course in Natural Language Processing - weekly assignments.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published