Natural Language Processing 2

My seconde iteration of works for "Natural Language Processing" Coursera course.

Week1

Predict tag of Stackoverflow with linear model. (multi-tag) Used GridSearch CV for optimisation on tfidf model.

Accuracy: 0.3631 (low because multi-tag)
F1-score micro: : 0.6752548016404758
F1-score macro: : 0.5062362622222868
F1-score weighted: : 0.6540253209151559
Precision micro: : 0.4810782509911088
Precision macro: : 0.3396351761816072
Precision weighted: : 0.5102553014775187

Week2

Named entity recognition with LSTMs.

TODO

Week3

Find duplicate questions by their embedding in word2vec. TODO

Week4

Seq2Seq model, encoder-decoder to learn to addition and substraction. Model could be used for other tasks. TODO

Week5

TODO

Natural Language Processing course resources

https://www.coursera.org/learn/language-processing

Running on Google Colab

Google has released its own flavour of Jupyter called Colab, which has free GPUs!

Here's how you can use it:

Open https://colab.research.google.com, click Sign in in the upper right corner, use your Google credentials to sign in.
Click GITHUB tab, paste https://github.com/hse-aml/natural-language-processing and press Enter
Choose the notebook you want to open, e.g. week1/week1-MultilabelClassification.ipynb
Click File -> Save a copy in Drive... to save your progress in Google Drive
If you need a GPU, click Runtime -> Change runtime type and select GPU in Hardware accelerator box
Execute the following code in the first cell that downloads dependencies (change for your week number):

! wget https://raw.githubusercontent.com/hse-aml/natural-language-processing/master/setup_google_colab.py -O setup_google_colab.py
import setup_google_colab
# please, uncomment the week you're working on
# setup_google_colab.setup_week1()  
# setup_google_colab.setup_week2()
# setup_google_colab.setup_week3()
# setup_google_colab.setup_week4()
# setup_google_colab.setup_project()
# setup_google_colab.setup_honor()

If you run many notebooks on Colab, they can continue to eat up memory, you can kill them with ! pkill -9 python3 and check with ! nvidia-smi that GPU memory is freed.

Known issues:

No support for ipywidgets, so we cannot use fancy tqdm progress bars. For now, we use a simplified version of a progress bar suitable for Colab.
Blinking animation with IPython.display.clear_output(). It's usable, but still looking for a workaround.

Running elsewhere

AWS

Docker

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Natural Language Processing 2

Week1

Week2

Week3

Week4

Week5

Natural Language Processing course resources

Running on Google Colab

Running elsewhere

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
common		common
docker		docker
honor		honor
project		project
week1		week1
week2		week2
week3		week3
week4		week4
AWS-tutorial.md		AWS-tutorial.md
Docker-tutorial.md		Docker-tutorial.md
README.md		README.md
setup_google_colab.py		setup_google_colab.py

cyrilou242/natural-language-processing-2

Folders and files

Latest commit

History

Repository files navigation

Natural Language Processing 2

Week1

Week2

Week3

Week4

Week5

Natural Language Processing course resources

Running on Google Colab

Running elsewhere

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages