Baal in Production Notebook | Classification | NLP | Hugging Face #245

nitish1295 · 2022-12-06T10:57:42Z

Summary:

This is a demo/tutorial to use active learning with hugging face models in a production setting. Kindly find more about this at in the discussion at #242

Features:

NA

Checklist:

Your code is documented (To validate this, add your module to tests/documentation_test.py).
Your code is tested with unit tests.
You moved your Issue to the PR state.

Given that this is a notebook and I am not setting up any new modules there are no test cases. There is some pending type hinting pending which I will complete.

Opening a PR for your feedback, just to check if you want me to add/remove somethings

Additional Info

Challenges with current GPU

Seems like the pytorch version which baal uses does not support my current GPU. Although I have tested this on Colab and it works fine.

NVIDIA GeForce RTX 3050 Laptop GPU with CUDA capability sm_86 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA GeForce RTX 3050 Laptop GPU GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

  warnings.warn(incompatible_device_warn.format(device_name, capability, " ".join(arch_list), device_name))

More info about this on the pytorch forum in case someone runs into a similar issue

import torch

torch.__version__

1.12.1+cu102

torch.cuda.get_arch_list()

['sm_37', 'sm_50', 'sm_60', 'sm_70']

Although I have tested this on Colab and it works fine.

Challenges with Black Formatting

You might want to update your black version to black==22.3.0.

The make format command produces an error which is identical to the one mentioned at stack overflow here.

I have encountered this before and an upgrade does fix it

Dref360 · 2022-12-11T17:02:11Z

Awesome!

I'll update torch/black in a PR separately.

My only comment would be around the csvs. Could we load the dataset directly from HuggingFace? load_datasets('tweet_eval", "emotion")?

Also if you can, could you add the new notebook to the documentation in mkdocs.yml? Maybe make a new subsection to hold all tutorials for production.

Very minor comments! Thank you very much.

nitish1295 · 2022-12-12T05:14:59Z

My only comment would be around the csvs. Could we load the dataset directly from HuggingFace? load_datasets('tweet_eval", "emotion")?

I had expected this but I had deliberately done this to "mimic" a setting where we do not load data directly via Hugging Face, but I guess people can do this on their own based on their requirements. Will update this.

Yes will do the docs thing

Dref360

LGTM

nitish1295 added 2 commits December 6, 2022 15:24

Baal Active Learning with Hugging Face in prod setting

7559261

black format

35b90eb

Dref360 self-requested a review December 10, 2022 15:11

Dref360 added 2 commits January 27, 2023 21:43

Add new tutorial to documentation

c21d5a6

Merge branch 'master' into hf_nlp_al_nb

542f537

Dref360 approved these changes Jan 28, 2023

View reviewed changes

Dref360 enabled auto-merge (squash) January 28, 2023 02:45

Dref360 merged commit 8413ea4 into baal-org:master Jan 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Baal in Production Notebook | Classification | NLP | Hugging Face #245

Baal in Production Notebook | Classification | NLP | Hugging Face #245

nitish1295 commented Dec 6, 2022

Dref360 commented Dec 11, 2022 •

edited

Loading

nitish1295 commented Dec 12, 2022

Dref360 left a comment

Baal in Production Notebook | Classification | NLP | Hugging Face #245

Baal in Production Notebook | Classification | NLP | Hugging Face #245

Conversation

nitish1295 commented Dec 6, 2022

Summary:

Features:

Checklist:

Additional Info

Dref360 commented Dec 11, 2022 • edited Loading

nitish1295 commented Dec 12, 2022

Dref360 left a comment

Choose a reason for hiding this comment

Dref360 commented Dec 11, 2022 •

edited

Loading