Skip to content

Baal in Production | Classification | NLP | Hugging Face #242

Answered by Dref360
nitish1295 asked this question in Q&A
Discussion options

You must be logged in to vote

Hello!

the code sample you've shown looks great!

For your questions:

  1. Usually we have a budget let's say we can label 5000 items so we stop after 5000 / query_size retraining. I have seen cases where we stop when the model stops improving. Unfortunately, we don't have an implementation for that. Do you think this would be valuable? If so, we should open an issue.
  2. Yes the "oracle" index is based on the full dataset.
  3. Yup you don't have to do anything, ActiveLearningDataset manages the split between labelled and unlabelled.
  4. Yes this would save it with Dropout always activated. You can save baal_model.unpatch() to get the original model. code

I hope I answered your questions!

Replies: 2 comments 5 replies

Comment options

You must be logged in to vote
2 replies
@nitish1295
Comment options

@Dref360
Comment options

Answer selected by nitish1295
Comment options

You must be logged in to vote
3 replies
@Dref360
Comment options

@nitish1295
Comment options

@Dref360
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants