-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add last_active_step iteration to iterate over the last N active steps. #174
Conversation
else: | ||
min_labelled_step = max(0, self.current_al_step - self.last_active_steps) | ||
indices = np.arange(len(self.labelled_map)) | ||
bool_mask = self.labelled_map > min_labelled_step |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the purpose of this function to check what are the indices to be labelled until the end of AL process or which indices should be labelled for this specific step? it seems to me that it provides the first output but the documentation kinda says that we do the second on ? also in any case, what is the use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool_mask = self.labelled_map > min_labelled_step | |
bool_mask = self.labelled_map > min_labelled_step |
I am confused by this specifically. min_labelled_step
can be any integer. that means that we only get the data points that are labelled after a certain active_step ? and when we call this function in len
don't we want all the available labels?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method specifies which indices should be selected for training at this step.
when we call len() we want the size of the Dataset I think? So in this case it would be all items selected for training. If we want the number of labelled items, we should call ActiveLearningDataset.n_labelled in my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok im clear now great 👍
@@ -228,19 +178,19 @@ def label(self, index: Union[list, int], value: Optional[Any] = None) -> None: | |||
|
|||
def reset_labeled(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is more of a style comment, since we used labelling
and labelled
every where with 2 l
s, maybe rename to reser_labelled
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the new refactoring a lot more. it seems very clean. Just few questions. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok amazing, it makes sense. Thanks alot :)
could you just change the reset_labeled
to reset_labelled
for consitency?
I will change it to approved right after.
Co-authored-by: fr.branchaud-charron <fr.branchaud-charron@servicenow.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GOOD TO GO!
Summary:
Major refactoring of
ActiveLearningDataset
to allow for more flexible iteration. In addition we addlast_active_step
when users want to get the items labelled in the last N active steps.I need to retest our notebooks, will do later this week.
Features:
Fixes #173
Checklist:
tests/documentation_test.py
).