Skip to content

Is there a way to only access the new labelled data in dataset.label ? #231

Answered by Dref360
Karthik-Pandaram asked this question in Q&A
Discussion options

You must be logged in to vote

Hello!

In this case, the indices in to_label are relative to the unlabelled pool, not to the overall dataset.

To get the mapping from the pool to the "original" indices, you can use dataset._pool_to_oracle_index(to_label[: query_size]) before dataset.label.

Addtional infos:

  • ActiveLearningDataset keeps track of what is labelled and what is not.
  • ActiveNumpyDataset.dataset gives you a new array with labelled items only.
  • ActiveNumpyDataset.pool gives you an array with unlabelled items only.
  • ActiveNumpyDataset.label(indices) add items from the unlabelled pool to the labelled pool before adding these indices to the set of labelled items.

I hope this helps!

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@Karthik-Pandaram
Comment options

@Dref360
Comment options

Answer selected by Karthik-Pandaram
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants