-
Notifications
You must be signed in to change notification settings - Fork 365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tutorial] Token classification tutorial for USPTO claims text with HF AutoTrain #5375
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for this PR. It looks very advanced. I believe the text is still split up by individual letters. Would you be able to fix that?
…ace between characters in a word
for more information, see https://pre-commit.ci
Thank you @davidberenstein1957 for the review comments. I have modified the images and reran the notebook to display updated images. |
Hi @bikash119, could you also add an overview of how you can run inference with the model and log that back into argilla? |
…nding records and pushing the records back to Argilla
for more information, see https://pre-commit.ci
Thank you for the suggestion @davidberenstein1957 . I have updated the notebook to generate predictions and push them back to Argilla Dataset. Please share your feedback. |
Hi @bikash119, took some time to review again.
Overall it is looking very nice! when we are done, we can post the blog on https://huggingface.co/blog, socials and add a reference to it form our docs. |
Hi @davidberenstein1957 ,
Thank you @davidberenstein1957 for the encouragement and guidance. I have learnt a lot in the process |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Hi @davidberenstein1957 , as we discussed during our meeting.
Hope this aligns with our discussion points. |
Hi @bikash119, the text looks nice. I would not use the DEBUG statements everywhere but just print the outputs in certain cell where you feel that is needed. Also, you don't need to add a 'print' statement when you want to output variables at the end of the cell. You can simply remove it. print(my_variable) # will be printed my_other_variable # will not be printed
my_variable # will be printed |
We don't need to update the Dockerfile anymore Update the Dockerfile: In general a redirect to https://docs.argilla.io/dev/getting_started/how-to-configure-argilla-on-huggingface/ might also be nice. |
Hi @davidberenstein1957 , |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I think the blog looks great. Would you be able to request to join this organizaiton https://huggingface.co/blog-explorers? We can then let copy the blog over to https://huggingface.co/blog and publish it there :)
Thanks @davidberenstein1957 . Request submitted. Will wait for the acceptance and revert back. |
for more information, see https://pre-commit.ci
This reverts commit 0745732.
… into argilla_with_autotrain
for more information, see https://pre-commit.ci
Modified the markdown to get rid of colab style.
for more information, see https://pre-commit.ci
For some weird reason the colab styles are getting added to the notebook. Will check this later.
for more information, see https://pre-commit.ci
… into argilla_with_autotrain
for more information, see https://pre-commit.ci
… into argilla_with_autotrain
colab style removal
for more information, see https://pre-commit.ci
… into argilla_with_autotrain
@bikash119, closing this because it was published here: https://huggingface.co/blog/bikashpatra/legal-data-token-classification-fine-tuning |
Description
A tutorial on how to use Argilla for annotation and use the annotated dataset to train a model using HuggingFace AutoTrain
Closes #<issue_number>
Type of change
How Has This Been Tested