Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(proposal update) Add HuggingFace Space pushing capability to the HFModelPusher #178

Merged

Conversation

deep-diver
Copy link
Contributor

@deep-diver deep-diver commented Sep 1, 2022

integrated HuggingFace Space pushing capability into the HFModelPusher component proposal(#174)

cc: @sayakpaul

@github-actions
Copy link
Contributor

github-actions bot commented Sep 1, 2022

Thanks for the PR! 🚀

Instructions: Approve using /lgtm and mark for automatic merge by using /merge.

proposals/20220823-huggingface_model_pusher.md Outdated Show resolved Hide resolved
proposals/20220823-huggingface_model_pusher.md Outdated Show resolved Hide resolved

HuggingFace Space Hub comes with free resources to host prototype applications that use machine learning models. Currently supported application frameworks are Gradio and Streamlit. It is often a good idea to host current version of the model to the Huggingface Space, so it could be interated with real world before the production deployment.

By keeping these information in mind, `HFModelPusher` let you push a trained or blessed model to the HuggingFace Model Hub within a new branch within TFX pipeline. Then, if specified, it pushes an application to the HuggingFace Space Hub by injecting the current model information into the prepared template sources.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hf_api has utilities for creating a PR too. Wouldn't be nice to create a PR with the newly created branch? We can, of course, parameterize that too.

@deep-diver

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have experiences?

I couldn't find any other ways but `upload_folder()' to create PR after commit/push changes.

Even if upload_folder() let us commit changes to a particular branch, it doesn't have a capability to create a new branch, so we need to create a branch beforehand. However, we have to create a new branch and push any changes to the branch to let the remote repository detect the newly created branch. So, maybe some kind of dummy files should be pushed beforehand which I want to avoid.

The 'create_pull_request()` seems like creating a PR, but it doesn't have any parameter to specify which branch the PR for.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gante any pointers?

Copy link

@gante gante Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The revision argument in upload_folder() allows us to specify a branch name.

I'm not sure if it creates the revision when it doesn't exist, but there are workarounds for that (e.g.).

HF Hub has the best experience when the model is in main. If you expect users to use this component to create many variations of the same model, make sure you either push for one model per repo, or to move the best model to main after the exploration phase :)

Copy link
Contributor Author

@deep-diver deep-diver Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Your suggestion is to create a branch with git_checkout() then call upload_folder(), right? but git_checkout() creates a branch only in the locally cloned repository. It means I have to push the branch, then upload the same files again with upload_folder(). I hoped there is APIs like create_pull_request() with revision parameter, or upload_folder() that creates a new branch.

- `app_path` : path where the application templates are in the container that runs the TFX pipeline. This is expressed either apps.gradio.img_classifier or apps/gradio.img_classifier
- `repo_name` : the repository name to push the application to. The default value is same as the TFX pipeline name
- `space_sdk` : either `gradio` or `streamlit`. this will decide which application framework to be used for the Space repository. The default value is `gradio`
- `placeholders` : dictionary which placeholders to replace with model specific information. The keys represents describtions, and the values represents the actual placeholders to replace in the files under the `app_path`. There are currently two predefined keys, and if `placeholders` is set to `None`, the default values will be used.

## Project Dependencies
- [tfx](https://pypi.org/project/tfx/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To interact with the model hub it's better to use the hf_api.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think hf_api is part of huggingface-hub package?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is. I guess we should list that dependency here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huggingface-hub is already listed in the dependency section

deep-diver and others added 2 commits September 1, 2022 15:01
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Copy link
Collaborator

@rcrowe-google rcrowe-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

As a proposal this looks good. Details of the implementation can, and likely will, change going forward.

@deep-diver
Copy link
Contributor Author

how this can be merged? maybe I need /lgtm from others?

@sayakpaul
Copy link
Contributor

/lgtm

@deep-diver
Copy link
Contributor Author

/merge

@rcrowe-google rcrowe-google merged commit 8624857 into tensorflow:main Sep 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants