Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] Add integration with huggingface_hub.utils.telemetry #5218

Conversation

davidberenstein1957
Copy link
Member

@davidberenstein1957 davidberenstein1957 commented Jul 12, 2024

Description

This PR adds changes to the server telemetry to gather metrics for API endpoint calls. This is the first iteration. Some new usage metrics can be included.

The metrics gathered include the user ID and some system info as the server ID (UUID generated once when starting the Argilla server)

Also, it deprecates the old telemetry KEY ("huggingface_hub includes an helper to send telemetry data. This information helps us debug issues and prioritize new features. Users can disable telemetry collection at any time by setting the HF_HUB_DISABLE_TELEMETRY=1 environment variable. Telemetry is also disabled in offline mode (i.e. when setting HF_HUB_OFFLINE=1)."

OUTDATED

Adds telemetry for:

General Idea:
I’ve structured data to come in through URLs/topics like dataset/settings/vectorsettings/create or dateset/records/suggestions/read along with some generalized metadata per URL/topics, like count or type of suggestion or setting.

To discuss:

  • What to do with list methods. I currently track list-like and send each individual with read, along with a read with a count. I did this because it might be interesting to get the total number of users, workspaces etc. Should we move this over to list as a separate CRUD action? Do we also want to capture each individual update
  • A similar logic applies to bulk operations. bulk_crud as separate CRUD actions?
  • I don't track user/dataset/workspace-specific list operations, like list_users_workspace or list_datasets_user.
  • I don't track metadata and vector updates on a record level, however, we DO keep track of operations on suggestions and responses.
  • @frascuchon was there a reason to include the header along with user/login operations? otherwise I will rewrite this a bit and include the user/login as user/read.

Follow up

Closes #5204

Type of change

  • Improvement (change adding some improvement to an existing functionality)

How Has This Been Tested

NA

Checklist

  • I added relevant documentation
  • I followed the style guidelines of this project
  • I did a self-review of my code
  • I made corresponding changes to the documentation
  • I confirm My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

@davidberenstein1957
Copy link
Member Author

davidberenstein1957 commented Jul 12, 2024

@dvsrepo added some initial work, still in progress but added you more so you could keep track.

@dvsrepo
Copy link
Member

dvsrepo commented Jul 15, 2024

Looks good, just left two small comments!

Comment on lines 108 to 110
for workspace in workspaces:
await telemetry_client.track_crud_workspace(action="read", workspace=workspace)
await telemetry_client.track_crud_workspace(action="read", workspace=None, count=len(workspaces))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand these lines. For other list endpoints we just track the resource count, but here we track also the workspaces individually. What's the motivation?

Copy link
Member Author

@davidberenstein1957 davidberenstein1957 Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@frascuchon It is to keep track of list like operations but not differentiate too much in the naming CRUD. If you think it is interesting for development, I will create a separate "list" action. Otherwise, I will leave it. I think the individual calls and list-like I forgot unintentionally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the code and see I did keep track of the list-like thing in other places too.

argilla-server/pyproject.toml Show resolved Hide resolved
Comment on lines 103 to 106
for field in dataset.fields:
await telemetry_client.track_crud_dataset_setting(
action="read", dataset=dataset, setting_name="fields", setting=field
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know how these telemetry requests are done? Are they synchronous? UDP?

We should check that we are not spending a lot of time executing these requests so we are not adding additional time to the API endpoint requests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 194 to 197
action = "create"
if await Suggestion.get_by(db, record_id=record_id, question_id=suggestion_create.question_id):
response.status_code = status.HTTP_200_OK
action = "update"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can have an upsert action so you don't need to add logic trying to know if it's a create or an update?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also rename it to "upsert" but I wanted to avoid capturing all edge cases "publish", "search", "read", "list" "upsert" "update" because I thought it might be a bit much for metrics/telemetry. @frascuchon @jfcalvo if you feel it would help development, I can make a finer distinguishment.

rafactor: add "me" to user operations
refactor: add "list" to like-like operations
context = self._system_info.copy()
user_agent.update(self._system_info)
if count is not None:
user_agent["count"] = count
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the meaning of count and what is used for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Count is used for list operations

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just send it as part of the data content for those actions measuring list?

frascuchon and others added 3 commits September 2, 2024 11:46
# Description
<!-- Please include a summary of the changes and the related issue.
Please also include relevant motivation and context. List any
dependencies that are required for this change. -->

This PR adds a middleware component to track the API endpoint's usage.

**Type of change**
<!-- Please delete options that are not relevant. Remember to title the
PR according to the type of change -->

- New feature (non-breaking change which adds functionality)
- Refactor (change restructuring the codebase without changing
functionality)
- Improvement (change adding some improvement to an existing
functionality)

**How Has This Been Tested**
<!-- Please add some reference about how your feature has been tested.
-->

**Checklist**
<!-- Please go over the list and make sure you've taken everything into
account -->

- I added relevant documentation
- I followed the style guidelines of this project
- I did a self-review of my code
- I made corresponding changes to the documentation
- I confirm My changes generate no new warnings
- I have added tests that prove my fix is effective or that my feature
works
- I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)

---------

Co-authored-by: José Francisco Calvo <jose@argilla.io>
argilla/mkdocs.yml Outdated Show resolved Hide resolved
frascuchon and others added 7 commits September 2, 2024 14:13
…5445)

# Description
<!-- Please include a summary of the changes and the related issue.
Please also include relevant motivation and context. List any
dependencies that are required for this change. -->

This PR restores the server_id for telemetry purposes and also add the
user.id and user.role when tracking API requests.

**Type of change**
<!-- Please delete options that are not relevant. Remember to title the
PR according to the type of change -->

- Improvement (change adding some improvement to an existing
functionality)
- Documentation update

**How Has This Been Tested**
<!-- Please add some reference about how your feature has been tested.
-->

**Checklist**
<!-- Please go over the list and make sure you've taken everything into
account -->

- I added relevant documentation
- I followed the style guidelines of this project
- I did a self-review of my code
- I made corresponding changes to the documentation
- I confirm My changes generate no new warnings
- I have added tests that prove my fix is effective or that my feature
works
- I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)
# Description
<!-- Please include a summary of the changes and the related issue.
Please also include relevant motivation and context. List any
dependencies that are required for this change. -->

This PR adds the track startup method defined in
#5441 and include
perstitent_storaged_enbled info as part of the system info

**Type of change**
<!-- Please delete options that are not relevant. Remember to title the
PR according to the type of change -->

- Improvement (change adding some improvement to an existing
functionality)

**How Has This Been Tested**
<!-- Please add some reference about how your feature has been tested.
-->

**Checklist**
<!-- Please go over the list and make sure you've taken everything into
account -->

- I added relevant documentation
- I followed the style guidelines of this project
- I did a self-review of my code
- I made corresponding changes to the documentation
- I confirm My changes generate no new warnings
- I have added tests that prove my fix is effective or that my feature
works
- I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)
@frascuchon frascuchon changed the title Add huggingface_hub.utils.telemetry [FEAT] Add integration with huggingface_hub.utils.telemetry Sep 3, 2024
@frascuchon frascuchon merged commit ebd1b0f into develop Sep 3, 2024
12 checks passed
@frascuchon frascuchon deleted the feat/5204-feature-add-huggingface_hubutilssend_telemetry-to-the-argilla-server branch September 3, 2024 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] add huggingface_hub.utils.send_telemetry to the argilla-server
4 participants