Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3store: Add tools for debugging bottlenecks #924

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Acconut
Copy link
Member

@Acconut Acconut commented Mar 26, 2023

When using tusd with the s3store, high CPU usage can occur frequently. At some point, the upload speed is limited by the available CPU resources. When comparing to a regular filestore, the s3store results in a reduced upload speed. This PR adds a tool to facilitate the investigation into how performance bottlenecks can be identified and resolved.

How to use this tool:

  1. Install Docker environment
  2. Run 1_run_tests.sh to execute benchmarks and record data
  3. Run 2_plot_resource_usage.py to plot the upload performance
  4. Modify tusd or the setup and repeat step 1 to iterate.

We also previously used CPU profiles to find compute-intensive method, but only with limited success. We found that the computation of checksums and SSL can add CPU usage and implemented flags to disable them (see

flag.BoolVar(&Flags.S3DisableContentHashes, "s3-disable-content-hashes", false, "Disable the calculation of MD5 and SHA256 hashes for the content that gets uploaded to S3 for minimized CPU usage (experimental and may be removed in the future)")
flag.BoolVar(&Flags.S3DisableSSL, "s3-disable-ssl", false, "Disable SSL and only use HTTP for communication with S3 (experimental and may be removed in the future)")
). However, the CPU reduction was not as significant as we hoped, especially when comparing to the downside of removing TLS.

We also attempted to switch to AWS SDK v2 and Minio SDK (see

flag.BoolVar(&Flags.S3UseMinioSDK, "s3-use-minio-sdk", false, "Use the Minio SDK interally (experimental)")
), but that also did not resolve in any significant improvement. Therefore, we should assume that it is not a problem of the AWS SDK v1, but rather our general approach to S3 uploads. Maybe the intensive reads and writes to the temporary disk are limiting us.

@Acconut Acconut changed the title docker: Add tools for preproducing load issues s3store: Add tools for debugging bottlenecks Mar 26, 2023
Base automatically changed from v2 to main September 6, 2023 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant