Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching doesn't seem to save that much time #881

Closed
2 of 5 tasks
hamirmahal opened this issue May 31, 2024 · 6 comments
Closed
2 of 5 tasks

Caching doesn't seem to save that much time #881

hamirmahal opened this issue May 31, 2024 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@hamirmahal
Copy link

Description:
Caching seems to only skip the downloading step. It would be nice if it skipped Collecting and Using altogether, since that could maybe save a lot more time.

Action version:
v5

Platform:

  • Ubuntu
  • macOS
  • Windows

Runner type:

  • Hosted
  • Self-hosted

Tools version:

Run actions/setup-python@v5
  with:
    cache: pip
    python-version: [3](https://github.com/hamirmahal/cache-pip-install/actions/runs/8963287619/job/24613359851#step:3:3).11
    check-latest: false
    token: ***
    update-environment: true
    allow-prereleases: false
Installed versions
  Successfully set up CPython (3.11.9)

Repro steps:
https://github.com/hamirmahal/cache-pip-install/actions/runs/8963287619/job/24613359851#step:4:1

Expected behavior:
Caching saves a lot more time.

Actual behavior:
Caching only saves about 6s or so on an install step that otherwise takes a minute.

@hamirmahal hamirmahal added bug Something isn't working needs triage labels May 31, 2024
@hamirmahal
Copy link
Author

Initial Run

Run / python-program (push) Successful in 1m Details

Cached Run

Run / python-program (push) Successful in 54s Details

@hamirmahal hamirmahal changed the title Caching doesn't seem to save that much timea Caching doesn't seem to save that much time May 31, 2024
@HarithaVattikuti
Copy link
Contributor

Hello @hamirmahal
Thank you for creating this issue. We will investigate it and get back to you as soon as we have some feedback.

@hamirmahal
Copy link
Author

You're welcome @HarithaVattikuti.

@kurtmckee
Copy link
Contributor

@hamirmahal I'm not a developer but want to respond to your ticket.

Diagnosis

The pip install -r requirements.txt step in your workflow benefits from caching. However, the setup-python action doesn't control pip's behavior, and cannot reduce the number of "Collecting" and "Using" lines that you're seeing.

However, "Collecting" and "Using" aren't actually consuming much time -- it's the installation itself that consumes the vast majority of time. You can verify this by reviewing the raw logs, which contain a timestamp for every output line:

image

Looking at those raw logs, pip spends ~6 seconds (from 2024-05-06T02:41:05.2881468Z to 2024-05-06T02:41:11.4118495Z) printing "Collecting" and "Using" lines. It then spends ~26 seconds (from 2024-05-06T02:41:11.4118495Z to 2024-05-06T02:41:37.9340734Z) actually installing your dependencies.

Best practice

My recommendation is to disable setup-python's pip caching entirely and focus exclusively on caching an entire virtual environment.

    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        id: setup-python
        with:
          python-version: "3.11"

      # Write the exact Python version to a file for cache-busting.
      - run: |
          echo "${{ steps.setup-python.outputs.python-version }}" > ".installed-python"

      # THIS! This is where you'll save all your time!
      # Never cache pip dependencies! Cache virtual environments!
      - uses: "actions/cache@v4"
        id: "restore-cache"
        with:
          key: "venv-${{ hashFiles('.installed-python', 'requirements.txt') }}"
          path: |
            .venv/

      # If Python 3.11.x upgrades to 3.11.y, or if requirements.txt gets updated,
      # the cache lookup above will miss, and the venv needs to be recreated.
      - name: "Create a virtual environment"
        if: "steps.restore-cache.outputs.cache-hit == false"
        run: |
          python -m venv .venv
          .venv/bin/python -m pip install --upgrade pip setuptools wheel
          .venv/bin/python -m pip install -r requirements.txt

      - run: .venv/bin/python src/main.py

Caching an entire virtual environment is going to save you a ton of time. The only thing you need to watch out for is cache-busting. The example above busts the cache based on the exact Python version (like "3.11.7") and based on your requirements.txt file. If you're running this workflow on multiple platforms you'll need to include that in the cache key, too.

@HarithaVattikuti I think that this ticket can be closed; the report is referring to pip behavior, not setup-python behavior.

@priya-kinthali priya-kinthali self-assigned this Jun 26, 2024
@priya-kinthali
Copy link
Contributor

Hello @hamirmahal 👋,
Regarding pip caching, it works by storing downloaded packages in a directory so that they don't need to be re-downloaded when they're needed in the future. This indeed saves time by skipping the downloading step.
Thanks to @kurtmckee for the insightful explanation. As rightly pointed out, the "Collecting" and "Using" stages in pip's process are essential and cannot be bypassed entirely, even when caching is employed. The "Collecting" stage is where pip identifies the necessary packages for installation (including dependencies), and the "Using" stage is where pip indicates that it's utilising a cached package, thus avoiding the need for a new download.
Hope this clarifies and I am proceeding to close this issue. Thank you for your patience and cooperation:)
Please feel free to reach us out incase of any further concerns!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants