Skip to content

Development Methodology

Jason R. Coombs edited this page Aug 19, 2024 · 6 revisions

Several projects in the CPython stdlib follow this methodology for development:

These projects are maintained primarily in their external repositories (sometimes referred to as "backports"). Development is preferred in the external repositories due to a host of advantages:

  • The test suite runs under pytest, a richer test runner, affording more sophisticated behaviors even when running unittest-based tests.
  • Maintenance follows best practices as defined by the skeleton used by hundreds of other projects, and deriving value from common concerns such as updating support for Python versions.
  • Extra checks are performed to ensure consistent formatting, style, and safety (ruff, mypy, ...).
  • Doctests are checked.
  • Performance benchmarks can be measured and tested.
  • Code is tested against all supported Python versions.
  • Changes can be released quickly and get rapid feedback, shifting left the lifecycle.
  • CPython is less strict so accepts changes more readily.

As a result, this preference means that the external projects are "upstream" of Python and code is synced "downstream" into Python's stdlib.

These projects still accept contributions downstream in Python. Issues can be tracked in either the CPython or the host project's repo. Maintainers will file additional issues when necessary for contributing changes.

Regardless of the original source of the contribution, these two codebases should be kept in close sync and utilize techniques to minimize the diffs between the two targets. Here are some of the ways these projects achieve that minimum variance:

  • Code in the stdlib should be partitioned into folders pertaining to the shared functionality. That means that the abstract base classes for importlib resources should live in importlib.resources.abc and not importlib.abc. That is also why zipfile.Path is implemented as zipfile._path.Path.
  • The external project should implement "compatibility" shims in separate modules or (preferably) packages. For example, importlib_resources exposes future and compat packages for the external-specific behaviors. This behavior is excluded in the port to CPython.
  • Each project keeps a cpython branch that tracks the state of the code that keeps track of the code in the same layout as it appears in the stdlib, such that one can cp -r the contents in either direction and there should be no diff when projects are in sync.

Syncing

Ensure projects are in sync

  1. Check out the cpython branch of the external project (at $PROJECT).
  2. Check out cpython to the main branch (at $CPYTHON).
  3. cp -r $(PROJECT) $(CPYTHON).
  4. Ensure git -C $(CPYTHON) diff shows no diff.
  5. If there is a diff, track down the source of the changes and determine how to reconcile.

Sync external to stdlib

  1. Ensure projects are in sync.
  2. In the cpython branch of the project, merge changes from main with git merge main.
  3. Resolve conflicts. Some conflicts will be changes to deleted files that aren't needed for stdlib - just confirm the deletion. Other conflicts may be in code, so intelligently compare the changes for each branch.
  4. Ensure compatibility modules are not included. If new ones were added, delete them with git rm -f path/to/compat.
  5. Replace references to compatibility modules with their stdlib equivalent.
  6. Search for references to the external package by name (e.g. importlib_resources) and replace them.
  7. Commit the merge.
  8. Test the changes.
    1. Remove any incidental files (git clean -fdx).
    2. cp -r * $(CPYTHON).
    3. Change directory to the CPython checkout.
    4. Build and test:
      1. Run the tests ./python.exe -m test.test_importlib -v (or similar).
    5. Address test failures in the external project, either in cpython or main (or another branch) as appropriate. Amend or commit or re-merge the changes to the cpython branch and repeat "Test the changes."
  9. Push the cpython branch of the external project.
  10. Commit the changes and submit them for review following the usual CPython development process.
    1. If encountering issues with docs builds, consider running make check suspicious html in the Doc/ directory of CPython.

Port stdlib to external

The best way to sync from stdlib to external is to find the relevant squashed commit from CPython's main branch and cherry-pick it to the external project. Find the relevant commit (often from a relevant pull request merge message), $(COMMIT).

  1. Check out the project to the main branch.
  2. Fetch the CPython repo with git fetch https://github.com/python/cpython. Maybe pass --depth 1000 so as not to fetch everything. This fetches the CPython repo into the external project's repo.
  3. Cherry-pick the change with git cherry-pick $(COMMIT). git will most likely be able to associate the changes with the relevant files (even though they're in different locations).
    1. Resolve creation/deletion conflicts (usually by electing to delete irrelevant files).
    2. Remove any files not relevant to this project (e.g. git rm -rf Misc for news fragments).
  4. (optional) Commit the merge as a checkpoint.
  5. Test the changes, make amendments or tweaks to make the code compatible, possibly as new commits.
  6. Push the changes.
  7. (optional) git prune to remove the CPython history.
  8. Merge the changes into the cpython branch.
  9. (optional) Check that projects are in sync.

Documentation

In general, it's preferred for guidance documentation to be in CPython (only, not synced). The external projects should refer users to the CPython documentation. Some documentation (like API documentation from code and docstrings) is not viable in CPython, so will be hosted in the external project. Similarly, the changelog for the external project should be hosted by that project. Other documentation, like migration guides, might be maintained outside CPython.

Because CPython cannot reflect API documentation from code, it's possible API documentation may not be provided in CPython at all (due to the duplicative, toilsome work that entails), though contributors are welcome to provide it.

Clone this wiki locally