Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for free-threaded Python #695

Merged
merged 17 commits into from
Sep 20, 2024
Merged

Support for free-threaded Python #695

merged 17 commits into from
Sep 20, 2024

Commits on Sep 20, 2024

  1. free-threading: CMake build sytem

    This commit adds the ``FREE_THREADED`` parameter to the CMake
    ``nanobind_add_module()`` command. It does not do anything for now
    besides defining ``NB_FREE_THREADED`` in C++ compilation units.
    wjakob committed Sep 20, 2024
    Configuration menu
    Copy the full SHA
    3cea767 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e47d2c9 View commit details
    Browse the repository at this point in the history
  3. free-threading: ABI separation

    Nanobind's internal data structures switch to a different layout when
    free-threading was requested. We must give such extensions a different
    ABI tag since they are incompatible with non-free-threaded ones.
    wjakob committed Sep 20, 2024
    Configuration menu
    Copy the full SHA
    84ce61e View commit details
    Browse the repository at this point in the history
  4. free-threading: Abstractions around locking API

    This commit implements RAII scope guards that are thin wrappers around
    existing Python API to lock Python mutexes and enter critical sections
    with respect to one or two Python objects.
    
    In ordinary (non-free-threaded) Python builds, they don't do anything
    and can be optimized away.
    wjakob committed Sep 20, 2024
    Configuration menu
    Copy the full SHA
    7d9acee View commit details
    Browse the repository at this point in the history
  5. free-threading: GC adaptations

    The garbage collector in free-threaded Python does not like it when the
    reference count of an object temporarily increases while traversing the
    object graph in ``tp_traverse``, and doing so introduces leaks.
    Unfortunately, example implementations of ``tp_traverse`` in both
    documentation and test suite fall into this trap and must be adapted.
    wjakob committed Sep 20, 2024
    Configuration menu
    Copy the full SHA
    510e3e6 View commit details
    Browse the repository at this point in the history
  6. free-threading: Locking for internal data structures

    This commit enables free-threaded extension builds on Python 3.13+,
    which involves the following changes:
    
    - nanobind must notify Python that an extension supports free-threading.
    
    - All internal data structures must be protected from concurrent
      modification. The approach taken varies with respect to the specific
      data structure, and a long comment in ``nb_internals.h`` explains the
      design decisions all of the changes. In general, the implementation
      avoids centralized locks as much as possible to improve scalability.
    
    - Adopting safe versions of certain operations where needed, e.g.
      ``PyList_GetItemRef()``.
    
    - Switching non-object allocation from ``PyObject_Allo()`` to
      ``PyMem_Alloc()``.
    wjakob committed Sep 20, 2024
    Configuration menu
    Copy the full SHA
    e02db2d View commit details
    Browse the repository at this point in the history
  7. free-threading: Immortalize type and function objects

    Global objects that undergo a high rate of reference count changes can
    become a bottleneck in free-threaded Python extensions, since the
    associated atomic operation require coordination between processor
    cores. Function and type objects are a particular concern.
    
    This commit immortalizes such objects, which exempts them from
    free-threading. The downside of this is that they will leak.
    wjakob committed Sep 20, 2024
    Configuration menu
    Copy the full SHA
    9cf1e8e View commit details
    Browse the repository at this point in the history
  8. free-threading: Argument-level locking

    Adapting C++ to handle parallelism due to free-threaded Python can be
    tricky, especially when the original code is given as-is. This commit
    an tentative API to retrofit locking onto existing code by locking the
    arguments of function calls.
    wjakob committed Sep 20, 2024
    Configuration menu
    Copy the full SHA
    e70cf49 View commit details
    Browse the repository at this point in the history
  9. free-threading: Documentation

    This commit documents free-threading in general and in the context of
    nanobind extensions.
    wjakob committed Sep 20, 2024
    Configuration menu
    Copy the full SHA
    50771a7 View commit details
    Browse the repository at this point in the history
  10. free-threading: Test suite

    Several parallel tests to check that locking works as expected
    wjakob committed Sep 20, 2024
    Configuration menu
    Copy the full SHA
    dd43fe2 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    302ab7c View commit details
    Browse the repository at this point in the history
  12. incorporate review feedback

    wjakob committed Sep 20, 2024
    Configuration menu
    Copy the full SHA
    13682cb View commit details
    Browse the repository at this point in the history
  13. Make nb::dict iterator non-copyable

    This commit changes the ``nb::dict`` iterator so that nanobind can
    implement the recommendation from
    
    https://docs.python.org/3.14/howto/free-threading-extensions.html#pydict-next
    
    The primary goal of ``nb::internal::dict_iterator`` was to be able to write
    
    ```cpp
    nb::dict my_dict = /* ... */;
    for (auto [k, v] : my_dict) {
        // ....
    }
    ```
    
    This in fact the only associated feature that is explicitly mentioned in
    the documentation, and this continues to work.
    
    However, some undocumented features are lost:
    
    - The dictionary iterator is no longer copyable. This is because it
      must acquire an exclusive lock to the underlying dictionary.
    
    - The pre-increment operator ``++dict_it`` (which relied on copying) is
      gone. Post-increment continues to work, and that is enough for the
      loop structure mentioned above.
    wjakob committed Sep 20, 2024
    Configuration menu
    Copy the full SHA
    0026b83 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    29d8834 View commit details
    Browse the repository at this point in the history
  15. Lock function arguments at compile time (#720)

    This commit refactors argument the locking locking so that it occurs at
    compile-time without imposing runtime overheads. The change applies to
    free-threaded extensions.
    
    Behavior differences compared to the prior approach:
    
    - it is no longer possible to do ``nb::arg().lock(false)`` or
      ``.lock(runtime_determined_value)``
    
    - we no longer prohibit locking self in ``__init__``; changing this
      would also require restoring ``cast_flags::lock``, and it's not clear
      that the benefit outweighs the complexity.
    oremanj authored and wjakob committed Sep 20, 2024
    Configuration menu
    Copy the full SHA
    ec77413 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    53e1e81 View commit details
    Browse the repository at this point in the history
  17. free-threading: final tweaks

    wjakob committed Sep 20, 2024
    Configuration menu
    Copy the full SHA
    a1a055a View commit details
    Browse the repository at this point in the history