GH-39565: [C++] Do not concatenate chunked values of fixed-width types to run "array_take" #41700

felipecrv · 2024-05-17T03:59:53Z

Rationale for this change

Concatenating a chunked array into a single array before running the array_take kernels is very inefficient and can lead to out-of-memory crashes. See also #25822.

What changes are included in this PR?

Implementation of kernels for "array_take" that can receive a ChunkedArray as values and produce an output without concatenating these chunks
Improvements in the dispatching logic of TakeMetaFunction("take") to make "array_take" able to have a chunked_exec kernel for all types (some specialized and some based on concatenation)

Are these changes tested?

By existing tests. Some tests were added in previous PRs that introduced some of the infrastructure to support this.

GitHub Issue: Do not concatenate ChunkedArray when running Take kernel #39565

cpp/src/arrow/compute/kernels/vector_selection_take_internal.cc

mapleFU · 2024-05-17T14:54:51Z

May I ask a unrelated question, when would we call assert and when call DCHECK, since I think they would likely to be same?

felipecrv · 2024-05-17T15:23:03Z

May I ask a unrelated question, when would we call assert and when call DCHECK, since I think they would likely to be same?

We call assert in headers because we don't want to pay the cost of including logging.h everywhere. Think of assert as lighter-weight debug checks. But if you see an assert in a .cc file tell me to change it to DCHECK*.

… make them private (#42127) ### Rationale for this change Move TakeXXX free functions into `TakeMetaFunction` and make them private ### What changes are included in this PR? Code move and some small refactorings in preparation for #41700. ### Are these changes tested? By existing tests. * GitHub Issue: #42126 Authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com> Signed-off-by: Felipe Oliveira Carvalho <felipekde@gmail.com>

felipecrv · 2024-06-15T15:19:28Z

cpp/src/arrow/compute/kernels/vector_selection_internal.cc

@@ -60,6 +60,7 @@ void RegisterSelectionFunction(const std::string& name, FunctionDoc doc,
        {std::move(kernel_data.value_type), std::move(kernel_data.selection_type)},
        OutputType(FirstType));
    base_kernel.exec = kernel_data.exec;
+    base_kernel.exec_chunked = kernel_data.chunked_exec;


The member variable is called exec_chunked but the type is called ChunkedExec (so confusing). In this PR I ended up sticking to chunked_exec. Once everything is reviewed and merged I could try to unify things to the direction people prefer.

pitrou · 2024-08-21T13:29:09Z

I ran the new benchmarks (those with a small selection factor) and the results are more varied, see below:

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Non-regressions: (13)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                                    benchmark           baseline          contender  change %                                                                                                                                                                                                                                                  counters
             TakeChunkedFlatInt64FewMonotonicIndices/524288/0   3.487G items/sec   6.131G items/sec    75.818             {'family_index': 5, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedFlatInt64FewMonotonicIndices/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 4676, 'null_percent': 0.0, 'selection_factor': 0.05}
          TakeChunkedChunkedInt64FewMonotonicIndices/524288/0   3.397G items/sec   5.278G items/sec    55.353          {'family_index': 1, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedChunkedInt64FewMonotonicIndices/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 4603, 'null_percent': 0.0, 'selection_factor': 0.05}
             TakeChunkedFlatInt64FewMonotonicIndices/524288/1   3.019G items/sec   4.508G items/sec    49.291           {'family_index': 5, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedFlatInt64FewMonotonicIndices/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 4000, 'null_percent': 100.0, 'selection_factor': 0.05}
          TakeChunkedChunkedInt64FewMonotonicIndices/524288/1   2.802G items/sec   4.135G items/sec    47.589        {'family_index': 1, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedChunkedInt64FewMonotonicIndices/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 3740, 'null_percent': 100.0, 'selection_factor': 0.05}
          TakeChunkedFlatInt64FewMonotonicIndices/524288/1000   2.316G items/sec   2.910G items/sec    25.628          {'family_index': 5, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedFlatInt64FewMonotonicIndices/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 3104, 'null_percent': 0.1, 'selection_factor': 0.05}
       TakeChunkedChunkedInt64FewMonotonicIndices/524288/1000   2.254G items/sec   2.684G items/sec    19.075       {'family_index': 1, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedChunkedInt64FewMonotonicIndices/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 3082, 'null_percent': 0.1, 'selection_factor': 0.05}
            TakeChunkedFlatInt64FewMonotonicIndices/524288/10   2.257G items/sec   2.635G items/sec    16.733           {'family_index': 5, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedFlatInt64FewMonotonicIndices/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 3052, 'null_percent': 10.0, 'selection_factor': 0.05}
         TakeChunkedChunkedInt64FewMonotonicIndices/524288/10   2.179G items/sec   2.466G items/sec    13.175        {'family_index': 1, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedChunkedInt64FewMonotonicIndices/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2878, 'null_percent': 10.0, 'selection_factor': 0.05}
   TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/1   5.541G items/sec   5.568G items/sec     0.490 {'family_index': 2, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 7344, 'null_percent': 100.0, 'selection_factor': 0.05}
         TakeChunkedChunkedStringFewMonotonicIndices/524288/1   2.753G items/sec   2.736G items/sec    -0.625       {'family_index': 3, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedChunkedStringFewMonotonicIndices/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 3674, 'null_percent': 100.0, 'selection_factor': 0.05}
             TakeChunkedFlatInt64FewMonotonicIndices/524288/2   1.880G items/sec   1.852G items/sec    -1.442            {'family_index': 5, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedFlatInt64FewMonotonicIndices/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2520, 'null_percent': 50.0, 'selection_factor': 0.05}
          TakeChunkedChunkedInt64FewMonotonicIndices/524288/2   1.818G items/sec   1.766G items/sec    -2.872         {'family_index': 1, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedChunkedInt64FewMonotonicIndices/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2403, 'null_percent': 50.0, 'selection_factor': 0.05}
TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/1000 269.164M items/sec 256.848M items/sec    -4.576 {'family_index': 2, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 358, 'null_percent': 0.1, 'selection_factor': 0.05}

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Regressions: (17)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                                   benchmark           baseline          contender  change %                                                                                                                                                                                                                                                  counters
        TakeChunkedChunkedStringFewMonotonicIndices/524288/2 854.187M items/sec 810.956M items/sec    -5.061        {'family_index': 3, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedChunkedStringFewMonotonicIndices/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1124, 'null_percent': 50.0, 'selection_factor': 0.05}
        TakeChunkedChunkedStringFewMonotonicIndices/524288/0 420.243M items/sec 393.347M items/sec    -6.400          {'family_index': 3, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedChunkedStringFewMonotonicIndices/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 561, 'null_percent': 0.0, 'selection_factor': 0.05}
  TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/2 751.346M items/sec 700.885M items/sec    -6.716   {'family_index': 2, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 982, 'null_percent': 50.0, 'selection_factor': 0.05}
     TakeChunkedChunkedStringFewMonotonicIndices/524288/1000 383.643M items/sec 353.206M items/sec    -7.934       {'family_index': 3, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedChunkedStringFewMonotonicIndices/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 511, 'null_percent': 0.1, 'selection_factor': 0.05}
  TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/0 308.601M items/sec 283.361M items/sec    -8.179    {'family_index': 2, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 411, 'null_percent': 0.0, 'selection_factor': 0.05}
 TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/10 327.708M items/sec 300.772M items/sec    -8.220  {'family_index': 2, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedChunkedStringFewRandomIndicesWithNulls/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 440, 'null_percent': 10.0, 'selection_factor': 0.05}
       TakeChunkedChunkedStringFewMonotonicIndices/524288/10 452.054M items/sec 393.971M items/sec   -12.849        {'family_index': 3, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedChunkedStringFewMonotonicIndices/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 602, 'null_percent': 10.0, 'selection_factor': 0.05}
   TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/2   1.276G items/sec 509.634M items/sec   -60.060   {'family_index': 0, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1691, 'null_percent': 50.0, 'selection_factor': 0.05}
      TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/2   1.323G items/sec 517.354M items/sec   -60.894      {'family_index': 4, 'per_family_instance_index': 2, 'run_name': 'TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/2', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 1750, 'null_percent': 50.0, 'selection_factor': 0.05}
  TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/10   1.615G items/sec 549.705M items/sec   -65.960  {'family_index': 0, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2171, 'null_percent': 10.0, 'selection_factor': 0.05}
     TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/10   1.709G items/sec 553.717M items/sec   -67.595     {'family_index': 4, 'per_family_instance_index': 1, 'run_name': 'TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/10', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2274, 'null_percent': 10.0, 'selection_factor': 0.05}
TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/1000   2.026G items/sec 587.265M items/sec   -71.021 {'family_index': 0, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2670, 'null_percent': 0.1, 'selection_factor': 0.05}
   TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/1000   2.078G items/sec 588.468M items/sec   -71.680    {'family_index': 4, 'per_family_instance_index': 0, 'run_name': 'TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/1000', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 2771, 'null_percent': 0.1, 'selection_factor': 0.05}
   TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/0   3.429G items/sec 717.438M items/sec   -79.076    {'family_index': 0, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 4639, 'null_percent': 0.0, 'selection_factor': 0.05}
      TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/0   3.582G items/sec 721.296M items/sec   -79.861       {'family_index': 4, 'per_family_instance_index': 4, 'run_name': 'TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/0', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 4799, 'null_percent': 0.0, 'selection_factor': 0.05}
   TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/1   3.873G items/sec 759.915M items/sec   -80.381  {'family_index': 0, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedChunkedInt64FewRandomIndicesWithNulls/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 5316, 'null_percent': 100.0, 'selection_factor': 0.05}
      TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/1   4.173G items/sec 761.551M items/sec   -81.751     {'family_index': 4, 'per_family_instance_index': 3, 'run_name': 'TakeChunkedFlatInt64FewRandomIndicesWithNulls/524288/1', 'repetitions': 1, 'repetition_index': 0, 'threads': 1, 'iterations': 5385, 'null_percent': 100.0, 'selection_factor': 0.05}

Putting aside the String perf changes which are minor and probably irrelevant, we can see that on monotonic indices, this PR actually increases performance, while still hurting it on random indices.

felipecrv · 2024-08-22T22:59:06Z

...while still hurting it on random indices.

The binary searches for chunk index are a costly part of the process that is avoided when value chunks are concatenated.

I have ideas to make the search interleaved so that multiple binary searches can happen in parallel (i.e. likely sharing the same cache line accesses to make progress).

felipecrv · 2024-10-07T14:37:33Z

@pitrou wouldn't it make sense to keep the responsibility for concatenation to a layer above the kernels? Like a query optimizer? They are in a better position to make memory/time trade-offs than the context-less kernel.

The worst regression (-81%) has the kernel still at 4G items/sec.

TakeChunkedFlatInt64FewRandomIndicesWithNulls
4.173G items/sec 761.551M items/sec   -81.751

I find it very inelegant to put these heuristics at the compute kernel level.

Imagine a pipeline trying to save on memory allocations by keeping the array chunked as much as possible and then a simple filter operation requires allocating enough memory to keep it all in memory.

Another case would be a pipeline where the caller is consolidating a big contiguous array for more operations than just array_take. They should be the ones concatenating.

pitrou · 2024-10-07T15:35:54Z

@pitrou wouldn't it make sense to keep the responsibility for concatenation to a layer above the kernels? Like a query optimizer? They are in a better position to make memory/time trade-offs than the context-less kernel.

Ideally perhaps. In practice this assumes that 1) there is a query optimizer 2) it has enough information about implementation details to make an informed decision.

In practice, take is probably often called directly in the context of PyArrow's sort methods. So this

arrow/python/pyarrow/table.pxi

Lines 1139 to 1143 in e62fbaa

    
           indices = _pc().sort_indices( 
        
               self, 
        
               options=_pc().SortOptions(sort_keys=[("", order)], **kwargs) 
        
           ) 
        
           return self.take(indices)

The worst regression (-81%) has the kernel still at 4G items/sec.

I might be misreading, but this is the worst regression on the new benchmarks, right (those with a small selection factor)? On the benchmarks with a 1.0 selection factor (such as when sorting), the worst absolute results are around 25 Mitems/sec AFAICT. Or are those results obsolete?

Imagine a pipeline trying to save on memory allocations by keeping the array chunked as much as possible and then a simple filter operation requires allocating enough memory to keep it all in memory.

Well, currently "take" would always concatenate array chunks, so at least there is no regression in that regard.

Still, I understand the concern. We might want to expose an additional option to entirely disable concatenation when possible. But that might be overkill as well.

We will ensure "array_take" returns a ChunkedArray if at least one input is chunked, just like "take" does. Even when the output fits in a single chunk.

…::exec_chunked Before this commit, only the "take" meta function could handle CA parameters.

…omChunks

…trategies

This is not a time-saver yet because in TakeCC kernels, every call to TakeCA will create a new ValuesSpan instance, but this will change in the next commits.

… stable

github-actions bot added Component: C++ awaiting committer review Awaiting committer review labels May 17, 2024

felipecrv changed the title ~~GH-39565: [C++]~~ GH-39565: [C++] Do not concatenate ChunkedArray values to run "array_take" May 17, 2024

felipecrv changed the title ~~GH-39565: [C++] Do not concatenate ChunkedArray values to run "array_take"~~ GH-39565: [C++] Do not concatenate chunked values of fixed-width types to run "array_take" May 17, 2024

felipecrv mentioned this pull request May 17, 2024

GH-39565: [C++] Do not concatenate ChunkedArray when running take function #39566

Draft

felipecrv commented May 17, 2024

View reviewed changes

cpp/src/arrow/compute/kernels/vector_selection_take_internal.cc Outdated Show resolved Hide resolved

github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting committer review Awaiting committer review awaiting changes Awaiting changes labels May 17, 2024

felipecrv force-pushed the take_chunked_fixed branch 2 times, most recently from fbd97a3 to f4b4e12 Compare June 10, 2024 15:35

felipecrv force-pushed the take_chunked_fixed branch from f4b4e12 to df7de46 Compare June 13, 2024 21:53

felipecrv marked this pull request as ready for review June 15, 2024 15:12

felipecrv commented Jun 15, 2024

View reviewed changes

github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Jun 15, 2024

felipecrv force-pushed the take_chunked_fixed branch from 467f0f8 to d92da9f Compare June 16, 2024 18:19

github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jun 16, 2024

felipecrv force-pushed the take_chunked_fixed branch 4 times, most recently from 28da5e6 to 2ff6789 Compare June 20, 2024 13:54

felipecrv force-pushed the take_chunked_fixed branch 4 times, most recently from cbf1ddb to 72101ab Compare September 4, 2024 01:34

jorisvandenbossche mentioned this pull request Sep 15, 2024

BUG: Large dataset exception on arrow string operations pandas-dev/pandas#59752

Closed

3 tasks

felipecrv added 20 commits October 7, 2024 14:27

Take: Add VectorKernel::ChunkedExec to SelectionKernelData

844aa5e

Take: VectorKernel::output_chunked should be false for "array_take"

9797c9f

We will ensure "array_take" returns a ChunkedArray if at least one input is chunked, just like "take" does. Even when the output fits in a single chunk.

Take: Make "array_take" handle CA->C cases by populating VectorKernel…

a135471

…::exec_chunked Before this commit, only the "take" meta function could handle CA parameters.

gather_internal.h: Introduce GatherFromChunks

36b69af

Take: Introduce ValueSpan to delay dispatching on chunked-ness

d4b48a3

Take: Implement the FixedWidthTakeChunkedExec() kernel using GatherFr…

3435bd5

…omChunks

Take: Adapt kernel to the ChunkResolver changes

40f2422

TakeMetaFunction: Update comment about what the MetaFunction does

022f6a0

Take: Support CA->C and CC->C cases directly in "array_take" with 2 s…

7f1bc10

…trategies

Take: Simplify TakeMetaFunction even further

ace70fb

Remove all ARROW_NOINLINE from vector_selection_take_internal.cc

4a48d3e

gather_intenal.h: Clarify the semantics of ValiditySpan/IsSrcValid

49b5e97

Take: Fix silly mistake

575d6df

Small fixes from PR feedback

d47462e

Take: Use fixed size blocks of locations when running TakeCA

566a113

Take: Lazily build a ChunkResolver from ValuesSpan

408b4d8

This is not a time-saver yet because in TakeCC kernels, every call to TakeCA will create a new ValuesSpan instance, but this will change in the next commits.

Take: Move the ValuesSpan class to the header

e638991

Selection: Fix UB -- nothing guarantees these references to spans are…

54d410d

… stable

Selection: Make sub-classes constructable with ValueSpan and ArraySpan's

3140ceb

Take: Create a signature for Take kernels support AAA and CAA calls

018320d

felipecrv force-pushed the take_chunked_fixed branch from 72101ab to 018320d Compare October 7, 2024 17:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-39565: [C++] Do not concatenate chunked values of fixed-width types to run "array_take" #41700

GH-39565: [C++] Do not concatenate chunked values of fixed-width types to run "array_take" #41700

felipecrv commented May 17, 2024 •

edited

Loading

mapleFU commented May 17, 2024

felipecrv commented May 17, 2024

felipecrv Jun 15, 2024

pitrou commented Aug 21, 2024

felipecrv commented Aug 22, 2024

felipecrv commented Oct 7, 2024 •

edited

Loading

pitrou commented Oct 7, 2024

GH-39565: [C++] Do not concatenate chunked values of fixed-width types to run "array_take" #41700

Are you sure you want to change the base?

GH-39565: [C++] Do not concatenate chunked values of fixed-width types to run "array_take" #41700

Conversation

felipecrv commented May 17, 2024 • edited Loading

Rationale for this change

What changes are included in this PR?

Are these changes tested?

mapleFU commented May 17, 2024

felipecrv commented May 17, 2024

felipecrv Jun 15, 2024

Choose a reason for hiding this comment

pitrou commented Aug 21, 2024

felipecrv commented Aug 22, 2024

felipecrv commented Oct 7, 2024 • edited Loading

pitrou commented Oct 7, 2024

felipecrv commented May 17, 2024 •

edited

Loading

felipecrv commented Oct 7, 2024 •

edited

Loading