[c++] Fix fast path for `SparseNDArray` `nnz` #3229

johnkerl · 2024-10-23T21:37:03Z

Issue and/or context: [sc-58277]

Changes:

Notes for Reviewer:

codecov · 2024-10-23T21:55:27Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.97%. Comparing base (29b2648) to head (6eaba1e).
Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3229      +/-   ##
==========================================
+ Coverage   83.86%   83.97%   +0.10%     
==========================================
  Files          51       51              
  Lines        5505     5505              
==========================================
+ Hits         4617     4623       +6     
+ Misses        888      882       -6

Flag	Coverage Δ
python	`83.97% <ø> (+0.10%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
python_api	`83.97% <ø> (+0.10%)`	⬆️
libtiledbsoma	`∅ <ø> (∅)`

bkmartinjr

I understand why this change "fixes" the problem, but I'm uncomfortable with the design (in both the original PR and in this fix). It relies on a heuristic (names), and disallows lots of cases where the fast path would work fine (e.g., other non-variable-sized dimensions)

Possible solutions to consider:

directly test the dimension to determine if it variable-sized. Only then fall back (ie.., don't test name or hard-wire to only int64)
Or, given that the most common case, simply wrap the nonempty-domain fetch in a try/catch, and call the slow path only when you can't get it.

Either of these seem far less likely to bite us in the future as they are not heuristics based

johnkerl · 2024-10-24T02:17:39Z

@bkmartinjr #3231 if you're willing to take this as a follow-on

and disallows lots of cases where the fast path would work fine

You're correct

The code before [c++] Fix bug in nnz of variant-indexed dataframes #2990 was broken and wrong in ways that I documented on [c++] Fix bug in nnz of variant-indexed dataframes #2990, and tested for
The change I applied on [c++] Fix bug in nnz of variant-indexed dataframes #2990 was too strict -- it disallowed the fast path in certain cases, including -- importantly -- the one fixed by this PR
But we can go further and allow more cases to take the fast path

I do want to do that as separate PRs -- this one and the to-be-created tracked on #3231 -- precisely because this PR "gets it right" for the most important cases which are (a) default-indexed dataframes; (b) all ND arrays always (which must have int64 dims). So the follow-on PR will enable us to take the fast path for non-default-indexed dataframes, and that will be nice.

johnkerl requested a review from bkmartinjr October 23, 2024 21:37

[c++] Fix fast path for SparseNDArray nnz

6eaba1e

johnkerl force-pushed the kerl/snda-fast-nnz branch from 94d358a to 6eaba1e Compare October 23, 2024 21:39

bkmartinjr requested changes Oct 24, 2024

View reviewed changes

johnkerl mentioned this pull request Oct 24, 2024

[c++] Extend fast path for nnz #3231

Open

johnkerl requested a review from bkmartinjr October 24, 2024 02:17

bkmartinjr approved these changes Oct 24, 2024

View reviewed changes

johnkerl merged commit 99ba3a3 into main Oct 24, 2024
15 checks passed

johnkerl deleted the kerl/snda-fast-nnz branch October 24, 2024 03:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[c++] Fix fast path for `SparseNDArray` `nnz` #3229

[c++] Fix fast path for `SparseNDArray` `nnz` #3229

johnkerl commented Oct 23, 2024

codecov bot commented Oct 23, 2024 •

edited

Loading

bkmartinjr left a comment

johnkerl commented Oct 24, 2024 •

edited

Loading

[c++] Fix fast path for SparseNDArray nnz #3229

[c++] Fix fast path for SparseNDArray nnz #3229

Conversation

johnkerl commented Oct 23, 2024

codecov bot commented Oct 23, 2024 • edited Loading

Codecov Report

bkmartinjr left a comment

Choose a reason for hiding this comment

johnkerl commented Oct 24, 2024 • edited Loading

[c++] Fix fast path for `SparseNDArray` `nnz` #3229

[c++] Fix fast path for `SparseNDArray` `nnz` #3229

codecov bot commented Oct 23, 2024 •

edited

Loading

johnkerl commented Oct 24, 2024 •

edited

Loading