Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functions exposing conversion to Arrow ArrayData #1827

Merged
merged 1 commit into from
Jul 27, 2023

Conversation

benjaminwinger
Copy link
Collaborator

Fixes #1804

I've added functions to the C and C++ API which allow conversion of query result data to the structures defined in the Arrow C Data Interface.

The rust API takes this a step further and, similarly to the python API, converts it to the arrow::array::ArrayData type (which seems a little difficult to use by itself, but converting it to more ergonomic types requires a lot of copying and probably has poor performance, so I thought it best to leave it as ArrayData and let the user decide if they want to use it directly or convert it).
The arrow conversion is hidden behind an arrow feature to avoid pulling in the large arrow crate by default.

Rust test will probably fail with the arrow feature. Testing locally the latest version of one of the arrow-rs dependencies requires rust 1.70, which is very recent; I had to manually force it to use an earlier version to get it to build on rust 1.69.

@codecov
Copy link

codecov bot commented Jul 18, 2023

Codecov Report

Patch coverage: 10.00% and project coverage change: -0.02% ⚠️

Comparison is base (c1896a3) 91.30% compared to head (4c51235) 91.28%.

❗ Current head 4c51235 differs from pull request most recent head 2e5682f. Consider uploading reports for the commit 2e5682f to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1827      +/-   ##
==========================================
- Coverage   91.30%   91.28%   -0.02%     
==========================================
  Files         799      794       -5     
  Lines       28996    28828     -168     
==========================================
- Hits        26475    26316     -159     
+ Misses       2521     2512       -9     
Files Changed Coverage Δ
src/c_api/query_result.cpp 85.45% <0.00%> (-6.71%) ⬇️
src/include/main/query_result.h 100.00% <ø> (ø)
src/main/query_result.cpp 65.41% <16.66%> (-2.56%) ⬇️

... and 114 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@benjaminwinger benjaminwinger force-pushed the arrow-data branch 8 times, most recently from a2d3e8f to b004523 Compare July 20, 2023 21:16
// Could use directly, except that we can't (yet) mark ArrowSchema as being safe to store in a
// cxx::UniquePtr
return *result.getArrowSchema();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an empty line between these two functions.

*
* It is the caller's responsibility to call the release function to release the underlying data
*/
struct ArrowArray kuzu_query_result_to_arrow_array(kuzu_query_result* query_result);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm should this be kuzu_query_result_get_next_arrow_chunk ?

@benjaminwinger benjaminwinger force-pushed the arrow-data branch 2 times, most recently from 4c51235 to 5d9650b Compare July 21, 2023 14:38
@benjaminwinger benjaminwinger force-pushed the arrow-data branch 2 times, most recently from 76220d9 to 6f8e8ea Compare July 25, 2023 15:10
@benjaminwinger benjaminwinger merged commit ed09f19 into master Jul 27, 2023
8 checks passed
@benjaminwinger benjaminwinger deleted the arrow-data branch July 27, 2023 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support serialization of query results to in-memory arrow data types
2 participants