Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add int16 int32 support to get_as_arrow. #1483

Merged
merged 1 commit into from
Apr 24, 2023
Merged

Conversation

wenhoujx
Copy link
Contributor

fixes #1482

add int16 and int32 support according to: https://arrow.apache.org/docs/format/CDataInterface.html#data-type-description-format-strings

please let me know where is the best place to add a test for this. (I haven't touched C++ for years ... )

I have read and agree to the terms under CLA.md.

Copy link
Contributor

@ray6080 ray6080 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @wenhoujx , thanks for the PR!
For your question on test, we test arrow conversion in our python module, see test_arrow.py::_test_primitive_data_types under tools/python_api/test.
I would add two int32 and int16 fields to the existing query, basically modify the query to the following (to_int32(a.age), to_int16(a.age)):

MATCH (a:person) RETURN a.age, to_int32(a.age), to_int16(a.age), a.isStudent, a.eyeSight, a.birthdate, a.registerTime, a.lastJobDuration, a.fName ORDER BY a.ID;

One comment on the changes:
Besides changes in arrow_converter.cpp, can you also take a look at src/common/arrow/arrow_row_batch.cpp? There are several switch cases are currently missing for INT16 and INT32 to make them work.
See ArrowRowBatch::createVector, ArrowRowBatch::copyNonNullValue, ArrowRowBatch::copyNullValue, ArrowRowBatch::convertVectorToArray.
If you make changes to these, things should work fine.

It's quite annoying now that many functions need to be touched for a new data type, so I probably need to do a refactoring here after this. Again, thanks for you interest, and hope to see your updates soon!

@codecov
Copy link

codecov bot commented Apr 22, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +92.08 🎉

Comparison is base (3e04497) 0.00% compared to head (7408e23) 92.08%.

Additional details and impacted files
@@             Coverage Diff             @@
##           master    #1483       +/-   ##
===========================================
+ Coverage        0   92.08%   +92.08%     
===========================================
  Files           0      671      +671     
  Lines           0    23864    +23864     
===========================================
+ Hits            0    21974    +21974     
- Misses          0     1890     +1890     
Impacted Files Coverage Δ
src/common/arrow/arrow_converter.cpp 63.55% <100.00%> (ø)
src/common/arrow/arrow_row_batch.cpp 58.16% <100.00%> (ø)

... and 669 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@wenhoujx
Copy link
Contributor Author

@ray6080 thanks for the pointers, i updated the code and tests. Let me know if i need to further update. THanks !

@ray6080
Copy link
Contributor

ray6080 commented Apr 24, 2023

@ray6080 thanks for the pointers, i updated the code and tests. Let me know if i need to further update. THanks !

Thanks! Looks good to me. Before merging the PR, can you manually squash your commits into one? We follow the principle of one commit for one pr, so it looks better on master commit history.
We always do git rebase to squash commits in your branch into a single one first, then sync local master with the latest remote (if necessary), and git rebase master, finally force push the branch.

@wenhoujx
Copy link
Contributor Author

wenhoujx commented Apr 24, 2023

great, i like squashing to one commit too.

done merging into one commit.

have you considered turn on github merge & squash option? https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/configuring-pull-request-merges/configuring-commit-squashing-for-pull-requests

@ray6080
Copy link
Contributor

ray6080 commented Apr 24, 2023

have you considered turn on github merge & squash option? https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/configuring-pull-request-merges/configuring-commit-squashing-for-pull-requests

Good point, I think this will squash all commits of a pr into a single one, while we currently keep two for each. Though it doesn't matter, it's just that we didn't follow this way from day 1 😅

@ray6080 ray6080 changed the title add int16 int32 support to get_as_arrow. Add int16 int32 support to get_as_arrow. Apr 24, 2023
@ray6080 ray6080 merged commit 09106ce into kuzudb:master Apr 24, 2023
@wenhoujx
Copy link
Contributor Author

y, that doesn't matter, thanks for reviewing my PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ge_as_arrow raise exception when use int32 or int16
2 participants