Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export query result to arrow: NodeID, Node, Rel, LIST data types #1209

Merged
merged 1 commit into from
Feb 2, 2023

Conversation

ray6080
Copy link
Contributor

@ray6080 ray6080 commented Jan 27, 2023

No description provided.

@ray6080 ray6080 marked this pull request as ready for review January 29, 2023 17:07
@ray6080 ray6080 requested a review from mewim January 29, 2023 17:08
Copy link
Member

@mewim mewim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM.

Two general comments:

  1. In the Python class for query result (
    def get_as_arrow(self, chunk_size):
    ), we should add import pyarrow before calling the CPP function, so that if pyarrow is not installed, the user gets a ModuleNotFoundError exception early without having to deal with the confusing exceptions in CPP code.
  2. The function seems to throw error for un-processable data types. Should we consider not throwing error and do a best-effort with warning if we add a new type but the arrow conversion has not been implemented?

src/common/arrow/arrow_row_batch.cpp Show resolved Hide resolved
src/common/arrow/arrow_row_batch.cpp Show resolved Hide resolved
src/common/arrow/arrow_row_batch.cpp Show resolved Hide resolved
src/common/arrow/arrow_row_batch.cpp Outdated Show resolved Hide resolved
src/main/query_result.cpp Outdated Show resolved Hide resolved
@ray6080
Copy link
Contributor Author

ray6080 commented Feb 2, 2023

Overall LGTM.

Two general comments:

  1. In the Python class for query result (
    def get_as_arrow(self, chunk_size):

    ), we should add import pyarrow before calling the CPP function, so that if pyarrow is not installed, the user gets a ModuleNotFoundError exception early without having to deal with the confusing exceptions in CPP code.
  2. The function seems to throw error for un-processable data types. Should we consider not throwing error and do a best-effort with warning if we add a new type but the arrow conversion has not been implemented?

Thanks! Added import check for get_as_df and get_as_arrow. For exception, we've covered data types we have right now. for new data types, I think it's better to throw exception than trying to convert unknown type by ourself.

@ray6080 ray6080 merged commit 51fb6de into master Feb 2, 2023
@ray6080 ray6080 deleted the arrow branch February 2, 2023 00:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants