Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implment get_as_torch_geometric #1200

Merged
merged 1 commit into from
Jan 29, 2023
Merged

Implment get_as_torch_geometric #1200

merged 1 commit into from
Jan 29, 2023

Conversation

mewim
Copy link
Collaborator

@mewim mewim commented Jan 24, 2023

This PR implements a function get_as_torch_geometric() which converts the query result into a torch_geometric.data object.

Principles:

  • Only full NODE and RELs are converted. Selected properties, such as a.isWorker is ignored.
  • Only converts INT64 and BOOL types. Other types are ignored. Multi-dimensional lists are supported as long as the dimension is consistent and there is no sub-list of different length.
  • Edge attributes and labels are ignored for now.
  • Best effort: it will try to convert every attribute that can be converted and ignore those that cannot be converted to torch.Tensor.
  • No throw: it only shows warnings for incompatible types but never throws an error.

@mewim mewim marked this pull request as ready for review January 25, 2023 03:43
@mewim mewim force-pushed the get-as-pyg branch 4 times, most recently from 252e441 to d086ece Compare January 25, 2023 03:55
@mewim mewim requested a review from ray6080 January 25, 2023 03:55
@mewim mewim force-pushed the get-as-pyg branch 6 times, most recently from 800e65a to a4b6c62 Compare January 25, 2023 21:45
tools/python_api/src_py/query_result.py Outdated Show resolved Hide resolved
tools/python_api/src_py/query_result.py Outdated Show resolved Hide resolved
LIST = "LIST"
NODE = "NODE"
REL = "REL"
NODE_ID = "NODE_ID"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update NODE_ID once Ziyi's PR is in.


PRIMARY_KEY_SYMBOL = "(PRIMARY KEY)"
LIST_SYMBOL = "[]"
result_str = self.query_result.connection._connection.get_node_property_names(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is beyond this PR, but let me document it here:

The parsing here makes me think that our cpp interface string Connection::getNodePropertyNames(const string& tableName) is incorrect. Instead of returning a string, it's better to return vector<string>, each inside which is a property name. The caller (shell) shall be responsible to organize them in a printable way.

continue

# Ignore primary key
if node_property_names[prop_name]["is_primary_key"]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not put primary_key prop_name also into the ignored properties?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignored property is not read at all, but for primary key, it still needs to be read as we need to maintain the pos_to_primary_key_dict. I have changed the comment to Read primary key but do not add it to the node properties.

from enum import Enum


class Type(Enum):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way that we don't need to maintain a separate mapping in Python?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in #1189, pybind11 has a memory leakage bug when binding enum class. We can change this Enum to binding after the bug is fixed.

tools/python_api/src_py/query_result.py Outdated Show resolved Hide resolved
@mewim mewim merged commit 3981257 into master Jan 29, 2023
@mewim mewim deleted the get-as-pyg branch January 29, 2023 17:25
@ray6080 ray6080 added the feature New features or missing components of existing features label Jan 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New features or missing components of existing features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants