Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix distinct hash table resizing #3348

Merged
merged 2 commits into from
Apr 23, 2024
Merged

Conversation

acquamarin
Copy link
Collaborator

@acquamarin acquamarin commented Apr 23, 2024

Closes #3339
This PR solves a distinct hash table issue:
When we append values to distinct hash table, we forgot to resize the hashtable even if it is full. This causes an infinite loop issue.
This PR solves the issue simply by resizing the distinct hashtable upon each insertion.

@ted-wq-x
Copy link
Contributor

This change can indeed solve the infinite loop issue,but the current hash slots has a very poor performance.

@acquamarin
Copy link
Collaborator Author

This change can indeed solve the infinite loop issue,but the current hash slots has a very poor performance.

Since the query has distinct keyword in agg keys (count(distinct a)), we can only run the query in single thread mode. That's the reason why you were experiencing a slow spped.

@ted-wq-x
Copy link
Contributor

I understand distinct only run in single thread mode, but the query in another graph db only need 2.8s (also single thread mode), in kuzu the execution didn‘t finished in 5min .

@ray6080
Copy link
Contributor

ray6080 commented Apr 23, 2024

I understand distinct only run in single thread mode, but the query in another graph db only need 2.8s (also single thread mode), in kuzu the execution didn‘t finished in 5min .

Looks like we might have a performance bug regarding to this then. Will look into it. Thanks!

@acquamarin
Copy link
Collaborator Author

acquamarin commented Apr 23, 2024

#3339

I opened a new issue about performance on distinct hash aggregation: #3349

@acquamarin acquamarin merged commit 15a648f into master Apr 23, 2024
17 checks passed
@acquamarin acquamarin deleted the fix-hash-agg-infinite-loop branch April 23, 2024 03:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Query like match (a:Person) return count(distinct a) has an infinite loop problem
4 participants