Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow double dimension query when useDefaultValueForNull set to false #8171

Closed
WilliamWhispell opened this issue Jul 26, 2019 · 7 comments
Closed

Comments

@WilliamWhispell
Copy link

For example, for a datasource called example if I have:

"dimensionsSpec": {
"dimensions": [
"SYMBOL",
{
"type": "float",
"name": "FIELD1"
},
{
"type": "double",
"name": "FIELD2"
}]
}

Then querying this datasource, where that FIELD1 and FIELD2 column contain the same data, but one is stored using double, I find that:

SELECT SYMBOL,FIELD1 from example WHERE __time = TIMESTAMP '2019-07-26 00:00:00'

  • Takes about 200 ms

SELECT SYMBOL,FIELD2 from example WHERE __time = TIMESTAMP '2019-07-26 00:00:00'

  • takes about 10 seconds

Now changing druid.generic.useDefaultValueForNull=true and restarting druid, when I load the data again into a new datasource example2, I find that that the double query is no longer slow and completes in about the same time as any other dimension query.

@WilliamWhispell
Copy link
Author

For this test, my datasource contained about 70,000 rows and they all were in the same segment. Only data for 2019-07-26 was loaded. About 30,000 entries are null.

@WilliamWhispell
Copy link
Author

Looks like all the time is spent inside the IsNull check of HistoricalDoubleColumnSelectorWithNulls. All of the time is spent in BitIterator.skipAllBefore which is called inside of ImmutableConciseSet.contains. I'm not sure why the other non-double fields don't see this performance issue.

@WilliamWhispell
Copy link
Author

For now I've switched to using a roaring index and all field types seem to perform well. So I believe this issue affects only some field types like double when using the concise index type.

@nishantmonu51 nishantmonu51 self-assigned this Aug 1, 2019
@nishantmonu51
Copy link
Member

@WilliamWhispell thanks for reporting this, will look into it.

@gianm
Copy link
Contributor

gianm commented Aug 1, 2019

@nishantmonu51 Check out #5569, I think it is related.

@tracy4u
Copy link

tracy4u commented Sep 24, 2019

Any progress about this issue? I also met this and do not have any workaround.

@gianm
Copy link
Contributor

gianm commented Apr 24, 2023

Fixed by #8822.

@gianm gianm closed this as completed Apr 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants