-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow double dimension query when useDefaultValueForNull set to false #8171
Comments
For this test, my datasource contained about 70,000 rows and they all were in the same segment. Only data for 2019-07-26 was loaded. About 30,000 entries are null. |
Looks like all the time is spent inside the IsNull check of HistoricalDoubleColumnSelectorWithNulls. All of the time is spent in BitIterator.skipAllBefore which is called inside of ImmutableConciseSet.contains. I'm not sure why the other non-double fields don't see this performance issue. |
For now I've switched to using a roaring index and all field types seem to perform well. So I believe this issue affects only some field types like double when using the concise index type. |
@WilliamWhispell thanks for reporting this, will look into it. |
@nishantmonu51 Check out #5569, I think it is related. |
Any progress about this issue? I also met this and do not have any workaround. |
Fixed by #8822. |
For example, for a datasource called example if I have:
"dimensionsSpec": {
"dimensions": [
"SYMBOL",
{
"type": "float",
"name": "FIELD1"
},
{
"type": "double",
"name": "FIELD2"
}]
}
Then querying this datasource, where that FIELD1 and FIELD2 column contain the same data, but one is stored using double, I find that:
SELECT SYMBOL,FIELD1 from example WHERE __time = TIMESTAMP '2019-07-26 00:00:00'
SELECT SYMBOL,FIELD2 from example WHERE __time = TIMESTAMP '2019-07-26 00:00:00'
Now changing druid.generic.useDefaultValueForNull=true and restarting druid, when I load the data again into a new datasource example2, I find that that the double query is no longer slow and completes in about the same time as any other dimension query.
The text was updated successfully, but these errors were encountered: