-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider using RoaringBitmapWriter for bitmap construction #6764
Conversation
@richardstartin can you please fix
to have a clean build. Thanks |
@b-slim this is more of a working heads up (i.e. I think you can probably build your roaring bitmaps faster) rather than something I would expect you to merge. The fact that run compression can be applied incrementally so you don't need to do it when you serialise the bitmaps ripples out fairly quickly to "BitmapSerdeFactory" JSON formats, and I can't offer a view on how they should be kept backward compatible. If you update, please feel free to use this commit as a reference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@richardstartin Thanks for taking interest enough to relay this information! Outside of benchmarks, I believe all of the bitmaps we construct are done in order, so if I understand you correctly there would be little benefit to making this change, and I suspect the main difference in the original benchmarks you collected was the performance improvements from the version bump, which I saw as well.
That said, if there is no harm either, I don't see any reason to not make the switch, and could be useful in the event we find ourselves in need of creating out of order bitmaps. But maybe we should repeat the benchmarks against the latest master to ensure there is no penalty for this change?
I am unsure what the best thing to do with regards to the compressRunOnSerialization
parameter, I can't imagine the resulting segment size is very viable without it, but yeah it's very hard and annoying to remove things like that.
@clintropolis the change optimises ordered insertion, because it avoids binary search on the high 16 bits of the bitmap. The change is useless for unordered insertions. |
Ah, I misread the PR description, 👍 |
@clintropolis You were right about where most of the performance came from. I cleaned up the commits a bit and ran against master. Here's the benchmark at
Here's a slight improvement on this branch at
PS this has been squashed and force pushed so take another look. |
In fact, here's a benchmark to isolate the construction time from the iteration time: @Benchmark
public Object construct(ConstructAndIterState state)
{
int dataSize = state.dataSize;
int[] data = state.data;
MutableBitmap mutableBitmap = factory.makeEmptyMutableBitmap();
for (int i = 0; i < dataSize; i++) {
mutableBitmap.add(data[i]);
}
return factory.makeImmutableBitmap(mutableBitmap);
} and at
And at
So there's a good boost to be had from avoiding the binary searches, and it sounds like druid does write its bitmaps in an ordered fashion. |
I noticed you are investigating using RoaringBitmap 0.7.30. If you do so, it is worth considering a different mechanism to build your bitmaps which favours ordered insertions by buffering them into a container and appending to the bitmap as late as possible (just before the bitmap is queried, or when the next multiple of 2^16 is crossed). There is no need to manually run optimise bitmaps built this way, because they are run optimised at the container level whenever it is appended to the bitmap.
I am not a druid user and am unlikely to become one soon, so this PR is intended as an FYI about the feature only.
I ran the
BitmapIterationBenchmark.constructAndIter
on JDK8 on Ubuntu 16.0.4 at7a09cde4de1953eee75c5033e863cfde8f94d6c1
and got:At
b313193c81ed868a9afe04c658306705f63daaef
I got:These quick results are quite noisy so require more careful consideration on your part. I am also unaware of how likely out of order insertions are in a typical druid workload, where there would be no penalty for using this abstraction, but there will be no benefit.