-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
appenderator plumber #2913
appenderator plumber #2913
Conversation
This is a very useful PR and required to pave the way for some larger scale changes so we should try to get this in. |
Some context: This is meant to be an adapter from an Appenderator (#2220) into the Plumber interface. Right now Appenderators are only used by the KafkaIndexTask. @javasoze has been working on a custom Appenderator as an extension and wanted to use it with realtime nodes too, which this would help with. The motivation for the custom Appenderator is that @javasoze is wanting to create a new segment type based on Lucene indexes rather than the traditional Druid columnar format with smooshes and all that. The hope is that with some changes in Druid we could support more than one kind of segment format, where "base" queries (like topN, timeseries, groupBy) work on all formats, but maybe additional extension queries only work on specific formats, or maybe certain features (like certain filters) would be implemented in more optimal ways on certain formats. The first step towards experimenting with that is making it possible to customize segment creation via implementing a custom Appenderator, which this PR helps with by making that interface usable in realtime nodes. If that works out then we'd also want to make it usable in batch ingestion and in realtime tasks. |
@xvrl, definitely, although we have not yet figured out the details of a proposal so there is nothing concrete to share yet. Hoping to have something there soon-ish. The goal though is to make it possible to store a Druid dataSource using some engine other than Druid's traditional storage engine (like Lucene). Probably in addition to this PR, a sketch of a fuller proposal is,
I think these changes could be done in a pretty non-invasive way and then open up some cool possibilities for storage engine extensions. Hopefully this PR can be reviewed before a fuller proposal is ready. |
is there any benefit of getting this merged before we flesh out the proposal? |
merged, I'm not sure (@javasoze?), but looked at – that always helps :) |
Thanks @gianm for providing the context. @xvrl I guess I had hoped the AppenderatorPlumber would be a general purpose extension point where people can do interesting things with Druid. I used @gianm's branch and was able to plug-in full-text support through Lucene via this extension point and thought it would be interesting to explore further. I am not sure if this PR should be treated as a simple extension point or part of a larger roadmap. Personally I am a fan of small incremental changes that make sense. So I would like to get an opinion from the community to see if this extension point makes sense in general without making it a dependency of a larger discussion/roadmap. Hope this makes sense. Thanks! |
Haven't heard anymore comments on this. @xvrl Any objections in merging this? |
@javasoze we are trying to release 0.9.1 right now so all other PRs that are non-urgent will require a few more days before we can get to it |
|
||
final long startDelay = runExecStopwatch.elapsed(TimeUnit.MILLISECONDS); | ||
metrics.incrementPersistBackPressureMillis(startDelay); | ||
if (startDelay > WARN_DELAY) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a minor comment, but in the event that there are a lot of persists occurring, we'll get spammed with these messages. But I think when so many persists are occurring, we'll have other problems
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@javasoze yeah I mean the log.warn line below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 |
@fjy Do you mean the log.warn()? |
@himanshug can you help review this? |
@javasoze how did you use this PR to create lucene segment? i imaging you would have written an Appenderator impl based on lucene, but, i see that RealtimeIndexTask would still be using RealtimePlumberSchool even after this PR . |
@himanshug Yes, I implemented a Lucene based Appenderator. For this PR, we don't expect everything to be migrated to AppenderatorPlumber, only an extension point. As for longer term plans for RealtimePlumberSchool, it is out of the scope of this PR. |
👍 note to users: custom Appenderators can only be utilized via Realtime nodes currently. |
This is primarily from Gian's work here:
https://github.com/gianm/druid/tree/appenderator-skunkworks
I moved the AppenderatorPlumber stuff to io.druid.segment.realtime.appenderator package from io.druid.segment.realtime.skunkworks package.
Added a test for AppenderatorPlumber.