-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes and tests related to the Indexer process. #10631
Conversation
9e8698a
to
a549f5b
Compare
Three bugs fixed: 1) Indexers would not announce themselves as segment servers if they did not have storage locations defined. This used to work, but was broken in apache#9971. Fixed this by adding an "isSegmentServer" method to ServerType and updating SegmentLoadDropHandler to always announce if this method returns true. 2) Certain batch task types were written in a way that assumed "isReady" would be called before "run", which is not guaranteed. In particular, they relied on it in order to initialize "taskLockHelper". Fixed this by updating AbstractBatchIndexTask to ensure "isReady" is called before "run" for these tasks. 3) UnifiedIndexerAppenderatorsManager did not properly handle complex datasources. Introduced DataSourceAnalysis in order to fix this. Test changes: 1) Add a new "docker-compose.cli-indexer.yml" config that spins up an Indexer instead of a MiddleManager. 2) Introduce a "USE_INDEXER" environment variable that determines if docker-compose will start up an Indexer or a MiddleManager. 3) Duplicate all the jdk8 tests and run them in both MiddleManager and Indexer mode. 4) Various adjustments to encourage fail-fast errors in the Docker build scripts. 5) Various adjustments to speed up integration tests and reduce memory usage. 6) Add another Mac-specific approach to determining a machine's own IP. This was useful on my development machine. 7) Update segment-count check in ITCompactionTaskTest to eliminate a race condition (it was looking for 6 segments, which only exist together briefly, until the older 4 are marked unused). Javadoc updates: 1) AbstractBatchIndexTask: Added javadocs to determineLockGranularityXXX that make it clear when taskLockHelper will be initialized as a side effect. (Related to the second bug above.) 2) Task: Clarified that "isReady" is not guaranteed to be called before "run". It was already implied, but now it's explicit. 3) ZkCoordinator: Clarified deprecation message. 4) DataSegmentServerAnnouncer: Clarified deprecation message.
a549f5b
to
7ae9e3c
Compare
* This method will not necessarily be executed before {@link #run(TaskToolbox)}, since this task readiness check | ||
* may be done on a different machine from the one that actually runs the task. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, how can this happen? I thought Task.isReady()
is always called before Task.run()
because a task can be scheduled only when Task.isReady()
returns true. Task.run()
will be called after the task is scheduled in some indexer or middleManager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It won't necessarily be called on that same Task object (it might be called on a different instance that represents the same actual task).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. I think it happens only in indexers because peon calls isReady()
before it runs its task.
druid_coordinator_period_indexingPeriod=PT180000S | ||
druid_coordinator_period=PT1S |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this too aggressive? 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha ha 🙂
I think it's OK — there are going to be very few segments in the integration tests, so a coordinator run should be able to finish very quickly. This change was meant to speed up the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 after CI
* This method will not necessarily be executed before {@link #run(TaskToolbox)}, since this task readiness check | ||
* may be done on a different machine from the one that actually runs the task. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. I think it happens only in indexers because peon calls isReady()
before it runs its task.
One test left. Trying to figure out what the problem is with "Kafka index integration test with various formats". Strangely, the Indexer version works fine, but the MM version is busted. |
I ran it locally and it passed. It might be flaky; I'll try running it again in CI. |
* Fixes and tests related to the Indexer process. Three bugs fixed: 1) Indexers would not announce themselves as segment servers if they did not have storage locations defined. This used to work, but was broken in apache#9971. Fixed this by adding an "isSegmentServer" method to ServerType and updating SegmentLoadDropHandler to always announce if this method returns true. 2) Certain batch task types were written in a way that assumed "isReady" would be called before "run", which is not guaranteed. In particular, they relied on it in order to initialize "taskLockHelper". Fixed this by updating AbstractBatchIndexTask to ensure "isReady" is called before "run" for these tasks. 3) UnifiedIndexerAppenderatorsManager did not properly handle complex datasources. Introduced DataSourceAnalysis in order to fix this. Test changes: 1) Add a new "docker-compose.cli-indexer.yml" config that spins up an Indexer instead of a MiddleManager. 2) Introduce a "USE_INDEXER" environment variable that determines if docker-compose will start up an Indexer or a MiddleManager. 3) Duplicate all the jdk8 tests and run them in both MiddleManager and Indexer mode. 4) Various adjustments to encourage fail-fast errors in the Docker build scripts. 5) Various adjustments to speed up integration tests and reduce memory usage. 6) Add another Mac-specific approach to determining a machine's own IP. This was useful on my development machine. 7) Update segment-count check in ITCompactionTaskTest to eliminate a race condition (it was looking for 6 segments, which only exist together briefly, until the older 4 are marked unused). Javadoc updates: 1) AbstractBatchIndexTask: Added javadocs to determineLockGranularityXXX that make it clear when taskLockHelper will be initialized as a side effect. (Related to the second bug above.) 2) Task: Clarified that "isReady" is not guaranteed to be called before "run". It was already implied, but now it's explicit. 3) ZkCoordinator: Clarified deprecation message. 4) DataSegmentServerAnnouncer: Clarified deprecation message. * Fix stop_cluster script. * Fix sanity check in script. * Fix hashbang lines. * Test and doc adjustments. * Additional tests, and adjustments for tests. * Split ITs back out. * Revert change to druid_coordinator_period_indexingPeriod. * Set Indexer capacity to match MM. * Bump up Historical memory. * Bump down coordinator, overlord memory. * Bump up Broker memory.
Three bugs fixed:
Indexers would not announce themselves as segment servers if they
did not have storage locations defined. This used to work, but was
broken in Load broadcast datasources on broker and tasks #9971. Fixed this by adding an "isSegmentServer" method
to ServerType and updating SegmentLoadDropHandler to always announce
if this method returns true.
Certain batch task types were written in a way that assumed "isReady"
would be called before "run", which is not guaranteed. In particular,
they relied on it in order to initialize "taskLockHelper". Fixed this
by updating AbstractBatchIndexTask to ensure "isReady" is called
before "run" for these tasks.
UnifiedIndexerAppenderatorsManager did not properly handle join
datasources. Introduced DataSourceAnalysis in order to fix this.
Test changes:
Add a new "docker-compose.cli-indexer.yml" config that spins up an
Indexer instead of a MiddleManager.
Introduce a "USE_INDEXER" environment variable that determines if
docker-compose will start up an Indexer or a MiddleManager.
Duplicate all the jdk8 tests and run them in both MiddleManager and
Indexer mode.
Various adjustments to encourage fail-fast errors in the Docker
build scripts.
Various adjustments to speed up integration tests and reduce memory
usage.
Add another Mac-specific approach to determining a machine's own IP.
This was useful on my development machine.
Update segment-count check in ITCompactionTaskTest to eliminate a
race condition (it was looking for 6 segments, which only exist
together briefly, until the older 4 are marked unused).
Javadoc updates:
AbstractBatchIndexTask: Added javadocs to determineLockGranularityXXX
that make it clear when taskLockHelper will be initialized as a side
effect. (Related to the second bug above.)
Task: Clarified that "isReady" is not guaranteed to be called before
"run". It was already implied, but now it's explicit.
ZkCoordinator: Clarified deprecation message.
DataSegmentServerAnnouncer: Clarified deprecation message.