-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DruidMaster stops thinking that there's a server, but doesn't remove and stop the LoadQueuePeon #76
Milestone
Comments
This code got reworked with the recent Zookeeper changes, so I'm going to close this bug with the hopes that it is resolved. If it happens again, we should make a new one. |
gianm
pushed a commit
to gianm/druid
that referenced
this issue
Apr 9, 2020
abhishekagarwal87
referenced
this issue
in abhishekagarwal87/druid
Dec 14, 2020
* Fix license header for imply extensions * arm64 packaging should use jdk8 * maybe this time
abhishekagarwal87
referenced
this issue
in abhishekagarwal87/druid
Dec 14, 2020
* scaffolding * readme * adjust * more better, janky heap metadata store, primitive job queue that can submit to overlord, it works - sort of * test scaffolding * move InputFormat into IngestSchema * imply-5135 Create & list ingest tables * Addressed PR comments * Removed bean IngestTable * job processing + sql metadata job table (#78) * Add indexed-table-loader (#65) * Add indexed-table-loader * Fix checkstyle * Fix intelliJ inspections * Fix analyze dependencies * fix license check job * Add imply-druid-security (#66) * Add imply-druid-security * fix checkstyle * Fix analyze dependencies * Fix license check job * Update license header for all imply extensions * fix intelliJ inspections * code review * modify access to protected SQLMetadataConnector methods to allow extensions to create SQL metadata tables using implementation specific constructs (payload type, serial type, etc) (apache#10573) * Correct getRandomBalancerSegmentHolderTest (apache#10569) * Add missing docs for timeout exceptions (apache#10554) * Add missing docs for timeout exceptions * Add info on auth failures * Fix ingestion failure of pretty-formatted JSON message (apache#10383) * support multi-line text * add test cases * split json text into lines case by case * improve exception handle * fix CI * use IntermediateRowParsingReader as base of JsonReader * update doc * ignore the non-immutable field in test case * add more test cases * mark `lineSplittable` as final * fix testcases * fix doc * add a test case for SqlReader * return all raw columns when exception occurs * fix CI * fix test cases * resolve review comments * handle ParseException returned by index.add * apply Iterables.getOnlyElement * fix CI * fix test cases * improve code in more graceful way * fix test cases * fix test cases * add a test case to check multiple json string in one text block * fix inspection check * Add TravisCI job that builds and tests on ARM64 CPU architecture (apache#10562) * Ensure Krb auth before killing YARN apps in graceful shutdown (apache#9785) * job processing + sql metadata * Web console: fix data loader schema table column ordering bug and other polish (apache#10588) * remove unused fields * keep tables live * advanced * fix schema view * better indication * tests pass * Show more instead of show advanced * fix tests * extract dynamic configs * update snapshots * fix issues * update snapshot * reword without > * some javadoc * modify druid.historical.cache.maxEntrySize property in Unified format (apache#10590) Co-authored-by: yuezhang <yuezhang@freewheel.tv> * Fix license header for imply extensions (#76) * Fix license header for imply extensions * arm64 packaging should use jdk8 * maybe this time * jobs and states and status and whatever * use indexing client and coordinator client instead of leader client * always running * simplify * fix readme * Add zero period support to TIMESTAMPADD (apache#10550) * Allow zero period for TIMESTAMPADD * update test cases * add empty zone test case * add unit test cases for TimestampShiftMacro * add -Pimply-saas distribution profile, table exists check * update readme Co-authored-by: Suneet Saldanha <suneet.saldanha@imply.io> Co-authored-by: Lucas Capistrant <capistrant@users.noreply.github.com> Co-authored-by: github-actions <github-actions@github.com> Co-authored-by: Atul Mohan <atulmohan.mec@gmail.com> Co-authored-by: frank chen <frank.chen021@outlook.com> Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com> Co-authored-by: Suneet Saldanha <suneet@apache.org> Co-authored-by: Vadim Ogievetsky <vadim@ogievetsky.com> Co-authored-by: zhangyue19921010 <69956021+zhangyue19921010@users.noreply.github.com> Co-authored-by: yuezhang <yuezhang@freewheel.tv> * fix style and headers * fix fails * fix auth Co-authored-by: Agustin Gonzalez <agustin.gonzalez@imply.io> Co-authored-by: Suneet Saldanha <suneet.saldanha@imply.io> Co-authored-by: Lucas Capistrant <capistrant@users.noreply.github.com> Co-authored-by: github-actions <github-actions@github.com> Co-authored-by: Atul Mohan <atulmohan.mec@gmail.com> Co-authored-by: frank chen <frank.chen021@outlook.com> Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com> Co-authored-by: Suneet Saldanha <suneet@apache.org> Co-authored-by: Vadim Ogievetsky <vadim@ogievetsky.com> Co-authored-by: zhangyue19921010 <69956021+zhangyue19921010@users.noreply.github.com> Co-authored-by: yuezhang <yuezhang@freewheel.tv>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Someone today reported a cluster where they had a node go on the fritz and even after it was destroyed and the announcement znode removed, the LoadQueuePeon continued to persist and do things (namely, when we tried to remove things from the loadQueue path, it kept creating new things).
Looking at the code in DruidMaster, it's hard to imagine how this could happen. Symptoms seen were
log.info("Removing listener for server[%s] which is no longer there.", name);
did not run.The text was updated successfully, but these errors were encountered: