Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overlord throws Exception while gaining leadership and bails out #2886

Closed
nishantmonu51 opened this issue Apr 27, 2016 · 2 comments
Closed
Labels
Milestone

Comments

@nishantmonu51
Copy link
Member

nishantmonu51 commented Apr 27, 2016

The leader threw an exception -

java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
        at com.google.common.base.Throwables.propagate(Throwables.java:160)
        at io.druid.indexing.overlord.TaskMaster$1.takeLeadership(TaskMaster.java:150)
        at org.apache.curator.framework.recipes.leader.LeaderSelector$WrappedListener.takeLeadership(LeaderSelector.java:536)
        at org.apache.curator.framework.recipes.leader.LeaderSelector.doWork(LeaderSelector.java:399)
        at org.apache.curator.framework.recipes.leader.LeaderSelector.doWorkLoop(LeaderSelector.java:443)
        at org.apache.curator.framework.recipes.leader.LeaderSelector.access$100(LeaderSelector.java:64)
        at org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:245)
        at org.apache.curator.framework.recipes.leader.LeaderSelector$2.call(LeaderSelector.java:239)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at com.metamx.common.lifecycle.Lifecycle$AnnotationBasedHandler.start(Lifecycle.java:350)
        at com.metamx.common.lifecycle.Lifecycle.start(Lifecycle.java:259)
        at io.druid.indexing.overlord.TaskMaster$1.takeLeadership(TaskMaster.java:134)
        ... 12 more
Caused by: java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@21c56256 rejected from java.util.concurrent.ScheduledThreadPoolExecutor@66a2d79d[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 2353]
        at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
        at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
        at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326)
        at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
        at com.metamx.common.concurrent.ScheduledExecutors.scheduleAtFixedRate(ScheduledExecutors.java:159)
        at com.metamx.common.concurrent.ScheduledExecutors.scheduleAtFixedRate(ScheduledExecutors.java:135)
        at com.metamx.common.concurrent.ScheduledExecutors.scheduleAtFixedRate(ScheduledExecutors.java:121)
        at io.druid.indexing.overlord.autoscaling.SimpleResourceManagementStrategy.startManagement(SimpleResourceManagementStrategy.java:276)
        at io.druid.indexing.overlord.autoscaling.SimpleResourceManagementStrategy.startManagement(SimpleResourceManagementStrategy.java:51)
        at io.druid.indexing.overlord.RemoteTaskRunner.start(RemoteTaskRunner.java:293)
        ... 19 more
@nishantmonu51 nishantmonu51 added this to the 0.9.1 milestone Apr 27, 2016
@nishantmonu51
Copy link
Member Author

Issue is in RemoteTaskRunnerFactory that shared the executor between multiple RTR instances.

nishantmonu51 added a commit to metamx/druid that referenced this issue Apr 27, 2016
drcrallen pushed a commit that referenced this issue Apr 27, 2016
@du00cs
Copy link
Contributor

du00cs commented Dec 23, 2016

@nishantmonu51 I met this on 0.9.1.1, a similar exception is thrown:

2016-12-23 04:15:10,090 ERROR i.d.i.o.RemoteTaskRunner [Curator-PathChildrenCache-6] Failed to handle new worker status: {class=io.druid.indexing.overlord.RemoteTaskRunner, exceptionType=class java.util.concurrent.RejectedExecutionException, exceptionMessage=Task java.util.concurrent.FutureTask@61ce9c92 rejected from java.util.concurrent.ThreadPoolExecutor@4b88b5ac[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 6], worker=c3-hadoop-druid01.bj:8091, znode=/druid/prod/indexer/status/c3-hadoop-druid01.bj:8091/index_realtime_profile_mifg_druid_event_2016-12-22T19:00:00.000Z_1_0}
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@61ce9c92 rejected from java.util.concurrent.ThreadPoolExecutor@4b88b5ac[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 6]
        at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047) ~[?:1.8.0_31]
        at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) ~[?:1.8.0_31]
        at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) ~[?:1.8.0_31]
        at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134) ~[?:1.8.0_31]
        at io.druid.indexing.overlord.RemoteTaskRunner.runPendingTasks(RemoteTaskRunner.java:597) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
        at io.druid.indexing.overlord.RemoteTaskRunner.access$2100(RemoteTaskRunner.java:120) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
        at io.druid.indexing.overlord.RemoteTaskRunner$6.childEvent(RemoteTaskRunner.java:942) [druid-indexing-service-0.9.1.1.jar:0.9.1.1]
        at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:522) [curator-recipes-2.10.0.jar:?]
        at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:516) [curator-recipes-2.10.0.jar:?]
        at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93) [curator-framework-2.10.0.jar:?]
        at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)[guava-16.0.1.jar:?]
        at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:85) [curator-framework-2.10.0.jar:?]
        at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:514) [curator-recipes-2.10.0.jar:?]
        at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35) [curator-recipes-2.10.0.jar:?]
        at org.apache.curator.framework.recipes.cache.PathChildrenCache$9.run(PathChildrenCache.java:772) [curator-recipes-2.10.0.jar:?]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_31]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_31]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_31]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_31]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_31]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_31]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_31]

Any idea?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants