-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix k8s task runner failure reporting #15311
Conversation
indexing-service/src/main/java/org/apache/druid/indexing/common/actions/UpdateStatusAction.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks great to me, just had that little comment.
@@ -186,7 +187,7 @@ public final TaskStatus run(TaskToolbox taskToolbox) throws Exception | |||
public abstract TaskStatus runTask(TaskToolbox taskToolbox) throws Exception; | |||
|
|||
@Override | |||
public void cleanUp(TaskToolbox toolbox, TaskStatus taskStatus) throws Exception | |||
public void cleanUp(TaskToolbox toolbox, @Nullable TaskStatus taskStatus) throws Exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although no harm to annotate nullable here, I think taskStatus won't be possible null as it will be set to a value in all the code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - feel free to merge. I think the small change in toString will help in debugging so we don't see something like status=statusFull={...}
The rest of the comments are around making it easier to remove the status field from the UpdateStatusAction in the future - up to you if you think it's worth addressing.
You can also merge this patch, and address anything else in a future patch so you don't need to wait on CI again.
indexing-service/src/main/java/org/apache/druid/indexing/common/actions/UpdateStatusAction.java
Show resolved
Hide resolved
) | ||
{ | ||
this.status = status; | ||
this.statusFull = statusFull; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If status
is deprecated, and callers are expected to use statusFull
can we add validation in the constructor to make sure only 1 field is set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd rather not add the extra validation since its principally a json constructor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine.
My thinking was that in the perform method either status or statusFull is used. So if a caller calls this method with both set, it's likely a mistake, and they won't know about it unless they closely inspect the results of the perform method.
indexing-service/src/main/java/org/apache/druid/indexing/common/actions/UpdateStatusAction.java
Outdated
Show resolved
Hide resolved
indexing-service/src/main/java/org/apache/druid/indexing/common/task/AbstractTask.java
Show resolved
Hide resolved
indexing-service/src/main/java/org/apache/druid/indexing/common/actions/UpdateStatusAction.java
Show resolved
Hide resolved
@Test | ||
public void testEquals() | ||
{ | ||
UpdateStatusAction one = new UpdateStatusAction("", TaskStatus.failure(ID, "error")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EqualsVerifier.forClass(UpdateStatusAction.class).verify()
will verify all combinations of the underlying fields in the class and point out subtle issues in the equals / hash code logic - see other tests like testEquals
or testEqualsAndHashCode
for examples.
* Fix k8s task runner failure reporting * Fix reference * add jsonignore * PR changes
Fixes a couple bugs with some of the AbstractTask.run implementation that was added to support mmless ingestion..
Description
If a task in mmless ingestion fails during setup(), then cleanup is called with taskStatus=TaskStatus.running(getId()) because of a error in the logic. This can cause issues because then the overlord will think the task is running when in reality it has exited.
Separately, as a part of dc0b163, mmless ingestion will normally report the status to the overlord in the UpdateStatusAction api call, with the deep storage status.json being just a backup option. When this status is reported, the error message is dropped because UpdateStateAction only takes in a string instead of a TaskStatus object. To fix this, I added a TaskStatus field to the UpdateStatusAction (i left the string value in for backwards compatibility).
Release note
Fix some error reporting with mmless ingestion
Key changed/added classes in this PR
AbstractTask
UpdateStatusAction
This PR has: