-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Task genrator debug api #9058
Task genrator debug api #9058
Conversation
Overall, I am a plus 1 on more minion debuggability. My view is that we should add APIs to minions similar to that of servers, and pull debugging info from there and display it. |
We already have minion APIs (look for /debug endpoint in Tasks tab). This is for scheduler errors, and the scope of scheduler remains limited to the controller, hence thought it makes most sense as an API here |
} catch (Exception e) { | ||
for (TableConfig tableConfig : enabledTableConfigs) { | ||
try { | ||
StringWriter errors = new StringWriter(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems we will do the same thing for each table config, and then put the same stacktrace for every table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
@@ -37,7 +37,13 @@ public FakePropertyStore() { | |||
|
|||
@Override | |||
public ZNRecord get(String path, Stat stat, int options) { | |||
return _contents.get(path); | |||
ZNRecord znRecord = _contents.get(path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this change needed?
/** | ||
* Returns the current information version, it should be consistent with the corresponding {@link ZNRecord} version. | ||
*/ | ||
public int getVersion() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prolly don't need this method either
_tableNameWithType = tableNameWithType; | ||
_taskType = taskType; | ||
// sort and keep the most recent several error messages | ||
_mostRecentErrorRunMessage = new TreeMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems this constructor is only ever called from within this same class, with empty list and map. Do we need all this sorting logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If at all we move to persistent storage, we might need to serialize / deserialize instances of this class using an allArgsConstructor
6420c75
to
fd0e6bd
Compare
Codecov Report
@@ Coverage Diff @@
## master #9058 +/- ##
============================================
- Coverage 70.06% 68.74% -1.32%
+ Complexity 4743 4656 -87
============================================
Files 1832 1838 +6
Lines 96971 97224 +253
Branches 14620 14640 +20
============================================
- Hits 67939 66840 -1099
- Misses 24305 25680 +1375
+ Partials 4727 4704 -23
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. |
|
||
|
||
/** | ||
* Licensed to the Apache Software Foundation (ASF) under one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
format: move this header to top
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
I don't see a minion debug REST endpoint in the code. Am I missing something? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saurabhd336 thanks a lot for your contribution. It has been hard to debug minions, so I totally understand your pain point. However, I have a few general questions:
- What is the use case where you think this will help? (we have some use cases where we saw problems, and instead of introducing one new API per use case, I am looking to make sure we do something that works for more than one. @sajjad-moradi can review from our standpoint).
- I know @npawar referred to a debug API on the minion, but I could not locate one. Please correct me if I am wrong. I think we should issue an API to the minions as well (liek we do with servers) and get the latest status form them and display it in the controller.
@ApiParam(value = "Table name with type", required = true) @PathParam("tableNameWithType") | ||
String tableNameWithType) { | ||
BaseTaskGeneratorInfo | ||
taskGeneratorMostRecentRunInfo = _taskManagerStatusCache.fetchTaskGeneratorInfo(tableNameWithType, taskType); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the table is not mastered on this controller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saurabhd336 thanks a lot for your contribution. It has been hard to debug minions, so I totally understand your pain point. However, I have a few general questions:
- What is the use case where you think this will help? (we have some use cases where we saw problems, and instead of introducing one new API per use case, I am looking to make sure we do something that works for more than one. @sajjad-moradi can review from our standpoint).
- I know @npawar referred to a debug API on the minion, but I could not locate one. Please correct me if I am wrong. I think we should issue an API to the minions as well (liek we do with servers) and get the latest status form them and display it in the controller.
Hey Subbu, this is the one for Minion tasks: /tasks/task/{taskName}/debug
. In the list of subtasks returned, there's an "info" field which contains the exception (if any) encountered by this task
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah good catch about the lead controller! this won't work as is, we'll have to think of how to handle that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for 2. I feel it's not very convenient when one auto scales minion workers frequently based on pending tasks. The debug API would be either not reachable when no workers are around or get no past task states to check when new workers are added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we have helix APIs for that, right? Has that avenue been fully explored? We are migrating to Helix 1.x) very soon (we are getting to production with that), so please check in 1.x pages what is available for task management and debugging, thanks. We will also do the same.
cc: @sajjad-moradi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two parts:
- generating task configurations which happens in pinot code
- submitting the task via TaskDriver which is in Helix code
Subbu has a valid point that helix should provide some info on failure on the 2nd part - task submission. The first part though is before reaching helix and it's good if we have an API to capture issues in task generation in pinot code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also scheduling tasks can be done by either:
- periodic task generator job on lead controller, or
- task REST endpoint which can be on any controller
So this means the approach in this PR doesn't work as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. The status can indeed be stored across different controllers. I was trying the following two approaches
Upon hitting this API,
- The controller makes a HTTP call to every other controller to get their respective in-memory debug statuses, waits for the responses and finally responds back with a list of all the responses.
- Use the helix messaging framework to broadcast a message to each controller, and via the
AsyncCallback
, collate all the replies and respond back (https://github.com/saurabhd336/pinot/pull/9/files)
I'll test and evaluate both approaches and update this PR accordingly.
fd0e6bd
to
e9dc106
Compare
92e0153
to
aa80094
Compare
@mcvsubbu @sajjad-moradi @npawar @klsince I've tried addressing the issue with the debug data being present across different controllers by letting the main controller query every other controller for the status data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* @param ts A timestamp | ||
*/ | ||
public void addSuccessRunTs(long ts) { | ||
_mostRecentSuccessRunTS.add(ts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: any need to keep _mostRecentSuccessRunTS sorted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplifies the eviction is all. Also the APi response is easier to work with (the most recent timestamp is the one of the top)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this indicate a successful run (completion) of the task, or a successful generation of the task? If the task fails in minion, does your API address it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only indicates the successful generation of a the task. This API is exclusively for task generation
pinotTaskConfigs = taskGenerator.generateTasks(enabledTableConfigs); | ||
for (TableConfig tableConfig : enabledTableConfigs) { | ||
try { | ||
_taskManagerStatusCache.saveTaskGeneratorInfo(tableConfig.getTableName(), taskGenerator.getTaskType(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a saveTaskGeneratorInfoQuietly() to save the try-catch block for brevity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the in-memory storage, I don't see any exceptions that can be thrown. Have removed the try catch block
public class InMemoryTaskManagerStatusCache extends TaskManagerStatusCache<TaskGeneratorMostRecentRunInfo> { | ||
|
||
private static class TaskGeneratorCacheKey { | ||
String _tableNameWithType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor) make them final
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
private final String _taskType; | ||
private final String _tableNameWithType; | ||
// the timestamp to error message map of the most recent several error runs | ||
@Nonnull |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor) We don't usually use Nonnull
. Everything without annotation are treated non-null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
* several error run | ||
* message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor)
* several error run | |
* message. | |
* several error run messages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
@VisibleForTesting | ||
static final String MOST_RECENT_SUCCESS_RUN_TS = "mostRecentSuccessRunTs"; | ||
@VisibleForTesting | ||
static final String MOST_RECENT_ERROR_RUN_MESSAGE = "mostRecentErrorRunMessage"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems not used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was being used when we had ZK based storage. Removed
_tableNameWithType = tableNameWithType; | ||
_taskType = taskType; | ||
// sort and keep the most recent several error messages | ||
_mostRecentErrorRunMessage = new TreeMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we decided to not use ZNode to store the information, suggest simplifying it to only take tableNameWithType
and taskType
. We may change it when we decide to persist the info in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
private final String _tableNameWithType; | ||
// the timestamp to error message map of the most recent several error runs | ||
@Nonnull | ||
private final TreeMap<Long, String> _mostRecentErrorRunMessage; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private final TreeMap<Long, String> _mostRecentErrorRunMessage; | |
private final TreeMap<Long, String> _mostRecentErrorRunMessages; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider storing timestamp as java.sql.Timestamp
so that the response can show human readable time
try { | ||
result.add(JsonUtils.stringToJsonNode(resp)); | ||
} catch (IOException e) { | ||
LOGGER.error("Failed to parse "); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incomplete
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
List<PinotTaskConfig> pinotTaskConfigs; | ||
try { | ||
pinotTaskConfigs = taskGenerator.generateTasks(enabledTableConfigs); | ||
for (TableConfig tableConfig : enabledTableConfigs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a TODO
here to separate the generated task/exception for each table. Currently even though some table doesn't have anything scheduled (e.g. enough tasks are scheduled from previous tables), we will still put success TS for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't get this. This method is only called for enabledTableConfigs
i.e. only for tables for which the particular task is enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a discussion offline. Added the todo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
public TreeMap<String, String> getMostRecentErrorRunMessages() { | ||
TreeMap<String, String> result = new TreeMap<>(); | ||
_mostRecentErrorRunMessages.forEach((timestamp, error) -> result.put( | ||
OffsetDateTime.ofInstant(Instant.ofEpochMilli(timestamp), ZoneId.of("UTC")).toString(), error)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor)
OffsetDateTime.ofInstant(Instant.ofEpochMilli(timestamp), ZoneId.of("UTC")).toString(), error)); | |
OffsetDateTime.ofInstant(Instant.ofEpochMilli(timestamp), ZoneOffset.UTC).toString(), error)); |
We can also consider putting ZoneId.systemDefault()
which might be more readable, but it might also be confusing when the client is in a different timezone from the server. IMO both ways have pros and cons, so your call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel keeping everything UTC might be more user friendly especially when pinot clusters are deployed in different geographies. Keeping it UTC for now.
import java.util.function.Consumer; | ||
|
||
|
||
public abstract class TaskManagerStatusCache<T extends BaseTaskGeneratorInfo> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest changing it to an interface
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
522743f
to
9fa0078
Compare
} | ||
|
||
// Call all controllers | ||
List<InstanceConfig> controllers = _pinotHelixResourceManager.getAllControllerInstanceConfigs(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need to send request to all controllers. Just to the lead controller is enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since POST '/tasks/schedule' API lets one schedule tasks for a given table on any controller (non leader included), we do need to query all controllers here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I was not clear. You can field the request in any controller (like you do). If that is not the lead for the table, then locate the leader controller (see the class PinotLeadControllerRestletResource
for how to do that) and then forward the request to just the lead controller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, so the POST actually has any controller schedule the job for a table ? Hmm... maybe we should fix hat too. But I see what you are saying here. My bad.
I had asked some questions about the use case you are tackling. Can you please answer those? How frequently are you generating tasks (in your use case)? Is a metric bump not enough for your use case? Can log processing help solving this use case? |
@npawar @Jackie-Jiang to add more |
I agree there is a need to debug minions. However this API does not provide any new insights other than what Helix already provides. So, let us work off of some concrete examples -- what is the use case in which you think this will help?
Every installation has its own way of processing logs. Cloud installations typically pipe the logs into a log processing service to get insights into it. On-prem installations have either their own way of piping the logs, or ways to get onto the host and look for the logs.
The reason I am reluctant to add more APIs is that the more we add, the more we need to keep backward compatibility on. My question to @saurabhd336 is : Have you looked at what Helix has to offer? If so, what are the short-comings? |
for 1: typically for users of RealtimeToOfflineTask, tasks get generated hourly. For SegmentGenerationAndPushTask, it can be way more frequently, depending on the the number of times files are generated in the source dir. MergeRollupTasks, are less frequent, but still several a day. There might be others, but this is what we see most commonly setup by users in oss. In a typical setup, all of these would be configured. It becomes quite confusing for users to have to find the exact exception in the logs, especially because some logs are in controller (scheduler related) and some in minion (task execution related). This API will help us make the feedback loop quicker, especially when we add this into the new Minion tab on the Pinot Admin UI Regarding info already in Helix, these scheduler related exceptions are not present in the Helix generated metadata. |
/** | ||
* Base abstract class for task generator info. | ||
*/ | ||
public abstract class BaseTaskGeneratorInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to add implements Serializable
* @param ts A timestamp | ||
*/ | ||
public void addSuccessRunTs(long ts) { | ||
_mostRecentSuccessRunTS.add(ts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this indicate a successful run (completion) of the task, or a successful generation of the task? If the task fails in minion, does your API address it?
I still think log processing should be the answer to this (and other similar PRs that may come up in the future). We should not be adding a new API for every error condition we may encounter in the system (and log something). @siddharthteotia , @npawar , @Jackie-Jiang , @snleee, @kishoreg, @mayankshriv what do you think? If the PMCs don't have any objection to this then I can live with this, but I am willing to bet that more such PRs will come up because logs are difficult to read. |
Save task generator info into an inmemory map, and provide a controller API to fetch the details for a given table
design doc: https://docs.google.com/document/d/1bCn13k57WLSaIAQGfATP75VS3p6V_ievlBi_HoOtmZ8/edit?usp=sharing
Release Notes
Added an API
GET /tasks/generator/{tableNameWithType}/{taskType}/debug
which returns the success time / error message for the 5 most recent task generator runs.