-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented metrics for LoadBasedPartitionAssignmentStrategy #840
Implemented metrics for LoadBasedPartitionAssignmentStrategy #840
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My recommendation is to not pass the metrics from the assigner class. Instead emit the metric from the assigner class itself and expose additional methods like getMetricsInfos()
that you can use. This way all the metric logic will remain in one place and the code will look cleaner.
...rver/src/main/java/com/linkedin/datastream/server/assignment/LoadBasedPartitionAssigner.java
Outdated
Show resolved
Hide resolved
...rver/src/main/java/com/linkedin/datastream/server/assignment/LoadBasedPartitionAssigner.java
Outdated
Show resolved
Hide resolved
|
||
@Override | ||
protected void unregisterMetrics(String datastream) { | ||
_assigner.unregisterMetricsForDatastream(datastream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method still has to be here, because there's a path for cleaning metrics in StickyPartitionAssignmentStrategy
which calls this method.
...rver/src/main/java/com/linkedin/datastream/server/assignment/LoadBasedPartitionAssigner.java
Outdated
Show resolved
Hide resolved
@@ -144,6 +169,14 @@ | |||
newAssignments.put(instance, newTasks); | |||
}); | |||
|
|||
// update metrics | |||
PartitionAssignmentStats stats = new PartitionAssignmentStats(minPartitionsAcrossTasks.get(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need a separate PartitionAssignmentStats
class? You can probably get rid of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes the code cleaner. If I remove it, I'll have to maintain two separate maps for min/max partitions across tasks per datastream. This is more extensible as we can add more stats and add more per-datastream metrics in the future (e.g. the maxThroughputAcrossTasks we discussed earlier).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The major problem I can see with this future extension is, if there is another method created and that method calculates some stats, then either you will have to define all the getter and setters and make the fields non-final. Suppose we want to create any alert based metric, then all the other fields also need to be set. To overcome this, you may have to optional to figure out which fields are set and it will be an overkill.
You can have separate methods for each metric for create/Update the metric and just call those method, just like it is done in other classes in this repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean by an alert-based metric, but from the description it seems like it doesn't fit with what I see as partition assignment stats. Semantically they might be incompatible, and we don't have to put it in the class. I still see value in keeping PartitionAssignmentStats
around, for the reasons I mentioned above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean if you have to emit a new metric which is in the exception path (and you don't calculate the other stats), then reusing this stat will be an issue and will require refactoring, V/s addition of new metric not impacting any other code change.
...rver/src/main/java/com/linkedin/datastream/server/assignment/LoadBasedPartitionAssigner.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, in the case when a datastream is deleted, how would this work? Since we are never deleting the record for a datastream from the _partitionAssignmentStatsMap.
...rver/src/main/java/com/linkedin/datastream/server/assignment/LoadBasedPartitionAssigner.java
Outdated
Show resolved
Hide resolved
...rver/src/main/java/com/linkedin/datastream/server/assignment/LoadBasedPartitionAssigner.java
Show resolved
Hide resolved
...rver/src/main/java/com/linkedin/datastream/server/assignment/LoadBasedPartitionAssigner.java
Show resolved
Hide resolved
talked with jhora offline and he said, "on datastream delete/stop, the metric gets unregistered i.e. the values for deleted/stopped metrics don’t get accessed and emitted" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving with the note that there might be refactoring required for stats in the future based on the requirement.
Implemented the following metrics for
LoadBasedPartitionAssignmentStrategy
:Important: DO NOT REPORT SECURITY ISSUES DIRECTLY ON GITHUB.
For reporting security issues and contributing security fixes,
please, email security@linkedin.com instead, as described in
the contribution guidelines.
Please, take a minute to review the contribution guidelines at:
https://github.com/linkedin/Brooklin/blob/master/CONTRIBUTING.md