Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented metrics for LoadBasedPartitionAssignmentStrategy #840

Merged
merged 5 commits into from
Jul 12, 2021

Conversation

jzakaryan
Copy link
Collaborator

Implemented the following metrics for LoadBasedPartitionAssignmentStrategy:

  • Minimum partitions across tasks (per datastream)
  • Maximum partitions across tasks (per datastream)
  • Throughput information fetch rate

Important: DO NOT REPORT SECURITY ISSUES DIRECTLY ON GITHUB.
For reporting security issues and contributing security fixes,
please, email security@linkedin.com instead, as described in
the contribution guidelines.

Please, take a minute to review the contribution guidelines at:
https://github.com/linkedin/Brooklin/blob/master/CONTRIBUTING.md

Copy link
Collaborator

@vmaheshw vmaheshw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My recommendation is to not pass the metrics from the assigner class. Instead emit the metric from the assigner class itself and expose additional methods like getMetricsInfos() that you can use. This way all the metric logic will remain in one place and the code will look cleaner.

@jzakaryan jzakaryan requested a review from vmaheshw June 29, 2021 20:45

@Override
protected void unregisterMetrics(String datastream) {
_assigner.unregisterMetricsForDatastream(datastream);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method still has to be here, because there's a path for cleaning metrics in StickyPartitionAssignmentStrategy which calls this method.

@@ -144,6 +169,14 @@
newAssignments.put(instance, newTasks);
});

// update metrics
PartitionAssignmentStats stats = new PartitionAssignmentStats(minPartitionsAcrossTasks.get(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need a separate PartitionAssignmentStats class? You can probably get rid of it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes the code cleaner. If I remove it, I'll have to maintain two separate maps for min/max partitions across tasks per datastream. This is more extensible as we can add more stats and add more per-datastream metrics in the future (e.g. the maxThroughputAcrossTasks we discussed earlier).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The major problem I can see with this future extension is, if there is another method created and that method calculates some stats, then either you will have to define all the getter and setters and make the fields non-final. Suppose we want to create any alert based metric, then all the other fields also need to be set. To overcome this, you may have to optional to figure out which fields are set and it will be an overkill.

You can have separate methods for each metric for create/Update the metric and just call those method, just like it is done in other classes in this repo.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by an alert-based metric, but from the description it seems like it doesn't fit with what I see as partition assignment stats. Semantically they might be incompatible, and we don't have to put it in the class. I still see value in keeping PartitionAssignmentStats around, for the reasons I mentioned above.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean if you have to emit a new metric which is in the exception path (and you don't calculate the other stats), then reusing this stat will be an issue and will require refactoring, V/s addition of new metric not impacting any other code change.

@jzakaryan jzakaryan requested a review from vmaheshw July 1, 2021 01:16
Copy link
Collaborator

@shrinandthakkar shrinandthakkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, in the case when a datastream is deleted, how would this work? Since we are never deleting the record for a datastream from the _partitionAssignmentStatsMap.

@shrinandthakkar
Copy link
Collaborator

Also, in the case when a datastream is deleted, how would this work? Since we are never deleting the record for a datastream from the _partitionAssignmentStatsMap.

talked with jhora offline and he said, "on datastream delete/stop, the metric gets unregistered i.e. the values for deleted/stopped metrics don’t get accessed and emitted"

Copy link
Collaborator

@vmaheshw vmaheshw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving with the note that there might be refactoring required for stats in the future based on the requirement.

@jzakaryan jzakaryan merged commit ddde7ae into linkedin:master Jul 12, 2021
vmaheshw pushed a commit to vmaheshw/brooklin that referenced this pull request Mar 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants