Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log a small subset of segments to refresh for debugging Coordinator refresh logic #16998

Merged
merged 3 commits into from
Sep 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -761,6 +761,9 @@ public Set<SegmentId> refreshSegmentsForDataSource(final String dataSource, fina
final Map<String, SegmentId> segmentIdMap = Maps.uniqueIndex(segments, SegmentId::toString);

final Set<SegmentId> retVal = new HashSet<>();

logSegmentsToRefresh(dataSource, segments);

final Sequence<SegmentAnalysis> sequence = runSegmentMetadataQuery(
Iterables.limit(segments, MAX_SEGMENTS_PER_QUERY)
);
Expand Down Expand Up @@ -805,6 +808,14 @@ public Set<SegmentId> refreshSegmentsForDataSource(final String dataSource, fina
return retVal;
}

/**
* Log the segment details for a datasource to be refreshed for debugging purpose.
*/
void logSegmentsToRefresh(String dataSource, Set<SegmentId> ids)
Copy link
Contributor

@cryptoe cryptoe Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add java docs to this method. Also there should be a abstract method called logRefreshSegments and that should return a boolean.
Then each impl can set true or false
So CSMC can return true where as the broker one can return false.
And the log.info call is in the abstract class itself.

Copy link
Contributor Author

@findingrish findingrish Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the benefit of doing that? Also, is there a reason behind making it abstract?

Copy link
Contributor

@cryptoe cryptoe Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the main logic of logging does not leak out to impls.
So a rouge impl cannot mess up the logs of the coordinator/broker ever . Atleast that is what I was thinking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, but is there a reason behind making it abstract and implementing it in both Broker and Coordinator?
The base method could return false and the overridden method in Coordinator could return true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that the implementation logic for logging the segments should reside in the child classes. This gives flexibility to both Broker and Coordinator to log different number of segments, blacklist/whitelist datasources etc.

Copy link
Contributor

@cryptoe cryptoe Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base method could return false and the overridden method in Coordinator could return true?

Sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base method could return false and the overridden method in Coordinator could return true?

Sure.

I haven't made this change yet. Let me know what do you think about this,

I feel that the implementation logic for logging the segments should reside in the child classes. This gives flexibility to both Broker and Coordinator to log different number of segments, blacklist/whitelist datasources etc.

{
// no-op
}

/**
* Action to be executed on the result of Segment metadata query.
* Returns if the segment metadata was updated.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
import com.google.common.base.Supplier;
import com.google.common.collect.FluentIterable;
import com.google.common.collect.ImmutableMap;
import com.google.common.collect.Iterables;
import com.google.common.collect.Sets;
import com.google.inject.Inject;
import org.apache.druid.client.CoordinatorServerView;
Expand Down Expand Up @@ -529,6 +530,12 @@ public void refresh(final Set<SegmentId> segmentsToRefresh, final Set<String> da
}
}

@Override
void logSegmentsToRefresh(String dataSource, Set<SegmentId> ids)
{
log.info("Logging a sample of 5 segments [%s] to be refreshed for datasource [%s]", Iterables.limit(ids, 5), dataSource);
}

private void filterRealtimeSegments(Set<SegmentId> segmentIds)
{
synchronized (lock) {
Expand Down
Loading