-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding ability to fetch metadata for specific list of segments #11949
Conversation
cc @snleee - Please review |
68874d7
to
03c2001
Compare
Codecov Report
@@ Coverage Diff @@
## master #11949 +/- ##
============================================
+ Coverage 61.30% 61.45% +0.14%
+ Complexity 1140 207 -933
============================================
Files 2378 2385 +7
Lines 128865 129157 +292
Branches 19927 19997 +70
============================================
+ Hits 78998 79370 +372
+ Misses 44147 44027 -120
- Partials 5720 5760 +40
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 77 files with indirect coverage changes 📣 Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
pinot-controller/src/main/java/org/apache/pinot/controller/util/TableMetadataReader.java
Outdated
Show resolved
Hide resolved
pinot-controller/src/main/java/org/apache/pinot/controller/util/TableMetadataReader.java
Show resolved
Hide resolved
d03d917
to
a6c8a4d
Compare
a6c8a4d
to
4ee5788
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late review. I don't see the rest API change in this PR. Do we plan to add it in a separate PR?
*/ | ||
public JsonNode getSegmentsMetadata(String tableNameWithType, List<String> columns, int timeoutMs) | ||
public JsonNode getSegmentsMetadata(String tableNameWithType, List<String> columns, Set<String> segmentsToInclude, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add @Nullable
to segmentsToInclude
} | ||
|
||
private JsonNode getSegmentsMetadataInternal(String tableNameWithType, List<String> columns, | ||
Set<String> segmentsToInclude, int timeoutMs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add @Nullable
to segmentsToInclude
Problem: Getting metadata for all segments leads to high heap utilization and even controller OOMs (especially when we have large number of segments and small controller heap). Typically we dont need metadata for all segments at once and this can be done in batches.
Solution:
Adding an API to fetch metadata for a list of segments. Also unified all APIs to use the same internal function. Did not modify existing APIs for backwards compatiblity.
Verified that existing functionality/apis work as expected
curl -X 'GET'
'http://localhost:9000/segments/airlineStats/metadata?type=OFFLINE'
-H 'accept: application/json'