Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Segment Lineage List API #9005 #9006

Merged
merged 1 commit into from
Jul 13, 2022
Merged

Add Segment Lineage List API #9005 #9006

merged 1 commit into from
Jul 13, 2022

Conversation

yuanbenson
Copy link
Contributor

@yuanbenson yuanbenson commented Jun 30, 2022

Description

This PR adds support for listing segment lineage entries. The following endpoint on the controller can be used to retrieve such entries in chronological order of timestamp for a particular table.

  • segments/{tableName}/lineage

Testing Done

Added new Junit test in PinotSegmentRestletResourceTest.java

Manual verification

  1. With valid table type combination.

image

  1. Invalid type parameter.

image

3. Invalid table

image

Labels

feature

@yuanbenson yuanbenson marked this pull request as ready for review June 30, 2022 23:35
@jtao15
Copy link
Contributor

jtao15 commented Jun 30, 2022

Looks like it only returns the lineage entry Ids, can we return the actual entries as well? You can probably refer the get schema api "/schema/{schemaName}" to convert the object to json string.

@Authenticate(AccessType.READ)
@Produces(MediaType.APPLICATION_JSON)
@ApiOperation(value = "List segment lineage", notes = "List segment lineage")
public Response listSegmentLineage(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we add some test for this endpoint. you can checkout how other endpoints are tested under PinotSegmentUploadDownloadRestletResourceTest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raised some concerns in the #Testing done section of the PR description. Namely, PinotSegmentUploadDownloadRestletResourceTest does not seem to cover any segment operations integration tests (as it requires table creation and segment upload) and such operations are tested in PinotHelixResourceManagerTest though.

Copy link
Contributor

@walterddr walterddr Jul 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh. actually my mistake. I think the first thing I suggest is to move the API from PinotSegmentUploadDownloadRestletResource to PinotSegmentRestletResource where all the other segment metadata APIs are located.
then it should be easy to add a test in PinotSegmentRestletResourceTest

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC ^ @snleee what do you think? this IMO should be in SegmentRestletResource rather than the UploadDownload one

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that @yuanbenson probably put the new API to PinotSegmentUploadDownloadRestletResource because the segment lineage-related code is being written in the segment replacement-related APIs.

Another way to put this is e.g. GET /tables/{tableName}/segments/lineage under PinotSegmentRestletResource as @walterddr suggested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@walterddr Moved the newly added API to PinotSegmentRestletResource.

@codecov-commenter
Copy link

codecov-commenter commented Jul 5, 2022

Codecov Report

Merging #9006 (a018c41) into master (378bdec) will decrease coverage by 0.08%.
The diff coverage is 73.35%.

❗ Current head a018c41 differs from pull request most recent head 3be8b5e. Consider uploading reports for the commit 3be8b5e to get more accurate results

@@             Coverage Diff              @@
##             master    #9006      +/-   ##
============================================
- Coverage     68.55%   68.47%   -0.09%     
- Complexity     4839     4961     +122     
============================================
  Files          1808     1831      +23     
  Lines         94274    96293    +2019     
  Branches      14056    14392     +336     
============================================
+ Hits          64633    65935    +1302     
- Misses        25122    25774     +652     
- Partials       4519     4584      +65     
Flag Coverage Δ
integration1 ?
integration2 24.64% <29.52%> (?)
unittests1 66.85% <68.46%> (+0.42%) ⬆️
unittests2 15.40% <28.01%> (+0.52%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../java/org/apache/pinot/client/utils/Constants.java 0.00% <ø> (ø)
...ava/org/apache/pinot/client/utils/DriverUtils.java 15.73% <ø> (+11.33%) ⬆️
...inot/common/function/scalar/DateTimeFunctions.java 95.00% <ø> (-0.24%) ⬇️
...rg/apache/pinot/common/lineage/SegmentLineage.java 72.22% <0.00%> (-23.43%) ⬇️
...g/apache/pinot/common/minion/BaseTaskMetadata.java 60.00% <ø> (ø)
...e/pinot/common/minion/MergeRollupTaskMetadata.java 0.00% <ø> (-94.74%) ⬇️
.../minion/RealtimeToOfflineSegmentsTaskMetadata.java 100.00% <ø> (ø)
...pache/pinot/common/utils/request/RequestUtils.java 82.89% <0.00%> (+1.31%) ⬆️
...roller/api/resources/PinotTaskRestletResource.java 0.00% <0.00%> (-4.06%) ⬇️
...troller/helix/core/minion/ClusterInfoAccessor.java 65.51% <ø> (-13.80%) ⬇️
... and 521 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 378bdec...3be8b5e. Read the comment docs.

@yuanbenson
Copy link
Contributor Author

Looks like it only returns the lineage entry Ids, can we return the actual entries as well? You can probably refer the get schema api "/schema/{schemaName}" to convert the object to json string.

Thanks, addressed.

Copy link
Contributor

@snleee snleee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you for working on this!

@snleee
Copy link
Contributor

snleee commented Jul 11, 2022

@yuanbenson Linter test fails. Can you try to set up https://docs.pinot.apache.org/developers/developers-and-contributors/code-setup#intellij and apply the style to the classes that you added?

@Authenticate(AccessType.READ)
@Produces(MediaType.APPLICATION_JSON)
@ApiOperation(value = "List segment lineage", notes = "List segment lineage")
public Response listSegmentLineage(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that @yuanbenson probably put the new API to PinotSegmentUploadDownloadRestletResource because the segment lineage-related code is being written in the segment replacement-related APIs.

Another way to put this is e.g. GET /tables/{tableName}/segments/lineage under PinotSegmentRestletResource as @walterddr suggested.

@jtao15
Copy link
Contributor

jtao15 commented Jul 13, 2022

LGTM, thanks for working on this!

LinkedHashMap<String, LineageEntry> sortedLineageEntries = new LinkedHashMap<>();
_lineageEntries.entrySet().stream()
.sorted(Map.Entry.comparingByValue(Comparator.comparingLong(LineageEntry::getTimestamp)))
.forEachOrdered(x -> sortedLineageEntries.put(x.getKey(), x.getValue()));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@snleee fyi ensures resulting entries of listing segment lineage REST API are in chronological order here.

Copy link
Contributor

@walterddr walterddr Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO sorting is not necessary. at least we shouldn't do this in the toJsonObject path. what do you think we add the timestamp processing later?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed offline. We agreed to file the issue and address it from the separate PR.

@snleee
Copy link
Contributor

snleee commented Jul 13, 2022

@yuanbenson Can you change the description segments/{tableName}/listSegmentLineage?

@yuanbenson
Copy link
Contributor Author

(minor) Can we change segments/{tableName}/listSegmentLineage to segments/{tableName}/lineage to make this simpler?

Yes, indeed that has been done! I just updated the description to reflect that.

@snleee
Copy link
Contributor

snleee commented Jul 13, 2022

@yuanbenson Please create an issue regarding optimizing the JSON conversion for the SegmentLineage to remove the necessity of sorting.

Copy link
Contributor

@walterddr walterddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me. thank you for the contribution!

@snleee snleee merged commit 7eec296 into apache:master Jul 13, 2022
@Jackie-Jiang Jackie-Jiang added feature release-notes Referenced by PRs that need attention when compiling the next release notes labels Jul 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature release-notes Referenced by PRs that need attention when compiling the next release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants