Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add table indexes API #11576

Merged
merged 7 commits into from
Sep 18, 2023
Merged

Add table indexes API #11576

merged 7 commits into from
Sep 18, 2023

Conversation

saurabhd336
Copy link
Contributor

@saurabhd336 saurabhd336 commented Sep 12, 2023

This API is primarily for the controller UI -> tables ->

-> reload status -> STATUS ui element. Right now it uses the /segments/{tableName}/metadata?type={type}&columns=* API to display the column level index details. This has a few problems

  1. This is an expensive API call especially for large tables. This leads to bad ux when using the reload status UI feature
  2. The UI logic to build tableLevel column -> index mapping using this segmentName level metadata is simply wrong (uss the first segment's metadata to build the view).
  3. Since Index-spi changes are merged, this API does not return metadata of dynamically added index types, but only some hardcoded ones.

This new API's reponse looks like this

The totalSegments is the total count of ONLINE segments (replication included). Since mutable version of all indexes don't exist, CONSUMING segments will not be used when building the response.

{
    "totalSegments": 31,
    "columnToIndexesCount":
    {
        "FlightNum":
        {
            "dictionary": 31,
            "bloom": 0,
            "null": 0,
            "forward": 31,
            "fst": 0,
            "json": 0,
            "range": 0,
            "h3": 0,
            "text": 0,
            "inverted": 0,
            "some-dynamically-injected-index-type": 31,
        },
        "Origin":
        {
            "dictionary": 31,
            "bloom": 0,
            "null": 0,
            "forward": 31,
            "fst": 0,
            "json": 0,
            "range": 0,
            "h3": 0,
            "text": 0,
            "inverted": 0,
            "some-dynamically-injected-index-type": 0,
        }
        ...
}

I'll followup with @jayeshchoudhary to move the UI API call to this new one, and how best to display the UI element once this PR is merged.

@saurabhd336
Copy link
Contributor Author

Screenshot 2023-09-12 at 4 41 17 PM

API calls involved when viewing this modal. For a large enough table, this UI element takes ~40s to load

@codecov-commenter
Copy link

codecov-commenter commented Sep 12, 2023

Codecov Report

Merging #11576 (75f1b45) into master (b1b070f) will increase coverage by 0.02%.
The diff coverage is 30.43%.

@@             Coverage Diff              @@
##             master   #11576      +/-   ##
============================================
+ Coverage     14.51%   14.54%   +0.02%     
  Complexity      201      201              
============================================
  Files          2323     2324       +1     
  Lines        124465   124534      +69     
  Branches      18989    18994       +5     
============================================
+ Hits          18071    18109      +38     
- Misses       104877   104906      +29     
- Partials       1517     1519       +2     
Flag Coverage Δ
integration ?
integration2 ?
java-11 14.52% <30.43%> (+<0.01%) ⬆️
java-17 14.51% <30.43%> (+<0.01%) ⬆️
java-20 14.53% <30.43%> (+0.02%) ⬆️
temurin 14.54% <30.43%> (+0.02%) ⬆️
unittests 14.54% <30.43%> (+0.02%) ⬆️
unittests2 14.54% <30.43%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
...on/response/server/TableIndexMetadataResponse.java 0.00% <0.00%> (ø)
...oller/api/resources/PinotTableRestletResource.java 43.42% <0.00%> (-3.96%) ⬇️
...che/pinot/server/api/resources/TablesResource.java 60.98% <91.30%> (+2.47%) ⬆️

... and 5 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@Produces(MediaType.APPLICATION_JSON)
@ApiOperation(value = "Get the aggregate index details of all segments for a table", notes = "Get the aggregate "
+ "index details of all segments for a table")
public String getTableIndexes(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a new API, shouldn't we have tests to validate its behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. added tests

response = ErrorInfo.class), @ApiResponse(code = 404, message = "Table or segment not found", response =
ErrorInfo.class)
})
public String getTableIndexes(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, shouldn't we have some unit tests and integration tests to validate this new API?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. added tests

Copy link
Contributor

@ege-st ege-st left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had one question about tests.

public void getTableIndexes()
throws Exception {
String tableIndexesPath =
"/tables/" + TableNameBuilder.forType(TableType.OFFLINE).tableNameWithType(TABLE_NAME) + "/indexes";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also need to test this for RT tables?

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

long totalSegmentCount = 0;
Map<String, Map<String, Long>> columnToIndexesCount = new HashMap<>();
for (SegmentDataManager segmentDataManager : allSegments) {
if (segmentDataManager instanceof RealtimeSegmentDataManager) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the documentation, let's mention that consuming segment is not counted to avoid confusion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

TableDataManager tableDataManager = ServerResourceUtils.checkGetTableDataManager(_serverInstance, tableName);
List<SegmentDataManager> allSegments = tableDataManager.acquireAllSegments();
try {
long totalSegmentCount = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) long seems not necessary, same for the index count

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Comment on lines 1024 to 1025
columnToIndexCountMap.putIfAbsent(col, new HashMap<>());

indexToCount.forEach((indexName, count) -> {
columnToIndexCountMap.get(col)
.put(indexName, columnToIndexCountMap.get(col).getOrDefault(indexName, 0L) + count);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reduce map lookup

Suggested change
columnToIndexCountMap.putIfAbsent(col, new HashMap<>());
indexToCount.forEach((indexName, count) -> {
columnToIndexCountMap.get(col)
.put(indexName, columnToIndexCountMap.get(col).getOrDefault(indexName, 0L) + count);
});
Map<String, Integer> indexCountMap = columnToIndexCountMap.computeIfAbsent(col, new HashMap<>());
indexToCount.forEach((indexName, count) -> {
indexCountMap.compute(col, (k, v) -> v != null ? v + count : count);
});

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

@saurabhd336 saurabhd336 merged commit 49196ec into apache:master Sep 18, 2023
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants