Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add replication factor column to sys table #14403

Merged
merged 18 commits into from
Jun 18, 2023

Conversation

adarshsanjeev
Copy link
Contributor

@adarshsanjeev adarshsanjeev commented Jun 9, 2023

Currently, Druid allows segments to be not loaded on any historical. This status is not tracked in the sys.segments table on the broker. This makes it difficult to tell from the segments table why a segment is not available, and if that is expected. This also makes it hard to decide what segments to wait for to load after an ingestion is finished, as those segments may never get loaded due to the load rules.

This PR adds a new column replication factor to the sys virtual table and populates it with information from the coordinator. If this column is 0, the segment is not assigned to any historical and will never be loaded.

This column can be access from the sys table by normal sql queries:
SELECT "segment_id", "replication_factor" FROM sys."segments"

Screenshot 2023-06-14 at 9 12 56 AM

Response:

    [
        "segment_id",
        "replication_factor"
    ],
    [
        "STRING",
        "LONG"
    ],
    [
        "VARCHAR",
        "BIGINT"
    ],
    [
        "kttm_simple_-146136543-09-08T08:23:32.096Z_146140482-04-24T15:36:27.903Z_2023-06-14T03:38:24.502Z",
        -1
    ],
    [
        "kttm_simple_-146136543-09-08T08:23:32.096Z_146140482-04-24T15:36:27.903Z_2023-06-14T03:40:13.463Z",
        2
    ]
]

Release Notes

  • Adds a new virtual column replication factor to the sys table. This returns the total number of replicants of the segment across all tiers. The column is set to -1 if the information is not available currently.

Copy link
Contributor

@abhishekrb19 abhishekrb19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change! I've a few questions related to API backwards compatibility and naming consistency.

Copy link
Contributor

@abhishekrb19 abhishekrb19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall approach looks good to me. One question related to setReplicationFactor and a few minor nits, thanks!

@adarshsanjeev adarshsanjeev changed the title Add target replica column to sys table Add replication factor column to sys table Jun 12, 2023
Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some final feedback comments.

@adarshsanjeev , could you please also share a web-console screenshot of the query and the results table from your local/cluster testing?

@vogievetsky vogievetsky added the Needs web console change Backend API changes that would benefit from frontend support in the web console label Jun 14, 2023
Copy link
Contributor

@cryptoe cryptoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left minor comments. Overall LGTM. Will approve once those are addressed.

Copy link
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@adarshsanjeev , for the coverage issues, you can try adding a simple test for MetadataResource similar to DataSourceResourceTest or tests for other resource classes.

docs/querying/sql-metadata-tables.md Outdated Show resolved Hide resolved
@kfaraz
Copy link
Contributor

kfaraz commented Jun 16, 2023

Left a few minor comments, but none of them are blockers for this PR.

@kfaraz kfaraz merged commit 128133f into apache:master Jun 18, 2023
abhishekagarwal87 pushed a commit that referenced this pull request Jul 17, 2023
This PR catches the console up to all the backend changes for Druid 27

Specifically:

Add page information to SqlStatementResource API #14512
Allow empty tiered replicants map for load rules #14432
Adding Interactive API's for MSQ engine #14416
Add replication factor column to sys table #14403
Account for data format and compression in MSQ auto taskAssignment #14307
Errors take 3 #14004
AmatyaAvadhanula pushed a commit to AmatyaAvadhanula/druid that referenced this pull request Jul 17, 2023
This PR catches the console up to all the backend changes for Druid 27

Specifically:

Add page information to SqlStatementResource API apache#14512
Allow empty tiered replicants map for load rules apache#14432
Adding Interactive API's for MSQ engine apache#14416
Add replication factor column to sys table apache#14403
Account for data format and compression in MSQ auto taskAssignment apache#14307
Errors take 3 apache#14004
abhishekagarwal87 pushed a commit that referenced this pull request Jul 17, 2023
This PR catches the console up to all the backend changes for Druid 27

Specifically:

Add page information to SqlStatementResource API #14512
Allow empty tiered replicants map for load rules #14432
Adding Interactive API's for MSQ engine #14416
Add replication factor column to sys table #14403
Account for data format and compression in MSQ auto taskAssignment #14307
Errors take 3 #14004

Co-authored-by: Vadim Ogievetsky <vadim@ogievetsky.com>
@abhishekagarwal87 abhishekagarwal87 added this to the 27.0 milestone Jul 19, 2023
sergioferragut pushed a commit to sergioferragut/druid that referenced this pull request Jul 21, 2023
This PR catches the console up to all the backend changes for Druid 27

Specifically:

Add page information to SqlStatementResource API apache#14512
Allow empty tiered replicants map for load rules apache#14432
Adding Interactive API's for MSQ engine apache#14416
Add replication factor column to sys table apache#14403
Account for data format and compression in MSQ auto taskAssignment apache#14307
Errors take 3 apache#14004
@vogievetsky vogievetsky removed the Needs web console change Backend API changes that would benefit from frontend support in the web console label Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants