(perf) server: batch SQL Metadata deleteSegments #14639

jasonk000 · 2023-07-21T17:52:18Z

Related to #14634.

Description

Introduce batching to mitigate some scaling challenges while managing lots of segments.

This PR introduces changes to IndexerSQLMetadataStorageCoordinator to use the JDBI PreparedBatch instead of issuing single update statements inside a transaction.

Context - in our environment, bulk cleanup of old segments (O(thousands)) stalls the overlord, because the overlord is issuing delete statements. These delete statements are done while holding the TaskLockbox lock, which is done from the TaskQueue, so the whole overlord locks up until the delete statements are complete. By pushing these into a single bulk transaction we see a meaningful improvement: previously ~1000 rows/sec removed, after ~1800 rows/sec removed.

Key changed/added classes in this PR

IndexerSQLMetadataStorageCoordinator: use PreparedBatch instead of single statements.

This PR has:

been self-reviewed.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
been tested in a test Druid cluster.

…adata

abhishekrb19

The overall approach looks good to me. I left a few comments around batch size and transaction code blocks. Thanks!

server/src/main/java/org/apache/druid/metadata/IndexerSQLMetadataStorageCoordinator.java

server/src/test/java/org/apache/druid/metadata/IndexerSQLMetadataStorageCoordinatorTest.java

server/src/main/java/org/apache/druid/metadata/IndexerSQLMetadataStorageCoordinator.java

kfaraz

Thanks for taking this up, @jasonk000 ! I have left some suggestions.

…adata pr feedback: - extract batch update and delete data generation outside of the SQL transaction, - avoid a query altogether if there are no segments to add, - improvement to logging

removed the updateSegmentMetadata batching since it is dead code

kfaraz

Final nitpicks.

server/src/main/java/org/apache/druid/metadata/IndexerSQLMetadataStorageCoordinator.java

pr feedback - improve logging accuracy - restore missing newline

jasonk000 · 2023-07-23T04:37:06Z

Final nitpicks.

Much appreciated @kfaraz, all good suggestions. Fixed in d28f768.

kfaraz

Thanks for the quick fix, @jasonk000 !

maytasm · 2023-07-24T04:36:26Z

A bit late to the PR. Change looks good to me. Thank you for the PR!

maytasm

+1

jasonk000 requested review from maytasm and zhangyue19921010 July 21, 2023 17:52

jasonk000 added Performance Area - Operations labels Jul 21, 2023

(perf) server: batch SQL Metadata deleteSegments and updateSegmentMet…

d57c891

…adata

jasonk000 force-pushed the batch-sql-deleteSegments branch from 3758d7a to d57c891 Compare July 21, 2023 18:21

jasonk000 mentioned this pull request Jul 21, 2023

Speed up kill tasks by deleting segments in batch #14131

Merged

10 tasks

jasonk000 requested review from kfaraz and removed request for zhangyue19921010 July 21, 2023 18:25

jasonk000 marked this pull request as ready for review July 21, 2023 21:03

abhishekrb19 reviewed Jul 22, 2023

View reviewed changes

kfaraz reviewed Jul 22, 2023

View reviewed changes

server/src/main/java/org/apache/druid/metadata/IndexerSQLMetadataStorageCoordinator.java Outdated Show resolved Hide resolved

kfaraz reviewed Jul 22, 2023

View reviewed changes

server/src/main/java/org/apache/druid/metadata/IndexerSQLMetadataStorageCoordinator.java Outdated Show resolved Hide resolved

kfaraz reviewed Jul 22, 2023

View reviewed changes

server/src/main/java/org/apache/druid/metadata/IndexerSQLMetadataStorageCoordinator.java Outdated Show resolved Hide resolved

kfaraz requested changes Jul 22, 2023

View reviewed changes

jasonk000 added 2 commits July 22, 2023 12:00

(perf) server: batch SQL Metadata deleteSegments and updateSegmentMet…

f040a03

…adata pr feedback: - extract batch update and delete data generation outside of the SQL transaction, - avoid a query altogether if there are no segments to add, - improvement to logging

(perf) server: batch SQL Metadata deleteSegments

1f462ff

removed the updateSegmentMetadata batching since it is dead code

kfaraz reviewed Jul 23, 2023

View reviewed changes

(perf) server: batch SQL Metadata deleteSegments

d28f768

pr feedback - improve logging accuracy - restore missing newline

jasonk000 changed the title ~~(perf) server: batch SQL Metadata deleteSegments and updateSegmentMetadata~~ (perf) server: batch SQL Metadata deleteSegments Jul 23, 2023

kfaraz approved these changes Jul 23, 2023

View reviewed changes

kfaraz merged commit 54f29fe into apache:master Jul 23, 2023
75 checks passed

jasonk000 mentioned this pull request Jul 23, 2023

split KillUnusedSegmentsTask to processing in smaller chunks #14642

Merged

3 tasks

maytasm reviewed Jul 24, 2023

View reviewed changes

LakshSingla added this to the 28.0 milestone Oct 12, 2023

LakshSingla mentioned this pull request Nov 4, 2023

[DRAFT] 28.0.0 release notes #15326

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(perf) server: batch SQL Metadata deleteSegments #14639

(perf) server: batch SQL Metadata deleteSegments #14639

jasonk000 commented Jul 21, 2023 •

edited

Loading

abhishekrb19 left a comment

kfaraz left a comment

kfaraz left a comment

jasonk000 commented Jul 23, 2023

kfaraz left a comment

maytasm commented Jul 24, 2023

maytasm left a comment

(perf) server: batch SQL Metadata deleteSegments #14639

(perf) server: batch SQL Metadata deleteSegments #14639

Conversation

jasonk000 commented Jul 21, 2023 • edited Loading

Description

Key changed/added classes in this PR

abhishekrb19 left a comment

Choose a reason for hiding this comment

kfaraz left a comment

Choose a reason for hiding this comment

kfaraz left a comment

Choose a reason for hiding this comment

jasonk000 commented Jul 23, 2023

kfaraz left a comment

Choose a reason for hiding this comment

maytasm commented Jul 24, 2023

maytasm left a comment

Choose a reason for hiding this comment

jasonk000 commented Jul 21, 2023 •

edited

Loading