Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TierBasedSegmentDirectoryLoader to keep segments in multi-datadir #9306

Merged
merged 7 commits into from
Oct 14, 2022

Conversation

klsince
Copy link
Contributor

@klsince klsince commented Aug 31, 2022

This PR is part of the work to support multi-datadir for Pinot server as tracked by issue.

TierBasedSegmentDirectoryLoader is added in this PR. It implements SegmentDirectoryLoader interface. It checks the target storage tier for the segment and move it from current tier to the target tier. The tier config is extended to include dataDir, so that one can define multi-tier for a table to use multiple datadirs to store segments.

TierBasedSegmentDirectoryLoader is not used unless set pinot.server.instance.segment.directory.loader to tierBased.

Following PRs will add Controller periodic task to compare segment current/target tiers and trigger segment reloading to move them to target tiers (datadirs) on Pinot servers.

@codecov-commenter
Copy link

codecov-commenter commented Aug 31, 2022

Codecov Report

Merging #9306 (9028f31) into master (2d520d4) will increase coverage by 42.73%.
The diff coverage is 85.41%.

@@              Coverage Diff              @@
##             master    #9306       +/-   ##
=============================================
+ Coverage     26.03%   68.77%   +42.73%     
- Complexity       44     5285     +5241     
=============================================
  Files          1922     1936       +14     
  Lines        102937   103474      +537     
  Branches      15657    15714       +57     
=============================================
+ Hits          26796    71160    +44364     
+ Misses        73435    27228    -46207     
- Partials       2706     5086     +2380     
Flag Coverage Δ
integration1 25.90% <24.47%> (-0.13%) ⬇️
unittests1 67.46% <84.88%> (?)
unittests2 15.58% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...not/segment/spi/loader/SegmentDirectoryLoader.java 0.00% <0.00%> (ø)
...va/org/apache/pinot/spi/utils/CommonConstants.java 24.00% <0.00%> (+24.00%) ⬆️
...nt/local/loader/DefaultSegmentDirectoryLoader.java 61.53% <16.66%> (+61.53%) ⬆️
...not/common/metadata/segment/SegmentZKMetadata.java 82.69% <33.33%> (+1.64%) ⬆️
...r/helix/SegmentOnlineOfflineStateModelFactory.java 56.43% <75.00%> (-2.06%) ⬇️
.../local/loader/TierBasedSegmentDirectoryLoader.java 82.92% <82.92%> (ø)
.../pinot/core/data/manager/BaseTableDataManager.java 75.97% <92.68%> (+15.19%) ⬆️
...server/starter/helix/HelixInstanceDataManager.java 71.30% <93.75%> (+1.73%) ⬆️
...che/pinot/common/utils/config/TierConfigUtils.java 70.45% <100.00%> (+66.75%) ⬆️
...indexsegment/immutable/ImmutableSegmentLoader.java 85.71% <100.00%> (+85.71%) ⬆️
... and 1349 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@Jackie-Jiang Jackie-Jiang added feature Configuration Config changes (addition/deletion/change in behavior) labels Sep 1, 2022
@klsince klsince force-pushed the multi_datadirs_segment_loader branch from cafb4e3 to 698813c Compare September 6, 2022 22:23
@klsince klsince changed the title MultiDirSegmentDirectoryLoader to keep segments in multi-datadir TierBasedSegmentDirectoryLoader to keep segments in multi-datadir Sep 6, 2022
@klsince klsince force-pushed the multi_datadirs_segment_loader branch from 698813c to e74b9a6 Compare September 6, 2022 23:48
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still confused on how the current tier and target tier is stored.
IIUC, the target tier is stored in ZK metadata, which can be modified by the controller periodic task (not implemented yet). Where is the current tier stored? When server restarts, it needs to be able to find the current tier (not from ZK metadata because that is the target tier), then find the local segment.

@klsince
Copy link
Contributor Author

klsince commented Sep 7, 2022

Where is the current tier stored?

It's set when immutable segment is loaded. TieredStorageLoader sets it after loading the segment.

ImmutableSegment.getTier() {
    return _segmentDirectory.getTier();
  }

TierBasedSegmentDirectoryLoader.load() {
...
   segmentDirectory.setTier(segmentTier);
...
}

When addOrReplaceSegment() is called, as there is no segment loaded yet, the segmentZKMetadata.tier is used to load the segment for the first time on server. When reloadSegment() is called, as the segment is loaded by server (and we're reloading it), the current tier tracked by the ImmutableSegment object is compared with segmentZKMetadata.tier to decide if there is need to move indexDir to a new dir before reloading the segment.

@Jackie-Jiang
Copy link
Contributor

When addOrReplaceSegment() is called, as there is no segment loaded yet, the segmentZKMetadata.tier is used to load the segment for the first time on server. When reloadSegment() is called, as the segment is loaded by server (and we're reloading it), the current tier tracked by the ImmutableSegment object is compared with segmentZKMetadata.tier to decide if there is need to move indexDir to a new dir before reloading the segment.

What will happen if the tier is updated when server is restarting? Server won't be able to find the old local segment, and the old segment will left as orphan and never be cleaned up.
I think we need to find a way to persist the current tier info to the disk so that when server restarts, it can find the local segment tier. Or we may search the previous tiers if we cannot find the local segment, and migrate the tier immediately if the local segment is in the data dir for another tier.

@klsince
Copy link
Contributor Author

klsince commented Sep 7, 2022

What will happen if the tier is updated when server is restarting? Server won't be able to find the old local segment, and the old segment will left as orphan and never be cleaned up.

Good question. Although will add a cleanup mechanism for orphan segments as part of this work, it's still good to find and reuse the existing segment on disk under such race condition, to avoid orphan segment and downloading raw segment.

But as to the ideas persist segment tier over server restarts or check all previous tiers, they are subject to another race condition: the dataDir for tier is updated in TierConfig in TableConfig when server restarts. So instead, we may simply persist the segmentDir path over restarts, and check it out when calling addOrReplaceSegment(). This check is supposed to be done in TierBasedSegmentDirectoryLoader as those race conditions are specific to it. Will think a bit more on this.

@Jackie-Jiang
Copy link
Contributor

But as to the ideas persist segment tier over server restarts or check all previous tiers, they are subject to another race condition: the dataDir for tier is updated in TierConfig in TableConfig when server restarts. So instead, we may simply persist the segmentDir path over restarts, and check it out when calling addOrReplaceSegment(). This check is supposed to be done in TierBasedSegmentDirectoryLoader as those race conditions are specific to it. Will think a bit more on this.

I thought about the problem of changing data dir, which is a destructive operation anyway (think of the case of server changing its root data dir). That leads me start thinking whether we should store the tier data dir in the table config. IMO that should be a server config because each server might have disk mounted under different path (not likely happen but just an example why it should be part of the server config). Basically the mapping from tier name to data dir should be within the server config, and the table config should only specify the tier name to use.

Persisting local tier requires more management overhead such as where to store this meta, when the update it, clean it up when segment is removed etc. That's why checking all previous tiers might be easier.

Segment deletion is another problem we need to think of. When segment is deleted, the ZK metadata is already gone. How do we find the local segment to remove?

@klsince klsince force-pushed the multi_datadirs_segment_loader branch from e512c4a to 737caca Compare September 21, 2022 23:59
@klsince
Copy link
Contributor Author

klsince commented Sep 22, 2022

Using a tier track file kept in the default segment dir to help find local segment and reuse them as much as possible. pls help take a look @Jackie-Jiang . I'm thinking to put the track file into tableDataDir/tmp/ directory to be cleaner, lemme know your thought.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(MAJOR) the segment delete logic won't work with tier data dir. Please take a look at SegmentOnlineOfflineStateModelFactory.onBecomeDroppedFromOffline()

As for the tracking file management, I'm thinking to keep a file of name <segment>.tier under the table data dir. IMO it is easier to track than adding it into the segment data dir (think of the case when the segment dir is moved). When deleting the segment from the server, we should also delete the tier tracking file

@klsince
Copy link
Contributor Author

klsince commented Sep 23, 2022

(MAJOR) the segment delete logic won't work with tier data dir....
good point, will check it out.

As for the tracking file management, I'm thinking to keep a file of name <segment>.tier under the table data dir.
I like this

@klsince klsince force-pushed the multi_datadirs_segment_loader branch from 737caca to cfdd876 Compare October 12, 2022 21:06
@klsince klsince force-pushed the multi_datadirs_segment_loader branch from cfdd876 to c986805 Compare October 12, 2022 23:41
@klsince klsince force-pushed the multi_datadirs_segment_loader branch from 416ec66 to 8d91f3f Compare October 13, 2022 18:01
@Jackie-Jiang Jackie-Jiang added the release-notes Referenced by PRs that need attention when compiling the next release notes label Oct 14, 2022
@Jackie-Jiang Jackie-Jiang merged commit a8d95f2 into apache:master Oct 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Configuration Config changes (addition/deletion/change in behavior) feature release-notes Referenced by PRs that need attention when compiling the next release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants