[SEDONA-632] Use direct committer when writing raster files using df.write.format("raster")
#1528
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Did you read the Contributor Guide?
Is this PR related to a JIRA ticket?
[SEDONA-XXX] my subject
.What changes were proposed in this PR?
Writing large amounts of raster files to distributed file systems or object store is super slow, because the output committer has to move files from temporary locations to their target locations. Users will see all the tasks are completed but the driver is stuck at the committing phase.
This patch an option
useDirectCommitter
to the raster format. By defaultuseDirectCommitter
is true, and the raster format will use a direct committer that writes raster files to their target locations directly. Users can manually set it to false if they want the original behavior.How was this patch tested?
Passing existing tests
Did this PR include necessary documentation updates?