-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] Support cache delta lake delta log metadata #49069
Conversation
fe/fe-core/src/main/java/com/starrocks/connector/delta/DeltaLakeMetastore.java
Show resolved
Hide resolved
fe/fe-core/src/main/java/com/starrocks/connector/delta/DeltaLakeParquetHandler.java
Show resolved
Hide resolved
fe/fe-core/src/main/java/com/starrocks/connector/delta/DeltaLakeParquetHandler.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/com/starrocks/connector/delta/DeltaLakeParquetHandler.java
Show resolved
Hide resolved
fe/fe-core/src/main/java/com/starrocks/connector/delta/DeltaLakeParquetHandler.java
Show resolved
Hide resolved
public CloseableIterator<ColumnarBatch> readParquetFiles( | ||
CloseableIterator<FileStatus> fileIter, | ||
StructType physicalSchema, | ||
Optional<Predicate> predicate) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we don't use predicate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we cannot cache the entire parquet data when use predicate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it the same as iceberg that caching all manifest file content?
fe/fe-core/src/main/java/com/starrocks/connector/delta/DeltaLakeCatalogProperties.java
Show resolved
Hide resolved
fe/fe-core/src/main/java/com/starrocks/connector/delta/DeltaLakeCatalogProperties.java
Show resolved
Hide resolved
fe/fe-core/src/main/java/com/starrocks/connector/delta/DeltaLakeJsonHandler.java
Outdated
Show resolved
Hide resolved
Quality Gate failedFailed conditions See analysis details on SonarCloud Catch issues before they fail your Quality Gate with our IDE extension SonarLint |
[FE Incremental Coverage Report]✅ pass : 161 / 173 (93.06%) file detail
|
[BE Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
@mergify backport branch-3.3 |
✅ Backports have been created
|
(cherry picked from commit edbab22)
Why I'm doing:
Query delta lake table will need to access delta log, it would take long time to read a lot of json/parquet files, thees metadata files are not going to change, so they are suitable for caching to avoid repeated reading.
What I'm doing:
Fixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: