Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kernel] Add non-monotonic inCommitTimestamp with related table properties and table features #3276

Merged
merged 7 commits into from
Jun 27, 2024

Conversation

EstherBear
Copy link
Contributor

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

Add non-monotonic inCommitTimestamp with related table properties and table features to prepare for adding monotonic inCommitTimestamp in laters PRs.

How was this patch tested?

Add unit tests to verify the inCommitTimestamp and related table properties and table features when enabling the inCommitTimestamp enablement property.

Does this PR introduce any user-facing changes?

Yes, user can enable non-monotonic inCommitTimestamp by enabling its property.

@EstherBear EstherBear force-pushed the add-non-monotonic-ict branch 4 times, most recently from 77b6870 to bd001c9 Compare June 17, 2024 21:54
public Map<String, String> filterOutUnchangedProperties(Map<String, String> newProperties) {
Map<String, String> oldProperties = getConfiguration();
return newProperties.entrySet().stream()
.filter(entry -> !oldProperties.containsKey(entry.getKey()) ||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TableConfig validation is case insensitive so shouldn't this also be case insensitive?

Copy link
Contributor Author

@EstherBear EstherBear Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The input of this function is the validated properties returned by the validation function whose key is same as the one in TableConfig. And the property name in Metadata is also stored in the case from TableConfig. So I think it should be fine?

// (see [[ConflictChecker.checkNoMetadataUpdates]]), so we know that
// all winning transactions' ICT enablement status must match the read snapshot.
//
// WARNING: The Metadata() of InitialSnapshot can enable ICT by default. To ensure that
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually true in Kernel?

Copy link
Contributor Author

@EstherBear EstherBear Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Kernel, the Metadata() of InitialSnapshot will not enable ICT by default. But the ICT can be enabled during the first commit using the withTableProperties API. I will update the comments.

val ver1Snapshot = table.getLatestSnapshot(engine).asInstanceOf[SnapshotImpl]
assert(ver1Snapshot.getProtocol.getWriterFeatures.contains("inCommitTimestamp-preview"))
assert(
getInCommitTimestamp(engine, table, 1).get >= beforeCommitAttemptStartTime)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to eventually have a version of transaction builder that takes in a Clock so that this can be tested more thoroughly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I will add it in the next PR. Or do you think I should add the Clock in this PR?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just saw that you added it in this PR: #3282. No need to shift it to this one.

@@ -137,8 +176,125 @@ trait DeltaTableWriteSuiteBase extends AnyFunSuite with TestUtils {
Optional.empty()
}

def generateData(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this been copied as is?

Copy link
Contributor Author

@EstherBear EstherBear Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I just moved it from DeltaTableWriteSuite to DeltaTableWriteSuiteBase so that both DeltaTableWriteSuite and InCommitTimestampSuite can use it.

Comment on lines 170 to 171
setTablePropAndVerify(
engine, tablePath, isNewTable = false, IN_COMMIT_TIMESTAMPS_ENABLED, "false", false)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this working? Shouldn't we throw error if the ICT is disabled?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dhruvarya-db how does Delta-Spark handles when the ICT is disabled?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disabling ICT by setting the enablement property to false is possible even in Delta-Spark. Do you see any problems with this? Disabling ICT this way does not remove it from the protocol though.

Copy link
Collaborator

@vkorukanti vkorukanti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

One pending comment is: how are you verifying the tables with ICT enabled are readable (i.e we are not writing an unreadable table)?

@@ -53,7 +55,7 @@ public static Protocol fromColumnVector(ColumnVector vector, int rowId) {
private final int minReaderVersion;
private final int minWriterVersion;
private final List<String> readerFeatures;
private final List<String> writerFeatures;
private List<String> writerFeatures;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert?


val ver0Snapshot = table.getLatestSnapshot(engine).asInstanceOf[SnapshotImpl]
assert(getInCommitTimestamp(engine, table, 0).get >= beforeCommitAttemptStartTime)
assert(ver0Snapshot.getProtocol.getWriterFeatures.contains("inCommitTimestamp-preview"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertHasWriterFeature(ver0Snapshot, "inCommitTimestamp-preview"

also another utility method assertHasNoWriterFeature

@EstherBear
Copy link
Contributor Author

next #3282

Copy link
Collaborator

@vkorukanti vkorukanti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR!

@vkorukanti vkorukanti merged commit c395a63 into delta-io:master Jun 27, 2024
10 checks passed
@@ -120,11 +125,22 @@ public TransactionCommitResult commit(Engine engine, CloseableIterable<Row> data
"Transaction is already attempted to commit. Create a new transaction.");

long commitAsVersion = readSnapshot.getVersion(engine) + 1;
CommitInfo attemptCommitInfo = generateCommitAction(engine);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conflict resolution happens here and I think I've added a TODO comment.

@@ -147,17 +157,43 @@ public TransactionCommitResult commit(Engine engine, CloseableIterable<Row> data
throw new ConcurrentWriteException();
}

private void updateMetadata(Metadata metadata) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update Metadata

@@ -123,6 +134,93 @@ public static void validateWriteSupportedTable(
}
}

/**
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update Protocol

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants