-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tests to check for DDSs sending duplicate attach ops on handle encode #20995
Conversation
packages/test/test-end-to-end-tests/src/test/stashedOps.spec.ts
Outdated
Show resolved
Hide resolved
const dataObjectB = (await dataStoreB.entryPoint.get()) as ITestFluidObject; | ||
idB = dataObjectB.context.id; | ||
|
||
cellRoot.set(dataObjectB.handle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we need a test per dds, as grouping them could miss problems, as if any dds binds the handle it will work right, so we miss broken dds.
i would imagine this test using a for loop to run the same test for each dds.
for(const storeHandle in [...]){
it(`store handle in dds: $(ddsName)`,()=>{
//setup the detached stuff
storeHandle(handle)
// verify
}
}
maybe the loop would run over an array of funtion that take the defaultDataStore, and store in the dds like:
(defaultDataStore, handle)=>{
const cellRoot = await defaultDataStore.getSharedObject(cellId);
cellRoot.set(handle);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, my impression was that the interesting part was not storing dataObjectB.handle
, but rather the handle from B's root
to the other DDSes in B. So the test case would be to store a handle from a particular DDS to some other DDS - all in a DataStore that is currently detached.
const defaultDataStore = (await container.getEntryPoint()) as ITestFluidObject; | ||
const runtime = defaultDataStore.context.containerRuntime; | ||
|
||
const dataStoreB = await runtime.createDataStore(["default"]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest using a simpler container schema / default data store, to be more explicit in the test code itself about what's going on. IIRC, the key part of the setup is that dataStoreB has internal handled between its DDSes (in a Map). Better to have that be explicit in the test code itself.
Probably will get here anyway by following Tony's suggestions below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will probably be resolved in a follow up, just so we can have some coverage immediately.
[cellId, SharedCell.getFactory()], | ||
[counterId, SharedCounter.getFactory()], | ||
[directoryId, SharedDirectory.getFactory()], | ||
// [treeId, SharedTree.getFactory()], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the tree implementations are not commented out, I get an error that SharedTree is invalid or not implemented. It might have to do with the fact that SharedTree is not in the test APIs, but I'm not sure yet.
packages/test/test-end-to-end-tests/src/test/handleValidation.spec.ts
Outdated
Show resolved
Hide resolved
packages/test/test-end-to-end-tests/src/test/handleValidation.spec.ts
Outdated
Show resolved
Hide resolved
packages/test/test-end-to-end-tests/src/test/handleValidation.spec.ts
Outdated
Show resolved
Hide resolved
packages/test/test-end-to-end-tests/src/test/handleValidation.spec.ts
Outdated
Show resolved
Hide resolved
packages/test/test-end-to-end-tests/src/test/handleValidation.spec.ts
Outdated
Show resolved
Hide resolved
packages/test/test-end-to-end-tests/src/test/handleValidation.spec.ts
Outdated
Show resolved
Hide resolved
await waitForContainerConnection(container2); | ||
const default2 = (await container2.getEntryPoint()) as ITestFluidObject; | ||
|
||
const handleB = (await handle.readHandle(default2)) as IFluidHandle<ITestFluidObject>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should avoid casting. in the stress test we use isFluidHandle, we should use that here like assert(isFuildHandle(handle2))
rather than casting
Co-authored-by: Tony Murphy <anthony.murphy@microsoft.com>
packages/test/test-end-to-end-tests/src/test/handleValidation.spec.ts
Outdated
Show resolved
Hide resolved
packages/test/test-end-to-end-tests/src/test/handleValidation.spec.ts
Outdated
Show resolved
Hide resolved
packages/test/test-end-to-end-tests/src/test/handleValidation.spec.ts
Outdated
Show resolved
Hide resolved
packages/test/test-end-to-end-tests/src/test/handleValidation.spec.ts
Outdated
Show resolved
Hide resolved
packages/test/test-end-to-end-tests/src/test/handleValidation.spec.ts
Outdated
Show resolved
Hide resolved
remove unnecessary code
…ed (#20998) the primary change is to move getting the attachment summary before transitioning to attached for datastores. This also matches how the container works, specifically, capture the summary, and then transition to attaching. Since this code is all synchronous the risk of something bad happening is small, or about the same as before. Regardless of the order, having the summary retrieval decoupled from the state transition leaves open the possibility for bugs. With this change it is possible the dds state is modified after the summary is captured, so it won't be in the summary or in ops, only in memory. Conversely, in the existing code if a change happened after the attach state changed it would get an op, but it would also be in the summary, so it gets duplicated. Both of these are data corruption. The better and much bigger fix would be to get the summary as part of the transition, so as each ddses attachment summary is retrieved it then transition to attaching. this would be a substantial refactoring, and due to the code being synchronous the opportunity for these types of issues is small, and would likely represent a framework issue. Testing done as part of #20995 --------- Co-authored-by: Tony Murphy <anthonm@microsoft.com>
} | ||
|
||
const container2 = await provider.loadTestContainer(testContainerConfig); | ||
if (container2.deltaManager.lastSequenceNumber < seq) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI this code already exists here: https://github.com/microsoft/FluidFramework/tree/main/packages/loader/container-loader/src/catchUpMonitor.ts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, almost... it doesn't support passing a custom target in. Oh well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But maybe if you do provider.ensuresynchronized before loading container2, it would work
…ed (microsoft#20998) the primary change is to move getting the attachment summary before transitioning to attached for datastores. This also matches how the container works, specifically, capture the summary, and then transition to attaching. Since this code is all synchronous the risk of something bad happening is small, or about the same as before. Regardless of the order, having the summary retrieval decoupled from the state transition leaves open the possibility for bugs. With this change it is possible the dds state is modified after the summary is captured, so it won't be in the summary or in ops, only in memory. Conversely, in the existing code if a change happened after the attach state changed it would get an op, but it would also be in the summary, so it gets duplicated. Both of these are data corruption. The better and much bigger fix would be to get the summary as part of the transition, so as each ddses attachment summary is retrieved it then transition to attaching. this would be a substantial refactoring, and due to the code being synchronous the opportunity for these types of issues is small, and would likely represent a framework issue. Testing done as part of microsoft#20995 --------- Co-authored-by: Tony Murphy <anthonm@microsoft.com>
…code (microsoft#20995) Adds end to end tests to validate that each DDS does not send duplicate attach ops when handles are encoded. This bug was added in microsoft#19242, and deals with a second attach op being sent when there is a handle from one detached DDS to another and the DDS is then attached. [AB#7936](https://dev.azure.com/fluidframework/235294da-091d-4c29-84fc-cdfc3d90890b/_workitems/edit/7936) --------- Co-authored-by: Mark Fields <markfields@users.noreply.github.com> Co-authored-by: Tony Murphy <anthony.murphy@microsoft.com>
const dataStoreB = await runtime.createDataStore(["default"]); | ||
const dataObjectB = (await dataStoreB.entryPoint.get()) as ITestFluidObject; | ||
|
||
await handle.storeHandle(defaultDataStore, dataObjectB.handle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jzaffiro - This is a different test case than what I remember seeing last week.
This:
- DefaultDataStore is attached (of course). Create DataStoreB (it's detached).
- DefaultDataStore's DDS of each type stores a handle to DataStoreB (thus attaching it)
What I remember:
- DefaultDataStore is attached (of course). Create DataStoreB (it's detached).
- DataStoreB's DDS of each type stores a handle to another DDS within DataStoreB (all the while it's detached)
- Then store a handle to DataStoreB from DefaultDataStore's root -- now DataStoreB (and all its children) attach.
Did you confirm that these tests fail without the bindHandles call and without Tony's fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests do fail without calling bindHandles and without Tony's fix. I also might have gotten confused on the test scenario, but looking at the commit we protoyped it seems like the test case was:
- DefaultDataStore is attached (of course). Create DataStoreB (it's detached).
- Store a handle to DataObjectB in the root of DefaultDataStore (effectively storing it in a map)
I planned on doing a follow up anyway, so I can add the second test case you point out in that PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's great to know you verified the test coverage.
As for the other test case - you're right that this is what the test was doing. But when we were debugging, it was the test fluid object initialization for DataStoreB that was storing handles between DDSes, and that was the part that was causing multiple attach ops. Super subtle.
My concern is that all these test cases are actually just testing Map under the hood, because it's in DataStoreB's root Map where the handles are stored that actually matter.
cc @anthony-murphy as FYI
## Description Adds back `ConsensusQueue` to the handle validation tests. The test had a few issues preventing it from working with ConsensusQueue. The direct cause for the hang that the original bug reported is that container1 was disposed before attempting to read the handle, and ConsensusQueue requires ops to round-trip before its data can be read. While working on this, I also refactored the affected test and suite a fair bit for clarity, and tweaked what the test case was doing. I split it into a couple different cases based on what the handle in the DDS was referencing, as [previous discussion](#20995 (comment)) on the topic mentioned the problematic case was something not being explicitly tested. [AB#8058](https://dev.azure.com/fluidframework/235294da-091d-4c29-84fc-cdfc3d90890b/_workitems/edit/8058) --------- Co-authored-by: Abram Sanderson <absander@microsoft.com>
Adds end to end tests to validate that each DDS does not send duplicate attach ops when handles are encoded. This bug was added in #19242, and deals with a second attach op being sent when there is a handle from one detached DDS to another and the DDS is then attached.
AB#7936