Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make provider reconnectable #4591

Merged
merged 4 commits into from
Sep 9, 2024
Merged

make provider reconnectable #4591

merged 4 commits into from
Sep 9, 2024

Conversation

jaym
Copy link
Contributor

@jaym jaym commented Aug 22, 2024

Makes providers reconnectable. Keeps track of all connect requests to a provider. If it crashes, and is restarted, all the connect requests are replayed. Disconnected connections are also recreated because they can be a parent connect, and its non-existence can cause a child to fail to connect

Copy link
Contributor

github-actions bot commented Aug 22, 2024

Test Results

3 098 tests  +1   3 097 ✅ +1   1m 18s ⏱️ -4s
  370 suites ±0       1 💤 ±0 
   28 files   ±0       0 ❌ ±0 

Results for commit 29fa5a8. ± Comparison against base commit 483ba82.

♻️ This comment has been updated with latest results.

@jaym jaym force-pushed the jdm/reconnect-provider branch 2 times, most recently from 3fffdb3 to 8a55228 Compare August 23, 2024 15:35
jaym added a commit to mondoohq/cnspec that referenced this pull request Aug 23, 2024
@jaym jaym marked this pull request as ready for review August 23, 2024 16:03
@imilchev
Copy link
Member

I think this is going to be leaking connections for long running instance (serve mode). If we always reconnect all connections, they won't be closed properly. The scanner will only close connections that were just scanned.

I think this may be solved easily. If you remove the connectRequest when Disconnect is called. We call Disconnect when we "gracefully" shut down the connection. If that happens, it should be safe to drop the connectRequest too for this connection. Parent - child connections are tracked already so it's safe to assume we won't call Disconnect for a parent before the child is scanned

@jaym jaym force-pushed the jdm/reconnect-provider branch 2 times, most recently from aa0e94b to 6be847a Compare August 26, 2024 17:25
@jaym
Copy link
Contributor Author

jaym commented Aug 26, 2024

I've updated things so that we keep track of which connections rely on which other connections. This information is used to reconnect. If we disconnect a connection and something else relies on it, we keep around the reconnection information until the thing that relies on it is disconnected

Copy link
Member

@imilchev imilchev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One general question that I have is what happens in the following case:

  1. I connect to the provider with a root asset
  2. I connect to the provider with child asset of that root
  3. I scan the root
  4. Root connection is closed
  5. Provider crashes
  6. We trigger a restart + reconnect

After step 6 is done what connection will be open for the restarted provider?

}

type connectionGraph struct {
nodes map[uint32]connectionGraphNode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add some more comments what we store in these properties? It's not immediately clear for me when I see it, so it may help us in case we need to change this in the future

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would also help to describe in text how this is intended to work and put it in a comment:

  • when do we add connections to the graph?
  • when do we remove connections from the graph?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will add comments here

@jaym
Copy link
Contributor Author

jaym commented Aug 27, 2024

One general question that I have is what happens in the following case:

1. I connect to the provider with a root asset

2. I connect to the provider with child asset of that root

3. I scan the root

4. Root connection is closed

5. Provider crashes

6. We trigger a restart + reconnect

After step 6 is done what connection will be open for the restarted provider?

So assuming there is another asset that needs to be scanned, and it has a connection that depends on the root, first the root connection is recreated, and then the next assets connection. mondoohq/cnspec#1412 has the change to try to make sure the providers are up before starting an asset scan.

The root connection is then disconnected automatically once all the connections that depend on it are disconnected

@jaym jaym requested a review from chris-rock September 6, 2024 18:34
@jaym jaym merged commit 5150cf4 into main Sep 9, 2024
15 checks passed
@jaym jaym deleted the jdm/reconnect-provider branch September 9, 2024 10:18
@github-actions github-actions bot locked and limited conversation to collaborators Sep 9, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants