Track the execution state of every integration connection on the backend #1220

kenxu95 · 2023-04-17T21:11:40Z

@likawind as lead reviewer. @hsubbaraj-spiral as secondary to maybe help answer Qs. The meat of the changes are in connect_integration.go.

Describe your changes and why you are making these changes

Conda and Lambda integrations already manage the integration lifecycle from registration, setup, to completion. This PR makes the same pattern available for all integrations. This means that when we register an integration:

A "exec_state" key in the pending state is always written to the initial integration table entry.
This integration table entry will be updated to the appropriate terminal state. For all non-conda and non-lambda integrations, this happens synchronously.

The list_integrations endpoint now returns the execution state instead of the validated flag. The validated column is dropped from the integrations table, as it is no longer useful. To keep this PR well scoped, we keep the data model of everything else (eg. the createdAt column).

This is what the database now looks like (after a couple successful connections and a Lambda integration that failed to setup):

This is what the describe() and list_integrations() SDK methods now look like:

As an immediate follow-up, I error on the SDK if we attempt to preview or publish against any integration that has not been successfully registered. Note that this change is purely on the SDK and Backend, and has no affect on our contract with the UI (failed integration cards will still show up normally for now).

Related issue number (if any)

ENG-2782

Loom demo (if any)

Checklist before requesting a review

I have created a descriptive PR title. The PR title should complete the sentence "This PR...".
I have performed a self-review of my code.
I have included a small demo of the changes. For the UI, this would be a screenshot or a Loom video.
If this is a new feature, I have added unit tests and integration tests.
I have run the integration tests locally and they are passing.
I have run the linter script locally (See python3 scripts/run_linters.py -h for usage).
All features on the UI continue to work correctly.
Added one of the following CI labels:
- run_integration_test: Runs integration tests
- skip_integration_test: Skips integration tests (Should be used when changes are ONLY documentation/UI)

kenxu95 · 2023-04-17T21:12:47Z

src/golang/cmd/server/handler/connect_integration.go

 	// TODO(ENG-2523): move base conda env creation outside of ConnectIntegration.
 	if args.Service == shared.Conda {
+		condaDB, err := database.NewDatabase(DB.Config())


@likawind @hsubbaraj-spiral why do we define a new database here (and in lambda)? I copied this from existing code. I would like to document the reason here.

You couldn't pass the DB object to a separate go routine, otherwise it will error since some underlying connection is lost since you lost the parent scope that creates the DB object. Here, you'd need to call this NewDatabase as a part of go func() to ensure you initialize the DB and use it within the same scope.

I think both testing and editing an integration should update the exec_state. We can track the source of the change to so we can display "Tested on" vs "Edited on" vs "Created On" timestamps. What do ya'll think? cc: @vsreekanti This ofc means that testing an integration and finding out that it is failing may cause SDK operations that are using that integration to fail. But tbh that seems like proper behavior, since execution is going to fail anyways (it'll just fail earlier now).

Maybe "Last Verified" would make sense here. Or even have "Last Used" and make an entry to the table each time the integration is invoked. That would give a sense of stability even without using the testing integration button.

I like the "Last Used" idea, yeah. By making an entry to the table each time, you mean inserting an entry into the integrations table?

likawind

Overall looks good! Have you verified with conda or lambda to ensure we have no regression in async cases?

Blocking mainly for the part initializing DB in go routines. I have a few other comments, which can be addressed as follow ups:

Why don't we add a new column to track exec state if it applies to all integrations? This reduce the need to deserialize the values
I don't have impression that you can drop column in sqlite, but if tests are passing it's probably fine.
I assume these information will show up in UI in the new project. If they do show up, what do we do if the user click 'test connect integration'? Should the exec state be updated if the test-connect failed?

likawind · 2023-04-17T22:52:02Z

src/golang/cmd/server/handler/connect_integration.go

 	// TODO(ENG-2523): move base conda env creation outside of ConnectIntegration.
 	if args.Service == shared.Conda {
+		condaDB, err := database.NewDatabase(DB.Config())


You couldn't pass the DB object to a separate go routine, otherwise it will error since some underlying connection is lost since you lost the parent scope that creates the DB object. Here, you'd need to call this NewDatabase as a part of go func() to ensure you initialize the DB and use it within the same scope.

src/golang/cmd/server/handler/connect_integration.go

likawind · 2023-04-17T22:53:38Z

src/golang/cmd/server/handler/connect_integration.go

-				return
+			condaErr := setupCondaAsync(integrationRepo, integrationObject.ID, publicConfig, condaDB)
+			if condaErr != nil {
+				log.Errorf("Conda setup failed: %v", condaErr)


where do we handle updating failure exec state here?

Inside setupCondaAsync()

likawind · 2023-04-17T22:54:34Z

src/golang/cmd/server/handler/connect_integration.go

+	// the that entry to reflect the failure.
+	defer func() {
+		if err != nil {
+			execution_state.UpdateOnFailure(


Just to confirm my understanding - this is only useful for sync connections?

Yep, I'll make a note of that.

actually no that's not true. It's only useful for the synchronous connection part for all integrations. Even lambda and conda have additional setup that is done after this. I think since the goroutines are the last thing that is done in this method, we can be assured that there are no update races between synch and async. I'll document that.

kenxu95 · 2023-04-17T23:20:03Z

Why don't we add a new column to track exec state if it applies to all integrations? This reduce the need to deserialize the values.

I initially made a separate table for connections called integration_connections. Then I realized that we only needed the exec_state of the latest connection, so then I did what you're suggesting and made a new exec_state column. Then I saw all the code for conda and lambda and realized it would take quite a bit of effort to get everything backfilled correctly and working the same way with all our integrations, so I took the easier route of reusing the execution state helpers and the config column for now. I can make a task for migrating this column to it's own exec_state as a follow up - but I don't see it as completely necessary right now.

I don't have impression that you can drop column in sqlite, but if tests are passing it's probably fine.

I think you can, we have other migrations that do this.

I assume these information will show up in UI in the new project. If they do show up, what do we do if the user click 'test connect integration'? Should the exec state be updated if the test-connect failed?

I think both testing and editing an integration should update the exec_state. We can track the source of the change to so we can display "Tested on" vs "Edited on" vs "Created On" timestamps. What do ya'll think? cc: @vsreekanti This ofc means that testing an integration and finding out that it is failing may cause SDK operations that are using that integration to fail. But tbh that seems like proper behavior, since execution is going to fail anyways (it'll just fail earlier now).

kenxu95 · 2023-04-17T23:29:18Z

@likawind addressed the database thing. Good to know! I haven't fully tested it E2E but will do that soon.

likawind

This PR LGTM!

likawind · 2023-04-17T23:34:56Z

If there's too much work it make sense to keep exec_state in the dict.

Generally agree with your proposals on test-connection. Following that, should we update integrations to failure state if we detect any such connection failed during operator execution?

some fixes shoot. Maybe it's better to keep it two tables moved it to an integration_connection table, because of editing the migration works. Now to worry about the read and write paths added the drop column one ok now I'm confused. Unwinding back to a single table approach. there are so many thing swrong with this but at least I see a path through now. made some progress. Write path internals in a good state unwound a bunch of stuff prune all the stuff that is unnecessary now Moved the transaction back inside reverted a bunch of stuff back, still need too drop validated column added validated column drop. Ready for testing now. remove dsdk changes last things, lint and block usage of broken integrations. lint done small formatting fixes fixed gh actions made exec_state optional on SDK cr fixed err removed elem update backend

kenxu95 requested review from cw75, likawind and hsubbaraj-spiral April 17, 2023 21:11

kenxu95 commented Apr 17, 2023

View reviewed changes

likawind reviewed Apr 17, 2023

View reviewed changes

kenxu95 requested a review from likawind April 17, 2023 23:28

likawind approved these changes Apr 17, 2023

View reviewed changes

kenxu95 force-pushed the eng-2782-track-the-status-of-a-resource branch from bbd06a7 to b4d3694 Compare April 18, 2023 15:41

kenxu95 force-pushed the eng-2782-track-the-status-of-a-resource branch from f516f10 to fb1047e Compare April 19, 2023 15:33

fixed sorting

69828e6

kenxu95 added the run_integration_test Triggers integration tests label Apr 19, 2023

kenxu95 merged commit 1bcf1fc into main Apr 20, 2023

kenxu95 deleted the eng-2782-track-the-status-of-a-resource branch April 20, 2023 17:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Track the execution state of every integration connection on the backend #1220

Track the execution state of every integration connection on the backend #1220

kenxu95 commented Apr 17, 2023 •

edited

Loading

kenxu95 Apr 17, 2023 •

edited

Loading

likawind Apr 17, 2023

hsubbaraj-spiral Apr 18, 2023

kenxu95 Apr 18, 2023

likawind left a comment •

edited

Loading

likawind Apr 17, 2023

likawind Apr 17, 2023

kenxu95 Apr 17, 2023

likawind Apr 17, 2023

kenxu95 Apr 17, 2023

kenxu95 Apr 17, 2023 •

edited

Loading

kenxu95 commented Apr 17, 2023 •

edited

Loading

kenxu95 commented Apr 17, 2023

likawind left a comment

likawind commented Apr 17, 2023

Track the execution state of every integration connection on the backend #1220

Track the execution state of every integration connection on the backend #1220

Conversation

kenxu95 commented Apr 17, 2023 • edited Loading

Describe your changes and why you are making these changes

Related issue number (if any)

Loom demo (if any)

Checklist before requesting a review

kenxu95 Apr 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

likawind left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kenxu95 Apr 17, 2023 • edited Loading

Choose a reason for hiding this comment

kenxu95 commented Apr 17, 2023 • edited Loading

kenxu95 commented Apr 17, 2023

likawind left a comment

Choose a reason for hiding this comment

likawind commented Apr 17, 2023

kenxu95 commented Apr 17, 2023 •

edited

Loading

kenxu95 Apr 17, 2023 •

edited

Loading

likawind left a comment •

edited

Loading

kenxu95 Apr 17, 2023 •

edited

Loading

kenxu95 commented Apr 17, 2023 •

edited

Loading