Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model: Connecting to DBs that don't contain tables that are in the model should not be an error. #235

Open
dceara opened this issue Oct 5, 2021 · 10 comments · May be fixed by #244 or #257
Open

model: Connecting to DBs that don't contain tables that are in the model should not be an error. #235

dceara opened this issue Oct 5, 2021 · 10 comments · May be fixed by #244 or #257

Comments

@dceara
Copy link

dceara commented Oct 5, 2021

errors = append(errors, fmt.Errorf("database model contains a model for table %s that does not exist in schema", tableName))

This should probably logged as a warning (at most) and the table should optionally be removed from the model. Otherwise it breaks all upgrade scenarios in which clients are upgraded before the database. Even worse, the client might have an external way of checking if a table actually exists on the remote server (e.g., run an ovn-nbctl list <table> command), and adapt its execution based on that information, completely avoiding to use the nonexistent table.

An example of such a scenario can be found here; the client fails to validate the schema every time it connects to a remote server, crash loops, even though it's perfectly able to function when the table is missing on the server side:

F1004 16:31:22.456301      34 ovnkube.go:131] error when trying to initialize libovsdb NB client: failed to connect to tcp:172.18.0.3:6641: database OVN_Northbound validation error (3): Mapper Error. Object type nbdb.LogicalRouter contains field LoadBalancerGroup ([]string) ovs tag load_balancer_group: Column does not exist in schema. database model contains a model for table Load_Balancer_Group that does not exist in schema. Mapper Error. Object type nbdb.LogicalSwitch contains field LoadBalancerGroup ([]string) ovs tag load_balancer_group: Column does not exist in schema

CC: @amorenoz

@dcbw
Copy link
Contributor

dcbw commented Oct 5, 2021

@dave-tucker

@amorenoz
Copy link
Collaborator

amorenoz commented Oct 5, 2021

We added that verification because the error handling later on was more complicated. But now that the library is more mature we can relax this.

For missing tables we can remove the table from the model and warn. All API calls with that model will return a ErrWrongType. I'd say we should return a specific error type so clients can handle this case easier.

Should we also handle other types of discrepancies more robustly?, e.g: extra/missing/different column?
the mapper.Info is built using the column object and having less fields (columns) on models is supported from day 0, so in theory (not looked at code), we should be able to ignore problematic columns....

@dave-tucker
Copy link
Collaborator

Even worse, the client might have an external way of checking if a table actually exists on the remote server (e.g., run an ovn-nbctl list table command), and adapt its execution based on that information, completely avoiding to use the nonexistent table.

That's precisely the sort of scenario we wrote this library to avoid - why use ovn-nbctl and libovsdb together 🤔

We added that verification because the error handling later on was more complicated. But now that the library is more mature we can relax this.

I agree with Adrian here, but we need to do this thoughtfully.

The intended way for supporting two different versions of a schema was to let the library user handle it.
Specifically one of:

  • Create your own DatabaseModel with just the tables/columns you care about and ignore others
  • Update your database first, then update your clients with the new schema
  • Use modelgen to generate bindings for all versions you plan to support, update your clients (with error handling so you can speak old and new schema), then update your server

If we're relaxing verification, we'll need to be clear on what the compatibility guarantees are and for how long they are valid
Backwards only (i.e db is older than schema) or Forwards only (i.e db is newer than schema) or both.
Is that something that is just for an upgrade period? or could it be in place longer term?

For missing tables we can remove the table from the model and warn. All API calls with that model will return a ErrWrongType.

Missing tables should be a simple case. I'd say we should return a specific error type so clients can handle this case easier.
This can be done by instead of removing it from the DatabaseModel, we mark it as unsupported and return a specific error - e.g ErrMissingTable = "the server database version does not contain the table %s" if it's attempted to be used.

Should we also handle other types of discrepancies more robustly?, e.g: extra/missing/different column?
the mapper.Info is built using the column object and having less fields (columns) on models is supported from day 0, so in theory (not looked at code), we should be able to ignore problematic columns....

This is going to be a much harder case to solve for.
This was previously not possible because of the cache, but since we added logic to purge it on reconnect it's at least plausible.
Extra fields is fine, but missing ones is going to be weird.

I think a recent example was that OtherConfig was added to a table, and we actually needed to set options there for the control plane to work properly (could have been BFD if my memory serves). If you're dealing with a DB that doesn't have the OtherConfig column then you actually want to error out early.

Likewise a change of column type would be another issue that you could plausibly handle at runtime, i.e from a scalar type to a set of scalar types (or vice-versa), but one of the nice things about updating your bindings using modelgen was that Go's type system would catch these errors.

Given the pitfalls outlined above (and significant level of effort that would be involved to do this properly) I do wonder whether using modelgen to support both schemas is the least-worst option.

@amorenoz
Copy link
Collaborator

amorenoz commented Oct 6, 2021

Should we also handle other types of discrepancies more robustly?, e.g: extra/missing/different column?
the mapper.Info is built using the column object and having less fields (columns) on models is supported from day 0, so in theory (not looked at code), we should be able to ignore problematic columns....

This is going to be a much harder case to solve for. This was previously not possible because of the cache, but since we added logic to purge it on reconnect it's at least plausible. Extra fields is fine, but missing ones is going to be weird.

I think a recent example was that OtherConfig was added to a table, and we actually needed to set options there for the control plane to work properly (could have been BFD if my memory serves). If you're dealing with a DB that doesn't have the OtherConfig column then you actually want to error out early.

Likewise a change of column type would be another issue that you could plausibly handle at runtime, i.e from a scalar type to a set of scalar types (or vice-versa), but one of the nice things about updating your bindings using modelgen was that Go's type system would catch these errors.

I wouldn't try to handle changes on runtime. Just marking nonexisting/changed fields and tables as "not supported" and ignore them or error out on transactions that use them would be much easier. However, the challenge I see is how to unambiguously let the client know what to expect from the library if only a subset of the fields/tables are supported. I think this is the same "weirdness" you express above.

Given the pitfalls outlined above (and significant level of effort that would be involved to do this properly) I do wonder whether using modelgen to support both schemas is the least-worst option.

Do you mean having the client decide which model to use in runtime or having the library support multiple (versioned) models?

@dave-tucker
Copy link
Collaborator

dave-tucker commented Oct 6, 2021

However, the challenge I see is how to unambiguously let the client know what to expect from the library if only a subset of the fields/tables are supported. I think this is the same "weirdness" you express above.

Yes that's my point. The only way to prevent this would be to indirect struct access through getter/setter functions where you could return an error if the field wasn't supported. This is much less convenient that using the struct directly.

EDIT: in some ways the model declares the minimum required feature set for a connection... and I quite like how that is enforced at present through validation.

Do you mean having the client decide which model to use in runtime or having the library support multiple (versioned) models?

Having the client pick at runtime

@dceara
Copy link
Author

dceara commented Oct 6, 2021

Even worse, the client might have an external way of checking if a table actually exists on the remote server (e.g., run an ovn-nbctl list table command), and adapt its execution based on that information, completely avoiding to use the nonexistent table.

That's precisely the sort of scenario we wrote this library to avoid - why use ovn-nbctl and libovsdb together thinking

Sorry, my example wasn't good, you're right, no need to use ovn-nbctl. We can use libovsdb to check if a table is in the shema supported by the server, e.g., as suggested by @amorenoz:

if Client.Schema().Table("Load_Balancer_Group") == nil {
 ... // Handle case when table is not available on the server

@dceara
Copy link
Author

dceara commented Oct 6, 2021

Should we also handle other types of discrepancies more robustly?, e.g: extra/missing/different column?
the mapper.Info is built using the column object and having less fields (columns) on models is supported from day 0, so in theory (not looked at code), we should be able to ignore problematic columns....

This is going to be a much harder case to solve for. This was previously not possible because of the cache, but since we added logic to purge it on reconnect it's at least plausible. Extra fields is fine, but missing ones is going to be weird.
I think a recent example was that OtherConfig was added to a table, and we actually needed to set options there for the control plane to work properly (could have been BFD if my memory serves). If you're dealing with a DB that doesn't have the OtherConfig column then you actually want to error out early.
Likewise a change of column type would be another issue that you could plausibly handle at runtime, i.e from a scalar type to a set of scalar types (or vice-versa), but one of the nice things about updating your bindings using modelgen was that Go's type system would catch these errors.

I wouldn't try to handle changes on runtime. Just marking nonexisting/changed fields and tables as "not supported" and ignore them or error out on transactions that use them would be much easier. However, the challenge I see is how to unambiguously let the client know what to expect from the library if only a subset of the fields/tables are supported. I think this is the same "weirdness" you express above.

Why not put the burden on the client (like is currently done with the C and Python ovsdb IDLs) and make it the responsibility of the client to determine what tables/columns to use?

All we'd need is a way to determine if a table/column exists in the remote schema. The client can then set up its own mechanism to determine what tables/columns to read to/write from.

EDIT: Of course schema validation shouldn't throw an error for a missing table/column. :)

@amorenoz
Copy link
Collaborator

amorenoz commented Oct 7, 2021

Should we also handle other types of discrepancies more robustly?, e.g: extra/missing/different column?
the mapper.Info is built using the column object and having less fields (columns) on models is supported from day 0, so in theory (not looked at code), we should be able to ignore problematic columns....

This is going to be a much harder case to solve for. This was previously not possible because of the cache, but since we added logic to purge it on reconnect it's at least plausible. Extra fields is fine, but missing ones is going to be weird.
I think a recent example was that OtherConfig was added to a table, and we actually needed to set options there for the control plane to work properly (could have been BFD if my memory serves). If you're dealing with a DB that doesn't have the OtherConfig column then you actually want to error out early.
Likewise a change of column type would be another issue that you could plausibly handle at runtime, i.e from a scalar type to a set of scalar types (or vice-versa), but one of the nice things about updating your bindings using modelgen was that Go's type system would catch these errors.

I wouldn't try to handle changes on runtime. Just marking nonexisting/changed fields and tables as "not supported" and ignore them or error out on transactions that use them would be much easier. However, the challenge I see is how to unambiguously let the client know what to expect from the library if only a subset of the fields/tables are supported. I think this is the same "weirdness" you express above.

Why not put the burden on the client (like is currently done with the C and Python ovsdb IDLs) and make it the responsibility of the client to determine what tables/columns to use?

I think the original rationale behind strict validation was to ensure no [un]marshalling errrors occur that are difficult to report (e.g: they occur on the update-handling thread) and to deal with (it's difficult to determine if the error is on the client, on the library or on the server side).

All we'd need is a way to determine if a table/column exists in the remote schema. The client can then set up its own mechanism to determine what tables/columns to read to/write from.
EDIT: Of course schema validation shouldn't throw an error for a missing table/column. :)

The missing table issue is relatively easy to fix, we could:

  1. Perform OvsdbClient creation and DBModel loading in two steps so the client can inspect the schema and modify the DBModel accordingly (switch to a different model if you have multiple or remove conflicting tables).
  2. Have the OvsdbClient remove the conflicting tables for you. Not incompatible with 1)

IMHO, the weirdness comes when dealing with column type changes or missing columns. When a client sees an empty field when reading a Model from the cache, how would it know if it's actually empty in the DB of if it was missing from the schema in the first place?

@dceara
Copy link
Author

dceara commented Oct 7, 2021

The missing table issue is relatively easy to fix, we could:

1. Perform `OvsdbClient` creation and `DBModel` loading in two steps so the client can inspect the schema and modify the DBModel accordingly (switch to a different model if you have multiple or remove conflicting tables).

2. Have the `OvsdbClient` remove the conflicting tables for you. Not incompatible with 1)

IMHO, the weirdness comes when dealing with column type changes or missing columns. When a client sees an empty field when reading a Model from the cache, how would it know if it's actually empty in the DB of if it was missing from the schema in the first place?

That's when the "put the burden on the client" comes to play IMO: the client will have to figure out what columns are safe to read from by using its own mechanism. An oversimplified example: it could have a dedicated table in the database that lists supported features. If feature F that requires values from Column X is not listed in the supported features table then the client must not use feature F.

Back to the ovn-kubernetes feature that triggered this issue, this becomes even simpler: if the Load_Balancer_Group table is not available in the schema, then fall back to "legacy" configuration of Load_Balancer and direct referencing from Logical_Switches/Logical_Routers (i.e., never read/write from/to Load_Balancer_Group).

@amorenoz
Copy link
Collaborator

amorenoz commented Oct 7, 2021

The missing table issue is relatively easy to fix, we could:

1. Perform `OvsdbClient` creation and `DBModel` loading in two steps so the client can inspect the schema and modify the DBModel accordingly (switch to a different model if you have multiple or remove conflicting tables).

2. Have the `OvsdbClient` remove the conflicting tables for you. Not incompatible with 1)

IMHO, the weirdness comes when dealing with column type changes or missing columns. When a client sees an empty field when reading a Model from the cache, how would it know if it's actually empty in the DB of if it was missing from the schema in the first place?

That's when the "put the burden on the client" comes to play IMO: the client will have to figure out what columns are safe to read from by using its own mechanism. An oversimplified example: it could have a dedicated table in the database that lists supported features. If feature F that requires values from Column X is not listed in the supported features table then the client must not use feature F.

The problem is the client cannot really control the columns that are read.
If the field exists in the model, libovsdb will try to decode it when it receives updates on that table and when any transaction uses that Model (regardless of whether the particular field is "empty" or not). We also have cache indexes, etc.

Using your example, today, the only way for libovsdb not to read column "FeatureF" is not have it in the DatabaseModel.

We could avoid the update decoding if we use the recently added Conditional Monitoring was introduced, but I haven't looked at that feature so I don't know if it would work. Maybe @dave-tucker knows.

So, it seems to me that in order to really support this we would have to add Table and Column selectors in the DatabaseModel and carry them to the mapper layer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants