model: Connecting to DBs that don't contain tables that are in the model should not be an error. #235

dceara · 2021-10-05T09:20:49Z

Line 78 in e29d8bd

    
           errors = append(errors, fmt.Errorf("database model contains a model for table %s that does not exist in schema", tableName))

This should probably logged as a warning (at most) and the table should optionally be removed from the model. Otherwise it breaks all upgrade scenarios in which clients are upgraded before the database. Even worse, the client might have an external way of checking if a table actually exists on the remote server (e.g., run an ovn-nbctl list <table> command), and adapt its execution based on that information, completely avoiding to use the nonexistent table.

An example of such a scenario can be found here; the client fails to validate the schema every time it connects to a remote server, crash loops, even though it's perfectly able to function when the table is missing on the server side:

F1004 16:31:22.456301      34 ovnkube.go:131] error when trying to initialize libovsdb NB client: failed to connect to tcp:172.18.0.3:6641: database OVN_Northbound validation error (3): Mapper Error. Object type nbdb.LogicalRouter contains field LoadBalancerGroup ([]string) ovs tag load_balancer_group: Column does not exist in schema. database model contains a model for table Load_Balancer_Group that does not exist in schema. Mapper Error. Object type nbdb.LogicalSwitch contains field LoadBalancerGroup ([]string) ovs tag load_balancer_group: Column does not exist in schema

CC: @amorenoz

The text was updated successfully, but these errors were encountered:

dcbw · 2021-10-05T14:17:39Z

@dave-tucker

amorenoz · 2021-10-05T14:41:57Z

We added that verification because the error handling later on was more complicated. But now that the library is more mature we can relax this.

For missing tables we can remove the table from the model and warn. All API calls with that model will return a ErrWrongType. I'd say we should return a specific error type so clients can handle this case easier.

Should we also handle other types of discrepancies more robustly?, e.g: extra/missing/different column?
the mapper.Info is built using the column object and having less fields (columns) on models is supported from day 0, so in theory (not looked at code), we should be able to ignore problematic columns....

dave-tucker · 2021-10-05T16:19:08Z

Even worse, the client might have an external way of checking if a table actually exists on the remote server (e.g., run an ovn-nbctl list table command), and adapt its execution based on that information, completely avoiding to use the nonexistent table.

That's precisely the sort of scenario we wrote this library to avoid - why use ovn-nbctl and libovsdb together 🤔

We added that verification because the error handling later on was more complicated. But now that the library is more mature we can relax this.

I agree with Adrian here, but we need to do this thoughtfully.

The intended way for supporting two different versions of a schema was to let the library user handle it.
Specifically one of:

Create your own DatabaseModel with just the tables/columns you care about and ignore others
Update your database first, then update your clients with the new schema
Use modelgen to generate bindings for all versions you plan to support, update your clients (with error handling so you can speak old and new schema), then update your server

If we're relaxing verification, we'll need to be clear on what the compatibility guarantees are and for how long they are valid
Backwards only (i.e db is older than schema) or Forwards only (i.e db is newer than schema) or both.
Is that something that is just for an upgrade period? or could it be in place longer term?

For missing tables we can remove the table from the model and warn. All API calls with that model will return a ErrWrongType.

Missing tables should be a simple case. I'd say we should return a specific error type so clients can handle this case easier.
This can be done by instead of removing it from the DatabaseModel, we mark it as unsupported and return a specific error - e.g ErrMissingTable = "the server database version does not contain the table %s" if it's attempted to be used.

Should we also handle other types of discrepancies more robustly?, e.g: extra/missing/different column?
the mapper.Info is built using the column object and having less fields (columns) on models is supported from day 0, so in theory (not looked at code), we should be able to ignore problematic columns....

This is going to be a much harder case to solve for.
This was previously not possible because of the cache, but since we added logic to purge it on reconnect it's at least plausible.
Extra fields is fine, but missing ones is going to be weird.

I think a recent example was that OtherConfig was added to a table, and we actually needed to set options there for the control plane to work properly (could have been BFD if my memory serves). If you're dealing with a DB that doesn't have the OtherConfig column then you actually want to error out early.

Likewise a change of column type would be another issue that you could plausibly handle at runtime, i.e from a scalar type to a set of scalar types (or vice-versa), but one of the nice things about updating your bindings using modelgen was that Go's type system would catch these errors.

Given the pitfalls outlined above (and significant level of effort that would be involved to do this properly) I do wonder whether using modelgen to support both schemas is the least-worst option.

amorenoz · 2021-10-06T06:07:03Z

Should we also handle other types of discrepancies more robustly?, e.g: extra/missing/different column?
the mapper.Info is built using the column object and having less fields (columns) on models is supported from day 0, so in theory (not looked at code), we should be able to ignore problematic columns....

This is going to be a much harder case to solve for. This was previously not possible because of the cache, but since we added logic to purge it on reconnect it's at least plausible. Extra fields is fine, but missing ones is going to be weird.

I think a recent example was that OtherConfig was added to a table, and we actually needed to set options there for the control plane to work properly (could have been BFD if my memory serves). If you're dealing with a DB that doesn't have the OtherConfig column then you actually want to error out early.

Likewise a change of column type would be another issue that you could plausibly handle at runtime, i.e from a scalar type to a set of scalar types (or vice-versa), but one of the nice things about updating your bindings using modelgen was that Go's type system would catch these errors.

I wouldn't try to handle changes on runtime. Just marking nonexisting/changed fields and tables as "not supported" and ignore them or error out on transactions that use them would be much easier. However, the challenge I see is how to unambiguously let the client know what to expect from the library if only a subset of the fields/tables are supported. I think this is the same "weirdness" you express above.

Given the pitfalls outlined above (and significant level of effort that would be involved to do this properly) I do wonder whether using modelgen to support both schemas is the least-worst option.

Do you mean having the client decide which model to use in runtime or having the library support multiple (versioned) models?

dave-tucker · 2021-10-06T07:20:24Z

However, the challenge I see is how to unambiguously let the client know what to expect from the library if only a subset of the fields/tables are supported. I think this is the same "weirdness" you express above.

Yes that's my point. The only way to prevent this would be to indirect struct access through getter/setter functions where you could return an error if the field wasn't supported. This is much less convenient that using the struct directly.

EDIT: in some ways the model declares the minimum required feature set for a connection... and I quite like how that is enforced at present through validation.

Do you mean having the client decide which model to use in runtime or having the library support multiple (versioned) models?

Having the client pick at runtime

dceara · 2021-10-06T07:24:38Z

Even worse, the client might have an external way of checking if a table actually exists on the remote server (e.g., run an ovn-nbctl list table command), and adapt its execution based on that information, completely avoiding to use the nonexistent table.

That's precisely the sort of scenario we wrote this library to avoid - why use ovn-nbctl and libovsdb together thinking

Sorry, my example wasn't good, you're right, no need to use ovn-nbctl. We can use libovsdb to check if a table is in the shema supported by the server, e.g., as suggested by @amorenoz:

if Client.Schema().Table("Load_Balancer_Group") == nil {
 ... // Handle case when table is not available on the server

dceara · 2021-10-06T07:28:09Z

Should we also handle other types of discrepancies more robustly?, e.g: extra/missing/different column?
the mapper.Info is built using the column object and having less fields (columns) on models is supported from day 0, so in theory (not looked at code), we should be able to ignore problematic columns....

This is going to be a much harder case to solve for. This was previously not possible because of the cache, but since we added logic to purge it on reconnect it's at least plausible. Extra fields is fine, but missing ones is going to be weird.
I think a recent example was that OtherConfig was added to a table, and we actually needed to set options there for the control plane to work properly (could have been BFD if my memory serves). If you're dealing with a DB that doesn't have the OtherConfig column then you actually want to error out early.
Likewise a change of column type would be another issue that you could plausibly handle at runtime, i.e from a scalar type to a set of scalar types (or vice-versa), but one of the nice things about updating your bindings using modelgen was that Go's type system would catch these errors.

I wouldn't try to handle changes on runtime. Just marking nonexisting/changed fields and tables as "not supported" and ignore them or error out on transactions that use them would be much easier. However, the challenge I see is how to unambiguously let the client know what to expect from the library if only a subset of the fields/tables are supported. I think this is the same "weirdness" you express above.

Why not put the burden on the client (like is currently done with the C and Python ovsdb IDLs) and make it the responsibility of the client to determine what tables/columns to use?

All we'd need is a way to determine if a table/column exists in the remote schema. The client can then set up its own mechanism to determine what tables/columns to read to/write from.

EDIT: Of course schema validation shouldn't throw an error for a missing table/column. :)

amorenoz · 2021-10-07T07:51:25Z

Should we also handle other types of discrepancies more robustly?, e.g: extra/missing/different column?
the mapper.Info is built using the column object and having less fields (columns) on models is supported from day 0, so in theory (not looked at code), we should be able to ignore problematic columns....

This is going to be a much harder case to solve for. This was previously not possible because of the cache, but since we added logic to purge it on reconnect it's at least plausible. Extra fields is fine, but missing ones is going to be weird.
I think a recent example was that OtherConfig was added to a table, and we actually needed to set options there for the control plane to work properly (could have been BFD if my memory serves). If you're dealing with a DB that doesn't have the OtherConfig column then you actually want to error out early.
Likewise a change of column type would be another issue that you could plausibly handle at runtime, i.e from a scalar type to a set of scalar types (or vice-versa), but one of the nice things about updating your bindings using modelgen was that Go's type system would catch these errors.

I wouldn't try to handle changes on runtime. Just marking nonexisting/changed fields and tables as "not supported" and ignore them or error out on transactions that use them would be much easier. However, the challenge I see is how to unambiguously let the client know what to expect from the library if only a subset of the fields/tables are supported. I think this is the same "weirdness" you express above.

Why not put the burden on the client (like is currently done with the C and Python ovsdb IDLs) and make it the responsibility of the client to determine what tables/columns to use?

I think the original rationale behind strict validation was to ensure no [un]marshalling errrors occur that are difficult to report (e.g: they occur on the update-handling thread) and to deal with (it's difficult to determine if the error is on the client, on the library or on the server side).

All we'd need is a way to determine if a table/column exists in the remote schema. The client can then set up its own mechanism to determine what tables/columns to read to/write from.
EDIT: Of course schema validation shouldn't throw an error for a missing table/column. :)

The missing table issue is relatively easy to fix, we could:

Perform OvsdbClient creation and DBModel loading in two steps so the client can inspect the schema and modify the DBModel accordingly (switch to a different model if you have multiple or remove conflicting tables).
Have the OvsdbClient remove the conflicting tables for you. Not incompatible with 1)

IMHO, the weirdness comes when dealing with column type changes or missing columns. When a client sees an empty field when reading a Model from the cache, how would it know if it's actually empty in the DB of if it was missing from the schema in the first place?

dceara · 2021-10-07T08:56:39Z

The missing table issue is relatively easy to fix, we could:
1. Perform `OvsdbClient` creation and `DBModel` loading in two steps so the client can inspect the schema and modify the DBModel accordingly (switch to a different model if you have multiple or remove conflicting tables).

2. Have the `OvsdbClient` remove the conflicting tables for you. Not incompatible with 1)
IMHO, the weirdness comes when dealing with column type changes or missing columns. When a client sees an empty field when reading a Model from the cache, how would it know if it's actually empty in the DB of if it was missing from the schema in the first place?

That's when the "put the burden on the client" comes to play IMO: the client will have to figure out what columns are safe to read from by using its own mechanism. An oversimplified example: it could have a dedicated table in the database that lists supported features. If feature F that requires values from Column X is not listed in the supported features table then the client must not use feature F.

Back to the ovn-kubernetes feature that triggered this issue, this becomes even simpler: if the Load_Balancer_Group table is not available in the schema, then fall back to "legacy" configuration of Load_Balancer and direct referencing from Logical_Switches/Logical_Routers (i.e., never read/write from/to Load_Balancer_Group).

amorenoz · 2021-10-07T10:34:32Z

The missing table issue is relatively easy to fix, we could:
1. Perform `OvsdbClient` creation and `DBModel` loading in two steps so the client can inspect the schema and modify the DBModel accordingly (switch to a different model if you have multiple or remove conflicting tables).

2. Have the `OvsdbClient` remove the conflicting tables for you. Not incompatible with 1)
IMHO, the weirdness comes when dealing with column type changes or missing columns. When a client sees an empty field when reading a Model from the cache, how would it know if it's actually empty in the DB of if it was missing from the schema in the first place?
That's when the "put the burden on the client" comes to play IMO: the client will have to figure out what columns are safe to read from by using its own mechanism. An oversimplified example: it could have a dedicated table in the database that lists supported features. If feature F that requires values from Column X is not listed in the supported features table then the client must not use feature F.

The problem is the client cannot really control the columns that are read.
If the field exists in the model, libovsdb will try to decode it when it receives updates on that table and when any transaction uses that Model (regardless of whether the particular field is "empty" or not). We also have cache indexes, etc.

Using your example, today, the only way for libovsdb not to read column "FeatureF" is not have it in the DatabaseModel.

We could avoid the update decoding if we use the recently added Conditional Monitoring was introduced, but I haven't looked at that feature so I don't know if it would work. Maybe @dave-tucker knows.

So, it seems to me that in order to really support this we would have to add Table and Column selectors in the DatabaseModel and carry them to the mapper layer.

dceara mentioned this issue Oct 5, 2021

Use Load Balancer Groups to improve NB scalability. ovn-org/ovn-kubernetes#2533

Merged

amorenoz linked a pull request Oct 13, 2021 that will close this issue

RFC: Enhance compatibility with different schema versions #244

Open

dave-tucker linked a pull request Nov 3, 2021 that will close this issue

Make Columns and Tables Optional #257

Open

astoycos mentioned this issue Dec 8, 2021

Remove lingering nb/sb ctl calls from the code base ovn-org/ovn-kubernetes#2697

Merged

jcaamano mentioned this issue Aug 21, 2024

[WIP] Use libovsdb package for collecting ovs metrics ovn-org/ovn-kubernetes#4637

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model: Connecting to DBs that don't contain tables that are in the model should not be an error. #235

model: Connecting to DBs that don't contain tables that are in the model should not be an error. #235

dceara commented Oct 5, 2021 •

edited

Loading

dcbw commented Oct 5, 2021

amorenoz commented Oct 5, 2021 •

edited

Loading

dave-tucker commented Oct 5, 2021

amorenoz commented Oct 6, 2021

dave-tucker commented Oct 6, 2021 •

edited

Loading

dceara commented Oct 6, 2021

dceara commented Oct 6, 2021 •

edited

Loading

amorenoz commented Oct 7, 2021

dceara commented Oct 7, 2021

amorenoz commented Oct 7, 2021

model: Connecting to DBs that don't contain tables that are in the model should not be an error. #235

model: Connecting to DBs that don't contain tables that are in the model should not be an error. #235

Comments

dceara commented Oct 5, 2021 • edited Loading

dcbw commented Oct 5, 2021

amorenoz commented Oct 5, 2021 • edited Loading

dave-tucker commented Oct 5, 2021

amorenoz commented Oct 6, 2021

dave-tucker commented Oct 6, 2021 • edited Loading

dceara commented Oct 6, 2021

dceara commented Oct 6, 2021 • edited Loading

amorenoz commented Oct 7, 2021

dceara commented Oct 7, 2021

amorenoz commented Oct 7, 2021

dceara commented Oct 5, 2021 •

edited

Loading

amorenoz commented Oct 5, 2021 •

edited

Loading

dave-tucker commented Oct 6, 2021 •

edited

Loading

dceara commented Oct 6, 2021 •

edited

Loading