RFC: Enhance compatibility with different schema versions #244

amorenoz · 2021-10-13T06:29:22Z

Currently, there are several limitations that forces libovsdb to be very strict with the db schema version it uses. These limitations mainly come from the fact that:

mapper.Info structs are created all over the codebase solely based on the schema (ignoring the provided model)
Although there are many internal structs where we keep a reference to a model.DatabaseModel and the ovsdb.Schema, we lack a struct that centralizes both and can make decisions based on their compatibility. This has an additional challenge because the schema and the model are obtained at two different points in time.

This PR performs two mayor code refactorings on top of which a schema compatibility version is implemented.

Refactor 1: Rename current `DatabaseModel` to `DatabaseModelRequest` and introduce an internal `DatabaseModel` type

All over the codebase we keep references to both the ovsdb.Schema and the model.DatabaseModel. In fact, server.go did one step further and defined:

libovsdb/server/server.go

Lines 31 to 34 in f2b3ce2

    
           type DatabaseModel struct { 
        
           	Model  *model.DBModel 
        
           	Schema *ovsdb.DatabaseSchema 
        
           }

The first code refactoring proposed is to rename the client-provided DatabaseModel to DatabaseModelRequest. The naming could very well be discussed, my rationale is that the client would be "requesting" that model. Such request might not be 100% fulfilled depending on the schema. After such rename, a DatabaseModel is introduced that combines both central objects that are core for the library and that supports 2-step initialization to allow the schema to be provided afterwards.

Refactor 2: `mapper.Mapper` to use `mapper.Info`

All over the codebase we create mapper.Info structs and then call mapper.Mapper functions which create mapper.Info structs again. Unify this into a more compact API by having mapper.Mapper functions accept a mapper.Info reference.

Functional enhancements

With these two refactorings in place, this PR optimizes the mapper.Info creation process by having the newly introduced DatabaseModel do it. Since this struct now has both the schema and the model, it has more control over what fields will actually be readable/writable. Besides, we can cache the Info's metadata and speed up creation (I don't expect a huge performance improvement in unit tests since they use small models).

Finally, this PR introduces a Compatibility mode by adding a flag into the DatabaseModelRequest. That way the client can request the Model to be more flexible wrt the schema that is supported. On this mode, columns with conflicting types, missing columns (not present in the schema) are omitted and so are missing tables.

Fixes: #235

MonitorCookies are marshalled as a json object, not an array. Fix it so that benchmark test works Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

Clearly DBModel does not hold the full database model. Instead, only the combination of model.DBModel, mapper.Mapper and ovsdb.Schema is a useful database model. The fact that server.go had to defined a struct called DatabaseModel with model.DBModel and ovsdb.Schema and dymanically create Mapper objects from them is a proof of this. In order to prepare for a DBModel refactoring, rename it to DatabaseModelRequest as it's what the client requests the DatabaseModel to look like. This patch does not contain functional changes Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

Replace the one that server.go had to define. For now, it's just a drop-in replacement of the previous type Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

It is common to first create a DatabaseModel only based on the DatabaseModelRequest, and then add / remove the schema to it when, e.g: when the client (re) connects. Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

For the cache, it's simply replacing three fields with one For the client, use the 2-step DatabaseModel initialization Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

Now that client, cache and server uses the DatabaseModel as central point of model creation and introspection, we can hide the DatabaseModelRequest and move its pulbic functions to the DatabaseModel Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

All around the codebase we're creating mapper.Info structures and then calling Mapper functions that create Info structures again. Given that mapper.Info already defines all the metadata that Mapper needs to do the native-to-ovs transations, it makes sense to use Info structures as input to all functions. That simplifies the code inside the mapper module. Also, I'd expect some performance improvement since we were creating multiple Info structs unnecessarily in the host path. It's true that, for now, it makes it sligthly more cumbersone to call mapper functions, since the Info struct has to be created first and it now requires an additional argument (the table name). However, this can be improved later on by having the database model build the Info structs for us. Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

The core mapper API uses mapper.Info sctructs which can be created just by inspecting a TableSchema. However, having the DatabaseModel now centralizing accesses to the mapper API and containing both the Model types and the Schema, we can pre-create the mapper.Info.Metadata sctructs and cache them so we create Info sctructs more efficiently Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

Allow the user to set a Compatibility flag on the DatabaseModelRequest. When that flag is set, the verification phase will not fail if a column is missing on the schema or has a different type. Instead it will just skip the column. Same goes for missing tables, they will just be skipped. Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

It enables compatibility mode to cope with schema changes. This has to be revisited once/if the relevant PR get's merged: ovn-org/libovsdb#244 Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

dave-tucker · 2021-10-13T15:00:28Z

Thanks @amorenoz

Refactor 1: Rename current DatabaseModel to DatabaseModelRequest and introduce an internal DatabaseModel type
The first code refactoring proposed is to rename the client-provided DatabaseModel to DatabaseModelRequest. The naming could very well be discussed, my rationale is that the client would be "requesting" that model. Such request might not be 100% fulfilled depending on the schema. After such rename, a DatabaseModel is introduced that combines both central objects that are core for the library and that supports 2-step initialization to allow the schema to be provided afterwards.

So firstly, combining the Model/Schema to a single struct makes a lot of sense. I'm struggling with the 2-step initialization though.

Currently we ensure that what we're connecting to maps 1:1 with what the server offers. This is important because the datatypes that comprise a DatabaseModel are used directly. Put differently, I can always ovs.Create(&Bridge{}) if I've created a model for Bridge. While Connect() could fail at runtime, once I'm connected there are no surprises.

If we move to a 2-step approach, where what's passed to NewOVSDBClient is now a DatabaseModelRequest, the API now acts differently. ovs.Connect() will succeed, but ovs.Create(&Bridge{}):

Can succeed
Can fail at runtime if the server doesn't support that table.
It can partially succeed but may omit columns that I've mapped because either the server doesn't support the column OR the server does support the column but the types are different

My issue with this is that it weakens the guarantees we had before 😢 and also leads to some rather confusing runtime behaviours. I was also very fond of the fact that the DatabaseModel abstraction eliminated the need to know about tables names, columns etc... You just work with structs. In order to ease the rough edges of dealing with different runtime behaviours we will have to expose HasColumn, HasTable, IsSameTypeAsSchema APIs for users to conditionally build version support logic in to their applications which will end up very messy in the long term. If we head down this road I feel we'd be better off not using Model at all and reverting to the old behaviour of using a map[string]interface{}.

Finally, this PR introduces a Compatibility mode by adding a flag into the DatabaseModelRequest. That way the client can request the Model to be more flexible wrt the schema that is supported. On this mode, columns with conflicting types, missing columns (not present in the schema) are omitted and so are missing tables.

The concern I have here is that if you enable the bool you have to be prepared to deal with the cases explained above.
I worry that flipping a bool is "too easy" and can lead to unexpected runtime behaviour as mentioned earlier.

Ultimately my position is:

Upgrade ovsdb-server before you upgrade your clients
Use modelgen and application versioning to ensure that your app (using libovsdb) is compatible with your target server version

If we really really need to have a version of libovsdb that can speak different versions of the same schema, then I think the correct approach is to make it easy to register/maintain a model per version but I've not had to the time to design how this might work.

amorenoz · 2021-10-15T07:40:47Z

Thanks @amorenoz

Refactor 1: Rename current DatabaseModel to DatabaseModelRequest and introduce an internal DatabaseModel type
The first code refactoring proposed is to rename the client-provided DatabaseModel to DatabaseModelRequest. The naming could very well be discussed, my rationale is that the client would be "requesting" that model. Such request might not be 100% fulfilled depending on the schema. After such rename, a DatabaseModel is introduced that combines both central objects that are core for the library and that supports 2-step initialization to allow the schema to be provided afterwards.

So firstly, combining the Model/Schema to a single struct makes a lot of sense. I'm struggling with the 2-step initialization though.

Well, 2-step initialization is happening now since the shema Validation happens at Connect()

Currently we ensure that what we're connecting to maps 1:1 with what the server offers. This is important because the datatypes that comprise a DatabaseModel are used directly. Put differently, I can always ovs.Create(&Bridge{}) if I've created a model for Bridge. While Connect() could fail at runtime, once I'm connected there are no surprises.

The 2-step initialization of the DatabaseModel is perfectly compatible with the current validation criteria. It's just a way to add the schema to the struct after it has been created. We can make Connect fail as it does (in fact, without the last patch, it is what this PR does). In fact, it's even more strict since it centralizes the creation of Info structures avoiding crashes on cache updates (e.g. running cmd/stress against examples/ovsdb-server crashes whith panic: FieldByColumn: column dpdk_initialized not found in orm info)

If we move to a 2-step approach, where what's passed to NewOVSDBClient is now a DatabaseModelRequest, the API now acts differently. ovs.Connect() will succeed, but ovs.Create(&Bridge{}):
* Can succeed

* Can fail at runtime if the server doesn't support that table.

* It can partially succeed but may omit columns that I've mapped because either the server doesn't support the column OR the server does support the column but the types are different
My issue with this is that it weakens the guarantees we had before cry and also leads to some rather confusing runtime behaviours. I was also very fond of the fact that the DatabaseModel abstraction eliminated the need to know about tables names, columns etc... You just work with structs. In order to ease the rough edges of dealing with different runtime behaviours we will have to expose HasColumn, HasTable, IsSameTypeAsSchema APIs for users to conditionally build version support logic in to their applications which will end up very messy in the long term. If we head down this road I feel we'd be better off not using Model at all and reverting to the old behaviour of using a map[string]interface{}.

I agree, it weakens a guarantee that was very useful. However, if the application is not able to guarantee it will be updated before the server, it will have to code around the possibility of different schema versions anyhow (even if it's by using another &Bridge type) but guess it's still more robust that having to query for each column to know if it's really empty or not supported...

Finally, this PR introduces a Compatibility mode by adding a flag into the DatabaseModelRequest. That way the client can request the Model to be more flexible wrt the schema that is supported. On this mode, columns with conflicting types, missing columns (not present in the schema) are omitted and so are missing tables.

The concern I have here is that if you enable the bool you have to be prepared to deal with the cases explained above. I worry that flipping a bool is "too easy" and can lead to unexpected runtime behaviour as mentioned earlier.

I agree. The flag was just to enable the OVN feature testing.

Ultimately my position is:
1. Upgrade ovsdb-server before you upgrade your clients

2. Use `modelgen` and application versioning to ensure that your app (using libovsdb) is compatible with your target server version
If we really really need to have a version of libovsdb that can speak different versions of the same schema, then I think the correct approach is to make it easy to register/maintain a model per version but I've not had to the time to design how this might work.

Just throwing another idea: maybe a minimum and maximum DatabaseModel? The client tires the maximum, if it fails to validate, tries the minimum, it it also fails Connect() fails. The client then returns which model was selected.

Anyhow, as I said above, the refactoring has nothing to do with the validation strictness, and I think it makes the code cleaner (and fixes a crash :)). So if you think it's worth it, let me know and I'll create another PR without the last commit.

dave-tucker · 2021-10-15T15:07:59Z

So if you think it's worth it, let me know and I'll create another PR without the last commit.

@amorenoz I absolutely think it's worth it. Less bugs and crashes are awesome 😆

I'm still not a fan of DatabaseModelRequest but don't have a better suggestion, so let's run with it for now and we can figure it out later.

amorenoz · 2021-10-15T15:20:08Z

So if you think it's worth it, let me know and I'll create another PR without the last commit.

@amorenoz I absolutely think it's worth it. Less bugs and crashes are awesome laughing

I'm still not a fan of DatabaseModelRequest but don't have a better suggestion, so let's run with it for now and we can figure it out later.

Me neither, I hate it actually. How about ClientDatabaseModel (and we internally build the DatabaseModel that contains the ClientDatabaseModel, the server schema and the mapper)?

dave-tucker · 2021-10-15T15:31:43Z

@amorenoz yep I prefer ClientDatabaseModel

amorenoz added 10 commits October 13, 2021 07:32

client_test: fix monitor cookie json

fa84414

MonitorCookies are marshalled as a json object, not an array. Fix it so that benchmark test works Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

example: fix ovsdb-server schema file

8ef7753

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

model: Introduce DatabaseModel

d1894d4

Replace the one that server.go had to define. For now, it's just a drop-in replacement of the previous type Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

database: Add 2-step initialization

d5556fa

It is common to first create a DatabaseModel only based on the DatabaseModelRequest, and then add / remove the schema to it when, e.g: when the client (re) connects. Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

client: Use DatabaseModel in client an cache

81f034b

For the cache, it's simply replacing three fields with one For the client, use the 2-step DatabaseModel initialization Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

amorenoz force-pushed the upgrades2 branch from b3434b5 to 94fd061 Compare October 13, 2021 06:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Enhance compatibility with different schema versions #244

RFC: Enhance compatibility with different schema versions #244

amorenoz commented Oct 13, 2021 •

edited

Loading

dave-tucker commented Oct 13, 2021

amorenoz commented Oct 15, 2021

dave-tucker commented Oct 15, 2021

amorenoz commented Oct 15, 2021

dave-tucker commented Oct 15, 2021

	type DatabaseModel struct {
	Model *model.DBModel
	Schema *ovsdb.DatabaseSchema
	}

RFC: Enhance compatibility with different schema versions #244

Are you sure you want to change the base?

RFC: Enhance compatibility with different schema versions #244

Conversation

amorenoz commented Oct 13, 2021 • edited Loading

Refactor 1: Rename current DatabaseModel to DatabaseModelRequest and introduce an internal DatabaseModel type

Refactor 2: mapper.Mapper to use mapper.Info

Functional enhancements

dave-tucker commented Oct 13, 2021

amorenoz commented Oct 15, 2021

dave-tucker commented Oct 15, 2021

amorenoz commented Oct 15, 2021

dave-tucker commented Oct 15, 2021

amorenoz commented Oct 13, 2021 •

edited

Loading

Refactor 1: Rename current `DatabaseModel` to `DatabaseModelRequest` and introduce an internal `DatabaseModel` type

Refactor 2: `mapper.Mapper` to use `mapper.Info`