Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow cluster_networking_config to have defaults origin #19106

Merged
merged 10 commits into from
Dec 13, 2022

Conversation

vitorenesduarte
Copy link
Contributor

Fixes #19099.

This commit ensures that cluster_networking_config has the teleport.dev/origin: defaults label when no related field is set in the configuration file.
Before this commit, cluster_networking_config would always have the teleport.dev/origin: config-file, even if no related field was set.

This commit ensures that `cluster_networking_config` has the
`teleport.dev/origin: defaults` label when no related field is set in
the configuration file.
Before this commit, `cluster_networking_config` would always have the
`teleport.dev/origin: config-file`, even if no related field was set.
@@ -635,6 +635,7 @@ func applyAuthConfig(fc *FileConfig, cfg *service.Config) error {
}

if fc.Auth.MessageOfTheDay != "" {
cfg.Auth.Preference.SetOrigin(types.OriginConfigFile)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this was missed in de9bdf0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the origin is already OriginDynamic? MessageOfTheDay seems less consequential than the rest of the auth preferences, I'm not sure it makes sense to overwrite the origin for this, but I don't know much about what these origin tags are used for

@@ -635,6 +635,7 @@ func applyAuthConfig(fc *FileConfig, cfg *service.Config) error {
}

if fc.Auth.MessageOfTheDay != "" {
cfg.Auth.Preference.SetOrigin(types.OriginConfigFile)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the origin is already OriginDynamic? MessageOfTheDay seems less consequential than the rest of the auth preferences, I'm not sure it makes sense to overwrite the origin for this, but I don't know much about what these origin tags are used for

lib/config/configuration.go Outdated Show resolved Hide resolved
Co-authored-by: Nic Klaassen <nic@goteleport.com>
@vitorenesduarte vitorenesduarte self-assigned this Dec 8, 2022
@vitorenesduarte
Copy link
Contributor Author

What if the origin is already OriginDynamic? MessageOfTheDay seems less consequential than the rest of the auth preferences, I'm not sure it makes sense to overwrite the origin for this, but I don't know much about what these origin tags are used for

@nklaassen thanks for raising this. This PR as is can indeed change the outcome of shouldInitReplaceResourceWithOrigin:

teleport/lib/auth/init.go

Lines 527 to 552 in 4f89756

func shouldInitReplaceResourceWithOrigin(stored, candidate types.ResourceWithOrigin) (bool, error) {
if candidate == nil || (candidate.Origin() != types.OriginDefaults && candidate.Origin() != types.OriginConfigFile) {
return false, trace.BadParameter("candidate origin must be either defaults or config-file (this is a bug)")
}
// If there is no resource stored in the backend or it was not dynamically
// configured, the candidate resource should be stored in the backend.
if stored == nil || stored.Origin() != types.OriginDynamic {
return true, nil
}
// If the candidate resource is explicitly configured in the config file,
// store this config-file resource in the backend no matter what.
if candidate.Origin() == types.OriginConfigFile {
// Log a warning when about to overwrite a dynamically configured resource.
if stored.Origin() == types.OriginDynamic {
log.Warnf("Stored %v resource that was configured dynamically is about to be discarded in favor of explicit file configuration.", stored.GetKind())
}
return true, nil
}
// The resource in the backend was configured dynamically, and there is no
// more authoritative file configuration to replace it. Keep the stored
// dynamic resource.
return false, nil
}

This can result in a breaking change (in a seemingly unlikely situation) as shown below.


Given the code above, we have the following 8 possibilities:

# current candidate does replace?
1 nil defaults yes
2 nil config-file yes
3 defaults defaults yes
4 defaults config-file yes
5 config-file defaults yes
6 config-file config-file yes
7 dynamic defaults no
8 dynamic config-file yes

cluster_auth_preference

If a given configuration only has the message_of_the_day cluster_auth_preference field, then:

  • before this PR, it would be a candidate with defaults (which seems wrong, and that's why I added this line)
  • with this PR, it will be a candidate with config-file

Now, if there's already a cluster_auth_preference that is dynamic (as you asked), then:

  • before this PR we had situation #7, and so the dynamic resource is kept
  • with this PR we have situation #8, and so the dynamic resource is discarded with a warning

This means that this PR would be a breaking change.
Given this, should we remove this line?

[testing done to verify the above]
  • start teleport: teleport-master start -c cap_config.yaml
# cap_config.yaml
version: v3
teleport:
  nodename: my-node-cap
  data_dir: my-node-cap-data
  log:
    output: stderr
    severity: INFO
  ca_pin: ""
  diag_addr: ""
auth_service:
  enabled: "yes"
  listen_addr: 0.0.0.0:3025
  message_of_the_day: "welcome"
  • check cluster_auth_preference: tctl -c cap_config.yaml get cap
# origin is default even though message_of_the_day is set
kind: cluster_auth_preference
metadata:
  id: 1670585925908005000
  labels:
    teleport.dev/origin: defaults
  name: cluster-auth-preference
spec:
  allow_local_auth: true
  allow_passwordless: false
  disconnect_expired_cert: false
  locking_mode: best_effort
  message_of_the_day: welcome
  second_factor: otp
  type: local
version: v2
  • make cluster_auth_preference dynamic, setting disconnect_expired_cert: true: tctl -c cap_config.yaml create -f cap.yaml
# cap.yaml
kind: cluster_auth_preference
metadata:
  name: cluster-auth-preference
spec:
  message_of_the_day: welcome
  disconnect_expired_cert: true
version: v2
  • check cluster_auth_preference: tctl -c cap_config.yaml get cap
# origin is now dynamic
kind: cluster_auth_preference
metadata:
  id: 1670586368075235000
  labels:
    teleport.dev/origin: dynamic
  name: cluster-auth-preference
spec:
  allow_local_auth: true
  allow_passwordless: false
  disconnect_expired_cert: true
  locking_mode: best_effort
  message_of_the_day: welcome
  second_factor: otp
  type: local
version: v2
  • restart teleport: teleport-master start -c cap_config.yaml

  • check cluster_auth_preference: tctl -c cap_config.yaml get cap

# the update on `disconnect_expired_cert` is still here
kind: cluster_auth_preference
metadata:
  id: 1670586368075235000
  labels:
    teleport.dev/origin: dynamic
  name: cluster-auth-preference
spec:
  allow_local_auth: true
  allow_passwordless: false
  disconnect_expired_cert: true
  locking_mode: best_effort
  message_of_the_day: welcome
  second_factor: otp
  type: local
version: v2
  • upgrade teleport to this PR & restart: teleport-pr start -c cap_config.yaml

  • check cluster_auth_preference: tctl -c cap_config.yaml get cap

# the update on `disconnect_expired_cert` was lost as `disconnect_expired_cert` is now `false` again
kind: cluster_auth_preference
metadata:
  id: 1670586581724934000
  labels:
    teleport.dev/origin: config-file
  name: cluster-auth-preference
spec:
  allow_local_auth: true
  allow_passwordless: false
  disconnect_expired_cert: false
  locking_mode: best_effort
  message_of_the_day: welcome
  second_factor: otp
  type: local
version: v2

cluster_networking_config

For this case, we don't seem to have a breaking change, and we can have a dynamic cluster_networking_config that persists across restarts.

If a given configuration has no cluster_networking_config fields, then:

  • before this PR, it would be a candidate with config-file (which seems wrong)
  • with this PR, it will be a candidate with defaults

Now, if there's already a cluster_networking_config that is dynamic, then:

  • before this PR we had situation #8, and so the dynamic resource is discarded with a warning (which means that currently it's not possible to have a dynamic resource that doesn't get discarded when e.g. the auth server restarts)
  • with this PR we have situation #7, and so the dynamic resource is kept
[testing done to verify the above]
  • start teleport: teleport-master start -c cnc_config.yaml
# cnc_config.yaml
version: v3
teleport:
  nodename: my-node-cnc
  data_dir: my-node-cnc-data
  log:
    output: stderr
    severity: INFO
  ca_pin: ""
  diag_addr: ""
auth_service:
  enabled: "yes"
  listen_addr: 0.0.0.0:3025
  • check cluster_networking_config: tctl -c cnc_config.yaml get networking
# origin is config-file even though no field is set
kind: cluster_networking_config
metadata:
  id: 1670587121167239000
  labels:
    teleport.dev/origin: config-file
  name: cluster-networking-config
spec:
  client_idle_timeout: 0s
  idle_timeout_message: ""
  keep_alive_count_max: 3
  keep_alive_interval: 5m0s
  session_control_timeout: 0s
  tunnel_strategy:
    type: agent_mesh
  web_idle_timeout: 0s
version: v2
  • make cluster_networking_config dynamic (with --force --confirm), setting keep_alive_interval: 10m: tctl -c cnc_config.yaml create -f cnc.yaml --force --confirm
# cnc.yaml
kind: cluster_networking_config
metadata:
  name: cluster-networking-config
spec:
  keep_alive_interval: 10m
version: v2
  • check cluster_networking_config: tctl -c cnc_config.yaml get networking
# origin is now dynamic
kind: cluster_networking_config
metadata:
  id: 1670588089715194000
  labels:
    teleport.dev/origin: dynamic
  name: cluster-networking-config
spec:
  client_idle_timeout: 0s
  idle_timeout_message: ""
  keep_alive_count_max: 3
  keep_alive_interval: 10m0s
  session_control_timeout: 0s
  tunnel_strategy:
    type: agent_mesh
  web_idle_timeout: 0s
version: v2
  • (if here we upgrade teleport & restart, then the update on keep_alive_interval is not lost)

  • restart teleport: teleport-master start -c cnc_config.yaml

  • check cluster_networking_config: tctl -c cnc_config.yaml get networking

# the update on `keep_alive_interval` was lost as `keep_alive_interval` is now `5m` again
kind: cluster_networking_config
metadata:
  id: 1670588220989450000
  labels:
    teleport.dev/origin: config-file
  name: cluster-networking-config
spec:
  client_idle_timeout: 0s
  idle_timeout_message: ""
  keep_alive_count_max: 3
  keep_alive_interval: 5m0s
  session_control_timeout: 0s
  tunnel_strategy:
    type: agent_mesh
  web_idle_timeout: 0s
version: v2
  • upgrade teleport to this PR & restart: teleport-pr start -c cnc_config.yaml

  • check cluster_networking_config: tctl -c cnc_config.yaml get networking

# origin is defaults
kind: cluster_networking_config
metadata:
  id: 1670588391282321000
  labels:
    teleport.dev/origin: defaults
  name: cluster-networking-config
spec:
  client_idle_timeout: 0s
  idle_timeout_message: ""
  keep_alive_count_max: 3
  keep_alive_interval: 5m0s
  session_control_timeout: 0s
  tunnel_strategy:
    type: agent_mesh
  web_idle_timeout: 0s
version: v2
  • make cluster_networking_config dynamic (no need for --force --confirm now), setting keep_alive_interval: 10m: tctl -c cnc_config.yaml create -f cnc.yaml

  • check cluster_networking_config: tctl -c cnc_config.yaml get networking

# origin is now dynamic
kind: cluster_networking_config
metadata:
  id: 1670588510946801000
  labels:
    teleport.dev/origin: dynamic
  name: cluster-networking-config
spec:
  client_idle_timeout: 0s
  idle_timeout_message: ""
  keep_alive_count_max: 3
  keep_alive_interval: 10m0s
  session_control_timeout: 0s
  tunnel_strategy:
    type: agent_mesh
  web_idle_timeout: 0s
version: v2
  • restart teleport: teleport-pr start -c cnc_config.yaml

  • check cluster_networking_config: tctl -c cnc_config.yaml get networking

# the update on `keep_alive_interval` is still here
# ...

@@ -778,6 +778,9 @@ func TestApplyConfig(t *testing.T) {
require.NoError(t, err)
require.Equal(t, types.AgentMesh, tunnelStrategyType)
require.Equal(t, types.DefaultAgentMeshTunnelStrategy(), cfg.Auth.NetworkingConfig.GetAgentMeshTunnelStrategy())
require.Equal(t, cfg.Auth.Preference.Origin(), types.OriginConfigFile)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These assertions are backwards. The expected value should come before the actual value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Done in 3a7053d.

// configuration fields have values different from an empty Auth.
func (a *Auth) hasCustomNetworkingConfig() bool {
empty := Auth{}
return a.ClientIdleTimeout != empty.ClientIdleTimeout ||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just compare these to the zero value without creating the empty auth struct first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was suggested during review and done in 857fd1e.

@vitorenesduarte
Copy link
Contributor Author

@nklaassen thanks for raising this. This PR as is can indeed change the outcome of shouldInitReplaceResourceWithOrigin:

teleport/lib/auth/init.go

Lines 527 to 552 in 4f89756

func shouldInitReplaceResourceWithOrigin(stored, candidate types.ResourceWithOrigin) (bool, error) {
if candidate == nil || (candidate.Origin() != types.OriginDefaults && candidate.Origin() != types.OriginConfigFile) {
return false, trace.BadParameter("candidate origin must be either defaults or config-file (this is a bug)")
}
// If there is no resource stored in the backend or it was not dynamically
// configured, the candidate resource should be stored in the backend.
if stored == nil || stored.Origin() != types.OriginDynamic {
return true, nil
}
// If the candidate resource is explicitly configured in the config file,
// store this config-file resource in the backend no matter what.
if candidate.Origin() == types.OriginConfigFile {
// Log a warning when about to overwrite a dynamically configured resource.
if stored.Origin() == types.OriginDynamic {
log.Warnf("Stored %v resource that was configured dynamically is about to be discarded in favor of explicit file configuration.", stored.GetKind())
}
return true, nil
}
// The resource in the backend was configured dynamically, and there is no
// more authoritative file configuration to replace it. Keep the stored
// dynamic resource.
return false, nil
}

This can result in a breaking change (in a seemingly unlikely situation) as shown below.

If a given configuration only has the message_of_the_day cluster_auth_preference field, then:

  • before this PR, it would be a candidate with defaults (which seems wrong, and that's why I added this line)
  • with this PR, it will be a candidate with config-file

Now, if there's already a cluster_auth_preference that is dynamic (as you asked), then:

  • before this PR we had situation #7, and so the dynamic resource is kept
  • with this PR we have situation #8, and so the dynamic resource is discarded with a warning

This means that this PR would be a breaking change.

@nklaassen @zmb3 Given the above, I decided to revert this small change in dfbad2d so that this PR is just about #19099.

dfbad2d also adds a test case describing the current behaviour and I opened #19272 in case we want to fix this afterwards.

@vitorenesduarte vitorenesduarte enabled auto-merge (squash) December 13, 2022 09:37
@vitorenesduarte vitorenesduarte merged commit 0050eb7 into master Dec 13, 2022
@github-actions
Copy link

@vitorenesduarte See the table below for backport results.

Branch Result
branch/v11 Create PR

@vitorenesduarte vitorenesduarte deleted the vitor/networking-config-defaults branch December 13, 2022 13:21
vitorenesduarte pushed a commit that referenced this pull request Dec 19, 2022
This commit ensures that `cluster_networking_config` has the
`teleport.dev/origin: defaults` label when no related field is set in
the configuration file.
Before this commit, `cluster_networking_config` would always have the
`teleport.dev/origin: config-file`, even if no related field was set.
vitorenesduarte pushed a commit that referenced this pull request Dec 20, 2022
…19474)

This commit ensures that `cluster_networking_config` has the
`teleport.dev/origin: defaults` label when no related field is set in
the configuration file.
Before this commit, `cluster_networking_config` would always have the
`teleport.dev/origin: config-file`, even if no related field was set.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants