Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ops user changes should be rolled back when attempt at VCH reconfigure fails #7814

Open
2 tasks
jzt opened this issue Apr 23, 2018 · 8 comments
Open
2 tasks
Assignees

Comments

@jzt
Copy link
Contributor

jzt commented Apr 23, 2018

User Statement:

As a user, when I attempt to change ops user permissions during a vic-machine configure, I would expect those changes to be rolled back in the event of a failure.

Details:

Currently, an attempt to use vic-machine configure to grant permissions to an ops user does not work (see issue #7725, and fix #7777). Even with this fix, there exists a scenario in which changes to the ops user during a failing vic-machine configure persist even after a rollback is executed. Logic to undo changes to the ops user permissions should be added to the rollback step.

Acceptance Criteria:

  • rollback includes logic to undo any changes to the ops user that were present during a failed attempt at vic-machine configure
  • integration/unit tests to exercise this rollback logic
@jzt jzt added kind/defect Behavior that is inconsistent with what's intended component/install labels Apr 23, 2018
@jzt jzt assigned zjs Apr 23, 2018
@zjs
Copy link
Member

zjs commented Apr 23, 2018

I'm going to mark this as a P1 because it's a regression in behavior since 1.3. I don't think we can reasonably fix this for 1.4 due to the complexity, but I think it's significant enough to call out as something that we should be fixing, but can't.

To restate the impact of this, the effects are twofold:

  1. Following the rollback of an unsuccessful change (configure or upgrade) where additional privileges were granted to the operations user, those additional privileges are left as "cruft". This leads to a (slight) violation of the principle of least privilege as we leave the VCH with an operations user which can do more than it needs to.
  2. Following the rollback of an unsuccessful change where privileges were revoked from the operations user, the VCH will not start. This would lead to a significant violation of our contract around rollback, in that we do not leave the VCH in a good state. However, currently there may not be any changes which lead to privileges being revoked. (There certainly will in the future though. See, e.g., Update vic-machine configure to support VM-Host Affinity #7561.)

@mdubya66
Copy link
Contributor

Is there a workaround? Adding kind/note.

@mdubya66 mdubya66 added the impact/doc/note Requires creation of or changes to an official release note label Apr 23, 2018
@zjs
Copy link
Member

zjs commented Apr 23, 2018

A high-level workaround would be for the administrator to manually adjust the permissions of the operations user back to what they "should be". That's an annoying sort of workaround though; presumably the whole reason they're using the --ops-grant-perms functionality is that they don't want to be manually fussing with the permissions.

@zjs
Copy link
Member

zjs commented Apr 23, 2018

Marking this as blocked by #7818. I think the easiest way to address this would be to use invoke opsuser.GrantOpsUserPerms with the old configuration. This seems simpler and less error-prone than attempting to track the previous state and then restore it.

@zjs
Copy link
Member

zjs commented Apr 23, 2018

Gracefully plumbing through the old configuration values seems to be the hardest part of the above idea; the rest of the change looks roughly like:

	if conf.ShouldGrantPerms() {
		err = opsuser.GrantOpsUserPerms(d.op, d.session, conf)
		if err != nil {
			return errors.Errorf("Failed to grant permissions to ops-user, failure: %s", err)
		}
+
+		defer func() {
+			if err != nil {
+				d.op.Info("Attempting to undo changes to operations user due to previous error: %s", err)
+
+				opsuser.GrantOpsUserPerms(d.op, d.session, oldConf)
+			}
+		}()
	}

@stuclem
Copy link
Contributor

stuclem commented May 15, 2018

Release note:

  • Attempts to change operations user permissions with vic-machine configure do not roll back in the event of a failure. #7814
    Changes to the operations user that are made during a failed vic-machine configure operation persist even after a rollback.

@jzt is this OK?

@jzt
Copy link
Contributor Author

jzt commented May 15, 2018

@stuclem Looks good. It may be worth noting that there is a workaround to cleanup: Browse to Administration -> Roles, and delete every role that begins with vch-. WARNING: This workaround should only be performed when there are no other virtual container hosts running in VC under the ops user, as this will remove the roles for any (all) virtual container hosts that are running in VC.

@stuclem
Copy link
Contributor

stuclem commented May 15, 2018

Thanks @jzt. Added:

Workaround: In the vSphere Client, go to Administration -> Roles and delete every role that begins with vch. WARNING: Only perform this workaround if there are no other VCHs running in vCenter Server that are configured with an operations user, as this removes the roles for all VCHs that are running in that vCenter Server instance.

@stuclem stuclem removed the impact/doc/note Requires creation of or changes to an official release note label May 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants