Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1.1] Fix failed exec after systemctl daemon-reload (regression in 1.1.3) #3554

Merged
merged 1 commit into from
Aug 18, 2022

Conversation

kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Aug 11, 2022

A regression reported for runc v1.1.3 says that after systemctl
daemon-reload runc exec fails:

exec failed: unable to start container process: open /dev/pts/0: operation not permitted: unknown

Apparently, with commit 7219387 we are no longer adding
"DeviceAllow=char-pts rwm" rule (because os.Stat("char-pts") returns
ENOENT).

The bug can only be seen after "systemctl daemon-reload" because runc
also applies the same rules manually (by writing to devices.allow for
cgroup v1), and apparently reloading systemd leads to re-applying the
rules that systemd has (thus removing the char-pts access).

The fix is to do os.Stat only for "/dev" paths.

Also, emit a warning that the path was skipped. Since the original idea
was to emit less warnings, demote the level to debug.

Note this also fixes the issue of not adding "m" permission for block-*
and char-* devices.

A test case is added, which reliably fails before the fix
on both cgroup v1 and v2.

Fixes: #3551
Fixes: 7219387

@kolyshkin kolyshkin added this to the 1.1.4 milestone Aug 11, 2022
@kolyshkin kolyshkin changed the title Fix failed exec after systemctl daemon-reload [1.1] Fix failed exec after systemctl daemon-reload Aug 16, 2022
@kolyshkin
Copy link
Contributor Author

CI failure is to be fixed by #3558

Still no regression test :(

@thaJeztah
Copy link
Member

Tried a build from this PR, and that fixes the issue reported in moby/moby#43960 (reply in thread)

@AkihiroSuda
Copy link
Member

Thanks, but I don't see the main PR, isn't the main branch affected?

@kolyshkin kolyshkin marked this pull request as ready for review August 16, 2022 23:37
@kolyshkin

This comment was marked as outdated.

@kolyshkin kolyshkin changed the title [1.1] Fix failed exec after systemctl daemon-reload [1.1] Fix failed exec after systemctl daemon-reload (regression in 1.1.3) Aug 16, 2022
thaJeztah
thaJeztah previously approved these changes Aug 17, 2022
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I'm ok with looking for a test as follow-up

@thaJeztah
Copy link
Member

Thanks, but I don't see the main PR, isn't the main branch affected?

good point; PR for main is in #3559, but indeed still in draft; do we want the one in main first?

@kolyshkin

This comment was marked as outdated.

@kolyshkin

This comment was marked as outdated.

@kolyshkin kolyshkin marked this pull request as ready for review August 18, 2022 22:03
AkihiroSuda
AkihiroSuda previously approved these changes Aug 18, 2022
@AkihiroSuda
Copy link
Member

Please run git cherry-pick with -x for tracking the main commit

@kolyshkin
Copy link
Contributor Author

kolyshkin commented Aug 18, 2022

Please run git cherry-pick with -x for tracking the main commit

That is what I usually do, although not in this case. The reasons are:

  • I initially created the fix in release-1.1 branch and then forward-ported it to main (so this one is not a cherry-pick)
  • Automatic cherry-pick (either way) is not possible because the code we patch was moved to a different file.

I can certainly add a reference to commit 58b1374 to the commit message.

A regression reported for runc v1.1.3 says that "runc exec -t" fails
after doing "systemctl daemon-reload":

> exec failed: unable to start container process: open /dev/pts/0: operation not permitted: unknown

Apparently, with commit 7219387 we are no longer adding
"DeviceAllow=char-pts rwm" rule (because os.Stat("char-pts") returns
ENOENT).

The bug can only be seen after "systemctl daemon-reload" because runc
also applies the same rules manually (by writing to devices.allow for
cgroup v1), and apparently reloading systemd leads to re-applying the
rules that systemd has (thus removing the char-pts access).

The fix is to do os.Stat only for "/dev" paths.

Also, emit a warning that the path was skipped. Since the original idea
was to emit less warnings, demote the level to debug.

Note this also fixes the issue of not adding "m" permission for block-*
and char-* devices.

A test case is added, which reliably fails before the fix
on both cgroup v1 and v2.

This is a backport of commit 58b1374
to release-1.1 branch.

Fixes: opencontainers#3551
Fixes: 7219387
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@kolyshkin
Copy link
Contributor Author

I can certainly add a reference to commit 58b1374 to the commit message.

Done; PTAL @AkihiroSuda

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AkihiroSuda AkihiroSuda merged commit 46a5a84 into opencontainers:release-1.1 Aug 18, 2022
@thaJeztah
Copy link
Member

@kolyshkin @AkihiroSuda @cyphar should we do a 1.1.4 release to fix the regression, or do we have more things to fix? current diff looks like; v1.1.3...release-1.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants