Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config-linux: Require no cgroup tweaks when linux.resources is unset #576

Closed

Conversation

wking
Copy link
Contributor

@wking wking commented Sep 19, 2016

Or empty. Using:

"resources": {}

should mean the same thing as:

"resources": null

or as not specifying resources at all, so we can drop the “Do not specify” requirement.

It's good to be clear about what leaving the properties unset/empty means though. I'd prefer a config-wide rule about explosing platform defaults for empty/unset properties, but if that is too much to bite off I expect we can at least do that for cases where a new container is joining an existing cgroup. With the wording I'm proposing here, runtimes would still be free to clobber settings on new cgroups in an unspecified manner as they see fit.

Spun off from the Windows discussion. #540 is also in-flight in the no-tweaking space, although it is orthogonal to this change.

@mrunalp
Copy link
Contributor

mrunalp commented Sep 20, 2016

LGTM

Approved with PullApprove

For example, to run a new process in an existing container without updating limits, `resources` need not be specified.
You can configure a container's cgroups via the OPTIONAL `resources` property.
When `resources` is empty or unset, the runtime MUST NOT alter properties of existing cgroups.
When a subset of `resources` is empty or unset, the runtime MUST not alter the properties of existing cgroups covered by that subset.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/MUST not/MUST NOT

Or empty.  Using:

  "resources": {}

should mean the same thing as:

  "resources": null

or as not specifying 'resources' at all, so we can drop the "Do not
specify" requirement.

It's good to be clear about what leaving the properties unset/empty
means though.  I'd prefer a config-wide rule like [1], but if that is
too much to bite off I expect we can at least do that for cases where
a new container is joining an existing cgroup.

[1]: https://groups.google.com/a/opencontainers.org/forum/#!topic/dev/DWdystx5X3A
     Subject: Exposing platform defaults
     Date: Thu, 14 Jan 2016 15:36:26 -0800
     Message-ID: <20160114233625.GN6362@odin.tremily.us>

Signed-off-by: W. Trevor King <wking@tremily.us>
@wking wking force-pushed the tk/empty-linux-resources-no-tweak branch from fc992aa to cca6b65 Compare September 21, 2016 06:47
@wking
Copy link
Contributor Author

wking commented Sep 21, 2016

I fixed a “MUST not” → “MUST NOT” typo 1 and rebased onto master
with fc992aacca6b65.

@hqhq
Copy link
Contributor

hqhq commented Sep 21, 2016

LGTM

Approved with PullApprove

@crosbymichael
Copy link
Member

{} and null are two different things. How are we supposed to tell that it is empty unless we walk all the fields? This is why we have nullable fields in the spec so why are we needing this?

@wking
Copy link
Contributor Author

wking commented Sep 22, 2016

On Wed, Sep 21, 2016 at 03:20:28PM -0700, Michael Crosby wrote:

{} and null are two different things.

At the JSON level, I agree. If you're talking about the semantics of
a runtime-spec config though, I don't see a distinction. What would
you change about the container if you changed the config value from
null to {}?

How are we supposed to tell that it is empty unless we walk all the
fields? This is why we have nullable fields in the spec so why are
we needing this?

I agree that null gives you a more convenient check for “the
configuration contains nothing about linux.resources”. The empty
object also contains nothing about linux.resources, it's just harder
to check. This PR just makes it illegal for runtimes to bump
properties in existing cgroups unless the associated config property
is set.

If a runtime implementation previously only updated existing cgroup
properties when the associated config property
(e.g. linux.resources.memory.limit) was set, you shouldn't have to
change anything with this PR.

@crosbymichael
Copy link
Member

REJECT

@wking
Copy link
Contributor Author

wking commented Sep 22, 2016

On Thu, Sep 22, 2016 at 09:12:36AM -0700, Michael Crosby wrote:

REJECT

Because… ?

@cyphar
Copy link
Member

cyphar commented Oct 18, 2016

@wking My issue with this is that it means that (for example), we cannot set device restrictions on a container if the user hasn't set any device restrictions. My main issue is this sentence:

When a subset of resources is empty or unset, the runtime MUST NOT alter the properties of existing cgroups covered by that subset.

Maybe if it's {} I would understand this logic (because you cannot accidentailly set "resources": { "devices": {}}), but what you're doing is making it impossible for runtimes to set additional security options on behalf of their users. And thus "resources": {} must not have any restrictions because of the subset change I just mentioned. And even with "devices": {} it might not be a good idea to force runtimes to not be able to set any additional cgroup restrictions. Overall this change has quite a few far-reaching consequences that I don't think you've considered.

So I also REJECT this.

@wking
Copy link
Contributor Author

wking commented Oct 18, 2016

On Tue, Oct 18, 2016 at 01:00:05AM -0700, Aleksa Sarai wrote:

… but what you're doing is making it impossible for runtimes to set
additional security options on behalf of their users…

I'm fine with that. For security options that are covered by the
config, the runtime should just do what it's been told to do. Helpful
hinting about increasing security for naive users belongs in
‘oci-runtime-tool generate’ and similar when setting up the config, so
the user has a chance to inspect and alter the policy recommendations.
In many cases (e.g. exec'ing a new process in an existing sandbox),
the fact that the cgroup already exists means they've already been
setup the way the user wants them.

With your approach (not forbidding the runtime to alter controllers
that are empty or unset), how do users that disagree with the
runtime's built-in policies turn them off?

@cyphar
Copy link
Member

cyphar commented Oct 18, 2016

how do users that disagree with the runtime's built-in policies turn them off?

They set configurations for the controller that explicitly is the value they want. Anything less is subject to interpretation anyway, because you could have other services running on the machine that add restrictions that are counter to what a user might expect. What is the point of allowing for "system defaults" with null if you're going to turn around and say "the runtime cannot touch this -- so the system default isn't the runtime default". That's just not correct -- the runtime defines the system defaults.

There is a valid concern about list-based settings like devices, and I'm not sure what to do with that case (should the runtime be allowed to append to it). But I think that allowing appending to lists is fine for devices -- so long as it doesn't conflict with what the user explicitly set.

The whole reason I argued for null in the first place is so that runtimes can define system defaults without conflicting with options that users explicitly set. Can we please just go back to that and stop trying to make it harder for runtimes to actually make containers secure.

Helpful hinting about increasing security for naive users belongs in ‘oci-runtime-tool generate’ and similar when setting up the config

"Helpful hinting" while restricting runtimes is not the solution to making containers secure.

@wking
Copy link
Contributor Author

wking commented Oct 18, 2016

On Tue, Oct 18, 2016 at 03:38:45AM -0700, Aleksa Sarai wrote:

how do users that disagree with the runtime's built-in policies
turn them off?

They set configurations for the controller that explicitly is the
value they want. Anything less is subject to interpretation anyway,
because you could have other services running on the machine that
add restrictions that are counter to what a user might expect. What
is the point of allowing for "system defaults" with null if you're
going to turn around and say "the runtime cannot touch this -- so
the system default isn't the runtime default". That's just not
correct -- the runtime defines the system defaults.

I think “the runtime defines the system defaults” is not what we want,
especially for pre-existing cgroups.

Say I launch container-1 with whatever settings. Now I launch
container-2 with the same cgroupsPath (because I want a monitor
process with the same resource constraints). With your approach, I
have to translate the container-1 constraints into OCI linux.resources
language for container-2, even if container-1 was not launched via an
OCI runtime (or it was, but post-create work adjusted the
controllers).

Or say you have an external cgroup-manager whose config syntax you
like better than the OCI linux.resources. You use your manager to
create a handful of cgroups and use cgroupsPath to launch your
container into them. With your approach you have to translate that
configuration into OCI linux.resources language as well.

I think the implementation is easier, and the result is easier to use,
if we just require the runtime to leave these things alone unless you
explicitly tell it to do something.

The whole reason I argued for null in the first place is so
that runtimes can define system defaults without conflicting with
options that users explicitly set. Can we please just go back to
that and stop trying to make it harder for runtimes to actually make
containers secure.

With the current phrasing in this PR, the runtime is still free to do
whatever it wants (as long as it doesn't conflict with an
explicitly-set parameter) for controllers it creates. It's just not
allowed to change an existing controller unless the configuration
tells it to. So folks who set cgroupsPath to something that doesn't
exist can still have your runtime-defined default controller settings.

Helpful hinting about increasing security for naive users belongs
in ‘oci-runtime-tool generate’ and similar when setting up the
config

"Helpful hinting" while restricting runtimes is not the solution to
making containers secure.

We're all consenting adults here. If you don't trust yourself to
write an appropriately safe config, get oci-runtime-tool to write it
for you. If have already setup a controller to your liking, it should
be easy to launch a container into that controller without the runtime
mucking about.

@cyphar
Copy link
Member

cyphar commented Oct 18, 2016

@wking

I think “the runtime defines the system defaults” is not what we want, especially for pre-existing cgroups.

Sorry, I misread the last part of the sentence of this PR. I read it as "at all" not as "pre-existing cgroups", which is why I reacted against it like that. I'm going to think more about this...

However I just want to mention that the runtime is an authority to decide what a system default is, simply because the runtime is part of the system. If you want, I can open a PR to change the word "system" to "runtime" if you don't agree with my reading of the phrase "system default" (which is why I used that phrasing in the initial discussions for cgroup settings).

We're all consenting adults here.

But some of us are more aware and experienced with the security risks than others. 😉

@crosbymichael
Copy link
Member

If you don't agree with the runtime defaults then a runtime flag or not using the runtime is a better option for your instead of trying to do some da vinci code logic on what is null, {}, and valid inside the spec and how you can combine those into some street fighter combo move to make the runtime do what you want. Its just making things confusing and unusable both for a consumer and runtime author.

Explicit is better than implicit.

@wking
Copy link
Contributor Author

wking commented Nov 6, 2016

On Wed, Oct 19, 2016 at 10:06AM -0700, Michael Crosby wrote:

If you don't agree with the runtime defaults…

I can accept unspecified (or hopefully implementation-defined) runtime
defaults when the runtime is creating a new cgroup.

What I really don't want is the runtime mucking about with an
existing cgroup when I join it 1. The decisions about how that
cgroup should be configured have already been made, I just want to put
the new container process inside it.

… instead of trying to do some da vinci code logic on what is null,
{}, and valid inside the spec and how you can combine those into
some street fighter combo move to make the runtime do what you
want. Its just making things confusing and unusable both for a
consumer and runtime author.

As I said earlier 2, if a runtime implementation previously only
updated existing cgroup properties when the associated config property
(e.g. linux.resources.memory.limit) was set, you shouldn't have to
change anything with this PR. That doesn't sound particularly magical
to me, and it appears to be what runC already does [3,4].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants