-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add validation for sysctl #303
Add validation for sysctl #303
Conversation
@crosbymichael @LK4D4 ping |
a2b2ef3
to
9bdc01c
Compare
/proc/sys isn't completely namespaced and only some properties are allowed per linux namespace. Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
9bdc01c
to
f7d1401
Compare
is it even worth having this field when only a few are valid? |
@crosbymichael We have a lot of customers asking for this. net itself gives a lot of options. |
oh, right, i see the prefixes now |
Also, more options will be made configurable over time from what I gathered talking to kernel engineers. |
@jeremyeder PTAL Are all of these network sysctls actually Namespace aware? |
Would we want to make this available to docker as a public interface so that it could catch these early in the parse commands? |
I think jeder should be @jeremyeder. |
@Mrunal No, we can't just pass in the whole "net" sysctl space. Even though some might actually work, there's some strange side-effects; for example you can adjust net.ipv4.tcp_mem in a container but the stack actually respects what's set on the host's kernel. |
vm.nr_hugepages Does not blow up, however not sure it worked correctly.
|
@jeremyeder Thanks for your input. I will get input from the right folks to create a better list for net and things that may be missing here. |
I'm not @jeremyeder, so if he is not responding, you may have an address -John Eder On Tue, Sep 29, 2015 at 4:43 AM, Mrunal Patel notifications@github.com
J. Eder +1 619.847.5579 | +81 80.8903.2817 | j@eder.ws |
Should we validate at all? If you asked for something we just try to do it? |
I don't think we should validate. We should leave his up to the user to validate whether it works or not, as we explain in the man pages and docs. Perhaps it would be helpful to have a page where we list sysctls that we know work with a particular kernel version. Going forward this check is going to be very kernel specific, so verification is often going to be wrong. |
@crosbymichael @rhatdan Sounds good. If we find a better way we could add it in the future. |
Re-opening to see if there is interest in merging this so sysctl could be enabled safely in docker |
@crosbymichael @LK4D4 WDYT? There are quite a few use cases that are blocked by not enabling this feature in docker. |
From a community sense, it makes sense to curate a list of safe procfs tunables as part of runc. Until we have a complete list, we could possibly allow a backdoor. |
Is there a way were we can move this out of the specs and into hooks/ create/ start lifecycle? |
It would be possible but a very roundabout way. Any concerns with doing it like in this PR? We could go by the man page and support what is documented. |
@crosbymichael Why do you think it would be better in one of those? |
@mrunalp do you just want to do the validation at the higher level so that we don't have to recompile or update libcontainer/runc when something changes? |
Not sure you want to do this, if we want to prevent users from shooting themselves in the foot, when doing runc/oci type apps. If we want to protect them, we should protect them in all tools. One thing we are doing in runc is also looking at whether they disabled the host or ipc namespace, which is pretty difficult to do in the docker cli. |
LGTM |
If we do run into an issue where we have to frequently modify this list then we could add configuration for it. I think we can merge this and then see how it goes. |
So after this, container will never be as privileged as host, isn't this gonna break some users who use container to hack with host, like rancherOS? ping @ibuildthecloud The validation looks reasonable to me, just think maybe we should make it optional. |
We could make it optional in privileged mode as the file system is made read only in non privileged anyways. Sent from my iPhone
|
Of course in --privileged mode you can just execute sysctl directly. |
I think we don't need to modify this PR. One can go and set /proc/sys when in privileged mode and there is no harm in the validation when using the CLI so one distinguish between what is safe in container vs being set on the host. |
I agree. We are only allowing what we know of as "Namespaced sysctls" |
@mrunalp Do you mean we can make the sysctl validation optional now without any furture modification? Can you explain how to do that? |
I mean that this patch is sufficient. No special handling of privileged is necessary as /proc/sys is writeable under privileged and allows the user to set sysctls that are not container specific. Sent from my iPhone
|
@mrunalp Thanks for clarification, I'm worry about people might want use sysctl with CLI in LGTM. |
@hqhq I think the current assumption is that --privileged does not change other behaviour in the system. We were discussing similar behaviour with UserNamespace and it was decided that --privileged does not effect userNamespace but a --userns=host is being added to turn off user namespace. I think that precedent would follow here. In that you can do priv ops in the container, but we will continue to block you doing something accidentally. |
Add validation for sysctl
narrative cleanup in support of Base Config Compatibility opencontainers#303
/proc/sys isn't completely namespaced and only some properties are allowed
per linux namespace.
The list has been compiled from namespaces(7)
Signed-off-by: Mrunal Patel mrunalp@gmail.com