Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

[RFC] Always create sandbox init process and remove the field sandbox_pidns #54

Closed
wants to merge 1 commit into from

Conversation

laijs
Copy link
Contributor

@laijs laijs commented Dec 12, 2017

In hyperstart, the pod_init process has its own net/pid/uts/ipc ns.

And in kata, these namespaces are also essential.

Net ns for the sandbox should be always created since the container's
network should be isolated from the vm's initial network to
avoid messing.

Pid ns for the sandbox should be always created since the containers
always join this ns. We can use new pid ns for each container,
but in most cases, they are shared the same sandbox init pid ns.

Uts ns for the sandbox should be always created since the hostname
is set in it without mess it up with the vm's initial UTS.

Ipc ns for the sandbox should be always created since the containers
always join this ns. We can use new ipc ns for each container,
but in most cases, they are shared the same sandbox init ipc ns.

So sandbox init process is better to be always created
in its own net/pid/uts/ipc ns and let the later containers
join in. To make it different from hyperstart, we can use
some config to tell the container not to join it in
some cases later.

Signed-off-by: Lai Jiangshan jiangshanlai@gmail.com

In hyperstart, the pod_init process has its own net/pid/uts/ipc ns.

And in kata, these namespaces are also essential.

Net ns for the sandbox should be always created since the container's
network should be isolated from the vm's initial network to
avoid messing.

Pid ns for the sandbox should be always created since the containers
always join this ns. We can use new pid ns for each container,
but in most cases, they are shared the same sandbox init pid ns.

Uts ns for the sandbox should be always created since the hostname
is set in it without mess it up with the vm's initial UTS.

Ipc ns for the sandbox should be always created since the containers
always join this ns. We can use new ipc ns for each container,
but in most cases, they are shared the same sandbox init ipc ns.

So sandbox init process is better to be always created
in its own net/pid/uts/ipc ns and let the later containers
join in. To make it different from hyperstart, we can use
some config to tell the container not to join it in
some cases later.

Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
@sameo
Copy link

sameo commented Dec 12, 2017

So sandbox init process is better to be always created
in its own net/pid/uts/ipc ns and let the later containers
join in. To make it different from hyperstart, we can use
some config to tell the container not to join it in
some cases later.

I have 2 concerns with this:

  1. The current k8s plan is to support 3 types of sandbox pid namespaces: host, shared or private but here we introduce a fourth one where some containers would share the sandbox namespace while other would not. There are no use cases for this

  2. There is currently no way to specify if a container wants to use the shared pid ns or not, so we'd always be using the shared pid ns.

@laijs
Copy link
Contributor Author

laijs commented Dec 12, 2017

@sameo I agree with you, this is represent the shared mode.
We can add other option for private mode. (or sandbox_pidns = false represents for this mode?)

Even pid namespaces is private the sandbox init process should also be created, regardless the container join it or not.

@sameo
Copy link

sameo commented Dec 12, 2017

We can add other option for private mode. (or sandbox_pidns = false represents for this mode?)

I think it would be cleaner if we change the pod boolean for pid_ns to be a string that cannot be empty, but either be private or shared.

Even pid namespaces is private the sandbox init process should also be created, regardless the container join it or not.

Yes, I agree with that. We should document it as part of the proto file.

@sboeuf
Copy link

sboeuf commented Dec 12, 2017

@laijs @sameo This whole thing about shared pid ns is related to k8s concept of pod having its containers sharing or not the same namespaces. First, that means that if k8s creates a pod with this flag, it expects all its containers to share those ns, therefore, there is no need to get this information from the container, it's more appropriate to get it from the pod IMO.
The discussion about shared vs private is exactly the same than having a boolean, or am I missing something ?
Also, it is way more convenient for us to handle this when creating the sandbox. This way, we can reexec and run this reexec inside the expected namespaces. Same way, we know that we can kill this process when destroying the sandbox.

@sameo
Copy link

sameo commented Dec 12, 2017

@sboeuf

The discussion about shared vs private is exactly the same than having a boolean, or am I missing something ?

Almost, yes. With a boolean, the default value is false and thus by default we don't share the pid ns. With a string, it has no meaningful value and you can even force the caller to select one.

@sboeuf
Copy link

sboeuf commented Dec 12, 2017

@sameo ok then I consider this is a detail in this conversation, and I am fine with both ways. But about the pidns flag, I would really prefer to keep this as part of CreateSandbox().

@sameo
Copy link

sameo commented Dec 12, 2017

@sboeuf Yes, I fully agree. Being shared by default and having containers opt out of it would be something that can't be expressed by the CRI afaiu.

@laijs
Copy link
Contributor Author

laijs commented Dec 13, 2017

Ok, we agree to keep the sandbox_pidns as a knob to tell the agent whether all the containers should join the sandbox init process's pidns or not.

I will close this pr, and add a new issue for the remanning requirement in #59

@laijs laijs closed this Dec 13, 2017
@sboeuf
Copy link

sboeuf commented Dec 13, 2017

Thank you @laijs

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants