Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config-linux: Specify relationships for new namespaces #795

Closed

Conversation

wking
Copy link
Contributor

@wking wking commented May 9, 2017

Spun off from #767, since these are contentious. I still think we want to say something about these relationships.

We already have some of “runtime namespace” conditions (e.g. when a type is not listed in linux.namespaces[]), so runtimes should already have implementation-specific wording around what the runtime namespaces are (we don't explicitly make them implementation-defined, although we probably should). Anyhow, that's not a new concept added by this commit.

Seeded namespaces

For example, if you ask for a new uts namespace but do not set the optional hostname, having the seed defined means that the hostname in the container UTS namespace is well-defined (it will be whatever the hostname was in the runtime UTS namespace).

This is less of an issue for the mount namespace, because with root.path REQUIRED, there's no way to avoid clobbering whatever mounts you got from your seed (which makes not asking for a new mount namespace exciting ;).

Hierarchical namespaces

I think “I want this container to run in a new user/pid namespace that is a child of the runtime user/pid namespace” should be something that has a portable config expression. Otherwise it becomes very unclear what to put in the hostID field for (u|g)idMappings, because you don't know what namespace will be used to interpret the hostIDs.

Namespace ownership

This is another case where I think specified clarity is essential. A new network namespace will not be very useful if you don't know who owns it.

@wking wking force-pushed the rfc2119-namespace-relationships branch from 3ee4b06 to 26d592c Compare May 9, 2017 22:15
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request May 9, 2017
Previously we had no MUST-level runtime requirements for namespace
entries in valid configs.  This commit attempts to pin those down.

I think we want more wording about new namespace creation (what
namespace is the seed/parent?  Which user namespace owns a runtime
namespace?  For more background on hierarchical namespaces, see [1].
For more background on the owning user namespace idea, see [2,3,4]),
but that wording proved contentious [5,6], so I punted it to [7].

The "'path' not associated with a namespace of type 'type'" condition
ensures that runtimes don't blindly call setns(2) on the path without
setting nstype nonzero.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a7306ed8d94af729ecef8b6e37506a1c6fc14788
     nsfs: add ioctl to get a parent namespace, 2016-09-06
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6786741dbf99e44fb0c0ed85a37582b8a26f1c3b
     nsfs: add ioctl to get owning user namespace for ns file
     descriptor, 2016-09-06
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e5ff5ce6e20ee22511398bb31fb912466cf82a36
     nsfs: Add an ioctl() to return the namespace type, 2017-01-25
[4]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d95fa3c76a66b6d76b1e109ea505c55e66360f3c
     nsfs: Add an ioctl() to return owner UID of a userns, 2017-01-25
[5]: opencontainers#767 (comment)
[6]: opencontainers#767 (comment)
[7]: opencontainers#795

Signed-off-by: W. Trevor King <wking@tremily.us>
@wking wking force-pushed the rfc2119-namespace-relationships branch from 26d592c to 79ee80a Compare May 9, 2017 22:19
@wking wking force-pushed the rfc2119-namespace-relationships branch from 79ee80a to c9b6118 Compare May 10, 2017 22:27
@wking
Copy link
Contributor Author

wking commented May 10, 2017

This got kicked around some in today's meeting (although my notes are thin). My impression was that there is concern that we are documenting the kernel (which I'm trying to avoid) and that these requirements are somewhat tautological with the source of the unshare(2) or clone(2) being the “runtime namespace” by definition (which I agree with). It's been a while since I looked at the glossary entry, and now that I do I think:

  • The current glossary entry makes all namespaces sound hierarchical, when they aren't.
  • The current glossary entry has no RFC 2119 language for compliance testing.

I think we want to land something like the wording in this PR (as of c9b6118) with RFC 2119 teeth, and then update the glossary entry to be something like:

On Linux, a the namespace from which the container namespaces [link to config-linux.md#namespaces] are created.

Which embraces the overlap by trimming the incorrect glossary entry and referencing the RFC 2119 teeth here.

I don't think we should leave these relationships alone with an argument like “namespaces can be confusing, so let's not say obvious things like this”. If namespaces are confusing, I think that's a good reason to say obvious things, because “obvious” is in the eye of the beholder. And it's especially good if we can phrase them specifically enough for compliance testing, since that removes the eye-of-the-beholder ambiguity.

@tianon
Copy link
Member

tianon commented May 11, 2017

I feel like this is already implied (given that it's how the kernel itself works), and that there isn't a lot of value added by explicitly mentioning it or attempting to validate runtime behavior around it.

I'd vote to either close or defer discussion to post-1.0.

For more background on hierarchical namespaces, see [1].  For more
background on the owning user namespace idea, see [2,3,4]).

These were contentious [5,6], so they weren't part of the previous
commit.  I still think we want to say something about these
relationships.

We already have some of "runtime namespace" conditions (e.g. when a
type is not listed in linux.namespaces[]), so runtimes should already
have implementation-specific wording around what the runtime
namespaces are (we don't explicitly make them implementation-defined,
although we probably should). Anyhow, that's not a new concept added
by this commit.

# Seeded namespaces

For example, if you ask for a new uts namespace but do not set the
optional hostname, having the seed defined means that the hostname in
the container UTS namespace is well-defined (it will be whatever the
hostname was in the runtime UTS namespace).

This is less of an issue for the mount namespace, because with
root.path REQUIRED, there's no way to avoid clobbering whatever mounts
you got from your seed (which makes not asking for a new mount
namespace exciting ;).

# Hierarchical namespaces

I think "I want this container to run in a new user/pid namespace that
is a child of the runtime user/pid namespace" should be something that
has a portable config expression. Otherwise it becomes very unclear
what to put in the hostID field for (u|g)idMappings, because you don't
know what namespace will be used to interpret the hostIDs.

# Namespace ownership

This is another case where I think specified clarity is essential. A
new network namespace will not be very useful if you don't know who
owns it.

# Glossary changes

In review, Mrunal and others pointed out that the new config-linux
entries overlapped with the existing glossary entry [7].  I've
addressed that in this commit by trimming the glossary entries down to
fix them (namespaces are not all hierarchical and processes aren't
always in leaves).  I've also dropped the handwavy and incorrect
"children" sentence in favor of a link back to config-linux.md and the
more specific RFC 2119 language from this commit.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a7306ed8d94af729ecef8b6e37506a1c6fc14788
     nsfs: add ioctl to get a parent namespace, 2016-09-06
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6786741dbf99e44fb0c0ed85a37582b8a26f1c3b
     nsfs: add ioctl to get owning user namespace for ns file
     descriptor, 2016-09-06
[3]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e5ff5ce6e20ee22511398bb31fb912466cf82a36
     nsfs: Add an ioctl() to return the namespace type, 2017-01-25
[4]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d95fa3c76a66b6d76b1e109ea505c55e66360f3c
     nsfs: Add an ioctl() to return owner UID of a userns, 2017-01-25
[5]: opencontainers#767 (comment)
[6]: opencontainers#767 (comment)
[7]: http://ircbot.wl.linuxfoundation.org/meetings/opencontainers/2017/opencontainers.2017-05-10-21.03.log.html#l-115
     But I was talking so my log entries are sparse :/.

Signed-off-by: W. Trevor King <wking@tremily.us>
@wking wking force-pushed the rfc2119-namespace-relationships branch from c9b6118 to cf8306e Compare May 11, 2017 07:23
@wking
Copy link
Contributor Author

wking commented May 11, 2017 via email

On Linux, a leaf in the [namespace][namespaces.7] hierarchy from which the [runtime](#runtime) process is executed.
New container namespaces will be created as children of the runtime namespaces.

On Linux, the namespaces from which new [container namespaces](config-linux.md#namespaces) are created.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not link to the glossary container namespace above? Looks more suitable to me.

@hqhq
Copy link
Contributor

hqhq commented May 24, 2017

I feel like this is already implied (given that it's how the kernel itself works), and that there isn't a lot of value added by explicitly mentioning it or attempting to validate runtime behavior around it.
I'd vote to either close or defer discussion to post-1.0.

Agreed.

I'd recommend erring on the side of adding three
lines to config-linux.md and trimming a line from the glossary

I'm +1 on this.

@@ -40,6 +40,9 @@ The following parameters can be specified to setup namespaces:
The runtime MUST [generate an error](runtime.md#errors) if `path` is not associated with a namespace of type `type`.

If `path` is not specified, the runtime MUST create a new [container namespace](glossary.md#container-namespace) of type `type`.
For hierarchical namespaces (e.g. `pid`, `user`), the new container namespace MUST be a child of the [runtime namespace](glossary.md#runtime-namespace) of that type.
For seeded namespaces (e.g. `mount`, `uts`), the new container namespace MUST be seeded by the runtime namespace of that type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NACK on this language.

@vbatts
Copy link
Member

vbatts commented May 24, 2017

A couple of the clarifying sentences are okay, but the seeding and hierarchical namespace language is too prescriptive and not okay to add. closing. add prs for the simpler sentences if still desired

@vbatts vbatts closed this May 24, 2017
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request May 24, 2017
…tion

Namespaces are not all hierarchical and processes aren't always in
leaves.  Also punt to config-linux.md for details about namespace
creation, although currently that section doesn't talk much about how
the runtime namespaces relate to new container namespaces [1].

Also mention resource access, because runtime namespaces play a role
even if no new container namespaces are created.  I've used resources
that currently explicitly mention runtime namespaces as examples,
although I think more resources (e.g. root.path and mounts[].source
[2,3]) deserve wording about that as well and would be better examples
if they'd already landed such wording.

[1]: opencontainers#795 (comment)
[2]: opencontainers#735 (comment)
[3]: opencontainers@604205e#diff-c9c91c29b41257aea3a3403cc606ad99R65

Signed-off-by: W. Trevor King <wking@tremily.us>
@wking
Copy link
Contributor Author

wking commented May 24, 2017 via email

@cyphar
Copy link
Member

cyphar commented May 24, 2017

Do you want to allow runtimes to use the D', D''',
D''', etc., positions? If so, can you explain why? If not, why allow
them?

If the runtime can implement all of the relevant lifecycle stuff with those semantics, they can go for it. I don't understand why it is a necessity that ps aux in the caller's namespace must container the container PIDs. That's not required anywhere and would actually break any daemon-based container runtimes (at least it would make their jobs harder for no obviously beneficial reason).

Not to mention there might be some very odd usecases where you might actually want the functionality of joining namespaces you couldn't have created.

Without having something as awful as cgroupsPath for namespaces, mandating any rules on this that aren't extensible on this topic will break implementations and make people very unhappy. Please don't make a PR for nsPath, it will be a waste of time to review (such a path doesn't have any kernel parallel and not every hierarchy is a VFS).

@wking
Copy link
Contributor Author

wking commented May 24, 2017 via email

wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request May 31, 2017
…tion

Namespaces are not all hierarchical and processes aren't always in
leaves.  Also punt to config-linux.md for details about namespace
creation, although currently that section doesn't talk much about how
the runtime namespaces relate to new container namespaces [1].

Also mention resource access, because runtime namespaces play a role
even if no new container namespaces are created.  I've used resources
that currently explicitly mention runtime namespaces as examples,
although I think more resources (e.g. root.path and mounts[].source
[2,3]) deserve wording about that as well and would be better examples
if they'd already landed such wording.

[1]: opencontainers#795 (comment)
[2]: opencontainers#735 (comment)
[3]: opencontainers@604205e#diff-c9c91c29b41257aea3a3403cc606ad99R65

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/opencontainer-runtime-spec that referenced this pull request Jun 1, 2017
…tion

Namespaces are not all hierarchical and processes aren't always in
leaves.  Also punt to config-linux.md for details about namespace
creation, although currently that section doesn't talk much about how
the runtime namespaces relate to new container namespaces [1].

Also mention resource access, because runtime namespaces play a role
even if no new container namespaces are created.  I've used resources
that currently explicitly mention runtime namespaces as examples,
although I think more resources (e.g. root.path and mounts[].source
[2,3]) deserve wording about that as well and would be better examples
if they'd already landed such wording.

Examples of resources retrieved from this namespace include
linux.namespaces[].path and the resctrl psuedo-filesystem used for
`linux.intelRdt`, but Mrunal and Michael didn't want me to include the
examples in the glossary entry (probably because they could go stale).

[1]: opencontainers#795 (comment)
[2]: opencontainers#735 (comment)
[3]: opencontainers@604205e#diff-c9c91c29b41257aea3a3403cc606ad99R65

Signed-off-by: W. Trevor King <wking@tremily.us>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants