-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not use regexp #3460
Do not use regexp #3460
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,7 +5,6 @@ import ( | |
"errors" | ||
"fmt" | ||
"os" | ||
"regexp" | ||
"runtime/debug" | ||
"strconv" | ||
|
||
|
@@ -27,8 +26,6 @@ const ( | |
execFifoFilename = "exec.fifo" | ||
) | ||
|
||
var idRegex = regexp.MustCompile(`^[\w+-\.]+$`) | ||
|
||
// Create creates a new container with the given id inside a given state | ||
// directory (root), and returns a Container object. | ||
// | ||
|
@@ -260,8 +257,47 @@ func loadState(root string) (*State, error) { | |
return state, nil | ||
} | ||
|
||
// validateID checks if the supplied container ID is valid, returning | ||
// the ErrInvalidID in case it is not. | ||
// | ||
// The format of valid ID was never formally defined, instead the code | ||
// was modified to allow or disallow specific characters. | ||
// | ||
// Currently, a valid ID is a non-empty string consisting only of | ||
// the following characters: | ||
// - uppercase (A-Z) and lowercase (a-z) Latin letters; | ||
// - digits (0-9); | ||
// - underscore (_); | ||
// - plus sign (+); | ||
// - minus sign (-); | ||
// - period (.). | ||
// | ||
// In addition, IDs that can't be used to represent a file name | ||
// (such as . or ..) are rejected. | ||
|
||
func validateID(id string) error { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps would be good to (despite it not being exported) add a GoDoc to describe accepted formats, and outline some of it. Looking at this PR made me go down a bit of a trip in history; let me post below, but I should probably post elsewhere I noticed the underscore being added, which was not explicitly included in the So, I went on a short trip down history;
So, from the above;
Some what unfortunately, the runtime-spec does NOT define a format (or restrictions) in any way, which means that currently runc is more restrictive than the OCI runtime spec (overall this is fine). I found multiple attempts to have this included;
To summarize my thoughts;
proposed PATCH for formalizing the format from 2015From 2ec34c62b019431c0edebead50b56ebafe70a67c Mon Sep 17 00:00:00 2001
From: "W. Trevor King" <wking@tremily.us>
Date: Mon, 10 Aug 2015 12:07:20 -0700
Subject: [PATCH] runtime: Document container ID charset and uniqueness domain
Allow the runtime to use it's own scheme, but let the caller use UUIDs
if they want. Jonathan asked for clarification as part of #87, but
didn't suggest a particular approach [1]. When we discussed it in the
2015-08-26 meeting [2], the consensus was to just allow everything.
With container IDs like 'a/b/c' leading to state entries like
'/var/oci/containers/a/b/c/state.json'. But that could get ugly with
container IDs that contain '../' etc. And perhaps there are some
filesystems out there that cannot represent non-ASCII characters
(actually, I'm not even sure what charset our JSON is in ;). I'd
rather pick this minimal charset which can handle UUIDs, and make life
easy for runtime implementers and safe for bundle consumers at a
slight cost of flexibility for bundle-authors.
There was some confusion on the list about what "ASCII letters" meant
[3], so I've explicitly listed the allowed character ranges. Here's a
Python 3 script that shows the associated Unicode logic:
import unicodedata
# http://www.unicode.org/reports/tr44/tr44-4.html#GC_Values_Table
category = {
'Ll': 'lowercase letter',
'Lu': 'uppercase letter',
'Nd': 'decimal number',
'Pd': 'dash punctuation',
}
for i in range(1<<7):
char = chr(i)
abbr = unicodedata.category(char)
if abbr[0] in ['L', 'N'] or abbr == 'Pd':
cat = category[abbr]
print('{:02x} {} {}'.format(i, char, cat))
[1]: https://github.com/opencontainers/specs/pull/87#discussion_r35828046
[2]: https://github.com/opencontainers/specs/wiki/Meeting-Minutes-2015-08-26
[3]: https://groups.google.com/a/opencontainers.org/d/msg/dev/P9gZBYhiqDE/-ptpOcQ5FwAJ
Message-Id: <7ec9cff6-c1a6-4beb-82de-16eb412bf2f8@opencontainers.org>
Reported-by: Jonathan Boulle <jonathanboulle@gmail.com>
Signed-off-by: W. Trevor King <wking@tremily.us>
---
runtime.md | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/runtime.md b/runtime.md
index be2045b58..c9da6af9a 100644
--- a/runtime.md
+++ b/runtime.md
@@ -11,6 +11,9 @@ By providing a default location that container state is stored external applicat
* **version** (string) Version of the OCI specification used when creating the container.
* **id** (string) ID is the container's ID.
+ Only ASCII letters, numbers, and hyphens are valid (a–z, A–Z, 0–9, and ‘-’).
+ This value must be unique for a given host, but need not be universally unique.
+ Runtimes must allow the caller to set this ID, so that callers may choose, for example, to use [UUIDs][uuid] for universal uniqueness.
* **pid** (int) Pid is the ID of the main process within the container.
* **root** (string) Root is the path to the container's bundle directory.
@@ -94,3 +97,5 @@ If a hook returns a non-zero exit code, then an error is logged and the remainin
`path` is required for a hook.
`args` and `env` are optional.
+
+[uuid]: https://tools.ietf.org/html/rfc4122 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Most interestingly, as pointed out by @mrunalp below, this commit also implicitly adds comma ( Now we need to decide whether to keep supporting |
||
if !idRegex.MatchString(id) || string(os.PathSeparator)+id != utils.CleanPath(string(os.PathSeparator)+id) { | ||
if len(id) < 1 { | ||
return ErrInvalidID | ||
} | ||
|
||
// Allowed characters: 0-9 A-Z a-z _ + - . | ||
for i := 0; i < len(id); i++ { | ||
c := id[i] | ||
switch { | ||
case c >= 'a' && c <= 'z': | ||
case c >= 'A' && c <= 'Z': | ||
case c >= '0' && c <= '9': | ||
case c == '_': | ||
case c == '+': | ||
case c == '-': | ||
case c == '.': | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The current regex also includes the comma :/ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks like we missed It should have been There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh wow. I guess that the best way is to document that we're no longer allowing The alternative is to add |
||
default: | ||
return ErrInvalidID | ||
} | ||
|
||
} | ||
|
||
if string(os.PathSeparator)+id != utils.CleanPath(string(os.PathSeparator)+id) { | ||
return ErrInvalidID | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OH! Actually, this is not correct; this should be
strings.TrimPrefix
.strings.TrimLeft
takes a "cut set" of characters, so currently"vvvvvv""v"v"v""v245.4-1.fc32"
would be accepted;https://go.dev/play/p/-Z4D9U4-oYM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correct since it allows us to Trim either
v
or"v
(i.e. with or without the quote).Sure, it will also cut
v"
orvvvv"""vvvv
but we do not expect such input here (the input expectations are documented).Within the realm of expected input,
work the same as
but is somewhat more compact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thaJeztah PTAL ^^