diff --git a/docs/rfcs/env-vars.md b/docs/rfcs/env-vars.md new file mode 100644 index 00000000000..1bb8e1b4a82 --- /dev/null +++ b/docs/rfcs/env-vars.md @@ -0,0 +1,216 @@ +# A more sensible Collector environment variable resolution + +## Overview + +The OpenTelemetry Collector supports three different syntaxes for +environment variable resolution which differ in their syntax, semantics +and allowed variable names. Before we stabilize confmap, we need to +address several issues related to environment variables. This document +describes: the current (as of v0.96.0) behavior of the Collector, the +goals that an environment variable resolution should aim for, existing +deviations from these goals and the desired behavior after making some +changes. + +CLI environment variable resolution has a single syntax (`--config env:ENV`) +and it is considered out of scope for this document, focusing +instead on expansion within the Collector configuration. + +How to get from the current to desired behavior is also considered out +of scope and will be discussed on individual PRs. It will likely involve +one or multiple feature gates, warnings and transition periods. + +## Goals of an expansion system + +The following are considered goals of the expansion system: + +1. ***Expansion should happen only when the user expects it***. We + should aim to expand when the user expects it and keep the original + value when we don't (e.g. because the syntax is used for something + different). +2. ***Expansion should have predictable behavior***. +3. ***Multiple expansion methods, if present, should have similar behavior.*** + Switching from `${env:ENV}` to `${ENV}` or vice versa + should not lead to (too many?) surprises. +4. ***When the syntax overlaps, expansion should be aligned with*** + [***the expansion defined by the Configuration Working Group***](https://github.com/open-telemetry/opentelemetry-specification/blob/032213cedde54a2171dfbd234a371501a3537919/specification/configuration/file-configuration.md#environment-variable-substitution). + +## Current behavior + +The Collector supports three different syntaxes for environment variable +resolution: + +1. The *naked syntax*, `$ENV`. +2. The *braces syntax*, `${ENV}`. +3. The *env provider syntax*, `${env:ENV}`. + +These differ in the character set allowed for environment variable names +as well as the type of parsing they return. Escaping is supported in all +syntaxes by using two dollar signs. + +### Type casting rules + +A provider or converter takes a string and returns some sort of value +after potentially doing some parsing. This gets stored in a +confmap.Conf. When unmarshalling, we use mapstructure with +WeaklyTypedInput enabled, which does a lot of implicit casting. The +details of this type casting are complex and are outlined on issue +[#9532](https://github.com/open-telemetry/opentelemetry-collector/issues/9532). + +When using this notation in inline mode (e.g. +`http://endpoint/${env:PATH}`) we also do manual implicit type +casting with a similar approach to mapstructure. These are outlined +[here](https://github.com/open-telemetry/opentelemetry-collector/blob/fc4c13d3c2822bec39fa9d9658836d1a020c6844/confmap/expand.go#L124-L139). + +### Naked syntax + +The naked syntax is supported via the expand converter. It is +implemented using the [`os.Expand`](https://pkg.go.dev/os#Expand) stdlib +function. This syntax supports identifiers made up of: + +1. ASCII alphanumerics and the `_` character +2. Certain special characters if they appear alone typically used in + Bash: `*`, `#`, `$`, `@`, `!`, `?` and `-`. + +You can see supported identifiers in this example: +[`go.dev/play/p/YfxLtYbsL6j`](https://go.dev/play/p/YfxLtYbsL6j). + +The environment variable value is taken as-is and the type is always +string. + +### Braces syntax + +The braces syntax is supported via the expand converter. It is also +implemented using the os.Expand stdlib function. This syntax supports +any identifiers that don't contain `}`. Again, refer to the os.Expand +example to see how it works in practice: +[`go.dev/play/p/YfxLtYbsL6j`](https://go.dev/play/p/YfxLtYbsL6j). + +The environment variable value is taken as-is and the type is always +string. + +### `env` provider + +The `env` provider syntax is supported via (you guessed it) the `env` +provider. Its implementation is custom made. This syntax supports any +identifier that does not have a $ in it, to support recursive +resolution (e.g. `${env:${http://example.com}}` would get the +environment variable whose name is stored in the URL +`http://example.com`). + +The environment variable value is parsed by the yaml.v3 parser to an +any-typed variable. The yaml.v3 parser mostly follows the YAML v1.2 +specification with [*some exceptions*](https://github.com/go-yaml/yaml#compatibility). +You can see +how it works for some edge cases in this example: +[`go.dev/play/p/RtPmH8aZA1X`](https://go.dev/play/p/RtPmH8aZA1X). + +### Issues of current behavior + +#### Unintuitive behavior on unset environment variables + +When an environment variable is empty, all syntaxes return an empty +string with no warning given; this is frequently unexpected but can also +be used intentionally. This is especially unintuitive when the user did +not expect expansion to happen. Two examples where this is unexpected +are the following: + +1. **Opaque values such as passwords that contain `$`** (issue + [#8215](https://github.com/open-telemetry/opentelemetry-collector/issues/8215)). + If the $ is followed by an alphanumeric character or one of the + special characters, it's going to lead to false positives. +2. **Prometheus relabel config** (issue + [`contrib#9984`](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/9984)). + Prometheus uses `${1}` in some of its configuration values. We + resolve this to the value of the environment variable with name + '`1`'. +3. **Other uses of $** (issue + [`contrib#11846`](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/11846)). + If a product requires the use of `$` in some field, we would most + likely interpret it as an environment variable. This is not + intuitive for users. + +#### Unexpected type casting + +When using the env syntax we parse its value as YAML. Even if you are +familiar with YAML, because of the implicit type casting rules and the +way we store intermediate values, we can get unintuitive results. + +The most clear example of this is issue +[*#8565*](https://github.com/open-telemetry/opentelemetry-collector/issues/8565): +When setting a variable to value `0123` and using it in a string-typed +field, it will end up as the string `"83"` (where as the user would +expect the string to be `0123`). + +#### We are less restrictive than the Configuration WG + +The Configuration WG defines an [*environment variable expansion feature +for SDK +configurations*](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/configuration/file-configuration.md#environment-variable-substitution). +This accepts only non empty alphanumeric + underscore identifiers +starting with alphabetic or underscore. If the Configuration WG were to +expand this in the future (e.g. to include other features present in +Bash-like syntax), we would not be able to expand our braces syntax to +support new features without breaking users. + +## Desired behavior + +*This section is written as if the changes were already implemented.* + +The Collector supports **two** different syntaxes for environment +variable resolution: + +1. The *braces syntax*, `${ENV}`. +2. The *env provider syntax*, `${env:ENV}`. + +These both have** the same character set and behavior**. They both use +the env provider under the hood. + +The naked syntax supported in BASH is not supported in the Collector. +Escaping is supported by using two dollar signs. Escaping is also +honored for unsupported identifiers like `${1}`. + +### Type casting rules + +The environment variable value is parsed by the yaml.v3 parser to an +any-typed variable and the original representation as a string is stored +for numeric types. The `yaml.v3` parser mostly follows the YAML v1.2 +specification with [*some +exceptions*](https://github.com/go-yaml/yaml#compatibility). You can see +how it works for some edge cases in this example: +[*https://go.dev/play/p/RtPmH8aZA1X*](https://go.dev/play/p/RtPmH8aZA1X). + +When unmarshalling, we use mapstructure with WeaklyTypedInput +**disabled**. We check via a hook an AsString method from confmap.Conf +and use its return value when it is valid and we are mapping to a string +field. This method has default casting rules for unambiguous scalar +types but may return the original representation depending on the +construction of confmap.Conf + +For using this notation in inline mode (e.g.`http://endpoint/${env:PATH}`), we +use the AsString method from confmap.Conf. + +### Character set + +An environment variable identifier must be a nonempty ASCII alphanumeric +or underscore starting with an alphabetic or underscore character. Its +maximum length is 200 characters. Both syntaxes support recursive +resolution. + +When an invalid identifier is found, a warning log is emitted, and the +string is kept as-is. + +### Comparison table with current behavior + +This is a comparison between the current and desired behavior for +loading a field with the braces syntax, `env` syntax. + +| Raw value | Field type | Current behavior, `${ENV}`, single field | Current behavior, `${env:ENV}` , single field | Desired behavior, entire field | Desired behavior, inline string field | +|-------------- |------------ |------------------------------------------ |------------------------------------------------ |-------------------------------- |--------------------------------------- | +| `123` | integer | 123 | 123 | 123 | n/a | +| `0123` | integer | 83 | 83 | 83 | n/a | +| `0123` | string | 0123 | 83 | 0123 (83?) | 0123 | +| `0xdeadbeef` | string | 0xdeadbeef | 3735928559 | 0xdeadbeef (3735928559?) | 0xdeadbeef | +| `"0123"` | string | "0123" | 0123 | 0123 | 0123 | +| `!!str 0123` | string | !!str 0123 | 0123 | 0123 | 0123 (!!str 0123) | +| `t` | boolean | true | true | Error: mapping string to bool | n/a | +| `23` | boolean | true | true | Error: mapping integer to bool | n/a |