Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[enh] Make max_accumulations configurable and add remove method for meters. #2747

Closed
wants to merge 2 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
163 changes: 148 additions & 15 deletions specification/metrics/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ linkTitle: API
+ [Instrument naming rule](#instrument-naming-rule)
+ [Instrument unit](#instrument-unit)
+ [Instrument description](#instrument-description)
+ [Instrument max_accumulations](#instrument-max_accumulations)
+ [Synchronous and Asynchronous instruments](#synchronous-and-asynchronous-instruments)
+ [Synchronous Instrument API](#synchronous-instrument-api)
+ [Asynchronous Instrument API](#asynchronous-instrument-api)
Expand Down Expand Up @@ -300,6 +301,13 @@ instrument. It MUST be treated as an opaque string from the API and SDK.
* It MUST support at least 1023 characters. [OpenTelemetry
API](../overview.md#api) authors MAY decide if they want to support more.

#### Instrument max_accumulations

The `max_accumulations` is an optional free-form integer provided by the author of the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is "accumulation"? This seems like an implementation detail from Java?

Copy link
Author

@tjiuming tjiuming Aug 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like Counter.Child in Prometheus. Currently, OTEL SDK limits the maximum number of Child to 2000, it should be configurable or Unlimited

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dont see anything like "Counter.Child" in OTel specs. Most likely this is something very java specific.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, in Otel specs, the number of child in unlimited? accumulation's limits is just a java spec?

how about remove method? should it be added to Otel spec?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cijothomas sorry, I've read the open-telemetry/opentelemetry-java#4647 's comments carefully, max_accumulations is truly a java spec, please ignore about it.

besides, I wonder that if remove method should be added to Otel spec?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accumulation is a term internal to java's implementation. The max accumulations refers to the number of distinct attribute sets / timeseries that are allowed in memory for a particular instrument at any time. If exceeded, we log a warning and drop the measurement.

The topic of metrics SDK memory limits was punted on for the initial stable release of the metrics spec, and in java (and I believe dotnet) we opted for defensive approach with a fixed limit until the spec clears things up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying! Yes .NET also uses its own mechanism, called SetMaxMetricPointsPerMetricStream.
https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/docs/metrics/customizing-the-sdk#changing-maximum-metricpoints-per-metricstream

Would be good to have it in spec. (allowing sufficient flexibility so that we wont break existing API in .NET)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jack-berg @cijothomas If I understand you correctly, it's good to add max_accumulations and remove to spec right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont know if there is a need to have remove.
I definitely like spec to have the equivalent of max_accumulations. (need a diff. name, as accumulation is java specific.) .NET named it SetMaxMetricPointsperMetricStream

instrument, default value is 2000.

* It MUST greater than 0.

Instruments are categorized on whether they are synchronous or
asynchronous:

Expand Down Expand Up @@ -333,6 +341,8 @@ The API to construct synchronous instruments MUST accept the following parameter
rule](#instrument-unit).
* An optional `description`, following the [instrument description
rule](#instrument-description).
* An optional `max_accumulations`, following the [Instrument max_accumulations
rule](#instrument-max_accumulations).

#### Asynchronous Instrument API

Expand All @@ -351,6 +361,8 @@ The API to construct asynchronous instruments MUST accept the following paramete
rule](#instrument-description).
* Zero or more `callback` functions, responsible for reporting
[Measurement](#measurement) values of the created instrument.
* An optional `max_accumulations`, following the [Instrument max_accumulations
rule](#instrument-max_accumulations).

The API MUST support creation of asynchronous instruments by passing
zero or more `callback` functions to be permanently registered to the
Expand Down Expand Up @@ -446,21 +458,21 @@ might consider:
```python
# Python

exception_counter = meter.create_counter(name="exceptions", description="number of exceptions caught", value_type=int)
exception_counter = meter.create_counter(name="exceptions", description="number of exceptions caught", value_type=int, max_accumulations=1000)
```

```csharp
// C#

var counterExceptions = meter.CreateCounter<UInt64>("exceptions", description="number of exceptions caught");
var counterExceptions = meter.CreateCounter<UInt64>("exceptions", description="number of exceptions caught", max_accumulations=1000);

readonly struct PowerConsumption
{
[HighCardinality]
string customer;
};

var counterPowerUsed = meter.CreateCounter<double, PowerConsumption>("power_consumption", unit="kWh");
var counterPowerUsed = meter.CreateCounter<double, PowerConsumption>("power_consumption", unit="kWh", max_accumulations=1000);
```

#### Counter operations
Expand Down Expand Up @@ -504,6 +516,44 @@ counterPowerUsed.Add(13.5, new PowerConsumption { customer = "Tom" });
counterPowerUsed.Add(200, new PowerConsumption { customer = "Jerry" }, ("is_green_energy", true));
```

##### Remove

Remove the accumulation with the specified attributes.

This API SHOULD NOT return a value (it MAY return a dummy value if required by
certain programming languages or systems, for example `null`, `undefined`).

Required parameters:

* [attributes](../common/README.md#attribute).

The [OpenTelemetry API](../overview.md#api) authors MAY decide to allow flexible
[attributes](../common/README.md#attribute) to be passed in as arguments. If
the attribute names and types are provided during the [counter
creation](#counter-creation), the [OpenTelemetry API](../overview.md#api)
authors MAY allow attribute values to be passed in using a more efficient way
(e.g. strong typed struct allocated on the callstack, tuple). The API MUST allow
callers to provide flexible attributes at invocation time rather than having to
register all the possible attribute names during the instrument creation. Here
are some examples that [OpenTelemetry API](../overview.md#api) authors might
consider:

```python
# Python

exception_counter.remove({"exception_type": "IOError", "handled_by_user": True})
exception_counter.remove(exception_type="IOError", handled_by_user=True)
```

```csharp
// C#

counterExceptions.Remove(("exception_type", "FileLoadException"), ("handled_by_user", true));

counterPowerUsed.Remove(new PowerConsumption { customer = "Tom" });
counterPowerUsed.Remove(new PowerConsumption { customer = "Jerry" }, ("is_green_energy", true));
```

### Asynchronous Counter

Asynchronous Counter is an [asynchronous Instrument](#asynchronous-instrument-api)
Expand Down Expand Up @@ -591,7 +641,7 @@ def pf_callback(result):
result.Observe(37741921, ("pid", 4), ("bitness", 64))
result.Observe(10465, ("pid", 880), ("bitness", 32))

meter.create_observable_counter(name="PF", description="process page faults", pf_callback)
meter.create_observable_counter(name="PF", description="process page faults", max_accumulations=1000, pf_callback)
```

```csharp
Expand All @@ -606,7 +656,7 @@ interface IAtomicClock

IAtomicClock clock = AtomicClock.Connect();

meter.CreateObservableCounter<UInt64>("caesium_oscillates", () => clock.GetCaesiumOscillates());
meter.CreateObservableCounter<UInt64>("caesium_oscillates", 1000, () => clock.GetCaesiumOscillates());
```

#### Asynchronous Counter operations
Expand All @@ -627,14 +677,17 @@ class Device:

def __init__(self, meter, x):
self.x = x
counter = meter.create_observable_counter(name="usage", description="count of items used")
self.counter = meter.create_observable_counter(name="usage", description="count of items used", max_accumulations=1000)
self.cb = counter.register_callback(self.counter_callback)

def counter_callback(self, result):
result.Observe(self.read_counter(), {'x', self.x})

def read_counter(self):
return 100 # ...

def remove(self):
self.counter.remove({'x', self.x})

def stop(self):
self.cb.unregister()
Expand Down Expand Up @@ -671,7 +724,8 @@ http_server_duration = meter.create_histogram(
name="http.server.duration",
description="measures the duration of the inbound HTTP request",
unit="milliseconds",
value_type=float)
value_type=float,
max_accumulations=1000)
```

```csharp
Expand All @@ -680,7 +734,8 @@ http_server_duration = meter.create_histogram(
var httpServerDuration = meter.CreateHistogram<double>(
"http.server.duration",
description: "measures the duration of the inbound HTTP request",
unit: "milliseconds"
unit: "milliseconds",
maxAccumulations=1000
);
```

Expand Down Expand Up @@ -719,6 +774,38 @@ httpServerDuration.Record(50, ("http.method", "POST"), ("http.scheme", "https"))
httpServerDuration.Record(100, new HttpRequestAttributes { method = "GET", scheme = "http" });
```

##### Remove

Remove the accumulation with the specified attributes.

This API SHOULD NOT return a value (it MAY return a dummy value if required by
certain programming languages or systems, for example `null`, `undefined`).

Parameters:

* [attributes](../common/README.md#attribute).

[OpenTelemetry API](../overview.md#api) authors MAY decide to allow flexible
[attributes](../common/README.md#attribute) to be passed in as individual
arguments. [OpenTelemetry API](../overview.md#api) authors MAY allow attribute
values to be passed in using a more efficient way (e.g. strong typed struct
allocated on the callstack, tuple). Here are some examples that [OpenTelemetry
API](../overview.md#api) authors might consider:

```python
# Python

http_server_duration.Remove({"http.method": "POST", "http.scheme": "https"})
http_server_duration.Remove(http_method="GET", http_scheme="http")
```

```csharp
// C#

httpServerDuration.Remove(("http.method", "POST"), ("http.scheme", "https"));
httpServerDuration.Remove(new HttpRequestAttributes { method = "GET", scheme = "http" });
```

### Asynchronous Gauge

Asynchronous Gauge is an [asynchronous Instrument](#asynchronous-instrument-api)
Expand Down Expand Up @@ -772,6 +859,7 @@ def cpu_frequency_callback():
meter.create_observable_gauge(
name="cpu.frequency",
description="the real-time CPU clock speed",
max_accumulations=1000,
callback=cpu_frequency_callback,
unit="GHz",
value_type=float)
Expand All @@ -790,6 +878,7 @@ def cpu_frequency_callback(result):
meter.create_observable_gauge(
name="cpu.frequency",
description="the real-time CPU clock speed",
max_accumulations=1000,
callback=cpu_frequency_callback,
unit="GHz",
value_type=float)
Expand All @@ -800,7 +889,7 @@ meter.create_observable_gauge(

// A simple scenario where only one value is reported

meter.CreateObservableGauge<double>("temperature", () => sensor.GetTemperature());
meter.CreateObservableGauge<double>("temperature", 1000, () => sensor.GetTemperature());
```

#### Asynchronous Gauge operations
Expand All @@ -821,14 +910,17 @@ class Device:

def __init__(self, meter, x):
self.x = x
gauge = meter.create_observable_gauge(name="pressure", description="force/area")
sefl.gauge = meter.create_observable_gauge(name="pressure", description="force/area", max_accumulations=1000)
self.cb = gauge.register_callback(self.gauge_callback)

def gauge_callback(self, result):
result.Observe(self.read_gauge(), {'x', self.x})

def read_gauge(self):
return 100 # ...

def remove(self):
self.gauge.remove({'x', self.x})

def stop(self):
self.cb.unregister()
Expand Down Expand Up @@ -872,7 +964,8 @@ properties as they are added and removed.

items_counter = meter.create_up_down_counter(
name="store.inventory",
description="the number of the items available")
description="the number of the items available",
max_accumulations=1000)

def restock_item(color, material):
inventory.add_item(color=color, material=material)
Expand All @@ -884,6 +977,9 @@ def sell_item(color, material):
if succeeded:
items_counter.add(-1, {"color": color, "material": material})
return succeeded

def remove(color, material):
item_counter.remove({"color": color, "material": material})
```

#### UpDownCounter creation
Expand All @@ -906,7 +1002,8 @@ might consider:
customers_in_store = meter.create_up_down_counter(
name="grocery.customers",
description="measures the current customers in the grocery store",
value_type=int)
value_type=int,
max_accumulations=1000)
```

```csharp
Expand All @@ -915,6 +1012,7 @@ customers_in_store = meter.create_up_down_counter(
var customersInStore = meter.CreateUpDownCounter<int>(
"grocery.customers",
description: "measures the current customers in the grocery store",
max_accumulations=1000
);
```

Expand Down Expand Up @@ -951,6 +1049,36 @@ customersInStore.Add(1, ("account.type", "commercial"));
customersInStore.Add(-1, new Account { Type = "residential" });
```

##### Remove

Remove the accumulation with the specified attributes.

This API SHOULD NOT return a value (it MAY return a dummy value if required by
certain programming languages or systems, for example `null`, `undefined`).

Parameters:

* [attributes](../common/README.md#attribute).

[OpenTelemetry API](../overview.md#api) authors MAY decide to allow flexible
[attributes](../common/README.md#attribute) to be passed in as individual
arguments. [OpenTelemetry API](../overview.md#api) authors MAY allow attribute
values to be passed in using a more efficient way (e.g. strong typed struct
allocated on the callstack, tuple). Here are some examples that [OpenTelemetry
API](../overview.md#api) authors might consider:

```python
# Python
customers_in_store.remove({"account.type": "commercial"})
customers_in_store.remove(account_type="residential")
```

```csharp
// C#
customersInStore.Remove(("account.type", "commercial"));
customersInStore.Remove(new Account { Type = "residential" });
```

### Asynchronous UpDownCounter

Asynchronous UpDownCounter is an [asynchronous
Expand Down Expand Up @@ -1010,6 +1138,7 @@ def ws_callback():
meter.create_observable_updowncounter(
name="process.workingset",
description="process working set",
max_accumulations=1000,
callback=ws_callback,
unit="kB",
value_type=int)
Expand All @@ -1027,6 +1156,7 @@ def ws_callback(result):
meter.create_observable_updowncounter(
name="process.workingset",
description="process working set",
max_accumulations=1000,
callback=ws_callback,
unit="kB",
value_type=int)
Expand All @@ -1037,7 +1167,7 @@ meter.create_observable_updowncounter(

// A simple scenario where only one value is reported

meter.CreateObservableUpDownCounter<UInt64>("memory.physical.free", () => WMI.Query("FreePhysicalMemory"));
meter.CreateObservableUpDownCounter<UInt64>("memory.physical.free", 1000, () => WMI.Query("FreePhysicalMemory"));
```

#### Asynchronous UpDownCounter operations
Expand All @@ -1058,15 +1188,18 @@ class Device:

def __init__(self, meter, x):
self.x = x
updowncounter = meter.create_observable_updowncounter(name="queue_size", description="items in process")
self.updowncounter = meter.create_observable_updowncounter(name="queue_size", description="items in process")
self.cb = updowncounter.register_callback(self.updowncounter_callback)

def updowncounter_callback(self, result):
result.Observe(self.read_updowncounter(), {'x', self.x})

def read_updowncounter(self):
return 100 # ...


def remove(self):
self.updowncounter.remove({'x', self.x})

def stop(self):
self.cb.unregister()
```
Expand Down