Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional removal of metrics from Prometheus PushGateway on shutdown #14935

7 changes: 4 additions & 3 deletions docs/development/extensions-contrib/prometheus.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,7 @@ To use this Apache Druid extension, [include](../../configuration/extensions.md#

This extension exposes [Druid metrics](https://druid.apache.org/docs/latest/operations/metrics.html) for collection by a Prometheus server (https://prometheus.io/).

Emitter is enabled by setting `druid.emitter=prometheus` [configs](https://druid.apache.org/docs/latest/configuration/index.html#enabling-metrics) or include `prometheus` in the composing emitter list.

Emitter is enabled by setting `druid.emitter=prometheus` [configs](https://druid.apache.org/docs/latest/configuration/index.html#enabling-metrics) or include `prometheus` in the composing emitter list.

## Configuration

Expand All @@ -47,6 +46,8 @@ All the configuration parameters for the Prometheus emitter are under `druid.emi
| `druid.emitter.prometheus.pushGatewayAddress` | Pushgateway address. Required if using `pushgateway` strategy. | no | none |
| `druid.emitter.prometheus.flushPeriod` | Emit metrics to Pushgateway every `flushPeriod` seconds. Required if `pushgateway` strategy is used. | no | 15 |
| `druid.emitter.prometheus.extraLabels` | JSON key-value pairs for additional labels on all metrics. Keys (label names) must match the regex `[a-zA-Z_:][a-zA-Z0-9_:]*`. Example: `{"cluster_name": "druid_cluster1", "env": "staging"}`. | no | none |
| `druid.emitter.prometheus.pushGatewayDeleteOnShutdown` | Flag to delete metrics from Pushgateway on shutdown. Works only if `pushgateway` strategy is used. | no | false |
BartMiki marked this conversation as resolved.
Show resolved Hide resolved
| `druid.emitter.prometheus.waitForShutdownDelay` | Time to wait for Pushgateway to delete metrics on shutdown in seconds (e.g. 60). Works only if `pushgateway` strategy is used. Be aware, that task can terminate before the deletion is performed, when the [Peon's `druid.indexer.task.gracefulShutdownTimeout` is used](https://druid.apache.org/docs/latest/configuration/#additional-peon-configuration). | no | none |
BartMiki marked this conversation as resolved.
Show resolved Hide resolved

### Ports for colocated Druid processes

Expand Down Expand Up @@ -110,5 +111,5 @@ the service name. For example:
"druid/coordinator-segment/count" : { "dimensions" : ["dataSource"], "type" : "gauge" },
"druid/historical-segment/count" : { "dimensions" : ["dataSource", "tier", "priority"], "type" : "gauge" }
```

For most use cases, the default mapping is sufficient.
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,13 @@ public PrometheusEmitter(PrometheusEmitterConfig config)
{
this.config = config;
this.strategy = config.getStrategy();
metrics = new Metrics(config.getNamespace(), config.getDimensionMapPath(), config.isAddHostAsLabel(), config.isAddServiceAsLabel(), config.getExtraLabels());
metrics = new Metrics(
config.getNamespace(),
config.getDimensionMapPath(),
config.isAddHostAsLabel(),
config.isAddServiceAsLabel(),
config.getExtraLabels()
);
}


Expand Down Expand Up @@ -164,7 +170,8 @@ private void emitMetric(ServiceMetricEvent metricEvent)
} else if (metric.getCollector() instanceof Gauge) {
((Gauge) metric.getCollector()).labels(labelValues).set(value.doubleValue());
} else if (metric.getCollector() instanceof Histogram) {
((Histogram) metric.getCollector()).labels(labelValues).observe(value.doubleValue() / metric.getConversionFactor());
((Histogram) metric.getCollector()).labels(labelValues)
.observe(value.doubleValue() / metric.getConversionFactor());
} else {
log.error("Unrecognized metric type [%s]", metric.getCollector().getClass());
}
Expand Down Expand Up @@ -202,11 +209,35 @@ public void close()
{
if (strategy.equals(PrometheusEmitterConfig.Strategy.exporter)) {
if (server != null) {
server.stop();
server.close();
}
} else {
exec.shutdownNow();
flush();

try {
if (config.getWaitForShutdownDelay() > 0) {
Thread.sleep(config.getWaitForShutdownDelay());
}
}
catch (InterruptedException e) {
log.error(e, "Interrupted while waiting for shutdown delay. Deleting metrics now.");
abhishekrb19 marked this conversation as resolved.
Show resolved Hide resolved
}
finally {
deletePushGatewayMetrics();
}
}
}

private void deletePushGatewayMetrics()
{
if (pushGateway != null && config.isPushGatewayDeleteOnShutdown()) {
try {
pushGateway.delete(config.getNamespace(), ImmutableMap.of(config.getNamespace(), identifier));
}
catch (IOException e) {
log.error(e, "Unable to delete prometheus metrics from pushGateway");
BartMiki marked this conversation as resolved.
Show resolved Hide resolved
}
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,12 @@ public class PrometheusEmitterConfig
@JsonProperty
private final Map<String, String> extraLabels;

@JsonProperty
private final boolean pushGatewayDeleteOnShutdown;

@JsonProperty
private final int waitForShutdownDelay;

@JsonCreator
public PrometheusEmitterConfig(
@JsonProperty("strategy") @Nullable Strategy strategy,
Expand All @@ -80,7 +86,9 @@ public PrometheusEmitterConfig(
@JsonProperty("addHostAsLabel") boolean addHostAsLabel,
@JsonProperty("addServiceAsLabel") boolean addServiceAsLabel,
@JsonProperty("flushPeriod") Integer flushPeriod,
@JsonProperty("extraLabels") @Nullable Map<String, String> extraLabels
@JsonProperty("extraLabels") @Nullable Map<String, String> extraLabels,
@JsonProperty("pushGatewayDeleteOnShutdown") @Nullable Boolean pushGatewayDeleteOnShutdown,
@JsonProperty("waitForShutdownDelay") @Nullable Integer waitForShutdownDelay
)
{
this.strategy = strategy != null ? strategy : Strategy.exporter;
Expand All @@ -103,6 +111,8 @@ public PrometheusEmitterConfig(
this.addHostAsLabel = addHostAsLabel;
this.addServiceAsLabel = addServiceAsLabel;
this.extraLabels = extraLabels != null ? extraLabels : Collections.emptyMap();
this.pushGatewayDeleteOnShutdown = pushGatewayDeleteOnShutdown != null && pushGatewayDeleteOnShutdown;
this.waitForShutdownDelay = waitForShutdownDelay != null ? waitForShutdownDelay : 0;
BartMiki marked this conversation as resolved.
Show resolved Hide resolved
// Validate label names early to prevent Prometheus exceptions later.
for (String key : this.extraLabels.keySet()) {
if (!PATTERN.matcher(key).matches()) {
Expand Down Expand Up @@ -165,6 +175,16 @@ public Map<String, String> getExtraLabels()
return extraLabels;
}

public boolean isPushGatewayDeleteOnShutdown()
{
return pushGatewayDeleteOnShutdown;
}

public int getWaitForShutdownDelay()
{
return waitForShutdownDelay;
}

public enum Strategy
{
exporter, pushgateway
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ public void testEmitterConfigWithBadExtraLabels()

// Expect an exception thrown by our own PrometheusEmitterConfig due to invalid label key
Exception exception = Assert.assertThrows(DruidException.class, () -> {
new PrometheusEmitterConfig(PrometheusEmitterConfig.Strategy.exporter, null, null, 0, null, false, true, 60, extraLabels);
new PrometheusEmitterConfig(PrometheusEmitterConfig.Strategy.exporter, null, null, 0, null, false, true, 60, extraLabels, false, null);
});

String expectedMessage = "Invalid metric label name [label Name]. Label names must conform to the pattern [[a-zA-Z_:][a-zA-Z0-9_:]*]";
Expand Down
Loading