Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/httpcheck] emit single httpcheck.status datapoint instead of five #22994

Closed
andrzej-stencel opened this issue Jun 1, 2023 · 14 comments
Labels
closed as inactive enhancement New feature or request receiver/httpcheck HTTP Check receiver Stale

Comments

@andrzej-stencel
Copy link
Member

andrzej-stencel commented Jun 1, 2023

Component(s)

receiver/httpcheck

Version

v0.78.0

Is your feature request related to a problem? Please describe.

The HTTP Check receiver currently emits 6 time series per single endpoint. For example, the following configuration:

exporters:
  logging:
    verbosity: detailed

receivers:
  httpcheck:
    endpoint: https://opentelemetry.io

service:
  pipelines:
    metrics:
      exporters:
      - logging
      receivers:
      - httpcheck

gives the following output:

$ otelcol-contrib-0.78.0 --config config.yaml
2023-06-01T12:48:14.930+0200    info    service/telemetry.go:104        Setting up own telemetry...
2023-06-01T12:48:14.930+0200    info    service/telemetry.go:127        Serving Prometheus metrics      {"address": ":8888", "level": "Basic"}
2023-06-01T12:48:14.930+0200    info    exporter@v0.78.2/exporter.go:275        Development component. May change in the future.        {"kind": "exporter", "data_type": "metrics", "name": "logging"}
2023-06-01T12:48:14.930+0200    info    receiver@v0.78.2/receiver.go:296        Development component. May change in the future.        {"kind": "receiver", "name": "httpcheck", "data_type": "metrics"}
2023-06-01T12:48:14.954+0200    info    service/service.go:131  Starting otelcol-contrib...     {"Version": "0.78.0", "NumCPU": 16}
2023-06-01T12:48:14.954+0200    info    extensions/extensions.go:30     Starting extensions...
2023-06-01T12:48:14.956+0200    info    service/service.go:148  Everything is ready. Begin running and processing data.2023-06-01T12:48:18.905+0200    info    MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "logging", "resource metrics": 1, "metrics": 2, "data points": 6}
2023-06-01T12:48:18.905+0200    info    ResourceMetrics #0
Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope otelcol/httpcheckreceiver 0.78.0
Metric #0
Descriptor:
     -> Name: httpcheck.duration
     -> Description: Measures the duration of the HTTP check.
     -> Unit: ms
     -> DataType: Gauge
NumberDataPoints #0
Data point attributes:
     -> http.url: Str(https://opentelemetry.io)
StartTimestamp: 2023-06-01 10:48:14.930686644 +0000 UTC
Timestamp: 2023-06-01 10:48:17.963384163 +0000 UTC
Value: 941
Metric #1
Descriptor:
     -> Name: httpcheck.status
     -> Description: 1 if the check resulted in status_code matching the status_class, otherwise 0.
     -> Unit: 1
     -> DataType: Sum
     -> IsMonotonic: false
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
     -> http.url: Str(https://opentelemetry.io)
     -> http.status_code: Int(200)
     -> http.method: Str(GET)
     -> http.status_class: Str(1xx)
StartTimestamp: 2023-06-01 10:48:14.930686644 +0000 UTC
Timestamp: 2023-06-01 10:48:17.963384163 +0000 UTC
Value: 0
NumberDataPoints #1
Data point attributes:
     -> http.url: Str(https://opentelemetry.io)
     -> http.status_code: Int(200)
     -> http.method: Str(GET)
     -> http.status_class: Str(2xx)
StartTimestamp: 2023-06-01 10:48:14.930686644 +0000 UTC
Timestamp: 2023-06-01 10:48:17.963384163 +0000 UTC
Value: 1
NumberDataPoints #2
Data point attributes:
     -> http.url: Str(https://opentelemetry.io)
     -> http.status_code: Int(200)
     -> http.method: Str(GET)
     -> http.status_class: Str(3xx)
StartTimestamp: 2023-06-01 10:48:14.930686644 +0000 UTC
Timestamp: 2023-06-01 10:48:17.963384163 +0000 UTC
Value: 0
NumberDataPoints #3
Data point attributes:
     -> http.url: Str(https://opentelemetry.io)
     -> http.status_code: Int(200)
     -> http.method: Str(GET)
     -> http.status_class: Str(4xx)
StartTimestamp: 2023-06-01 10:48:14.930686644 +0000 UTC
Timestamp: 2023-06-01 10:48:17.963384163 +0000 UTC
Value: 0
NumberDataPoints #4
Data point attributes:
     -> http.url: Str(https://opentelemetry.io)
     -> http.status_code: Int(200)
     -> http.method: Str(GET)
     -> http.status_class: Str(5xx)
StartTimestamp: 2023-06-01 10:48:14.930686644 +0000 UTC
Timestamp: 2023-06-01 10:48:17.963384163 +0000 UTC
Value: 0
        {"kind": "exporter", "data_type": "metrics", "name": "logging"}

The data points with 0 value do not carry a lot of information. Ideally, I would expect to only have one httpcheck.status data point emitted for an endpoint.

Describe the solution you'd like

I propose to make it possible via a configuration option to only emit non-zero data points:

receivers:
  httpcheck:
    endpoint: https://opentelemetry.io
    emit_zero_values: false # we might need a better name for this configuration property

so that the output would be something like:

$ otelcol-contrib-0.78.0 --config config.yaml
2023-06-01T12:48:14.930+0200    info    service/telemetry.go:104        Setting up own telemetry...
2023-06-01T12:48:14.930+0200    info    service/telemetry.go:127        Serving Prometheus metrics      {"address": ":8888", "level": "Basic"}
2023-06-01T12:48:14.930+0200    info    exporter@v0.78.2/exporter.go:275        Development component. May change in the future.        {"kind": "exporter", "data_type": "metrics", "name": "logging"}
2023-06-01T12:48:14.930+0200    info    receiver@v0.78.2/receiver.go:296        Development component. May change in the future.        {"kind": "receiver", "name": "httpcheck", "data_type": "metrics"}
2023-06-01T12:48:14.954+0200    info    service/service.go:131  Starting otelcol-contrib...     {"Version": "0.78.0", "NumCPU": 16}
2023-06-01T12:48:14.954+0200    info    extensions/extensions.go:30     Starting extensions...
2023-06-01T12:48:14.956+0200    info    service/service.go:148  Everything is ready. Begin running and processing data.2023-06-01T12:48:18.905+0200    info    MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "logging", "resource metrics": 1, "metrics": 2, "data points": 2}
2023-06-01T12:48:18.905+0200    info    ResourceMetrics #0
Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope otelcol/httpcheckreceiver 0.78.0
Metric #0
Descriptor:
     -> Name: httpcheck.duration
     -> Description: Measures the duration of the HTTP check.
     -> Unit: ms
     -> DataType: Gauge
NumberDataPoints #0
Data point attributes:
     -> http.url: Str(https://opentelemetry.io)
StartTimestamp: 2023-06-01 10:48:14.930686644 +0000 UTC
Timestamp: 2023-06-01 10:48:17.963384163 +0000 UTC
Value: 941
Metric #1
Descriptor:
     -> Name: httpcheck.status
     -> Description: 1 if the check resulted in status_code matching the status_class, otherwise 0.
     -> Unit: 1
     -> DataType: Sum
     -> IsMonotonic: false
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
     -> http.url: Str(https://opentelemetry.io)
     -> http.status_code: Int(200)
     -> http.method: Str(GET)
     -> http.status_class: Str(2xx)
StartTimestamp: 2023-06-01 10:48:14.930686644 +0000 UTC
Timestamp: 2023-06-01 10:48:17.963384163 +0000 UTC
Value: 1
        {"kind": "exporter", "data_type": "metrics", "name": "logging"}

I also think this might be a good default for this receiver.

Describe alternatives you've considered

I suppose an alternative would be to add a processor to the pipeline that will filter out the zero data points. But honestly I wasn't able to find a way to filter out metric data points based on their value using either the Filter processor or Metrics Transform processor. Is this possible? 🤔

Additional context

Telemetry is costly. We don't want to collect metrics that don't carry a lot of value.

@andrzej-stencel andrzej-stencel added enhancement New feature or request needs triage New item requiring triage labels Jun 1, 2023
@github-actions github-actions bot added the receiver/httpcheck HTTP Check receiver label Jun 1, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Jun 1, 2023

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 1, 2023

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

  • needs: Github issue template generation code needs this to generate the corresponding labels.
  • receiver/httpcheck: @codeboten

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@andrzej-stencel
Copy link
Member Author

@codeboten can you please take a look? I believe this issue is important, as I couldn't find a way to exclude the zero time series using a processor:

I suppose an alternative would be to add a processor to the pipeline that will filter out the zero data points. But honestly I wasn't able to find a way to filter out metric data points based on their value using either the Filter processor or Metrics Transform processor. Is this possible? 🤔

@codeboten
Copy link
Contributor

@astencel-sumo will take a look this week

Copy link
Contributor

github-actions bot commented Dec 5, 2023

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Dec 5, 2023
@andrzej-stencel
Copy link
Member Author

@astencel-sumo will take a look this week

@codeboten is this going to be the week? 😉

@codeboten
Copy link
Contributor

@astencel-sumo yes... sorry for the delay!

@codeboten codeboten removed the Stale label Dec 8, 2023
@codeboten
Copy link
Contributor

As discussed in the Dec-13 SIG call, the plan to move this forward is to allow a configuration to filter out the http status class attribute, resulting in a single metric.

Stretch goal is to make this filtering of attribute generic enough to be used in all scrapers 😬

@andrzej-stencel
Copy link
Member Author

@codeboten can you please take a look? I believe this issue is important, as I couldn't find a way to exclude the zero time series using a processor:

I suppose an alternative would be to add a processor to the pipeline that will filter out the zero data points. But honestly I wasn't able to find a way to filter out metric data points based on their value using either the Filter processor or Metrics Transform processor. Is this possible? 🤔

I found a way to exclude zero values with the Filter processor:

processors:
  filter:
    metrics:
      datapoint:
        - 'metric.name == "httpcheck.status" and value_int == 0'

However, this is not really what I want. I do want to get the zero values when the endpoint is down. I just want a single zero datapoint and not five. Let me rephrase the issue title to account for this.

@andrzej-stencel andrzej-stencel changed the title [receiver/httpcheck] emit only non-zero data points [receiver/httpcheck] emit single httpcheck.status datapoint instead of five Dec 14, 2023
@andrzej-stencel
Copy link
Member Author

I think this is a workaround that makes a reasonable amount of sense to me:

  filter/drop-non-2xx-datapoints:
    metrics:
      datapoint:
        - 'metric.name == "httpcheck.status" and attributes["http.status_class"] != "2xx"'

@andrzej-stencel
Copy link
Member Author

andrzej-stencel commented Dec 14, 2023

Here's a full example:

exporters:
  debug:
    verbosity: detailed
  prometheus:
    endpoint: localhost:1234
processors:
  filter/drop-non-2xx-datapoints:
    metrics:
      datapoint:
        - 'metric.name == "httpcheck.status" and attributes["http.status_class"] != "2xx"'
  transform/drop-status-class-attribute:
    metric_statements:
    - context: datapoint
      statements:
      - keep_keys(attributes, ["http.url", "http.status_code", "http.method"]) where metric.name == "httpcheck.status"
receivers:
  httpcheck:
    collection_interval: 3s
    targets:
    - endpoint: https://opentelemetry.io
    - endpoint: https://non.existent.address
service:
  pipelines:
    metrics:
      exporters:
      - debug
      - prometheus
      processors:
      - filter/drop-non-2xx-datapoints
      - transform/drop-status-class-attribute
      receivers:
      - httpcheck

Here's the output from the collector:

$ otelcol-contrib-0.89.0-darwin_arm64 --config config.yaml
2023-12-14T10:06:38.819+0100    info    service@v0.89.0/telemetry.go:85 Setting up own telemetry...
2023-12-14T10:06:38.819+0100    info    service@v0.89.0/telemetry.go:202        Serving Prometheus metrics   {"address": ":8888", "level": "Basic"}
2023-12-14T10:06:38.819+0100    info    exporter@v0.89.0/exporter.go:275        Development component. May change in the future.      {"kind": "exporter", "data_type": "metrics", "name": "debug"}
2023-12-14T10:06:38.819+0100    info    receiver@v0.89.0/receiver.go:296        Development component. May change in the future.      {"kind": "receiver", "name": "httpcheck", "data_type": "metrics"}
2023-12-14T10:06:38.819+0100    info    service@v0.89.0/service.go:143  Starting otelcol-contrib...     {"Version": "0.89.0", "NumCPU": 10}
2023-12-14T10:06:38.819+0100    info    extensions/extensions.go:34     Starting extensions...
2023-12-14T10:06:38.820+0100    info    service@v0.89.0/service.go:169  Everything is ready. Begin running and processing data.
2023-12-14T10:06:40.380+0100    info    MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "debug", "resource metrics": 1, "metrics": 3, "data points": 5}
2023-12-14T10:06:40.381+0100    info    ResourceMetrics #0
Resource SchemaURL: 
ScopeMetrics #0
ScopeMetrics SchemaURL: 
InstrumentationScope otelcol/httpcheckreceiver 0.89.0
Metric #0
Descriptor:
     -> Name: httpcheck.duration
     -> Description: Measures the duration of the HTTP check.
     -> Unit: ms
     -> DataType: Gauge
NumberDataPoints #0
Data point attributes:
     -> http.url: Str(https://non.existent.address)
StartTimestamp: 2023-12-14 09:06:38.819473 +0000 UTC
Timestamp: 2023-12-14 09:06:39.822583 +0000 UTC
Value: 5
NumberDataPoints #1
Data point attributes:
     -> http.url: Str(https://opentelemetry.io)
StartTimestamp: 2023-12-14 09:06:38.819473 +0000 UTC
Timestamp: 2023-12-14 09:06:39.822618 +0000 UTC
Value: 557
Metric #1
Descriptor:
     -> Name: httpcheck.error
     -> Description: Records errors occurring during HTTP check.
     -> Unit: {error}
     -> DataType: Sum
     -> IsMonotonic: false
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
     -> http.url: Str(https://non.existent.address)
     -> error.message: Str(Get "https://non.existent.address": dial tcp: lookup non.existent.address: no such host)
StartTimestamp: 2023-12-14 09:06:38.819473 +0000 UTC
Timestamp: 2023-12-14 09:06:39.822583 +0000 UTC
Value: 1
Metric #2
Descriptor:
     -> Name: httpcheck.status
     -> Description: 1 if the check resulted in status_code matching the status_class, otherwise 0.
     -> Unit: 1
     -> DataType: Sum
     -> IsMonotonic: false
     -> AggregationTemporality: Cumulative
NumberDataPoints #0
Data point attributes:
     -> http.url: Str(https://non.existent.address)
     -> http.status_code: Int(0)
     -> http.method: Str(GET)
StartTimestamp: 2023-12-14 09:06:38.819473 +0000 UTC
Timestamp: 2023-12-14 09:06:39.822583 +0000 UTC
Value: 0
NumberDataPoints #1
Data point attributes:
     -> http.url: Str(https://opentelemetry.io)
     -> http.status_code: Int(200)
     -> http.method: Str(GET)
StartTimestamp: 2023-12-14 09:06:38.819473 +0000 UTC
Timestamp: 2023-12-14 09:06:39.822618 +0000 UTC
Value: 1
        {"kind": "exporter", "data_type": "metrics", "name": "debug"}

Here's the output from the Prometheus exporter:

$ curl localhost:1234/metrics
# HELP httpcheck_duration_milliseconds Measures the duration of the HTTP check.
# TYPE httpcheck_duration_milliseconds gauge
httpcheck_duration_milliseconds{http_url="https://non.existent.address"} 4
httpcheck_duration_milliseconds{http_url="https://opentelemetry.io"} 176
# HELP httpcheck_error Records errors occurring during HTTP check.
# TYPE httpcheck_error gauge
httpcheck_error{error_message="Get \"https://non.existent.address\": dial tcp: lookup non.existent.address: no such host",http_url="https://non.existent.address"} 1
# HELP httpcheck_status 1 if the check resulted in status_code matching the status_class, otherwise 0.
# TYPE httpcheck_status gauge
httpcheck_status{http_method="GET",http_status_code="0",http_url="https://non.existent.address"} 0
httpcheck_status{http_method="GET",http_status_code="200",http_url="https://opentelemetry.io"} 1

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Apr 15, 2024
Copy link
Contributor

This issue has been closed as inactive because it has been stale for 120 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
closed as inactive enhancement New feature or request receiver/httpcheck HTTP Check receiver Stale
Projects
None yet
Development

No branches or pull requests

2 participants