OpenTelemetry exporter for Prometheus on Azure Kubernetes cant connect to Prometheus service #31914

abkhan5 · 2024-03-22T12:27:02Z

Opentelemetry exporter on Azure Kubernetes on trying to connect to Prometheus throws an error

OpenTelemetry exporter on being configured to connect to Prometheus server on the same cluster throws the following error
024-03-22T11:50:22.849Z error exporterhelper/queue_sender.go:97 Exporting failed. Dropping data. {"kind": "exporter", "data_type": "metrics", "name": "otlphttp/prometheus", "error": "not retryable error: Permanent error: error exporting items, request to http://prometheus-server.monitoring.svc.cluster.local:80/v1/metrics responded with HTTP Status Code 404", "dropped_items": 107}

Install Prometheus on Azure Kubernetes on the monitoring namespace using helm chart

helm install prometheus prometheus-community/prometheus --debug -n monitoring

On installing prometheues this is what the services look like

Install opentelemetry using helm with a values.yaml file which looks like the code below

helm install opentelemetry-collector open-telemetry/opentelemetry-collector --debug -f values.yaml;

The values.yaml file looks like this

mode: deployment


presets:
  # enables the k8sattributesprocessor and adds it to the traces, metrics, and logs pipelines
  kubernetesAttributes:
    enabled: true
  # enables the kubeletstatsreceiver and adds it to the metrics pipelines
  kubeletMetrics:
    enabled: true
  # Enables the filelogreceiver and adds it to the logs pipelines
  logsCollection:
    enabled: true

config:
  processors:
    resourcedetection:
      detectors: [env, system]
    cumulativetodelta:
    batch:
      send_batch_max_size: 1000
      timeout: 30s
      send_batch_size : 800

    memory_limiter:
      check_interval: 1s
      limit_percentage: 70
      spike_limit_percentage: 30

  receivers:
    prometheus:
      config:  
        scrape_configs:
        - job_name: 'otel-collector'
          scrape_interval: 10s
          static_configs:
          - targets: ['0.0.0.0:8888']
        - job_name: 'node-exporter'
          scrape_interval: 10s
          static_configs:
          - targets: ['0.0.0.0:9100']

    hostmetrics:
      collection_interval: 30s
      scrapers:
        cpu:
        disk:
        memory:
        load:
          cpu_average: true
    kubeletstats:
        collection_interval: 10s
        auth_type: 'serviceAccount'
        endpoint: '${env:K8S_NODE_NAME}:10250'
        insecure_skip_verify: true
        metric_groups:
            - node
            - pod
            - container

  exporters:
    otlphttp/prometheus:      
      endpoint: "http://prometheus-server.monitoring.svc.cluster.local:80"
      tls:
        insecure: true
      
    prometheusremotewrite:
      endpoint: http://prometheus-server.monitoring.cluster.local:9090/api/v1/push
      tls:
        insecure: true

    prometheus:
      endpoint: "prometheus-server.monitoring.svc.cluster.local:80"
      const_labels:
        label1: dev2
      send_timestamps: true
      metric_expiration: 180m
      enable_open_metrics: true
      add_metric_suffixes: false      
      resource_to_telemetry_conversion:
        enabled: true

  service:
    pipelines:
      metrics:
        processors: [cumulativetodelta, batch, resourcedetection,memory_limiter]
        receivers:
          - otlp
          - hostmetrics
          - kubeletstats
        exporters:
          - otlphttp/prometheus

I expected apps deployed on AKS to show up in Prometheus . I also expected the logs in the opentelemetry pod to show it successfully able to connect and for metrics to show up in Prometheus

What I see instead are 404 error on trying to connect to prometheus . The errors look like this

error exporterhelper/queue_sender.go:97 Exporting failed. Dropping data. {"kind": "exporter", "data_type": "metrics", "name": "otlphttp/prometheus", "error": "not retryable error: Permanent error: error exporting items, request to http://prometheus-server.monitoring.svc.cluster.local:80/v1/metrics responded with HTTP Status Code 404", "dropped_items": 107}

While trying the prometheusremotewrite expoerter i get the following error

exporterhelper/queue_sender.go:97 Exporting failed. Dropping data. {"kind": "exporter", "data_type": "metrics", "name": "prometheusremotewrite", "error": "Permanent error: invalid temporality and type combination for metric "system.disk.io"; invalid temporality and type combination for metric "system.disk.io_time"; invalid temporality and type combination for metric "system.disk.merged"; invalid temporality and type combination for metric "system.disk.operation_time"; invalid temporality and type combination for metric "system.disk.operations"; invalid temporality and type combination for metric "system.disk.weighted_io_time"; invalid temporality and type combination for metric "system.cpu.time"; Permanent error: Permanent error: context deadline exceeded", "errorCauses": [{"error": "Permanent error: invalid temporality and type combination for metric "system.disk.io"; invalid temporality and type combination for metric "system.disk.io_time"; invalid temporality and type combination for metric "system.disk.merged"; invalid temporality and type combination for metric "system.disk.operation_time"; invalid temporality and type combination for metric "system.disk.operations"; invalid temporality and type combination for metric "system.disk.weighted_io_time"; invalid temporality and type combination for metric "system.cpu.time""}, {"error": "Permanent error: Permanent error: context deadline exceeded"}], "dropped_items": 107}

The text was updated successfully, but these errors were encountered:

github-actions · 2024-03-22T18:20:04Z

Pinging code owners for receiver/prometheus: @Aneurysm9 @dashpole. See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions · 2024-03-22T18:20:27Z

Pinging code owners for exporter/prometheus: @Aneurysm9. See Adding Labels via Comments if you do not have permissions to add labels yourself.

dashpole · 2024-03-22T18:22:27Z

Looks like this is actually using the otelhttp exporter?

abkhan5 · 2024-03-23T01:45:24Z

Looks like this is actually using the otelhttp exporter?

i've tried all three of the exporters mentioned . Each give different error messages
The promethues exporter gives the following

no existing monitoring routine is r │
│ 2024/03/23 01:39:11 collector server run finished with error: cannot start pipelines: listen tcp 10.0.117.182:80: bind: cannot assign requested address;

the oltphttp/prometheus exporter gives this
error exporterhelper/queue_sender.go:97 Exporting failed. Dropping data. {"kind": "exporter", "data_type": "metrics", "name": "otlphttp/prometheus", "error": "not retryable error: Permanent error: error exporting items, request to http://prometheus-server.monitoring.svc.cluster.local:80/v1/metrics responded with HTTP Status Code 404", "dropped_items": 107}

and remotewrite gives

exporterhelper/queue_sender.go:97 Exporting failed. Dropping data. {"kind": "exporter", "data_type": "metrics", "name": "prometheusremotewrite", "error": "Permanent error: invalid temporality and type combination for metric "system.disk.io"; invalid temporality and type combination for metric "system.disk.io_time"; invalid temporality and type combination for metric "system.disk.merged"; invalid temporality and type combination for metric "system.disk.operation_time"; invalid temporality and type combination for metric "system.disk.operations"; invalid temporality and type combination for metric "system.disk.weighted_io_time"; invalid temporality and type combination for metric "system.cpu.time"; Permanent error: Permanent error: context deadline exceeded", "errorCauses": [{"error": "Permanent error: invalid temporality and type combination for metric "system.disk.io"; invalid temporality and type combination for metric "system.disk.io_time"; invalid temporality and type combination for metric "system.disk.merged"; invalid temporality and type combination for metric "system.disk.operation_time"; invalid temporality and type combination for metric "system.disk.operations"; invalid temporality and type combination for metric "system.disk.weighted_io_time"; invalid temporality and type combination for metric "system.cpu.time""}, {"error": "Permanent error: Permanent error: context deadline exceeded"}], "dropped_items": 107}

dashpole · 2024-03-24T02:59:40Z

To send OTLP to prometheus, you need to enable OTLP ingestion on the prometheus server: https://prometheus.io/docs/prometheus/latest/feature_flags/#otlp-receiver

To send any metrics to Prometheus today, you need to make sure they aggregation temporality is Cumulative, not delta. The errors in #31914 (comment) indicate you are trying to send Delta metrics.

The prometheus exporter exposes a local endpoint on the collector (e.g. localhost:8080) which a prometheus server can scrape. It doesn't really make sense to listen on a remote IP.

abkhan5 added the bug Something isn't working label Mar 22, 2024

TylerHelmuth transferred this issue from open-telemetry/opentelemetry-collector Mar 22, 2024

TylerHelmuth added the receiver/prometheus Prometheus receiver label Mar 22, 2024

TylerHelmuth added exporter/prometheus and removed receiver/prometheus Prometheus receiver labels Mar 22, 2024

abkhan5 closed this as completed Mar 25, 2024

github-actions bot mentioned this issue Mar 26, 2024

Weekly Report: 2024-03-19 - 2024-03-26 #31947

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenTelemetry exporter for Prometheus on Azure Kubernetes cant connect to Prometheus service #31914

OpenTelemetry exporter for Prometheus on Azure Kubernetes cant connect to Prometheus service #31914

abkhan5 commented Mar 22, 2024

github-actions bot commented Mar 22, 2024

github-actions bot commented Mar 22, 2024

dashpole commented Mar 22, 2024

abkhan5 commented Mar 23, 2024

dashpole commented Mar 24, 2024