Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenTelemetry exporter for Prometheus on Azure Kubernetes cant connect to Prometheus service #31914

Closed
abkhan5 opened this issue Mar 22, 2024 · 5 comments
Labels
bug Something isn't working exporter/prometheus

Comments

@abkhan5
Copy link

abkhan5 commented Mar 22, 2024

Opentelemetry exporter on Azure Kubernetes on trying to connect to Prometheus throws an error

OpenTelemetry exporter on being configured to connect to Prometheus server on the same cluster throws the following error
024-03-22T11:50:22.849Z error exporterhelper/queue_sender.go:97 Exporting failed. Dropping data. {"kind": "exporter", "data_type": "metrics", "name": "otlphttp/prometheus", "error": "not retryable error: Permanent error: error exporting items, request to http://prometheus-server.monitoring.svc.cluster.local:80/v1/metrics responded with HTTP Status Code 404", "dropped_items": 107}

Install Prometheus on Azure Kubernetes on the monitoring namespace using helm chart

helm install prometheus prometheus-community/prometheus --debug -n monitoring

On installing prometheues this is what the services look like

image

Install opentelemetry using helm with a values.yaml file which looks like the code below

helm install opentelemetry-collector open-telemetry/opentelemetry-collector --debug -f values.yaml;

The values.yaml file looks like this

mode: deployment


presets:
  # enables the k8sattributesprocessor and adds it to the traces, metrics, and logs pipelines
  kubernetesAttributes:
    enabled: true
  # enables the kubeletstatsreceiver and adds it to the metrics pipelines
  kubeletMetrics:
    enabled: true
  # Enables the filelogreceiver and adds it to the logs pipelines
  logsCollection:
    enabled: true

config:
  processors:
    resourcedetection:
      detectors: [env, system]
    cumulativetodelta:
    batch:
      send_batch_max_size: 1000
      timeout: 30s
      send_batch_size : 800

    memory_limiter:
      check_interval: 1s
      limit_percentage: 70
      spike_limit_percentage: 30

  receivers:
    prometheus:
      config:  
        scrape_configs:
        - job_name: 'otel-collector'
          scrape_interval: 10s
          static_configs:
          - targets: ['0.0.0.0:8888']
        - job_name: 'node-exporter'
          scrape_interval: 10s
          static_configs:
          - targets: ['0.0.0.0:9100']

    hostmetrics:
      collection_interval: 30s
      scrapers:
        cpu:
        disk:
        memory:
        load:
          cpu_average: true
    kubeletstats:
        collection_interval: 10s
        auth_type: 'serviceAccount'
        endpoint: '${env:K8S_NODE_NAME}:10250'
        insecure_skip_verify: true
        metric_groups:
            - node
            - pod
            - container

  exporters:
    otlphttp/prometheus:      
      endpoint: "http://prometheus-server.monitoring.svc.cluster.local:80"
      tls:
        insecure: true
      
    prometheusremotewrite:
      endpoint: http://prometheus-server.monitoring.cluster.local:9090/api/v1/push
      tls:
        insecure: true

    prometheus:
      endpoint: "prometheus-server.monitoring.svc.cluster.local:80"
      const_labels:
        label1: dev2
      send_timestamps: true
      metric_expiration: 180m
      enable_open_metrics: true
      add_metric_suffixes: false      
      resource_to_telemetry_conversion:
        enabled: true

  service:
    pipelines:
      metrics:
        processors: [cumulativetodelta, batch, resourcedetection,memory_limiter]
        receivers:
          - otlp
          - hostmetrics
          - kubeletstats
        exporters:
          - otlphttp/prometheus

I expected apps deployed on AKS to show up in Prometheus . I also expected the logs in the opentelemetry pod to show it successfully able to connect and for metrics to show up in Prometheus

What I see instead are 404 error on trying to connect to prometheus . The errors look like this

error exporterhelper/queue_sender.go:97 Exporting failed. Dropping data. {"kind": "exporter", "data_type": "metrics", "name": "otlphttp/prometheus", "error": "not retryable error: Permanent error: error exporting items, request to http://prometheus-server.monitoring.svc.cluster.local:80/v1/metrics responded with HTTP Status Code 404", "dropped_items": 107}

While trying the prometheusremotewrite expoerter i get the following error

exporterhelper/queue_sender.go:97 Exporting failed. Dropping data. {"kind": "exporter", "data_type": "metrics", "name": "prometheusremotewrite", "error": "Permanent error: invalid temporality and type combination for metric "system.disk.io"; invalid temporality and type combination for metric "system.disk.io_time"; invalid temporality and type combination for metric "system.disk.merged"; invalid temporality and type combination for metric "system.disk.operation_time"; invalid temporality and type combination for metric "system.disk.operations"; invalid temporality and type combination for metric "system.disk.weighted_io_time"; invalid temporality and type combination for metric "system.cpu.time"; Permanent error: Permanent error: context deadline exceeded", "errorCauses": [{"error": "Permanent error: invalid temporality and type combination for metric "system.disk.io"; invalid temporality and type combination for metric "system.disk.io_time"; invalid temporality and type combination for metric "system.disk.merged"; invalid temporality and type combination for metric "system.disk.operation_time"; invalid temporality and type combination for metric "system.disk.operations"; invalid temporality and type combination for metric "system.disk.weighted_io_time"; invalid temporality and type combination for metric "system.cpu.time""}, {"error": "Permanent error: Permanent error: context deadline exceeded"}], "dropped_items": 107}

@abkhan5 abkhan5 added the bug Something isn't working label Mar 22, 2024
@TylerHelmuth TylerHelmuth transferred this issue from open-telemetry/opentelemetry-collector Mar 22, 2024
@TylerHelmuth TylerHelmuth added the receiver/prometheus Prometheus receiver label Mar 22, 2024
Copy link
Contributor

Pinging code owners for receiver/prometheus: @Aneurysm9 @dashpole. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@TylerHelmuth TylerHelmuth added exporter/prometheus and removed receiver/prometheus Prometheus receiver labels Mar 22, 2024
Copy link
Contributor

Pinging code owners for exporter/prometheus: @Aneurysm9. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@dashpole
Copy link
Contributor

Looks like this is actually using the otelhttp exporter?

@abkhan5
Copy link
Author

abkhan5 commented Mar 23, 2024

Looks like this is actually using the otelhttp exporter?

i've tried all three of the exporters mentioned . Each give different error messages
The promethues exporter gives the following

no existing monitoring routine is r │
│ 2024/03/23 01:39:11 collector server run finished with error: cannot start pipelines: listen tcp 10.0.117.182:80: bind: cannot assign requested address;

the oltphttp/prometheus exporter gives this
error exporterhelper/queue_sender.go:97 Exporting failed. Dropping data. {"kind": "exporter", "data_type": "metrics", "name": "otlphttp/prometheus", "error": "not retryable error: Permanent error: error exporting items, request to http://prometheus-server.monitoring.svc.cluster.local:80/v1/metrics responded with HTTP Status Code 404", "dropped_items": 107}

and remotewrite gives

exporterhelper/queue_sender.go:97 Exporting failed. Dropping data. {"kind": "exporter", "data_type": "metrics", "name": "prometheusremotewrite", "error": "Permanent error: invalid temporality and type combination for metric "system.disk.io"; invalid temporality and type combination for metric "system.disk.io_time"; invalid temporality and type combination for metric "system.disk.merged"; invalid temporality and type combination for metric "system.disk.operation_time"; invalid temporality and type combination for metric "system.disk.operations"; invalid temporality and type combination for metric "system.disk.weighted_io_time"; invalid temporality and type combination for metric "system.cpu.time"; Permanent error: Permanent error: context deadline exceeded", "errorCauses": [{"error": "Permanent error: invalid temporality and type combination for metric "system.disk.io"; invalid temporality and type combination for metric "system.disk.io_time"; invalid temporality and type combination for metric "system.disk.merged"; invalid temporality and type combination for metric "system.disk.operation_time"; invalid temporality and type combination for metric "system.disk.operations"; invalid temporality and type combination for metric "system.disk.weighted_io_time"; invalid temporality and type combination for metric "system.cpu.time""}, {"error": "Permanent error: Permanent error: context deadline exceeded"}], "dropped_items": 107}

@dashpole
Copy link
Contributor

To send OTLP to prometheus, you need to enable OTLP ingestion on the prometheus server: https://prometheus.io/docs/prometheus/latest/feature_flags/#otlp-receiver

To send any metrics to Prometheus today, you need to make sure they aggregation temporality is Cumulative, not delta. The errors in #31914 (comment) indicate you are trying to send Delta metrics.

The prometheus exporter exposes a local endpoint on the collector (e.g. localhost:8080) which a prometheus server can scrape. It doesn't really make sense to listen on a remote IP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working exporter/prometheus
Projects
None yet
Development

No branches or pull requests

3 participants