-
Notifications
You must be signed in to change notification settings - Fork 735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kafka-Minion as alternative to Burrow for consumer lag monitoring #259
Changes from 7 commits
3789317
0e4f82c
9913600
4ccf36d
9f669f4
31baa04
419c629
5fc33f4
dcae04c
54ab8b4
f848f56
c318eda
7fcabf3
22c60c5
bba77fb
8c1d7ec
b2eb754
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
apiVersion: v1 | ||
kind: Service | ||
metadata: | ||
name: metrics-minion | ||
namespace: kafka | ||
labels: &labels | ||
app: kafka-minion | ||
type: openmetrics | ||
spec: | ||
selector: *labels | ||
ports: | ||
- name: http | ||
port: 8080 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
apiVersion: apps/v1 | ||
kind: Deployment | ||
metadata: | ||
name: metrics-minion | ||
namespace: kafka | ||
labels: &labels | ||
app: kafka-minion | ||
type: openmetrics | ||
spec: | ||
replicas: 1 | ||
selector: | ||
matchLabels: *labels | ||
template: | ||
metadata: | ||
labels: *labels | ||
annotations: | ||
prometheus.io/scrape: "true" | ||
prometheus.io/port: "8080" | ||
prometheus.io/path: /metrics | ||
spec: | ||
containers: | ||
- name: kafka-minion | ||
image: quay.io/google-cloud-tools/kafka-minion:v0.1.2@sha256:756faaa4b7ce39b9f7d76c0cf9570ab0cf6a9c921e407acd0f12ca933abe202e | ||
env: | ||
- name: TELEMETRY_HOST | ||
value: 0.0.0.0 | ||
- name: TELEMETRY_PORT | ||
value: "8080" | ||
- name: EXPORTER_IGNORE_SYSTEM_TOPICS | ||
value: "true" | ||
- name: EXPORTER_METRICS_PREFIX | ||
value: kafka_minion | ||
- name: LOG_LEVEL | ||
value: info | ||
- name: KAFKA_BROKERS | ||
value: kafka-0.broker:9092, kafka-1.broker:9092, kafka-2.broker:9092 | ||
- name: KAFKA_CONSUMER_OFFSETS_TOPIC_NAME | ||
value: __consumer_offsets | ||
ports: | ||
- name: http | ||
containerPort: 8080 | ||
readinessProbe: | ||
failureThreshold: 1 | ||
httpGet: | ||
port: http | ||
path: /metrics | ||
livenessProbe: | ||
failureThreshold: 3 | ||
httpGet: | ||
port: http | ||
path: /metrics | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have a separate endpoint which checks if it's still connected to at least one kafka broker:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 5fc33f4 swiches to this endpoints but keeps everything else default |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
resources: | ||
- kafka-minion-service.yaml | ||
- kafka-minion.yaml |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,27 @@ | ||
# Export metrics to Prometheus | ||
|
||
Kafka uses JMX to expose metrics, as is already [enabled](https://github.com/Yolean/kubernetes-kafka/pull/96) for broker pods. There's many ways to use JMX. For example [Kafka Manager](../yahoo-kafka-manager/) uses it to display current broker traffic. | ||
JMX is already [enabled](https://github.com/Yolean/kubernetes-kafka/pull/96) for broker pods (TODO extract to kustomization). There's many ways to use JMX. For example [Kafka Manager](../yahoo-kafka-manager/) uses it to display current broker traffic. | ||
|
||
At Yolean we use Prometheus. This folder adds a sidecar to the broker pods that exports selected JMX metrics over HTTP in Prometheus format. To add a container to an existing pod we must use the `patch`command: | ||
This folder adds a sidecar to the broker pods that exports selected JMX metrics over HTTP in Prometheus format. To add a container to an existing pod we must use the `patch`command: | ||
|
||
Using kubectl 1.14+ | ||
|
||
``` | ||
kubectl --namespace kafka apply -k prometheus/ | ||
``` | ||
|
||
Using pre-1.14 kubectl: | ||
|
||
``` | ||
kubectl --namespace kafka apply -f prometheus/10-metrics-config.yml | ||
kubectl --namespace kafka patch statefulset kafka --patch "$(cat prometheus/50-kafka-jmx-exporter-patch.yml )" | ||
``` | ||
|
||
## Consumer lag monitoring | ||
|
||
See [Burrow](../linkedin-burrow) | ||
or [Kafka Minion](../consumers-prometheus/) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe some additional comments what one may prefer depending on the use case / environment?
In fact they can supplement each other and it may be a valid desire to operate both of them. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's lots and lots of research to be done for anyone who wants to set up a Kafka stack and I see this repository as a collection of examples rather than a way to discuss the choices. |
||
|
||
## Prometheus Operator | ||
|
||
Use the [prometheus-operator](../variants/prometheus-operator/) kustomization. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
bases: | ||
- ../kafka | ||
# or any variant with kafka included, such as | ||
#- ../variants/scale-1 | ||
resources: | ||
- 10-metrics-config.yml | ||
patchesStrategicMerge: | ||
- 50-kafka-jmx-exporter-patch.yml |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# Allows the "k8s" prometheus from Prometheus Operator contrib to do service discovery iin the kafka namespace | ||
--- | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: Role | ||
metadata: | ||
name: prometheus-k8s | ||
namespace: kafka | ||
rules: | ||
- apiGroups: | ||
- "" | ||
resources: | ||
- services | ||
- endpoints | ||
- pods | ||
verbs: | ||
- get | ||
- list | ||
- watch | ||
--- | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: RoleBinding | ||
metadata: | ||
name: prometheus-k8s | ||
namespace: kafka | ||
roleRef: | ||
apiGroup: rbac.authorization.k8s.io | ||
kind: Role | ||
name: prometheus-k8s | ||
subjects: | ||
- kind: ServiceAccount | ||
name: prometheus-k8s | ||
namespace: monitoring |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
--- | ||
apiVersion: v1 | ||
kind: Service | ||
metadata: | ||
name: broker-monitoring | ||
namespace: kafka | ||
labels: | ||
app: kafka | ||
spec: | ||
publishNotReadyAddresses: true | ||
ports: | ||
- name: fromjmx | ||
port: 5556 | ||
selector: | ||
app: kafka | ||
--- | ||
apiVersion: monitoring.coreos.com/v1 | ||
kind: ServiceMonitor | ||
metadata: | ||
name: kafka | ||
namespace: monitoring | ||
labels: | ||
k8s-app: kafka | ||
spec: | ||
namespaceSelector: | ||
matchNames: | ||
- kafka | ||
selector: | ||
matchLabels: | ||
app: kafka | ||
endpoints: | ||
# https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#endpoint | ||
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token | ||
interval: 120s | ||
scrapeTimeout: 119s | ||
port: fromjmx | ||
scheme: http | ||
path: /metrics |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
apiVersion: monitoring.coreos.com/v1 | ||
kind: ServiceMonitor | ||
metadata: | ||
name: kafka-metrics-minion | ||
namespace: monitoring | ||
labels: | ||
k8s-app: kafka-metrics-minion | ||
spec: | ||
namespaceSelector: | ||
matchNames: | ||
- kafka | ||
selector: | ||
matchLabels: | ||
app: kafka-minion | ||
type: openmetrics | ||
endpoints: | ||
- bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token | ||
interval: 30s | ||
scrapeTimeout: 30s | ||
port: http | ||
scheme: http | ||
path: /metrics |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
bases: | ||
- ../../prometheus | ||
- ../../consumers-prometheus | ||
resources: | ||
- k8s-kafka-rbac.yaml | ||
# with base ../../prometheus | ||
- k8s-kafka-servicemonitor.yaml | ||
# with base ../../consumers-prometheus | ||
- k8s-minion-servicemonitor.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kafka Minion 1.1.2 introduces a dedicated readiness check which is 200 once Kafka Minion has initially consumed the
__consumer_offsets
topic which is the point in time when it starts exposing metrics. This is a required feature to run Kafka Minion in high availability / multiple replicas. This is recommended if you intend to setup alerting on these metrics.Since this can take some time it requires some loose timeouts:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5fc33f4 swiches to this endpoints but keeps everything else default