Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nginx metrics to prometheus #36

Merged
merged 3 commits into from
Nov 30, 2016

Conversation

aledbf
Copy link
Member

@aledbf aledbf commented Nov 29, 2016

ngx

@aledbf aledbf self-assigned this Nov 29, 2016
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 29, 2016
}

func (n *NGINXController) setupMonitor(args []string) {
pc, err := newProcessCollector(true, exeMatcher{"nginx", args})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why run this in the nginx controller vs in the generic controller and report Status through an interface method? prometheus library seems pretty general purpose, and we could just ask the backend to return a map of like:

map [string]string{
  "num_procs": 10, 
  "read_bytes": 123,
...
}

Which the generic_controller converts into some export format (could be prometheus, could be write directly to some database etc)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the generic_controller already exposes prometheus metrics for the go process. Using this approach each backend can decide what to export and the comments for each variable (and how to extract the information)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(could be prometheus, could be write directly to some database etc)

can we take this as an improvement and iterate?

Copy link
Contributor

@bprashanth bprashanth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be great if we could avoid the duplication for each metric, otherwise this mostly lgtm after nits are fixed

glog.Warningf("unexpected error obtaining nginx status info: %v", err)
return
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a lot of repetition here, though I haven't spent time thinking about a better solution

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be isolated to this backend.
In case of caddy or haproxy there's already an exporter:
https://github.com/miekg/caddy-prometheus
https://github.com/prometheus/haproxy_exporter

n.start(cmd, done)
select {
case err := <-done:
if exitError, ok := err.(*exec.ExitError); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please explain why this is needed (i.e why is it important to wait for master process in this way vs just running start and assuming the liveness check will fail if the master never comes up?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add the comment in the code. Basically if the nginx master process dies the workers continue to process requests (passing the checks) but in case of updates in ingress no updates will be reflected in the nginx configuration

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please explain why this is needed

done

}

func getNginxStatus() (*nginxStatus, error) {
resp, err := http.DefaultClient.Get("http://localhost:18080/internal_nginx_status")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make this port a const, and the url path

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@aledbf aledbf force-pushed the prometheus-nginx branch 3 times, most recently from e19c4ff to a43d2b1 Compare November 29, 2016 19:12
@coveralls
Copy link

Coverage Status

Coverage decreased (-1.0%) to 39.239% when pulling a43d2b1e66db64f9a4a21820b4ae144626155c17 on aledbf:prometheus-nginx into 666cbf5 on kubernetes:master.

@aledbf aledbf changed the title WIP: Add nginx metrics to prometheus Add nginx metrics to prometheus Nov 29, 2016
@k8s-oncall
Copy link

This change is Reviewable

@coveralls
Copy link

Coverage Status

Coverage decreased (-1.02%) to 39.178% when pulling f7011d2 on aledbf:prometheus-nginx into 666cbf5 on kubernetes:master.

@aledbf
Copy link
Member Author

aledbf commented Nov 30, 2016

@bprashanth ping

Copy link
Contributor

@bprashanth bprashanth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, the simplest way to handle zombies is probably the shell but that can come as a follow up if testing turns out ok

@@ -23,4 +23,10 @@ RUN DEBIAN_FRONTEND=noninteractive apt-get update && apt-get install -y \

COPY . /

# https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zombie-reaping-problem
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we just rely on the shell to manage zombies?
eg http://blog.dscpl.com.au/2015/12/issues-with-running-as-pid-1-in-docker.html

@bprashanth bprashanth added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 30, 2016
@bprashanth bprashanth merged commit ac6930b into kubernetes:master Nov 30, 2016
@aledbf aledbf deleted the prometheus-nginx branch December 21, 2016 00:15
haoqing0110 referenced this pull request in stolostron/management-ingress Mar 5, 2021
* use go mod

* update travis to use go 1.13

* fix copyright check

* fix UT error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants