Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Prometheus instrumentation #181

Merged
merged 16 commits into from
Jan 17, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions asset.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ type Asset struct {
ID int `json:"id,string"`
ContentType string `json:"content-type"`
Date time.Time `json:"date"`
Size int64 `json:"size"`
Sha256 string `json:"sha256"`
URLs []string `json:"urls"`
Exists bool `json:"exists"`
Expand Down
1 change: 1 addition & 0 deletions docs/assets.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ Meta data of assets are stored in etcd as follows:
"content-type": "application/tar",
"date": "2017-12-02T15:04:05Z",
"sha256": "2e0390eb024a52963db7b95e84a9c2b12c004054a7bad9a97ec0c7c89d4681d2",
"size": 1002567,
"urls": [
"http://10.1.2.3:10080/api/v1/assets/hoge.tar.gz",
"http://10.98.76.54:10080/api/v1/assets/hoge.tar.gz"
Expand Down
2 changes: 2 additions & 0 deletions docs/image_management.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ The structure of the index is a JSON like this:
{
"id": "1688.5.3",
"date": "2017-12-02T15:04:05Z",
"size": 10000000,
"urls": [
"http://10.1.2.3:10080/api/v1/images/coreos/1688.5.3",
"http://10.98.76.54:10080/api/v1/images/coreos/1688.5.3"
Expand All @@ -40,6 +41,7 @@ The structure of the index is a JSON like this:
{
"id": "1745.4.0",
"date": "2018-05-29T01:23:45Z",
"size": 10000000,
"urls": [
"http://10.1.2.3:10080/api/v1/images/coreos/1745.4.0"
],
Expand Down
17 changes: 17 additions & 0 deletions docs/metrics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
Metrics
=======

Sabakan exposes the following metrics with the Prometheus format. The listen address and update interval can be configured by the CLI flags (see [here](sabakan.md#Usage)). All these metrics are prefixed with `sabakan_`

| Name | Description | Type | Labels |
| ------------------ | -------------------------------------------------------------------- | ------- | ----------------------------------------------------- |
| machine_status | The machine status (see [Machine States](lifecycle.m#Machine-States)) | Gauge | status, address, serial, rack, role, machine_type (*) |
| api_request_count | The request counts of API call. | Counter | code, path, verb |
| assets_bytes_total | The total byte size of assets. | Gauge | |
| assets_items_total | The total item numbers of assets. | Gauge | |
| images_bytes_total | The total byte size of images. | Gauge | |
| images_items_total | The total item numbers of images. | Gauge | |

Note that sabakan also exposes the metrics provided by the Prometheus client library which located under `go` and `process` namespaces.

(*) "machine_type" is derived from [the user-defined `labels`](machine.md#machinespec-struct) with the key of `machine-type`.
25 changes: 15 additions & 10 deletions docs/sabakan.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,18 +31,23 @@ Usage of sabakan:
Log format [plain,logfmt,json]
-loglevel string
Log level [critical,error,warning,info,debug]
-metrics string
<Listen IP>:<Port number> (default "0.0.0.0:10081")
-metrics-interval string
```

| Option | Default value | Description |
| ------------------- | ------------------------ | ------------------------------------------------ |
| `advertise-url` | "" | Public URL to access this server. Required. |
| `allow-ips` | `127.0.0.1,::1` | Comma-separated IPs allowed to change resources. |
| `config-file` | "" | If given, configurations are read from the file. |
| `data-dir` | `/var/lib/sabakan` | Directory to store files. |
| `dhcp-bind` | `0.0.0.0:10067` | IP address and port number of DHCP server. |
| `enable-playground` | false | Enable GraphQL playground service. |
| `http` | `0.0.0.0:10080` | IP address and port number of HTTP server. |
| `ipxe-efi-path` | `/usr/lib/ipxe/ipxe.efi` | Path to ipxe.efi . |
| Option | Default value | Description |
| ------------------- | ------------------------ | -------------------------------------------------- |
| `advertise-url` | "" | Public URL to access this server. Required. |
| `allow-ips` | `127.0.0.1,::1` | Comma-separated IPs allowed to change resources. |
| `config-file` | "" | If given, configurations are read from the file. |
| `data-dir` | `/var/lib/sabakan` | Directory to store files. |
| `dhcp-bind` | `0.0.0.0:10067` | IP address and port number of DHCP server. |
| `enable-playground` | false | Enable GraphQL playground service. |
| `http` | `0.0.0.0:10080` | IP address and port number of HTTP server. |
| `ipxe-efi-path` | `/usr/lib/ipxe/ipxe.efi` | Path to ipxe.efi . |
| `metrics` | `0.0.0.0:10081` | IP address and port number of metrics HTTP server. |
| `metrics-interval` | `30s` | Update interval of metrics. |

Config file
-----------
Expand Down
26 changes: 13 additions & 13 deletions docs/schema.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,20 +17,20 @@ Before version 1.2, sabakan did not have this key.
`<prefix>/machines/<serial>`
----------------------------

Name | Description
---- | -----------
serial | Serial number of a machine
| Name | Description |
| ------ | -------------------------- |
| serial | Serial number of a machine |

This type of key holds the information of a machine.
The value is formatted in JSON as defined in [Machine](machine.md).

`<prefix>/crypts/<serial>/<path>`
---------------------------------

Name | Description
---- | -----------
serial | Serial number of a machine
path | Name of an encrypted disk, in the format shown in `/dev/disk/by-path`
| Name | Description |
| ------ | --------------------------------------------------------------------- |
| serial | Serial number of a machine |
| path | Name of an encrypted disk, in the format shown in `/dev/disk/by-path` |

These keys hold the encryption key of a disk.
The value is a raw binary key.
Expand Down Expand Up @@ -90,9 +90,9 @@ The value is [DHCPConfig](dhcp.md#dhcpconfig) formatted in JSON.
`<prefix>/lease-usages/<ip>`
----------------------------

Name | Description
---- | -----------
ip | The first IP address of the lease range.
| Name | Description |
| ---- | ---------------------------------------- |
| ip | The first IP address of the lease range. |

These keys hold lease address usages for a range of IP addresses.
The value is a mapping between hardware address and (`index`, `expire`)
Expand All @@ -102,9 +102,9 @@ and `expire` is the Go's `time.Time` when the lease expires.
`<prefix>/node-indices/<rack>`
------------------------------

Name | Description
---- | -----------
rack | Rack nubmer
| Name | Description |
| ---- | ----------- |
| rack | Rack nubmer |

This type of key holds assignment of node indices per rack.
The value is a list of assigned indices formatted in JSON.
Expand Down
6 changes: 4 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,9 @@ require (
github.com/mitchellh/mapstructure v1.1.2 // indirect
github.com/onsi/ginkgo v1.8.0
github.com/onsi/gomega v1.5.0
github.com/prometheus/client_golang v0.9.1 // indirect
github.com/prometheus/common v0.0.0-20181126121408-4724e9255275 // indirect
github.com/prometheus/client_golang v0.9.1
github.com/prometheus/client_model v0.0.0-20180712105110-5c3871d89910
github.com/prometheus/common v0.0.0-20181126121408-4724e9255275
github.com/prometheus/procfs v0.0.0-20181126161756-619930b0b471 // indirect
github.com/sirupsen/logrus v1.2.0 // indirect
github.com/spf13/cast v1.3.0 // indirect
Expand All @@ -32,6 +33,7 @@ require (
go.universe.tf/netboot v0.0.0-20181010164912-24067fad46fd
go4.org v0.0.0-20181109185143-00e24f1b2599 // indirect
golang.org/x/crypto v0.0.0-20190418165655-df01cb2cc480
golang.org/x/sync v0.0.0-20190423024810-112230192c58
golang.org/x/sys v0.0.0-20190419153524-e8e3143a4f4a // indirect
golang.org/x/time v0.0.0-20181108054448-85acf8d2951c // indirect
google.golang.org/genproto v0.0.0-20181127195345-31ac5d88444a // indirect
Expand Down
1 change: 1 addition & 0 deletions image.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ func IsValidImageOS(os string) bool {
type Image struct {
ID string `json:"id"`
Date time.Time `json:"date"`
Size int64 `json:"size"`
URLs []string `json:"urls"`
Exists bool `json:"exists"`
}
Expand Down
11 changes: 11 additions & 0 deletions machines.go
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,17 @@ var (
StateRetiring: {StateRetired},
StateRetired: {StateUninitialized},
}

// StateList is the list of possible machine states
StateList = []MachineState{
StateUninitialized,
StateHealthy,
StateUnhealthy,
StateUnreachable,
StateUpdating,
StateRetiring,
StateRetired,
}
)

// IsValidRole returns true if role is valid as machine role
Expand Down
24 changes: 24 additions & 0 deletions metrics/counter.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
package metrics

import (
"fmt"

"github.com/prometheus/client_golang/prometheus"
)

// APICounter represents API counter.
type APICounter struct {
Counter *prometheus.CounterVec
}

// NewCounter returns a new APICounter.
func NewCounter() *APICounter {
return &APICounter{
Counter: APIRequestTotal,
}
}

// Inc increments APIRequestTotal counter
func (c *APICounter) Inc(statusCode int, path, verb string) {
c.Counter.WithLabelValues(fmt.Sprint(statusCode), path, verb).Inc()
}
59 changes: 59 additions & 0 deletions metrics/metrics.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
package metrics

import "github.com/prometheus/client_golang/prometheus"

// MachineStatus returns the machine state metrics
var MachineStatus = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Namespace: namespace,
Name: "machine_status",
Help: "The machine status set by HTTP API.",
},
[]string{"status", "address", "serial", "rack", "role", "machine_type"},
)

// APIRequestTotal returns the total count of API calls
var APIRequestTotal = prometheus.NewCounterVec(
prometheus.CounterOpts{
Namespace: namespace,
Name: "api_request_count",
Help: "The total count of API calls.",
},
[]string{"code", "path", "verb"},
)

// AssetsBytesTotal returns the total bytes of assets
var AssetsBytesTotal = prometheus.NewGauge(
prometheus.GaugeOpts{
Namespace: namespace,
Name: "assets_bytes_total",
Help: "The total bytes of assets.",
},
)

// AssetsItemsTotal returns the total Items of assets
var AssetsItemsTotal = prometheus.NewGauge(
prometheus.GaugeOpts{
Namespace: namespace,
Name: "assets_items_total",
Help: "The total items of assets.",
},
)

// ImagesBytesTotal returns the total bytes of Images
var ImagesBytesTotal = prometheus.NewGauge(
prometheus.GaugeOpts{
Namespace: namespace,
Name: "images_bytes_total",
Help: "The total bytes of Images.",
},
)

// ImagesItemsTotal returns the total Items of Images
var ImagesItemsTotal = prometheus.NewGauge(
prometheus.GaugeOpts{
Namespace: namespace,
Name: "images_items_total",
Help: "The total items of Images.",
},
)
Loading