Skip to content

Commit

Permalink
eBPF based instrumentation
Browse files Browse the repository at this point in the history
  • Loading branch information
edeNFed committed Nov 16, 2022
1 parent 900a882 commit e42d188
Show file tree
Hide file tree
Showing 72 changed files with 18,583 additions and 3 deletions.
14 changes: 11 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,14 @@ Comments and questions about the project can be posted in our [slack channel](ht

## Development

### Compiling the project

Linux users can build this repository by running:
`make build`

Windows/Mac users will need to compile this project inside a docker container by running:
`make docker-build IMG=otel-go-agent:v0.1`

### Issues

Questions, bug reports, and feature requests can all be submitted as [issues](https://github.com/open-telemetry/opentelemetry-go-instrumentation/issues/new) to this repository.
Expand All @@ -27,9 +35,9 @@ If you are still working to finalize your PR, but would like to publish somethin
Next, your PR needs to be reviewed and approved by the [project approvers](https://github.com/orgs/open-telemetry/teams/go-instrumentation-approvers).
It will be ready to merge when:

* It has received two approvals from project approvers (at different companies).
* All feedback has been addressed.
* All open comments should be resolved.
- It has received two approvals from project approvers (at different companies).
- All feedback has been addressed.
- All open comments should be resolved.

A [project maintainer](https://github.com/orgs/open-telemetry/teams/go-instrumentaiton-maintainers) can merge the PR once these conditions are satisfied.
It is up to project maintains to ensure enough time has been allowed for review of PRs.
11 changes: 11 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
FROM fedora:35 as builder
RUN dnf install clang llvm make libbpf-devel -y
RUN curl -LO https://go.dev/dl/go1.18.linux-amd64.tar.gz && tar -C /usr/local -xzf go*.linux-amd64.tar.gz
ENV PATH="/usr/local/go/bin:${PATH}"
WORKDIR /app
COPY . .
RUN make build

FROM gcr.io/distroless/base-debian11
COPY --from=builder /app/otel-go-instrumentation /
CMD ["/otel-go-instrumentation"]
21 changes: 21 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Obtain an absolute path to the directory of the Makefile.
# Assume the Makefile is in the root of the repository.
REPODIR := $(shell dirname $(realpath $(firstword $(MAKEFILE_LIST))))

# Build the list of include directories to compile the bpf program
BPF_INCLUDE += -I${REPODIR}/include/libbpf
BPF_INCLUDE+= -I${REPODIR}/include

.PHONY: generate
generate: export CFLAGS := $(BPF_INCLUDE)
generate:
go mod tidy
go generate ./...

.PHONY: build
build: generate
CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o otel-go-instrumentation cli/main.go

.PHONY: docker-build
docker-build:
docker build -t $(IMG) .
90 changes: 90 additions & 0 deletions cli/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
// Copyright The OpenTelemetry Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package main

import (
"fmt"
"os"
"os/signal"
"syscall"

"github.com/open-telemetry/opentelemetry-go-instrumentation/pkg/errors"
"github.com/open-telemetry/opentelemetry-go-instrumentation/pkg/instrumentors"
"github.com/open-telemetry/opentelemetry-go-instrumentation/pkg/log"
"github.com/open-telemetry/opentelemetry-go-instrumentation/pkg/opentelemetry"
"github.com/open-telemetry/opentelemetry-go-instrumentation/pkg/process"
)

func main() {
err := log.Init()
if err != nil {
fmt.Printf("could not init logger: %s\n", err)
os.Exit(1)
}

log.Logger.V(0).Info("starting Go OpenTelemetry Agent ...")
target := process.ParseTargetArgs()
if err = target.Validate(); err != nil {
log.Logger.Error(err, "invalid target args")
return
}

processAnalyzer := process.NewAnalyzer()
otelController, err := opentelemetry.NewController()
if err != nil {
log.Logger.Error(err, "unable to create OpenTelemetry controller")
return
}

instManager, err := instrumentors.NewManager(otelController)
if err != nil {
log.Logger.Error(err, "error creating instrumetors manager")
return
}

stopper := make(chan os.Signal, 1)
signal.Notify(stopper, os.Interrupt, syscall.SIGTERM)
go func() {
<-stopper
log.Logger.V(0).Info("Got SIGTERM, cleaning up..")
processAnalyzer.Close()
instManager.Close()
}()

pid, err := processAnalyzer.DiscoverProcessID(target)
if err != nil {
if err != errors.ErrInterrupted {
log.Logger.Error(err, "error while discovering process id")
}
return
}

targetDetails, err := processAnalyzer.Analyze(pid, instManager.GetRelevantFuncs())
if err != nil {
log.Logger.Error(err, "error while analyzing target process")
return
}
log.Logger.V(0).Info("target process analysis completed", "pid", targetDetails.PID,
"go_version", targetDetails.GoVersion, "dependencies", targetDetails.Libraries,
"total_functions_found", len(targetDetails.Functions))

instManager.FilterUnusedInstrumentors(targetDetails)

log.Logger.V(0).Info("invoking instrumentors")
err = instManager.Run(targetDetails)
if err != nil && err != errors.ErrInterrupted {
log.Logger.Error(err, "error while running instrumentors")
}
}
141 changes: 141 additions & 0 deletions docs/design/context-propagation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
# Design Proposal: Context Propagation

## Motivation

Context propagation is a mechanism that allows tracing to be propagated across process boundaries. Usually, propagation is done by passing traceId and spanId of the current span to the next process. Those identifiers are passed as headers over the requests and responses.

The examples in this proposal describes context propagation over HTTP/gRPC formatted as [W3C Trace Context](https://www.w3.org/TR/trace-context/). However, the implementation should support other transportation methods and header formats.

## Overview

The context propagation implementation should support the following:

1. **Reading headers**: If the current transaction is a part of an existing distributed trace, the request / response should contain headers according to the chosen format.
2. **Storing the current span context**: data about the current span should be stored in an eBPF map to be used by the next process. The suggested data structure is a map from goroutine id to an object similar to [trace.SpanContext](https://pkg.go.dev/go.opentelemetry.io/otel/trace#SpanContext). The current span map should always hold the current running span. Entries can be written by one of the following:

- Header propagator (the use case described in this document) - for remote spans created outside of the current process.
- Automatic instrumentation - for spans created by the automatic instrumentation agent.
- Manual instrumentation - for spans created manually by the user.

3. **Writing headers**: the implementation should get the current span from the eBPF map and propagate it to the next process by adding new headers to the request / response.

Notice that currently, the automatic instrumentation correlates spans to the same trace if they are being executed by the same goroutine. In the future we plan to implement a more robust tracking of the goroutine tree to support traces from multiple goroutines. As part of this planned change, the current implementation of context propagation will also have to be changed (different key in the current span map).

## Example walkthrough

In order to better understand how all the different pieces fit together, we will walk through an example. The example is based on the following scenario:

```mermaid
graph LR
A[Application A] -->|HTTP| B(Target Application)
B --->|HTTP| A
B -->|gRPC| C(Application B)
C --->|gRPC|B
```

The target application is a simple HTTP server. For every request it receives, it sends a gRPC request to another application. The gRPC response is then sent back to the client. We assume that applications A and B are already instrumented.

#### Step 1: Read request headers

The current HTTP server instrumentation is attached to the following function:

```go
func (mux *ServeMux) ServeHTTP(w ResponseWriter, r *Request)
```

The instrumentation is already reading fields from the `http.Request` object. Getting the headers should be a matter of tracking the `Header` field in [offsets-tracker](https://github.com/keyval-dev/offsets-tracker) and reading the values according to W3C Trace Context specification.

#### Step 2: Store the headers as current span in the SpanContext map

Update the SpanContext map with key equals to the current gorountine id and value equals to SpanContext object (traceId and spanId).

#### Step 3: Create an HTTP span

Current implementation of HTTP server instrumentor already creates a span for the request. The span traceId and parent spanId should be set according to the values in the eBPF map.

#### Step 4: Update SpanContext map

Update the map with the newly created HTTP span as the current span.

#### Step 5: Add headers to gRPC request

In general, the context propagation implementation should attach a uprobe to a function that writes the headers to the outgoing request.
For example, in gRPC we may choose to attach a uprobe to the following function:

```go
func (l *loopyWriter) writeHeader(streamID uint32, endStream bool, hf []hpack.HeaderField, onWrite func()) error
```

The context propagation implementation will use the kernel helper function `bpf_probe_write_user()` in order to modify the `hf` array according to the current span (taken from the eBPF map). Notice that there might be a better choice for the target function, a further investigation is needed.

#### Step 6: Read headers from gRPC response

The implementation will attach a uprobe to a function that reads the headers from the incoming response. Look for headers with keys according to the W3C Trace Context specification. In gRPC example, we may choose to attach a uprobe to the following function:

```go
func decodeMetadataHeader(k, v string) (string, error)
```

This function is located at the `http_util.go` file (again, there may be a better target function, a further investigation is required).

#### Step 7: Update current span map

Similar to step 4, update the current span map with the headers from the previous step.

#### Step 8: Write headers to HTTP response

Similar to step 5, the implementation will attach a uprobe to the function that writes the headers to the response.
A possible candidate may be:

```go
func (h Header) writeSubset(w io.Writer, exclude map[string]bool, trace *httptrace.ClientTrace) error
```

## Proof of Concept

The [following application](https://github.com/edeNFed/ebpf-write-poc) is a test program that demonstrates changing go function arguments via eBPF instrumentation:

```go
func worker(str string) {
headers := make(map[string]string)
headers["X-Request-Id"] = str

fmt.Printf("The Headers are: %s\n", headers)
}

func main() {
for i := 0; i < 10; i++ {
worker(fmt.Sprintf("request number: %d", i))
time.Sleep(2 * time.Second)
}
}
```

By running a [modified version](https://github.com/edeNFed/opentelemetry-go-instrumentation/tree/ebpf-write-poc) of the automatic instrumentation, we can see that the headers are set by the instrumentation successfully:

```
❯ go run main.go
The Headers are: map[X-Request-Id:request number: 0]
The Headers are: map[X-Request-Id:request number: 1]
The Headers are: map[X-Request-Id:request number: 2]
The Headers are: map[X-Request-Id:request number: 3]
The Headers are: map[X-Request-Id:request number: 4]
The Headers are: map[X-Request-Id:ebpf header value] <--- After instrumentation launched
The Headers are: map[X-Request-Id:ebpf header value]
The Headers are: map[X-Request-Id:ebpf header value]
```

## Safety Considerations

Modifying function arguments has to be done with care. The automatic instrumentation has all the data required in order to fully understand the stack state and safely modify the arguments:

- Strcutres and data fields offsets (tracked over time by version)
- Compiled go version
- CPU architecture

The process of implementing context propagation will include thorough tests to ensure that the implementation is safe.

## Future Work

- Support more propagation formats like B3 and Jaeger.
- Configure propagation via environment variables.
30 changes: 30 additions & 0 deletions docs/design/manual-instrumentation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Design Proposal: Integration with Manual Instrumentation

## Motivation

Users may want to enrich the traces produced by the automatic instrumentation with additional spans created manually via the [OpenTelemetry Go SDK](https://github.com/open-telemetry/opentelemetry-go).

## Overview

Integration with manual instrumentation happens in two steps:

1. **Modify spans created manually** - attach a uprobe to the function that creates the span, override the trace id and parent span id with the current active span (according to the eBPF map described in the context propagation document).
2. **Update active span map** - After the span is created, update the eBPF map with the new span as the current span. This step is needed in order to create traces that combines spans created manually, automatically (via other instrumentors) and remotely (via context propagation).

This implementation depends on changes described in the [Context Propagation design document](context-propagation.md) and can't be implemented before context propagation is implemented.

## Instrumenting OpenTelemetry Go SDK

The following function (located in `tracer.go` file) may be a good candidate for instrumenting the creation of manual spans:

```go
func (tr *tracer) newRecordingSpan(psc, sc trace.SpanContext, name string, sr SamplingResult, config *trace.SpanConfig) *recordingSpan {
```
By overriding `psc.spanID` and `sc.traceID` to match the current span according to the eBPF map, the function will create a span that is a child of the current active span.
## Future Work
### Use single exporter
Applications instrumented both manually and automatically will export the produced spans via two different exporters. One created manually by the user and another one created in the instrumentation agent. This is not damaging the combined traces, but it is not ideal. In the future, we may want to implement an exporter that communicates with the instrumentation agent (via mechanisem like Unix domain socket) and exports the combined traces over a single connection.
Loading

0 comments on commit e42d188

Please sign in to comment.