The complete guide to OpenTelemetry in Golang.

The complete guide to OpenTelemetry in Golang. (17 Feb 2023)

This guide should take you from zero to production.

OpenTelemetry(OTLP) is a collection of tools that can be used to instrument, collect, and export telemetry(metrics, logs, & traces). That telemetry data can help you analyze your software's behavior.
At the end of this blogpost;

Our application should be emitting logs, traces & metrics.
We should be able to correlate logs to their corresponding traces.
We should be able to correlate logs to their corresponding metrics.
We should be able to correlate traces to their corresponding metrics.
Our application should be able to propagate traces between different services.
Telemetry should be collected in a secure, perfomant & scalable manner.

I have previously written about how you can implement logging without losing money, this blogpost kind of completes the triad of observability.

Our application
The application consists of two services; serviceA and serviceB. Customers send requests to serviceA, which in turn calls serviceB.
When called, serviceB adds two numbers and then returns the result as part of the SVC-RESPONSE http header. ServiceA echos back that header to the customer/client.


// file: main.go

package main

import (
    "context"
    "fmt"
    "net/http"
)

func main() {
    ctx := context.Background()
    go serviceA(ctx, 8081)
    serviceB(ctx, 8082)
}

func serviceA(ctx context.Context, port int) {
    mux := http.NewServeMux()
    mux.HandleFunc("/serviceA", serviceA_HttpHandler)
    serverPort := fmt.Sprintf(":%d", port)
    server := &http.Server{Addr: serverPort, Handler: mux}

    fmt.Println("serviceA listening on", server.Addr)
    if err := server.ListenAndServe(); err != nil {
        panic(err)
    }
}

func serviceA_HttpHandler(w http.ResponseWriter, r *http.Request) {
    cli := &http.Client{}
    req, err := http.NewRequestWithContext(r.Context(), http.MethodGet, "http://localhost:8082/serviceB", nil)
    if err != nil {
        panic(err)
    }
    resp, err := cli.Do(req)
    if err != nil {
        panic(err)
    }

    w.Header().Add("SVC-RESPONSE", resp.Header.Get("SVC-RESPONSE"))
}

func serviceB(ctx context.Context, port int) {
    mux := http.NewServeMux()
    mux.HandleFunc("/serviceB", serviceB_HttpHandler)
    serverPort := fmt.Sprintf(":%d", port)
    server := &http.Server{Addr: serverPort, Handler: mux}

    fmt.Println("serviceB listening on", server.Addr)
    if err := server.ListenAndServe(); err != nil {
        panic(err)
    }
}

func serviceB_HttpHandler(w http.ResponseWriter, r *http.Request) {
    answer := add(r.Context(), 42, 1813)
    w.Header().Add("SVC-RESPONSE", fmt.Sprint(answer))
    fmt.Fprintf(w, "hello from serviceB: Answer is: %d", answer)
}

func add(ctx context.Context, x, y int64) int64 { return x + y }

If we call serviceA, we get back;


curl -I http://127.0.0.1:8081/serviceA

Svc-Response: 1855

Perfect, everything is working.

Tracing
Let's add tracing support.


// file: tracing.go

package main

import (
    "context"
    "crypto/tls"
    "crypto/x509"
    "os"

    "go.opentelemetry.io/otel"
    "go.opentelemetry.io/otel/attribute"
    "go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
    "go.opentelemetry.io/otel/propagation"
    "go.opentelemetry.io/otel/sdk/resource"
    "go.opentelemetry.io/otel/sdk/trace"
    semconv "go.opentelemetry.io/otel/semconv/v1.12.0"
    "google.golang.org/grpc/credentials"
)

func setupTracing(ctx context.Context, serviceName string) (*trace.TracerProvider, error) {
    c, err := getTls()
    if err != nil {
        return nil, err
    }

    exporter, err := otlptracegrpc.New(
        ctx,
        otlptracegrpc.WithEndpoint("localhost:4317"),
        otlptracegrpc.WithTLSCredentials(
            // mutual tls.
            credentials.NewTLS(c),
        ),
    )
    if err != nil {
        return nil, err
    }

    // labels/tags/resources that are common to all traces.
    resource := resource.NewWithAttributes(
        semconv.SchemaURL,
        semconv.ServiceNameKey.String(serviceName),
        attribute.String("some-attribute", "some-value"),
    )

    provider := trace.NewTracerProvider(
        trace.WithBatcher(exporter),
        trace.WithResource(resource),
        // set the sampling rate based on the parent span to 60%
        trace.WithSampler(trace.ParentBased(trace.TraceIDRatioBased(0.6))),
    )

    otel.SetTracerProvider(provider)

    otel.SetTextMapPropagator(
        propagation.NewCompositeTextMapPropagator(
            propagation.TraceContext{}, // W3C Trace Context format; https://www.w3.org/TR/trace-context/
        ),
    )

    return provider, nil
}

// getTls returns a configuration that enables the use of mutual TLS.
func getTls() (*tls.Config, error) {
    clientAuth, err := tls.LoadX509KeyPair("./confs/client.crt", "./confs/client.key")
    if err != nil {
        return nil, err
    }

    caCert, err := os.ReadFile("./confs/rootCA.crt")
    if err != nil {
        return nil, err
    }
    caCertPool := x509.NewCertPool()
    caCertPool.AppendCertsFromPEM(caCert)

    c := &tls.Config{
        RootCAs:      caCertPool,
        Certificates: []tls.Certificate{clientAuth},
    }

    return c, nil
}

There's a lot going on here so lets break it down.
We create an exporter using otlptracegrpc.New(). An exporter creates trace data in the OTLP wire format. The one we are using here is backed by GRPC, you can also use other exporters backed with other transport mechanisms like http etc.
That exporter will be sending trace data to a 'thing' listening on localhost:4317 and for security purposes it is going to authenticate to that 'thing' using mutual TLS. That 'thing' is called a collector, and we'll talk about it soon.
Next, we create a provider with trace.NewTracerProvider(). A provider creates spans containing more information about what is happening for a given operation. For purposes of scalability, cost, and the like; we have configured the provider to batch trace data and also to sample at a rate of 60%. You can vary this parameters based on your own usecases and specific requirements.
Finally, we set a propagator with otel.SetTextMapPropagator(). Propagation is the mechanism by which a trace can be disseminated/communicated from one service to another across transport boundaries. In our example, we have two services serviceA & serviceB, these can be running in two different computers or even two different geographical regions. It is useful to be able to correlate traces across the two services since they 'talk' to each other. In our case, we are using propagation.TraceContext{} which is based on the W3C Trace Context standard. There are other propators that you can use if trace-context is not your preference.

We need to generate the tls certificates that will be used for authentication. You can generate the certificates in various ways, here's the script that I used for this blogpost.

Finally, let's setup the main func to start tracing.


// file: main.go

const serviceName = "AdderSvc"

func main() {
    ctx := context.Background()
    {
        tp, err := setupTracing(ctx, serviceName)
        if err != nil {
            panic(err)
        }
        defer tp.Shutdown(ctx)
    }

    go serviceA(ctx, 8081)
    serviceB(ctx, 8082)
}

If we run our app, it still works


go run -race ./... && curl -I http://127.0.0.1:8081/serviceA

Svc-Response: 1855

The app is ready to collect traces, but we have not yet setup the collector that will receive the traces at localhost:4317
We will do that eventually, but first lets look at;

Metrics
The code to add metric support is almost similar.


// file: metrics.go

package main

import (
    "context"
    "time"

    "go.opentelemetry.io/otel/attribute"
    "go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc"
    "go.opentelemetry.io/otel/metric/global"
    sdkmetric "go.opentelemetry.io/otel/sdk/metric"
    "go.opentelemetry.io/otel/sdk/resource"
    semconv "go.opentelemetry.io/otel/semconv/v1.12.0"
    "google.golang.org/grpc/credentials"
)

func setupMetrics(ctx context.Context, serviceName string) (*sdkmetric.MeterProvider, error) {
    c, err := getTls()
    if err != nil {
        return nil, err
    }

    exporter, err := otlpmetricgrpc.New(
        ctx,
        otlpmetricgrpc.WithEndpoint("localhost:4317"),
        otlpmetricgrpc.WithTLSCredentials(
            // mutual tls.
            credentials.NewTLS(c),
        ),
    )
    if err != nil {
        return nil, err
    }

    // labels/tags/resources that are common to all metrics.
    resource := resource.NewWithAttributes(
        semconv.SchemaURL,
        semconv.ServiceNameKey.String(serviceName),
        attribute.String("some-attribute", "some-value"),
    )

    mp := sdkmetric.NewMeterProvider(
        sdkmetric.WithResource(resource),
        sdkmetric.WithReader(
            // collects and exports metric data every 30 seconds.
            sdkmetric.NewPeriodicReader(exporter, sdkmetric.WithInterval(30*time.Second)),
        ),
    )

    global.SetMeterProvider(mp)

    return mp, nil
}

That code is almost line-for-line similar to the previous code for tracing, so I'll not spend anytime explaining it. Let's setup the main func to setup metrics.


// file: main.go

func main() {
    {
        // ... previous code here.
        mp, err := setupMetrics(ctx, serviceName)
        if err != nil {
            panic(err)
        }
        defer mp.Shutdown(ctx)
    }
    // ... previous code here.
}

Logs
They are the so-called third pillar of telemetry. In our case we are going to use logrus, but this technique can be applied to many other logging libraries. Indeed, here is the same technique applied to zerolog and to golang.org/x/exp/slog
What we are going to do is create a logrus hook that will; (a) add TraceIds & spanIds to logs and (b) add logs to the active span as span-events.


// file: log.go

package main

import (
    "context"

    "github.com/sirupsen/logrus"
    "go.opentelemetry.io/otel/attribute"
    "go.opentelemetry.io/otel/codes"
    "go.opentelemetry.io/otel/trace"
)

// usage:
//
//	ctx, span := tracer.Start(ctx, "myFuncName")
//	l := NewLogrus(ctx)
//	l.Info("hello world")
func NewLogrus(ctx context.Context) *logrus.Entry {
    l := logrus.New()
    l.SetLevel(logrus.TraceLevel)
    l.AddHook(logrusTraceHook{})
    return l.WithContext(ctx)
}

// logrusTraceHook is a hook that;
// (a) adds TraceIds & spanIds to logs of all LogLevels
// (b) adds logs to the active span as events.
type logrusTraceHook struct{}

func (t logrusTraceHook) Levels() []logrus.Level { return logrus.AllLevels }

func (t logrusTraceHook) Fire(entry *logrus.Entry) error {
    ctx := entry.Context
    if ctx == nil {
        return nil
    }
    span := trace.SpanFromContext(ctx)
    if !span.IsRecording() {
        return nil
    }

    { // (a) adds TraceIds & spanIds to logs.
        sCtx := span.SpanContext()
        if sCtx.HasTraceID() {
            entry.Data["traceId"] = sCtx.TraceID().String()
        }
        if sCtx.HasSpanID() {
            entry.Data["spanId"] = sCtx.SpanID().String()
        }
    }

    { // (b) adds logs to the active span as events.

        // code from: https://github.com/uptrace/opentelemetry-go-extra/tree/main/otellogrus
        // whose license(BSD 2-Clause) can be found at: https://github.com/uptrace/opentelemetry-go-extra/blob/v0.1.18/LICENSE
        attrs := make([]attribute.KeyValue, 0)
        logSeverityKey := attribute.Key("log.severity")
        logMessageKey := attribute.Key("log.message")
        attrs = append(attrs, logSeverityKey.String(entry.Level.String()))
        attrs = append(attrs, logMessageKey.String(entry.Message))

        span.AddEvent("log", trace.WithAttributes(attrs...))
        if entry.Level <= logrus.ErrorLevel {
            span.SetStatus(codes.Error, entry.Message)
        }
    }

    return nil
}

If you are not familiar with hooks in logrus, they are pieces of code(interfaces) that are called for each log event. You can create a hook to literally do almost anything you want. In our case, this hook looks for any traceId's & spanId's from tracing and adds them to log events. Additionally, it takes any log events and adds them to traces as span event. This enables us to be able to correlate logs to their corresponding traces and vice versa. And as I said, this technique can be applied to others like zerolog and golang.org/x/exp/slog, see here and here

Collector
Both the tracing & metrics code we have written so far sends telemetry data to a 'thing' that we have defined to listen on localhost:4317
That thing is an OpenTelemetry collector. A collector is a vendor-agnostic way to receive, process and export telemetry data. It removes the need to run, operate, and maintain multiple agents/collectors. This is one of the great benefits of opentelemetry, you do not have to install multiple client libraries or agents(jaeger, prometheus, datadog, honeycomb, grafana, whoever, whatever).
You just use the opentelemetry library for your chosen language(in our case Go) and also use one opentelemetry collector/agent. Then that collector can 'fan-out' to jaeger/prometheus/datadog/honeycomb/grafana/stdOut/whoever/whatever.
The Collector is a process/agent/server that you run on your own computer that does 3 main things:

Receives telemetry(logs,traces,metrics) data from the client libraries.
Optionally, pre-processes that data.
Exports that data to final destinations.

Lets start by creating a collector configuration file as described in the documentation


# file: confs/otel-collector-config.yaml

# (1) Receivers
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317 # It is important that we do not use localhost
        tls:
          cert_file: /etc/tls/server.crt
          key_file: /etc/tls/server.key
          ca_file: /etc/tls/rootCA.crt
          client_ca_file: /etc/tls/rootCA.crt

# (2) Processors
processors:
  memory_limiter:
    limit_percentage: 50
    check_interval: 1s
    spike_limit_percentage: 30
  batch:
    send_batch_size: 8192

# (3) Exporters
exporters:
  logging: # aka, stdOut/stdErr
  jaeger:
    endpoint: jaeger:14250
    tls:
      insecure: true
  prometheus:
    endpoint: otel_collector:9464


# (4) Service
service:
  # A pipeline consists of a set of receivers, processors and exporters.
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [logging, jaeger]
    metrics:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [logging, prometheus]
    logs:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [logging]

We have defined the OpenTelemetry collector to listen/run on localhost port 4317. The collector will authenticate using mutual TLS, and we configure the same TLS as we had configured for both the tracing & metrics Go code. The script that we used to generate TLS certificates can be found here. All our telemetry data will be sent to the OpenTelemetry collector's receiver that is running on localhost:4317
Next we have configured two data processors, memory_limiter & batch processor. The memory limiter is used to prevent out of memory situations on the collector, whereas the batch processor places telemetry data into batches thus reducing transmission load.
After that, we configure exporters. Here we have configured a logging (aka stdOut) processor. Using this, the collector will send the collected telemetry data to stdOut/stderr. This can be useful when you are developing locally or for debugging purposes. Remember to turn it off in production/live settings.
We have also configured a jaeger exporter, the collector will send tracing data to jaeger listening on jaeger:14250(we are not using localhost here since jaeger will be running in a separate docker container, see next section)
We have a prometheus exporter, the collector will setup an endpoint at localhost:9464 and that endpoint exposes prometheus format metrics. The endpoint can be scraped for prometheus style metrics.
Finally, in the services section is where we orchestrate everything together. For example we are saying that; traces will be recieved by the receiver called otlp, processed by both the memory_limiter & batch processor and finally exported both to stdout & jaeger.

collector, prometheus, jaeger
We have talked about the collector, prometheus, jaeger & stdOut/logging. We need to run these services so that our Go code can talk to them. In most cases you may not have to run these services, it may well be the case that in production/live circumstance; someone else is running these on your behalf. There are a lot of companies out there that offer paid versions of these, some even with very generous free tiers. It might make a lot of sense to use those services if you can.
Neverthless it would not hurt knowing how to set them up for yourself. We'll use docker/docker-compose to run these;


# file: docker-compose.yaml

version: '3.3'
services:

    # OpenTelemetry Collector
    otel_collector:
    image: otel/opentelemetry-collector-contrib:0.70.0
    command: --config=/etc/otel-collector-config.yaml
    volumes:
        - ./confs/otel-collector-config.yaml:/etc/otel-collector-config.yaml
        - ./confs/server.crt:/etc/tls/server.crt
        - ./confs/server.key:/etc/tls/server.key
        - ./confs/rootCA.crt:/etc/tls/rootCA.crt
    ports:
        - "4317:4317" # OTLP over gRPC receiver
        - "9464:9464" # Prometheus exporter
    depends_on:
        - jaeger
        - prometheus
    networks:
        - my_net

    # Jaeger
    jaeger:
    image: jaegertracing/all-in-one:1.41.0
    ports:
        - "14250:14250" # Collector gRPC
        - "16686:16686" # Web HTTP
    networks:
        - my_net

    # Prometheus
    prometheus:
    image: prom/prometheus:v2.42.0
    command:
        - --config.file=/etc/prometheus/prometheus-config.yaml
    volumes:
        - ./confs/prometheus-config.yaml:/etc/prometheus/prometheus-config.yaml
    ports:
        - "9090:9090"
    networks:
        - my_net

networks:
    my_net:

We start the OpenTelemetry collector and have it running on port 4317 inside the container. We also expose the same port on the host machine. This is the same port we have set in the confs/otel-collector-config.yaml file as the receiver. We also add in the certificates needed for mutual TLS from the host to the container. The port 9464 is where the collector exposes prometheus format metrics. You can then have prometheus scrape that port, as we will.
We start jaeger, which is a is a distributed tracing system, listening on port 14250 inside the container and also expose the running service onto the host at the same port. Remember in the confs/otel-collector-config.yaml file we have setup OpenTelemetry collector to export tracing telemetry(traces) to jaeger on port 14250. We also expose jaeger port 16686, this is the port that we can access on the browser and use it to query for traces; we will be using that port later on in the blogpost.
We also start prometheus, which is a metrics system. The way prometheus primarily works is by scraping metrics from instrumented services. This is to say; your service does not send metrics to prometheus, instead, prometheus pulls metrics from your service. So the port we have exposed here 9090 is the port that we can access on the browser and use it to query for metrics. Since prometheus scrapes metrics, we need to tell it from where to scrape for metrics. We do that using a configuration file, telling it to scrape from otel_collector:9464


# file: confs/prometheus-config.yaml

global:
  evaluation_interval: 30s
  scrape_interval: 5s
scrape_configs:
- job_name: 'collector'
    static_configs:
    - targets: ['otel_collector:9464']

We can now start all the docker services and also the application.


docker-compose up --build --detach
go run ./...

This is well and good but our application code is not yet instrumented to emit any telemetry(logs, traces & metrics). That is what we are going to do next.

Instrumentation
There are a couple of changes we need to make;

Use a http middleware that will add traces to our http handlers.
Use a http.RoundTripper that will add traces to http requests.
Create trace spans in relevant places in our code.
Emit metrics in relevant places in our code.
Emit logs in relevant places in our code.

The diff to implement (1) and (2) is(don't worry if you do not like diffs, the full code will be linked to at the end);


diff --git a/main.go b/main.go
index c0fb2d5..c6a9ee2 100644
--- a/main.go
+++ b/main.go
@@ -4,6 +4,14 @@ import (
        "context"
        "fmt"
        "net/http"
+
+	"go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
+	"go.opentelemetry.io/otel"
+	"go.opentelemetry.io/otel/attribute"
+	"go.opentelemetry.io/otel/metric"
+	"go.opentelemetry.io/otel/metric/global"
+	"go.opentelemetry.io/otel/metric/instrument"
+	"go.opentelemetry.io/otel/trace"
    )
    
    const serviceName = "AdderSvc"
@@ -32,8 +40,9 @@ func main() {
    func serviceA(ctx context.Context, port int) {
        mux := http.NewServeMux()
        mux.HandleFunc("/serviceA", serviceA_HttpHandler)
+	handler := otelhttp.NewHandler(mux, "server.http")
        serverPort := fmt.Sprintf(":%d", port)
-	server := &http.Server{Addr: serverPort, Handler: mux}
+	server := &http.Server{Addr: serverPort, Handler: handler}
    
        fmt.Println("serviceA listening on", server.Addr)
        if err := server.ListenAndServe(); err != nil {
@@ -42,7 +51,9 @@ func serviceA(ctx context.Context, port int) {
    }
    
    func serviceA_HttpHandler(w http.ResponseWriter, r *http.Request) {
-	cli := &http.Client{}
+	cli := &http.Client{
+		Transport: otelhttp.NewTransport(http.DefaultTransport),
+	}
        req, err := http.NewRequestWithContext(r.Context(), http.MethodGet, "http://localhost:8082/serviceB", nil)
        if err != nil {
            panic(err)
@@ -58,8 +69,9 @@ func serviceA_HttpHandler(w http.ResponseWriter, r *http.Request) {
    func serviceB(ctx context.Context, port int) {
        mux := http.NewServeMux()
        mux.HandleFunc("/serviceB", serviceB_HttpHandler)
+	handler := otelhttp.NewHandler(mux, "server.http")
        serverPort := fmt.Sprintf(":%d", port)
-	server := &http.Server{Addr: serverPort, Handler: mux}
+	server := &http.Server{Addr: serverPort, Handler: handler}
    
        fmt.Println("serviceB listening on", server.Addr)
        if err := server.ListenAndServe(); err != nil {
@@ -73,4 +85,38 @@ func serviceB_HttpHandler(w http.ResponseWriter, r *http.Request) {
        fmt.Fprintf(w, "hello from serviceB: Answer is: %d", answer)
    }

The diff to implement (3), (4) and (5) is;


diff --git a/main.go b/main.go
index 349bc6e..868c380 100644
--- a/main.go
+++ b/main.go
@@ -51,10 +51,13 @@ func serviceA(ctx context.Context, port int) {
    }
    
    func serviceA_HttpHandler(w http.ResponseWriter, r *http.Request) {
+	ctx, span := otel.Tracer("myTracer").Start(r.Context(), "serviceA_HttpHandler")
+	defer span.End()
+
        cli := &http.Client{
            Transport: otelhttp.NewTransport(http.DefaultTransport),
        }
-	req, err := http.NewRequestWithContext(r.Context(), http.MethodGet, "http://localhost:8082/serviceB", nil)
+	req, err := http.NewRequestWithContext(ctx, http.MethodGet, "http://localhost:8082/serviceB", nil)
        if err != nil {
            panic(err)
        }
@@ -80,9 +83,44 @@ func serviceB(ctx context.Context, port int) {
    }
    
    func serviceB_HttpHandler(w http.ResponseWriter, r *http.Request) {
-	answer := add(r.Context(), 42, 1813)
+	ctx, span := otel.Tracer("myTracer").Start(r.Context(), "serviceB_HttpHandler")
+	defer span.End()
+
+	answer := add(ctx, 42, 1813)
        w.Header().Add("SVC-RESPONSE", fmt.Sprint(answer))
        fmt.Fprintf(w, "hello from serviceB: Answer is: %d", answer)
    }
    
-func add(ctx context.Context, x, y int64) int64 { return x + y }
+func add(ctx context.Context, x, y int64) int64 {
+	ctx, span := otel.Tracer("myTracer").Start(
+		ctx,
+		"add",
+		// add labels/tags/resources(if any) that are specific to this scope.
+		trace.WithAttributes(attribute.String("component", "addition")),
+		trace.WithAttributes(attribute.String("someKey", "someValue")),
+	)
+	defer span.End()
+
+	counter, _ := global.MeterProvider().
+		Meter(
+			"instrumentation/package/name",
+			metric.WithInstrumentationVersion("0.0.1"),
+		).
+		Int64Counter(
+			"add_counter",
+			instrument.WithDescription("how many times add function has been called."),
+		)
+	counter.Add(
+		ctx,
+		1,
+		// labels/tags
+		attribute.String("component", "addition"),
+	)
+
+	log := NewLogrus(ctx)
+	log.Info("add_called")
+
+	return x + y
+}

With that, we are in business. Let's run all the things and send some requests;


docker-compose up --build --detach
go run ./... &
curl -vkL http://127.0.0.1:8081/serviceA

If you navigate to the jaeger port that is used to query traces; http://localhost:16686/search you should be able to see traces;

image showing sample trace telemetry

As you can see the traces are recorded and more importantly, the logs emitted by the logger are also added in as trace span events.
If you look at the logs emitted by the logger in stdOut;


INFO[0004] add_called traceId=68d7a4ea8cdaadb7309eebf0fd41037a spanId=c3cf045f0f8ee0f2

We can also see that spanId & traceId from traces have been added to the logs.
It is now very easy to correlate logs and their corresponding traces.

Additionally, if you navigate to the prometheus port that is used to query traces; http://localhost:9090/graph you should be able to see metrics;

image showing sample metric telemetry

This is like oncall/debugging nirvana.

There's only one thing left. As things stand, it is very easy to correlate logs and traces; this is because they share traceId's and spanId's. However it is still not that easy to correlate metrics with either logs or traces. Let's fix that. You can do that by making sure that you use the same set of attributes when starting a span, when taking metrics and when emitting logs. Something along the lines of;


ctx, span := otel.Tracer("myTracer").Start(ctx, "add",
    trace.WithAttributes(attribute.String("component", "addition")),
    trace.WithAttributes(attribute.Int("age", 89)),
)

counter.Add(ctx, 1,
    attribute.String("component", "addition"),
    attribute.Int("age", 89),
)

log := NewLogrus(ctx).WithFields(logrus.Fields{
    "component": "addition",
    "age":       89,
})

Conclusion
We have been able to instrument the application so that it emits all the necessary telemetry data.
Do note that the OpenTelemetry ecosystem keeps changing a lot and so if you would like to reproduce the results in this post, you ought to use the same library versions as I did. Those exact versions can be found here

All the code in this blogpost, including the full source code, can be found at: this link

You can comment on this article by clicking here.