Structure
I’m setting this up on a Kubernetes Cluster using the OpenTelemetry Operator, but since the Collector CRD is basically just a config file anyways, this can be replicated with regular deployments too. The graph below shows how traces are processed by the OTel Collector, the Span Metrics Connector and Service Graph Connector are used in the pipeline to generate metrics.
flowchart TD subgraph Monitoring Namespace prom(Prometheus) tempo(Tempo) end subgraph Application Namespace app(Applications) --> otelc subgraph otelc [OpenTelemetry Collector] tr(Trace Reciever) tr --> te(Trace Exporter) tr --> smc(Span Metric Connector) tr --> sgc(Service Graph Connector) smc --> me(Metric Exporter) sgc --> me end end te --> tempo me <-->|ServiceMonitor| prom
Configuration
OTel Collector
The comments in the yaml below should outline everything relevant. I’ve tried my best to make it as minimal as possible.
A quick kubectl port-forward svc/otel-collector-headless 8889
and call to http://localhost:8889/metrics should already display traces_spanmetrics_calls_total
, traces_spanmetrics_duration_seconds
, traces_service_graph_request_total
, as well as a few others. (Assuming the collector has already ingested a trace, otherwise /metrics doesn’t serve anything.)
Metrics
Until they figure out how they want to name the spanmetrics, the OTel Collector generated metric names aren’t compatible with the Grafana Views out of the box. For the time being, this can be corrected with metricRelabelings in our ServiceMonitor. The ServiceMonitor below scrapes the headless service every 30 seconds on the http-metrics port & renames metrics starting with traces_spanmetrics_duration_seconds_*
to traces_spanmetrics_latency_*
. Fortunately, the Service Graph Connector is based on Tempo’s service graph processor, so those metrics are compatible without any changes.
Results
Send off some traces & after a while you should start seeing metrics and a graph show up in Grafana under Explore > Tempo > Service Graph.
I also created a SPM Dashboard to get an overview of individual services & their operations (which are now called span.name
) Get the Dashboard here: https://grafana.com/grafana/dashboards/21202
Thank you for reading! I will most likely do a similar post using Grafana Alloy to continue my journey of shilling Grafana Labs. 🔭 Happy Tracing! :)