Quickstart Kubernetes: OpenTelemetry-first infrastructure metrics

This tutorial demonstrates how to use the Kubernetes Operator for OpenTelemetry Collector to send infrastructure metrics, and optionally application traces, to Cloud Observability using a Helm chart already configured for Collector best practices. This tutorial walks you through how to use Opentelemetry Collector Receivers to send OTLP metrics. This method differs from the Prometheus quickstart which is recommended for use when you want exact duplication of your Prometheus installation. Cloud Observability recommends using the Kubernetes Operator when deploying the OpenTelemetry Collector in Kubernetes environments.

A prerequisite of this quickstart is a running Kubernetes cluster. It can be either a standard Kubernetes distribution or a managed Kubernetes distribution like Azure AKS, Google GKE, or AWS EKS. If you’d just like to test locally, we recommend using minikube.

For more on the Kubernetes Operator for OpenTelemetry Collector, see the official OpenTelemetry docs.

Prerequisites

  • A Kubernetes cluster (either local using a tool like minikube or a cluster running in the cloud) with at least 2 CPUs and 4 GB of memory.
  • Helm v3 or later.
  • A Cloud Observability account
  • A Cloud Observability access token for the Cloud Observability project you would like to use.

Verify your setup

  1. Run the following command to verify you are connected to a Kubernetes cluster.

    1
    
     kubectl cluster-info
    

    If you see errors or cannot connect, follow the instructions from minikube or your cloud provider on authenticating with your cluster.

  2. Next, verify Helm is installed.

    1
    
     helm version
    

    Verify you are on Helm v3.

We recommend using Helm to manage dependencies and upgrades. However, if you cannot deploy Helm charts, you can use the helm template command to automatically generate Kubernetes manifests from an existing chart.

Add Helm repositories and install charts

  1. Run the following command to add the following Helm respositories and pull the latest charts:

    1
    2
    3
    4
    5
    
     helm repo add jetstack https://charts.jetstack.io
     helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
     helm repo add prometheus https://prometheus-community.github.io/helm-charts
     helm repo add lightstep https://lightstep.github.io/otel-collector-charts
     helm repo update
    
  2. Next, install the cert-manager charts on your cluster. The Cert Manager manages certificates needed by the Operator to subscribe to in-cluster Kubernetes events.

    1
    2
    3
    4
    5
    6
    
     helm install \
         cert-manager jetstack/cert-manager \
         --namespace cert-manager \
         --create-namespace \
         --version v1.8.0 \
         --set installCRDs=true
    
  3. Install the OpenTelemetry Operator chart. The Operator automates the creation and management of collectors, autoscaling, code instrumentation, scraping metrics endpoints, and more. We recommend version 0.35.1 with the Cloud Observability Helm chart.
    1
    2
    3
    
     helm install \
         opentelemetry-operator open-telemetry/opentelemetry-operator \
         -n default --version 0.35.1
    
  4. Run the following command to verify both charts successfully deployed with a status that says deployed:
    1
    
     helm list -A
    

Send Kubernetes metrics to Cloud Observability

The OpenTelemetry Collector has several receivers and processors that let you collect and enrich Kubernetes data all with OTLP data. If you send OTLP data to a collector with these processors you are able to enrich your application’s telemetry with infrastructure metadata. Cloud Observability provides a Helm chart to automatically configure collectors to send these metrics to Cloud Observability.

  1. Create a secret that holds your Cloud Observability Access Token.

    1
    2
    
     export LS_TOKEN='<your-token>'
     kubectl create secret generic otel-collector-secret -n default --from-literal="LS_TOKEN=$LS_TOKEN"
    
  2. Install the otel-cloud-stack chart. This chart automatically creates collectors to push Kubernetes metrics to your Cloud Observability project. This chart will install a singleton collector for Kubernetes cluster metrics and a daemonset collector to collect node and kubelet metrics (as well as any Prometheus instances with the prometheus.io/scrape: "true" annotation.)

    1
    
     helm install otel-cloud-stack lightstep/otel-cloud-stack -n default
    
  3. Verify the pods from the charts have been deployed with no errors:

    1
    
     kubectl get pods
    

    You should see pods for a stats-collector and a daemonset for node metrics.

See metrics in Cloud Observability

  1. In Cloud Observability, you can view your metrics in either a notebook or dashboard.

    When using notebooks you can click on any Kubernetes metrics in the all telemetry dropdown. Check the scrape_series_added metric first, which lets you know many Kubernetes metrics are being ingested.

    For dashboards, there are several pre-built dashboards that display Kubernetes metrics. For example, to see Pod metrics, in the Dashboard view, click Create a pre-built dashboard, and choose “Generic Kube App”.

    otel-operator-k8s-pod-dashboard-example

Send traces to Cloud Observability

Send data from your applications to Cloud Observability

You can also use the Operator to deploy a collector configured to send trace data to Cloud Observability. The chart configures a collector for tracing using best practices.

  1. Run the following command to deploy a new Collector configured for trace data into the cluster. Or you can change the tracesCollector to enabled:true if you are overriding the chart with a values file.

    1
    2
    
     helm upgrade otel-cloud-stack lightstep/otel-cloud-stack \
       -n default --set tracesCollector.enabled=true
    
  2. Next, verify that the Collector configured for tracing has been deployed:

    1
    
     kubectl get services
    

    You should see a new service with the name otel-cloud-stack-traces-collector with ports 4317/TCP and 8888/TCP open.

  3. Configure your OpenTelemetry-instrumented applications running in the cluster to export traces to an OTLP/gRPC endpoint otel-cloud-stack-traces-collector:4317. More information is available on how to instrument applications in the Quickstart: Instrumentation documentation or follow the instructions below to deploy the demo application.

The Operator, for languages like Java, .NET, Node, and Python, supports auto-instrumenting code running in clusters. This lets you deploy SDKs automatically without any code changes. More details are available in the OpenTelemetry Community Docs.

Troubleshooting

The default OTLP Exporter from a Collector enables gzip compression and TLS. Depending on your network configuration, you may need to enable or disable certain other gRPC features. This page contains a complete list of configuration parameters for the Collector gRPC client.

In the event that you are unable to establish a gRPC connection to the Cloud Observability Observability platform, you can use the grpcurl tool to ensure connectivity from your network to our public satellites. Run the following command, replacing <YOUR_ACCESS_TOKEN> with your project’s access token:

1
grpcurl -H 'lightstep-access-token:<YOUR_ACCESS_TOKEN>' ingest.lightstep.com:443 list

You should see the following output, or something similar:

1
2
3
4
5
grpc.reflection.v1alpha.ServerReflection
jaeger.api_v2.CollectorService
lightstep.collector.CollectorService
lightstep.egress.CollectorService
opentelemetry.proto.collector.trace.v1.TraceService

If you do not see this output, or the request hangs, then something is blocking gRPC traffic from transiting your network to ours. Please ensure that any proxies are passing through the lightstep-access-token header.

For additional troubleshooting recommendations, see Troubleshooting Missing Data in Cloud Observability.

See also

Use the OpenTelemetry Collector

Quickstart Kubernetes: Collector and Operator

Updated Jun 10, 2022