Install the OpenTelemetry Collector on Kubernetes

This topic walks you through how to install an OpenTelemetry Collector with a sample configuration for both metrics and traces that serves as a base for other guides.

While there are many ways to deploy a Collector on Kubernetes, we recommend using the OpenTelemetry Operator. The Operator installs custom resources into your cluster, allowing you to create OpenTelemetry Collectors and have the operator handle their coordination.

You can create a pre-built dashboard to monitor the Operator from the Dashboard list view. Or use the Cloud Observability Terraform Provider to create a dashboard.

If you’re interested in learning more about the trade-offs between different deployment modes, read more here.

To install the Collector, you need to (in this order):

  1. Install the OpenTelemetry Operator and Cert Manager
  2. Configure the OpenTelemetry Collector

These instructions install the Collector to a Kubernetes cluster as a single replica Kubernetes Deployment (also called “standalone” mode) using the Operator. For any questions, please contact your Customer Success representative.

You must be able to run ValidatingWebhookConfigurations and MutatingWebhookConfigurations within your Kubernetes cluster; these are used to verify the Collector configuration.

Prerequisites

Install the OpenTelemetry Operator and Cert Manager

  1. Run the following command to add the following Helm respositories and pull the latest charts:

    These Helm charts configure your environment (especially the Collector) to work best with Cloud Observability.

    1
    2
    3
    4
    5
    
     helm repo add jetstack https://charts.jetstack.io
     helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
     helm repo add prometheus https://prometheus-community.github.io/helm-charts
     helm repo add lightstep https://lightstep.github.io/otel-collector-charts
     helm repo update
    
  2. Next, install the cert-manager charts on your cluster. The Cert Manager manages certificates needed by the Operator to subscribe to in-cluster Kubernetes events.

    1
    2
    3
    4
    5
    6
    
     helm install \
         cert-manager jetstack/cert-manager \
         --namespace cert-manager \
         --create-namespace \
         --version v1.8.0 \
         --set installCRDs=true
    
  3. Create a namespace for ServiceNow Cloud Observability

    1
    
     kubectl create ns sn-cloud-obs
    
  4. Install the OpenTelemetry Operator chart. The Operator automates the creation and management of collectors, autoscaling, code instrumentation, scraping metrics endpoints, and more.
    1
    2
    3
    4
    5
    
     helm install \
         opentelemetry-operator open-telemetry/opentelemetry-operator \
         -n sn-cloud-obs \
         --set "manager.collectorImage.repository=otel/opentelemetry-collector-k8s" \
         --version 0.56.0
    
  5. Run the following command to verify both charts successfully deployed with a status that says deployed:
    1
    
     helm list -A
    

Configure the OpenTelemetry Collector

Kubernetes has built-in support for hundreds of useful metrics that help teams understand the health of their containers, pods, nodes, workloads, and internal system components. Cloud Observability provides a Helm chart to automatically configure collectors to send these metrics to Cloud Observability.

  1. Create a secret that holds your Cloud Observability Access Token.

    1
    2
    
     export LS_TOKEN='<your-token>'
     kubectl create secret generic otel-collector-secret -n sn-cloud-obs --from-literal="LS_TOKEN=$LS_TOKEN"
    
  2. Create another secret that holds your Cloud Observability API key.

    1
    2
    
     export LS_OPAMP_API_KEY='<your-api-key>'
     kubectl create secret generic otel-opamp-bridge-secret -n sn-cloud-obs --from-literal="LS_OPAMP_API_KEY=$LS_OPAMP_API_KEY"
    
  3. Install the collector-k8s chart. This chart automatically creates collectors to pull Kubernetes metrics and send them to your Cloud Observability project. We recommend you also specify the name of your cluster when installing the chart, which your can use by setting the clusterName variable:

    1
    2
    3
    4
    5
    
     helm install kube-otel-stack lightstep/kube-otel-stack \
         -n sn-cloud-obs \
         --set metricsCollector.clusterName=your-cluster-name \
         # --set otlpDestinationOverride="ingest.eu.lightstep.com:443" \ # EU data center
         # --set opAMPBridge.endpoint="wss://opamp.eu.lightstep.com/v1/opamp" # EU data center  
    
  4. Verify the pods from the charts have been deployed with no errors:

    1
    
     kubectl get pods
    

    You should see pods for a node exporter, the operator, kube-state-metrics, and multiple collectors.

Collector troubleshooting

The first thing you should do when troubleshooting collector issues is make sure data from your network can reach Cloud Observability. Your firewall or cloud configuration may be preventing a connection.

The default OTLP Exporter from a Collector enables gzip compression and TLS. Depending on your network configuration, you may need to enable or disable certain other gRPC features. This page contains a complete list of configuration parameters for the Collector gRPC client.

In the event that you are unable to establish a connection to the Cloud Observability platform, you can use curl to verify HTTP/2 connectivity to our collectors. Run the following command, replacing <YOUR_ACCESS_TOKEN> with your project’s access token:

1
2
curl -D- -XPOST --http2-prior-knowledge -H "lightstep-access-token: <YOUR_ACCESS_TOKEN>" https://ingest.lightstep.com/access-test # US data center
# curl -D- -XPOST --http2-prior-knowledge -H "lightstep-access-token: <YOUR_ACCESS_TOKEN>" https://ingest.eu.lightstep.com/access-test # EU data center

You should see the following output, or something similar:

1
2
3
4
5
6
7
HTTP/2 200
content-length: 2
content-type: text/plain
date: Thu, 09 May 2024 15:39:14 GMT
server: envoy

OK

If you do not see this output, or the request hangs, then something is blocking HTTP2 traffic from transiting your network to ours.

If you see HTTP/2 401, your request succeeded, but your token was not accepted. Some things to check:

  • Validity of your access token.
  • Ensure proxies are passing through the lightstep-access-token header.

Alternatively, to exercise the full gRPC request/response cycle, you can try emitting a single span to your project using the otel-cli tool. Refer to this example image and commands for running the CLI tool in Kubernetes and Docker on GitHub. Only send test spans to a non-production project.

For additional troubleshooting recommendations, see Troubleshooting Missing Data in Cloud Observability.

Next steps

Now that you’ve succesfully installed an OpenTelemetry Operator and Collector to your cluster, there are many ways to tune your setup to your needs.

See also

Performance test and tune the Collector

Updated Oct 21, 2022