Run the Collector as a DaemonSet

Running the OpenTelemetry Collector as a DaemonSet

This topic covers instructions to deploy the OpenTelemetry Collector using Kubernetes DaemonSet to ingest Prometheus metrics. A DaemonSet is a type of workload that ensures that every Kubernetes Node has a running instance.

You can also deploy the Collector using a Kubernetes Deployment or a StatefulSet.
Read Plan an OpentTelemetry Collector deployment to determine which method to use.

To scale the Collector with a Daemonset, you configure the Collector to only scrape application metrics from its nodes. Optionally, you can run another single Collector in Deployment mode to scrape static targets and infrastructure metrics if needed.

Prerequisites

This topic covers the steps to deploy two OpenTelemetry Collectors, one in DaemonSet mode and optionally, a second one in Deployment mode, in a Kubernetes Cluster. You will need to (in this order):

  1. Install the OpenTelemetry Collector DaemonSet
  2. Configure the OpenTelemetry Collector deployment to scrape infrastructure metrics and static targets, if needed

Install the OpenTelemetry Collector DaemonSet

  1. From the Cloud Observability otel-collector-charts repository, copy the charts/collector_k8s folder to your existing directory.

  2. Set the shell variable LS_TOKEN to your Cloud Observability access token.
    1
    
    export LS_TOKEN=”<ACCESS_TOKEN>”
    
  3. Install the OpenTelemetry Collector using the collector_k8s/values-daemonset.yaml values.
    1
    2
    3
    
    kubectl create namespace opentelemetry
    kubectl create secret generic otel-collector-secret -n opentelemetry --from-literal=LS_TOKEN=$LS_TOKEN
    helm upgrade lightstep ./charts/collector-k8s -f ./charts/collector-k8s/values-daemonset.yaml -n opentelemetry --install
    
  4. Verify that the daemonset Collector is up and running, You should see one pod in “ready” state for each node on your cluster.
    1
    
    kubectl get daemonset -n opentelemetry
    

    This Collector will scrape all pods that are annotated with the prometheus.io/scrape: true annotation, which is per pod. You can adjust the prometheus.io/port annotation to scrape a port of your choice instead of the default.

  5. In Cloud Observability, use a Notebook to verify that the metric otelcol_process_uptime is reporting to your Cloud Observability project. You can group this metric by k8s.pod.name to see all pods that were created. You should expect one pod for each node on your Kubernetes Cluster. Verifying OpenTelemetry Installation

Additionally, verify that your applications are being scraped by the Collector with the metric scrape_samples_scraped grouped by service.name. You should see the amount of samples scraped from each application. At this point, you can start querying your app metrics. Verifying Targets Scraped

If you don’t see this metric, you might not have set your token correctly. Check the logs of your Collector pod for access token not found errors using: % kubectl logs -n opentelemetry <collector pod name>.
If you see these errors, make sure that the correct token is saved in your otel-collector-secret and has write metrics permissions.

Next, you can configure the deployment Collector to scrape your infrastructure metrics.

(Optional) Configure the Deployment Collector to scrape your infrastructure metrics

The DaemonSet Collector deployment has been configured to scrape application metrics from its nodes. In order to scrape static targets and infastructure metrics, run a second OpenTelemetry Collector as a single replica deployment.

  1. Add your additional scrape targerts to the scrape_configs.yaml. This should contain static targets that are not discovered by the Kubernetes service discovery in the DaemonSet Collector.

  2. Enable the secondary Collector Deployment by setting enabled to true in the collectors array element named as deployment in the values-daemonset.yaml file.
    Once complete, upgrade the Collector’s chart to incorporate the new changes.
    1
    
    helm upgrade lightstep ./charts/collector-k8s -f ./charts/collector-k8s/values-daemonset.yaml -n opentelemetry --install
    
  3. Using Notebooks, verify that your applications are being scraped by the Collector with the metric scrape_samples_scraped grouped by service.name.

Collector Troubleshooting

The first thing you should do when troubleshooting collector issues is make sure data from your network can reach Cloud Observability. Your firewall or cloud configuration may be preventing a connection.

The default OTLP Exporter from a Collector enables gzip compression and TLS. Depending on your network configuration, you may need to enable or disable certain other gRPC features. This page contains a complete list of configuration parameters for the Collector gRPC client.

In the event that you are unable to establish a connection to the Cloud Observability platform, you can use curl to verify HTTP/2 connectivity to our collectors. Run the following command, replacing <YOUR_ACCESS_TOKEN> with your project’s access token:

1
2
curl -D- -XPOST --http2-prior-knowledge -H "lightstep-access-token: <YOUR_ACCESS_TOKEN>" https://ingest.lightstep.com/access-test # US data center
# curl -D- -XPOST --http2-prior-knowledge -H "lightstep-access-token: <YOUR_ACCESS_TOKEN>" https://ingest.eu.lightstep.com/access-test # EU data center

You should see the following output, or something similar:

1
2
3
4
5
6
7
HTTP/2 200
content-length: 2
content-type: text/plain
date: Thu, 09 May 2024 15:39:14 GMT
server: envoy

OK

If you do not see this output, or the request hangs, then something is blocking HTTP2 traffic from transiting your network to ours.

If you see HTTP/2 401, your request succeeded, but your token was not accepted. Some things to check:

  • Validity of your access token.
  • Ensure proxies are passing through the lightstep-access-token header.

Alternatively, to exercise the full gRPC request/response cycle, you can try emitting a single span to your project using the otel-cli tool. Refer to this example image and commands for running the CLI tool in Kubernetes and Docker on GitHub. Only send test spans to a non-production project.

For additional troubleshooting recommendations, see Troubleshooting Missing Data in Cloud Observability.

See also

Ingest Prometheus metrics with an OpenTelemetry Collector on Kubernetes

Create and manage dashboards

Updated Jul 26, 2022