Recommended Collector configuration

The following code snippet shows the Cloud Observability recommended components and configuration of the OpenTelemetry Collector from the Contribution repo. But what do these components actually do? Read on and we’ll break it down piece by piece.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
receivers:
    otlp:
        protocols:
            grpc:
            http:

processors:
  batch:
  memory_limiter:
    limit_percentage: 80
    spike_limit_percentage: 25
    check_interval: 1s


exporters:
    otlp/ls:
        endpoint: ingest.lightstep.com:443 # US data center
        #endpoint: ingest.eu.lightstep.com:443 # EU data center
        headers:
            lightstep-access-token: "${LIGHTSTEP_ACCESS_TOKEN}"

extensions:
    health_check:
    memory_ballast:
        size_in_percentage: 40

service:
    extensions: [health_check, memory_ballast]
    telemetry:
        metrics:
          address: :8888
        logs:
            level: debug
    pipelines:
        traces:
            receivers: [otlp]
            processors: [memory_limiter, batch]
            exporters: [otlp/ls]
        metrics:
            receivers: [otlp]
            processors: [memory_limiter, batch]
            exporters: [otlp/ls]
        logs:
            exporters: [otlp]
            processors: [memory_limiter, batch]
            receivers: [otlp] # update with your receiver name

You can also visualize and validate the pipeline configuration of our recommended configuration in your browser using the OTelBin tool here.

Receivers

A receiver is how data gets into the OpenTelemetry Collector. Generally, a receiver accepts data in a specified format, translates it into the internal format and passes it to processors and exporters defined in the applicable pipelines.

The recommended Cloud Observability configuration is to enable the OpenTelemetry Protocol (OTLP) receiver for both gRPC and HTTP. OTLP is the default protocol supported by all language implementations. Depending on the environment, both grpc and http may not be necessary if the data being sent to the Collector only uses a single protocol.

1
2
3
4
5
receivers:
    otlp:
        protocols:
            grpc:
            http:

Additional receivers may be helpful to configure depending on the environment.

Processors

Processors are used at various stages of a pipeline. Generally, a processor pre-processes data before it is exported (e.g. modify attributes or sample) or helps ensure that data makes it through a pipeline successfully (e.g. batch/retry). Cloud Observability uses the memory limiter and batch processors by default.

1
2
3
4
5
6
processors:
    batch:
    memory_limiter:
        limit_percentage: 80
        spike_limit_percentage: 25
        check_interval: 1s

Memory limiter

The memory limiter processor supports configuring limits on the memory the Collector expects to consume. Once this limit is reached, the processor starts dropping incoming data and return an error to the receivers sending it data. This allows the receiver to provide a signal to the originator of the data to back off.

limit_percentage sets the percentage of total memory available to be allocated by the process heap. spike_limit_percentage is the maximum spike expected between check_intervals.

Other configuration options are documented in the Collector repo.

Batch

The batch processor receives telemetry data and batches it before sending it to the exporter. It improves the compression of the data and reduces the number of calls to emit the data from the Collector. This processor can be configured to batch data for a certain duration or per batch size. See the repo documentation for additional information on configuring the batch processor.

Exporters

An exporter is how data gets sent to different systems/back-ends. Generally, an exporter translates the internal format into another defined format.

Cloud Observability supports OTLP natively for all signals. Therefore the only required exporter is the OTLP exporter, which requires a Cloud Observability access token. The example extracts the access token from an environment variable, to avoid hard-coding this in the configuration file.

1
2
3
4
5
6
exporters:
    otlp/ls:
        endpoint: ingest.lightstep.com:443 # US data center
        #endpoint: ingest.eu.lightstep.com:443 # EU data center
        headers:
            lightstep-access-token: ${env:ACCESS_TOKEN}

Extensions

Extensions provide capabilities on top of the primary functionality of the collector. Generally, extensions are used for implementing components that can be added to the Collector, but which do not require direct access to telemetry data and are not part of the pipelines (like receivers, processors or exporters). Example extensions are: Health Check extension that responds to health check requests or PProf extension that allows fetching Collector’s performance profile.

The following extensions are enabled to improve the observability of the Collector.

1
2
3
4
extensions:
    health_check:
    memory_ballast:
        size_in_percentage: 40

Health check

The health check extension provides an endpoint that provides support to run liveness checks for the Collector.

Ballast

The ballast extension provides a mechanism to configure a memory ballast for the Collector process. Memory ballast pre-allocates a large chunk of the heap, allowing the Collector to use as much heap space as possible without that memory being re-allocated automatically when not in use. Configuring a ballast improves the performance of the Collector by providing stability to the heap.

Service

The service section is used to configure what components are enabled in the Collector based on the configuration found in the receivers, processors, exporters, and extensions sections.

We have defined the trace, metric, and log pipelines, the Cloud Observability default extensions, and Collector telemetry. If a component is configured, but not defined within the service section then it is not enabled.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
service:
    extensions: [health_check, memory_ballast]
    telemetry:
        metrics:
          address: :8888
        logs:
            level: debug
    pipelines:
        traces:
            receivers: [otlp]
            processors: [memory_limiter, batch]
            exporters: [otlp/ls]
        metrics:
            receivers: [otlp]
            processors: [memory_limiter, batch]
            exporters: [otlp/ls]
        logs:
            exporters: [otlp]
            processors: [memory_limiter, batch]
            receivers: [otlp] # update with your receiver name

Telemetry

The telemetry section configures telemetry for the Collector itself. We recommend enabling traces, metrics, and logs for the Collector. At this time, metrics from the Collector are emitted using Prometheus, which requires a Prometheus scraper to be configured to collect the data from the Collector. 8888 is the defaut Prometheus endpoint.

1
2
3
4
5
telemetry:
    metrics:
        address: :8888
    logs:
        level: debug

You can monitor your Collectors running in Kubernetes using the pre-built dashboard.

Updated Nov 22, 2023