Two key components of the Cloud Observability architecture are involved with Cloud Observability’s performance: the tracers that collect span data and the Microsatellites. Tracers collect span data from your instrumentation, hold it in a buffer, and then send that data to the Microsatellites. Microsatellites collect 100% of that data and then send it to the Cloud Observability platform, where it’s available for the UI to retrieve data necessary to build the span data into traces, service diagrams, Streams, and other meaningful reports.
Cloud Observability performance is based on the ingress and egress of data from the tracers to the Microsatellites and the amount of memory the Microsatellites have to store that data. If the tracers collect more data than their buffer can hold, then the tracer may drop the spans. If the Microsatellites don’t have enough memory to store the data sent by the tracers, then they may drop spans.
The tracer client libraries are engineered for minimal impact on the processes they are tracing while still collecting and reporting the tracing data intended for collection. The use of the network is managed by buffering the data to be reported: spans and the associated attributes, events, and payloads. Buffering shifts some burden onto memory to hold this buffered data until the client flushes the content of the buffer and reports to the Microsatellite. You set the buffer size when you instantiate the tracer in your code. If the size is too small, you may start to see the client tracer dropping spans.
Microsatellites are responsible for collecting the spans generated by the tracers and then sending that data to the Cloud Observability platform for trace assembly. When the memory allocation is too low or there are too few Microsatellites, you may start to see Microsatellites dropping spans. This can be fixed by either reducing the amount of span traffic or by increasing the available memory of Microsatellites in the pool (either by increasing the available memory per instance, or the overall number of instances).
If you’re using the Cloud Observability Community or Teams plan, then Cloud Observability manages your Microsatellites. If you see dropped spans from the Microsatellite, please contact customer service.
Cloud Observability provides a number of different ways to monitor tracer and Microsatellite performance, especially regarding dropped spans:
Reporting Status Dashboard:
See per project and by service, the language, library, number of instances of that service currently reporting, number of spans dropped by the client tracer and Microsatellite, and the pool those Microsatellites belong to.
Satellite Pool Report:
See a high-level overview of all Microsatellite pools and individual Microsatellites, their current performance and configuration, and the projects reporting into them.
Provides a health check for a Microsatellite, along with configuration information.
StatsD Reporting Metrics:
Provides detailed StatsD metrics that you can import into a monitoring tool.
Updated Apr 6, 2021