Telemetry is missing or incomplete

Data can appear incomplete or not appear in Lightstep for several reasons. Here are some troubleshooting steps for the most common causes.

Check your access token

Lightstep uses access tokens to authorize customers to send telemetry. If the access token is expired, disabled, or typed incorrectly, data will not be accepted by Lightstep.

Confirm your access token is enabled in Project settings, the value in your configuration matches the value in the UI, and it hasn’t expired.

Verify the time window in the UI

Data displayed in the Lightstep UI are scoped to a specific start and end time. Double-check that the time window set on charts and queries in the UI is correct using the date and time picker.

For very long windows, data may not be available due to retention policies.

Check Reporting Status Page and Metric Details Page

The Reporting Status page also provides a summary of services actively sending traces to Lightstep in the Service Directory.

For metrics, you can visit the metric details page to see actively reporting metrics.

Verify configuration and health of data pipeline components

Customers typically send data to Lightstep via multiple intermediate components like the OpenTelemetry Collector, Lightstep Microsatellites, or AWS CloudWatch Metric Streams. The health and configuration of these components is a frequent cause of missing or dropped telemetry.

Understanding the end-to-end pipeline of how data flows into Lightstep is critical for troubleshooting. For each “hop” of telemetry data through a collector, satellite or other component in your data pipeline, it’s important to verify the following:

  • How is the telemetry being ingested into this component?
  • How is the telemetry being modified (i.e. sampling, redacting) by this component?
  • How is the telemetry being exported from this component?
  • What format is the telemetry in?
  • How is the next hop configured?
  • Are there any network policies that prevent data from getting in or out?
  • Are there error messages in the logs of this component?

OpenTelemetry Collectors and Lightstep Satellites are also impacted by hardware limitations including memory, network bandwidth, and CPU. More information is available on health, monitoring, and troubleshooting specific components under Verify and test microsatellite setup and the OpenTelemetry Collector Troubleshooting Guide on GitHub.

Verify your data retention and sampling policies

Sampling intentionally drops data for performance or cost reasons. If sampling is set to a low value in either Lightstep, the OpenTelemety Collector, or Lightstep Microsatellites, data is intentionally not sent to (or processed by) Lightstep.

Lightstep data retention policies allow customers to configure how long data is retained in Lightstep. If you’re trying to retrieve data from many days or weeks ago, confirm your data retention policies.

More information on data retention policies is available here.

Check Lightstep’s status page

Lightstep updates the status page if our systems are experiencing an outage or trouble processing data.

The status page is available at https://status.lightstep.com/

Contact Lightstep

Include as much information as possible when contacting Lightstep for missing data issues, particularly if you are running OpenTelemetry Collectors or Microsatellites.

End-to-end architecture information and collector/microsatellite logs are especially helpful.

Services are missing or stale in the Service Directory

Services listed in the Service Directory are generated from traces. Below are some steps to troubleshoot service that don’t appear.

Confirm traces with service metadata is being ingested

The Reporting Status page also provides an active view of services sending data to Lightstep. If the expected service name does not appear, follow steps in the Telemetry is missing or incomplete section.

Wait 7 days for non-reporting services to be removed from the Service Directory

You may see old services in the Service Directory. If a service is decommissioned, it may take up to 7 days for it to be removed from the service directory.