Lightstep collects and analyzes data across infrastructure, application, runtime, cloud and other third-party services. By mapping metrics to unsampled trace data from your distributed system, it can correlate root causes across traces, metrics, and logs anywhere in the system, and provide immediate insights for developers and SREs.

Metrics

Lightstep ingests metrics from a number of sources as time series and saves them to our next-gen time series database (TSDB). Our TSDB was designed and built by the same people who created Google’s planet-scale Monarch system, and offers high throughput for reliable monitoring. Once your metrics are stored, Lightstep’s Change Intelligence can map that data to your trace data, correlating deviations in metric charts to performance issues from traces throughout your deep system.

Instrumentation

You instrument your services to create and collect telemetry data used to describe distributed traces and metrics in your system. This instrumentation lives in your microservices, functions, web, mobile clients, anywhere your system accesses functionality.

If your services use Java, Node.js, Python, or Go, you can quickly instrument using our OpenTelemetry Launchers. OpenTelemetry provides APIs, libraries and instrumentation resources to capture telemetry data from your applications. Any supported frameworks, protocols, libraries, and data stores are automatically instrumented with just one line of code. You can then add more targeted instrumentation in areas of your system where additional data would prove helpful.

Lightstep also supports instrumentation for many other languages using OpenTracing.

Lightstep can also ingest data from Jaeger Agents or Zipkin, so if you’ve already instrumented your app to work with one of those, it will work with Lightstep too!

Microsatellites

Microsatellites are Lightstep components that communicate with your instrumentation to collect 100% of the telemetry data. They forward the data to the Lightstep SaaS, where it’s analyzed. Microsatellites analyze the performance of each segment against historical performance, error rates, and throughput.

Lightstep offers three types: a locally run satellite that developers use during individual coding and testing to speed up instrumentation time, public remote Microsatellites for lower throughput production environments or development environments, and on-premise Microsatellites that you configure and maintain to meet your specific production environment requirements.

Lightstep Observability Platform

Microsatellites send any data that serves as examples of application errors, high latency, or other interesting events to the Observability Platform. The platform analyzes the data, builds complete traces and dynamic service diagrams, deduces correlations among the data, and monitors for changes in performance after deploys, from both your metric data and your trace data. And because it can correlate the two, you can use Change Intelligence to go from a metric deviation in a chart to a root cause in a trace in just three clicks.

The platform durably stores the data for as long as your Data Retention policy allows. Historical comparisons allow you to quickly see when things are not normal. Post-mortems can contain real data to show exactly what happened and when.

Lightstep Web UI

Here’s where observability is fully realized. From a deviation in a metric chart, Change Intelligence shows you operations in your services that experienced a change in performance at the same time as the metric deviation. And it also shows you attributes from those operations more often appearing during the change, allowing you to click through to a full trace.Change Intelligence

You can view complete traces, from web and mobile clients down to low-level services and back, with the critical path (areas where latency or error rate is affecting performance) detected for you.