Lightstep Observability collects and analyzes telemetry data across infrastructure, application, runtime, cloud and other third-party services. By mapping metrics to trace data from your distributed system, it can correlate root causes across traces, metrics, and logs anywhere in the system, and provide immediate insights for developers and SREs.
Metrics
Lightstep Observability ingests metrics from a number of sources as time series and saves them to our next-gen time series database (TSDB). Our TSDB was designed and built by the same people who created Google’s planet-scale Monarch system, and offers high throughput for reliable monitoring. Once your metrics are stored, Change Intelligence can map that data to your trace data, correlating deviations in metric charts to performance issues from traces throughout your deep system.
Instrumentation
You instrument your services to create and collect telemetry data used to describe distributed traces and metrics in your system. This instrumentation lives in your microservices, functions, web, mobile clients, anywhere your system accesses functionality.
If your services use Java, Node.js, Python, or Go, you can quickly instrument using our OpenTelemetry Launchers. OpenTelemetry provides APIs, libraries and instrumentation resources to capture telemetry data from your applications. Any supported frameworks, protocols, libraries, and data stores are automatically instrumented with just one line of code. You can then add more targeted instrumentation in areas of your system where additional data would prove helpful.
Lightstep Observability can also ingest data from Jaeger Agents or Zipkin, so if you’ve already instrumented your app to work with one of those, it will work with Lightstep too!
Microsatellites
Microsatellites are Lightstep components that communicate with your instrumentation to collect 100% of the telemetry data. They forward the data to the Lightstep SaaS, where the performance of each segment is analyzed against historical performance, error rates, and throughput.
Lightstep offers three types of Microsatellites: a locally run satellite that developers use during individual coding and testing to speed up instrumentation time, public remote Microsatellites for lower throughput production environments or development environments, and on-premise Microsatellites that you configure and maintain to meet your specific production environment requirements.
Lightstep Observability platform
Microsatellites send telemetry data from your instrumentation to the Observability Platform. The platform analyses any data that serves as examples of application errors, high latency, or other interesting events, and then builds complete traces and dynamic service diagrams, deduces correlations among the data, and monitors for changes in performance. And because it can correlate between span and metric data, you can use Change Intelligence to go from a deviation in a chart to a root cause in a trace in just three clicks.
The platform durably stores the data, and for some data, you can configure that retention. Historical comparisons allow you to quickly see when things are not normal.
Lightstep Observability web UI
Here’s where observability is fully realized. Unified dashboards allow you to view both metric and span data in one place. From a deviation in a metric chart, Change Intelligence shows you operations in your services that experienced a change in performance at the same time as the metric deviation. And it also shows you attributes from those operations more often appearing during the change, allowing you to click through to a full trace.
You can view complete traces, from web and mobile clients down to low-level services and back, with the critical path (areas where latency or error rate is affecting performance) detected for you.