Lightstep Observability has a number of tools that help you in all your observability flows, whether it’s continual monitoring, triaging an incident, root cause analysis, viewing overall service health, or managing your team’s observability practices.

Monitoring

Monitoring your resources and transactions is a key part of observability. At a glance, you need to know if your transactions through your system are performant and that your resources (services, virtual memory) that those transactions consume are healthy. Unified dashboards allow you to view both your transactional performance (from trace data) and your resource health (usually from metric data) in one place. And after a deployment (even a partial deploy), you can use Lightstep Observability to ensure things are staying on track.

Unified dashboards

Using the unified dashboard experience, you can monitor both metric and span data charts in one place. Unified dashboard

As a starting point, you can use our pre-built dashboards for AWS CloudWatch Metric Streams metrics or for a metric integration that uses the OTel Collector. Once the dashboard builds, you can edit it to add additional charts, change chart queries, rearrange the charts, and more.

You create charts for a dashboard using a query builder that works for both metric and span data. Use filters and groupings to see just the data you want. Unified query builder

Instead of the builder, you can use the Unified Query Language (UQL) in the editor to build more fine-grained queries.

Use UQL in the editor

For span data, exemplars are mapped in the chart, providing direct access to traces. A table below the chart provides a quick view into the data. Filter and group data

Filter and group data

You can click into a chart and immediately start your investigation using Change Intelligence. Change Intelligence

Using Terraform? You can use the Lightstep Terraform provider to create and manage your dashboards and charts. You can also use it to export existing dashboards into the Terraform format.

Read more:

Set up alerts

You create alerts by setting thresholds on a query to your metric or span data (you can set both a warning and critical threshold). Alert configuration

Notification destinations determine where the alert should be sent. Lightstep supports PagerDuty, Slack, and BigPanda out-of-the-box. You can use webhooks to integrate with other third-party destinations. Slack notification destination

Read more:

Investigate root causes

Lightstep Observability has a number of different ways to help you find the root cause of performance and error issues. It can correlate spikes in metric performance or errors with changes in span data that ocurred at the same time to determine what caused the change in performance. Using span data, Lightstep Observability analyzes traces to determine service dependencies that may be causing latency or errors in services further up or down the stack.

Triage incidents using notebooks

When you begin an investigation, you often need to run a number of queries to reach a hypothesis about the origin of an issue. Notebooks allow you to query both your metric and span data in one place to reach that hypothesis and then share those findings with other team members. Notebooks

Once you mitigate the issue, you can transfer your learnings from notebooks to begin deeper root cause analysis.

Learn more:

Find the cause of change

Lightstep’s Change Intelligence correlates metric and span data to help find the cause of metric deviations. It determines the service that emitted a metric, searches for performance changes on Key Operations from that service at the same time as the deviation, and then uses trace data to determine what caused the change. Change Intelligence

You access Change Intelligence from any chart on a dashboard, notebook, or alert. A side panel displays attributes on spans that experienced a change in performance at the same time as a deviation on the chart. You can copy queries for these attributes and paste them into a notebook, where you can continue your investigation Change Intelligence side panel

Learn more: Investigate a deviation

View a full trace

From all tools that you might use in your investigation, you can click through to a full-stack trace of a request. A side panel provides details of each span in the trace, allowing you to view its attributes, logs, and other details. The Trace view illustrates the critical path for you (the time when an operation in a trace is actually doing something) so you can immediately see bottlenecks in the request. The Trace view is where you can prove out your hypotheses.

Trace view

View service health

The Service Directory view lets you see at a glance how services reporting to Lightstep Observability are performing. At a glance, you can view changes to performance on a service’s operations. View operation performance

You can also see how well a service is instrumented for tracing, and where you can make improvements. Service IQ

Learn more: