View all content tagged with Root Cause Analysis
Learn how to find the cause of a latency increase using Lightstep's Service Health view.
Learn how to improve your incident response capabilities using high-cardinality tags in your instrumentation, creating alerts, and configuring Workflow Links.
Learn how to find the cause of an error rate increase using Lightstep's Service Health view.
Use Lightstep to monitor service performance after a deployment to catch regressions and other issues quickly.
Use Lightstep to quickly create a hypothesis about the root cause of a performance incident.
Lightstep can ingest infrastructure metrics from your instrumentation to report on things like CPU and memory usage to help you resolve incidents fater.
In Lightstep, you can add deployment markers to the Service Health for Deployments view so that you can easily see when service versions change, and if the change affects performance in any way.
Lightstep lets you add flexible Workflow Links on the Trace View page that link to other resources, allowing access to all the info you need when you need it.
A standard method of identifying the root cause of a performance regression is to manually comb through traces and search for common system attributes associated with that regression or with errors. With Correlations, Lightstep helps you find attributes correlated with latency and errors automatically.
You can use Lightstep not only to monitor your services after a deploy, but also to compare performance over specific time periods and then dig into details to find the differences that caused the issue.
When you notice an increase in error rate on Lightstep's Service Health view, you can use the analytical tools to find the source of errors.
When you integrate Lightstep with Slack, you can copy a link to a specific Explorer query, Trace View page, or Stream, post it into any Slack channel in your workspace, and all the pertinent info from that page displays in the Slack channel.
Lightstep's Explorer view allows you to query all span data currently in the Satellites to see what's going on. You create Snapshots that are durably persisted, allowing you to view performance at a certain point in time and share that Snapshot with others. You can see real-time span data, filter and group that data, and drill down on common attributes that may be causing latency.
You can share the URL from the different views in Lightstep (Trace, Explorer, etc.), and users will be taken into Lightstep to that view. If you integrate with Slack, you can post a URL for an Explorer query, a Stream, or a Trace view in Slack and members can see a preview of that data directly in the channel.
Use variables to render Workflow Link names, URLs, and rules.
Lightstep offers a way to quickly see how all your services and their operations are performing in one place - the Service Directory view.
You can use Lightstep's Service diagram to get an aggregate view of trace data as a request travels through your system. The Service diagram provides a visual, interactive, and hierarchical representation of a system’s behavior for a given point in time.
You use the Trace view to see a full trace from beginning to end of a request. The Trace view shows you a flame graph of the full trace (each service a different color) and below that, each span is shown in a hierarchy, allowing you to see the parent-child relationship of all the spans in the trace. Errors are shown in red.
Lightstep offers a way to see how your deployments (even partial deployments) affect your service performance.