View all content tagged with Root Cause Analysis
Use an OpenTelemetry-based plugin to automatically instrument AWS service API Calls from your Node.js service, and see their performance in Lightstep.
Learn how to find the cause of a latency increase using Lightstep's Service Health view.
Learn how to improve your incident response capabilities using high-cardinality attributes in your instrumentation, creating alerts, and configuring Workflow Links.
Use an OpeTelemetry-based plugin to integrate feature flag data from your Node.js service into Lightstep.
Learn how to integrate Lightstep with Rollbar. Trace data from errors found in Rollbar are automatically captured in Lightstep where you can start your investigation.
Learn how to find the cause of an error rate increase using Lightstep's Service Health view.
Use Lightstep to monitor service performance after a deployment to catch regressions quickly.
Use Lightstep's Change Intelligence to find the root cause when alerted to a deviation in your metrics.
Learn how to integrate Lightstep with Codefresh to monitor and if needed, rollback your deployments.
Use Lightstep and Gremlin to create chaos experiments automatically using observability data found in Lightstep.
Lightstep can ingest infrastructure metrics from your instrumentation to report on things like CPU and memory usage to help you resolve incidents fater.
In Lightstep, you can add deployment markers to the Service Health for Deployments view so that you can easily see when service versions change, and if the change affects performance in any way.
Lightstep lets you add flexible Workflow Links on the Trace View page that link to other resources, allowing access to all the info you need when you need it.
A standard method of identifying the root cause of a performance regression is to manually comb through traces and search for common system attributes associated with that regression or with errors. With Correlations, Lightstep helps you find attributes correlated with latency and errors automatically.
You can use Lightstep not only to monitor your services after a deploy, but also to compare performance over specific time periods and then dig into details to find the differences that caused the issue.
Use Lightstep's Change Intelligence when you notice a deviation in your metric charts to quickly find the root cause.
When you notice an increase in error rate on Lightstep's Service Health view, you can use the analytical tools to find the source of errors.
When you integrate Lightstep with Slack, you can copy a link to a specific Explorer query, Trace View page, or Stream, post it into any Slack channel in your workspace, and all the pertinent info from that page displays in the Slack channel.
Lightstep's Explorer view allows you to query all span data currently in the Microsatellites' retention window to see what's going on. You create Snapshots that are durably persisted, allowing you to view performance at a certain point in time and share that Snapshot with others. You can see real-time span data, filter and group that data, and drill down on common attributes that may be causing latency.
You register an attribute on your metric data that holds the service's name and allows Change Intelligence to correlate metric and trace data.
You can share the URL from the different views in Lightstep (Change Intelligence, Trace, Explorer, etc.), and users will be taken into Lightstep to that view. If you integrate with Slack, you can post a URL for an Explorer query, a Stream, or a Trace view in Slack and members can see a preview of that data directly in the channel.
Lightstep's Change Intelligence correlates changes found in traces with deviations in your metric data. But there may be times when you need help getting that correlation to work as expected. Here are some issues you may run into and how to solve them.
You can create notebooks for ad-hoc queries, post-mortems, runbooks, collaboration, or anytime you want to keep a record of an investigation.
Use variables to render Workflow Link names, URLs, and rules.
Lightstep offers a way to quickly see how all your services and their operations are performing in one place - the Service Directory view.
You can use Lightstep's Service diagram to get an aggregate view of trace data as a request travels through your system. The Service diagram provides a visual, interactive, and hierarchical representation of a system’s behavior for a given point in time.
You use the Trace view to see a full trace from beginning to end of a request. The Trace view shows you a flame graph of the full trace (each service a different color) and below that, each span is shown in a hierarchy, allowing you to see the parent-child relationship of all the spans in the trace. Errors are shown in red.
Lightstep offers a way to see how your deployments (even partial deployments) affect your service performance.