Explorer makes exploring your real-time system behavior intuitive. It gives you a direct view into the data seen by your [𝑥]PM Satellites.
At the top of Explorer, there's a query builder that categorizes query terms by Service, Operation, and Tags. There is no practical limit to the number of query terms that you can include. When adding a new query term, you can scan the list of suggestions and begin typing to quickly find what you're looking for. In the case of a rarely seen search term, you're able to add terms outside the suggestion list.
Your previous queries are saved locally and can be viewed by clicking on the query history dropdown.
You can also filter a query by clicking on a Service or Operation in the Spans Table.
This chart shows the global latency distribution for the full set of spans, or timed operations, that are currently in memory across every [𝑥]PM Satellite within your environment. When reading the chart, remember that the x-axis is a logarithmic indicator of latency and the y-axis is a logarithmic frequency measurement.
You can use the query to narrow the scope of the histogram and the example spans below to a specific Service, Operation, and/or Tags. You can also click and drag along the latency axis of the histogram chart to further narrow the set of example spans to include only those in a specific latency range. This can be used to drill down into subsections to analyze particular performance problems. These powerful features make it possible to flexibly examine the real-time latency characteristics for an application in its entirety, through monoliths and microservices or filter by any dimension no matter how focused or broad.
Within the Latency Histogram are checkboxes (1 hour, 1 day, 1 week), that when clicked, overlay the average latency shape calculated over the given windows of time. In this way, users can compare the shape of the Latency Histogram they are seeing right now, with the historical average latency histograms LightStep keeps continually up to date.
In cases where LightStep detects that the historical histograms for a particular query are sufficiently different from the live histogram, we will automatically show the difference by programmatically selecting the time windows as soon as the query is made.
This will help you discover latency changes for your system in a more seamless and automatic way.
Your Satellites must be upgraded to July 2018 or later's release to enable this feature.
Below the Latency Histogram is a table of example spans. This table is populated by extracting a subset of representative spans matching your query from your [x]PM Satellites.
There are four columns in the table: Service, Operation, Sub-Trace Summary, and Sub-Trace Duration. The first two columns, Service and Operation, are used for identifying the source of the span. The last two columns, Sub-Trace Summary and Sub-Trace Duration, provide key insights into the behavior of the span.
A sub-trace is the portion of an overall end-to-end trace that consists of the displayed span itself and its descendant spans. The Sub-trace Summary column shows the total number of spans in each sub-trace. Hovering over this column will show a Latency Breakdown which is the top three most time-consuming operations within that particular sub-trace, with all other operations bundled together as Remainder. When summed up, these latencies equal the total Sub-Trace latency. In this representation, the time consumed by an operation excludes the time spent waiting for any of its children. The Sub-Trace Duration column shows a horizontal bar that varies in length in relation to the other example spans shown, making it much easier to spot outliers and anomalies.
Errors in a span are indicated by coloring the Duration values in red.
This table is designed to surface spans that are abnormally slow, spans that contain large sub-traces, and spans that may be behaving unusually in other ways.
Within the spans table, you can also add a column for a specific tag key, and see the corresponding tag values. This is especially useful if you have a hypothesis that a particular tag (e.g.
customer_id) is related to latency, without having to manually inspect dozens of traces.