Lightstep’s Unified Query Builder allows you to query both your metric and span data using one tool. You use the same builder across notebooks, dashboards, and alerts to create charts that visualize your data. The builder allows you to quickly and easily ask questions of your data and see the results in one place.

Unified Query Builder

Query metric data

  1. Search for a metric to plot:
    Click into the search field and begin typing the metric name.

    You can select Metric from the All telemetry dropdown to narrow your search to only metric data.

    Search for metric

    Change Intelligence works best when it can focus on a single service and its dependencies. To narrow your search to metrics emitted from a specific service, click Filter to service and select the service name from the dropdown. Filter to a service

    When you select a metric, Lightstep expands the query builder and begins to chart the metric.

  2. Filter the data:
    Unless you’ve already filtered by a service, all data for the metric is displayed. You can filter the data using metric attributes found on the data.

    Enter an attribute key, select an operator or enter a regular expression, and enter one or more values. Multiple values are joined by OR.

    You can add more than one filter to the query where it makes sense (the query builder prunes the available list as you add attributes).

    Multiple filters use AND to join filters.

    Filter data with attributes

    By default, Lightstep Observability displays attributes that it’s seen in the last three days. But you can type in an attribute not in the dropdown and Lightstep will find it.

  3. Align the data by choosing an aggregation method. Alignment is the process by which time series data points are aggregated temporally to produce regular, periodic outputs. By aligning your data before plotting it, you can tell a clearer story and more easily see anomalies. You can also set an explicit rolling input window that determines the data points that will be used when computing the aggregation.

If you use the latest aggregation, no input window is needed.

Rolling input window

More about alignment and time windows

Whenever you make a query, Lightstep determines the number of output points to display that will make the chart detailed without making it feel crowded or hard to read. This optimal distance between points is called the output period. Lightstep adjusts the output period depending on the the amount of time being displayed on the chart. If you set the time picker to one week, the output period is 2 hours (meaning that the query produces a series of points which are all two hours apart). If you change the time picker to one hour, the output period is 30 seconds.

For example, say you are querying the requests metric and the data is comming from two sources. Using the delta aggregation, Lightstep combines the data from those sources to create individual time series points and then plots those on the chart, based on the output period. If you queried over the last 60 minutes, the output period (the space between each point on the graph) is 30 seconds.

Output period

The rolling input window is the duration of time that the data is pulled from before aggregating it. By default for charts on dashboards and notebooks, the input window is the same as the output period. That is, if the output period is 30 seconds, the data is aggregated over the last 30 seconds to produce the data point.

If your chart appears very noisy, or if you’re building a chart for an alert, it can be helpful to smooth the output data by using data from a wider rolling input window. For example, if you are querying on the requests metric and choose to aggregate using rate with a 10 minute input window, Lightstep computes the number of requests per second in 10 minute rolling time windows.

Rate aggregation with 10m input window

When querying for alerts, you must specifically set an rolling input window. Increasing the input window means increasing the time that it will take before a underlying change in the data will trigger an alert threshold. When it’s important to be notified immediatly, use smaller input windows.

Expandable end

Choose one of the following temporal aggregation methods:

  • Delta: Computes the total number of increments in the input window as whole numbers. Deltas are most useful for infrequent events and are best visualized as stacked bar charts.

  • Rate: Computes the number of operations per second in the input window. Rates are most useful for ongoing operations and are best visualized as line charts

Gauge metrics also allow the following aggregations:

  • Latest: Computes the latest value in the input window. When using latest, the input window is the same as the output period.

  • Mean: Computes the average value of the time series in the input window.

  • Max: Computes the maximum value of the time series in the input window.

  • Min: Computes the minimum value of the time series in the input window.

The query builder automatically configures the aggregation for distribution type metrics. If the distribution is a gauge, the aggregation is set to latest. If it is a delta or cumulative, the aggregation is set to rate by default, but you can change it to delta.

  1. Set the rolling input window (Unless using latest aggregation. Alerts require an input window):
    By default for charts in notebooks and dashboards, the input window is the same as the output window (determined by the time period for the chart). By setting an explict input window, you can smooth out your data to avoid noise.Aggregate and set input window

  2. Compute percentiles (distribution type metrics only):
    When the metric data is a distribution type (a set of values for each point in time), Lightstep Observability can compute percentiles for you. Enter the value in the Percentile field (no % needed).

    For existing Lightstep customers interested in tracking distribution metrics, please opt-in here. For new customers to Lightstep, this feature is already enabled in your account.

    When using distributions in an alert, you must select only one percentile to alert on.

  3. Group the data:
    By default, Lightstep Observability spatially aggregates the data from the metric into one line.

    Group all by default

    Instead, you can show lines for each available attribute value (group by). Select an attribute to display lines for each of the attribute’s values.

    In this example, by choosing to group by the host attribute, you can see the metrics for the individual hosts. Grouped by method

    Grouping isn’t available on big number charts.

  4. Choose how you want the data spatially aggregated into the chart.

  • count of non-null values: The number of values found that are not null. For example, given the values of [10, 15, null, 50] the count is 3.
  • count of non-zero values: The number of values found that are not zero (null is counted). For example, given the values of [10, 15, null, 0 50] the count is 4.
  • maximum value: The highest point in the data.
    For example, given the values of [10, 15, 50] the max is 50.
  • mean of all values: The average (sum of the data divided by the count) of the data.
    For example, given the values of [10, 15, 50] the mean is 25.
  • minimum value: The lowest point in the data.
    For example, given the values of [10, 15, 50] the min is 10.
  • sum of all values: The total of all points in the data.
    For example, given the values of [10, 15, 50] the sum is 75.

    Distribution type metrics are automatically summed and then aggregated into percentiles.

  1. Click Save to save your chart.

When you hover over a point on the chart, you can see its value, along with the group-by value. Hover over a point

Clicking on a point allows you to start Change Intelligence to determine what cased the change in performance. Change Intelligence

Below the chart, a table displays the data for each line in the chart. Table for chart

Query span data

  1. In the first field, select operation or service, or start typing another attribute key name.

    You can select Spans with from the All telemetry dropdown to narrow your search to only span data.

    Search for operation

    By default, the query builder displays attribute keys that it’s seen in the last three days. But you can type in an attribute not in the dropdown and Lightstep will find it.

    Select an operator or enter a regular expression, and enter one or more values. Multiple values are joined by OR.

    Alerts do not support regular expressions.

  2. Add an optional filter.

    You can further refine your query by adding adding a service, operation, or attribute key and value(s) to a filter where it makes sense (Lightstep prunes the available list as you add filters).

    Filter the query with attributes

    Multiple filters use AND to join filters.

  3. Choose an SLI for the chart (latency percentiles, error rate, operation rate, or count of spans) Compute

  4. Click Specify a time aggregation to optionally set an explicit rolling input window that aggregates the data points to be computed when displaying the chart.

    Alerts require a specific input window to determine how far back to look for alert violations.

    Rolling input window

    More about aggregation and time windows

    Whenever you make a query, Lightstep determines the number of output points to display that will make the chart detailed without making it feel crowded or hard to read. This optimal distance between points is called the output period. Lightstep adjusts the output period depending on the the amount of time being displayed on the chart. If you set the time picker to one week, the output period is 2 hours (meaning that the query produces a series of points which are all two hours apart). If you change the time picker to one hour, the output period is 30 seconds.

    The rolling input window is the duration of time that the data is pulled from. By default, the input window is the same as the output period. That is, if the output period is 30 seconds, the data is aggregated over the last 30 seconds to produce the data point.

    Alerts require that you specifically set an input window.

    If your chart appears very noisy, it can be helpful to smooth the output data by using data from a wider input window.

    Rolling input window

    Expandable end

  5. Optionally group the results (not supported for alerts).

    The builder aggregates the data from the span’s performance into one line. You can show lines for each available attribute value (group by). Select an attribute to display lines for each of the attribute’s values. In this example, by choosing to group by the customer attribute, you can see the percentiles for the individual customers. Group by attribute

    You must be using Microsatellites to add a group-by.

  6. For latency charts, the 50th, 95th, 99th, and 99.9th percentiles are added by default. You can delete any you don’t want and add others by typing the value in the field (no % needed). For alerts, you can have only one value to alert on. Toggle percentile views

  7. Click Save to save your chart.

The result of the query displays in a chart below the query builder.

By default, the chart shows lines for each series (group-by), and dots for sampled spans. Triangles represent spans that have errors. Charted query

Use the Show span samples toggle to turn these off. Turn sample spans off

With spans turned off, when you hover over a line, you can see it’s value and percentile. Span data

With span samples displayed, when you hover over a point, you can see the value at the point along with the group-by value. View span data

Clicking the point takes you to its full trace, where you can view the full request path that the span participated in. In this case, the error is coming from the GET operation on the store-server service. Full trace

Below the chart, a table displays the data for each line in the chart.

Table displays data

With sample spans displayed, the Value column shows the latest value for that series.

You can collapse both the query and the table.

Add a final time aggregation

Once you’ve filtered and grouped your data, or added a formula, you may find it necessary to include a final time aggregation. The final aggregation takes all the values into account over the specified time period, and further smoothes the data.

Final aggregation

Choose the aggregation operation (min, max, or mean) and then set the rolling input window. The final input window must be larger than the input windows set on individual queries.

Final temporal aggregation

More about final aggregation and noisy alerts

Another time you may want to use the final aggregation is for data that may cause flappy alert. For example, say you set an alert to be sent when the rate of requests is over 2,300, and you set the initial input window to two minutes (because you want to smooth out super short spikes). There may be cases where during a two minute period, it does cross the threshold but then goes below it immediately after, multiple times, leading to your alert notifications flapping. If you set the final aggregation window to 10 minutes, the alert will still trigger within 2 minutes and will remain open for at least 10 minutes.The alert remains open until there has been a 10 minute period where the metric has consistently been under the threshold.

expandable end

Add multiple queries to a chart

You can more than one query to a chart. For example, you might want to show the request rate for iOS and Android on one chart.Two queries on one chart

For alerts, if you add more than one query, you must join them with a formula.

To add a query, click Plot another metric or Plot another span and build your query as you did the first one.

Add a query

When you have multiple queries, you can edit the chart so only certain time series display. For example, in this chart, only the timeseries for metrics from the iOS service is displayed. Toggle time series

Once you save the chart, this display toggle is persisted to the chart in the dashboard.

What happens when you delete a query?

You can delete a query by clicking the X for that row. When you do, the remaining queries retain their order (for example if you deleted b, the remaining queries are a and c). If you then add another query, it uses the order that was deleted. If you continue to add queries, the order continues down the alphabet from the “highest” letter. b was deleted. Now a new query uses b. In the above example, three queries were originally plotted: a, b, and c. The user deleted b, so the next query plotted used b. When adding another query, the order continued to d.

expandable end

If you want to use a big number metric chart with more than one metric, you need to combine them using a formula (big number charts can only display a single value).

Add a formula to the query

For span queries, you must be using Microsatellites to add a formula.

You can perform arithmetic on a single time series or on multiple time series using Add a formula. For example you might enter a/(a+b) if you want to chart the percentage of the a metric to the sum of the a+b metrics.

Lightstep Observability supports +, -, /, and *.

You must use * for multiplication. Implicit multiplication (for example, ab) is not allowed.

When using a metric that is a distribution type in a formula, you must select only one percentile.
If you’re performing the arithmetic on multiple queries, they must all be grouped by the same attribute.

You can edit the chart so only the formula is shown. For example, in this chart, only the timeseries for the result of the formula is displayed.

The toggle display doesn’t affect when the alert is triggered. Alerts are triggered only on the result of the formula

Toggle time series Once you save the chart, this display toggle is persisted.

Considerations for alert queries

When creating queries for alerts, keep the following in mind:

  • You must set an input window (unless using latest for aggregation).
  • When querying distribution metrics or latency on spans, you can select only one percentage to query on.
  • If your query includes multiple sub-queries, you must use a formula to join them and create one output.
  • Group-by isn’t supported in alerts.
  • To include Regex in metric queries, you must be running this release or later of the Microsatellites. Regex in span queries are currently not supported.
  • Consider adding a final time aggregation to prevent “noisy” alerts.

    Troubleshoot query results

    If your chart doesn’t look as expected, it may be because of one of the following:

  • The No data found message displays when Lightstep Observability can’t find a metric or span attribute key (service, operation, or attribute) by that name. Ensure you are using the right name in the query.

  • The No data found message also displays if you’re using the wrong time series operator for the metric kind.

    The latest operator can only be used with gauge metrics.

  • If no data displays and there’s no No data found message, then Lightstep Observability found the metric or span attribute key, but had no data to display

  • When adding a formula over multiple queries, they must all be grouped by the same attribute.

Data retention

Data for metric queries is retained for 13 months.

For notebooks, span data is retained for the length of your retention window. For a chart on a dashboard, you can choose to retain the data for longer by creating it as a Stream. Data for alerts is always saved as a Stream.

Learn more about data retention.