In addition to metric data, Lightstep’s Unified Query Language (UQL) can be used to retrieve and process your span data. This guide will help you understand how you can use UQL to operate on span data.
Why use UQL to query span data?
Lightstep enables querying of spans time series data via the query builder and UQL. UQL enables a more flexible and extensible query experience, allowing you to write more complex filter expressions on attribute keys, combine spans time series in various ways, query numerical attributes on spans and calculate custom latency percentiles.
Querying span data via UQL
All spans queries must start with a fetch, or data generation, statement. Just like you would start a metrics query with the keyword
metrics and then specify a metric name, with spans queries you start the query with the keyword
spans, and then specify a fetch type:
||Produces a delta float counting the number of spans|
||Produces a delta distribution for the latency of spans|
||Produces a delta distribution of the values of custom-numeric-attribute|
The fetch type determines what sort of spans time series data is returned for your query.
The most basic spans UQL query you can write is
spans count | delta | group_by , sum. This returns to you the count of spans across all services reporting to Lightstep.
The ability to query a distribution based on a custom attribute of spans is a powerful feature of UQL. This means that as long as an attribute is float or integer valued, a distribution can be created based on the values of the attribute.
Like UQL metrics queries, an alignment stage is required for spans queries. If you are issuing a
spans count query, you can use the
reduce aligners. You can provide an input window and output period to any aligner.
The count of spans sent by the warehouse service
1 2 3 4 spans count | delta | filter service = "warehouse" | group_by , sum
The rate (ops/s) of the database-update operation in the warehouse service, over a rolling 10m window
1 2 3 4 spans count | rate 10m | filter service = "warehouse" && operation = "database-update" | group_by , sum
If you are issuing a
spans latency query, you must use a
delta aligner. This is because a these queries produce a distribution of spans latency.
The p99 latency of the database-update operation in the warehouse service
1 2 3 4 5 spans latency | delta | filter service = "warehouse" && operation = "database-update" | group_by , sum | point percentile(value, 99)
You can filter your spans queries by any attribute on a span, using
!~ boolean operators.
The rate (ops/s) of requests to the android and iOS service made by customer sweetpines
1 2 3 4 spans count | rate | filter (service = "android" || service = "iOS") && customer = "sweetpines" | group_by , sum
The p50 latency for writes to database services
1 2 3 4 5 spans latency | delta | filter (service = "transaction-db" && operation = "INSERT") || (service = "inventory-db" && operation = "UPDATE") | group_by , sum | point percentile(value, 50)
The above query is possible in UQL because of the flexibility in constructing boolean expressions.
Aggregation / Group By
Just as in metrics UQL queries,
group_by combines rows with the same timestamps and the same values for the listed attribute keys using the provided reducer.
Unlike in metrics UQL queries, an aggregation (
group_by) stage is required for spans queries. This is due to the cardinality constraints around returning a time series for every attribute combination for spans - the data is just far easier to read and interpret when there is a
To aggregate across all attributes and have your query produce a single time series, provide an empty field list (
) as a
group_by , sum.
The following query produces a single time series:
The total count of spans, summed up across every attribute key
1 spans count | delta | group_by , sum
This query produces a time series for every service reporting to Lightstep:
The count of spans, summed up by service
1 spans count | delta | group_by [service], sum
Point and Join stages
You can use a join expression to combine two or more spans time series. Join expressions have the same syntax for spans queries as they do for metrics.
Error percentage for service android
1 2 3 4 ( spans count | delta | filter error == true && service == android | group_by , sum; spans count | delta | filter service == android | group_by , sum ) | join left/right * 100
The 99th percentile latency for the iOS service
1 spans latency | delta | filter service = iOS | group_by , sum | point percentile(value, 99)
Approximate total bytes received by the warehouse-db service
1 spans client-request-size-bytes | delta | point dist_sum(value) | filter service = warehouse-db | group_by , sum