Often, incident response begins with an alert sent to the on-call team. You create alerts in Lightstep Observability that activate when a set threshold on a Stream is crossed. Thresholds can be set for an error percentile, latency, or operations per second.

More about Streams

Streams are span queries that you can retain beyond the Microsatellites retention window (three days, by default). They allow you to proactively monitor parts of your system that are crucial to business health. You create Streams based on a query of your services, operations, and attributes. Lightstep Observability continuously receives data from your Microsatellites that match the query and stores statistics and example traces to ensure you always have data from 0 to p99.9, including outliers. The Stream view displays statistical time series data and example traces and stores them for as long as your Data Retention policy allows.

Creating a Stream for a query allows you to view the returned data in a unified dashboard or notebook for the length of your data retention policy.

Learn more about Streams.


When an alert is triggered, a message is sent to the configured destination, like a Slack channel or PagerDuty. The message includes a link into the Stream that triggered the alert and links to example traces.Alert in a Slack channel

You create an alert by defining a notification destination and a threshold that determines when the alert will trigger.

For this step, let’s create an alert that will trigger whenever the error percentage rate on the android service goes above 5%. Let’s tell Lightstep Observability to send that alert to the on-call team’s Slack channel every 10 minutes until it’s resolved.

  1. We’ll start by creating a Slack notification destination. In Lightstep Observability, the Destinations tab of the Alerts view shows all current notification destinations that alerts can be sent to. Destination page Click New Message Destination to create a destination for the on-call Slack channel.Create a new Slack destination

  2. In the dialog, you use the dropdown to search for the channel you want to post the alerts to. In this case, we’ll search for #on-call. Slack dialog When you click Save, the new destination now appears in the list. New destination added to the list

  3. Now that we have a destination to send the alert to, we can create the alert on a Stream. We have a Stream that monitors the android service, so we’ll use that. Stream list

  4. When we open the Stream view, we can see that there have been errors. Good thing we’re creating an alert! Stream view

  5. Click the Create Alert button to create the alert. Create alert button is highlighted

  6. We’ll define the threshold to send the alert in this dialog. Choose Error Percentage for the Signal, set the Threshold to be above 5%, and the Evaluation Window to be 5 minutes, meaning that the alert won’t be sent until the violation lasts for 5 minutes. Create Alert dialog

  7. Now we’ll add a notification destination and configure where to send the alerts and how often. Click Add Notification Destinations and select Slack for the Integration, the #on-call channel as the Destination, and set the Interval to be 5m, meaning the alert will be sent every 5 minutes until resolved.Add a Destination Once you click Create, the alert appears on the Stream.Created alert A dotted grey line on the Stream shows the threshold and the name of the alert that will trigger when the threshold is crossed.

That’s it! Now we wait for the threshold to be crossed and the alert to be sent.
Sure enough - it happened again! Slack alert The on-call team can click one of the example traces to see what’s going on. Trace View Looks like it might be a 429 error code coming from the get-store-data operation on the store-server service.

In the next step, we’ll see how we can make it easy for the team to begin remediation by adding links from Lightstep Observability to other tools the team uses.

What did we learn?

  • You create alerts on Streams. Lightstep Microsatellites continuously send 100% of your telemetry data and the Saas looks for instances where defined thresholds are crossed.
  • You create alerts by defining a destination for the alert, a threshold that should trigger the alert, and rules for when the alert should be sent.