We’ve notified the team about the issue, now we need to passively monitor the
iOS service to make sure it doesn’t get worse. And once the issue is resolved, we need to be able to act quickly if a regression occurs. To do this, we’ll create a Stream and a dashboard. Streams tell Satellites to always collect and persist meaningful data for a specific query, going forward.
Because the Lightstep demo is read-only, you can’t create a Stream. But there are some pre-built Streams for you to explore.
From the navigation bar, click Streams and then select the inventory > update-inventory Stream.
Streams contain typical monitoring information, like latency, operation rate, and error percentage.
Lightstep looks at the distribution of data from the Satellites every minute and takes example traces from different buckets of distribution to ensure you always have data from 0 to p99.9, including outliers.
When you hover over the timeseries graph, a scatter plot displays these persisted traces as dots that you can click on and view. Not only can you see how the system is behaving, but you can also pull a trace that represents a specific latency rate at a specific period in time.
Because the data is persisted, you can change the time period to display or create a custom range over a period of days, weeks, or months (going back to when you first created the Stream).
Now that you’re continuously capturing this data, you can set up alerts to notify people when certain conditions on the Stream exist.
You can set alerts for when certain conditions are met on latency, operations rate, or error percentage in a Stream. You can’t create alerts in the Lightstep demo, but we’ll walk you through it.
To create an alert for a Stream, you click Create Conditions.
When you create a condition, you set the signal to monitor (latency, error percentage, or operation rate) and a threshold and evaluation window. Alerting Rules determine who to send the alert to and how (Pager Duty, Slack, or other tools). Alerts include links back to the Stream.
Now that we know the right people will be alerted if the situation gets worse, we can go back to creating more monitoring capabilities.
Create a Dashboard
In a distributed system, there are many dependencies, so even though we have a Stream for the
inventory service, we probably need to create streams for other related services and queries. An easy way to view these Streams side-by-side is to create a dashboard to display these streams.
You can’t create a dashboard in the Lightstep demo, but we’ve created a few for you.
Click Dashboards from the navigation bar and choose Dashboard 1.
The dashboard displays a number of Streams.
Hover over any of the Streams to see the scatter plot and reported metrics.
Lightstep also offers a Grafana plug-in that allows you to integrate your Lightstep Streams into Grafana Dashboard. Clicking on a Lightstep dashboard in Grafana takes you back into Lightstep where you can view the Stream and continue with your research.
What Did We Learn?
- Streams allow you to always capture meaningful data for specific queries.
- You can create alerts for Streams and use Slack, PagerDuty, or other apps to notify people when thresholds have been crossed.
- You can build dashboards from Streams to create a holistic view of performance metrics.