Lightstep offers a way to see how individual versions (even partial deployments) affect your service performance. When you access the Service Directory, the Deployments tab shows you the latency, error rate, and operation rate of your Key Operations on a service. The operations are initially sorted by highest change during the time period initially selected.

When you implement an attribute to display versions of your service, a deployment marker displays at the time the deployment occurred. In the example below, a deploy of the inventory service occurred at around 12:40 pm. Hover over the marker in the larger charts to view details. These markers allow you to quickly correlate deployment with a possible regression.

When you have multiple versions in a time window, you can view the performance of each deployed version. For example, in this image, multiple versions have been deployed. Hover over the chart to see the percentage of traffic in each version.

This feature requires a Satellite upgrade to the March 2020 release.

If a version attribute hasn’t been instrumented for the service, or you haven’t configured Lightstep to recognize the version attribute, an Instrument version attributes button displays.

Clicking this button navigates to the Instrumentation Quality view, where you can get more information about instrumenting for deploys.

View Version Performance

Before you can view deployments in the Service Health view, you need to create deployment markers as attributes in your instrumention. Each time the value of that attribute changes, Lightstep displays a new marker in the charts and begins tracking the independent performance of that version.

When Lightstep detects only a single version in the time window selected, that version is displayed.

When more than one version has been deployed in that time window, you can choose a version to view separately. The marker for the selected version is solid. Colored lines represent the performance of that version, across the latency, error, and operation charts. All other versions display as grayed-out lines. You can hover over the chart to see the distribution of traffic between the versions and the corresponding performance.

In the image above, you can see that the current deploy has about 30% of total traffic and is experiencing latency in the p99 and p95 ranges (the green lines), while before the selected deployment, the other versions were not (the gray lines).

Compare Deployment Performance

When you notice that performance has regressed after a deployment, before you start digging in to latency or error analysis, it can be helpful to compare the current deploy to another deploy to see if the changes are an anomaly or if this type of regression is common after a deploy (often the case with canary deployments).

Lightstep can help by showing you a version-over-version comparison of performance. When you choose to compare two versions, Lightstep compares “time-shifted” views of performance. That is, it shows how the previous deploy performed for the same amount of time after its deploy, compared to how the current version is performing (it doesn’t compare how it’s performing during the same time period as the current deploy).

For example, in this screenshot, Lightstep is comparing the current version (1.14.5017) at 3:49 pm with a previous version (1.14.115) at 5:09 am, showing performance for about 10 minutes after both deploys.Time shifted comparison

Lightstep uses a time-shift comparison, as there is no guarantee that the two versions exist simultaneously.

To compare performance of two versions:

  1. Select the version you’re concerned about from the View version dropdown.

  2. From the Compared to dropdown, select another version (one that you know was performing as expected).

    Lightstep overlays the time-shifted timeseries of the selected deployment (grey lines). You can quickly see if there is an issue or if this deployment is behaving similarly to the comparison deployment. In this image, you can see that the selected version has the same shape as the current version and can conclude that the current version is behaving as expected.

    In the image below, you can see that the selected version did not have the same issue, there are no gray lines showing a regression.

Now that you have a clear idea of which version may have caused issues and when, you can continue on to investigating a latency regression or an increase in errors and get to the root cause.