Cloud Observability lets you see how individual versions (even partial deployments) affect your service performance. When you access the Service Directory, the Deployments tab shows you the latency, error rate, and operation rate of your Key Operations on a service. The operations are initially sorted by highest change during the time period initially selected.
Cloud Observability considers Key Operations to be your ingress operations on a service.
When you implement an attribute to display versions of your service, a deployment marker displays at the time the deployment occurred, both in the Service Health view and Cloud Observability’s correlation feature.
We will be introducing new workflows to replace the Deployments tab and RCA view. As a result, they will soon no longer be supported. Instead, use notebooks for your investigation where you can run ad-hoc queries, view data over a longer time period, and run Cloud Observability’s correlation feature.
These markers allow you to quickly correlate deployment with a possible regression.
When you have multiple versions in a time window, you can view the performance of each deployed version. For example, in this image of the Service Health view, multiple versions have been deployed. Hover over the chart to see the percentage of traffic in each version.
If a version attribute hasn’t been instrumented for the service, or you haven’t configured Cloud Observability to recognize the version attribute, an Instrument version attributes button displays.
Clicking this button navigates to the Instrumentation Quality view, where you can get more information about instrumenting for deploys.
Before you can view deployments in the Service Health view, you need to create deployment markers as attributes in your instrumentation. Each time the value of that attribute changes, Cloud Observability displays a new marker in the charts and begins tracking the independent performance of that version.
When Cloud Observability detects only a single version in the time window selected, that version is displayed.
When more than one version has been deployed in that time window, you can choose a version to view separately. The marker for the selected version is solid. Colored lines represent the performance of that version, across the latency, error, and operation charts. All other versions display as grayed-out lines. You can hover over the chart to see the distribution of traffic between the versions and the corresponding performance.
In the image above, you can see that the current deploy has about 30% of total traffic and is experiencing latency in the p99 and p95 ranges (the green lines), while before the selected deployment, the other versions were not (the gray lines).
When you notice that performance has regressed after a deployment, before you start digging in to latency or error analysis, it can be helpful to compare the current deploy to another deploy to see if the changes are an anomaly or if this type of regression is common after a deploy (often the case with canary deployments).
Cloud Observability can help by showing you a version-over-version comparison of performance. When you choose to compare two versions, Cloud Observability compares “time-shifted” views of performance. That is, it shows how the previous deploy performed for the same amount of time after its deploy, compared to how the current version is performing (it doesn’t compare how it’s performing during the same time period as the current deploy).
For example, in this screenshot, Cloud Observability is comparing the current version (1.14.5017) at 3:49 pm with a previous version (1.14.115) at 5:09 am, showing performance for about 10 minutes after both deploys.
Cloud Observability uses a time-shift comparison, as there is no guarantee that the two versions exist simultaneously.
To compare performance of two versions:
Select the version you’re concerned about from the View version dropdown.
From the Compared to dropdown, select another version (one that you know was performing as expected).
Cloud Observability overlays the time-shifted timeseries of the selected deployment (grey lines). You can quickly see if there is an issue or if this deployment is behaving similarly to the comparison deployment. In this image, you can see that the selected version has the same shape as the current version and can conclude that the current version is behaving as expected.
In the image below, you can see that the latest version has a performance issue issue and the previous one does not.
Updated Jun 11, 2020