View service hierarchy and performance

We will be introducing new workflows to replace Explorer and the Service diagram. As a result, they will soon no longer be supported. Instead, use notebooks for your investigation where you can add a dependency map, run ad-hoc queries, and run Cloud Observability’s correlation feature.

You can use Cloud Observability’s Service diagram to get an aggregate view of trace data as a request travels through your system. The Service diagram provides a visual, interactive, and hierarchical representation of a system’s behavior for a given point in time, based on the query shown in Explorer.

It also provides a clear visualization of inter-service relationships and insight into the performance of distributed software. You can see services both upstream and downstream from the queried service and pinpoint services that contribute to the latency of the request. The Service diagram also allows you to easily visualize a complex system architecture, identify services with errors, and quickly formulate or eliminate hypotheses.

You can also configure Cloud Observability to recognize and display inferred services in the diagram. Inferred services are external services, libraries, or dependencies that haven’t been instrumented, like a database or a third-party API. Cloud Observability recognizes these leaf spans (the request can’t continue to another service) and reports on their error counts, span counts, and average latencies.

The diagram edges represent relationships between spans and is dependent on the quality of your instrumentation. Missing spans may cause edges to appear when there is no parent-child relationship.
The following can cause missing spans:
* The Microsatellite dropped the span (for example, your Microsatellite pool is not auto-scaling to keep up with traffic)
* The tracer dropped the span (for example, your application crashed or never called span.flush() )
* There is an issue with your instrumentation (context was dropped, or the service with a parent span is not instrumented).

When you see missing spans, check the Reporting Status page to find the culprit.

The queried service is shown with an animated border. Inferred services have a light blue inner halo. Any services that contribute to latency are shown with a yellow halo - the larger the halo, the larger the latency. Services with errors are shown with a red halo.

Latency halos only display when your query contains a service.

The left panel shows operations on the currently selected service with exemplar spans. You can click a span to see it in context in the Trace view.

For inferred services, the panel shows details for the service.

View the Service diagram

You access the Service diagram by clicking the Service Diagram tab in Explorer. The service you queried for is centered in the diagram and shows an animated blue circle.

The blue animated border indicates that this is the queried service. Descendant services display a yellow halo to indicate their contribution to the latency experienced by the queried service.

The light blue halo indicates that this is an inferred service. The name that is displayed is based on how you configured it.

The yellow halo indicates that the service contributes to the latency experienced by the queried service. The size of the halo indicates the size of the latency contribution. The yellow halo only appears when a service is explicitly added as a query term (if your query doesn't include a service, latency halos aren't displayed).

The blue dot in the center of a node indicates the currently selected service. See the side panel for this service’s reporting status and a sample list of operations. You select a service by clicking the center of the node. Selecting a service changes the information in the side panel to reflect that service.

The red halo indicates that the service is experiencing errors.

Nodes appear greyed-out when traces from this service don’t match a filter in the Latency Histogram or Trace Analysis table.

Hover over a service to view its status.

Click the center of a service to select it and see its information in the side panel.

The Service Diagram helps you quickly find the ancestor and descendant services of a given service and whether they contribute to latency. In the graphic above, a user has queried for api-server and the diagram clearly distinguishes between the api-proxy that is sending traffic to api-server and the four services that are receiving traffic from the api-server. The database service shows significant latency and may need further investigation.

Change the Service diagram’s display

You can move the diagram, zoom in and out, and you can center in on a service using the controls at the top. Click Focus on service to see just the immediate upstream and downstream services.

Move the diagram by clicking and dragging.

Selecting and centering on a different service populates the panel with information about that service (but does not run a query).

To turn the display of inferred services off and on, use the switcher.

Filter results

Similar to how you can filter the spans from the Trace Analysis table, you can filter spans from the side panel of the Service diagram.

When you click a filter icon on a service or operation, the Trace Analysis table repopulates to show spans from the results that match the filter. Results are taken from all spans that participate in the same traces as the original query.

You can’t filter on inferred services.

Add inferred services

Before inferred services display in the diagram, you need to add attributes to your instrumentation that tell Cloud Observability the request is going to an inferred service. You then need to add the attribute to Cloud Observability settings and also set a display name.

For example, you might create (or have) a attribute that returns the database type for a span. You can use this attribute and its value to identify spans coming from specific types of databases. Cloud Observability will collect information for any span that uses that attribute name/value pair. You might also use that attribute to display the database type, so that any database service is labeled with the database type value (for example, sql or cassandra).

You might also need to use different attributes for recognizing the inferred service and the display name for that service. For example, you might use span.type="sql" to tell Cloud Observability to collect data for any spans from any SQL database, and then use the attribute sequel.db.vendor to return the value and use that as the label in the display, for example, "MySQL" and DB2.

Inferred services must be a leaf node in the resulting trace - that is, it cannot call out to another service.

To add inferred services:

  1. If you don’t already have one, add an attribute to your instrumentation that can identify an inferred service.
    For example, if a request is calling out to a SQL database, you might create an attribute with the name db.type that takes a value of sql when calling out to that database.

  2. In Cloud Observability, click Settings > Attribute mapping > Inferred services.

  3. In the Identify inferred services field, start typing to select the attribute and value for the inferred service you want to add.

  4. In the Label inferred service field, select the attribute whose value should be used as the display name.

For example, you might select db.type as the label, and db.type:"sql", db.type:"memcached" and db.type:"cassandra" as all the values to collect and display inferred service information for.

See also

Service health

Query real-time span data

Monitor Microsatellites, tracers, and service reporting

Updated Apr 26, 2023