When your instrumentation includes tags (key/value pairs that carry metadata descriptions about your spans), Lightstep can use them to help quickly pinpoint root causes of issues. You can see when spans with certain tag values have errors or latency, while others with different values don’t.

In OpenTelemetry, tags are known as attributes.

More about Tags

A span can have zero or more key/value tags. Tags allow you to create metadata about the span. For example, you might create tags that hold a customer ID, or information about the environment that the request is operating in, or an app’s release. Tags do not reflect any time-based event (logs handle events). The OpenTracing spec defines several standard tags. For example, here are the tags available using the Java-based tracer.

You can also implement your own tags and logs. The following are tags and logs that work very well with Lightstep in addition to the OpenTracing semantic conventions. Any tag that you add to your span data will enable more segmentation, making it easier to find, filter, and group your span data in Lightstep. Lightstep doesn’t have cardinality limitations, so the more tags you use, the greater your insights will be.

Learn more about tags.

Expandable end

Use the OpenTracing Semantic Tags

Start with the OpenTracing semantic span tags. These are standardized tags that provide much of the functionality you’ll need.

Tags help identify issues that are affecting business flows and assets that are important to you. For example, if your business’s number one concern is customers, then create a tag to carry the customer’s name so you can quickly see when issues hit your highest priority customers.

Here’s an example of how you can see when a specific customer is likely experiencing high latency. Because the customer tag was implemented, LightStep was able to show that spans generated for the meowsy customer tag/value are experiencing high latency.

Similarly, tags that describe the device type or OS can help determine how widespread an issue is. Is it only a problem for customers on a particular combination of OS and device? Is that a large part of your customer base? Being able to make that determination quickly can help prioritize mitigation.

Catch System Changes

Knowing immediately when a deploy affects your system, or when an a/b segment of your environment is affected means you can react quickly by starting a rollback or moving customers to a different segment.

When you use the service.version tag, LightStep automatically creates markers showing when deployments occurred. Here’s an example of increased latency, likely caused by the deployment immediately before it.

In this graphic, you can see that the tag service.version with a value of 10.8.585 has many errors during the regression time range.Tags correlated with errors

When you filter on the service.version:10.8.585 tag to see only spans with that tag value, and then in the Trace Analysis table, group by the service tag, you can see which services were deployed with that version.

In this example, only spans on the store-server service have the version value of 10.8.585, which tells you that a deploy of that service is causing the errors. And you can look at the log messages to see what those errors are. Logs, tags, and filters pinpoint the cause

Because of the service.version and service tags, you were able to quickly pinpoint the source of errors!

Label Infrastructure

Knowing where an issue is occurring, and just as important, ruling out where it’s not, can save a tremendous amount of time. In this example, you can see that the tag db.type=cassandra is on almost every span at the high-end of latency.

When you group the results by the db.type tag, you can see that the Cassandra database is experiencing much higher latency than others.

What Did We Learn?

  • Tags are valuable when looking for root causes. Lightstep can correlate tags with error and latency rate, and you can group and filter by tags to narrow down the data and find issues fast.

  • Using the service.version tag allows Lightstep to show you when deploys occur, letting you associate a deploy with a regression at a glance.

  • You should create tags related to valued business functionality so you can easily monitor flows and assets important to you.