There are tags that work very well with Lightstep in addition to the OpenTracing semantic conventions. Any tag that you add to your span data will enable more segmentation, making it easier to find, filter, and group your span data in Lightstep. Lightstep doesn’t have cardinality limitations, so the more tags you use, the greater your insights will be.

In particular, tags that allow you to segment user pathways are useful. Adding things like “parameters” (params.name, params.count), that correspond to the operation on the span and tell an operation which path to take depending on user input, are also very helpful for grouping, filtering, and segmenting. Otherwise, you may optimize for one use case without noticing some other outlier use case that only gets triggered 1/4 the time. Correlations will also be able to spot the outliers from these tag values.

Useful Tags

Following are recommended tags (other than the OpenTracing tags) that provide greater visibility into your span data.

Best Practice - Use the OpenTracing Tags
Always follow the OpenTracing semantic tag guidelines whenever possible.

  • params.count, params.name, params.type, corresponding to the parameters an operation was called with.
  • payload.size, response.size, request.bytes, request.size_bucket, response.size_bucket, or other size tags when sending and receiving data.
  • host.dc, zone.name, zone.id, region, or any sort of regional, zone, or geographical tag.
  • request.id, uuid, or other anonymous identifiers of transactions or of users (or even of user segments or user types).
  • version, library.version, api.version, or any sort of version tagging on your code in production.
  • hardware versions, platform, or any identifier of the user’s hardware, such as ios 10 or ios 8.
  • http.status_code_group such as 4xx, 5xx, 2xx.
  • client.error versus internal.error boolean for differentiating when an error is caused by a user, for example a 404, 400 versus 500.
  • exception.class, exception.message, unified_error_code, to quickly figure out the magnitude of exceptions or specific error types that are occurring.
  • <entity>.id for the entity that is being fetched from the database or worked with, for example, project.id if retrieving a project for a user.
  • grpc.method, grpc.status_code, and codes corresponding to gRPC calls.
  • retry_attempt, max_retry_attempts in areas of the codebase where there is retry logic.
  • feature_flag.<feature_flag_name>: true/false, canary: true/false, or other tags corresponding to when a feature flag or canary/AB test is active or not.
  • pubsub.topic, pubsub.message_id, and other tags corresponding to pubsub mechanisms.
  • sdk.version, api.version, library.version, service.version, and any other tags related to versions of the codebase or library/tool being used. The service.version tag is very useful when tracking service deployment performance in Lightstep.
  • kubernetes_cluster, pod.id, node.id, and any other tags related to Kubernetes (or other container management solutions) to help easily show when a problem is isolated to a particular cluster, pod or node.
  • stack_trace_hash, the hash of a stack trace when an error has occurred, to easily search for a particular stack trace in Lightstep.
  • flow, a human-readable name of common flows represented by traces, like checkout or search.
  • svc.<service_name>.<thing>, service name-spacing for service-specific tags, for example:
    • svc.users.table
    • svc.users.database
    • svc.users.index_name

Useful Logs

  • Sanitized payload of a request and a response (clear any personally identifiable information).
  • Events that are occurring within the span, for example, sanitized payload for request, forwarding to <xyz>.
  • Stack trace or exception messages and error messages.
  • When things are returning, processing, or waiting, for example, context deadline exceeded. An operation may go for a few seconds and logging can add context on what it’s doing or what it’s waiting for.
  • Any additional context. If a user hits a certain flow and it’s non-obvious by the operations, a simple log message can be helpful, for example, “user entered flow x”.