LightStep

LightStep [𝑥]PM Documentation

Welcome to the LightStep developer hub. You'll find comprehensive guides and documentation to help you start working with LightStep [𝑥]PM as quickly as possible, as well as support if you get stuck. Let's jump right in!

Get Started    

OpenTracing Instrumentation

OpenTracing

LightStep relies on the standardized OpenTracing APIs for instrumentation. In order to get LightStep traces as shown in the Quickstart guide, you will need to instrument your codebase for tracing and connect it to a LightStep tracer.

OpenTracing makes it easy for developers to add (or switch) tracing implementations with a simple configuration change. OpenTracing also offers a lingua franca for open source instrumentation and platform-specific tracing helper libraries. In this scenario, it binds the production application code to LightStep’s production tracing system.

Take some time to get familiar with OpenTracing on the OpenTracing site.

The OpenTracing GitHub repositories include support for many many popular languages, including:

For "Hello World"-type tutorials in Go, Java, Python, and Node.js, see this tutorials repo.

Using LightStep with OpenTracing

Once you have instrumented your code using the libraries above, you will initialize the LightStep tracer (this is not a part of OpenTracing) to start seeing traces. See the Quickstart guide for basic instructions in binding a LightStep Tracer to the OpenTracing libraries.

Note: LightStep libraries can be used directly in the cases where an OpenTracing API library is not yet available. The LightStep APIs are a proper superset of the OpenTracing standard.

As a reference, the LightStep GitHub repositories are provided below.

Three Guiding Principles

We offer three guiding principles to get started with your tracing efforts:

1) Prioritize
2) Start at the edges
3) Connect the dots

Prioritize

Stack-rank your most critical operations and build up traces one at a time.

Tracing every detail of a large distributed application would be daunting, but if you take the right approach you will see value from your first few lines of code. LightStep users have the most success when they start with end-to-end traces of their highest-business-value operations.

The following dimensions will help you think about the relative priority of instrumentation targets:

  • Impact on end-users of your application: the closer instrumentation is to end-users and/or your business value, the more meaningful the resulting performance and reliability data will be

  • Widely-used routing and communication packages: homegrown RPC subsystems and routing layers reveal a great deal about application semantics and also play a role in propagation across process boundaries

  • Known areas of unpredictable latency or reliability: adding instrumentation will help to explain and model the variability

  • Known bottlenecks: database calls, inter-region network activity, and so on

Start at the edges

Work from the outside-in to get fastest time-to-value.

Deciding where to begin can be the hardest part. You know your top-priority operation, but where should you start writing code?

Work from the outside-in to get the most critical data first and inform your instrumentation decisions as you discover the critical path of your own system. Whether you are looking at the application as a whole or instrumenting a single service, start by collecting data on the requests you serve and the outbound calls you make to services you depend on.

Application-level

Start as close to the end-user as possible and work your way down the stack (e.g. mobile/web app, or the routing layer of your distributed system).

Service-level

Instrument the request path for the most important operations your service handles, then instrument the client side of RPCs to other services those operations depend on.

Connect the dots

Join spans together to see the full picture of your high-priority operation.

Traces are all about following transactions through an application, not simply monitoring individual processes or operations. To deliver the goal of an end-to-end trace of your highest-priority operations, you will need to join spans together by propagating trace span context along with the transaction. You will need to connect the dots both within processes and between processes.

Within a process

Context objects

Many languages and frameworks provide some form of context object that can be used to propagate an identifier for a request/transaction/operation. For example, Go has context.Context and Django has template contexts. Where possible, use the approach that is idiomatic for the platform to propagate context throughout a process.

An example of using context.Context in Go to propagate transaction context from parent to child spans.

// Start a new span as a child of the span in the current context
// Returns the span and a new context.Context containing the current span
span, ctx := opentracing.StartSpanFromContext(ctx, "my_operation_name")

Trace Assembly Tags

Sometimes it's convenient to use OpenTracing's Inject and Extract feature to trace across process boundaries; but for whatever reason, sometimes it's not. For the "sometimes it's not" case, LightStep offers an additional correlation mechanism called Trace Assembly Tags.

If you already have access to a transaction id, request id, correlation id, context id, or even a user id, you can use trace assembly (Span) tags to assemble distributed traces without formally Injecting or Extracting across process boundaries.

"guid:*" tags

If you already propagate a de facto transaction id (perhaps referred to as a "request id", a "correlation id", a "trace id", etc) around your distributed system, LightStep can use it to assemble distributed traces. In order to take advantage of this feature, add an OpenTracing tag to your Spans with a key string prefixed by "guid:" and the given transaction/correlation/etc id as the value. For instance, given a requestId member variable, you might write

// Set a guid tag on an opentracing.Span object using the SetTag method and by
// giving the key a name that starts with "guid:".
//
// LightStep will automatically include any other Span (throughout the
// distributed system) that uses this same Tag key and value as part of the
// same distributed trace.
span.SetTag("guid:request_id", requestId)

These "guid:" tags allow you to reuse existing propagation mechanisms to assemble traces in LightStep, and in so doing can greatly reduce the initial integration time for codebases and systems that have not yet instrumented with OpenTracing.

"join:*" tags

If neither conventional OpenTracing Inject/Extract propagation nor "guid:" trace assembly tags are feasible, you may still have an end-user id or other stable pan-transaction identifier on hand at a given code location. In such situations, you may be able to use a "join:" trace assembly tag instead.

A "join:" tag is similar to a "guid:" tag in most ways. The key difference is that Spans tagged with identical "join:" tags will only be assembled as part of the same trace if they transitively overlap in time, too.

A diagram will make this clearer:

[-- Span A: "join:user_id"=42 --]

                                    [-- Span B: "join:user_id"=42 ----]

                [-- Span C: "join:user_id"=42 --]

------------------------------------------------------> time

All three Spans above are assembled into the same trace, as they share a
"join:" tag and (transitively) overlap in time.

Setting the Join Tags in the instrumentation code is straightforward:

// Set a join tag on an opentracing.Span object using the SetTag method
// and giving the key a name that starts with "join:".
//
// This sets the join tag "join:user_id" to a value of 42, and any other Span
// (in the distributed system) that overlaps in time and shares this Tag key
// and value will be included in the same distributed trace.
span.SetTag("join:user_id", 42)
Transitivity

Spans may have multiple "guid:" or "join:" tags, and traces are assembled by taking the union of these assembly hints as well as any conventional OpenTracing Span-to-Span references.

Downsides

Though "guid:" and "join:" tags are powerful and useful in production, conventional OpenTracing Inject/Extract propagation actually encodes more information about Span relationships and will thus give LightStep more information to use when visualizing or analyzing traces. Trace assembly tags are far better than nothing, but conventional OpenTracing propagation is still a best practice when it's a convenient option.


Between processes

OpenTracing Propagation with Inject and Extract

http://opentracing.io/propagation/

The OpenTracing API provides methods to Inject and Extract a SpanContext, which respectively serialize and deserialize the SpanContext to be sent over the wire, e.g. as HTTP headers. Use these methods to propagate your SpanContext when crossing process boundaries.

Using guid or join Tags

Both guid and join Tags allow you to group Spans into traces using existing correlation or context ids you already have on hand in your source code. They function in the same way within a single process as they do across process boundaries. See Trace Assembly Tags above for more detail.


Remember these three principles and you are on your way to better visibility with LightStep.

When you're ready, move on to the Quickstart guide guide!