Observability gives you the necessary information about the health and efficiency of your system. However, systems are large and complex, so the question is what do you decide to measure and where do you start?
Cloud Observability supports OpenTelemetry as the way to get telemetry data (traces, logs, and metrics) from your app as requests travel through its many services and other infrastructure. If OpenTelemetry doesn’t currently support the language you need, you can use OpenTracing for now and then move to OpenTelemetry when it’s ready. If you use OpenTracing in some services and OpenTelemetry in other services, traces will be correctly connected as long as you use B3 context propagation in all services.
Maybe you want to ensure your most valuable business operations have full coverage so you can monitor them and find issues quickly. Or you want to ensure your most frequently called API is always performant. Or maybe you know you have a latency issue with a particular request and you need to dig in and find the cause. These scenarios all call for instrumentation that traverses the full stack, giving you a view into a request as it travels through your system.
Here are some common use cases to prioritize:
When translating these priorities to code changes, it’s helpful to consider the following:
MongoDB
These centralized communication hubs reveal a great deal about how the application behaves, for example in the Cloud Observability Service Diagram.
Start at the framework with installers that add the tracing logic for you. You can get fairly wide coverage without touching existing code. Auto-installers are available for many languages.
If you’ve already instrumented using OpenTelemetry Collectors, it’s easy to get that instrumentation into Cloud Observability.
If OpenTelemetry doesn’t currently support the language you need, you can use OpenTracing for now and then move to OpenTelemetry when it’s ready. For now, check out our Quickstarts for your language.
If you use Istio and Envoy, auto-instrument your service mesh.
With your framework instrumented, you can immediately see traces in Cloud Observability. To get a finer-grained view into details important to your business, you add manual instrumentation to supplement the baseline auto-instrumentation.
Once you have some instrumentation in place, be sure to check out it’s IQ Score!. Cloud Observability can analyze your instrumentation and recommend ways to improve it. Watch your score go up as you continue to add tracing capabilities to your system.
Much of the IQ score is based on the presence of specific attributes that Cloud Observability needs for efficient issue mitigation. If there is metadata that you’d like all services to report to Cloud Observability (like a customer ID or Kubernetes region), you can register the corresponding attributes and Cloud Observability will check for those when determining the IQ score.
Once you’ve measured the coverage you get from instrumenting the framework, you’ll likely find specific places in your system that you need to better understand. At this point you’ll want to turn to the OpenTelemetry SDKs and APIs.
Continue adding spans to those areas and repeat our IQ Test test, until you are satisfied with the coverage. Be sure to add attributes, events to get the full breadth of observability.
Use attributes and log events to find issues fast
Updated Nov 1, 2019