Cloud Observability Glossary

Active Service Bundle

A bundle of products which consists of an Active Service, Span Data, Active Time Series, and other items with limits detailed on a Cloud Observability order form

Active Time Series (ATS)

A set of timestamped measurements that share a metric name and a unique set of tag keys and values. Other vendors refer to this concept as “custom metric” or “metric time series.” In Cloud Observability, the cost of a time series is prorated hourly. See also distribution.

More...

aggregation

The process of converting irregularly spaced metrics stored in the database into evenly spaced data points on a chart. Cloud Observability calculates the spacing (output period) to ensure each chart contains approximately 120 data points. Aggregation methods include Latest, Delta, and Rate.

More...

alert

A notification that a value being monitored has gone outside of an assigned threshold for an assigned duration.

More...

Alerts list

A sortable table in dashboards displaying alerts and their statuses

More...

Anomaly alert template

A template used to create alerts based on anomalies in the data. Anomalies are detected when current values deviate from baseline averages using standard deviations. This template is suitable for identifying instances where current data behaves differently from past data.

More...

API Key

A key required for authentication to send data to Cloud Observability using its API.

More...

area chart

Chart displaying a shaded area for the value

More...

attribute (tag/label)

Key/value pairs associated with telemetery data, allowing the creation of metadata. Attributes can include information like customer ID, environment details, or component version. The OpenTelemetry spec defines standard attributes known as semantic conventions for consistency in reporting.

More...

auto-saved span data

Automatically saved span data in notebooks, retained for the length of the data retention policy. It allows users to revisit and explore the data from when the chart was created.

More...

bar chart

Chart displaying bars for the values at given time points.

More...

baseline

Segment of a chart where performance was stable. Baselines are used to compare performance to the time when a deviation occurred.

big number chart

a chart displaying an aggregated value over time as a number

More...

body field

The field where Cloud Observability tokenizes log data for fast filtering

More...

cardinality

The number of elements in a set or other grouping, as a property of that grouping.

cardinality explosion

Situation where the number of unique values becomes too high for the system to handle

Change alert template

A template used to create alerts based on changes in data over time. It allows setting thresholds by comparing current data to historical data from specific time intervals such as minutes, hours, days, or weeks.

More...

Cloud observability platform

The central processing hub where Microsatellites and/or the OpenTelemetry Collector send telemetry data from instrumentation. The platform analyzes data, builds complete traces and dynamic service diagrams, deduces correlations, monitors for changes in performance, and durably stores data with configurable retention.

More...

Cloud Observability Quickstart

A paid engagement led by our observability experts who partner closely with your team to ensure a fast, effective, and robust onboarding experience. This may include optional sessions on Observability and instrumentation best practices alongside Cloud Observability product trainings.

Cloud observability web UI

The user interface of Cloud Observability used for monitoring distributed systems, detecting changes, and mitigating issues. It provides alerts and dashboards for logs, metrics, and traces, utilizing a Unified Query Builder and Unified Query Language for consistent querying across different telemetry types.

More...

cold storage

Storing logs in a less expensive, non-queryable format.

More...

cold to hot storage

Process of moving data from less frequently accessed storage (cold) to more accessible storage (hot)

More...

Collector

An OpenTelemetry component that provides a vendor-agnostic agent for receiving, processing, and exporting telemetry data.

More...

Collector configuration file

A configuration file for the OpenTelemetry Collector, comprising five main sections - Receivers, Processors, Exporters, Extensions, and Service. It defines how the Collector handles telemetry data.

More...

Collector Health dashboard

A pre-built dashboard designed to monitor the health of OpenTelemetry Collectors running in Kubernetes. Best suited for users who have followed the Quickstart guide to install and run Collectors in Kubernetes.

More...

Composite alert template

A template used to create alerts by combining multiple alerts into a single alert. Composite alerts allow alerting on various conditions and can combine log, metric, and trace alerts for comprehensive system health monitoring.

More...

Configuration Items (CIs)

Data sources reporting to Cloud Observability mapped in the ServiceNow CMDB.

More...

context

Information essential for building a trace tree in distributed tracing. It includes the parent span ID and trace ID, which propagate from span to span, establishing relationships in the trace tree. OpenTelemetry uses headers to transmit context between spans.

More...

correlation feature

Determines the service that emitted a metric or span, searches for performance changes on operations from that service at the same time as the deviation, and then uses trace data to find underlying services, operations, and tags that have a high probability in contributing to the deviation.

More...

critical path

The time each operation in a request was actually active during the request. In the Trace view, the path is a black line that travels down and back up the stack, to help identify bottlenecks in the overall transactions

More...

cumulative

Metric kind that adds a value to the last value, counting the total number of things at a specific point in time. Cumulative metrics use the same ‘start’ timestamp for each value. An example is the total number of accumulated web page hits.

More...

Custom alert template

A flexible alert template that allows users to create and configure alerts based on custom criteria and queries. It offers the most flexibility among alert templates.

More...

data retention

Period of time for which different types of data are stored and queryable in Cloud Observability

More...

Data Retention policy

The policy for an organization that determines its configurable data retention.

More...

delta

A metric kind that shows how the values change from one reporting period (point on the graph) to the next. HTTP requests is an example of a delta metric. You want to see if requests are going up or down, and by how much.

More...

dependency map

A graphical representation of dependencies among reporting services (and optionally operations).

More...

deployment markers

Indicators in charts showing deployment events

More...

Deployments tab

A tab in the Service Directory used to monitor how deployments specifically affect the performance of services. It provides deployment markers on charts for correlation with possible regressions.

More...

Developer Mode

A feature that uses a local developer satellite so that an application developer can quickly see results of local instrumentation without needing to deploy to production.

More...

developer satellite

A locally-run satellite used by Developer Mode

More...

distribution

A metric type that returns a set of values for a point in time and performs aggregation on those values before charting the points. Cloud Observability supports percentile aggregation and can display the 50th, 95th, 99th, and 99.9th percentiles.

More...

error rate

The proportion of requests that result in errors

event marker

Visual indicators displayed on dashboards that denote significant events that may have influenced the performance or behavior of a system. These events can include activities such as code commits, merges, deployments, or feature flag changes.

More...

exemplars

Example spans associated with high latency, error, or operation rate in trace data charts. Users can view the associated traces to investigate and understand the cause of deviations.

Explorer

The view in Cloud Observability for doing live queries whose results are shown in a histogram.

More...

exporters

Components in the Collector responsible for sending data from the Collector to configured destinations. They can be push or pull-based.

More...

extensions

Optional components in the Collector that do not primarily involve processing telemetry data. Examples include Collector health monitoring and service discovery.

More...

final time aggregation

A final aggregation operation to smooth out and summarize data over a specified time period

More...

gateway

A mode in which the Collector operates as a centralized data collection gateway, serving as a standalone service.

More...

gauge

A metric kind that represents an observed value at a specific point in time or over a specified range of time. Temperature readings are an example of a gauge metric. CPU usage is another example; you want to know exactly how much of an available resource is being used at a given point in time.

More...

global filters

Filters that quickly modify the displayed data on a dashboard based on attributes. Unlike template variables, global filters do not persist, affecting only the current view.

More...

Global Service dashboard

A single dashboard template that utilizes a variable for the service, allowing users to view information for all services from a unified dashboard. Users can change the service variable to focus on specific services.

More...

group-by

Grouping metrics by an attribute, creating separate data points for each value of the attribute. Provides insight into the distribution of the metric across different attribute values.

heatmap

Chart displaying the distribution of values over time using color saturation

More...

hot storage

Storing logs in a format that allows for querying.

More...

Identity Provider (IDP)

Identity Providers (IdPs) store and authenticate user identities. Cloud Observability integrates with several IdPs: Azure AD, Google, Okta, and OneLogin.

More...

individual service dashboard

A dashboard available for each service reporting to Cloud Observability, offering similar information as the Global Service Dashboard. However, it is scoped to a single service, providing additional details about individual operations.

More...

inferred service

An external service, library, or dependency that hasn’t been instrumented, like a database or a third-party API. Cloud Observability recognizes these as leaf spans (the request can’t continue to another service) and reports on their error counts, span counts, and average latencies.

More...

instrumentation

The process of creating and collecting telemetry data to describe distributed traces, logs, and metrics in a system. It involves placing code in microservices, functions, web, mobile clients, or other parts of a system to capture relevant data. OpenTelemetry facilitates quick and automatic instrumentation.

Instrumentation Quality

A score in Cloud Observability that evaluates the quality of tracing instrumentation on services. It assesses whether instrumentation crosses services, includes interior spans, contains attributes for correlated latency, uses attributes for deployments, and includes hostname attributes for different environments.

More...

Just-in-Time (JIT) provisioning

With Just-in-Time (JIT) provisioning, the first time a user logs into Cloud Observability with their IdP credentials, Cloud Observability creates their account with a default role.

More...

Key Operations

Ingress (server) operations on a service whose performance is crucial to the health of the system. These operations are strategically chosen, and their performance is displayed in the Service Health view on the Deployments tab.

More...

labels

Identifiers you can add to dashboards and alerts for further management and issue resolution. Labels also facilitate correlation with specific CIs in the ServiceNow CMDB.

More...

latency

Time interval or delay when one component is waiting for another component. Specifically, the duration of time for a data packet to travel from one component to another (one-way) or the time it takes for the packet to make a round-trip, minus the time spent at the destination (round-trip).

latest

An aggregation operator for gauge-type metrics that displays the last reported value for a time period.

More...

line chart

Chart that displays a line connecting charted values

More...

linear scale

Scale where the value difference between points on the Y axis is the same

More...

live tail

a feature for real-time viewing and troubleshooting of logs

More...

log

Structured or unstructured lines of text that are emitted by an application in response to some event in the code. They are stored by timestamp in a time series database, allowing for search, filtering, live tailing, and connections with traces and other telemetry types.

log rehydration

A process in Cloud Observability that helps lower storage costs and explore older logs by moving them from cold to hot storage, where they can be viewed.

More...

log retention

The duration for which logs are stored

More...

logarithmic (log) scale

Scale that allows an exponential representation of points on the Y axis (base 10 by default)

More...

logs list

List that can be added to a dashboard that displays individual logs matching certain conditions

More...

logs tab

The feature that displays a log’s fields in JSON and tabular format

More...

max

Aggregation method computing the highest point in the data. For example, the max value for [10, 15, 50] is 50.

More...

mean

Aggregation method computing the average value of the time series in the input window. For example, given the values of [10, 15, 50] the mean is 25.

More...

metric kinds

Categories that define how time series data is visualized. Cloud Observability supports gauges, deltas, and cumulative metrics. Metrics must be labeled with their kind before ingestion.

More...

metric retention

The duration for which metric time series data is retained

More...

metric types

Representation of the value type reported, including Integer, Float, and Distribution. Distribution types return a set of values for a point in time and support percentile aggregation.

More...

metrics

Quantitative measurements or data points that provide insights into the performance and behavior of a system. Cloud Observability ingests metrics as time series and stores them in a time series database (TSDB). Metrics can be correlated with trace data, linking deviations in metric charts to performance issues.

Microsatellite

Collects the telemetry data generated by instrumented clients and servers, and then sends that data to the Cloud Observability SaaS platform for analysis. They can be deployed in your environment as horizontally scalable instances.

More...

min

An aggregation method computing the minimum value of the time series in the input window. For example, the min value for [10, 15, 50] is 10.

More...

Monthly Active Service

A service that has reported telemetry to Cloud Observability within the previous rolling 30-day period.

More...

monthly charges

Costs associated with Cloud Observability

notebook

A feature in Cloud Observability that allows users to query and visualize logs, metrics, and traces in one place. Notebooks facilitate ad-hoc queries, collaboration, and record-keeping during investigations, postmortems, or runbooks.

More...

notebook snapshot

Read-only snapshots of a notebook created at a specific point in time. Snapshots are saved for the length of the data retention policy and can be shared with others.

More...

notification destination

A third-party tool or destination where notifications are sent when an alert is triggered. Users can configure notification rules to specify how notifications are sent.

More...

notification frequency

An optional setting that determines how often notifications are sent until the issue is resolved. Users can configure the rate at which notifications are sent.

More...

notification rules

Configurations that define how notifications are sent when the alert criteria are met. Users can choose between receiving a single notification for multiple occurrences or separate notifications for each occurrence during the input window.

More...

observability

The concept of measuring the internal state of a system only by its outputs. For distributed systems, such as microservices, serverless, service meshes, etc., these outputs are telemetry data: logs, metrics, and traces.

OpenTelemetry

An open source observability framework for cloud-native software. OpenTelemetry is a collection of tools, APIs, and SDKs. OpenTelemetry can be used to instrument, generate, collect, and export telemetry data (metrics, logs, and traces) for analysis in order to understand software’s performance and behavior.

More...

OpenTelemetry “Constellation” Consulting

A paid engagement led by our OpenTelemetry and OpenTracing experts who partner closely with your team to lead training and educational sessions on instrumentation (including instrumentation of common libraries and frameworks), creating a modern telemetry pipeline with the OpenTelemetry Collector, and hold office hours for two (2) weeks to help accelerate your OpenTelemetry efforts.

OpenTelemetry “Galaxy” Consulting

A paid engagement led by our OpenTelemetry and OpenTracing experts which includes all of the sessions and work from the “Constellation” package as well as hands-on-keyboard paired instrumentation of your services, frameworks, and other abstractions, any telemetry transformation in the OpenTelemetry Collector, and metrics data ingestion.

OpenTelemetry Collector

A vendor-agnostic agent designed to receive, process, and export telemetry data. It eliminates the need to manage multiple agents, leading to improved scalability. The OpenTelemetry Collector supports various open-source observability data formats (e.g., Jaeger, Prometheus, Fluent Bit) and can send data to one or more open-source or commercial back-ends.

More...

OpenTelemetry Operator

An automated tool recommended for Kubernetes environments to manage a fleet of OpenTelemetry Collectors. It automates cluster metrics gathering, handles auto-scaling, and more.

More...

operation

The work represented by a span.

More...

ordered list

A visualization that lists query results in descending/ascending order of value using a bar graph

More...

organization

The entity that Cloud Observability is installed for. Organizations contain projects.

Outlier alert template

A template used to create alerts based on outliers in metric data. Outliers are identified when data behaves differently from other data in its group. This template specifically works with metrics.

More...

p99

The 99th percentile of a (histogram) distribution. This represents the upper bound of latencies experienced by 99% of traces. In other words, 99% of the traces are experiencing the p99 latency or less.

pre-built dashboards

Ready-made dashboards provided by Cloud Observability, enabling users to start monitoring their data without the need for custom queries. These dashboards can be customized as per the user’s requirements.

More...

Premium Success & Support

A professional services offering for the duration of a customer’s term with SLAs or deliverables detailed on a Cloud Observability order form or statement of work.

processors

Optional components in the Collector that can transform data received from the receivers or exporters.

More...

Professional Services Hours

A limited engagement professional services offering capped at the number of hours specified on a Cloud Observability order form or statement of work.

project

Encapsulates all Cloud Observability data for a particular environment such as dev or production, spanning team boundaries, languages, clients, servers, and physical locations. Projects roll up into an organization.

More...

Public Microsatellite pool

A Cloud Observability-managed shared pool of Microsatellites.

More...

rate

An aggregation operator telling how many events occurred over a time period. It indicates the speed of occurrences, providing information on the rate of change.

More...

receivers

Components in the Collector responsible for receiving data. They can be push or pull-based.

More...

rehydration

The process of moving logs from cold to hot storage to enable querying of older logs.

More...

role-based access control (RBAC)

Lets you manage what users can do in Cloud Observability through roles and permissions.

More...

rolling input window

Specified time period used for aggregating data points in charts

More...

root span

Span that starts a trace.

scatter plot

Chart that displays spans and traces from spans_sample queries

More...

service

A single component of a software application (often a microservice) that provides specific functionality, such as an authentication or checkout service. You can have an unlimited number of deployed service instances.

service (Collector configuration)

The section in the Collector configuration file that enables the components configured in other sections. If a component isn’t in the Service section, it isn’t used.

More...

Service diagram

Shows a map of the service hierarchy, as well as latency and errors. It provides a visual, interactive, and hierarchical representation of a system’s behavior for a given point in time, based on the query shown in Explorer.

More...

Service Directory

A view in Cloud Observability that provides a consolidated overview of all services and their operations. Users can search, filter, mark favorite services, and view performance data, making it a centralized location to monitor service health.

More...

Service Graph Connector (SGC)

A tool that allows you to map data sources currently reporting to Cloud Observability as configuration items (CIs) in the ServiceNow CMDB. Visibility to these sources in the CMDB allows you to monitor your cloud-based infrastructure in the same way as your on-premise systems.

More...

Service health dashboards

Dashboards specifically designed to provide an overview of the performance of a service, including its operations and associated metrics. Two types are available: Global Service Dashboard and Individual Service Dashboard.

More...

Service health panel

A visualization for dashboards that displays performance SLIs emitted from services without using a query.

More...

service level agreement (SLA)

Contract between a service provider (either internal or external) and the end user that defines the level of availability (usually a customer-facing SLO) expected from the service provider. SLAs are output-based in that their purpose is specifically to define what the customer will receive.

service level indicator (SLI)

The tool(s) that continuously measure your app’s performance and determine when it is breaking an SLO.

service level objective (SLO)

The contract of performance you make internally, that when broken, alert you to the problem so that you have time to address it before an SLA is broken. SLIs measure for SLOs.

Single sign-on (SSO)

Using SSO, your existing Identity Provider (IdP), for example, Okta, authenticates users. Those users can then log into Cloud Observability with their Okta credentials. Cloud Observability supports SSO with OAuth2 and SSO with Security Assertion Markup Language (SAML).

More...

snapshot (Explorer)

Persisted view of a query’s results made in Explorer. Every query result has an associated snapshot that can be revisited and shared at anytime.

More...

span

Represents a name and timed unit of work in the system that has a start time and a duration. Spans that are from the same request are built into a trace and can include nested spans. Spans often include attribute and event objects that describe and contextualize the work being performed.

More...

span context

Represents span state that must propagate to child spans and across process boundaries (for example, a trace_id, span_id, sample_id tuple).

More...

span data

The total amount of data comprising all the spans sent to Cloud Observability. An average span is about 500 bytes of data, most of which consists of the key:value attributes that are added to the span.

span events

Time-stamped information associated with spans, offering insights into specific moments during a span’s duration. Span events can include event names and optional structured data payloads, adding context to individual traces.

span ID

Unique identifier for a span in a trace

span samples

The dots or error triangles representing spans in charts

sparkline Charts

Charts that show recent performance for Service Level Indicators (SLIs) like latency, error rate, and operation rate. Sparkline charts are used in the Service Health view to display top changes in performance.

More...

Standard Support

Cloud Observability-provided consulting for software integration and service setup. Tickets and consulting time resulting from violations of Cloud Observability’s Service Level Agreement do not count against monthly limits.

Stream

Retained span query that continuously collects latency, error rate, and operation rate data. Streams persist data for a longer period than the default retention window.

More...

sum of all values

The total of all points in the data. For example, the sum for [10, 15, 50] is 75.

More...

Symlog scale

Scale allowing the chart to display negative values on the Y axis.

More...

telemetry

All the data collected and analyzed to help determine the health of your system. Typical telemetry data includes traces, logs, and metrics.

template variables

Variables used in dashboards that allow users to dynamically filter panels based on attribute values returned by data. These variables provide a flexible way to view data without creating separate panels for each value.

More...

time picker

A tool that allows users to change the time range for data displayed on a dashboard. Users can set relative time ranges or enter custom time ranges.

time series

A sequence of measurements over time, representing data points at specific intervals.

trace

The path of an individual transaction or request as it flows through a distributed system. Multiple spans are pieced together to create a trace.

More...

Trace view

A flame graph of a trace (each service a different color), and below that, each span in the hierarchy, allowing you to see the parent-child relationship of all the spans in the trace.

More...

tracer

OpenTelemetry component responsible for creating spans, attributes, events, and context. Tracers send this data to the Collector (or to Microsatellites), where it is processed and exported, enabling trace visualization and metadata analysis.

Traces list

Visualization showing sample spans from traces returned by queries

More...

Unified Query Builder

A tool for creating queries for logs, metrics, and spans. It provides a graphical interface for building queries and supports the Unified Query Language (UQL).

More...

Unified Query Language (UQL)

A query language used in Cloud Observability for building queries to retrieve logs, metrics, and spans. It can be used in the Unified Query Builder or written manually in the query editor.

More...

Workflow Link

Customized links in Cloud Observability’s Trace View page linking to external resources

More...

Updated Mar 18, 2024