Once you’ve integrated with AWS CloudWatch, you have access to all metrics for OpenSearch, a managed service that makes it easy to deploy, operate, and scale OpenSearch clusters in the AWS Cloud.
See all AWS integrations.
To verify metrics are reporting, search for the metrics in the Metric details section of the Settings page.
The following table shows the OpenSearch metrics ingested by Cloud Observability.
Metric Name | Unit | Description |
---|---|---|
aws.es.cluster_status_green | integer | 1 for all index shards are allocated to nodes inside the cluster. Relevant statistics: Maximum |
aws.es.cluster_status_yellow | integer | 1 for the primary shards for all indexes are allocated to nodes in the cluster, but replica shards for at least one index are not. Relevant.statistics: Maximum |
aws.es.cluster_status_red | integer | 1 for the primary and replica shards for at least one index are not allocated to nodes in the cluster. For more information, see Red cluster status. Relevant statistics: Maximum |
aws.es.shards_active | count | The number of all active primary and replica shards. Relevant statistics: Maximum, Sum |
aws.es.shards_unassigned | count | The number of all shards those are not allocated to nodes in the cluster. Relevant statistics: Maximum, Sum |
aws.es.shards_delayed_unassigned | count | The number of all shards whose node allocation has been delayed by the timeout settings. Relevant statistics: Maximum, Sum |
aws.es.shards_active_primary | count | The total number of active primary shards. Relevant statistics: Maximum, Sum |
aws.es.shards_initializing | count | The number of initializaning shards. Relevant statistics: Sum |
aws.es.shards_relocating | count | The number of relocating shards. Relevant statistics: Sum |
aws.es.nodes | count | The total number of nodes in the OpenSearch Service cluster with dedicated master nodes and UltraWarm nodes. Relevant statistics: Maximum |
aws.es.searchable_documents | count | The total number of searchable documents in all data nodes inside the cluster. Relevant statistics: Minimum, Maximum, Average |
aws.es.deleted_documents | count | The total number of documents marked for deletion in all data nodes in the cluster. Relevant statistics: Minimum, Maximum, Average |
aws.es.cpu_utilization | percentage | The percentage of CPU usage for data nodes in the cluster. Maximum shows the node with the highest CPU usage. Average for all nodes in the cluster. This metric is also available for individual nodes. Relevant statistics: Maximum, Average |
aws.es.free_storage_space | mebibyte | The free space in data nodes inside the cluster. Sum shows total free space for the cluster. Minimum and Maximum show the nodes with the least and most free space, accordingly. This metric is also available for individual nodes. Relevant statistics: Minimum, Maximum, Average, Sum |
aws.es.cluster_used_space | mebibyte | The total used space for the cluster. You must leave the period at one minute to get an accurate value. Relevant statistics: Minimum, Maximum |
aws.es.cluster_index_writes_blocked | integer | 0 if the cluster is accepting requests, 1 if it is blocking requests. Relevant statistics: Maximum |
aws.es.jvm_memory_pressure | percentage | The maximum percentage of the Java heap used for across data nodes in the cluster. Relevant statistics: Maximum |
aws.es.old_gen_jvm_memory_pressure | percentage | The maximum percentage of the Java heap used for the "old generation" across all data nodes in the cluster. This metric is also available at the node level. Relevant statistics: Maximum |
aws.es.automated_snapshot_failure | count | The number of failed automated snapshots for the cluster. 1 shows that no automated snapshot was taken for the domain in the previous 36 hours. Relevant statistics: Minimum, Maximum |
aws.es.cpu_credit_balance | count | The remaining CPU credits available for T2 data nodes inside the cluster. Relevant statistics: Minimum |
aws.es.open_search_dashboards_healthy_nodes (previously kibana_healthy_nodes) |
count | A health check for OpenSearch Dashboard. 1 for the minimum, maximum, and average means Dashboards are working as expected. Relevant statistics: Minimum, Maximum, Average |
aws.es.kibana_reporting_failed_request_sys_err_count | count | The number of failed due to server problems or feature limitations requests to generate OpenSearch Dashboards reports. Relevant statistics: Sum |
aws.es.kibana_reporting_failed_request_user_err_count | count | The number of failed due to client issues requests to generate OpenSearch Dashboards reports. Relevant statistics: Sum |
aws.es.kibana_reporting_request_count | count | The number of all requests to generate OpenSearch Dashboards reports. Relevant statistics: Sum |
aws.es.kibana_reporting_success_count | count | The number of successful requests to generate OpenSearch Dashboards reports. Relevant statistics: Sum |
aws.es.kms_key_error | count | 1 shows that the AWS KMS key used to encrypt data at rest has been disabled and needs to be re-enabled. Relevant statistics: Minimum, Maximum |
aws.es.kms_key_inaccessible | integer | 1 shows that the AWS KMS key used to encrypt data at rest has been deleted or grants were revoked. This metric available for domains that encrypt data at rest only. Relevant statistics: Minimum, Maximum |
aws.es.invalid_host_header_requests | count | The number of HTTP requests with invalid host header. Relevant statistics: Sum |
aws.es.open_search_requests (previously elasticsearch_requests) |
count | The number of requests made to the OpenSearch cluster. Relevant statistics: Sum |
aws.es.2xx,_3xx,_4xx,_5xx | count | The number of responces with the requested HTTP response code (2xx, 3xx, 4xx, 5xx). Relevant statistics: Sum |
aws.es.throughput_throttle | integer | 1 shows that some requests were throttled within the selected timeframe, 0 is for normal behavior. Relevant statistics: Minimum, Maximum |
aws.es.master_cpu_utilization | percentage | The maximum percentage of CPU resources used by the dedicated master nodes. Relevant statistics: Maximum |
aws.es.master_jvm_memory_pressure | percentage | The maximum percentage of the Java heap used by all dedicated master nodes in the cluster. Relevant statistics: Maximum |
aws.es.master_old_gen_jvm_memory_pressure | percentage | The maximum percentage of the Java heap used by the "old generation" per master node. Relevant statistics: Maximum |
aws.es.master_cpu_credit_balance | count | The remaining CPU credits available for T2 dedicated master nodes in the cluster. Relevant statistics: Minimum |
aws.es.master_reachable_from_node | integer | A health check for MasterNotDiscovered exceptions. 1 represents normal behavior, 0 shows that /_cluster/health/ is failing.
Relevant statistics: Minimum |
aws.es.master_sys_memory_utilization | percentage | The percentage of the master node's memory in use. Relevant statistics: Maximum |
aws.es.read_latency | second | The latency, in seconds, for read operations on EBS volumes. This metric is also available for individual nodes. Relevant statistics: Minimum, Maximum, Average |
aws.es.write_latency | second | The latency, in seconds, for write operations on EBS volumes. This metric is also available for individual nodes. Relevant statistics: Minimum, Maximum, Average |
aws.es.read_throughput | byte | The throughput, in bytes per second, for read operations on EBS volumes. This metric is also available for individual nodes. Relevant statistics: Minimum, Maximum, Average |
aws.es.write_throughput | byte | The throughput, in bytes per second, for write operations on EBS volumes. This metric is also available for individual nodes. Relevant statistics: Minimum, Maximum, Average |
aws.es.disk_queue_depth | count | The number of pending input and output (I/O) requests for an EBS volume. Relevant statistics: Minimum, Maximum, Average |
aws.es.read_iops | count | The number of input and output (I/O) operations per second for read operations on EBS volumes. This metric is also available for individual nodes. Relevant statistics: Minimum, Maximum, Average |
aws.es.write_iops | count | The number of input and output (I/O) operations per second for write operations on EBS volumes. This metric is also available for individual nodes. Relevant statistics: Minimum, Maximum, Average |
aws.es.burst_balance | percentage | The percentage of input and output (I/O) credits remaining in the burst bucket for an EBS volume. Relevant statistics: Minimum, Maximum, Average |
aws.es.indexing_latency | millisecond | The average time that it takes a shard to complete an indexing operation. Relevant node statistics: Average Relevant cluster statistics: Average, Maximum |
aws.es.indexing_rate | count | The number of indexing operations per minute. Relevant node statistics: Average Relevant cluster statistics: Average, Maximum, Sum |
aws.es.search_latency | millisecond | The average time of search operation. Relevant node statistics: Average Relevant cluster statistics: Average, Maximum |
aws.es.search_rate | count | The total number of search requests per minute for all shards on a data node. Relevant node statistics: Average Relevant cluster statistics: Average, Maximum, Sum |
aws.es.segment_count | count | The number of segments on a data node. Relevant node statistics: Maximum, Average Relevant cluster statistics: Sum, Maximum, Average |
aws.es.sys_memory_utilization | percentage | The percentage of the instance's memory that is in use. Relevant node statistics: Minimum, Maximum, Average Relevant cluster statistics: Minimum, Maximum, Average |
aws.es.jvmgc_young_collection_count | count | The number of times when "young generation" garbage collection was launched. Relevant node statistics: Maximum Relevant cluster statistics: Sum, Maximum, Average |
aws.es.jvmgc_young_collection_time | millisecond | The amount of time that was spent for "young generation" garbage collection. Relevant node statistics: Maximum Relevant cluster statistics: Sum, Maximum, Average |
aws.es.jvmgc_old_collection_count | count | The number of times that "old generation" garbage collection was launched. In a cluster with sufficient resources, this number should remain small and grow infrequently. Relevant node statistics: Maximum Relevant cluster statistics: Sum, Maximum, Average |
aws.es.jvmgc_old_collection_time | millisecond | The amount of time that the cluster was spent for "old generation" garbage collection. Relevant node statistics: Maximum Relevant cluster statistics: Sum, Maximum, Average |
aws.es.open_search_dashboards_concurrent_connections (previously kibana_concurrent_connections) |
count | The number of active concurrent connections to OpenSearch Dashboards. Relevant node statistics: Maximum Relevant cluster statistics: Sum, Maximum, Average |
aws.es.open_search_dashboards_healthy_node (previously kibana_healthy_node) |
integer | A health check for the individual OpenSearch Dashboards node. 1 means normal behavior, 0 if Dashboard is inaccessible. Relevant node statistics: Minimum Relevant cluster statistics: Minimum, Maximum, Average |
aws.es.open_search_dashboards_heap_total (previously kibana_heap_total) |
mebibyte | The amount of heap memory allocated to OpenSearch Dashboards. Relevant node statistics: Maximum Relevant cluster statistics: Sum, Maximum, Average |
aws.es.open_search_dashboards_heap_used (previously kibana_heap_used) |
mebibyte | The absolute amount of heap memory used by OpenSearch Dashboards. Relevant node statistics: Maximum Relevant cluster statistics: Sum, Maximum, Average |
aws.es.open_search_dashboards_heap_utilization (previously kibana_heap_utilization) |
percentage | The maximum percentage of available heap memory used by OpenSearch Dashboards. Relevant node statistics: Maximum Relevant cluster statistics: Minimum, Maximum, Average |
aws.es.open_search_dashboards_os_1_minute_load (previously kibana_os_1_minute_load) |
count | The one-minute CPU load average for OpenSearch Dashboards, ideally should stay below 1.00. Relevant node statistics: Average Relevant cluster statistics: Average, Maximum |
aws.es.open_search_dashboards_request_total (previously kibana_request_total) |
count | The total number of HTTP calls to OpenSearch Dashboards. Relevant node statistics: Sum Relevant cluster statistics: Sum |
aws.es.open_search_dashboards_response_times_max_in_millis (previously kibana_response_times_max_in_millis) |
millisecond | The maximum OpenSearch Dashboards response time. Relevant node statistics: Maximum Relevant cluster statistics: Maximum, Average |
aws.es.threadpool_force_merge_queue | count | The number tasks those have been queued in the force merge thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Sum, Maximum, Average |
aws.es.threadpool_force_merge_rejected | count | The number tasks those have been rejected in the force merge thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Sum |
aws.es.threadpool_force_merge_threads | count | The number of items in the force merge thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Average, Sum |
aws.es.threadpool_index_queue | count | The number tasks those have been queued in the index thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Sum, Maximum, Average |
aws.es.threadpool_index_rejected | count | The number tasks those have been rejected in the index thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Sum |
aws.es.threadpool_index_threads | count | The number of items in the index thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Average, Sum |
aws.es.threadpool_search_queue | count | The number tasks those have been queued in the search thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Sum, Maximum, Average |
aws.es.threadpool_search_rejected | count | The number tasks those have been rejected in the search thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Sum |
aws.es.threadpool_search_threads | count | The number of items in the search thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Average, Sum |
aws.es.threadpoolsql_worker_queue | count | The number of tasks those have been queued in the SQL search thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Sum, Maximum, Average |
aws.es.threadpoolsql_worker_rejected | count | The number of tasks those have been rejected in the SQL search thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Sum |
aws.es.threadpoolsql_worker_threads | count | The number of items in the SQL search thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Average, Sum |
aws.es.threadpool_bulk_queue | count | The number tasks those have been queued in the bulk thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Sum, Maximum, Average |
aws.es.threadpool_bulk_rejected | count | The number tasks those have been rejected in the search thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Sum |
aws.es.threadpool_bulk_threads | count | The number of items in the bulk thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Average, Sum |
aws.es.threadpool_write_threads | count | The number of items in the write thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Average, Sum |
aws.es.threadpool_write_queue | count | The number of queued tasks in the write thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Average, Sum |
aws.es.threadpool_write_rejected | count | The number of rejected tasks in the write thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Average, Sum |
aws.es.coordinating_write_rejected | count | The total number of rejections happened on the coordinating node due to indexing pressure since the last OpenSearch Service process startup. Relevant node statistics: Maximum Relevant cluster statistics: Average, Sum |
aws.es.primary_write_rejected | count | The total number of rejections happened on the primary shards due to indexing pressure since the last OpenSearch Service process startup. Relevant node statistics: Maximum Relevant cluster statistics: Average, Sum |
aws.es.replica_write_rejected | count | The total number of rejections happened on the replica shards due to indexing pressure since the last OpenSearch Service process startup. Relevant node statistics: Maximum Relevant cluster statistics: Average, Sum |
aws.es.warm_cpu_utilization | percentage | The percentage of CPU usage for UltraWarm nodes in the cluster. Maximum shows the node with the highest CPU usage. Average represents all UltraWarm nodes in the cluster. This metric is also available for individual UltraWarm nodes. Relevant statistics: Maximum, Average |
aws.es.warm_free_storage_space | mebibyte | The amount of free warm storage space. Because UltraWarm uses Amazon S3 rather than attached disks, Sum is the only relevant statistic. You must leave the period at one minute to get an accurate value. Relevant statistics: Sum |
aws.es.warm_searchable_documents | count | The total number of searchable documents across all warm indexes in the cluster. You must leave the period at one minute to get an accurate value. Relevant statistics: Sum |
aws.es.warm_search_latency | millisecond | The average time that it takes a shard on an UltraWarm node to complete a search operation. Relevant node statistics: Average Relevant cluster statistics: Average, Maximum |
aws.es.warm_search_rate | count | The total number of search calls per minute on an UltraWarm node for all shards. A one call to the _search API might return results from many different shards. If five of these shards are on one node, the node would report 5 for this metric, even though the client only made one request.
Relevant node statistics: Average
Relevant cluster statistics: Average, Maximum, Sum |
aws.es.warm_storage_space_utilization | mebibyte | The total amount of warm storage space that the cluster is using. Relevant statistics: Maximum |
aws.es.hot_storage_space_utilization | mebibyte | The total amount of hot storage space that the cluster is using. Relevant statistics: Maximum |
aws.es.warm_sys_memory_utilization | percentage | The percentage of the warm node's memory that is in use. Relevant statistics: Maximum |
aws.es.hot_to_warm_migration_queue_size | count | The number of indexes currently waiting to migrate from hot to warm storage. Relevant statistics: Maximum |
aws.es.warm_to_hot_migration_queue_size | count | The number of indexes currently waiting to migrate from warm to hot storage. Relevant statistics: Maximum |
aws.es.hot_to_warm_migration_failure_count | count | The total number of failed hot to warm migrations. Relevant statistics: Sum |
aws.es.hot_to_warm_migration_force_merge_latency | second | The average latency of the force merge stage of the migration process. Relevant statistics: Average |
aws.es.hot_to_warm_migration_snapshot_latency | second | The average latency of the snapshot stage of the migration process. Relevant statistics: Average |
aws.es.hot_to_warm_migration_processing_latency | second | The average latency of successful hot to warm migrations, not including time spent in the queue. Relevant statistics: Average |
aws.es.hot_to_warm_migration_success_count | count | The total number of successful hot to warm migrations. Relevant statistics: Sum |
aws.es.hot_to_warm_migration_success_latency | second | The average latency of successful hot to warm migrations, including time spent in the queue. Relevant statistics: Average |
aws.es.warm_threadpool_search_threads | count | The number of items in the UltraWarm search thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Average, Sum |
aws.es.warm_threadpool_search_rejected | count | The number of rejected tasks in the UltraWarm search thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Sum |
aws.es.warm_threadpool_search_queue | count | The number of queued tasks in the UltraWarm search thread pool. Relevant node statistics: Maximum Relevant cluster statistics: Sum, Maximum, Average |
aws.es.warm_jvm_memory_pressure | percentage | The maximum percentage of the Java heap used for the UltraWarm nodes. Relevant statistics: Maximum |
aws.es.warm_old_gen_jvm_memory_pressure | percentage | The maximum percentage of the Java heap used for the "old generation" per UltraWarm node. Relevant statistics: Maximum |
aws.es.warm_jvmgc_young_collection_count | count | The number of times that "young generation" garbage collection has run on UltraWarm nodes. Relevant node statistics: Maximum Relevant cluster statistics: Sum, Maximum, Average |
aws.es.warm_jvmgc_young_collection_time | millisecond | The amount of time that the cluster has spent performing "young generation" garbage collection on UltraWarm nodes. Relevant node statistics: Maximum Relevant cluster statistics: Sum, Maximum, Average |
aws.es.warm_jvmgc_old_collection_count | count | The number of times that "old generation" garbage collection has run on UltraWarm nodes. Relevant node statistics: Maximum Relevant cluster statistics: Sum, Maximum, Average |
aws.es.cold_storage_space_utilization | mebibyte | The total amount of cold storage space that the cluster is using. Relevant statistics: Max |
aws.es.cold_to_warm_migration_failure_count | count | The total number of failed cold to warm migrations. Relevant statistics: Sum |
aws.es.cold_to_warm_migration_latency | second | The amount of time for successful cold to warm migrations to complete. Relevant statistics: Average |
aws.es.cold_to_warm_migration_queue_size | count | The number of indexes currently waiting to migrate from cold to warm storage. Relevant statistics: Maximum |
aws.es.cold_to_warm_migration_success_count | count | The total number of successful cold to warm migrations. Relevant statistics: Sum |
aws.es.warm_to_cold_migration_failure_count | count | The total number of failed warm to cold migrations. Relevant statistics: Sum |
aws.es.warm_to_cold_migration_latency | second | The amount of time for successful warm to cold migrations to complete. Relevant statistics: Average |
aws.es.warm_to_cold_migration_queue_size | count | The number of indexes currently waiting to migrate from warm to cold storage. Relevant statistics: Maximum |
aws.es.warm_to_cold_migration_success_count | count | The total number of successful warm to cold migrations. Relevant statistics: Sum |
aws.es.alerting_degraded | integer | 1 indicates that either the alerting index is red or one or more nodes is not on schedule, 0 shows normal behavior. Relevant statistics: Maximum |
aws.es.alerting_index_exists | integer | 1 shows the .opensearch-alerting-config index exists, 0 indicates it does not. Until you use the alerting feature for the first time, this value remains 0.
Relevant statistics: Maximum |
aws.es.alerting_index_status_green | integer | The health of the index. 1 means green, 0 shows that the index either doesn't exist or isn't green. Relevant statistics: Maximum |
aws.es.alerting_index_status_red | integer | The health of the index. 1 means red, 0 indicates that the index either doesn't exist or isn't red. Relevant statistics: Maximum |
aws.es.alerting_index_status_yellow | integer | The health of the index. 1 means yellow, 0 indicates that the index either doesn't exist or isn't yellow. Relevant statistics: Maximum |
aws.es.alerting_nodes_not_on_schedule | integer | 1 means some jobs are not running on schedule, 0 means that all alerting jobs are running on schedule or no alerting jobs exist. Relevant statistics: Maximum |
aws.es.alerting_nodes_on_schedule | integer | 1 means that all alerting jobs are running on schedule or that no alerting jobs exist, 0 means some jobs are not running on schedule. Relevant statistics: Maximum |
aws.es.alerting_scheduled_job_enabled | integer | 1 means that the opensearch.scheduled_jobs.enabled cluster setting is true, 0 means it is false, and scheduled jobs are disabled.
Relevant statistics: Maximum |
aws.es.ad_plugin_unhealthy | integer | 1 means that the anomaly detection plugin is not functioning properly, either because of a high number of failures or because one of the indexes that it uses is red, 0 indicates the plugin is working as expected. Relevant statistics: Maximum |
aws.es.ad_execute_request_count | count | The number of requests to detect anomalies. Relevant statistics: Sum |
aws.es.ad_execute_failure_count | count | The number of failed requests to detect anomalies. Relevant statistics: Sum |
aws.es.adhc_execute_failure_count | count | The number of failed requests to detect anomalies for high cardinality detectors. Relevant statistics: Sum |
aws.es.adhc_execute_request_count | count | The number of requests to detect anomalies for high cardinality detectors. Relevant statistics: Sum |
aws.es.ad_anomaly_results_index_status_index_exists | integer | 1 means the index that the .opensearch-anomaly-results alias points to exists. Without using anomaly detection this value remains 0. Relevant statistics: Maximum |
aws.es.ad_anomaly_results_index_status_red | integer | 1 means the index that the .opensearch-anomaly-results alias points to is red, 0 means it is not. Without using anomaly detection this value remains 0.
Relevant statistics: Maximum |
aws.es.ad_anomaly_detectors_index_status_index_exists | integer | 1 means that the .opensearch-anomaly-detectors index exists, 0 means it does not. Without using anomaly detection this value remains 0.
Relevant statistics: Maximum |
aws.es.ad_anomaly_detectors_index_status_red | integer | 1 means that the .opensearch-anomaly-detectors index is red, 0 means it is not. Without using anomaly detection this value remains 0.
Relevant statistics: Maximum |
aws.es.ad_models_checkpoint_index_status_index_exists | integer | 1 means that the .opensearch-anomaly-checkpoints index exists, 0 means it does not. Without using anomaly detection this value remains 0.
Relevant statistics: Maximum |
aws.es.ad_models_checkpoint_index_status_red | integer | 1 means that the .opensearch-anomaly-checkpoints index is red, 0 means it is not. Without using anomaly detection this value remains 0.
Relevant statistics: Maximum |
aws.es.asynchronous_search_submission_rate | count | The number of asynchronous searches submitted in the last minute. |
aws.es.asynchronous_search_initialized_rate | count | The number of asynchronous searches initialized in the last minute. |
aws.es.asynchronous_search_running_current | count | The number of asynchronous searches currently running. |
aws.es.asynchronous_search_completion_rate | count | The number of asynchronous searches successfully completed in the last minute. |
aws.es.asynchronous_search_failure_rate | count | The number of asynchronous searches that completed and failed in the last minute. |
aws.es.asynchronous_search_persist_rate | count | The number of asynchronous searches that persisted in the last minute. |
aws.es.asynchronous_search_persist_failed_rate | count | The number of asynchronous searches that failed to persist in the last minute. |
aws.es.asynchronous_search_rejected | count | The total number of asynchronous searches rejected since the node up time. |
aws.es.asynchronous_search_cancelled | count | The total number of asynchronous searches cancelled since the node up time. |
aws.es.asynchronous_search_max_running_time | second | The duration of longest running asynchronous search on a node in the last minute. |
aws.es.asynchronous_search_store_health | count | The health of the store in the persisted index (RED/non-RED) in the last minute. |
aws.es.asynchronous_search_store_size | count | The size of the system index across all shards in the last minute. |
aws.es.asynchronous_search_stored_response_count | count | The numbers of stored responses in the system index in the last minute. |
aws.es.sql_failed_request_count_by_cus_err | count | The number of requests to the _sql API that failed due to a client issue.
Relevant statistics: Sum |
aws.es.sql_failed_request_count_by_sys_err | count | The number of requests to the _sql API that failed due to a server problem or feature limitation.
Relevant statistics: Sum |
aws.es.sql_request_count | count | The number of requests to the _sql API.
Relevant statistics: Sum |
aws.es.sql_default_cursor_request_count | count | Similar to SQLRequestCount but only counts pagination requests.
Relevant statistics: Sum |
aws.es.sql_unhealthy | integer | 1 means that, in response to certain requests, the SQL plugin is returning 5xx response codes or passing invalid query DSL to OpenSearch, 0 means no recent failures. Relevant statistics: Maximum |
aws.es.knn_cache_capacity_reached | count | Per-node metric for whether cache capacity has been reached. This metric is only relevant to approximate k-NN search.
Relevant statistics: Maximum |
aws.es.knn_circuit_breaker_triggered | count | Per-cluster metric for whether the circuit breaker is triggered. If any nodes return a value of 1 for KNNCacheCapacityReached , this value will also return 1. This metric is only relevant to approximate k-NN search.
Relevant statistics: Maximum |
aws.es.knn_eviction_count | count | Per-node metric for the number of graphs that have been evicted from the cache due to memory constraints or idle time. Explicit evictions that occur because of index deletion are not counted. This metric is only relevant to approximate k-NN search.
Relevant statistics: Sum |
aws.es.knn_graph_index_errors | count | Per-node metric for the number of requests to add the knn_vector field of a document to a graph that produced an error.
Relevant statistics: Sum |
aws.es.knn_graph_index_requests | count | Per-node metric for the number of requests to add the knn_vector field of a document to a graph.
Relevant statistics: Sum |
aws.es.knn_graph_memory_usage | kilobyte | Per-node metric for the current cache size, total size of all graphs in memory. This metric is only relevant to approximate k-NN search.
Relevant statistics: Average |
aws.es.knn_graph_query_errors | count | Per-node metric for the number of graph queries that produced an error. Relevant statistics: Sum |
aws.es.knn_graph_query_requests | count | Per-node metric for the number of graph queries. Relevant statistics: Sum |
aws.es.knn_hit_count | count | Per-node metric for the number of cache hits. A cache hit occurs when a user queries a graph that is already loaded into memory. This metric is only relevant to approximate k-NN search.
Relevant statistics: Sum |
aws.es.knn_load_exception_count | count | Per-node metric for the number of times an exception occurred while trying to load a graph into the cache. This metric is only relevant to approximate k-NN search.
Relevant statistics: Sum |
aws.es.knn_load_success_count | count | Per-node metric for the number of times the plugin successfully loaded a graph into the cache. This metric is only relevant to approximate k-NN search.
Relevant statistics: Sum |
aws.es.knn_miss_count | count | Per-node metric for the number of cache misses. A cache miss occurs when a user queries a graph that is not yet loaded into memory. This metric is only relevant to approximate k-NN search.
Relevant statistics: Sum |
aws.es.knn_query_requests | count | Per-node metric for the number of query requests the k-NN plugin received.
Relevant statistics: Sum |
aws.es.knn_script_compilation_errors | count | Per-node metric for the number of errors during script compilation. This statistic is only relevant to k-NN score script search.
Relevant statistics: Sum |
aws.es.knn_script_compilations | count | Per-node metric for the number of times the k-NN script has been compiled. This value should usually be 1 or 0, but if the cache containing the compiled scripts is filled, the k-NN script might be recompiled. This statistic is only relevant to k-NN score script search.
Relevant statistics: Sum |
aws.es.knn_script_query_errors | count | Per-node metric for the number of errors during script queries.This statistic is only relevant to k-NN score script search.
Relevant statistics: Sum |
aws.es.knn_script_query_requests | count | Per-node metric for the total number of script queries. This statistic is only relevant to k-NN score script search.
Relevant statistics: Sum |
aws.es.knn_total_load_time | nanosecond | The time that k-NN has taken to load graphs into the cache. This metric is only relevant to approximate k-NN search. Relevant statistics: Sum |
aws.es.cross_cluster_outbound_connections | count | Number of connected nodes. If your response includes one or more skipped domains, use this metric to trace any unhealthy connections. If this number drops to 0, then the connection is unhealthy. |
aws.es.cross_cluster_outbound_requests | count | Number of search requests sent to the destination domain. |
aws.es.cross_cluster_inbound_requests | count | Number of incoming connection requests received from the source domain. |
aws.es.replication_rate | count | The average rate of replication operations per second. This metric is similar to the IndexingRate metric. |
aws.es.leader_check_point | count | For a specific connection, the sum of leader checkpoint values across all replicating indexes. |
aws.es.follower_check_point | count | For a specific connection, the sum of follower checkpoint values across all replicating indexes. |
aws.es.replication_num_syncing_indices | count | The number of indexes that have a replication status of SYNCING. |
aws.es.replication_num_bootstrapping_indices | count | The number of indexes that have a replication status of BOOTSTRAPPING. |
aws.es.replication_num_paused_indices | count | The number of indexes that have a replication status of PAUSED. |
aws.es.replication_num_failed_indices | count | The number of indexes that have a replication status of FAILED. |
aws.es.auto_follow_num_success_start_replication | count | The number of follower indexes that have been successfully created by a replication rule for a specific connection. |
aws.es.auto_follow_num_failed_start_replication | count | The number of follower indexes that failed to be created by a replication rule when there was a matching pattern. |
aws.es.auto_follow_leader_call_failure | integer | Whether there have been any failed queries from the follower index to the leader index to pull new data. 1 means that there have been 1 or more failed calls in the last minute. |
aws.es.ltr_request_total_count | count | Total count of ranking requests. |
aws.es.ltr_request_error_count | count | Total count of unsuccessful requests. |
aws.es.ltr_status_red | count | Tracks if one of the indexes needed to run the plugin is red. |
aws.es.ltr_memory_usage | count | Total memory used by the plugin. |
aws.es.ltr_feature_memory_usage_in_bytes | byte | The amount of memory used by Learning to Rank feature fields. |
aws.es.ltr_featureset_memory_usage_in_bytes | byte | The amount of memory used by all Learning to Rank feature sets. |
aws.es.ltr_model_memory_usage_in_bytes | byte | The amount of memory used by all Learning to Rank models. |
aws.es.ppl_failed_request_count_by_cus_err | count | The number of requests to the _ppl API that failed due to a client issue. |
aws.es.ppl_failed_request_count_by_sys_err | count | The number of requests to the _ppl API that failed due to a server problem or feature limitation. |
aws.es.ppl_request_count | count | The number of requests to the _ppl API. |
Updated Jan 13, 2023