Once you’ve integrated with AWS CloudWatch, you have access to all metrics for Elastic Map Reduce, which provides petabyte-scale data processing, analytics, and machine learning using framework like Apache Spark, Apache Hive, and Presto.
You can create a pre-built dashboard for this integration when you add the integration to Cloud Observability or from the Dashboard list view.
To verify metrics are reporting, search for the metrics in the Metric details section of the Settings page.
The following table shows the Elastic Map Reduce metrics ingested by Cloud Observability.
Metric Name | Unit | Description |
---|---|---|
aws.emr.is_idle | boolean | Indicates a cluster is accruing charges, but not performing work. |
aws.emr.container_allocated | count | The number of resource containers allocated by the ResourceManager. |
aws.emr.container_reserved | count | The number of reserved containers. |
aws.emr.container_pending | count | The number of unallocated containers in the queue. |
aws.emr.container_pending_ratio | count | The ratio of pending containers to allocated containers. |
aws.emr.apps_completed | count | The number of applications YARN completed. |
aws.emr.apps_failed | count | The number of applications submitted to YARN that failed to complete. |
aws.emr.apps_killed | count | The number of applications submitted to YARN that have been killed. |
aws.emr.apps_pending | count | The number of applications submitted to YARN in a pending state. |
aws.emr.apps_running | count | The number of applications submitted to YARN that are running. |
aws.emr.apps_submitted | count | The total number of applications submitted to YARN. |
aws.emr.core_nodes_running | count | The number of core nodes working. |
aws.emr.core_nodes_pending | count | The number of core nodes waiting to be assigned. |
aws.emr.live_data_nodes | percent | The percentage of data nodes that are receiving work from Hadoop. |
aws.emr.mr_total_nodes | count | The number of nodes presently available to MapReduce jobs. Equivalent to YARN metric mapred.resourcemanager.TotalNodes. |
aws.emr.mr_active_nodes | count | The number of nodes presently running MapReduce tasks or jobs. Equivalent to YARN metric mapred.resourcemanager.NoOfActiveNodes. |
aws.emr.mr_lost_nodes | count | The number of nodes allocated to MapReduce that have been marked in a LOST state. Equivalent to YARN metric mapred.resourcemanager.NoOfLostNodes. |
aws.emr.mr_unhealthy_nodes | count | The number of nodes available to MapReduce jobs marked in an UNHEALTHY state. Equivalent to YARN metric mapred.resourcemanager.NoOfUnhealthyNodes. |
aws.emr.mr_decommissioned_nodes | count | The number of nodes allocated to MapReduce applications that have been marked in a DECOMMISSIONED state. Equivalent to YARN metric mapred.resourcemanager.NoOfDecommissionedNodes. |
aws.emr.mr_rebooted_nodes | count | The number of nodes available to MapReduce that have been rebooted and marked in a REBOOTED state. Equivalent to YARN metric mapred.resourcemanager.NoOfRebootedNodes. |
aws.emr.multi_master_instance_group_nodes_running | count | The number of running master nodes. |
aws.emr.multi_master_instance_group_nodes_running_percentage | percent | The proportion of master nodes that are running over the requested master node instance count. |
aws.emr.multi_master_instance_group_nodes_requested | count | The number of requested master nodes. |
aws.emr.s_3_bytes_written | count | The number of bytes written to Amazon S3. This metric aggregates MapReduce jobs only, and does not apply for other workloads on Amazon EMR. |
aws.emr.s_3_bytes_read | count | The number of bytes read from Amazon S3. This metric aggregates MapReduce jobs only, and does not apply for other workloads on Amazon EMR. |
aws.emr.hdfs_utilization | percent | The proportion of HDFS storage currently used. |
aws.emr.hdfs_bytes_read | count | The number of bytes read from HDFS. This metric aggregates MapReduce jobs only, and does not apply for other workloads on EMR. |
aws.emr.hdfs_bytes_written | count | The number of bytes written to HDFS. |
aws.emr.hdfs_utilization | percent | The percentage of HDFS storage currently used. |
aws.emr.hdfs_bytes_read | count | The number of bytes read from HDFS. |
aws.emr.hdfs_bytes_written | count | The number of bytes written to HDFS. |
aws.emr.missing_blocks | count | The number of blocks in which HDFS has no replicas. |
aws.emr.total_load | count | The total current number of readers and writers reported by all DataNodes in a cluster. |
aws.emr.total_units_requested_total_nodes_requested_total_vcpu_requested | count | The target total number of units/nodes/vCPUs in a cluster as determined by managed scaling. |
aws.emr.total_units_running_total_nodes_running_total_vcpu_running | count | The number of units/nodes/vCPUs available in a running cluster. |
aws.emr.core_units_requested_core_nodes_requested_core_vcpu_requested | count | The target number of _core_ units/nodes/vCPUs in a cluster as determined by managed scaling. |
aws.emr.core_units_running_core_nodes_running_core_vcpu_running | count | The current number of _core_ units/nodes/vCPUs running in a cluster. |
aws.emr.task_units_requested_task_nodes_requested_task_vcpu_requested | count | The target number of _task_ units/nodes/vCPUs in a cluster as determined by managed scaling. |
aws.emr.task_units_running_task_nodes_running_task_vcpu_running | count | The current number of _task_ units/nodes/vCPUs running in a cluster. |
aws.emr.total_notebook_kernels | count | The total number of running and idle notebook kernels on the cluster. |
aws.emr.auto_termination_is_cluster_idle | count | Indicates whether the cluster is in use. A 0 value indicates the cluster is actively used by YARN, HDFS, a notebook, or on-cluster UI (e.g. Spark History Server). A value of 1 means the cluster is idle. |
Updated Dec 9, 2022