Once you’ve integrated with AWS CloudWatch, you have access to all metrics for SageMaker Model Building Pipelines, a tool for building machine learning pipelines that take advantage of direct SageMaker integration.

See this table for all available AWS integrations.

To verify metrics are reporting, search for the metrics in the Metric details section of the Project Settings page.

The following table shows the SageMaker Model Building Pipelines metrics ingested by Lightstep.

Metric Name Unit Description
aws.sagemaker.invocation_4xx_errors none The number of InvokeEndpoint requests with a 4xx HTTP response code. Valid statistics: Average, Sum
aws.sagemaker.invocation_5xx_errors none The number of InvokeEndpoint requests with a 5xx HTTP response code. Valid statistics: Average, Sum
aws.sagemaker.invocations none The total number of InvokeEndpoint calls sent to a model endpoint. Valid statistics: Sum, Sample Count
aws.sagemaker.invocations_per_instance none The number of endpoint calls sent to a model, normalized by InstanceCount in each ProductionVariant. Valid statistics: Sum
aws.sagemaker.model_latency microsecond The amount of time the model or models took to create response. Valid statistics: Average, Sum, Min, Max, Sample Count
aws.sagemaker.overhead_latency microseconds The time added to the time taken to respond to a client request by SageMaker for overhead, measured from the time that SageMaker receives the request until it returns a response to the client, minus the ModelLatency. Valid statistics: Average, Sum, Min, Max, Sample Count
aws.sagemaker.container_latency microsecond The time it took for an Inference Pipelines container to respond as viewed from SageMaker. ContainerLatency includes the time it took to send the request, to fetch the response from the model's container, and to complete inference in the container. Valid statistics: Average, Sum, Min, Max, Sample Count
aws.sagemaker.cpu_utilization percent The percentage of CPU units that are used by the containers running on an instance.
aws.sagemaker.memory_utilization percent The percentage of memory that is used by the containers running on an instance.
aws.sagemaker.gpu_utilization percent The percentage of GPU units that are used by the containers running on an instance.
aws.sagemaker.gpu_memory_utilization percent The percentage of GPU memory used by the containers running on an instance.
aws.sagemaker.disk_utilization percent The percentage of disk space used by the containers running on an instance.