Once you’ve integrated with AWS CloudWatch, you have access to all metrics for SageMaker Model Building Pipelines, a tool for building machine learning pipelines that take advantage of direct SageMaker integration.
See all AWS integrations.
To verify metrics are reporting, search for the metrics in the Metric details section of the Project Settings page.
The following table shows the SageMaker Model Building Pipelines metrics ingested by Cloud Observability.
|aws.sagemaker.invocation_4xx_errors||none||The number of InvokeEndpoint requests with a 4xx HTTP response code. Valid statistics: Average, Sum|
|aws.sagemaker.invocation_5xx_errors||none||The number of InvokeEndpoint requests with a 5xx HTTP response code. Valid statistics: Average, Sum|
|aws.sagemaker.invocations||none||The total number of InvokeEndpoint calls sent to a model endpoint. Valid statistics: Sum, Sample Count|
|aws.sagemaker.invocations_per_instance||none||The number of endpoint calls sent to a model, normalized by
|aws.sagemaker.model_latency||microsecond||The amount of time the model or models took to create response. Valid statistics: Average, Sum, Min, Max, Sample Count|
|aws.sagemaker.overhead_latency||microseconds||The time added to the time taken to respond to a client request by SageMaker for overhead, measured from the time that SageMaker receives the request until it returns a response to the client, minus the ModelLatency. Valid statistics: Average, Sum, Min, Max, Sample Count|
|aws.sagemaker.container_latency||microsecond||The time it took for an Inference Pipelines container to respond as viewed from SageMaker. ContainerLatency includes the time it took to send the request, to fetch the response from the model's container, and to complete inference in the container. Valid statistics: Average, Sum, Min, Max, Sample Count|
|aws.sagemaker.cpu_utilization||percent||The percentage of CPU units that are used by the containers running on an instance.|
|aws.sagemaker.memory_utilization||percent||The percentage of memory that is used by the containers running on an instance.|
|aws.sagemaker.gpu_utilization||percent||The percentage of GPU units that are used by the containers running on an instance.|
|aws.sagemaker.gpu_memory_utilization||percent||The percentage of GPU memory used by the containers running on an instance.|
|aws.sagemaker.disk_utilization||percent||The percentage of disk space used by the containers running on an instance.|
Updated Dec 19, 2022