Description
Hi all,
I am new to Sagemaker Studio and I was wondering if there is a way to monitor the studio usage, like, how many machines are being used, how much RAM and CPU the users are using. I've seen another repo of examples from notebook-lifecycle-config-examples (https://github.com/aws-samples/amazon-sagemaker-notebook-instance-lifecycle-config-samples) and I saw a very interesting lifecycle configuration: publish-instance-metrics.
I tried to reproduce this notebook-lifecycle-configuration inside studio-lifecycle-configuration, but no success. Here is my studio lifecycle configuration:
#!/bin/bash
set -e
# OVERVIEW
# This script publishes the system-level metrics from the Notebook instance to Cloudwatch.
#
# Note that this script will fail if either condition is not met
# 1. Ensure the Notebook Instance has internet connectivity to fetch the example config
# 2. Ensure the Notebook Instance execution role permissions to cloudwatch:PutMetricData to publish the system-level metrics
#
# https://aws.amazon.com/cloudwatch/pricing/
apt-get update
apt-get -y install jq
# PARAMETERS
NOTEBOOK_INSTANCE_NAME=$(jq '.ResourceName' /opt/ml/metadata/resource-metadata.json --raw-output)
echo "Fetching the CloudWatch agent configuration file."
wget https://raw.githubusercontent.com/aws-samples/amazon-sagemaker-notebook-instance-lifecycle-config-samples/master/scripts/publish-instance-metrics/amazon-cloudwatch-agent.json
sed -i -- "s/MyNotebookInstance/$NOTEBOOK_INSTANCE_NAME/g" amazon-cloudwatch-agent.json
echo "Starting the CloudWatch agent on the Notebook Instance."
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file://$(pwd)/amazon-cloudwatch-agent.json -s
In order to reproduce and try to understand what happened, I decided to use a terminal tab inside Sagemaker Studio and run the commands one by one and see what happens. The last command gave me the following output:
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl: 469: /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl: systemctl: not found
unknown init system
I don't know if there is anything I'm missing, or if it isn't supported yet by sagemaker studio. Can you please help me on this issue?
P.S.: I'm using a Kernel with Python3 and Data Science docker image.