Skip to content

Error when using CW Agent on KernelApp #9

Open
@cabral1888

Description

@cabral1888

Hi all,

I am new to Sagemaker Studio and I was wondering if there is a way to monitor the studio usage, like, how many machines are being used, how much RAM and CPU the users are using. I've seen another repo of examples from notebook-lifecycle-config-examples (https://github.com/aws-samples/amazon-sagemaker-notebook-instance-lifecycle-config-samples) and I saw a very interesting lifecycle configuration: publish-instance-metrics.

I tried to reproduce this notebook-lifecycle-configuration inside studio-lifecycle-configuration, but no success. Here is my studio lifecycle configuration:

#!/bin/bash

set -e

# OVERVIEW
# This script publishes the system-level metrics from the Notebook instance to Cloudwatch.
#
# Note that this script will fail if either condition is not met
#   1. Ensure the Notebook Instance has internet connectivity to fetch the example config
#   2. Ensure the Notebook Instance execution role permissions to cloudwatch:PutMetricData to publish the system-level metrics
#
# https://aws.amazon.com/cloudwatch/pricing/
apt-get update
apt-get -y install jq

# PARAMETERS
NOTEBOOK_INSTANCE_NAME=$(jq '.ResourceName' /opt/ml/metadata/resource-metadata.json --raw-output)

echo "Fetching the CloudWatch agent configuration file."
wget https://raw.githubusercontent.com/aws-samples/amazon-sagemaker-notebook-instance-lifecycle-config-samples/master/scripts/publish-instance-metrics/amazon-cloudwatch-agent.json

sed -i -- "s/MyNotebookInstance/$NOTEBOOK_INSTANCE_NAME/g" amazon-cloudwatch-agent.json

echo "Starting the CloudWatch agent on the Notebook Instance."
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c file://$(pwd)/amazon-cloudwatch-agent.json -s

In order to reproduce and try to understand what happened, I decided to use a terminal tab inside Sagemaker Studio and run the commands one by one and see what happens. The last command gave me the following output:

/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl: 469: /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl: systemctl: not found
unknown init system

I don't know if there is anything I'm missing, or if it isn't supported yet by sagemaker studio. Can you please help me on this issue?

P.S.: I'm using a Kernel with Python3 and Data Science docker image.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions