-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Markdown Refinements & Installation Streamlining for a Smooth User Journey #405
base: main
Are you sure you want to change the base?
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for opening the PR. I left some comments. Overall it feels like just adding more complexity or just changing the wording.
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"If you are going to use Sagemaker in a local environment. You need access to an IAM Role with the required permissions for Sagemaker. You can find [here](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html) more about it.\n", | ||
"\n" | ||
"### Configure IAM\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a fan of to many headlines is not fitting the other styles
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove the many headlines?
Hey Phil -- apologies for the complexity. Yeah, it's mainly just to prevent others from running into the same issue in SageMaker Studio Notebooks that I ran into. (Studio Notebooks is recommended in the blog post.) I refined some grammar, etc while I was at it. |
@@ -122,7 +165,7 @@ | |||
"\n", | |||
"model_id=\"bigscience/bloomz-7b1\"\n", | |||
"\n", | |||
"# Load tokenizer of BLOOMZ\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"To train our model, we need to convert our inputs (text) to token IDs. This is done by a 🤗 Transformers Tokenizer. If you are not sure what this means, check out **[chapter 6](https://huggingface.co/course/chapter6/1?fw=tf)** of the Hugging Face Course." | ||
"### Tokenize Dataset \n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a fan of that.
"import time\n", | ||
"# define Training Job Name \n", | ||
"job_name = f'huggingface-peft-{time.strftime(\"%Y-%m-%d-%H-%M-%S\", time.localtime())}'\n", | ||
"\n", | ||
"from sagemaker.huggingface import HuggingFace\n", | ||
"# Set up our training job name using a timestamp to ensure it's unique\n", | ||
"job_name = f'huggingface-peft-{time.strftime(\"%Y-%m-%d-%H-%M-%S\", time.localtime())}'\n", | ||
"\n", | ||
"# hyperparameters, which are passed into the training job\n", | ||
"from sagemaker.huggingface import HuggingFace" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you change the order?
What does this PR do?
I had issues following the notebook until I realized I wasn't using the write SageMaker image and I did the
pip install
of the requirements.txt too late in the process.While I was at it, I edited the other explanations in the markdown, some error handling towards the beginning, and changed
!pip install
to%pip install
because this seems to be the recommended approach.@philschmid