Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Markdown Refinements & Installation Streamlining for a Smooth User Journey #405

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

ystoneman
Copy link

What does this PR do?

I had issues following the notebook until I realized I wasn't using the write SageMaker image and I did the pip install of the requirements.txt too late in the process.

While I was at it, I edited the other explanations in the markdown, some error handling towards the beginning, and changed !pip install to %pip install because this seems to be the recommended approach.

@philschmid

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

@philschmid philschmid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for opening the PR. I left some comments. Overall it feels like just adding more complexity or just changing the wording.

"cell_type": "markdown",
"metadata": {},
"source": [
"If you are going to use Sagemaker in a local environment. You need access to an IAM Role with the required permissions for Sagemaker. You can find [here](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html) more about it.\n",
"\n"
"### Configure IAM\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a fan of to many headlines is not fitting the other styles

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove the many headlines?

@ystoneman ystoneman requested a review from philschmid July 3, 2023 15:11
@ystoneman
Copy link
Author

Hey Phil -- apologies for the complexity. Yeah, it's mainly just to prevent others from running into the same issue in SageMaker Studio Notebooks that I ran into. (Studio Notebooks is recommended in the blog post.)

I refined some grammar, etc while I was at it.

@@ -122,7 +165,7 @@
"\n",
"model_id=\"bigscience/bloomz-7b1\"\n",
"\n",
"# Load tokenizer of BLOOMZ\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

"cell_type": "markdown",
"metadata": {},
"source": [
"To train our model, we need to convert our inputs (text) to token IDs. This is done by a 🤗 Transformers Tokenizer. If you are not sure what this means, check out **[chapter 6](https://huggingface.co/course/chapter6/1?fw=tf)** of the Hugging Face Course."
"### Tokenize Dataset \n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of that.

Comment on lines 303 to +308
"import time\n",
"# define Training Job Name \n",
"job_name = f'huggingface-peft-{time.strftime(\"%Y-%m-%d-%H-%M-%S\", time.localtime())}'\n",
"\n",
"from sagemaker.huggingface import HuggingFace\n",
"# Set up our training job name using a timestamp to ensure it's unique\n",
"job_name = f'huggingface-peft-{time.strftime(\"%Y-%m-%d-%H-%M-%S\", time.localtime())}'\n",
"\n",
"# hyperparameters, which are passed into the training job\n",
"from sagemaker.huggingface import HuggingFace"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you change the order?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants