GitHub - nogibjj/hugging-face-tutorials: tutorials on Hugging Face

Hugging Face Tutorials

Push model

Follow steps in guide: https://huggingface.co/docs/transformers/training

Login:

huggingface-cli login

If you get output about Authenticated through git-credential store but this isn't the helper defined on your machine., then follow the instructions to fix.

Tip: You can get your token from https://huggingface.co/settings/tokens and it needs to be a WRITE token.

Run python hugging-face/hf_fine_tune_hello_world.py

Create data

Manually upload data from UX or from API.

Create your own dataset Reference

To load do the following:

from datasets import load_dataset
remote_dataset = load_dataset("noahgift/social-power-nba")
remote_dataset

Generally useful skills

Use the huggingface-cli

(venv) @noahgift ➜ /workspaces/hugging-face-tutorials (GPU) $ huggingface-cli scan-cache
REPO ID                      REPO TYPE SIZE ON DISK NB FILES LAST_ACCESSED LAST_MODIFIED REFS LOCAL PATH                                                                   
---------------------------- --------- ------------ -------- ------------- ------------- ---- ---------------------------------------------------------------------------- 
bert-base-cased              model           436.4M        5 2 days ago    2 days ago    main /home/codespace/.cache/huggingface/hub/models--bert-base-cased               
bert-base-uncased            model           441.2M        5 2 hours ago   2 hours ago   main /home/codespace/.cache/huggingface/hub/models--bert-base-uncased             
google/pegasus-cnn_dailymail model             1.9M        4 1 hour ago    1 hour ago    main /home/codespace/.cache/huggingface/hub/models--google--pegasus-cnn_dailymail 
gpt2                         model           551.0M        5 2 days ago    2 days ago    main /home/codespace/.cache/huggingface/hub/models--gpt2                          
gpt2-xl                      model             6.4G        5 1 hour ago    1 hour ago    main /home/codespace/.cache/huggingface/hub/models--gpt2-xl

Create model

Fine-Tuning Hugging Face Models Tutorial

Why transfer learning?

One batch in PyTorch
Using sacrebleu (precision based "Precision (also called positive predictive value) is the fraction of relevant instances among the retrieved instances"). Recall is "while recall (also known as sensitivity) is the fraction of relevant instances that were retrieved" - wikipedia
The ROUGE score was specifically developed for applications like summarization where high recall is more important than just precision!

rouge_metric = load_metric("rouge")

from datasets import load_metric
bleu_metric = load_metric("sacrebleu")

Push to Hub

Need token and follow guide
Refer to HuggingFace course

use huggingface-cli login and pass in your token

Create spaces

Verify GPU works

The following examples test out the GPU

run pytorch training test: python utils/quickstart_pytorch.py
run pytorch CUDA test: python utils/verify_cuda_pytorch.py
run tensorflow training test: python utils/quickstart_tf2.py
run nvidia monitoring test: nvidia-smi -l 1 it should show a GPU
run whisper transcribe test ./utils/transcribe-whisper.sh and verify GPU is working with nvidia-smi -l 1

Additionally, this workspace is setup to fine-tune Hugging Face

python hf_fine_tune_hello_world.py

Used in Following Projects

Used as the base and customized in the following Duke MLOps and Applied Data Engineering Coursera Labs:

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.devcontainer		.devcontainer
.github/workflows		.github/workflows
hugging-face		hugging-face
mylib		mylib
utils		utils
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
four-score.m4a.srt		four-score.m4a.srt
four-score.m4a.txt		four-score.m4a.txt
four-score.m4a.vtt		four-score.m4a.vtt
input.txt		input.txt
main.py		main.py
repeat.sh		repeat.sh
requirements.txt		requirements.txt
setup.sh		setup.sh
test_main.py		test_main.py
tf-requirements.txt		tf-requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hugging Face Tutorials

Push model

Create data

Recommended Tutorial Followup

Generally useful skills

Create model

Recommended Tutorial Followup

Fine-Tuning Hugging Face Models Tutorial

Push to Hub

Create spaces

Verify GPU works

Used in Following Projects

References

About

Releases

Packages

Languages

License

nogibjj/hugging-face-tutorials

Folders and files

Latest commit

History

Repository files navigation

Hugging Face Tutorials

Push model

Create data

Recommended Tutorial Followup

Generally useful skills

Create model

Recommended Tutorial Followup

Fine-Tuning Hugging Face Models Tutorial

Push to Hub

Create spaces

Verify GPU works

Used in Following Projects

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages