Skip to content

Commit

Permalink
[docs] PushToHubMixin (huggingface#4622)
Browse files Browse the repository at this point in the history
* push to hub docs

* fix typo

* feedback

* make style
  • Loading branch information
stevhliu authored Aug 16, 2023
1 parent 5049599 commit 4ff7264
Show file tree
Hide file tree
Showing 5 changed files with 182 additions and 8 deletions.
2 changes: 2 additions & 0 deletions docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@
title: Load safetensors
- local: using-diffusers/other-formats
title: Load different Stable Diffusion formats
- local: using-diffusers/push_to_hub
title: Push files to the Hub
title: Loading & Hub
- sections:
- local: using-diffusers/pipeline_overview
Expand Down
2 changes: 1 addition & 1 deletion docs/source/en/api/models/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,6 @@ All models are built from the base [`ModelMixin`] class which is a [`torch.nn.mo

[[autodoc]] FlaxModelMixin

## Pushing to the Hub
## PushToHubMixin

[[autodoc]] utils.PushToHubMixin
171 changes: 171 additions & 0 deletions docs/source/en/using-diffusers/push_to_hub.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
# Push files to the Hub

[[open-in-colab]]

🤗 Diffusers provides a [`~diffusers.utils.PushToHubMixin`] for uploading your model, scheduler, or pipeline to the Hub. It is an easy way to store your files on the Hub, and also allows you to share your work with others. Under the hood, the [`~diffusers.utils.PushToHubMixin`]:

1. creates a repository on the Hub
2. saves your model, scheduler, or pipeline files so they can be reloaded later
3. uploads folder containing these files to the Hub

This guide will show you how to use the [`~diffusers.utils.PushToHubMixin`] to upload your files to the Hub.

You'll need to log in to your Hub account with your access [token](https://huggingface.co/settings/tokens) first:

```py
from huggingface_hub import notebook_login

notebook_login()
```

## Models

To push a model to the Hub, call [`~diffusers.utils.PushToHubMixin.push_to_hub`] and specfiy the repository id of the model to be stored on the Hub:

```py
from diffusers import ControlNetModel

controlnet = ControlNetModel(
block_out_channels=(32, 64),
layers_per_block=2,
in_channels=4,
down_block_types=("DownBlock2D", "CrossAttnDownBlock2D"),
cross_attention_dim=32,
conditioning_embedding_out_channels=(16, 32),
)
controlnet.push_to_hub("my-controlnet-model")
```

For model's, you can also specify the [*variant*](loading#checkpoint-variants) of the weights to push to the Hub. For example, to push `fp16` weights:

```py
controlnet.push_to_hub("my-controlnet-model", variant="fp16")
```

The [`~diffusers.utils.PushToHubMixin.push_to_hub`] function saves the model's `config.json` file and the weights are automatically saved in the `safetensors` format.

Now you can reload the model from your repository on the Hub:

```py
model = ControlNetModel.from_pretrained("your-namespace/my-controlnet-model")
```

## Scheduler

To push a scheduler to the Hub, call [`~diffusers.utils.PushToHubMixin.push_to_hub`] and specfiy the repository id of the scheduler to be stored on the Hub:

```py
from diffusers import DDIMScheduler

scheduler = DDIMScheduler(
beta_start=0.00085,
beta_end=0.012,
beta_schedule="scaled_linear",
clip_sample=False,
set_alpha_to_one=False,
)
scheduler.push_to_hub("my-controlnet-scheduler")
```

The [`~diffusers.utils.PushToHubMixin.push_to_hub`] function saves the scheduler's `scheduler_config.json` file to the specified repository.

Now you can reload the scheduler from your repository on the Hub:

```py
scheduler = DDIMScheduler.from_pretrained("your-namepsace/my-controlnet-scheduler")
```

## Pipeline

You can also push an entire pipeline with all it's components to the Hub. For example, initialize the components of a [`StableDiffusionPipeline`] with the parameters you want:

```py
from diffusers import (
UNet2DConditionModel,
AutoencoderKL,
DDIMScheduler,
StableDiffusionPipeline,
)
from transformers import CLIPTextModel, CLIPTextConfig, CLIPTokenizer

unet = UNet2DConditionModel(
block_out_channels=(32, 64),
layers_per_block=2,
sample_size=32,
in_channels=4,
out_channels=4,
down_block_types=("DownBlock2D", "CrossAttnDownBlock2D"),
up_block_types=("CrossAttnUpBlock2D", "UpBlock2D"),
cross_attention_dim=32,
)

scheduler = DDIMScheduler(
beta_start=0.00085,
beta_end=0.012,
beta_schedule="scaled_linear",
clip_sample=False,
set_alpha_to_one=False,
)

vae = AutoencoderKL(
block_out_channels=[32, 64],
in_channels=3,
out_channels=3,
down_block_types=["DownEncoderBlock2D", "DownEncoderBlock2D"],
up_block_types=["UpDecoderBlock2D", "UpDecoderBlock2D"],
latent_channels=4,
)

text_encoder_config = CLIPTextConfig(
bos_token_id=0,
eos_token_id=2,
hidden_size=32,
intermediate_size=37,
layer_norm_eps=1e-05,
num_attention_heads=4,
num_hidden_layers=5,
pad_token_id=1,
vocab_size=1000,
)
text_encoder = CLIPTextModel(text_encoder_config)
tokenizer = CLIPTokenizer.from_pretrained("hf-internal-testing/tiny-random-clip")
```

Pass all of the components to the [`StableDiffusionPipeline`] and call [`~diffusers.utils.PushToHubMixin.push_to_hub`] to push the pipeline to the Hub:

```py
components = {
"unet": unet,
"scheduler": scheduler,
"vae": vae,
"text_encoder": text_encoder,
"tokenizer": tokenizer,
"safety_checker": None,
"feature_extractor": None,
}

pipeline = StableDiffusionPipeline(**components)
pipeline.push_to_hub("my-pipeline")
```

The [`~diffusers.utils.PushToHubMixin.push_to_hub`] function saves each component to a subfolder in the repository. Now you can reload the pipeline from your repository on the Hub:

```py
pipeline = StableDiffusionPipeline.from_pretrained("your-namespace/my-pipeline")
```

## Privacy

Set `private=True` in the [`~diffusers.utils.PushToHubMixin.push_to_hub`] function to keep your model, scheduler, or pipeline files private:

```py
controlnet.push_to_hub("my-controlnet-model", private=True)
```

Private repositories are only visible to you, and other users won't be able to clone the repository and your repository won't appear in search results. Even if a user has the URL to your private repository, they'll receive a `404 - Repo not found error.`

To load a model, scheduler, or pipeline from a private or gated repositories, set `use_auth_token=True`:

```py
model = ControlNet.from_pretrained("your-namespace/my-controlnet-model", use_auth_token=True)
```
Original file line number Diff line number Diff line change
Expand Up @@ -1370,7 +1370,7 @@ def get_sigmas(timesteps, n_dim=4, dtype=torch.float32):

# Get the target for loss depending on the prediction type
if noise_scheduler.config.prediction_type == "epsilon":
target = latents # compute loss against the denoised latents
target = latents # compute loss against the denoised latents
elif noise_scheduler.config.prediction_type == "v_prediction":
target = noise_scheduler.get_velocity(latents, noise, timesteps)
else:
Expand Down
13 changes: 7 additions & 6 deletions src/diffusers/utils/hub_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -410,23 +410,24 @@ def push_to_hub(
variant: Optional[str] = None,
) -> str:
"""
Upload the {object_files} to the 🤗 Hugging Face Hub.
Upload model, scheduler, or pipeline files to the 🤗 Hugging Face Hub.
Parameters:
repo_id (`str`):
The name of the repository you want to push your {object} to. It should contain your organization name
when pushing to a given organization. `repo_id` can also be a path to a local directory.
The name of the repository you want to push your model, scheduler, or pipeline files to. It should
contain your organization name when pushing to an organization. `repo_id` can also be a path to a local
directory.
commit_message (`str`, *optional*):
Message to commit while pushing. Will default to `"Upload {object}"`.
Message to commit while pushing. Default to `"Upload {object}"`.
private (`bool`, *optional*):
Whether or not the repository created should be private.
token (`str`, *optional*):
The token to use as HTTP bearer authorization for remote files. The token generated when running
`huggingface-cli login` (stored in `~/.huggingface`).
create_pr (`bool`, *optional*, defaults to `False`):
Whether or not to create a PR with the uploaded files or directly commit.
safe_serialization (`bool`, *optional*, defaults to `False`):
Whether or not to convert the model weights in safetensors format for safer serialization.
safe_serialization (`bool`, *optional*, defaults to `True`):
Whether or not to convert the model weights to the `safetensors` format.
variant (`str`, *optional*):
If specified, weights are saved in the format `pytorch_model.<variant>.bin`.
Expand Down

0 comments on commit 4ff7264

Please sign in to comment.