Skip to content

Commit

Permalink
[Docs] Many docs improvements #2170 (address Ilya's feedback in #2171)
Browse files Browse the repository at this point in the history
  • Loading branch information
peterschmidt85 committed Jan 8, 2025
1 parent 2135175 commit a9356ba
Show file tree
Hide file tree
Showing 10 changed files with 160 additions and 123 deletions.
4 changes: 2 additions & 2 deletions docs/docs/concepts/backends.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
# Backends

To use `dstack` with cloud providers, configure backends
via the `~/.dstack/server/config.yml` file.
via the [`~/.dstack/server/config.yml`](../reference/server/config.yml.md) file.
The server loads this file on startup.

Alternatively, you can configure backends on the [project settings page](../guides/administration.md#backends) via UI.

> For using `dstack` with on-prem servers, no backend configuration is required.
> Use [SSH fleets](../concepts/fleets.md#ssh) instead.
Below are examples of how to configure backends via `~/.dstack/server/config.yml`.
Below are examples of how to configure backends via [`~/.dstack/server/config.yml`](../reference/server/config.yml.md).

## Cloud providers

Expand Down
22 changes: 14 additions & 8 deletions docs/docs/concepts/fleets.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,12 +92,12 @@ and their quantity. Examples: `nvidia` (one NVIDIA GPU), `A100` (one A100), `A10
To use TPUs, specify its architecture via the `gpu` property.

```yaml
type: dev-environment
type: fleet
# The name is optional, if not specified, generated randomly
name: vscode
ide: vscode
name: my-fleet
nodes: 2
resources:
gpu: v2-8
```
Expand All @@ -106,9 +106,12 @@ and their quantity. Examples: `nvidia` (one NVIDIA GPU), `A100` (one A100), `A10

#### Idle duration

By default, fleet instances remain active until the fleet is explicitly deleted via `dstack fleet delete`.
By default, fleet instances stay `idle` for 3 days and can be reused within that time.
If the fleet is not reused within this period, it is automatically terminated.

To automatically terminate `idle` instances after a certain period, configure `idle_duration`.
To change the default idle duration, set
[`idle_duration`](../reference/dstack.yml/fleet.md#idle_duration) in the run configuration (e.g., `0s`, `1m`, or `off` for
unlimited).

<div editor-title="examples/misc/fleets/.dstack.yml">

Expand All @@ -131,14 +134,14 @@ To automatically terminate `idle` instances after a certain period, configure `i
#### Spot policy

By default, `dstack` uses on-demand instances. However, you can change that
via the [`spot_policy`](../reference/dstack.yml/dev-environment.md#spot_policy) property. It accepts `spot`, `on-demand`, and `auto`.
via the [`spot_policy`](../reference/dstack.yml/fleet.md#spot_policy) property. It accepts `spot`, `on-demand`, and `auto`.

#### Retry policy

By default, if `dstack` fails to provision an instance or an instance is interrupted, no retry is attempted.

If you'd like `dstack` to do it, configure the
[retry](../reference/dstack.yml/dev-environment.md#retry) property accordingly:
[retry](../reference/dstack.yml/fleet.md#retry) property accordingly:

<div editor-title=".dstack.yml">

Expand Down Expand Up @@ -309,6 +312,9 @@ $ dstack fleet

</div>

When you apply this configuration, `dstack` will connect to the specified hosts using the provided SSH credentials,
install the dependencies, and configure these servers as a fleet.

Once the status of instances changes to `idle`, they can be used by dev environments, tasks, and services.

#### Troubleshooting
Expand Down
46 changes: 39 additions & 7 deletions docs/docs/concepts/services.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ port: 8000

If the service is running a chat model with an OpenAI-compatible interface,
set the [`model`](#model) property to make the model accessible via `dstack`'s
global the OpenAI-compatible endpoint, and also accessible via `dstack`'s UI.
global OpenAI-compatible endpoint, and also accessible via `dstack`'s UI.

### Resources

Expand All @@ -128,7 +128,7 @@ range (e.g. `24GB..`, or `24GB..80GB`, or `..80GB`).
```yaml
type: service
# The name is optional, if not specified, generated randomly
name: http-server-service
name: llama31-service
python: "3.10"
Expand Down Expand Up @@ -157,6 +157,31 @@ and their quantity. Examples: `nvidia` (one NVIDIA GPU), `A100` (one A100), `A10
`A100:80GB` (one A100 of 80GB), `A100:2` (two A100), `24GB..40GB:2` (two GPUs between 24GB and 40GB),
`A100:40GB:2` (two A100 GPUs of 40GB).

??? info "Google Cloud TPU"
To use TPUs, specify its architecture via the `gpu` property.

```yaml
type: service
name: llama31-service-optimum-tpu
image: dstackai/optimum-tpu:llama31
env:
- HF_TOKEN
- MODEL_ID=meta-llama/Meta-Llama-3.1-8B-Instruct
- MAX_TOTAL_TOKENS=4096
- MAX_BATCH_PREFILL_TOKENS=4095
commands:
- text-generation-launcher --port 8000
port: 8000
# Register the model
model: meta-llama/Meta-Llama-3.1-8B-Instruct
resources:
gpu: v5litepod-4
```

Currently, only 8 TPU cores can be specified, supporting single TPU device workloads. Multi-TPU support is coming soon.

??? info "Shared memory"
If you are using parallel communicating processes (e.g., dataloaders in PyTorch), you may need to configure
`shm_size`, e.g. set it to `16GB`.
Expand Down Expand Up @@ -321,7 +346,7 @@ Running services doesn't require [gateways](gateways.md) unless you need to enab
use HTTPS and map it to your domain.

!!! info "Websockets and base path"
A [gateways](gateways.md) may also be required if the service needs Websockets or cannot be used with
A [gateway](gateways.md) may also be required if the service needs Websockets or cannot be used with
a base path.

> If you're using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"},
Expand Down Expand Up @@ -358,7 +383,15 @@ Model meta-llama/Meta-Llama-3.1-8B-Instruct is published at:
`dstack apply` automatically provisions instances, uploads the contents of the repo (incl. your local uncommitted changes),
and runs the service.

### Service endpoint
### Retry policy

By default, if `dstack` can't find capacity, the task exits with an error, or the instance is interrupted,
the run will fail.

If you'd like `dstack` to automatically retry, configure the
[retry](../reference/dstack.yml/service.md#retry) property accordingly:

## Access the endpoint

If a [gateway](gateways.md) is not configured, the service’s endpoint will be accessible at
`<dstack server URL>/proxy/services/<project name>/<run name>/`.
Expand Down Expand Up @@ -393,9 +426,8 @@ or via `dstack` UI.
If the service defines the `model` property, the model will be available via the global OpenAI-compatible endpoint
at `https://gateway.<gateway domain>/`.

[//]: # (By default, the service endpoint requires the `Authorization` header with `Bearer <dstack token>`.)
[//]: # (Authorization can be disabled by setting [`auth`]&#40;../reference/dstack.yml/service.md#authorization&#41; to `false` in the)
[//]: # (service configuration file.)
If [authorization](#authorization) is not disabled, the service endpoint requires the `Authorization` header with
`Bearer <dstack token>`.

!!! info "What's next?"
1. Read about [dev environments](dev-environments.md), [tasks](tasks.md), and [repos](repos.md)
Expand Down
7 changes: 3 additions & 4 deletions docs/docs/concepts/tasks.md
Original file line number Diff line number Diff line change
Expand Up @@ -386,7 +386,6 @@ retry:
--8<-- "docs/concepts/snippets/manage-runs.ext"
!!! info "What's next?"
1. Read about [dev environments](dev-environments.md), [services](services.md), and [repos](repos.md)
2. Learn how to manage [fleets](fleets.md)
3. Check the [Axolotl](/examples/fine-tuning/axolotl) example
1. Read about [dev environments](dev-environments.md), [services](services.md), and [repos](repos.md)
2. Learn how to manage [fleets](fleets.md)
3. Check the [Axolotl](/examples/fine-tuning/axolotl) example
2 changes: 1 addition & 1 deletion docs/docs/concepts/volumes.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ If you use this configuration, `dstack` will create a new volume based on the sp

??? info "Register existing volumes"
If you prefer not to create a new volume but to reuse an existing one (e.g., created manually), you can
[specify its ID via `volume_id`](../reference/dstack.yml/volume.md#existing-volume). In this case, `dstack` will register the specified volume so that you can use it with dev environments, tasks, and services.
specify its ID via [`volume_id`](../reference/dstack.yml/volume.md#volume_id). In this case, `dstack` will register the specified volume so that you can use it with dev environments, tasks, and services.

<div editor-title="volume.dstack.yml">

Expand Down
99 changes: 95 additions & 4 deletions docs/docs/guides/server-deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,11 +165,102 @@ The log group must be created beforehand, `dstack` won't try to create it.
}
```

## Enabling encryption
## Encryption

By default, `dstack` stores data in plaintext.
If you want backend credentials and user tokens to be encrypted, set up encryption keys via
[`~/.dstack/server/config.yml`](../reference/server/config.yml.md#encryption).
By default, `dstack` stores data in plaintext. To enforce encryption, you
specify one or more encryption keys.

`dstack` currently supports AES and identity (plaintext) encryption keys.
Support for external providers like HashiCorp Vault and AWS KMS is planned.

=== "AES"
The `aes` encryption key encrypts data using [AES-256](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard) in GCM mode.
To configure the `aes` encryption, generate a random 32-byte key:

<div class="termy">

```shell
$ head -c 32 /dev/urandom | base64

opmx+r5xGJNVZeErnR0+n+ElF9ajzde37uggELxL
```

</div>

And specify it as `secret`:

```yaml
# ...

encryption:
keys:
- type: aes
name: key1
secret: opmx+r5xGJNVZeErnR0+n+ElF9ajzde37uggELxL
```

=== "Identity"
The `identity` encryption performs no encryption and stores data in plaintext.
You can specify an `identity` encryption key explicitly if you want to decrypt the data:

<div editor-title="~/.dstack/server/config.yml">

```yaml
# ...

encryption:
keys:
- type: identity
- type: aes
name: key1
secret: opmx+r5xGJNVZeErnR0+n+ElF9ajzde37uggELxL
```

</div>

With this configuration, the `aes` key will still be used to decrypt the old data,
but new writes will store the data in plaintext.

??? info "Key rotation"
If multiple keys are specified, the first is used for encryption, and all are tried for decryption. This enables key
rotation by specifying a new encryption key.

<div editor-title="~/.dstack/server/config.yml">

```yaml
# ...

encryption:
keys:
- type: aes
name: key2
secret: cR2r1JmkPyL6edBQeHKz6ZBjCfS2oWk87Gc2G3wHVoA=

- type: aes
name: key1
secret: E5yzN6V3XvBq/f085ISWFCdgnOGED0kuFaAkASlmmO4=
```

</div>

Old keys may be deleted once all existing records have been updated to re-encrypt sensitive data.
Encrypted values are prefixed with key names, allowing DB admins to identify the keys used for encryption.

## Default permissions

By default, all users can create and manage their own projects. You can specify `default_permissions`
to `false` so that only global admins can create and manage projects:

<div editor-title="~/.dstack/server/config.yml">

```yaml
# ...

default_permissions:
allow_non_admins_create_projects: false
```
</div>
## Backward compatibility
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/guides/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ pointing to the gateway's hostname is configured.

#### Cause 1: Bad Authorization

If the service endpoint returns a 403 error, it is likely because the [`Authorization`](../concepts/services.md#service-endpoint)
If the service endpoint returns a 403 error, it is likely because the [`Authorization`](../concepts/services.md#access-the-endpoint)
header with the correct `dstack` token was not provided.

[//]: # (#### Other)
Expand Down
5 changes: 2 additions & 3 deletions docs/docs/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,8 +101,7 @@ $ dstack init
</div>

By default, tasks run on a single instance. To run a distributed task, specify
[`nodes`](reference/dstack.yml/task.md#distributed-tasks),
and `dstack` will run it on a cluster.
[`nodes`](concepts/tasks.md#distributed-tasks), and `dstack` will run it on a cluster.

Run the configuration via [`dstack apply`](reference/cli/dstack/apply.md):

Expand Down Expand Up @@ -192,7 +191,7 @@ $ dstack init
</div>

!!! info "Gateway"
To enable [auto-scaling](reference/dstack.yml/service.md#auto-scaling), or use a custom domain with HTTPS,
To enable [auto-scaling](concepts/services.md#replicas-and-scaling), or use a custom domain with HTTPS,
set up a [gateway](concepts/gateways.md) before running the service.
If you're using [dstack Sky :material-arrow-top-right-thin:{ .external }](https://sky.dstack.ai){:target="_blank"},
a gateway is pre-configured for you.
Expand Down
5 changes: 2 additions & 3 deletions docs/docs/reference/dstack.yml/service.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,14 +58,13 @@ The `service` configuration type allows running [services](../../concepts/servic
eos_token: "</s>"
```

##### Limitations

Please note that model mapping is an experimental feature with the following limitations:

1. Doesn't work if your `chat_template` uses `bos_token`. As a workaround, replace `bos_token` inside `chat_template` with the token content itself.
2. Doesn't work if `eos_token` is defined in the model repository as a dictionary. As a workaround, set `eos_token` manually, as shown in the example above (see Chat template).

If you encounter any other issues, please make sure to file a [GitHub issue](https://github.com/dstackai/dstack/issues/new/choose).
If you encounter any other issues, please make sure to file a
[GitHub issue :material-arrow-top-right-thin:{ .external }](https://github.com/dstackai/dstack/issues/new/choose){:target="_blank"}.

### `scaling`

Expand Down
Loading

0 comments on commit a9356ba

Please sign in to comment.