Skip to content

Commit

Permalink
assign workers keys (MystenLabs/narwhal#909)
Browse files Browse the repository at this point in the history
  • Loading branch information
arun-koshy authored Sep 8, 2022
1 parent 36bba8c commit 58a0f9b
Show file tree
Hide file tree
Showing 29 changed files with 436 additions and 198 deletions.
118 changes: 68 additions & 50 deletions narwhal/Docker/README.md
Original file line number Diff line number Diff line change
@@ -1,51 +1,57 @@
# Narwhal Cluster Startup
# Narwhal Cluster Startup

## Introduction

This directory contains the configuration information needed to
quickly setup and spin-up a small Narwhal cluster via [Docker Compose](https://docs.docker.com/compose/).

In this directory, you will find:
* The `Dockerfile` definition for a Narwhal node
* A `docker-compose.yml` file to allow you to quickly create a Narwhal cluster

- The `Dockerfile` definition for a Narwhal node
- A `docker-compose.yml` file to allow you to quickly create a Narwhal cluster

## Quick start

First, you must install:

* [Docker](https://docs.docker.com/get-docker/)
* [Docker-compose](https://docs.docker.com/compose/install/)
- [Docker](https://docs.docker.com/get-docker/)
- [Docker-compose](https://docs.docker.com/compose/install/)

Afterward, you will start the Narwhal cluster.

First, **make sure that you are on the `Docker folder`** . In the rest of the
document, we'll assume that we are under this folder:

```
$ cd Docker # Change to Docker directory
$ pwd # Print the current directory
narwhal/Docker
```

Then bring up the cluster via the following command:

```
$ docker-compose -f docker-compose.yml up
```

The first time this runs, `docker-compose` will build the Narwhal docker image. (This can take a few minutes
since the narwhal node binary needs to be built from the source code.) And then it will spin up
a cluster for *four nodes* by doing the necessary setup for `primary` and `worker` nodes. Each
`primary` node will be connected to *one worker* node.
a cluster for _four nodes_ by doing the necessary setup for `primary` and `worker` nodes. Each
`primary` node will be connected to _one worker_ node.

The logs from the `primary` and `worker` nodes are available via

```
docker-compose logs primary_<num>
docker-compose logs worker_<num>
```

By default, the production (release) version of the Narwhal node will be compiled when the Docker image is being built.
By default, the production (release) version of the Narwhal node will be compiled when the Docker image is being built.

To build the Docker image with the development version of Narwhal, which will lead to smaller compile times and
smaller binary (and image) size, you run the `docker-compose` command as:

```
$ docker-compose build --build-arg BUILD_MODE=debug
```
Expand All @@ -63,6 +69,7 @@ $ docker-compose up
## Build Docker image without docker-compose

To build the Narwhal node image without `docker-compose`, run:

```
$ docker build -f Dockerfile ../ --tag narwhal-node:latest
```
Expand All @@ -83,12 +90,13 @@ machine port and the corresponding container's port (ex. for someone to use a gR
computer to hit a primary's node container gRPC server). The [docker-compose.yml](docker-compose.yml) file
exports the gRPC port for each primary node so they can be accessible from the host machine.

For the default setup of *four primary* nodes, the gRPC servers are listening to the following
For the default setup of _four primary_ nodes, the gRPC servers are listening to the following
local (machine) ports:
* `primary_0`: 8000
* `primary_1`: 8001
* `primary_2`: 8002
* `primary_3`: 8003

- `primary_0`: 8000
- `primary_1`: 8001
- `primary_2`: 8002
- `primary_3`: 8003

For example, to send a gRPC request to the `primary_1` node, use the URL: `127.0.0.1:8001`

Expand All @@ -98,10 +106,11 @@ Just as you access the [public gRPC endpoints on a primary node](#access-primary
similarly **feed transactions** to the Narwhal cluster via the `worker` nodes with the gRPC server
bootstrapped on the worker nodes bind to the local machine port. To send transactions, the following local
ports can be used:
* `worker_0`: 7001
* `worker_1`: 7002
* `worker_2`: 7003
* `worker_3`: 7004

- `worker_0`: 7001
- `worker_1`: 7002
- `worker_2`: 7003
- `worker_3`: 7004

For example, to send a transaction to the `worker_2` node via gRPC, the url `127.0.0.1:7003` should be used.

Expand All @@ -114,14 +123,19 @@ Here is the Docker folder structure:
├── README.md
├── validators
│   ├── validator-0
│   │   └── key.json
│   │   └── primary-key.json
│   │   └── worker-key.json
│   ├── validator-1
│   │   └── key.json
│   │   └── primary-key.json
│   │   └── worker-key.json
│   ├── validator-2
│   │   └── key.json
│   │   └── primary-key.json
│   │   └── worker-key.json
│   ├── validator-3
│   │   └── key.json
│   │   └── primary-key.json
│   │   └── worker-key.json
│   ├── committee.json
│   ├── workers.json
│   └── parameters.json
├── docker-compose.yml
└── entry.sh
Expand All @@ -131,14 +145,21 @@ Under the `validators` folder find the independent configuration
folder for each validator node. (Remember, each `validator` is
constituted from one `primary` node and several `worker` nodes.)

The `key.json` file contains the private `key` for the corresponding node that
The `primary-key.json` file contains the private `key` for the corresponding primary that
is associated with this node only.

The `worker-key.json` file contains the private `key` for the corresponding worker that
is associated with this node only.

The [parameters.json](validators/parameters.json) file is shared across all the nodes and contains
the core parameters for a node.

The [committee.json](validators/committee.json) file is shared across all the nodes and contains
the information about the validators (primary & worker nodes), like the public keys, addresses,
the information about the validators (primary nodes), like the public keys, addresses,
ports available, etc.

The [workers.json](validators/workers.json) file is shared across all the nodes and contains
the information about the validators (worker nodes), like the public keys, addresses,
ports available, etc.

Note the current `docker-compose` setup is mounting the [Docker/validators](validators)
Expand All @@ -149,30 +170,31 @@ configuration without having to rebuild the Docker image.

The following environment variables are available to be used for each service in the
[docker-compose.yml](docker-compose.yml) file configuration:
* `NODE_TYPE` with values `primary|worker`. Defines the node type to bootstrap.
* `AUTHORITY_ID` with decimal numbers, for current setup available values `0..3`. Defines the
ID of the validator that the node/service corresponds to. This defines which
configuration to use under the `validators` folder.
* `LOG_LEVEL` is the level of logging for the node defined as number of `v` parameters (ex `-vvv`). The following
levels are defined according to the number of "v"s provided: `0 | 1 => "error", 2 => "warn", 3 => "info",
4 => "debug", 5 => "trace"`.
* `CONSENSUS_DISABLED`. This value disables consensus (`Tusk`) for a primary node and enables the
`gRPC` server. The corresponding argument is: `--consensus-disabled`
* `WORKER_ID` is the ID, as integer, for service when it runs as a worker.
* `CLEANUP_DISABLED`, when provided with value `true`, will disable the clean up of the validator folder
from the database and log data. This is useful to preserve the state between multiple Docker Compose runs.

- `NODE_TYPE` with values `primary|worker`. Defines the node type to bootstrap.
- `AUTHORITY_ID` with decimal numbers, for current setup available values `0..3`. Defines the
ID of the validator that the node/service corresponds to. This defines which
configuration to use under the `validators` folder.
- `LOG_LEVEL` is the level of logging for the node defined as number of `v` parameters (ex `-vvv`). The following
levels are defined according to the number of "v"s provided: `0 | 1 => "error", 2 => "warn", 3 => "info", 4 => "debug", 5 => "trace"`.
- `CONSENSUS_DISABLED`. This value disables consensus (`Tusk`) for a primary node and enables the
`gRPC` server. The corresponding argument is: `--consensus-disabled`
- `WORKER_ID` is the ID, as integer, for service when it runs as a worker.
- `CLEANUP_DISABLED`, when provided with value `true`, will disable the clean up of the validator folder
from the database and log data. This is useful to preserve the state between multiple Docker Compose runs.

## How to run more than the default 4 nodes with docker compose.

### Prerequisites:
- python3
- You must build the narwhal `node` binary at top level:

```cargo build --release --features "benchmark"```
- python3
- You must build the narwhal `node` binary at top level:

That binary is necessary for generating the keys for the validators and the committee.json seed file.
`cargo build --release --features "benchmark"`

### Running the `gen.validators.sh #` script to generate a larger cluster.
That binary is necessary for generating the keys for the validators and the committee.json seed file.

### Running the `gen.validators.sh #` script to generate a larger cluster.

```
# arguments for script are {num_primary} & {num_worker_per_primary} in that order
Expand All @@ -194,7 +216,6 @@ The `parameters.json` is (so far) a static template and just dropped into that d
Also note that the primaries are created with only 1 worker node currently.
When multiple workers are needed we'll add that feature.


## Grafana, Prometheus and Loki.

The grafana instance is exposed at http://localhost:3000/
Expand All @@ -204,26 +225,23 @@ Default user/pass is admin/admin.
You can 'skip' changing that since it's always regenerated.

Grafana is the frontend dashboard and metrics explorer, as well as a means
for setting up alerts.
- https://grafana.com/oss/grafana/
- https://grafana.com/grafana/dashboards/ published dashboards, good place to start building.
for setting up alerts. - https://grafana.com/oss/grafana/ - https://grafana.com/grafana/dashboards/ published dashboards, good place to start building.

Prometheus is the defacto standard for pulling metrics from targets and
storing for use via Grafana and other services (alertmanager, scripts).
- https://prometheus.io/docs/introduction/overview/
storing for use via Grafana and other services (alertmanager, scripts). - https://prometheus.io/docs/introduction/overview/

Loki is a log collector and processor. It is exposed as a datasource
in Grafana and makes the logs easily searchable.
- https://grafana.com/oss/loki/
Loki is a log collector and processor. It is exposed as a datasource
in Grafana and makes the logs easily searchable. - https://grafana.com/oss/loki/

Currently there are no Loki dashboards defined, however you can
browse the logs via the "Explorer", selecting the Loki datasource.


## Troubleshooting

#### 1. Compile errors when building Docker image

If you encounter errors while the Docker image is being built, for example errors like:

```
error: could not compile `tonic`
#9 373.3
Expand Down
9 changes: 6 additions & 3 deletions narwhal/Docker/entry.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@ fi

# Environment variables to use on the script
NODE_BIN="./bin/node"
KEYS_PATH=${KEYS_PATH:="/validators/validator-$VALIDATOR_ID/key.json"}
PRIMARY_KEYS_PATH=${KEYS_PATH:="/validators/validator-$VALIDATOR_ID/primary-key.json"}
WORKER_KEYS_PATH=${KEYS_PATH:="/validators/validator-$VALIDATOR_ID/worker-key.json"}
COMMITTEE_PATH=${COMMITTEE_PATH:="/validators/committee.json"}
WORKERS_PATH=${WORKERS_PATH:="/validators/workers.json"}
PARAMETERS_PATH=${PARAMETERS_PATH:="/validators/parameters.json"}
Expand All @@ -36,7 +37,8 @@ if [[ "$NODE_TYPE" = "primary" ]]; then
echo "Bootstrapping primary node"

$NODE_BIN $LOG_LEVEL run \
--keys $KEYS_PATH \
--primary-keys $PRIMARY_KEYS_PATH \
--worker-keys $WORKER_KEYS_PATH \
--committee $COMMITTEE_PATH \
--workers $WORKERS_PATH \
--store "${DATA_PATH}/validator-$VALIDATOR_ID/db-primary" \
Expand All @@ -46,7 +48,8 @@ elif [[ "$NODE_TYPE" = "worker" ]]; then
echo "Bootstrapping new worker node with id $WORKER_ID"

$NODE_BIN $LOG_LEVEL run \
--keys $KEYS_PATH \
--primary-keys $PRIMARY_KEYS_PATH \
--worker-keys $WORKER_KEYS_PATH \
--committee $COMMITTEE_PATH \
--workers $WORKERS_PATH \
--store "${DATA_PATH}/validator-$VALIDATOR_ID/db-worker-$WORKER_ID" \
Expand Down
5 changes: 5 additions & 0 deletions narwhal/Docker/validators/validator-0/worker-key.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"type": "Ed25519KeyPair",
"name": "hgeabM6XVJdIJQJjas05IiZAm2VFhtxwqIz0gDMYyjw=",
"secret": "odAwN+GmVs9BMtBDiWDGT5fcpMmMzzUMtE/szmVpEVk="
}
5 changes: 5 additions & 0 deletions narwhal/Docker/validators/validator-1/worker-key.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"type": "Ed25519KeyPair",
"name": "7o+0AxuS4muiCYxLOMT9pzdLug2LmrAzB7yLE7TrxwA=",
"secret": "a5IKgyaSAVRPtmzhvjGobSe5/gbFUi/2sF78DlLI/EE="
}
5 changes: 5 additions & 0 deletions narwhal/Docker/validators/validator-2/worker-key.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"type": "Ed25519KeyPair",
"name": "63HCaWWXpqB8fhrGgBSNy7XfbSpakrILoXA5DaaYx1o=",
"secret": "M6HIPZPJQbjLAY9BVCjKsGTWooW6a+HKgtmuFGDaJTo="
}
5 changes: 5 additions & 0 deletions narwhal/Docker/validators/validator-3/worker-key.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"type": "Ed25519KeyPair",
"name": "cJbSCbB7f3l/Tmeix4Ts3Th8xS3TTh4ohnakHKhObmc=",
"secret": "X/l+9RaoDB7N8Efn1kQ02T6nvW9kQ1aP5SktR/4GZT4="
}
4 changes: 4 additions & 0 deletions narwhal/Docker/validators/workers.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,31 @@
"workers": {
"Zy82aSpF8QghKE4wWvyIoTWyLetCuUSfk2gxHEtwdbg=": {
"0": {
"name": "hgeabM6XVJdIJQJjas05IiZAm2VFhtxwqIz0gDMYyjw=",
"primary_to_worker": "/dns/worker_0/tcp/4000/http",
"transactions": "/dns/worker_0/tcp/4001/http",
"worker_to_worker": "/dns/worker_0/tcp/4002/http"
}
},
"fbhvgLnet2HdE0NUITUpekQxdRRWKxbZczM6Qg55sP8=": {
"0": {
"name": "7o+0AxuS4muiCYxLOMT9pzdLug2LmrAzB7yLE7TrxwA=",
"primary_to_worker": "/dns/worker_1/tcp/4000/http",
"transactions": "/dns/worker_1/tcp/4001/http",
"worker_to_worker": "/dns/worker_1/tcp/4002/http"
}
},
"noDjBFfXGqQioHTf6jEIPYthhUWCMsC12ZJ9DMh7Ujk=": {
"0": {
"name": "63HCaWWXpqB8fhrGgBSNy7XfbSpakrILoXA5DaaYx1o=",
"primary_to_worker": "/dns/worker_2/tcp/4000/http",
"transactions": "/dns/worker_2/tcp/4001/http",
"worker_to_worker": "/dns/worker_2/tcp/4002/http"
}
},
"Z+K3OEI/eldyTTdp27mQFDdBPqjkss9wOkN6RceDTuM=": {
"0": {
"name": "cJbSCbB7f3l/Tmeix4Ts3Th8xS3TTh4ohnakHKhObmc=",
"primary_to_worker": "/dns/worker_3/tcp/4000/http",
"transactions": "/dns/worker_3/tcp/4001/http",
"worker_to_worker": "/dns/worker_3/tcp/4002/http"
Expand Down
33 changes: 18 additions & 15 deletions narwhal/benchmark/benchmark/commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,40 +31,43 @@ def generate_key(filename):
return f'./node generate_keys --filename {filename}'

@staticmethod
def run_primary(keys, committee, workers, store, parameters, debug=False):
assert isinstance(keys, str)
def run_primary(primary_keys, worker_keys, committee, workers, store, parameters, debug=False):
assert isinstance(primary_keys, str)
assert isinstance(worker_keys, str)
assert isinstance(committee, str)
assert isinstance(workers, str)
assert isinstance(parameters, str)
assert isinstance(debug, bool)
v = '-vvv' if debug else '-vv'
return (f'./node {v} run --keys {keys} --committee {committee} '
f'--workers {workers} --store {store} --parameters {parameters} '
f'primary')
return (f'./node {v} run --primary-keys {primary_keys} --worker-keys {worker_keys} '
f'--committee {committee} --workers {workers} --store {store} '
f'--parameters {parameters} primary')

@staticmethod
def run_no_consensus_primary(keys, committee, workers, store, parameters, debug=False):
assert isinstance(keys, str)
def run_no_consensus_primary(primary_keys, worker_keys, committee, workers, store, parameters, debug=False):
assert isinstance(primary_keys, str)
assert isinstance(worker_keys, str)
assert isinstance(committee, str)
assert isinstance(workers, str)
assert isinstance(parameters, str)
assert isinstance(debug, bool)
v = '-vvv' if debug else '-vv'
return (f'./node {v} run --keys {keys} --committee {committee} '
f'--workers {workers} --store {store} --parameters {parameters} '
f'primary --consensus-disabled')
return (f'./node {v} run --primary-keys {primary_keys} --worker-keys {worker_keys} '
f'--committee {committee} --workers {workers} --store {store} '
f'--parameters {parameters} primary --consensus-disabled')

@staticmethod
def run_worker(keys, committee, workers, store, parameters, id, debug=False):
assert isinstance(keys, str)
def run_worker(primary_keys, worker_keys, committee, workers, store, parameters, id, debug=False):
assert isinstance(primary_keys, str)
assert isinstance(worker_keys, str)
assert isinstance(committee, str)
assert isinstance(workers, str)
assert isinstance(parameters, str)
assert isinstance(debug, bool)
v = '-vvv' if debug else '-vv'
return (f'./node {v} run --keys {keys} --committee {committee} '
f'--workers {workers} --store {store} --parameters {parameters} '
f'worker --id {id}')
return (f'./node {v} run --primary-keys {primary_keys} --worker-keys {worker_keys} '
f'--committee {committee} --workers {workers} --store {store} '
f'--parameters {parameters} worker --id {id}')

@staticmethod
def run_client(address, size, rate, nodes):
Expand Down
Loading

0 comments on commit 58a0f9b

Please sign in to comment.