Skip to content

Commit

Permalink
Reth Book - Stages framework/draft (paradigmxyz#360)
Browse files Browse the repository at this point in the history
* Added framework and start to draft for stages, stopping here to get feedback on approach before pushing forward

* Update README.md

Fixing some wording / grammar.

* Fixing grammar/wording.

* Added note about non-exhaustive stages list, fixed grammar, fixed State exectuion function name, updated language to reflect that the stream yields a SealedHeader and added language to describe that the initial header validation is only a basic validation.

* updated stages chapter, added bodies, senders, execution, next chapter prelude

* typo

* Added line numbers to code snippets for the stages chapter of the reth book

* address reverse header download + other nits

* add note about book hosting

* tweaked wording, formatting

* Address typo "staring"

* Address typo "HeadderDownloader"

* consolidated book.toml

* updating snippets in stages chapter to ignore errors

* template & removed empty fields from book.toml

* addressed build issues, added templating for source code

* only deploy on push to main

* using single quotes in github action if expression

Co-authored-by: Andrew Kirillov <[email protected]>
Co-authored-by: Andrew Kirillov <[email protected]>
  • Loading branch information
3 people authored Dec 14, 2022
1 parent cbbdac2 commit df9d141
Show file tree
Hide file tree
Showing 14 changed files with 141 additions and 22 deletions.
14 changes: 14 additions & 0 deletions .github/workflows/book.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,12 @@ jobs:
curl -sSL https://github.com/rust-lang/mdBook/releases/download/v0.4.14/mdbook-v0.4.14-x86_64-unknown-linux-gnu.tar.gz | tar -xz --directory=./mdbook
echo `pwd`/mdbook >> $GITHUB_PATH
- name: Install mdbook-template
run: |
mkdir mdbook-template
curl -sSL https://github.com/sgoudham/mdbook-template/releases/latest/download/mdbook-template-x86_64-unknown-linux-gnu.tar.gz | tar -xz --directory=./mdbook-template
echo `pwd`/mdbook-template >> $GITHUB_PATH
- name: Run tests
run: mdbook test

Expand All @@ -41,6 +47,12 @@ jobs:
curl -sSL https://github.com/rust-lang/mdBook/releases/download/v0.4.14/mdbook-v0.4.14-x86_64-unknown-linux-gnu.tar.gz | tar -xz --directory=./mdbook
echo `pwd`/mdbook >> $GITHUB_PATH
- name: Install mdbook-template
run: |
mkdir mdbook-template
curl -sSL https://github.com/sgoudham/mdbook-template/releases/latest/download/mdbook-template-x86_64-unknown-linux-gnu.tar.gz | tar -xz --directory=./mdbook-template
echo `pwd`/mdbook-template >> $GITHUB_PATH
- name: Build
run: mdbook build

Expand All @@ -50,6 +62,8 @@ jobs:
path: target/book

deploy:
# Only deploy if a push to main
if: github.ref_name == 'main' && github.event_name == 'push'
runs-on: ubuntu-latest
needs: [test, build]

Expand Down
19 changes: 18 additions & 1 deletion book.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,23 @@ language = "en"
multilingual = false
src = "book"
title = "reth Book"
description = "A book on all things Reth"

[output.html]
git-repository-url = "https://github.com/paradigmxyz/reth"
default-theme = "ayu"
no-section-label = true

[output.html.fold]
enable = true
level = 1

[build]
build-dir = "target/book"
build-dir = "target/book"

[preprocessor.template]
before = [ "links" ]

[preprocessor.index]

[preprocessor.links]
2 changes: 2 additions & 0 deletions book/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
<!-- Add a quick description about Reth, what it is, the goals of the build, and any other quick overview information -->


The book is continuously rendered [here](https://paradigmxyz.github.io/reth/)!

> 📖 **Contributing**
>
> You can contribute to this book on [GitHub][gh-book].
Expand Down
2 changes: 1 addition & 1 deletion book/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
- [consensus]()
- [transaction-pool]()
- [Staged Sync]()
- [stages]()
- [stages](./stages/README.md)
- [Primitives]()
- [primitives]()
- [rlp]()
Expand Down
19 changes: 0 additions & 19 deletions book/book.toml

This file was deleted.

87 changes: 87 additions & 0 deletions book/stages/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# Stages

The `stages` lib plays a central role in syncing the node, maintaining state, updating the database and more. The stages involved in the Reth pipeline are the `HeaderStage`, `BodyStage`, `SendersStage`, and `ExecutionStage` (note that this list is non-exhaustive, and more pipeline stages will be added in the near future). Each of these stages are queued up and stored within the Reth pipeline.

{{#template ../templates/source_and_github.md path_to_root=../../ path=crates/stages/src/pipeline.rs anchor=struct-Pipeline}}


When the node is first started, a new `Pipeline` is initialized and all of the stages are added into `Pipeline.stages`. Then, the `Pipeline::run` function is called, which starts the pipeline, executing all of the stages continuously in an infinite loop. This process syncs the chain, keeping everything up to date with the chain tip.

Each stage within the pipeline implements the `Stage` trait which provides function interfaces to get the stage id, execute the stage and unwind the changes to the database if there was an issue during the stage execution.


{{#template ../templates/source_and_github.md path_to_root=../../ path=crates/stages/src/stage.rs anchor=trait-Stage}}

To get a better idea of what is happening at each part of the pipeline, lets walk through what is going on under the hood within the `execute()` function at each stage, starting with `HeaderStage`.

<br>

## HeaderStage

<!-- TODO: Cross-link to eth/65 chapter when it's written -->
The `HeaderStage` is responsible for syncing the block headers, validating the header integrity and writing the headers to the database. When the `execute()` function is called, the local head of the chain is updated to the most recent block height previously executed by the stage. At this point, the node status is also updated with that block's height, hash and total difficulty. These values are used during any new eth/65 handshakes. After updating the head, a stream is established with other peers in the network to sync the missing chain headers between the most recent state stored in the database and the chain tip. The `HeaderStage` contains a `downloader` attribute, which is a type that implements the `HeaderDownloader` trait. The `stream()` method from this trait is used to fetch headers from the network.

{{#template ../templates/source_and_github.md path_to_root=../../ path=crates/interfaces/src/p2p/headers/downloader.rs anchor=trait-HeaderDownloader}}

The `HeaderStage` relies on the downloader stream to return the headers in descending order starting from the chain tip down to the latest block in the database. While other stages in the `Pipeline` start from the most recent block in the database up to the chain tip, the `HeaderStage` works in reverse to avoid [long-range attacks](https://messari.io/report/long-range-attack). When a node downloads headers in ascending order, it will not know if it is being subjected to a long-range attack until it reaches the most recent blocks. To combat this, the `HeaderStage` starts by getting the chain tip from the Consensus Layer, verifies the tip, and then walks backwards by the parent hash. Each value yielded from the stream is a `SealedHeader`.

{{#template ../templates/source_and_github.md path_to_root=../../ path=crates/primitives/src/header.rs anchor=struct-SealedHeader}}

Each `SealedHeader` is then validated to ensure that it has the proper parent. Note that this is only a basic response validation, and the `HeaderDownloader` uses the `validate` method during the `stream`, so that each header is validated according to the consensus specification before the header is yielded from the stream. After this, each header is then written to the database. If a header is not valid or the stream encounters any other error, the error is propagated up through the stage execution, the changes to the database are unwound and the stage is resumed from the most recent valid state.

This process continues until all of the headers have been downloaded and and written to the database. Finally, the total difficulty of the chain's head is updated and the function returns `Ok(ExecOutput { stage_progress: current_progress, reached_tip: true, done: true })`, signaling that the header sync has completed successfully.

<br>

## BodyStage

Once the `HeaderStage` completes successfully, the `BodyStage` will start execution. The body stage downloads block bodies for all of the new block headers that were stored locally in the database. The `BodyStage` first determines which block bodies to download by checking if the block body has an ommers hash and transaction root.

An ommers hash is the Keccak 256-bit hash of the ommers list portion of the block. If you are unfamiliar with ommers blocks, you can [click here to learn more](https://ethereum.org/en/glossary/#ommer). Note that while ommers blocks were important for new blocks created during Ethereum's proof of work chain, Ethereum's proof of stake chain selects exactly one block proposer at a time, causing ommers blocks not to be needed in post-merge Ethereum.

The transactions root is a value that is calculated based on the transactions included in the block. To derive the transactions root, a [merkle tree](https://blog.ethereum.org/2015/11/15/merkling-in-ethereum) is created from the block's transactions list. The transactions root is then derived by taking the Keccak 256-bit hash of the root node of the merkle tree.

When the `BodyStage` is looking at the headers to determine which block to download, it will skip the blocks where the `header.ommers_hash` and the `header.transaction_root` are empty, denoting that the block is empty as well.

Once the `BodyStage` determines which block bodies to fetch, a new `bodies_stream` is created which downloads all of the bodies from the `starting_block`, up until the `target_block` specified. Each time the `bodies_stream` yields a value, a `BlockLocked` is created using the block header, the ommers hash and the newly downloaded block body.

{{#template ../templates/source_and_github.md path_to_root=../../ path=crates/primitives/src/block.rs anchor=struct-BlockLocked}}

The new block is then pre-validated, checking that the ommers hash and transactions root in the block header are the same in the block body. Following a successful pre-validation, the `BodyStage` loops through each transaction in the `block.body`, adding the transaction to the database. This process is repeated for every downloaded block body, with the `BodyStage` returning `Ok(ExecOutput { stage_progress: highest_block, reached_tip: true, done })` signaling it successfully completed.

<br>

## SendersStage

Following a successful `BodyStage`, the `SenderStage` starts to execute. The `SenderStage` is responsible for recovering the transaction sender for each of the newly added transactions to the database. At the beginning of the execution function, all of the transactions are first retrieved from the database. Then the `SenderStage` goes through each transaction and recovers the signer from the transaction signature and hash. The transaction hash is derived by taking the Keccak 256-bit hash of the RLP encoded transaction bytes. This hash is then passed into the `recover_signer` function.

{{#template ../templates/source_and_github.md path_to_root=../../ path=crates/primitives/src/transaction/signature.rs anchor=fn-recover_signer}}

In an [ECDSA (Elliptic Curve Digital Signature Algorithm) signature](https://wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm), the "r", "s", and "v" values are three pieces of data that are used to mathematically verify the authenticity of a digital signature. ECDSA is a widely used algorithm for generating and verifying digital signatures, and it is often used in cryptocurrencies like Ethereum.

The "r" is the x-coordinate of a point on the elliptic curve that is calculated as part of the signature process. The "s" is the s-value that is calculated during the signature process. It is derived from the private key and the message being signed. Lastly, the "v" is the "recovery value" that is used to recover the public key from the signature, which is derived from the signature and the message that was signed. Together, the "r", "s", and "v" values make up an ECDSA signature, and they are used to verify the authenticity of the signed transaction.

Once the transaction signer has been recovered, the signer is then added to the database. This process is repeated for every transaction that was retrieved, and similarly to previous stages, `Ok(ExecOutput { stage_progress: max_block_num, done: true, reached_tip: true })` is returned to signal a successful completion of the stage.

<br>

## ExecutionStage

Finally, after all headers, bodies and senders are added to the database, the `ExecutionStage` starts to execute. This stage is responsible for executing all of the transactions and updating the state stored in the database. For every new block header added to the database, the corresponding transactions have their signers attached to them and `reth_executor::executor::execute_and_verify_receipt()` is called, pushing the state changes resulting from the execution to a `Vec`.

{{#template ../templates/source_and_github.md path_to_root=../../ path=crates/stages/src/stages/execution.rs anchor=snippet-block_change_patches}}

After all headers and their corresponding transactions have been executed, all of the resulting state changes are applied to the database, updating account balances, account bytecode and other state changes. After applying all of the execution state changes, if there was a block reward, it is applied to the validator's account.

At the end of the `execute()` function, a familiar value is returned, `Ok(ExecOutput { done: is_done, reached_tip: true, stage_progress: last_block })` signaling a successful completion of the `ExecutionStage`.

<br>

# Next Chapter

Now that we have covered all of the stages that are currently included in the `Pipeline`, you know how the Reth client stays synced with the chain tip and updates the database with all of the new headers, bodies, senders and state changes. While this chapter provides an overview on how the pipeline stages work, the following chapters will dive deeper into the database, the networking stack and other exciting corners of the Reth codebase. Feel free to check out any parts of the codebase mentioned in this chapter, and when you are ready, the next chapter will dive into the `database`.

[Next Chapter]()



4 changes: 4 additions & 0 deletions book/templates/source_and_github.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[File: [[ #path ]]](https://github.com/paradigmxyz/reth/blob/main/[[ #path ]])
```rust,no_run,noplayground
{{#include [[ #path_to_root ]][[ #path ]]:[[ #anchor ]]}}
```
2 changes: 2 additions & 0 deletions crates/interfaces/src/p2p/headers/downloader.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ use reth_rpc_types::engine::ForkchoiceState;
///
/// A downloader represents a distinct strategy for submitting requests to download block headers,
/// while a [HeadersClient] represents a client capable of fulfilling these requests.
// ANCHOR: trait-HeaderDownloader
#[auto_impl::auto_impl(&, Arc, Box)]
pub trait HeaderDownloader: Downloader {
/// Stream the headers
Expand All @@ -28,6 +29,7 @@ pub trait HeaderDownloader: Downloader {
Ok(())
}
}
// ANCHOR_END: trait-HeaderDownloader

/// Validate whether the header is valid in relation to it's parent
///
Expand Down
2 changes: 2 additions & 0 deletions crates/primitives/src/block.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ impl Deref for Block {
}

/// Sealed Ethereum full block.
// ANCHOR: struct-BlockLocked
#[derive(Debug, Clone, PartialEq, Eq, Default, RlpEncodable, RlpDecodable)]
pub struct BlockLocked {
/// Locked block header.
Expand All @@ -31,6 +32,7 @@ pub struct BlockLocked {
/// Ommer/uncle headers
pub ommers: Vec<SealedHeader>,
}
// ANCHOR_END: struct-BlockLocked

impl BlockLocked {
/// Header hash.
Expand Down
2 changes: 2 additions & 0 deletions crates/primitives/src/header.rs
Original file line number Diff line number Diff line change
Expand Up @@ -203,13 +203,15 @@ impl Decodable for Header {

/// A [`Header`] that is sealed at a precalculated hash, use [`SealedHeader::unseal()`] if you want
/// to modify header.
// ANCHOR: struct-SealedHeader
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct SealedHeader {
/// Locked Header fields.
header: Header,
/// Locked Header hash.
hash: BlockHash,
}
// ANCHOR_END: struct-SealedHeader

impl Default for SealedHeader {
fn default() -> Self {
Expand Down
2 changes: 2 additions & 0 deletions crates/primitives/src/transaction/signature.rs
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ impl Signature {
}

/// Recover signature from hash.
// ANCHOR: fn-recover_signer
pub(crate) fn recover_signer(&self, hash: H256) -> Option<Address> {
let mut sig: [u8; 65] = [0; 65];

Expand All @@ -80,4 +81,5 @@ impl Signature {
// errors and we care only if recovery is passing or not.
secp256k1::recover(&sig, hash.as_fixed_bytes()).ok()
}
// ANCHOR_END: fn-recover_signer
}
2 changes: 2 additions & 0 deletions crates/stages/src/pipeline.rs
Original file line number Diff line number Diff line change
Expand Up @@ -73,11 +73,13 @@ use state::*;
///
/// The unwind priority is set with [Pipeline::push_with_unwind_priority]. Stages with higher unwind
/// priorities are unwound first.
// ANCHOR: struct-Pipeline
pub struct Pipeline<DB: Database> {
stages: Vec<QueuedStage<DB>>,
max_block: Option<BlockNumber>,
events_sender: MaybeSender<PipelineEvent>,
}
// ANCHOR_END: struct-Pipeline

impl<DB: Database> Default for Pipeline<DB> {
fn default() -> Self {
Expand Down
2 changes: 2 additions & 0 deletions crates/stages/src/stage.rs
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ pub struct UnwindOutput {
///
/// Stages receive [`StageDB`] which manages the lifecycle of a transaction,
/// such as when to commit / reopen a new one etc.
// ANCHOR: trait-Stage
#[async_trait]
pub trait Stage<DB: Database>: Send + Sync {
/// Get the ID of the stage.
Expand All @@ -81,3 +82,4 @@ pub trait Stage<DB: Database>: Send + Sync {
input: UnwindInput,
) -> Result<UnwindOutput, Box<dyn std::error::Error + Send + Sync>>;
}
// ANCHOR_END: trait-Stage
4 changes: 3 additions & 1 deletion crates/stages/src/stages/execution.rs
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,8 @@ impl<DB: Database> Stage<DB> for ExecutionStage {
let mut state_provider =
SubState::new(State::new(StateProviderImplRefLatest::new(db_tx)));

// executiong and store output to results
// execute and store output to results
// ANCHOR: snippet-block_change_patches
block_change_patches.push((
reth_executor::executor::execute_and_verify_receipt(
header,
Expand All @@ -230,6 +231,7 @@ impl<DB: Database> Stage<DB> for ExecutionStage {
start_tx_index,
block_reward_index,
));
// ANCHOR_END: snippet-block_change_patches
}

// apply changes to plain database.
Expand Down

0 comments on commit df9d141

Please sign in to comment.