Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] SidecarBuilder fails to encode anything bigger than 1 blob of data #1388

Closed
segfault-magnet opened this issue Sep 27, 2024 · 2 comments · Fixed by #1818
Closed

[Bug] SidecarBuilder fails to encode anything bigger than 1 blob of data #1388

segfault-magnet opened this issue Sep 27, 2024 · 2 comments · Fixed by #1818
Labels
bug Something isn't working

Comments

@segfault-magnet
Copy link

segfault-magnet commented Sep 27, 2024

Component

consensus, eips, genesis

What version of Alloy are you on?

├── alloy v0.3.6 │ ├── alloy-consensus v0.3.6 │ │ ├── alloy-eips v0.3.6 │ │ │ ├── alloy-eip2930 v0.1.0 │ │ │ │ ├── alloy-primitives v0.8.5 │ │ │ │ │ ├── alloy-rlp v0.3.8 │ │ │ │ │ │ ├── alloy-rlp-derive v0.3.8 (proc-macro) │ │ │ │ │ │ ├── alloy-rlp v0.3.8 () │ │ │ │ ├── alloy-rlp v0.3.8 () │ │ │ ├── alloy-eip7702 v0.1.0 │ │ │ │ ├── alloy-primitives v0.8.5 () │ │ │ │ ├── alloy-rlp v0.3.8 () │ │ │ ├── alloy-primitives v0.8.5 () │ │ │ ├── alloy-rlp v0.3.8 () │ │ │ ├── alloy-serde v0.3.6 │ │ │ │ ├── alloy-primitives v0.8.5 () │ │ ├── alloy-primitives v0.8.5 () │ │ ├── alloy-rlp v0.3.8 () │ │ ├── alloy-serde v0.3.6 () │ ├── alloy-contract v0.3.6 │ │ ├── alloy-dyn-abi v0.8.5 │ │ │ ├── alloy-json-abi v0.8.5 │ │ │ │ ├── alloy-primitives v0.8.5 () │ │ │ │ ├── alloy-sol-type-parser v0.8.5 │ │ │ ├── alloy-primitives v0.8.5 () │ │ │ ├── alloy-sol-type-parser v0.8.5 () │ │ │ ├── alloy-sol-types v0.8.5 │ │ │ │ ├── alloy-json-abi v0.8.5 () │ │ │ │ ├── alloy-primitives v0.8.5 () │ │ │ │ ├── alloy-sol-macro v0.8.5 (proc-macro) │ │ │ │ │ ├── alloy-sol-macro-expander v0.8.5 │ │ │ │ │ │ ├── alloy-json-abi v0.8.5 │ │ │ │ │ │ │ ├── alloy-primitives v0.8.5 │ │ │ │ │ │ │ ├── alloy-sol-type-parser v0.8.5 │ │ │ │ │ │ ├── alloy-sol-macro-input v0.8.5 │ │ │ │ │ │ │ ├── alloy-json-abi v0.8.5 () │ │ │ │ │ ├── alloy-sol-macro-input v0.8.5 () │ │ ├── alloy-json-abi v0.8.5 () │ │ ├── alloy-network v0.3.6 │ │ │ ├── alloy-consensus v0.3.6 () │ │ │ ├── alloy-eips v0.3.6 () │ │ │ ├── alloy-json-rpc v0.3.6 │ │ │ │ ├── alloy-primitives v0.8.5 () │ │ │ │ ├── alloy-sol-types v0.8.5 () │ │ │ ├── alloy-network-primitives v0.3.6 │ │ │ │ ├── alloy-eips v0.3.6 () │ │ │ │ ├── alloy-primitives v0.8.5 () │ │ │ │ ├── alloy-serde v0.3.6 () │ │ │ ├── alloy-primitives v0.8.5 () │ │ │ ├── alloy-rpc-types-eth v0.3.6 │ │ │ │ ├── alloy-consensus v0.3.6 () │ │ │ │ ├── alloy-eips v0.3.6 () │ │ │ │ ├── alloy-network-primitives v0.3.6 () │ │ │ │ ├── alloy-primitives v0.8.5 () │ │ │ │ ├── alloy-rlp v0.3.8 () │ │ │ │ ├── alloy-serde v0.3.6 () │ │ │ │ ├── alloy-sol-types v0.8.5 () │ │ │ ├── alloy-serde v0.3.6 () │ │ │ ├── alloy-signer v0.3.6 │ │ │ │ ├── alloy-primitives v0.8.5 () │ │ │ ├── alloy-sol-types v0.8.5 () │ │ ├── alloy-network-primitives v0.3.6 () │ │ ├── alloy-primitives v0.8.5 () │ │ ├── alloy-provider v0.3.6 │ │ │ ├── alloy-chains v0.1.34 │ │ │ ├── alloy-consensus v0.3.6 () │ │ │ ├── alloy-eips v0.3.6 () │ │ │ ├── alloy-json-rpc v0.3.6 () │ │ │ ├── alloy-network v0.3.6 () │ │ │ ├── alloy-network-primitives v0.3.6 () │ │ │ ├── alloy-primitives v0.8.5 () │ │ │ ├── alloy-pubsub v0.3.6 │ │ │ │ ├── alloy-json-rpc v0.3.6 () │ │ │ │ ├── alloy-primitives v0.8.5 () │ │ │ │ ├── alloy-transport v0.3.6 │ │ │ │ │ ├── alloy-json-rpc v0.3.6 () │ │ │ ├── alloy-rpc-client v0.3.6 │ │ │ │ ├── alloy-json-rpc v0.3.6 () │ │ │ │ ├── alloy-primitives v0.8.5 () │ │ │ │ ├── alloy-pubsub v0.3.6 () │ │ │ │ ├── alloy-transport v0.3.6 () │ │ │ │ ├── alloy-transport-http v0.3.6 │ │ │ │ │ ├── alloy-json-rpc v0.3.6 () │ │ │ │ │ ├── alloy-transport v0.3.6 () │ │ │ │ ├── alloy-transport-ipc v0.3.6 │ │ │ │ │ ├── alloy-json-rpc v0.3.6 () │ │ │ │ │ ├── alloy-pubsub v0.3.6 () │ │ │ │ │ ├── alloy-transport v0.3.6 () │ │ │ │ ├── alloy-transport-ws v0.3.6 │ │ │ │ │ ├── alloy-pubsub v0.3.6 () │ │ │ │ │ ├── alloy-transport v0.3.6 () │ │ │ ├── alloy-rpc-types-eth v0.3.6 () │ │ │ ├── alloy-transport v0.3.6 () │ │ │ ├── alloy-transport-http v0.3.6 () │ │ │ ├── alloy-transport-ipc v0.3.6 () │ │ │ ├── alloy-transport-ws v0.3.6 () │ │ ├── alloy-pubsub v0.3.6 () │ │ ├── alloy-rpc-types-eth v0.3.6 () │ │ ├── alloy-sol-types v0.8.5 () │ │ ├── alloy-transport v0.3.6 () │ ├── alloy-core v0.8.5 │ │ ├── alloy-dyn-abi v0.8.5 () │ │ ├── alloy-json-abi v0.8.5 () │ │ ├── alloy-primitives v0.8.5 () │ │ └── alloy-sol-types v0.8.5 () │ ├── alloy-eips v0.3.6 () │ ├── alloy-network v0.3.6 () │ ├── alloy-provider v0.3.6 () │ ├── alloy-pubsub v0.3.6 () │ ├── alloy-rpc-client v0.3.6 () │ ├── alloy-rpc-types v0.3.6 │ │ ├── alloy-rpc-types-eth v0.3.6 () │ │ ├── alloy-serde v0.3.6 () │ ├── alloy-signer v0.3.6 () │ ├── alloy-signer-local v0.3.6 │ │ ├── alloy-consensus v0.3.6 () │ │ ├── alloy-network v0.3.6 () │ │ ├── alloy-primitives v0.8.5 () │ │ ├── alloy-signer v0.3.6 () │ ├── alloy-transport v0.3.6 () │ ├── alloy-transport-http v0.3.6 () │ ├── alloy-transport-ipc v0.3.6 () │ └── alloy-transport-ws v0.3.6 (*)

Operating System

Linux

Describe the bug

I've noticed that all of my blob transactions have data only in the last blob, while the preceding ones are filled with 0s.

Upon closer inspection it turns out that the SidecarBuilder behaves weirdly once you approach the 1 blob boundary.

Reproduction:

    #[test]
    fn reproduction() {
        use alloy::consensus::SidecarBuilder;
        use alloy::consensus::SidecarCoder;
        use alloy::{consensus::SimpleCoder, eips::eip4844};
        use rand::RngCore;
        use rand::SeedableRng;

        let mut rng = rand::rngs::SmallRng::from_seed([0; 32]);

        let mut random_data = |num_input_bytes| {
            let mut data = vec![0; num_input_bytes];
            rng.fill_bytes(&mut data);

            data
        };

        let encode_blobs = |data: Vec<u8>| {
            let builder = SidecarBuilder::from_coder_and_data(SimpleCoder::default(), &data);

            builder.build().unwrap().blobs
        };

        let decode_blobs = |blobs: Vec<eip4844::Blob>| {
            let chunks = SimpleCoder::default().decode_all(&blobs)?;

            Some(chunks.into_iter().flatten().collect::<Vec<u8>>())
        };

        // This works for data sizes in the range (0,126915)
        {
            let small_amount_of_data = random_data(50_000);
            let blobs = encode_blobs(small_amount_of_data.clone());

            let recreated_data = decode_blobs(blobs).unwrap();

            assert_eq!(small_amount_of_data, recreated_data);
        }

        // for the range [126915, 126946) we get back None when decoding the data
        {
            let around_the_blob_boundary = random_data(126915);
            let blobs = encode_blobs(around_the_blob_boundary.clone());
            assert_eq!(blobs.len(), 1);

            let recreated_data = decode_blobs(blobs);

            assert!(recreated_data.is_none());
        }

        // and anything in [126946, +Inf) we get back empty data
        {
            let bigger_than_a_blob_when_encoded = random_data(126946);
            let blobs = encode_blobs(bigger_than_a_blob_when_encoded.clone());
            assert_eq!(blobs.len(), 2);
            // every blob, except the last one, is filled with 0s
            assert!(blobs[0].iter().all(|byte| *byte == 0));

            let recreated_data = decode_blobs(blobs).unwrap();

            assert!(recreated_data.is_empty());
        }
    }

I think this might be happening due to the way SidecarBuilder is allocating and giving blobs to the underlying coder:
preallocation of empty blobs

    /// Create a new builder, preallocating room for `capacity` blobs, and push
    /// an empty blob to it.
    pub fn with_capacity(capacity: usize) -> Self {
        let mut blobs = Vec::with_capacity(capacity);
        blobs.push(Blob::new([0u8; BYTES_PER_BLOB]));
        Self { blobs, fe: 0 }
    }

this makes sure that for the last example there are two blobs which the coder can fill.

The issue (probably) is due to the coder asking for the current blob and that is always the last blob that the builder preallocated.

What ends up happening is that out of all of the preallocated blobs only the last one is filled.

Since the docstring says this should be used for large amounts of data;

/// Build a [`BlobTransactionSidecar`] from an arbitrary amount of data.
///
/// This is useful for creating a sidecar from a large amount of data,
/// which is then split into blobs. It delays KZG commitments and proofs
/// until all data is ready.
///
/// [`BlobTransactionSidecar`]: crate::eip4844::BlobTransactionSidecar

these issues caught me by surprise.

Giving the input data byte by byte to the builder solves the issues (but is inefficient, goes without saying)

        let encode_blobs = |data: Vec<u8>| {
            let mut builder = SidecarBuilder::<SimpleCoder>::new();
            for byte in &data {
                builder.ingest(std::slice::from_ref(byte));
            }

            builder.build().unwrap().blobs
        };

Nope, byte-by-byte doesn't solve it, just pushes the limit to 256000B where we again get None from decode_all.

@segfault-magnet segfault-magnet added the bug Something isn't working label Sep 27, 2024
@mattsse
Copy link
Member

mattsse commented Sep 27, 2024

@prestwich I'm not that familiar with this but the issue appears to be that ingest first pre allocates a second blob:

self.inner.alloc_fes(self.coder.required_fe(data));
self.coder.code(&mut self.inner, data);
}

but only ever fills the last blob:

self.blobs.last_mut().expect("never empty")

@StackOverflowExcept1on
Copy link
Contributor

@mattsse I ran into same problem. Here's a minimal example to reproduce and let you play around with the boundary payload sizes that are causing problems.

// [dependencies]
// alloy = { version = "0.8.3", features = ["consensus", "kzg"] }

use alloy::consensus::{SidecarBuilder, SidecarCoder, SimpleCoder};

fn main() {
    // let data1 = vec![0xff; 124 * 1024 - 30]; // x2 blobs err
    // let data1 = vec![0xff; 124 * 1024 - 31]; // x1 blobs err
    // let data1 = vec![0xff; 124 * 1024 - 60]; // x1 blobs err
    let data1 = vec![0xff; 124 * 1024 - 61]; // x1 blobs err
    // let data1 = vec![0xff; 124 * 1024 - 62]; // x1 blobs ok

    let blob_sidecar = SidecarBuilder::<SimpleCoder>::from_slice(&data1)
        .build()
        .expect("failed to encode blobs");

    let mut coder = SimpleCoder::default();
    let data2 = coder
        .decode_all(&blob_sidecar.blobs)
        .expect("failed to decode blobs")
        .concat();

    assert_eq!(data1, data2);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants