Introduce `EbmlParsable` trait to make the EBML parsing more ergonomic #115

FreezyLemon · 2023-04-02T16:12:52Z

Okay, this should be the last time I do major changes to the EBML parsing code.

It has taken a while, but after this I feel good in going over the Matroska specification to fix the obvious issues (default values, optional vs. non-optional values). And then I'd like to fix the remuxer and the info tool.

* Introduce `EbmlParsable` trait to parse most known EBML types * Rename `ErrorKind` to `ParseError` * Move Element ID from `ParseError` into the `Error` enum

* Remove `ebml_` prefix from parsing functions * Move `complete` combinator into parsing functions * Implement EbmlParsable for Uuid * Rename `skip_void` to `void` (The combinator doesn't really skip the void, it returns it) * Rename `eat_void` to `skip_void` (better description)

Luni-4

Thanks a lot! The code is much cleaner! Just a question from my side

src/ebml.rs

* float_def inserts a default value for 0-octet or missing Elements * Change some struct fields to non-optional

lu-zero

Looks good, maybe float_with_default() might be more descriptive

FreezyLemon · 2023-04-03T15:03:07Z

The only thing I don't think we can handle well is quiet and signalling NaNs. EBML doesn't specify how to interpret the signalling bit, and not all platforms handle it the same.

I'll try to give a brief overview of how it works now, because it's a bit confusing.

When parsing a Float, there's 3 possible errors that can happen:

What I call a "nom Error", where the parsing fails on a basic level. The usual example of this would be the VerifyError, which means the ID doesn't match -> Element doesn't exist in the bitstream.
A ParseError::EmptyFloat -> Element exists but has a length of 0.
A ParseError::FloatWidthIncorrect -> Element exists but has a length that is not 0, 4 or 8.

Now, with this information, we can define some behaviour for different situations. Option<f64> is used when we have a Float Element with a minOccurs of 0. f64 is used with a minOccurs = maxOccurs = 1, and we differentiate this between Elements with a default value and those without one.

No.	Output type	VerifyError	EmptyFloat	FloatWidthIncorrect
1	`Option<f64>`	`None`	`Some(0)`	`Err`
2	`f64`, no default	`Err`	`0`	`Err`
3	`f64`, default `d`	`d`	`d`	`Err`

Basically, the float function exists to handle cases 1 and 2, while float_default handles case 3 (where the VerifyError also defaults instead of erroring)

FreezyLemon · 2023-04-03T15:09:30Z

I'd like to double-check if the code now does what I explained above . I'll re-convert the PR after that.

lu-zero · 2023-04-03T16:18:28Z

@robUx4 do you have suggestions regarding NaNs?

Luni-4 · 2023-04-04T07:37:44Z

Thinking again about this PR, we can leave FIXMEs, open a new issue, talk there and subsequently create another PR. In this way we don't block this refactor. What do you think @FreezyLemon? Better to separate problems perhaps

FreezyLemon · 2023-04-04T10:11:36Z

@lu-zero I went with float_or(id, default) to make it similar to std's usual naming (unwrap_or, map_or, etc.)

@Luni-4 I don't mind. The float behaviour should in general be at least as good as before. So it should be fine to do those changes/fixes later in a separate PR.

Luni-4

Thanks a lot!

robUx4 · 2023-04-10T15:27:18Z

@robUx4 do you have suggestions regarding NaNs?

Our floats are using the IEEE.754 format. According to Wikipedia it's properly defined:

IEEE 754 NaNs are encoded with the exponent field filled with ones (like infinity values), and some non-zero number in the significand field (to make them distinct from infinity values);

There are multiple NaN values, but any of them is OK as EBML values.

FreezyLemon · 2023-04-10T15:31:35Z

There are multiple NaN values, but any of them is OK as EBML values.

@robUx4 I guess EBML doesn't care about sNaN and qNaN? IEEE754 only has a recommendation regarding the signalling bit for NaNs, not a fast rule. This is because MIPS does it differently than x86 and ARM. I'd imagine EBML doesn't really care about this though?

robUx4 · 2023-04-25T19:32:37Z

EBML doesn't really care about this though?

We never got this far, no. I think it's too late to add constraints now. We can add recommendations, though.

FreezyLemon added 3 commits April 2, 2023 17:23

Reimplement value parsing for sized EBML types

04d1a66

* Introduce `EbmlParsable` trait to parse most known EBML types * Rename `ErrorKind` to `ParseError` * Move Element ID from `ParseError` into the `Error` enum

cargo fmt

8c2be74

Luni-4 approved these changes Apr 3, 2023

View reviewed changes

src/ebml.rs Outdated Show resolved Hide resolved

Add float_def and some tests

171a6f0

* float_def inserts a default value for 0-octet or missing Elements * Change some struct fields to non-optional

lu-zero approved these changes Apr 3, 2023

View reviewed changes

FreezyLemon marked this pull request as draft April 3, 2023 15:08

FreezyLemon added 2 commits April 4, 2023 12:09

Rename float_def to float_or

16cbb58

Add FIXME for outstanding issues

a4cdf22

FreezyLemon marked this pull request as ready for review April 4, 2023 10:11

FreezyLemon mentioned this pull request Apr 4, 2023

Define, implement & test handling of Float Elements #116

Open

Luni-4 approved these changes Apr 4, 2023

View reviewed changes

Luni-4 merged commit 8e6c586 into rust-av:master Apr 4, 2023

FreezyLemon deleted the rewrite-parsing branch April 4, 2023 14:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce `EbmlParsable` trait to make the EBML parsing more ergonomic #115

Introduce `EbmlParsable` trait to make the EBML parsing more ergonomic #115

FreezyLemon commented Apr 2, 2023

Luni-4 left a comment

lu-zero left a comment

FreezyLemon commented Apr 3, 2023 •

edited

Loading

FreezyLemon commented Apr 3, 2023

lu-zero commented Apr 3, 2023

Luni-4 commented Apr 4, 2023

FreezyLemon commented Apr 4, 2023 •

edited

Loading

Luni-4 left a comment

robUx4 commented Apr 10, 2023

FreezyLemon commented Apr 10, 2023

robUx4 commented Apr 25, 2023

Introduce EbmlParsable trait to make the EBML parsing more ergonomic #115

Introduce EbmlParsable trait to make the EBML parsing more ergonomic #115

Conversation

FreezyLemon commented Apr 2, 2023

Luni-4 left a comment

Choose a reason for hiding this comment

lu-zero left a comment

Choose a reason for hiding this comment

FreezyLemon commented Apr 3, 2023 • edited Loading

FreezyLemon commented Apr 3, 2023

lu-zero commented Apr 3, 2023

Luni-4 commented Apr 4, 2023

FreezyLemon commented Apr 4, 2023 • edited Loading

Luni-4 left a comment

Choose a reason for hiding this comment

robUx4 commented Apr 10, 2023

FreezyLemon commented Apr 10, 2023

robUx4 commented Apr 25, 2023

Introduce `EbmlParsable` trait to make the EBML parsing more ergonomic #115

Introduce `EbmlParsable` trait to make the EBML parsing more ergonomic #115

FreezyLemon commented Apr 3, 2023 •

edited

Loading

FreezyLemon commented Apr 4, 2023 •

edited

Loading