Skip to content

Commit

Permalink
[#377] Enhance built-in types encoding
Browse files Browse the repository at this point in the history
  • Loading branch information
mikir committed Aug 8, 2023
1 parent d63813b commit a35514b
Show file tree
Hide file tree
Showing 2 changed files with 71 additions and 14 deletions.
74 changes: 66 additions & 8 deletions doc/ZserioEncodingGuide.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,15 +77,14 @@ Byte position | Value | Value (hex) | Description

## Built-in Types

All [integer built-in types](ZserioLanguageOverview.md#integer-built-in-types),
[bit field types](ZserioLanguageOverview.md#bit-field-types),
[floating point types](ZserioLanguageOverview.md#floating-point-types),
[variable integer types](ZserioLanguageOverview.md#variable-integer-types) and
[boolean type](ZserioLanguageOverview.md#boolean-type) are encoded as they are using big endian byte order.
Thus, for multi-byte integers, the most significant byte comes first.
### Integer Built-in Types

All [integer built-in types](ZserioLanguageOverview.md#integer-built-in-types) are encoded as they are using
big endian byte order. Thus, for multi-byte integers, the most significant byte comes first.
Within each byte, the most significant bit comes first.

If the type size is not byte aligned, exact number of bits are encoded (e.g. `bit:2` is encoded in two bits).
Negative values are represented in two's complement, i.e. the hex byte `FF` is `255` as `uint8` or
`-1` as `int8`.

**Example**

Expand All @@ -100,9 +99,63 @@ Byte position | Value | Value (hex) | Value (bit) | Description
------------- | ----- | ----------- | -------------------- | -------
0-1 | 513 | 02 01 | 0000 0010 0000 0001 | bit 0 is `1`, bit 15 is `0`

**Example**

The decimal value `-513` interpreted as `int16`:

```
Offset 00 01
00000000 FD FF
```

Byte position | Value | Value (hex) | Value (bit) | Description
------------- | ----- | ----------- | -------------------- | -------
0-1 | -513 | FD FF | 1111 1101 1111 1111 | The most significant bit (bit 15) is the first one

### Bit Field Types

All [bit field types](ZserioLanguageOverview.md#bit-field-types) are encoded as they are using big endian byte
order. Thus, for multi-byte integers, the most significant byte comes first.
Within each byte, the most significant bit comes first.

If the type size is not byte aligned, exact number of bits are encoded (e.g. `bit:2` is encoded in two bits).

**Example**

The decimal value `513` interpreted as `bit:12`:

```
Offset 00 01
00000000 20 10
```

Byte position | Value | Value (hex) | Value (bit) | Description
------------- | ----- | ----------- | --------------- | -------
0-1 | 513 | 20 10 | 0010 0000 0001 | Only the first 4 bits of the second byte is used.

### Floating Point Types

All [floating point types](ZserioLanguageOverview.md#floating-point-types) are encoded as
16-bits/32-bits/64-bits integer numbers using their 16-bits/32-bits/64-bits floating point format defined
by IEEE 754 specification.

**Example**

The floating point value `8.0` interpreted as `float16`:

```
Offset 00 01
00000000 48 00
```

Byte position | Value | Value (hex) | Value (bit) | Description
------------- | ----- | ----------- | -------------------- | -------
0-1 | 8.0 | 48 00 | 0100 1000 0000 0000 | The most significant bit (bit 15) is the first one

### Variable Integer Types

The internal layout of the variable integer types is:
All [variable integer types](ZserioLanguageOverview.md#variable-integer-types) are encoded according to the
following table:

Data Type | Byte | Description
----------- | ---- | -----------
Expand Down Expand Up @@ -161,6 +214,11 @@ varsize | 0 | 1 bit has next byte, 7 bits value
> Minimum size is always 1 byte, the other bytes are present only when previous *has next byte* bit is set
> to `1`
### Boolean Type

[Boolean type](ZserioLanguageOverview.md#boolean-type) is encoded as a single bit. `true` is encoded
as a single bit `1` and `false` is encoded as a single bit `0`.

### String Type

[String type](ZserioLanguageOverview.md#string-type) is encoded by a length field
Expand Down
11 changes: 5 additions & 6 deletions doc/ZserioLanguageOverview.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@
This document contains a detailed specification of the zserio schema language. The Zserio Language Overview
document is targeted for developers who write zserio schema definitions.

This document does not describe the details how Zserio encodes data to the binary stream. Encoding details
are described in [Zserio Encoding Guide](ZserioEncodingGuide.md#variable-integer-types).

Zserio is a serialization schema language for modeling binary data types, bitstreams or file formats. Based
on the zserio language it is possible to automatically generate encoders and decoders for a given schema
in various target languages (e.g. Java, C++, Python).
Expand Down Expand Up @@ -100,8 +103,7 @@ signed | `int8`, `int16`, `int32`, `int64`


These types correspond to unsigned or signed integers represented as sequences of 8, 16, 32 or 64 bits,
respectively. Negative values are represented in two's complement, i.e. the hex byte FF is 255 as `uint8`
or -1 as `int8`.
respectively.

### Bit Field Types

Expand Down Expand Up @@ -164,14 +166,11 @@ varsize | `0 to 2147483647` | `5`

>Note that `varint` and `varuint` can handle all `int64` and `uint64` values respectively.
For encoding details see [encoding guide](ZserioEncodingGuide.md#variable-integer-types).

> Note that `varsize` is available since `2.0.0`.
### Boolean Type

In zserio, booleans are denoted by `bool`. A boolean is stored in a single bit. Both `true` and `false` are
available as built-in keywords that are stored as a 1 or 0, respectively.
In zserio, booleans are denoted by `bool`. Both `true` and `false` are available as built-in keywords.

**Example**
```
Expand Down

0 comments on commit a35514b

Please sign in to comment.