Skip to content

fix(docs): review and dedupe topics on bytes and strings #47

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: stable
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 38 additions & 45 deletions gitlab-pages/docs/data-types/bytes.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,16 +35,13 @@ const zero_too = 0x00;

</Syntax>

Clearly, this means that literal bytes are always comprised of an even
number of hexadecimal digits (because one hexadecimal digit requires
up to four bits in binary, and eight are needed to make up a byte).
This means that literal bytes are always comprised of an even number of hexadecimal digits, because one hexadecimal digit requires up to four bits in binary, and eight are needed to make up a byte.

## From numbers to bytes and back

Some other numerals can be converted to bytes by means of calling the
predefined function `bytes`, which is overloaded. The reverse
conversion is done by the predefined functions `int` and `nat`. For
instance, here how to create bytes from natural numbers and integers:
You can convert some other numerals to bytes by calling the predefined function `bytes`.
To convert ints or nats to bytes, use the predefined functions `int` and `nat`.
For instance, here how to create bytes from natural numbers and integers:

<Syntax syntax="cameligo">

Expand Down Expand Up @@ -72,21 +69,16 @@ const i: int = int(0x7B); // i == 123

</Syntax>

> Note: See
> [Two's complement](https://en.wikipedia.org/wiki/Two's_complement).
Note: See [Two's complement](https://en.wikipedia.org/wiki/Two's_complement).

## From strings

A string literal can be converted to bytes in two ways:
You can convert a string literal to bytes in two ways:

1. by interpreting the [ASCII](https://en.wikipedia.org/wiki/ASCII)
code of each character (which spans over two hexadecimal digits) as
one byte;
2. by interpreting directly each character as one hexadecimal digit.
- By interpreting the [ASCII](https://en.wikipedia.org/wiki/ASCII) code of each character (which spans over two hexadecimal digits) as one byte
- By interpreting directly each character as one hexadecimal digit


In the former case, the syntax is somewhat odd -- as opposed to simply
calling the function `bytes`:
To interpret the ASCII code, use this syntax:

<Syntax syntax="cameligo">

Expand All @@ -104,7 +96,7 @@ const from_ascii: bytes = bytes`foo`; // Not a call

</Syntax>

The latter case is implemented as a type cast:
To interpret each character directly, use a type cast:

<Syntax syntax="cameligo">

Expand All @@ -113,12 +105,13 @@ The latter case is implemented as a type cast:
let raw : bytes = ("666f6f" : bytes)
```

> Note that both the `[%bytes ...]` and `(... : bytes)` syntaxes apply
> only to *string literals*, not general expressions of type
> `string`. In other words, the contents of the strings must be
> available in-place at compile-time. (This actually reveals that
> `("666f6f" : bytes)` is not really a cast, as casts are
> non-operations.)
:::note

Both cases apply only to string literals, not variables or other expressions of type `string`.
In other words, the contents of the strings must be available in-place at compile time.
(This reveals that `("666f6f" : bytes)` is not really a cast, as casts are non-operations.)

:::

</Syntax>

Expand All @@ -129,11 +122,13 @@ let raw : bytes = ("666f6f" : bytes)
const raw: bytes = ("666f6f" as bytes);
```

> Note that both syntaxes apply respectively only to *verbatim* string
> literals and general strings, not general expressions of type
> `string`. In other words, the contents of the strings must be
> available at compile-time. (This actually reveals that `("666f6f" as
> bytes)` is not really a cast, as casts are non-operations.)
:::note

Both cases apply only to string literals, not variables or other expressions of type `string`.
In other words, the contents of the strings must be available in-place at compile time.
(This reveals that `("666f6f" as bytes)` is not really a cast, as casts are non-operations.)

:::

</Syntax>

Expand Down Expand Up @@ -161,8 +156,7 @@ const three: bytes = Bytes.concats([0x70, 0xAA, 0xFF]);

## Sizing

In order to obtain the length of a sequence of bytes, use the
predefined function `Bytes.length` like so:
To obtain the length of a sequence of bytes, use the predefined function `Bytes.length` like so:

<Syntax syntax="cameligo">

Expand All @@ -182,10 +176,10 @@ const len: nat = Bytes.length(0x0AFF); // len == 2n

## Slicing

Bytes can be extracted using the predefined function `Bytes.sub`. The
first parameter is the start index and the second is the number of
bytes of the slice we want. Keep in mind that the first byte in a
sequence has index `0n`.
You can extract a subset from bytes with the `Bytes.sub` function.
It accepts a nat for the index of the start of the subset and a nat for the number of bytes in the subset.
Both numbers are inclusive.
The first byte has the index 0.

<Syntax syntax="cameligo">

Expand Down Expand Up @@ -253,18 +247,17 @@ const shift_right: bytes = 0x0006 >> 1n; // 0x0003

## Packing and unpacking

As Michelson provides the instructions `PACK` and `UNPACK` for data
serialisation, so does LIGO with `Bytes.pack` and `Bytes.unpack`. The
former serialises Michelson data structures into a binary format, and
the latter reverses that transformation. Unpacking may fail, so the
return type of `Byte.unpack` is an option that needs to be annotated.
LIGO provides the functions `Bytes.pack` and `Bytes.unpack` to serialize and deserialize data into a binary format.
These functions correspond to the Michelson instructions `PACK` and `UNPACK`.
Unpacking may fail, so the return type of `Byte.unpack` is an option that needs a type annotation.

:::note

> Note: `PACK` and `UNPACK` are Michelson instructions that are
> intended to be used by people that really know what they are
> doing. There are several risks and failure cases, such as unpacking
> a lambda from an untrusted source or casting the result to the wrong
> type. Be careful.
Be careful packing and unpacking data.
These functions are intended for use by developers who are familiar with data serialization.
There are several risks and failure cases, such as unpacking a lambda from an untrusted source or casting the result to the wrong type.

:::

<Syntax syntax="cameligo">

Expand Down
71 changes: 24 additions & 47 deletions gitlab-pages/docs/data-types/strings.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,15 @@ title: Strings

import Syntax from '@theme/Syntax';

Strings are of the predefined type `string`. Literal strings are set
between double quotes.
Strings are of the predefined type `string`.
Literal strings are set between double quotes.

<Syntax syntax="cameligo">

```cameligo group=strings
let a : string = "Hello Alice"
```

Note: See the predefined
[module String](../reference/string-reference/?lang=cameligo)

</Syntax>

<Syntax syntax="jsligo">
Expand All @@ -24,14 +21,22 @@ Note: See the predefined
const a :string = "Hello Alice";
```

Note: See predefined [namespace String](../reference/string-reference/?lang=jsligo)
</Syntax>

<Syntax syntax="cameligo">

For reference, see the predefined [module String](../reference/string-reference/?lang=cameligo).

</Syntax>

### Casting
<Syntax syntax="jsligo">

For reference, see the predefined [namespace String](../reference/string-reference/?lang=jsligo).

</Syntax>

Strings can be used in contexts where a boolean is expected: an empty
string is then interpreted as `false`, and `true` otherwise.
string is interpreted as `false` and a non-empty string is interpreted as `true`.

<Syntax syntax="cameligo">

Expand Down Expand Up @@ -63,9 +68,6 @@ let greeting = "Hello"
let full_greeting = greeting ^ " " ^ name
```

Note: See the predefined
[module String](../reference/string-reference/?lang=cameligo)

</Syntax>

<Syntax syntax="jsligo">
Expand All @@ -79,24 +81,18 @@ const greeting = "Hello";
const full_greeting = greeting + " " + name;
```

Note: See predefined [namespace String](../reference/string-reference/?lang=jsligo)

</Syntax>

## Sizing

The length of a string can be obtain by calling the predefined
functions `String.length` or `String.size`:
To get the length of a string, use the function `String.length` or `String.size`:

<Syntax syntax="cameligo">

```cameligo group=length
let length : nat = String.size "Alice" // length = 5n
```

Note: See the predefined
[module String](../reference/string-reference/?lang=cameligo)

</Syntax>

<Syntax syntax="jsligo">
Expand All @@ -105,15 +101,14 @@ Note: See the predefined
const length : nat = String.size("Alice"); // length == 5n
```

Note: See predefined [namespace String](../reference/string-reference/?lang=jsligo)

</Syntax>

## Slicing

Substrings can be extracted using the predefined function
`String.sub`. The first character has index 0 and the interval of
indices for the substring has inclusive bounds.
You can extract a substring from a string with the `String.sub` function.
It accepts a nat for the index of the start of the substring and a nat for the number of characters.
Both numbers are inclusive.
The first character of a string has the index 0.

<Syntax syntax="cameligo">

Expand All @@ -122,57 +117,39 @@ let name = "Alice"
let slice = String.sub 0n 1n name // slice = "A"
```

Note: See the predefined
[module String](../reference/string-reference/?lang=cameligo)

</Syntax>

<Syntax syntax="jsligo">

The offset and length of the slice are natural number:

```jsligo group=slicing
const name = "Alice";
const slice = String.sub (0n, 1n, name); // slice == "A"
```

Note: See predefined [namespace String](../reference/string-reference/?lang=jsligo)

</Syntax>

## Verbatim
## Verbatim strings

Strings can contain control characters, like `\n`. Sometimes we need
that each character in a string is interpreted on its own, for example
`\n` as two characters instead of a newline character. In that case,
either we escape the backslash character, or we use <em>verbatim
strings</em>. Those have the same type `string` as normal (that is,
interpreted) strings.
Strings can contain control characters, like `\n`.
To interpret each character on its own (such as treating `\n` as two characters), you can either escape the backslash character or use _verbatim strings_.
Verbatim strings have the same type as ordinary strings (that is, interpreted strings).

<Syntax syntax="cameligo">

Verbatim strings are given between the delimiters `{|` and `|}`,
instead of double quotes:
Verbatim strings are given between the delimiters `{|` and `|}` instead of double quotes:

```cameligo group=verbatim
let s : string = {|\n|} // String made of two characters
```

Note: See the predefined
[module String](../reference/string-reference/?lang=cameligo)

</Syntax>

<Syntax syntax="jsligo">

Verbatim strings are given between backquotes (a.k.a. backticks),
instead of double quotes:
Verbatim strings are given between backquotes (a.k.a. backticks), instead of double quotes:

```jsligo group=verbatim
const s : string = `\n` // String made of two characters
```

Note: See predefined [namespace String](../reference/string-reference/?lang=jsligo)

</Syntax>

3 changes: 0 additions & 3 deletions gitlab-pages/docs/language-basics/src/strings-bytes/a.jsligo

This file was deleted.

3 changes: 0 additions & 3 deletions gitlab-pages/docs/language-basics/src/strings-bytes/a.mligo

This file was deleted.

2 changes: 0 additions & 2 deletions gitlab-pages/docs/language-basics/src/strings-bytes/b.jsligo

This file was deleted.

2 changes: 0 additions & 2 deletions gitlab-pages/docs/language-basics/src/strings-bytes/b.mligo

This file was deleted.

2 changes: 0 additions & 2 deletions gitlab-pages/docs/language-basics/src/strings-bytes/c.jsligo

This file was deleted.

2 changes: 0 additions & 2 deletions gitlab-pages/docs/language-basics/src/strings-bytes/c.mligo

This file was deleted.

3 changes: 0 additions & 3 deletions gitlab-pages/docs/language-basics/src/strings-bytes/d.jsligo

This file was deleted.

3 changes: 0 additions & 3 deletions gitlab-pages/docs/language-basics/src/strings-bytes/d.mligo

This file was deleted.

2 changes: 0 additions & 2 deletions gitlab-pages/docs/language-basics/src/strings-bytes/e.jsligo

This file was deleted.

2 changes: 0 additions & 2 deletions gitlab-pages/docs/language-basics/src/strings-bytes/e.mligo

This file was deleted.

2 changes: 0 additions & 2 deletions gitlab-pages/docs/language-basics/src/strings-bytes/f.jsligo

This file was deleted.

2 changes: 0 additions & 2 deletions gitlab-pages/docs/language-basics/src/strings-bytes/f.mligo

This file was deleted.

14 changes: 0 additions & 14 deletions gitlab-pages/docs/language-basics/src/strings-bytes/g.jsligo

This file was deleted.

14 changes: 0 additions & 14 deletions gitlab-pages/docs/language-basics/src/strings-bytes/g.mligo

This file was deleted.

Loading