Skip to content

Commit ec4be0e

Browse files
valsamakisjosevalim
authored andcommitted
Update binaries-strings-and-char-lists.markdown (elixir-lang#1051)
1 parent 6f4de01 commit ec4be0e

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

getting-started/binaries-strings-and-char-lists.markdown

+1-1
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ In this chapter, we will understand what binaries are, how they associate with s
2222

2323
A string is a UTF-8 encoded binary. In order to understand exactly what we mean by that, we need to understand the difference between bytes and code points.
2424

25-
The Unicode standard assigns code points to many of the characters we know. For example, the letter `a` has code point `97` while the letter `ł` has code point `322`. When writing the string `"hełło"` to disk, we need to convert this code point to bytes. If we adopted a rule that said one byte represents one code point, we wouldn't be able to write `"hełło"`, because it uses the code point `322` for `ł`, and one byte can only represent a number from `0` to `255`. But of course, given you can actually read `"hełło"` on your screen, it must be represented *somehow*. That's where encodings come in.
25+
The Unicode standard assigns code points to many of the characters we know. For example, the letter `a` has code point `97` while the letter `ł` has code point `322`. When writing the string `"hełło"` to disk, we need to convert this sequence of characters to bytes. If we adopted a rule that said one byte represents one code point, we wouldn't be able to write `"hełło"`, because it uses the code point `322` for `ł`, and one byte can only represent a number from `0` to `255`. But of course, given you can actually read `"hełło"` on your screen, it must be represented *somehow*. That's where encodings come in.
2626

2727
When representing code points in bytes, we need to encode them somehow. Elixir chose the UTF-8 encoding as its main and default encoding. When we say a string is a UTF-8 encoded binary, we mean a string is a bunch of bytes organized in a way to represent certain code points, as specified by the UTF-8 encoding.
2828

0 commit comments

Comments
 (0)