Skip to content

Update broken links to source code. #64

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 7, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 5 additions & 8 deletions _posts/2020-03-12-thing-explainer-parser.markdown
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@ date: 2020-04-02 11:34:01 -0400
redirect_from: /2020/04/02/thing-explainer-parser.html
---

This post goes over the RustPython parser. You can see the source code at [RustPython/parser/](https://github.com/RustPython/RustPython/tree/master/parser).
This post goes over the RustPython parser. You can see the source code in the [rustpython-parser](https://github.com/RustPython/RustPython/tree/main/compiler/parser) crate.

When you write code in Python and run it, an interpreter, such as the RustPython interpreter, acts as the translator between you and your machine.

The interpreter has the job of turning your human code into bytecode that a Python virtual machine can run. Bytecode is an intermediate code between source code and machine code. This makes it portable across multiple hardware and operating systems. Bytecode "works" as long as you implement a virtual machine (vm) that can run it. There is a performance penalty for this flexibility. RustPython has a vm under [RustPython/vm/](https://github.com/RustPython/RustPython/tree/master/vm). Other posts will go into the details of that vm but now let's figure out how to turn code into bytecode.
The interpreter has the job of turning your human code into bytecode that a Python virtual machine can run. Bytecode is an intermediate code between source code and machine code. This makes it portable across multiple hardware and operating systems. Bytecode "works" as long as you implement a virtual machine (vm) that can run it. There is a performance penalty for this flexibility. RustPython also [has a vm](https://github.com/RustPython/RustPython/tree/main/vm) that interprets the generated bytecode, other posts will go into the details of that vm but now let's figure out how to turn code into bytecode.


## What bytecode looks like
Expand Down Expand Up @@ -54,10 +54,7 @@ If you want to sound fancy:
- The parsing process is called "lexical analysis"
- The thing that does this is a "lexer"

Here is the link to the RustPython lexer.

**`RustPython/parser/lexer.rs`** >>
[source code](https://github.com/RustPython/RustPython/blob/master/parser/src/lexer.rs)
The code for the lexing stage lives in [lex.rs](https://github.com/RustPython/RustPython/blob/main/compiler/parser/src/lexer.rs) of the parser crate.


If you want to dive into the details of lexical analysis, check out [Python in a nutshell / Lexical structure](https://learning.oreilly.com/library/view/python-in-a/9781491913833/ch03.html#python_language-id00003)
Expand All @@ -76,7 +73,7 @@ As the presenter puts it, this is the spirit of the beast (Python) and it is onl

So, we have the rules or grammar of a programming language in a machine encoded format... now we need to write something that verifies that those rules were followed... This sounds like something that other people could use and like something that should exist as an open source project! 🤔

Sure enough, there is a whole Rust framework called `LALRPOP`. It takes the tokens generated by the lexer, verifies the syntax and turns the tokens into an AST (Abstract Syntax Tree). More information and a tutorial can be found in the [LALRPOP book](https://lalrpop.github.io/lalrpop/README.html).
Sure enough, there is a whole Rust framework called `LALRPOP`. It takes the tokens generated by the lexer, verifies the syntax and turns the tokens into an AST (Abstract Syntax Tree). More information and a tutorial can be found in the [LALRPOP book](https://lalrpop.github.io/lalrpop/index.html).

RustPython does one nice extra thing on top of `LALRPOP`. It masks the errors and provides you with safer, nicer errors. You can see the code for this in `RustPython/parser/src/error.rs`

Expand All @@ -101,4 +98,4 @@ As a recap, when you write a line of Python code and "run it", here is what the
⬇️ compile the AST into bytecode
**OUTPUT: bytecode** (in `__pycache__/file.pyc` or in memory)

The compiler is under **`RustPython/compiler`**. Keep an eye on the blog for a future post about the details or the compiler. In the meantime, check out the parser source code in [RustPython/parser/](https://github.com/RustPython/RustPython/tree/master/parser).
The compiler is located in the [rustpython-compiler](https://github.com/RustPython/RustPython/tree/main/compiler) crate. Keep an eye on the blog for a future post about the details or the compiler. In the meantime, check out the parser source code in [rustpython-parser](https://github.com/RustPython/RustPython/tree/main/compiler/parser).