Skip to content

Latest commit

 

History

History
60 lines (40 loc) · 5.49 KB

CONTRIBUTING.md

File metadata and controls

60 lines (40 loc) · 5.49 KB

Code Structure

I tried making the code of the compiler and vlib as simple and readable as possible. One of V's goals is to be open to developers with different levels of experience in compiler development. Compilers don't need to be black boxes full of magic that only few people understand.

The compiler itself is located in vlib/compiler/. It's a module that can be used by other applications.

The main files are:

  1. v.v and vlib/compiler/main.v. The entry point.
  • V figures out the build mode.
  • Constructs the compiler object (struct V).
  • Creates a list of .v files that need to be parsed.
  • Creates a parser object for each file and runs parse() on them (this should work concurrently in the future). The parser emits C or x64 code directly. For performance reasons, there are no intermediate steps (no AST or Assembly code generation).
  • If the parsing is successful, a single C file is generated by merging the output from the parsers and carefully arranging all definitions (C is a single pass language).
  • Finally, a C compiler is called to compile this C file and generate an executable or a library.
  1. parser.v The core of the compiler. This is the largest file (~3.5k loc). parse() method asks the scanner to generate a list of tokens for the file it needs to parse. Then it simply goes through all the tokens one by one.

    In V, objects can be used before declaration, so there are 2 passes. During the first pass, it only looks at declarations and skips function bodies. It memorizes all function signatures, types, consts, etc. During the second pass it looks at function bodies and generates C (e.g. cgen('if ($expr) {') or machine code (e.g. gen.mov(EDI, 1)).

    The formatter is embedded in the parser. Correctly formatted tokens are emitted as they are parsed. This allowed us to simplify the compiler and avoid duplication, but slowed it down a bit. In the future, this will be fixed with build flags and separate binaries for C generation, machine code generation, and formatting. This way there will be no unnecessary branching and function calls.

  2. scanner.v The scanner's job is to parse a list of characters and convert them to tokens. It also takes care of string interpolation, which is a mess at the moment.

  3. token.v This is simply a list of all tokens, their string values, and a couple of helper functions.

  4. table.v V creates one table object that is shared by all parsers. It contains all types, consts, and functions, as well as several helpers to search for objects by name, register new objects, modify types' fields, etc.

  5. cgen.v The small Cgen struct helps generate C code. It's also shared by all parsers. It has a couple of functions that allow to go back and set something that was previously unknown (like with a := 0 => int a = 0;). Some of these functions are hacky and need improvements and simplifications.

  6. fn.v Handles declaring and calling normal and async functions and methods. This file is about 1000 lines of code, and has some complex logic. It needs to be cleaned up and simplified a bit.

  7. json.v defines the json code generation. This file will be removed once V supports comptime code generation, and it will be possible to do this using the language's tools.

  8. x64/ is the directory with all the machine code generation logic. It defines a set of functions that translate assembly instructions to machine code and build the binary from scratch byte by byte. It manually builds all headers, segments, sections, symtable, relocations, etc. Right now it only has basic support of the x64 platform/ELF format.

The rest of the directories are vlib modules: builtin/ (strings, arrays, maps), time/, os/, etc. Their documentation is pretty clear.

Example Workflow for Contributing

(provided by @spytheman)

(If you don't already have a Github account, please create one. Your Github username will be referred to later as 'YOUR_GITHUB_USERNAME'. Change it accordingly in the steps below.)

  1. Clone https://github.com/vlang/v in a folder, say nv (git clone https://github.com/vlang/v nv)
  2. cd nv
  3. git remote add pullrequest [email protected]:YOUR_GITHUB_USERNAME/v.git # (NOTE: this is your own forked repo of: https://github.com/vlang/v - After this, we just do normal git operations such as: git pull and so on.)
  4. When finished with a feature/bugfix, you can: git checkout -b fix_alabala
  5. git push pullrequest # (NOTE: the pullrequest remote was setup on step 3)
  6. On Github's web interface, I go to: https://github.com/vlang/v/pulls Here the UI shows a nice dialog with a button to make a new pull request based on the new pushed branch. (Example dialogue: https://url4e.com/gyazo/images/364edc04.png)
  7. After making your pullrequest (aka, PR), you can continue to work on the branch... just do step #5 when you have more commits.
  8. If there are merge conflicts, or a branch lags too much behind V's master, you can do the following:
    1. git checkout master
    2. git pull
    3. git checkout fix_alabala
    4. git rebase master # solve conflicts and do git rebase --continue
    5. git push pullrequest -f

The point of doing the above steps to never directly push to the main V repository, only to your own fork. Since your local master branch tracks the main V repository's master, then git checkout master; git pull --rebase origin master work as expected (this is actually used by v up) and it can always do so cleanly. Git is very flexible, so there may be simpler/easier ways to accomplish the same thing.