- Scanning
- Add script tool to generate AST.
- Use visitor pattern, implement Lisp-like pretty printing as example
- Add parsing for expressions
- Add evaluator, again using the Visitor pattern
- Add support for statements, including expression statements and "print expr ;" statements
- Add support for global vars
- Add support for assignment
- Add lexical scoping, (chain of) enclosing environments
- Add control flow (branching with if & logical and/or, and while & for loops)
- Add support for function decls and calls, including return values. -- Add a builtin func, clock(), which is a wrapper around Java's System.currentTimeMillis() -- throw runtime Lox exception from interpreter when user attempts to call something that isn't callable, like a String. (Instead of allowing a java exception to be seen) -- For now, we'll always set the function's body's parent env as the top-level global environment. So if an identifier isn't defined in the function itself, the interpreter will look outside the func and in the global scope. However, this means functions nested inside blocks or other functions, aka local functions, and closures are not yet supported.
- (Around about here, stop updating AstPrinter visitor)
- Add support for closures.
- Fix bug adding support for closures introduced, using Resolver - see grammar.md for more info
- Use resolver to detect errors redeclaring variables twice in same local scope
- Disallow returnStmt outside of function
- Add support for classes, including this keyword to access self within a class
- Add constructors & initializers, allow init() to be invoked directly, like instance.init(), disallow returning a value from init(). But allow empty return statement (which has no value) from init, but return this instead of nil
- Add inheritance, including of methods, and calling superclass methods with the
super
keyword.
- Testing in Java
- C-style block comments
- For more practice with visitor pattern, implement printer for Reverse Polish notation, where
(1 + 2) * (4 - 3)
becomes1 2 + 4 3 - *
(note there will be ambiguity with negation vs subtraction, so it'll be necessary to redefine~
as the negation symbol) - Add C-style ternary operator
- Add suppport for break statements (branch: break-stmts)
- Add support for anonymous functions (branch: anon-funcs)
- Add support for "static" class methods using metaclasses. A static class method can be added like so:
class Math {
class square(n) {
return n * n;
}
}
print Math.square(3) // 9
A separate metaclass exists for each class. It holds any static (class) methods of the class. (branch: class-methods)
- Created foundation for chunks. A chunk is a sequence of instructions, and we've built a dynamic array structure for a Chunk. The Chunk struct now holds an array of bytes, in which we can store OpCodes.
- Create a disassembler that takes a chunk and prints all the instructions in it. It'll be used by us, lox maintainers, but not lox users. See the debug files
- Store constants (literals) in a constant pool, ValueArray. It'll be a dynamic array of values. An instruction using a constant looks up the value by index there. Each chunk has its own ValueArray. The opcode OP_CONSTANT produces the literal value. Also add debug funcs so we can print the value of the OP_CONSTANT instruction
- Store line #'s in a separate field in the chunk. This is simple, and a bit inefficient, but a pro is that it's stored separately from the instructions themselves, which is good b/c we only access it when there's a runtime error.
- Add a VM
- Add infrastructure for a value stack, using stack-based (instead of register based)
- Setup REPL and runFile functions to read user input (source code)
- setup barebones scanner to take as input stream of source code
- Implement barebones Pratt parser that can handle grouping, unary, numbers, and some binary ops
- Add macros and structs for storing lox's dynamic types
- Support runtime errors and unary negation
- Support binary arithmetic operators
- Add nil and bools, not and falsiness
- add support for strings, as StringObj which "inherit" from Obj, add cleanup funcs that will clean up VM before it exits (although our running VM won't free memory until we write the GC)
- Add hash tables. Included in the implementation are a tombstone delete method:
- upon deletion store a tombstone entry (key=NULL, value=TrueValue)
- findEntry returns entry with key = NULL if no entry found with given key. This'll either be a totally empty entry or the first tombstone (key=NULL, value=TrueValue) encountered
- findEntry could go into infinite loop if no empty slots are encountered, so we need keep track of tombstone count when resizing, and ensure the array is never completely full -- for load factor, treat tombstones like full buckets to prevent infinite loop -- when deleting an entry, don't reduce the count. Count is not the # of entries in the hash table, but the number of entries + tombstones. -- So increment the count during insertion iff the new entry goes into a totally empty bucket, not a tombstone -- when resizing the hash's array, throw out all tombstones (b/c during resize we copy all entries to new table)
- Use a hash table to do string interning: create a collection of 'interned' or 'internal' strings. Strings in the collection are guaranteed to be textually distinct. When encountering a string, look for a matching string in the collection. If found - use the original one. Otherwise, add the string to the collection. (sidenote: this is how ruby symbols are implemented). This will speed up string equality, but also: method calls & instance fields, which are looked up by name at runtime (since Lox is dynamically typed), will be much faster
- Add support for global variables using a hash table,
globals
, on th vm. - Add support for local variables using a stack. The stack offsets will be operands for the bytecode instructions that read & store local vars.
- Control flow for ifs, elses, and, or, and while statements
- Add for statements, again by desugaring the for to a while with some extra stuff before it & at the end of the body.
- Support for calls and functions. To implement multiple chunks for representing the call stack, we wrap the top level program in an implicit main() function
- Support for Closures. In jlox we dynamically allocated memory for all local vars, but that'd be slower as clox uses a stack which is really fast. Most locals aren't captured by closures and do use stack semantics. Local variables in clox will be implemented with 1 of 2 different strategies. For locals that aren't captured in closures, keep them as they are on the stack. When a local is captured by a closure, we'll use another method that lifts them on to the heap where they can then live as long as needed.
- Use upvalue concept: a local variable in an enclosing function. Every closure maintains an array of upvalues, one for each surrounding local var that the closure uses. This allows us to have a closed-over variable live on the stack exactly like a normal local var until it is closed over.
- The upvalue will point back into the stack to where it'the var it captured lives. When the closure accesses a closed-over var, it goes through the corresponding upvalue to reach it
- Mark-Sweep GC.
- It does not free the following:
- roots (any object the VM can reach directly w/o going through another object's reference. Most roots are global vars or on the stack or upvalues)
- any object referred to from a reachable object
- When to run the GC? Use a self-adjusting heap. As the amount of live memory increases, collect less frequently to avoid sacrificing throughput by re-traversing the growing pile of objects. As the amount of live memory goes down, collect more frequently to not lose too much latency by waiting too long
- It does not free the following:
- Classes, including Instances of classes, which can have fields added dynamically added at runtime using a hash table, methods with 'this' bound to the instance the method was accessed from using 'Bound Methods'. ObjBoundMethod wraps the method closure and the receiver (this) together
- Add Optimization: remove frequent modulo operation, replace with fast bit manipulations instead
- Add Optimization: when appropriate, store values in IEEE 754 format
- Instead of storing line numbers in an array the length of codes, where every element is the line number that code is on, use a run-length encoding scheme. ** although this addition was reverted when chunks started being used for each function, instead of 1 chunk per file.
- Add testing framework (interpret_test.c)