Tags: carld/compiler-tutorial
Tags
Chapter 1.9 Heap Allocation Note the extern scheme_entry function prototype return type changed from int to long for the 64bit quadwords to be correctly returned. compt master % cat m64.c int main() { printf("%ld\n", sizeof(long)); printf("%ld\n", sizeof(int)); printf("%ld\n", sizeof(void *)); printf("%ld\n", sizeof(long *)); } compt master % ./sizes 8 4 8 8 compt master % Introducing the begin special form, not functional anymore... gdb tip: to show memory (esp when debugging vectors), for example 8 words from address: display /8xg 0x100068000 Store registers marked as preserve in the context structure, and restore after call to scheme entry point. Pairs are adjacent quad words, tagged with 0x00000001, thus to untag and reference the car, pair-1, and to untag and reference the cdr, pair+7.
Chapter 1.8 Iteration via Proper Tail Calls Collapsing the stack does not consider how many locals are on it, simply move the current set of arguments adjacent to the first cell containing the return address. tests-1.8 appear to have tests not related to tail calls so these tests have been removed. A lot of code can be refactored now It seems like everything aside from argument evaluation is in a tail position in this compiler? passed all 419 tests
Chapter 1.7 Procedures Reminder: the stack grows downwards in memory, meaning decreasing the stack pointer uses up more stack space not less. The call instruction performs the following: (1) computes the return point (i.e. the address of the instruction following the call instruc- tion), (2) decrements the value of %esp by 4, (3) saves the return point at 0(%esp), then (4) directs the execution to the target of the call. The ret instruction performs the following: (1) it loads the return point address from 0(%esp), (2) increments the value of %esp by 4, then (3) directs the execution to the return point. In a call, rsp starts out pointing to the return address of the call. Regarding earlier confusion, not sure about this line: (emit " mov [rsp + 8], rsp") ; stack base argument tutorial has: movl 4(%esp), %esp It looks like the tutorial code would pick the stack_base argument directly off the stack - as it was the last local in main C function, and could be expected to be in the machine word above the current stack pointer. In 64-bit calling convention, this argument is passed in rdi. stack_base is passed via the calling convention, and as it's the only argument, it is passed in the rdi register. So this line has been changed to the following: (emit " mov rsp, rdi") Using 64-bit cells, Bus Error would occur in deeply nested procedures tests due to use of rsp pointing to out of bounds memory, need more memory... int stack_size = (16 * 4096); /* holds 16K cells */ Increasing 16 to 32768 enables those tests to pass, allowing enough stack size: 5000000 (call depth in test) * 2 (lambdas) * 8 (wordsize) = 80,000,000 32768 * 4096 = 134,217,728 Tip: to get gdb to show intel syntax, % cat ~/.gdbinit set disassembly-flavor intel Tip: to see code in gdb, layout asm layout reg The file tests-1.7-req.scm appeared to have a repeat of earlier binary primitives tests, so these have been removed. Add gcc flags, -g to include debugging symbols, -fomit-frame-pointer so gcc does not generate code that uses the rbp register, -Wall and -pedantic to provide C language warnings In this implementation, letrec cannot be nested and has to be the outermost expression. Remove reliance on ctype.h for isspace, iscntrl (down the track startup.c can be ported to assembly) Emit comments next to some assembly code to help debug Bus Error (see above) passed all 400 tests
Chapter 1.6 Local Variables Variables are implemented by placing values on the stack. The compiler keeps an environment association list that maps the name of the variable to its index on the stack. - emit-stack-save means to mov value from rax to stack at current stack index - emit-load-stack means to mov from stack at current index to rax passed all 381 tests
Chapter 1.5 Binary Primitives Notes on System V calling convention: - return value in rax, rdx (if it's 128-bit) - parameters in rdi, rsi, rdx, rcx, r8, r9, then stack right to left - aligned to 16-byte - scratch registers rax, rdi, rsi, rdx, rcx, r8, r9, r10, r11 - preserve registers rbx, rsp, rbp, r12, r13, r14, r15 - call list rbp - the call instruction pushes the address of the next instruction to the stack and jumps - stack has 128 byte red zone - push,pop, call, ret instructions affect rsp - in 64-bit the word size is a quad word, 8 bytes - the stack grows down in memory, the index starts at -8 not sure about this line: (emit " mov [rsp + 8], rsp") ; stack base argument tutorial has: movl 4(%esp), %esp
Exercise 1.2 Immediate Constants, from the compiler tutorial. Notes: - compile-program is a chez scheme builtin proc, so use emit-program for the compiler procedure instead, - formatted error messages use the errorf procedure in chez scheme, - use a makefile to build the executable, and set path to gcc, nasm in the makefile, outside tests-driver, - tests-driver had duplicated functions: get-string, test-with-string-output, execute, build-program