forked from DoctorWkt/acwj
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added proper global variable declarations.
- Loading branch information
Warren Toomey
committed
Oct 21, 2019
1 parent
0f04a49
commit e6c760f
Showing
53 changed files
with
2,807 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
SRCS= cg.c decl.c expr.c gen.c main.c misc.c scan.c stmt.c \ | ||
sym.c tree.c types.c | ||
|
||
ARMSRCS= cg_arm.c decl.c expr.c gen.c main.c misc.c scan.c stmt.c \ | ||
sym.c tree.c types.c | ||
|
||
comp1: $(SRCS) | ||
cc -o comp1 -g -Wall $(SRCS) | ||
|
||
comp1arm: $(ARMSRCS) | ||
cc -o comp1arm -g -Wall $(ARMSRCS) | ||
cp comp1arm comp1 | ||
|
||
clean: | ||
rm -f comp1 comp1arm *.o *.s out | ||
|
||
test: comp1 tests/runtests | ||
(cd tests; chmod +x runtests; ./runtests) | ||
|
||
armtest: comp1arm tests/runtests | ||
(cd tests; chmod +x runtests; ./runtests) | ||
|
||
test16: comp1 tests/input16.c lib/printint.c | ||
./comp1 tests/input16.c | ||
cc -o out out.s lib/printint.c | ||
./out | ||
|
||
armtest16: comp1arm tests/input16.c lib/printint.c | ||
./comp1 tests/input16.c | ||
cc -o out out.s lib/printint.c | ||
./out |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,205 @@ | ||
# Part 16: Declaring Global Variables Properly | ||
|
||
I did promise to look at the issue of adding offsets to pointers, but | ||
I need to do some thinking about that first. So I've decided to move | ||
global variable declarations out of function declarations. Actually, | ||
I've also left the parsing of variable declarations inside functions, because | ||
later on we will change them to be local variable declarations. | ||
|
||
I also wanted to extend our grammar so that we can declare multiple | ||
variables with the same type at the same time, e.g. | ||
|
||
``` | ||
int x, y, z; | ||
``` | ||
|
||
## The New BNF Grammar | ||
|
||
Here is the new BNF grammar for global declarations, both functions and | ||
variables: | ||
|
||
``` | ||
global_declarations : global_declarations | ||
| global_declaration global_declarations | ||
; | ||
global_declaration: function_declaration | var_declaration ; | ||
function_declaration: type identifier '(' ')' compound_statement ; | ||
var_declaration: type identifier_list ';' ; | ||
type: type_keyword opt_pointer ; | ||
type_keyword: 'void' | 'char' | 'int' | 'long' ; | ||
opt_pointer: <empty> | '*' opt_pointer ; | ||
identifier_list: identifier | identifier ',' identifier_list ; | ||
``` | ||
|
||
Both `function_declaration` and `global_declaration` start with a `type`. | ||
This is now a `type_keyword` followed by `opt_pointer` which is zero or more | ||
'*' tokens. After this, both `function_declaration` and `global_declaration` | ||
must be followed by one identifier. | ||
|
||
However, after the `type`, `var_declaration` is followed by an | ||
`identifier_list`, which is one or more `identifier`s separated by a ',' token. | ||
Also `var_declaration` must end with a ';' token but `function_declaration` | ||
ends with a `compound_statement` and no ';' token. | ||
|
||
## New Tokens | ||
|
||
We now have the T_COMMA token for the ',' character in `scan.c`. | ||
|
||
## Changes to `decl.c` | ||
|
||
We now convert the above BNF grammar into a set of recursive descent | ||
functions but, as we can do looping, we can turn some of the recursion | ||
into internal loops. | ||
|
||
### `global_declarations()` | ||
|
||
As there are one or more global declarations, we can loop parsing | ||
each one. When we run out of tokens, we can leave the loop. | ||
|
||
``` | ||
// Parse one or more global declarations, either | ||
// variables or functions | ||
void global_declarations(void) { | ||
struct ASTnode *tree; | ||
int type; | ||
while (1) { | ||
// We have to read past the type and identifier | ||
// to see either a '(' for a function declaration | ||
// or a ',' or ';' for a variable declaration. | ||
// Text is filled in by the ident() call. | ||
type = parse_type(); | ||
ident(); | ||
if (Token.token == T_LPAREN) { | ||
// Parse the function declaration and | ||
// generate the assembly code for it | ||
tree = function_declaration(type); | ||
genAST(tree, NOREG, 0); | ||
} else { | ||
// Parse the global variable declaration | ||
var_declaration(type); | ||
} | ||
// Stop when we have reached EOF | ||
if (Token.token == T_EOF) | ||
break; | ||
} | ||
} | ||
``` | ||
|
||
Knowing that, for now we only have global variables and functions, we | ||
can scan in the type here and the first identifier. Then, we look at | ||
the next token. If it's a '(', we call `function_declaration()`. If not, | ||
we can assume that it is a `var_declaration()`. We pass the `type` | ||
in to both functions. | ||
|
||
Now that we are receiving the AST `tree` from `function_declaration()` | ||
here, we can generate the code from the AST tree immediately. This code | ||
was in `main()` but has now been moved here. `main()` now only has to | ||
call `global_declarations()`: | ||
|
||
``` | ||
scan(&Token); // Get the first token from the input | ||
genpreamble(); // Output the preamble | ||
global_declarations(); // Parse the global declarations | ||
genpostamble(); // Output the postamble | ||
``` | ||
|
||
### `var_declaration()` | ||
|
||
The parsing of functions is much the same as before, except the code | ||
to scan the type and identifer are done elsewhere, and we receive the | ||
`type` as an argument. | ||
|
||
The parsing of variables also loses the type and identifier scanning code. | ||
We can add the identifier to the global symbol and generate the assembly | ||
code for it. But now we need to add in a loop. If there's a following ',', | ||
loop back to get the next identifier with the same type. And if there's | ||
a following ';', that's the end of the variable declarations. | ||
|
||
``` | ||
// Parse the declaration of a list of variables. | ||
// The identifier has been scanned & we have the type | ||
void var_declaration(int type) { | ||
int id; | ||
while (1) { | ||
// Text now has the identifier's name. | ||
// Add it as a known identifier | ||
// and generate its space in assembly | ||
id = addglob(Text, type, S_VARIABLE, 0); | ||
genglobsym(id); | ||
// If the next token is a semicolon, | ||
// skip it and return. | ||
if (Token.token == T_SEMI) { | ||
scan(&Token); | ||
return; | ||
} | ||
// If the next token is a comma, skip it, | ||
// get the identifier and loop back | ||
if (Token.token == T_COMMA) { | ||
scan(&Token); | ||
ident(); | ||
continue; | ||
} | ||
fatal("Missing , or ; after identifier"); | ||
} | ||
} | ||
``` | ||
|
||
## Not Quite Local Variables | ||
|
||
`var_declaration()` can now parse a list of variable declarations, but | ||
it requires the type and first identifier to be pre-scanned. | ||
|
||
Thus, I've left the call to `var_declaration()` in `single_statement()` | ||
in `stmt.c`. Later on, we will modify this to declarare local variables. | ||
But for now, all of the variables in this example program are globals: | ||
|
||
``` | ||
int d, f; | ||
int *e; | ||
int main() { | ||
int a, b, c; | ||
b= 3; c= 5; a= b + c * 10; | ||
printint(a); | ||
d= 12; printint(d); | ||
e= &d; f= *e; printint(f); | ||
return(0); | ||
} | ||
``` | ||
|
||
## Testing the Changes | ||
|
||
The above code is our `tests/input16.c`. As always, we can test it: | ||
|
||
``` | ||
$ make test16 | ||
cc -o comp1 -g -Wall cg.c decl.c expr.c gen.c main.c misc.c scan.c | ||
stmt.c sym.c tree.c types.c | ||
./comp1 tests/input16.c | ||
cc -o out out.s lib/printint.c | ||
./out | ||
53 | ||
12 | ||
12 | ||
``` | ||
|
||
|
||
## Conclusion and What's Next | ||
|
||
In the next part of our compiler writing journey, | ||
I promise to tackle the issue of adding offsets to pointers. |
Oops, something went wrong.