Skip to content

Latest commit

 

History

History

chapter03

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

Chapter 3

In this chapter we extend the language grammar to allow variable declarations. This allows us to write:

int a=1; 
int b=2; 
int c=0; 
{ 
  int b=3; 
  c=a+b; 
} 
return c;

Here is the complete language grammar for this chapter.

Extensions to Intermediate Representation

To recap, our list of Nodes are:

Node Name Type Description Inputs Value
Start Control Start of function None None for now as we do not have function arguments yet
Return Control End of function Predecessor control node, Data node value Return value of the function
Constant Data Constants such as integer literals None, however Start node is set as input to enable graph walking Value of the constant
Add Data Add two values Two data nodes, values are added, order not important Result of the add operation
Sub Data Subtract a value from another Two data nodes, values are subtracted, order matters Result of the subtract
Mul Data Multiply two values Two data nodes, values are multiplied, order not important Result of the multiply
Div Data Divide a value by another Two data nodes, values are divided, order matters Result of the division
Minus Data Negate a value One data node, value is negated Result of the unary minus

In this chapter we introduce:

Node Name Type Description Inputs Value
Scope Symbol Table Represents scopes in the graph All nodes that define variables None

Symbol Tables

To support variable declarations and lexical scopes, we introduce symbol tables in the parser. A symbol table maps names to nodes. We maintain a stack of symbol tables. When a new scope is created, we push a new symbol table to the stack, when scope exits, the topmost symbol table is popped from the stack.

Declaring a name adds it to the current symbol table. If a name is assigned to, then its mapping in the most recent symbol table is updated. If a name is accessed, its mapping is looked up in the symbol tables. The lookup goes up the stack of symbol tables.

ScopeNode

We wrap the stack of symbol tables in a ScopeNode. When a name binding is added to a symbol table, we also make the newly added node an input in the ScopeNode. This means that the ScopeNode is a user of all nodes that define new names.

From an implementation point of view, the symbol table maps names to offsets in the ScopeNode's inputs.

When a scope is exited, we not only pop the symbol table representing the scope, but also remove the relevant input nodes.

In the visualization, we show the ScopeNode in a separate box; the stack of symbol tables is shown inside the box. We show the edge from each variable in the symbol table to the defining Node in the graph.

Demonstration

We show the output from following code.

01      int a=1; 
02      int b=2; 
03      int c=0; 
04      { 
05        int b=3; 
07        c=a+b; 
08      } 
09      return c;
  • When we start parsing above, we start with a symbol table (level 0). The variables a, b, and c are registered in this table.
  • On line 4, we start a nested scope. At this point, a new symbol table is pushed (level 1).
  • On line 5, a variable b is declared. b is registered in the symbol table at level 1. This then hides the variable with the same name on line 2.
  • On line 7, we update the variable c. Variable lookup will find it in the symbol table at level 0. The node graph is shown below (note that peephole opts are switched off here so that we can see the graph as its initially constructed).

Graph1

The diagram shows the ScopeNode and the symbol tables in the bottom cluster. Each table is annotated by its scope level, and shows the variables defined in that scope.

  • When we exit the nested scope on line 8, the symbol table at level 1 is popped.
  • At line 9, our graph looks like this:

Graph2

As we see, only 1 symbol table is present at this point.

Enabling peephole optimizations

Once we enable peephole optimizations, the parser is able to reduce all the expressions down to a constant. This is illustrated nicely below.

01      int x0=1; 
02      int y0=2; 
03      int x1=3; 
04      int y1=4; 
05      return (x0-x1)*(x0-x1) + (y0-y1)*(y0-y1);

The graph for above, is displayed below.

Graph2

The return node has the constant 8 as its data value!