******************************************************************************* Chapter 7: Extending the Language: Mutable Variables / SSA construction ******************************************************************************* Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and `Max Shawabkeh <http://max99x.com>`_ Introduction ======================= Welcome to Chapter 7 of the `Implementing a language with LLVM <http://www.llvm.org/docs/tutorial/index.html>`_ tutorial. In chapters 1 through 6, we've built a very respectable, albeit simple, `functional programming language <http://en.wikipedia.org/wiki/Functional_programming>`_. In our journey, we learned some parsing techniques, how to build and represent an AST, how to build LLVM IR, and how to optimize the resultant code as well as JIT compile it. While Kaleidoscope is interesting as a functional language, the fact that it is functional makes it "too easy" to generate LLVM IR for it. In particular, a functional language makes it very easy to build LLVM IR directly in `SSA form <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_. Since LLVM requires that the input code be in SSA form, this is a very nice property and it is often unclear to newcomers how to generate code for an imperative language with mutable variables. The short (and happy) summary of this chapter is that there is no need for your front-end to build SSA form: LLVM provides highly tuned and well tested support for this, though the way it works is a bit unexpected for some. Why is this a hard problem? ==================================== To understand why mutable variables cause complexities in SSA construction, consider this extremely simple C example: .. code-block:: c int G, H; int test(_Bool Condition) { int X; if (Condition) X = G; else X = H; return X; } In this case, we have the variable "X", whose value depends on the path executed in the program. Because there are two different possible values for X before the return instruction, a PHI node is inserted to merge the two values. The LLVM IR that we want for this example looks like this: .. code-block:: llvm @G = weak global i32 0 ; type of @G is i32* @H = weak global i32 0 ; type of @H is i32* define i32 @test(i1 %Condition) { entry: br i1 %Condition, label %cond_true, label %cond_false cond_true: %X.0 = load i32* @G br label %cond_next cond_false: %X.1 = load i32* @H br label %cond_next cond_next: %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ] ret i32 %X.2 } In this example, the loads from the G and H global variables are explicit in the LLVM IR, and they live in the then/else branches of the if statement (cond\_true/cond\_false). In order to merge the incoming values, the X.2 phi node in the cond\_next block selects the right value to use based on where control flow is coming from: if control flow comes from the cond\_false block, X.2 gets the value of X.1. Alternatively, if control flow comes from cond\_true, it gets the value of X.0. The intent of this chapter is not to explain the details of SSA form. For more information, see one of the many `online references <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_. The question for this article is "who places the phi nodes when lowering assignments to mutable variables?". The issue here is that LLVM *requires* that its IR be in SSA form: there is no "non-ssa" mode for it. However, SSA construction requires non-trivial algorithms and data structures, so it is inconvenient and wasteful for every front-end to have to reproduce this logic. Memory in LLVM ========================== The 'trick' here is that while LLVM does require all register values to be in SSA form, it does not require (or permit) memory objects to be in SSA form. In the example above, note that the loads from G and H are direct accesses to G and H: they are not renamed or versioned. This differs from some other compiler systems, which do try to version memory objects. In LLVM, instead of encoding dataflow analysis of memory into the LLVM IR, it is handled with `Analysis Passes <http://www.llvm.org/docs/WritingAnLLVMPass.html>`_ which are computed on demand. With this in mind, the high-level idea is that we want to make a stack variable (which lives in memory, because it is on the stack) for each mutable object in a function. To take advantage of this trick, we need to talk about how LLVM represents stack variables. In LLVM, all memory accesses are explicit with load/store instructions, and it is carefully designed not to have (or need) an "address-of" operator. Notice how the type of the @G/@H global variables is actually "i32\*" even though the variable is defined as "i32". What this means is that @G defines *space* for an i32 in the global data area, but its *name* actually refers to the address for that space. Stack variables work the same way, except that instead of being declared with global variable definitions, they are declared with the `LLVM alloca instruction <http://www.llvm.org/docs/LangRef.html#i_alloca>`_: .. code-block:: llvm define i32 @example() { entry: %X = alloca i32 ; type of %X is i32* ... %tmp = load i32* %X ; load the stack value %X from the stack %tmp2 = add i32 %tmp, 1 ; increment it store i32 %tmp2, i32* %X ; store it back ... This code shows an example of how you can declare and manipulate a stack variable in the LLVM IR. Stack memory allocated with the alloca instruction is fully general: you can pass the address of the stack slot to functions, you can store it in other variables, etc. In our example above, we could rewrite the example to use the alloca technique to avoid using a PHI node: .. code-block:: llvm @G = weak global i32 0 ; type of @G is i32* @H = weak global i32 0 ; type of @H is i32* define i32 @test(i1 %Condition) { entry: %X = alloca i32 ; type of %X is i32 *. br i1 %Condition, label %cond_true, label %cond_false cond_true: %X.0 = load i32* @G store i32 %X.0, i32* %X ; Update X br label %cond_next cond_false: %X.1 = load i32* @H store i32 %X.1, i32* %X ; Update X br label %cond_next cond_next: %X.2 = load i32* %X ; Read X ret i32 %X.2 } With this, we have discovered a way to handle arbitrary mutable variables without the need to create Phi nodes at all: #. Each mutable variable becomes a stack allocation. #. Each read of the variable becomes a load from the stack. #. Each update of the variable becomes a store to the stack. #. Taking the address of a variable just uses the stack address directly. While this solution has solved our immediate problem, it introduced another one: we have now apparently introduced a lot of stack traffic for very simple and common operations, a major performance problem. Fortunately for us, the LLVM optimizer has a highly-tuned optimization pass named "mem2reg" that handles this case, promoting allocas like this into SSA registers, inserting Phi nodes as appropriate. If you run this example through the pass, for example, you'll get: .. code-block:: bash $ llvm-as < example.ll | opt -mem2reg | llvm-dis .. code-block:: llvm @G = weak global i32 0 @H = weak global i32 0 define i32 @test(i1 %Condition) { entry: br i1 %Condition, label %cond_true, label %cond_false cond_true: %X.0 = load i32* @G br label %cond_next cond_false: %X.1 = load i32* @H br label %cond_next cond_next: %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ] ret i32 %X.01 } The mem2reg pass implements the standard "iterated dominance frontier" algorithm for constructing SSA form and has a number of optimizations that speed up (very common) degenerate cases. The mem2reg optimization pass is the answer to dealing with mutable variables, and we highly recommend that you depend on it. Note that mem2reg only works on variables in certain circumstances: #. mem2reg is alloca-driven: it looks for allocas and if it can handle them, it promotes them. It does not apply to global variables or heap allocations. #. mem2reg only looks for alloca instructions in the entry block of the function. Being in the entry block guarantees that the alloca is only executed once, which makes analysis simpler. #. mem2reg only promotes allocas whose uses are direct loads and stores. If the address of the stack object is passed to a function, or if any funny pointer arithmetic is involved, the alloca will not be promoted. #. mem2reg only works on allocas of `first class <http://www.llvm.org/docs/LangRef.html#t_classifications>`_ values (such as pointers, scalars and vectors), and only if the array size of the allocation is 1 (or missing in the .ll file). mem2reg is not capable of promoting structs or arrays to registers. Note that the "scalarrepl" pass is more powerful and can promote structs, "unions", and arrays in many cases. All of these properties are easy to satisfy for most imperative languages, and we'll illustrate it below with Kaleidoscope. The final question you may be asking is: should I bother with this nonsense for my front-end? Wouldn't it be better if I just did SSA construction directly, avoiding use of the mem2reg optimization pass? In short, we strongly recommend that you use this technique for building SSA form, unless there is an extremely good reason not to. Using this technique is: - Proven and well tested: llvm-gcc and clang both use this technique for local mutable variables. As such, the most common clients of LLVM are using this to handle a bulk of their variables. You can be sure that bugs are found fast and fixed early. - Extremely Fast: mem2reg has a number of special cases that make it fast in common cases as well as fully general. For example, it has fast-paths for variables that are only used in a single block, variables that only have one assignment point, good heuristics to avoid insertion of unneeded phi nodes, etc. - Needed for debug info generation: `Debug information in LLVM <http://www.llvm.org/docs/SourceLevelDebugging.html>`_ relies on having the address of the variable exposed so that debug info can be attached to it. This technique dovetails very naturally with this style of debug info. If nothing else, this makes it much easier to get your front-end up and running, and is very simple to implement. Lets extend Kaleidoscope with mutable variables now! -------------- Mutable Variables in Kaleidoscope ============================================== Now that we know the sort of problem we want to tackle, lets see what this looks like in the context of our little Kaleidoscope language. We're going to add two features: #. The ability to mutate variables with the '=' operator. #. The ability to define new variables. While the first item is really what this is about, we only have variables for incoming arguments as well as for induction variables, and redefining those only goes so far :). Also, the ability to define new variables is a useful thing regardless of whether you will be mutating them. Here's a motivating example that shows how we could use these: .. code-block:: none # Define ':' for sequencing: as a low-precedence operator that ignores operands # and just returns the RHS. def binary : 1 (x y) y; # Recursive fib, we could do this before. def fib(x) if (x < 3) then 1 else fib(x-1) + fib(x-2) # Iterative fib. def fibi(x) var a = 1, b = 1, c in (for i = 3, i < x in c = a + b : a = b : b = c) : b # Call it. fibi(10) In order to mutate variables, we have to change our existing variables to use the "alloca trick". Once we have that, we'll add our new operator, then extend Kaleidoscope to support new variable definitions. -------------- Adjusting Existing Variables for Mutation ========================================================== The symbol table in Kaleidoscope is managed at code generation time by the ``g_named_values`` map. This map currently keeps track of the LLVM "Value" that holds the double value for the named variable. In order to support mutation, we need to change this slightly, so that it holds the *memory location* of the variable in question. Note that this change is a refactoring: it changes the structure of the code, but does not (by itself) change the behavior of the compiler. All of these changes are isolated in the Kaleidoscope code generator. At this point in Kaleidoscope's development, it only supports variables for two things: incoming arguments to functions and the induction variable of 'for' loops. For consistency, we'll allow mutation of these variables in addition to other user-defined variables. This means that these will both need memory locations. To start our transformation of Kaleidoscope, we will need to create the allocas that we will store in ``g_named_values``. We'll use a helper function that ensures that the allocas are created in the entry block of the function: .. code-block:: python # Creates an alloca instruction in the entry block of the function. This is used # for mutable variables. def CreateEntryBlockAlloca(function, var_name): entry = function.get_entry_basic_block() builder = Builder.new(entry) builder.position_at_beginning(entry) return builder.alloca(Type.double(), var_name) This code creates a temporary ``llvm.core.Builder`` that is pointing at the first instruction of the entry block. It then creates an alloca with the expected name and returns it. Because all values in Kaleidoscope are doubles, there is no need to pass in a type to use. With this in place, the first functionality change we want to make is to variable references. In our new scheme, variables live on the stack, so code generating a reference to them actually needs to produce a load from the stack slot: .. code-block:: python def CodeGen(self): if self.name in g_named_values: return g_llvm_builder.load(g_named_values[self.name], self.name) else: raise RuntimeError('Unknown variable name: ' + self.name) As you can see, this is pretty straightforward. Now we need to update the things that define the variables to set up the alloca. We'll start with ``ForExpressionNode.CodeGen`` (see the :ref:`full code listing <code>` for the unabridged code): .. code-block:: python def CodeGen(self): function = g_llvm_builder.basic_block.function # Create an alloca for the variable in the entry block. alloca = CreateEntryBlockAlloca(function, self.loop_variable) # Emit the start code first, without 'variable' in scope. start_value = self.start.CodeGen() # Store the value into the alloca. g_llvm_builder.store(start_value, alloca) ... # Compute the end condition. end_condition = self.end.CodeGen() # Reload, increment, and restore the alloca. This handles the case where # the body of the loop mutates the variable. cur_value = g_llvm_builder.load(alloca, self.loop_variable) next_value = g_llvm_builder.fadd(cur_value, step_value, 'nextvar') g_llvm_builder.store(next_value, alloca) # Convert condition to a bool by comparing equal to 0.0. end_condition_bool = g_llvm_builder.fcmp( FCMP_ONE, end_condition, Constant.real(Type.double(), 0), 'loopcond') ... This code is virtually identical to the code `before we allowed mutable variables <PythonLangImpl5.html#forcodegen>`_. The big difference is that we no longer have to construct a PHI node, and we use load/store to access the variable as needed. To support mutable argument variables, we need to also make allocas for them. The code for this is also pretty simple: .. code-block:: python class PrototypeNode(object): ... # Create an alloca for each argument and register the argument in the symbol # table so that references to it will succeed. def CreateArgumentAllocas(self, function): for arg_name, arg in zip(self.args, function.args): alloca = CreateEntryBlockAlloca(function, arg_name) g_llvm_builder.store(arg, alloca) g_named_values[arg_name] = alloca For each argument, we make an alloca, store the input value to the function into the alloca, and register the alloca as the memory location for the argument. This method gets invoked by ``FunctionNode.CodeGen`` right after it sets up the entry block for the function. The final missing piece is adding the mem2reg pass, which allows us to get good codegen once again: .. code-block:: python from llvm.passes import (PASS_PROMOTE_MEMORY_TO_REGISTER, PASS_INSTRUCTION_COMBINING, PASS_REASSOCIATE, PASS_GVN, PASS_CFG_SIMPLIFICATION) ... def main(): # Set up the optimizer pipeline. Start with registering info about how the # target lays out data structures. g_llvm_pass_manager.add(g_llvm_executor.target_data) # Promote allocas to registers. g_llvm_pass_manager.add(PASS_PROMOTE_MEMORY_TO_REGISTER) # Do simple "peephole" optimizations and bit-twiddling optzns. g_llvm_pass_manager.add(PASS_INSTRUCTION_COMBINING) # Reassociate expressions. g_llvm_pass_manager.add(PASS_REASSOCIATE) It is interesting to see what the code looks like before and after the mem2reg optimization runs. For example, this is the before/after code for our recursive fib function. Before the optimization: .. code-block:: llvm define double @fib(double %x) { entry: %x1 = alloca double store double %x, double* %x1 %x2 = load double* %x1 %cmptmp = fcmp ult double %x2, 3.000000e+00 %booltmp = uitofp i1 %cmptmp to double %ifcond = fcmp one double %booltmp, 0.000000e+00 br i1 %ifcond, label %then, label %else then: ; preds = %entry br label %ifcont else: ; preds = %entry %x3 = load double* %x1 %subtmp = fsub double %x3, 1.000000e+00 %calltmp = call double @fib(double %subtmp) %x4 = load double* %x1 %subtmp5 = fsub double %x4, 2.000000e+00 %calltmp6 = call double @fib(double %subtmp5) %addtmp = fadd double %calltmp, %calltmp6 br label %ifcont ifcont: ; preds = %else, %then %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ] ret double %iftmp } Here there is only one variable (x, the input argument) but you can still see the extremely simple-minded code generation strategy we are using. In the entry block, an alloca is created, and the initial input value is stored into it. Each reference to the variable does a reload from the stack. Also, note that we didn't modify the if/then/else expression, so it still inserts a PHI node. While we could make an alloca for it, it is actually easier to create a PHI node for it, so we still just make the PHI. Here is the code after the mem2reg pass runs: .. code-block:: llvm define double @fib(double %x) { entry: %cmptmp = fcmp ult double %x, 3.000000e+00 %booltmp = uitofp i1 %cmptmp to double %ifcond = fcmp one double %booltmp, 0.000000e+00 br i1 %ifcond, label %then, label %else then: br label %ifcont else: %subtmp = fsub double %x, 1.000000e+00 %calltmp = call double @fib(double %subtmp) %subtmp5 = fsub double %x, 2.000000e+00 %calltmp6 = call double @fib(double %subtmp5) %addtmp = fadd double %calltmp, %calltmp6 br label %ifcont ifcont: ; preds = %else, %then %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ] ret double %iftmp } This is a trivial case for mem2reg, since there are no redefinitions of the variable. The point of showing this is to calm your tension about inserting such blatent inefficiencies :). After the rest of the optimizers run, we get: .. code-block:: llvm define double @fib(double %x) { entry: %cmptmp = fcmp ult double %x, 3.000000e+00 %booltmp = uitofp i1 %cmptmp to double %ifcond = fcmp ueq double %booltmp, 0.000000e+00 br i1 %ifcond, label %else, label %ifcont else: %subtmp = fsub double %x, 1.000000e+00 %calltmp = call double @fib(double %subtmp) %subtmp5 = fsub double %x, 2.000000e+00 %calltmp6 = call double @fib(double %subtmp5) %addtmp = fadd double %calltmp, %calltmp6 ret double %addtmp ifcont: ret double 1.000000e+00 } Here we see that the simplifycfg pass decided to clone the return instruction into the end of the 'else' block. This allowed it to eliminate some branches and the PHI node. Now that all symbol table references are updated to use stack variables, we'll add the assignment operator. -------------- New Assignment Operator ======================================= With our current framework, adding a new assignment operator is really simple. We will parse it just like any other binary operator, but handle it internally (instead of allowing the user to define it). The first step is to set a precedence: .. code-block:: python def main(): ... # Install standard binary operators. # 1 is lowest possible precedence. 40 is the highest. g_binop_precedence['='] = 2 g_binop_precedence['<'] = 10 g_binop_precedence['+'] = 20 g_binop_precedence['-'] = 20 Now that the parser knows the precedence of the binary operator, it takes care of all the parsing and AST generation. We just need to implement codegen for the assignment operator. This looks like: .. code-block:: python class BinaryOperatorExpressionNode(ExpressionNode): ... def CodeGen(self): # A special case for '=' because we don't want to emit the LHS as an # expression. if self.operator == '=': # Assignment requires the LHS to be an identifier. if not isinstance(self.left, VariableExpressionNode): raise RuntimeError('Destination of "=" must be a variable.') Unlike the rest of the binary operators, our assignment operator doesn't follow the "emit LHS, emit RHS, do computation" model. As such, it is handled as a special case before the other binary operators are handled. The other strange thing is that it requires the LHS to be a variable. It is invalid to have ``(x+1) = expr`` -- only things like ``x = expr`` are allowed. .. code-block:: python # Codegen the RHS. value = self.right.CodeGen() # Look up the name. variable = g_named_values[self.left.name] # Store the value and return it. g_llvm_builder.store(value, variable) return value ... Once we have the variable, CodeGening the assignment is straightforward: we emit the RHS of the assignment, create a store, and return the computed value. Returning a value allows for chained assignments like ``X = (Y = Z)``. Now that we have an assignment operator, we can mutate loop variables and arguments. For example, we can now run code like this: .. code-block:: none # Function to print a double. extern printd(x) # Define ':' for sequencing: as a low-precedence operator that ignores operands # and just returns the RHS. def binary : 1 (x y) y def test(x) printd(x) : x = 4 : printd(x) test(123) When run, this example prints "123" and then "4", showing that we did actually mutate the value! Okay, we have now officially implemented our goal: getting this to work requires SSA construction in the general case. However, to be really useful, we want the ability to define our own local variables. Let's add this next! -------------- User-defined Local Variables =========================================== Adding var/in is just like any other other extensions we made to Kaleidoscope: we extend the lexer, the parser, the AST and the code generator. The first step for adding our new 'var/in' construct is to extend the lexer. As before, this is pretty trivial, the code looks like this: .. code-block:: python ... class UnaryToken(object): pass class VarToken(object): pass ... def Tokenize(string): ... elif identifier == 'unary': yield UnaryToken() elif identifier == 'var': yield VarToken() else: yield IdentifierToken(identifier) The next step is to define the AST node that we will construct. For var/in, it looks like this: .. code-block:: python # Expression class for var/in. class VarExpressionNode(ExpressionNode): def __init__(self, variables, body): self.variables = variables self.body = body def CodeGen(self): ... var/in allows a list of names to be defined all at once, and each name can optionally have an initializer value. As such, we capture this information in the variables list. Also, var/in has a body, this body is allowed to access the variables defined by the var/in. With this in place, we can define the parser pieces. The first thing we do is add it as a primary expression: .. code-block:: python # primary ::= # dentifierexpr | numberexpr | parenexpr | ifexpr | forexpr | varexpr def ParsePrimary(self): if isinstance(self.current, IdentifierToken): return self.ParseIdentifierExpr() elif isinstance(self.current, NumberToken): return self.ParseNumberExpr() elif isinstance(self.current, IfToken): return self.ParseIfExpr() elif isinstance(self.current, ForToken): return self.ParseForExpr() elif isinstance(self.current, VarToken): return self.ParseVarExpr() elif self.current == CharacterToken('('): return self.ParseParenExpr() else: raise RuntimeError('Unknown token when expecting an expression.') Next we define ParseVarExpr: .. code-block:: python # varexpr ::= 'var' (identifier ('=' expression)?)+ 'in' expression def ParseVarExpr(self): self.Next() # eat 'var'. variables = {} # At least one variable name is required. if not isinstance(self.current, IdentifierToken): raise RuntimeError('Expected identifier after "var".') #The first part of this code parses the list of identifier/expr pairs #into the local variables list. while True: var_name = self.current.name self.Next() # eat the identifier. # Read the optional initializer. if self.current == CharacterToken('='): self.Next() # eat '='. variables[var_name] = self.ParseExpression() else: variables[var_name] = None # End of var list, exit loop. if self.current != CharacterToken(','): break self.Next() # eat ','. if not isinstance(self.current, IdentifierToken): raise RuntimeError('Expected identifier after "," in a var expression.') # Once all the variables are parsed, we then parse the body and create the # AST node: # At this point, we have to have 'in'. if not isinstance(self.current, InToken): raise RuntimeError('Expected "in" keyword after "var".') self.Next() # eat 'in'. body = self.ParseExpression() return VarExpressionNode(variables, body) Now that we can parse and represent the code, we need to support emission of LLVM IR for it. This code starts out with: .. code-block:: python class VarExpressionNode(ExpressionNode): ... def CodeGen(self): old_bindings = {} function = g_llvm_builder.basic_block.function # Register all variables and emit their initializer. for var_name, var_expression in self.variables.iteritems(): # Emit the initializer before adding the variable to scope, this prevents # the initializer from referencing the variable itself, and permits stuff # like this: # var a = 1 in # var a = a in ... # refers to outer 'a'. if var_expression is not None: var_value = var_expression.CodeGen() else: var_value = Constant.real(Type.double(), 0) alloca = CreateEntryBlockAlloca(function, var_name) g_llvm_builder.store(var_value, alloca) # Remember the old variable binding so that we can restore the binding # when we unrecurse. old_bindings[var_name] = g_named_values.get(var_name, None) # Remember this binding. g_named_values[var_name] = alloca Basically it loops over all the variables, installing them one at a time. For each variable we put into the symbol table, we remember the previous value that we replace in ``old_bindings``. There are more comments here than code. The basic idea is that we emit the initializer, create the alloca, then update the symbol table to point to it. Once all the variables are installed in the symbol table, we evaluate the body of the var/in expression: .. code-block:: python # Codegen the body, now that all vars are in scope. body = self.body.CodeGen() Finally, before returning, we restore the previous variable bindings: .. code-block:: python # Pop all our variables from scope. for var_name in self.variables: if old_bindings[var_name] is not None: g_named_values[var_name] = old_bindings[var_name] else: del g_named_values[var_name] # Return the body computation. return body The end result of all of this is that we get properly scoped variable definitions, and we even (trivially) allow mutation of them :). With this, we completed what we set out to do. Our nice iterative fib example from the intro compiles and runs just fine. The mem2reg pass optimizes all of our stack variables into SSA registers, inserting PHI nodes where needed, and our front-end remains simple: no "iterated dominance frontier" computation anywhere in sight. -------------- .. _code: Full Code Listing =========================== Here is the complete code listing for our running example, enhanced with mutable variables and var/in support: .. code-block:: python #!/usr/bin/env python import re from llvm.core import Module, Constant, Type, Function, Builder from llvm.ee import ExecutionEngine, TargetData from llvm.passes import FunctionPassManager from llvm.core import FCMP_ULT, FCMP_ONE from llvm.passes import(PASS_PROMOTE_MEMORY_TO_REGISTER, PASS_INSTRUCTION_COMBINING, PASS_REASSOCIATE, PASS_GVN, PASS_CFG_SIMPLIFICATION) Globals ------- .. code-block:: python # The LLVM module, which holds all the IR code. g_llvm_module = Module.new('my cool jit') # The LLVM instruction builder. Created whenever a new function is entered. g_llvm_builder = None # A dictionary that keeps track of which values are defined in the current scope # and what their LLVM representation is. g_named_values = {} # The function optimization passes manager. g_llvm_pass_manager = FunctionPassManager.new(g_llvm_module) # The LLVM execution engine. g_llvm_executor = ExecutionEngine.new(g_llvm_module) # The binary operator precedence chart. g_binop_precedence = {} # Creates an alloca instruction in the entry block of the function. This is used # for mutable variables. def CreateEntryBlockAlloca(function, var_name): entry = function.get_entry_basic_block() builder = Builder.new(entry) builder.position_at_beginning(entry) return builder.alloca(Type.double(), var_name) Lexer ----- .. code-block:: python # The lexer yields one of these types for each token. class EOFToken(object): pass class DefToken(object): pass class ExternToken(object): pass class IfToken(object): pass class ThenToken(object): pass class ElseToken(object): pass class ForToken(object): pass class InToken(object): pass class BinaryToken(object): pass class UnaryToken(object): pass class VarToken(object): pass class IdentifierToken(object): def __init__(self, name): self.name = name class NumberToken(object): def __init__(self, value): self.value = value class CharacterToken(object): def __init__(self, char): self.char = char def __eq__(self, other): return isinstance(other, CharacterToken) and self.char == other.char def __ne__(self, other): return not self == other # Regular expressions that tokens and comments of our language. REGEX_NUMBER = re.compile('[0-9]+(?:\.[0-9]+)?') REGEX_IDENTIFIER = re.compile('[a-zA-Z][a-zA-Z0-9] *') REGEX_COMMENT = re.compile('#.*') def Tokenize(string): while string: # Skip whitespace. if string[0].isspace(): string = string[1:] continue # Run regexes. comment_match = REGEX_COMMENT.match(string) number_match = REGEX_NUMBER.match(string) identifier_match = REGEX_IDENTIFIER.match(string) # Check if any of the regexes matched and yield the appropriate result. if comment_match: comment = comment_match.group(0) string = string[len(comment):] elif number_match: number = number_match.group(0) yield NumberToken(float(number)) string = string[len(number):] elif identifier_match: identifier = identifier_match.group(0) # Check if we matched a keyword. if identifier == 'def': yield DefToken() elif identifier == 'extern': yield ExternToken() elif identifier == 'if': yield IfToken() elif identifier == 'then': yield ThenToken() elif identifier == 'else': yield ElseToken() elif identifier == 'for': yield ForToken() elif identifier == 'in': yield InToken() elif identifier == 'binary': yield BinaryToken() elif identifier == 'unary': yield UnaryToken() elif identifier == 'var': yield VarToken() else: yield IdentifierToken(identifier) string = string[len(identifier):] else: # Yield the ASCII value of the unknown character. yield CharacterToken(string[0]) string = string[1:] yield EOFToken() Abstract Syntax Tree (aka Parse Tree) ------------------------------------- .. code-block:: python # Base class for all expression nodes. class ExpressionNode(object): pass # Expression class for numeric literals like "1.0". class NumberExpressionNode(ExpressionNode): def __init__(self, value): self.value = value def CodeGen(self): return Constant.real(Type.double(), self.value) # Expression class for referencing a variable, like "a". class VariableExpressionNode(ExpressionNode): def __init__(self, name): self.name = name def CodeGen(self): if self.name in g_named_values: return g_llvm_builder.load(g_named_values[self.name], self.name) else: raise RuntimeError('Unknown variable name: ' + self.name) # Expression class for a binary operator. class BinaryOperatorExpressionNode(ExpressionNode): def __init__(self, operator, left, right): self.operator = operator self.left = left self.right = right def CodeGen(self): # A special case for '=' because we don't want to emit the LHS as an # expression. if self.operator == '=': # Assignment requires the LHS to be an identifier. if not isinstance(self.left, VariableExpressionNode): raise RuntimeError('Destination of "=" must be a variable.') # Codegen the RHS. value = self.right.CodeGen() # Look up the name. variable = g_named_values[self.left.name] # Store the value and return it. g_llvm_builder.store(value, variable) return value left = self.left.CodeGen() right = self.right.CodeGen() if self.operator == '+': return g_llvm_builder.fadd(left, right, 'addtmp') elif self.operator == '-': return g_llvm_builder.fsub(left, right, 'subtmp') elif self.operator == '*': return g_llvm_builder.fmul(left, right, 'multmp') elif self.operator == '<': result = g_llvm_builder.fcmp(FCMP_ULT, left, right, 'cmptmp') # Convert bool 0 or 1 to double 0.0 or 1.0. return g_llvm_builder.uitofp(result, Type.double(), 'booltmp') else: function = g_llvm_module.get_function_named('binary' + self.operator) return g_llvm_builder.call(function, [left, right], 'binop') # Expression class for function calls. class CallExpressionNode(ExpressionNode): def __init__(self, callee, args): self.callee = callee self.args = args def CodeGen(self): # Look up the name in the global module table. callee = g_llvm_module.get_function_named(self.callee) # Check for argument mismatch error. if len(callee.args) != len(self.args): raise RuntimeError('Incorrect number of arguments passed.') arg_values = [i.CodeGen() for i in self.args] return g_llvm_builder.call(callee, arg_values, 'calltmp') # Expression class for if/then/else. class IfExpressionNode(ExpressionNode): def __init__(self, condition, then_branch, else_branch): self.condition = condition self.then_branch = then_branch self.else_branch = else_branch def CodeGen(self): condition = self.condition.CodeGen() # Convert condition to a bool by comparing equal to 0.0. condition_bool = g_llvm_builder.fcmp( FCMP_ONE, condition, Constant.real(Type.double(), 0), 'ifcond') function = g_llvm_builder.basic_block.function # Create blocks for the then and else cases. Insert the 'then' block at the # end of the function. then_block = function.append_basic_block('then') else_block = function.append_basic_block('else') merge_block = function.append_basic_block('ifcond') g_llvm_builder.cbranch(condition_bool, then_block, else_block) # Emit then value. g_llvm_builder.position_at_end(then_block) then_value = self.then_branch.CodeGen() g_llvm_builder.branch(merge_block) # Codegen of 'Then' can change the current block; update then_block for the # PHI node. then_block = g_llvm_builder.basic_block # Emit else block. g_llvm_builder.position_at_end(else_block) else_value = self.else_branch.CodeGen() g_llvm_builder.branch(merge_block) # Codegen of 'Else' can change the current block, update else_block for the # PHI node. else_block = g_llvm_builder.basic_block # Emit merge block. g_llvm_builder.position_at_end(merge_block) phi = g_llvm_builder.phi(Type.double(), 'iftmp') phi.add_incoming(then_value, then_block) phi.add_incoming(else_value, else_block) return phi # Expression class for for/in. class ForExpressionNode(ExpressionNode): def __init__(self, loop_variable, start, end, step, body): self.loop_variable = loop_variable self.start = start self.end = end self.step = step self.body = body def CodeGen(self): # Output this as: # var = alloca double # ... # start = startexpr # store start -> var # goto loop # loop: # ... # bodyexpr # ... # loopend: # step = stepexpr # endcond = endexpr # # curvar = load var # nextvar = curvar + step # store nextvar -> var # br endcond, loop, endloop # outloop: function = g_llvm_builder.basic_block.function # Create an alloca for the variable in the entry block. alloca = CreateEntryBlockAlloca(function, self.loop_variable) # Emit the start code first, without 'variable' in scope. start_value = self.start.CodeGen() # Store the value into the alloca. g_llvm_builder.store(start_value, alloca) # Make the new basic block for the loop, inserting after current block. loop_block = function.append_basic_block('loop') # Insert an explicit fall through from the current block to the loop_block. g_llvm_builder.branch(loop_block) # Start insertion in loop_block. g_llvm_builder.position_at_end(loop_block) # Within the loop, the variable is defined equal to the alloca. If it # shadows an existing variable, we have to restore it, so save it now. old_value = g_named_values.get(self.loop_variable, None) g_named_values[self.loop_variable] = alloca # Emit the body of the loop. This, like any other expr, can change the # current BB. Note that we ignore the value computed by the body. self.body.CodeGen() # Emit the step value. if self.step: step_value = self.step.CodeGen() else: # If not specified, use 1.0. step_value = Constant.real(Type.double(), 1) # Compute the end condition. end_condition = self.end.CodeGen() # Reload, increment, and restore the alloca. This handles the case where # the body of the loop mutates the variable. cur_value = g_llvm_builder.load(alloca, self.loop_variable) next_value = g_llvm_builder.fadd(cur_value, step_value, 'nextvar') g_llvm_builder.store(next_value, alloca) # Convert condition to a bool by comparing equal to 0.0. end_condition_bool = g_llvm_builder.fcmp( FCMP_ONE, end_condition, Constant.real(Type.double(), 0), 'loopcond') # Create the "after loop" block and insert it. after_block = function.append_basic_block('afterloop') # Insert the conditional branch into the end of loop_block. g_llvm_builder.cbranch(end_condition_bool, loop_block, after_block) # Any new code will be inserted in after_block. g_llvm_builder.position_at_end(after_block) # Restore the unshadowed variable. if old_value is not None: g_named_values[self.loop_variable] = old_value else: del g_named_values[self.loop_variable] # for expr always returns 0.0. return Constant.real(Type.double(), 0) # Expression class for a unary operator. class UnaryExpressionNode(ExpressionNode): def __init__(self, operator, operand): self.operator = operator self.operand = operand def CodeGen(self): operand = self.operand.CodeGen() function = g_llvm_module.get_function_named('unary' + self.operator) return g_llvm_builder.call(function, [operand], 'unop') # Expression class for var/in. class VarExpressionNode(ExpressionNode): def __init__(self, variables, body): self.variables = variables self.body = body def CodeGen(self): old_bindings = {} function = g_llvm_builder.basic_block.function # Register all variables and emit their initializer. for var_name, var_expression in self.variables.iteritems(): # Emit the initializer before adding the variable to scope, this prevents # the initializer from referencing the variable itself, and permits stuff # like this: # var a = 1 in # var a = a in ... # refers to outer 'a'. if var_expression is not None: var_value = var_expression.CodeGen() else: var_value = Constant.real(Type.double(), 0) alloca = CreateEntryBlockAlloca(function, var_name) g_llvm_builder.store(var_value, alloca) # Remember the old variable binding so that we can restore the binding # when we unrecurse. old_bindings[var_name] = g_named_values.get(var_name, None) # Remember this binding. g_named_values[var_name] = alloca # Codegen the body, now that all vars are in scope. body = self.body.CodeGen() # Pop all our variables from scope. for var_name in self.variables: if old_bindings[var_name] is not None: g_named_values[var_name] = old_bindings[var_name] else: del g_named_values[var_name] # Return the body computation. return body # This class represents the "prototype" for a function, which captures its name, # and its argument names (thus implicitly the number of arguments the function # takes), as well as if it is an operator. class PrototypeNode(object): def __init__(self, name, args, is_operator=False, precedence=0): self.name = name self.args = args self.is_operator = is_operator self.precedence = precedence def IsBinaryOp(self): return self.is_operator and len(self.args) == 2 def GetOperatorName(self): assert self.is_operator return self.name[-1] def CodeGen(self): # Make the function type, eg. double(double,double). funct_type = Type.function( Type.double(), [Type.double()] * len(self.args), False) function = Function.new(g_llvm_module, funct_type, self.name) # If the name conflicted, there was already something with the same name. # If it has a body, don't allow redefinition or reextern. if function.name != self.name: function.delete() function = g_llvm_module.get_function_named(self.name) # If the function already has a body, reject this. if not function.is_declaration: raise RuntimeError('Redefinition of function.') # If the function took a different number of args, reject. if len(function.args) != len(self.args): raise RuntimeError('Redeclaration of a function with different number ' 'of args.') # Set names for all arguments and add them to the variables symbol table. for arg, arg_name in zip(function.args, self.args): arg.name = arg_name return function # Create an alloca for each argument and register the argument in the symbol # table so that references to it will succeed. def CreateArgumentAllocas(self, function): for arg_name, arg in zip(self.args, function.args): alloca = CreateEntryBlockAlloca(function, arg_name) g_llvm_builder.store(arg, alloca) g_named_values[arg_name] = alloca # This class represents a function definition itself. class FunctionNode(object): def __init__(self, prototype, body): self.prototype = prototype self.body = body def CodeGen(self): # Clear scope. g_named_values.clear() # Create a function object. function = self.prototype.CodeGen() # If this is a binary operator, install its precedence. if self.prototype.IsBinaryOp(): operator = self.prototype.GetOperatorName() g_binop_precedence[operator] = self.prototype.precedence # Create a new basic block to start insertion into. block = function.append_basic_block('entry') global g_llvm_builder g_llvm_builder = Builder.new(block) # Add all arguments to the symbol table and create their allocas. self.prototype.CreateArgumentAllocas(function) # Finish off the function. try: return_value = self.body.CodeGen() g_llvm_builder.ret(return_value) # Validate the generated code, checking for consistency. function.verify() # Optimize the function. g_llvm_pass_manager.run(function) except: function.delete() if self.prototype.IsBinaryOp(): del g_binop_precedence[self.prototype.GetOperatorName()] raise return function Parser ------ .. code-block:: python class Parser(object): def __init__(self, tokens): self.tokens = tokens self.Next() # Provide a simple token buffer. Parser.current is the current token the # parser is looking at. Parser.Next() reads another token from the lexer and # updates Parser.current with its results. def Next(self): self.current = self.tokens.next() # Gets the precedence of the current token, or -1 if the token is not a binary # operator. def GetCurrentTokenPrecedence(self): if isinstance(self.current, CharacterToken): return g_binop_precedence.get(self.current.char, -1) else: return -1 # identifierexpr ::= identifier | identifier '(' expression* ')' def ParseIdentifierExpr(self): identifier_name = self.current.name self.Next() # eat identifier. if self.current != CharacterToken('('): # Simple variable reference. return VariableExpressionNode(identifier_name) # Call. self.Next() # eat '('. args = [] if self.current != CharacterToken(')'): while True: args.append(self.ParseExpression()) if self.current == CharacterToken(')'): break elif self.current != CharacterToken(','): raise RuntimeError('Expected ")" or "," in argument list.') self.Next() self.Next() # eat ')'. return CallExpressionNode(identifier_name, args) # numberexpr ::= number def ParseNumberExpr(self): result = NumberExpressionNode(self.current.value) self.Next() # consume the number. return result # parenexpr ::= '(' expression ')' def ParseParenExpr(self): self.Next() # eat '('. contents = self.ParseExpression() if self.current != CharacterToken(')'): raise RuntimeError('Expected ")".') self.Next() # eat ')'. return contents # ifexpr ::= 'if' expression 'then' expression 'else' expression def ParseIfExpr(self): self.Next() # eat the if. # condition. condition = self.ParseExpression() if not isinstance(self.current, ThenToken): raise RuntimeError('Expected "then".') self.Next() # eat the then. then_branch = self.ParseExpression() if not isinstance(self.current, ElseToken): raise RuntimeError('Expected "else".') self.Next() # eat the else. else_branch = self.ParseExpression() return IfExpressionNode(condition, then_branch, else_branch) # forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression def ParseForExpr(self): self.Next() # eat the for. if not isinstance(self.current, IdentifierToken): raise RuntimeError('Expected identifier after for.') loop_variable = self.current.name self.Next() # eat the identifier. if self.current != CharacterToken('='): raise RuntimeError('Expected "=" after for variable.') self.Next() # eat the '='. start = self.ParseExpression() if self.current != CharacterToken(','): raise RuntimeError('Expected "," after for start value.') self.Next() # eat the ','. end = self.ParseExpression() # The step value is optional. if self.current == CharacterToken(','): self.Next() # eat the ','. step = self.ParseExpression() else: step = None if not isinstance(self.current, InToken): raise RuntimeError('Expected "in" after for variable specification.') self.Next() # eat 'in'. body = self.ParseExpression() return ForExpressionNode(loop_variable, start, end, step, body) # varexpr ::= 'var' (identifier ('=' expression)?)+ 'in' expression def ParseVarExpr(self): self.Next() # eat 'var'. variables = {} # At least one variable name is required. if not isinstance(self.current, IdentifierToken): raise RuntimeError('Expected identifier after "var".') while True: var_name = self.current.name self.Next() # eat the identifier. # Read the optional initializer. if self.current == CharacterToken('='): self.Next() # eat '='. variables[var_name] = self.ParseExpression() else: variables[var_name] = None # End of var list, exit loop. if self.current != CharacterToken(','): break self.Next() # eat ','. if not isinstance(self.current, IdentifierToken): raise RuntimeError('Expected identifier after "," in a var expression.') # At this point, we have to have 'in'. if not isinstance(self.current, InToken): raise RuntimeError('Expected "in" keyword after "var".') self.Next() # eat 'in'. body = self.ParseExpression() return VarExpressionNode(variables, body) # primary ::= # dentifierexpr | numberexpr | parenexpr | ifexpr | forexpr | varexpr def ParsePrimary(self): if isinstance(self.current, IdentifierToken): return self.ParseIdentifierExpr() elif isinstance(self.current, NumberToken): return self.ParseNumberExpr() elif isinstance(self.current, IfToken): return self.ParseIfExpr() elif isinstance(self.current, ForToken): return self.ParseForExpr() elif isinstance(self.current, VarToken): return self.ParseVarExpr() elif self.current == CharacterToken('('): return self.ParseParenExpr() else: raise RuntimeError('Unknown token when expecting an expression.') # unary ::= primary | unary_operator unary def ParseUnary(self): # If the current token is not an operator, it must be a primary expression. if (not isinstance(self.current, CharacterToken) or self.current in [CharacterToken('('), CharacterToken(',')]): return self.ParsePrimary() # If this is a unary operator, read it. operator = self.current.char self.Next() # eat the operator. return UnaryExpressionNode(operator, self.ParseUnary()) # binoprhs ::= (binary_operator unary)* def ParseBinOpRHS(self, left, left_precedence): # If this is a binary operator, find its precedence. while True: precedence = self.GetCurrentTokenPrecedence() # If this is a binary operator that binds at least as tightly as the # current one, consume it; otherwise we are done. if precedence < left_precedence: return left binary_operator = self.current.char self.Next() # eat the operator. # Parse the unary expression after the binary operator. right = self.ParseUnary() # If binary_operator binds less tightly with right than the operator after # right, let the pending operator take right as its left. next_precedence = self.GetCurrentTokenPrecedence() if precedence < next_precedence: right = self.ParseBinOpRHS(right, precedence + 1) # Merge left/right. left = BinaryOperatorExpressionNode(binary_operator, left, right) # expression ::= unary binoprhs def ParseExpression(self): left = self.ParseUnary() return self.ParseBinOpRHS(left, 0) # prototype # ::= id '(' id* ')' # ::= binary LETTER number? (id, id) # ::= unary LETTER (id) def ParsePrototype(self): precedence = None if isinstance(self.current, IdentifierToken): kind = 'normal' function_name = self.current.name self.Next() # eat function name. elif isinstance(self.current, UnaryToken): kind = 'unary' self.Next() # eat 'unary'. if not isinstance(self.current, CharacterToken): raise RuntimeError('Expected an operator after "unary".') function_name = 'unary' + self.current.char self.Next() # eat the operator. elif isinstance(self.current, BinaryToken): kind = 'binary' self.Next() # eat 'binary'. if not isinstance(self.current, CharacterToken): raise RuntimeError('Expected an operator after "binary".') function_name = 'binary' + self.current.char self.Next() # eat the operator. if isinstance(self.current, NumberToken): if not 1 <= self.current.value <= 100: raise RuntimeError('Invalid precedence: must be in range [1, 100].') precedence = self.current.value self.Next() # eat the precedence. else: raise RuntimeError('Expected function name, "unary" or "binary" in ' 'prototype.') if self.current != CharacterToken('('): raise RuntimeError('Expected "(" in prototype.') self.Next() # eat '('. arg_names = [] while isinstance(self.current, IdentifierToken): arg_names.append(self.current.name) self.Next() if self.current != CharacterToken(')'): raise RuntimeError('Expected ")" in prototype.') # Success. self.Next() # eat ')'. if kind == 'unary' and len(arg_names) != 1: raise RuntimeError('Invalid number of arguments for a unary operator.') elif kind == 'binary' and len(arg_names) != 2: raise RuntimeError('Invalid number of arguments for a binary operator.') return PrototypeNode(function_name, arg_names, kind != 'normal', precedence) # definition ::= 'def' prototype expression def ParseDefinition(self): self.Next() # eat def. proto = self.ParsePrototype() body = self.ParseExpression() return FunctionNode(proto, body) # toplevelexpr ::= expression def ParseTopLevelExpr(self): proto = PrototypeNode('', []) return FunctionNode(proto, self.ParseExpression()) # external ::= 'extern' prototype def ParseExtern(self): self.Next() # eat extern. return self.ParsePrototype() # Top-Level parsing def HandleDefinition(self): self.Handle(self.ParseDefinition, 'Read a function definition:') def HandleExtern(self): self.Handle(self.ParseExtern, 'Read an extern:') def HandleTopLevelExpression(self): try: function = self.ParseTopLevelExpr().CodeGen() result = g_llvm_executor.run_function(function, []) print 'Evaluated to:', result.as_real(Type.double()) except Exception, e: raise#print 'Error:', e try: self.Next() # Skip for error recovery. except: pass def Handle(self, function, message): try: print message, function().CodeGen() except Exception, e: raise#print 'Error:', e try: self.Next() # Skip for error recovery. except: pass Main driver code. ----------------- .. code-block:: python def main(): # Set up the optimizer pipeline. Start with registering info about how the # target lays out data structures. g_llvm_pass_manager.add(g_llvm_executor.target_data) # Promote allocas to registers. g_llvm_pass_manager.add(PASS_PROMOTE_MEMORY_TO_REGISTER) # Do simple "peephole" optimizations and bit-twiddling optzns. g_llvm_pass_manager.add(PASS_INSTRUCTION_COMBINING) # Reassociate expressions. g_llvm_pass_manager.add(PASS_REASSOCIATE) # Eliminate Common SubExpressions. g_llvm_pass_manager.add(PASS_GVN) # Simplify the control flow graph (deleting unreachable blocks, etc). g_llvm_pass_manager.add(PASS_CFG_SIMPLIFICATION) g_llvm_pass_manager.initialize() # Install standard binary operators. # 1 is lowest possible precedence. 40 is the highest. g_binop_precedence['='] = 2 g_binop_precedence['<'] = 10 g_binop_precedence['+'] = 20 g_binop_precedence['-'] = 20 g_binop_precedence['*'] = 40 # Run the main "interpreter loop". while True: print 'ready<', try: raw = raw_input() except KeyboardInterrupt: break parser = Parser(Tokenize(raw)) while True: # top ::= definition | external | expression | EOF if isinstance(parser.current, EOFToken): break if isinstance(parser.current, DefToken): parser.HandleDefinition() elif isinstance(parser.current, ExternToken): parser.HandleExtern() else: parser.HandleTopLevelExpression() # Print out all of the generated code. print '', g_llvm_module if __name__ == '__main__': main()