SSA form Static Single Assignment is popular in compiler intermediate representations. Since SSA has no mutable variables, phi functions are an important feature.
Let’s take a look at an alternative to phi functions: basic block arguments.
Phi functions
Here’s a loop counting to 10 in a fictional SSA IR:
entry:
v1 = const 1
jump loop
loop:
v10 = phi(entry: v1, loop: v11)
v11 = add v10, 1
v12 = cmp le v11, 10
branch v12, loop, exit
exit:
return
The phi function binds the v10
value in a control flow dependent way: When coming from the entry
block, use v1
. When coming from the loop
block, use v11
. There must be a phi operand for each predecessor block so the value is always defined.
Basic block arguments
Here is the same loop in a fictional IR that uses basic block arguments instead of phi functions:
entry:
v1 = const 1
jump loop(v1)
loop(v10):
v11 = add v10, 1
v12 = cmp le v11, 10
branch v12, loop(v11), exit
exit:
return
The v10
value is no longer defined by a phi function. Instead it appears as an
argument to the loop
basic block. These block arguments work just like
function arguments: You must provide a value when jumping to the basic block.
The biggest advantage to using basic block arguments is that phi functions disappear, and it is no longer necessary to update phi operand lists when making changes to the control flow. Of course, compiler optimization passes must now make sure to always provide the correct arguments when branching to a basic block. While that is technically equivalent to updating phi operand lists, in practice it turns out to be a lot easier.
Corner cases
Block arguments are almost equivalent to phi functions, but there are some corner cases to consider. One case is terminator instructions that mention the same basic block more than once. Consider:
entry:
branch v12, block(v3), block(v4)
block(v20):
...
This IR has a perfectly well-defined meaning, but it can’t be represented with a phi function without introducing an extra basic block. At some point, code generation will also have to create a separate basic block to distinguish the two cases. Depending on how high-level or low-level the IR is, you can choose whether to allow this kind of thing.
Block arguments can also be used to handle function calls that can either return normally or throw an exception. Compare this to something like LLVM’s invoke
instruction:
entry:
invoke f() to normal unwind lpad
normal(v10):
# Normal return value in v10
lpad(v20):
# Exception v20 thrown
LLVM’s landingpad
instruction is a bit of a hack to deal with this.