Essentially we define recursive equations over sets attached to control-flow graph nodes, then iterate until a fixed point. We compute local sets per block, then propagate globally.
Reaching Definitions
Which assignments might arrive here? This is done via forward analysis, taking in all paths that count.
Local sets:
GEN[B]= definitions created in B.KILL[B]= all defs in the entire program that write to the same vars as Bโs defs whether or not they reach B
Equations:
OUT[B] = GEN[B] โช (IN[B] - KILL[B])IN[B] = โช OUT[p]for predecessorsp
This is used for dead code elimination and constant propagation, where if all reaching defs of a variable assign the same constant, we replace the use with that constant.
Dominators
What must be passed through to get here? Node A dominates node B if every path from entry to B goes through A. Again we use forward analysis, but the meet op is intersection, meaning that the property must hold on all paths.
Equation: dom(n) = โฉ{dom(m) | m โ pred(n)} โช {n}
This is used for building dominator trees, inserting SSA phi-functions, and scoping value numbering.
Available Expressions
Has this expression already been computed on all paths? Forward analysis, meet operator is intersection, meaning that it must be available on every incoming path.
Local sets:
- DEExpr(b) = expressions computed in b whose operands arenโt redefined after (downward exposed).
- ExprKill(b) = expressions killed by any assignment to their operands in b.
Equation: Avail(b) = โฉ_{x โ pred(b)} (DEExpr(x) โช (Avail(x) - ExprKill(x)))
- Initialized via
Avail(entry) = โ, and all others start at U (universal set) since weโre intersecting.
This is used for global common subexpression elimination. Basically, if a + b is available, we reuse the earlier result instead of recomputing.
Constant Propagation
Is this variable always the same constant? Forward analysis, meet is intersection (must agree on all paths).
Equations (same shape as reaching defs):
OUT[S] = GEN[S] โช (IN[S] - KILL[S])IN[S] = โฉ_{p โ pred(S)} OUT[p]
We build duild def-use chains (from reaching defs), propagate constants forward, evaluate constant expressions at compile time, and repeat until nothing changes.
Liveness Analysis
This is just backward analysis that flows from successors to predecessors, and the meet operator is union.
Local sets:
UEVAR(b)= variables used in b before being defined in b (upward exposed)VARKILL(b)= variables assigned/defined in b.
Equations:
LIVEOUT(b) = โช_{s โ succ(b)} LIVEIN(s)LIVEIN(b) = UEVAR(b) โช (LIVEOUT(b) - VARKILL(b))
This is mainly used for register allocation. two variables that are live at the same time interfere and canโt share a register.
Computation
Taking everything from above, the formalized steps are:
- Compute local sets (GEN/KILL or UEVAR/VARKILL or DEExpr/ExprKill) for each block
- Initialize IN/OUT sets (โ for entry/exit boundary, โ or U for others depending on meet operator โ โ for union-based, U for intersection-based)
- For each block, recompute IN then OUT (or OUT then IN for backward) using the equations
- Stop when nothing changes โ fixed point