# **DTTNU** | Norwegian University of Science and Technology

#### **Compiler Construction**

Lecture 13: Intermediate representations and SSA

Michael Engel

### Overview

- More on intermediate representations
  - Efficient implementation
  - Translating an AST into linear IR
  - Static single assignment (SSA) form



### Three-address code again

• Most operations in three-address code (TAC) have the form

i = j op k

- one operator (op), two operands (j and k) and one result (i)
- some operators will need fewer arguments
  - e.g. immediate loads and jumps
- sometimes, an op with more than three addresses is needed
- Three-address code is reasonably compact
  - most ops consist of four items: an operation and three names
  - both the operation and the names are drawn from limited sets
  - operations typically require 1 or 2 bytes
  - names are typically represented by integers or table indices
    - in either case, 4 bytes is usually enough
- Data structure choices affect the costs of operations on IR

# **TAC example**

- TAC resembles a RISC-like *register machine* 
  - Operands have to be loaded into registers
  - Operations (other than load/store) operate on register values
  - Results are delivered in registers
- Limited constraints for naming/allocating registers compared to real machines

TAC code for  $a - 2 \times b$ 

| 5        |    |   | 131016497 | <b>1/1003</b> | 10 Entriture 1.47 | <i>s</i> / |
|----------|----|---|-----------|---------------|-------------------|------------|
| -        | t1 | ← | 2         |               |                   |            |
| New York | t2 | ← | b         |               |                   |            |
| Non-sec  | t3 | ← | t1        | ×             | t2                |            |
|          | t4 | ← | а         |               |                   |            |
|          | t5 | ← | t4        | -             | t3                |            |
| - 2      |    |   |           |               |                   |            |

ARM assembler code for  $a - 2 \times b$ 

| MOV  | R1, | #2     | // R1=2           |
|------|-----|--------|-------------------|
| LDR  | R2, | =b     |                   |
| LDR  | R2, | [R2]   | // R2=b           |
| MULU | R3, | R1, R2 | // R3=2*b         |
| LDR  | R4, | =a     |                   |
| LDR  | R4, | [R4]   | // R4=a           |
| SUB  | R5, | R4, R3 | // R5=R4-R3=a-2*b |

# **Representing Linear IRs**





- **Simple array**: most simple form
  - short array to represent each basic block
  - often, the compiler writer places the array inside CFG nodes
- Array of pointers groups quadruples into a block
  - the pointer array can be contained in a CFG node
- Linked list links the quadruples together to form a list
  - requires less storage in the CFG node
  - at the cost of restricting accesses to sequential traversals

# Tradeoffs of different represent.

- Use case: optimization of code
- Example: rearranging the code in this block
  - What are the costs incurred for each representation?
- Op 1 loads a constant into a register
  - on most machines this translates directly into an immediate load operation
- Ops 2 and 4 load values from memory
  - on most machines this might incur a multicycle delay (unless the values are already in the primary cache)
- To hide some of the delay, the instruction scheduler might move the loads of **b** and **a** in front of the immediate load of **2** 
  - What is the cost of doing this?



Intermediate

code



### Tradeoffs of different repres.







Intermediate

code



Norwegian University of Science and Technology

### **Tradeoffs of different repres.**



#### Array of pointers: move 2 ahead of 1







Norwegian University of Science and Technology

### **Tradeoffs of different repres.**

Intermediate code

#### Linked list: move 2 ahead of 1



Science and Technology



# A closer look at TAC

- Most modern computers (still) try to look like a von Neumann machine (even though they are far more complex internally)
- A von Neumann machine has three main components:
  - Control unit
  - Data path + ALU
  - Unified memory for instructions and data
- A clock controls the execution of instructions
  - Instruction fetch (from memory, addressed py PC)
  - Operand fetch (from memory addresses encoded in instr.)
  - Execute the instruction
  - Write back the results

Norwegian University of

Science and Technology



### Instruction classes

#### We need

- Instructions for control unit
- Data for data unit/ALU
  - Instructions and data are in memory
    - we can use symbolic names for these instead of numeric addresses:
      - *Labels* for instructions
      - Names for variables
- We can categorize instructions:



### TAC is a low-level IR

"Three address" since each operation deals with at most three addresses in memory (+ the instruction itself):

x = &v

x = \*y

x[i] = y

- Binary operations: **a** = **b OP c**
- Unary operations: a = OP b
- Copy: **a** = **b**
- Load/store:

- OP is ADD, MUL, SUB, ...
- OP is NEG, MINUS, ...
- address of y value at addres y
  - address + offset



# **Control flow in TAC**

#### Intermediate code

#### **Control flow is equally simple:**

- Label: L:
- Conditional jump:

Call and return:

- named address of next instruction Unconditional jump: jump L go to L and get next instruction
  - go to L if x is TRUE if x goto L go to L if x is FALSE ifFALSE x goto L comparison operators if x<y goto L
  - if x>=y goto L comparison operators if x!=y goto L comparison operators
    - x is parameter in next call similar to jump
      - ...to where we came from

param x

call L

return

# **Translating to TAC**



#### Translation of binary operators:

we make use of the recursive nature of our AST

No matter how complex the contents of expressions
 e1 and e2 are, this can be translated from



- First, (recursively) translate **e1** and store its result
- then, (recursively) translate e2 and store its result
- finally, combine the two stored results using OP

# Linearizing the program



We traverse the AST in depth-first order:







# Linearizing the program

We traverse the AST in depth-first order:

t1 = 1 t2 = 3 t3 = t1 + t2

Then we continue further up the tree:

- The result of the "+" operation is in t3
  - t4 = t3
  - t5 = 5
  - t6 = t4 \* t5
- The final result can be copied:
  - t = t6 ....





### **Nested expressions**



Combine the local parts which represent sub-trees:



### Statement sequences



Straightforward, since they are already sequenced:

```
T[ s1; s2; s3; ... ]
becomes
T[s1]
T[s2]
T[s3]
...
```

Simply translate one statement after the other and append their translations in order



### Assignments



#### Assignments require copying a value:

T[ v=e ]

#### requires us to

- obtain the result of e
- put the result into  $\mathbf{v}$

T[ v=e ] -> t = T[e] v = t





# We need to calculate the index (address offset):

Array assignment

T[ v[e1]=e2 ]

requires us to

- compute the index expression e1
- compute the expression e2
- put the result into v[e1]

```
T[ v[e1]=e2 ] -> t1 = T[e1]
t2 = T[e2]
v[t1] = t2
```



Semantic

analysis



IR

generation

# Conditionals

These require control flow:

T[ if(e) then s ]

becomes

```
t1 = T[e]
ifFALSE t1 goto Lend
T[s]
```



IR

generation

Semantic

analysis

#### Lend:

(translation of next statement follows here)





Norwegian University of

Science and Technology





#### **Conditionals + else**



Easy to derive:











# **Different kinds of loop**



For and repeat loops can be transformed into while loops:



### Switch



T[switch(e) { case v1:s1; ... case vn:sn }
can become

```
t = T[e]
ifFALSE (t=v1) goto L1
T[s1]
```

```
L1: ifFALSE (t=v2) goto L2
T[s2]
```

```
L2: …
ifFALSE (t=vn) goto Lend
T[sn]
```

Norwegian University of

Science and Technology

Lend:





Here, the compiler has to provide a *jump table* which maps the conditions v1, v2, ... vn to their respective labels Lv1, Lv2, ... Lvn

# **Using labels**



#### Labels must be unique

• This can be handled by numbering the statements that generate them:

```
if (e1) then s1;
if (e2) then s2;
becomes
    t1 = T[e1]
    ifFALSE t1 goto LEnd1
    T[s1]
LEnd1:
    t2 = T[e2]
    ifFALSE t2 goto LEnd2
    T[s2]
LEnd2:
```



#### **Nested statements**



if (e1) then if (e2) then a=b requires a bit of care:





# Static Single-Assignment Form

- Static single-assignment form (SSA) is a naming discipline that many modern compilers use to encode information about both the flow of control and the flow of data values in the program
  - names correspond uniquely to specific definition points in the code
  - each name is defined by one operation
  - hence the name static single assignment
- SSA abstracts from processor registers
  - helps to name intermediate values during compilation
- Each use of a name as an argument in some operation encodes information about where the value originated
  - each textual name refers to a specific definition point



# Static Single-Assignment Form

- A program is in SSA form when it meets two constraints:
  (1) each definition has a distinct name; and
  (2) each use refers to a single definition
- Transforming an IR program to SA form:
  - compiler inserts  $\phi$  functions at points where different control-flow paths merge
  - it then renames variables to make the single-assignment property hold







#### **Translation of code into SSA form**

Intermediate code





Norwegian University of Science and Technology

#### **Unique Identifiers: Naive Approach**

Intermediate code





#### **Problem with the Naive Approach**

Intermediate code





Norwegian University of Science and Technology

#### Fixing the Variable Problem

#### *"Which k is the right one?" "It depends..."*

- Basic block B2 can receive values for k from B1 and B7
- Similar for variable j
- Fix: introduce a selector function φ (phi) that copies the correct value to a new intermediate variable depending on the control flow:

Norwegian University of

Science and Technology

k2 =  $\phi$ (k1, k7) j2 =  $\phi$ (j1, j7)



Compiler Construction 13: IR and SSA

Intermediate

#### **Placement of Phi Functions**

The minimal number and placement of phi functions is more complex than in this simple example

- Generation of *minimal* SSA
- Use of *dominance frontiers* to determine the basic block defining the current value of a variable

Norwegian University of

Science and Technology

• See [3] for details





Intermediate

# What's next?

• The procedure abstraction

#### References

 [1] Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman and F. K. Zadeck (1991). Efficiently computing static single assignment form and the control dependence graph. ACM Transactions on Programming Languages and Systems. 13 (4): 451–490

- [2] Andrew W. Appel (1998). SSA is Functional Programming. ACM SIGPLAN Not. 33, 4 (April 1998), 17-20
- [3] Cooper, Keith D.; Harvey, Timothy J.; Kennedy, Ken (2001). A Simple, Fast Dominance Algorithm. Softw. Pract. Exper. 2001; 4:1–10

