Code Generation & Optimization
Code Generation & Optimization
com/
Code produced by compiler must be correct and high quality. Source-to-target program transformation
should be semantics preserving and effective use of target machine resources. Heuristic techniques should
be used to generate good but suboptimal code, because generating optimal code is un-decidable.
Input to the code generator: The input to the code generator is intermediate representation together with
the information in the symbol table. What type of input postfix, three-address, dag or tree.
Target Program: Which one is the out put of code generator: Absolute machine code (executable code),
Re-locatable machine code (object files for linker), Assembly language (facilitates debugging), Byte code
forms for interpreters (e.g. JVM)
Target Machine: Implementing code generation requires thorough understanding of the target machine
architecture and its instruction set.
Instruction Selection: Efficient and low cost instruction selection is important to obtain efficient code.
Choice of Evaluation order: The order of computation effect the efficiency of target code.
op source, destination
1
For more notes visit: https://collegenote.pythonanywhere.com/
The source and destination of instructions are specified by combining register and memory location with
address modes. The address mode together with assembly forms and associated cost are:
Addressing modes:
Instruction Costs
• Machine is a simple, non-super-scalar processor with fixed instruction costs
• Realistic machines have deep pipelines, I-cache, D-cache, etc.
• Define the cost of instruction
= 1 + cost(source-mode) + cost(destination-mode)
Instruction operation
MOV R0,R1 Store content(R0) into register R1 1
MOV R0,M Store content(R0) into memory location M 2
MOV M,R0 Store content(M) into register R0 2
MOV 4(R0),M Store contents(4+contents(R0)) into M 3
MOV *4(R0),M Store contents(contents(4+contents(R0))) into M 3
MOV #1,R0 Store 1 into R0 2
ADD 4(R0),*12(R1) Add contents(4+contents(R0)) to value at location
contents(12+contents(R1)) 3
Instruction Selection
Instruction selection is important to obtain efficient code. Suppose we translate three-address code
2
For more notes visit: https://collegenote.pythonanywhere.com/
3
For more notes visit: https://collegenote.pythonanywhere.com/
4
5
For more notes visit: https://collegenote.pythonanywhere.com/
6
For more notes visit: https://collegenote.pythonanywhere.com/
7
8
For more notes visit: https://collegenote.pythonanywhere.com/
9
10
For more notes visit: https://collegenote.pythonanywhere.com/
11
12
For more notes visit: https://collegenote.pythonanywhere.com/
13
14
For more notes visit: https://collegenote.pythonanywhere.com/
15
For more notes visit: https://collegenote.pythonanywhere.com/
A compiler contains a block of storage from the operating system for the compiled program to
run in. This run time storage might be sub-divided to hold
1. The generated target code
2. data objects and
3. a counterpart of the control stack to keep track of procedure activation.
• The size of generated target code is fixed at compile time so it can be placed in a statically
determined area – low end of memory.
• Some of data objects may also be known at compile time so these too can be placed in to
statically determined area.
• The addresses of these data objects can be compiled into target code
• For the activation of procedure, when a call occurs, execution of an activation is interrupted
and information about the status of the machine such as value of program counter, machine
register is saved into stack until the control returns from call to the activation.
• Data objects whose life times are contained in that of an activation can be allocated on the
stack along with other information associated with the activation.
• Separate area of run time storage , called heap, holds other information.
Code • The size of stack and heap may change during execution.
Static Data
• By convention , stack grows down and heap grows up
Stack
Heap
An activation record is a collection of fields, starting from the field for temporaries as
Returned value value returned after execution
actual parameter used by the calling procedure to call procedure.
optional control link points to the activation record of the caller.
optional access link Non local data held in other activation record.
saved machine state State of the machine just before procedure call
local data Data that are local to an execution.
temporaries Temporary values used for evaluation of expression
Since , run time allocation and de-allocation of activation records occurs as part of procedure
call-return sequences, following three address statements are in focus.
1. call
2. return
3. halt
4. action – a place holder for other statements
16
For more notes visit: https://collegenote.pythonanywhere.com/
A call statement in the intermediate code is implemented by two target machine instruction
MOV and GOTO
• The code constructed from procedure C and p above using arbitrary address 100 and 200
as:
Assume action takes cost of 20 bytes. – MOV and GOTO + 3 constants cost = 20 bytes
The target code for the input above will be as:
100: ACTION1
120: MOV #140,364 /* saves return address 140 */
132: ACTION2
160: HLT
……
/*Code for P */
200: ACTION3
220: GOTO *364 /* returns to address saved in location 364 */
…….
/* 300-363 hold activation record for c */
300: /* return address */
304: /*local data for c */
……
/* 364-451 holds activation record for P */
364: /*return address */
368: /* local data for p */
• The MOV instruction at address 120 saves the return address 140 in machine status field
- the first word in activation record of p.
• The GOTO instruction at 132 transfers control to first instruction to the target code of
called procedure.
• *364 represents 140 when GOTO statement at address 220 is executed, control then
returns to 140.
17