AT&CD Unit 5
AT&CD Unit 5
AT&CD Unit 5
UNIT-V
Code generation: Machine dependent code generation, object code forms, generic code
generation algorithm, Register allocation and assignment. Using DAG representation of Block.
Basic blocks play an important role in identifying variables, which are being used more than
once in a single basic block. If any variable is being used more than once, the register memory
allocated to that variable need not be emptied unless the block finishes execution.
Loop Optimization
Most programs run as a loop in the system. It becomes necessary to optimize the loops in
order to save CPU cycles and memory. Loops can be optimized by the following techniques:
Invariant code : A fragment of code that resides in the loop and computes the same value
at each iteration is called a loop-invariant code. This code can be moved out of the loop
by saving it to be computed only once, rather than with each iteration.
Induction analysis : A variable is called an induction variable if its value is altered
within the loop by a loop-invariant value.
Strength reduction : There are expressions that consume more CPU cycles, time, and
memory. These expressions should be replaced with cheaper expressions without
compromising the output of expression. For example, multiplication (x * 2) is expensive
in terms of CPU cycles than (x << 1) and yields the same result.
Dead-code Elimination
Dead code is one or more than one code statements, which are:
Either never executed or unreachable,
Or if executed, their output is never used.
Thus, dead code plays no role in any program operation and therefore it can simply be
eliminated.
Partially dead code
There are some code statements whose computed values are used only under certain
circumstances, i.e., sometimes the values are used and sometimes they are not. Such codes are
known as partially dead-code.
The above control flow graph depicts a chunk of program where variable „a‟ is used to assign the
output of expression „x * y‟. Let us assume that the value assigned to „a‟ is never used inside the
loop. Immediately after the control leaves the loop, „a‟ is assigned the value of variable „z‟,
which would be used later in the program. We conclude here that the assignment code of „a‟ is
never used anywhere, therefore it is eligible to be eliminated.
Likewise, the picture above depicts that the conditional statement is always false, implying that
the code, written in true case, will never be executed, hence it can be removed.
Partial Redundancy
Redundant expressions are computed more than once in parallel path, without any change
in operands. whereas partial-redundant expressions are computed more than once in a path,
without any change in operands. For example,
But, when you compile a program, then you are not going to use both compiler and
assembler. You just take the program and give it to the compiler and compiler will give you
the directly executable code. The compiler is actually combined inside the assembler along
with loader and linker. So all the module kept together in the compiler software itself. So when
you calling gcc, you are actually not just calling the compiler, you are calling the compiler,
then assembler, then linker and loader.
Once you call the compiler, then your object code is going to present in Hard-disk. This
object code contains various part –
Header –
The header will say what are the various parts present in this object code and then point
that parts. So header will say where the text segment is going to start and a pointer to it and
where the data segment going to start and it say where the relocation information and
symbol information there.
It is nothing but like an index, like you have a textbook, there an index page will contain at
what page number each topic present. Similarly, the header will tell you, what are the
palaces at which each information is present. So that later for other software it will be
useful to directly go into those segment.
Text segment –
It is nothing but the set of instruction.
Data segment –
Data segment will contain whatever data you have used. For example, you might have used
something constraint, then that going to be present in the data segment.
Relocation Information –
Whenever you try to write a program, we generally use symbol to specify anything. Let us
assume you have instruction 1, instruction 2, instruction 3, instruction 4,….
Now if you say somewhere Goto L4 (Even if you don‟t write Goto statement in the high-
level language, the output of the compiler will write it), then that code will be converted
into object code and L4 will be replaced by Goto 4. Now Goto 4 for the level L4 is going to
work fine, as long as the program is going to be loaded starting at address no 0. But in most
cases, the initial part of the RAM is going to be dedicated to the operating system. Even if
it is not dedicated to the operating system, then might be some other process that will
already be running at address no 0. So, when you are going to load the program into
memory, means if the program has to be loaded in the main memory, it might be loaded
anywhere. Let us say 1000 is the new starting address, then all the addresses have to be
changed, that is known as Reallocation.
The original address is known as Relocatable address and the final address, which we get
after loading the program into main memory, is known as the Absolute address.
Symbol table –
It contains every symbol that you have in your program. for example, int a, b, c then, a, b, c
are the symbol. it will show what are the variables that your program contains.
Debugging information –
It will help to find how a variable is keeping on changing.
Generic code generation algorithm
The algorithm takes a sequence of three-address statements as input. For each three address
statement of the form a:= b op c perform the various actions. These are as follows:
1. Invoke a function getreg to find out the location L where the result of computation b op c
should be stored.
2. Consult the address description for y to determine y'. If the value of y currently in
memory and register both then prefer the register y' . If the value of y is not already in L
then generate the instruction MOV y' , L to place a copy of y in L.
3. Generate the instruction OP z' , L where z' is used to show the current location of z. if z
is in both then prefer a register to a memory location. Update the address descriptor of x
to indicate that x is in location L. If x is in L then update its descriptor and remove x from
all other descriptor.
4. If the current value of y or z have no next uses or not live on exit from the block or in
register then alter the register descriptor to indicate that after execution of x : = y op z
those register will no longer contain y or z.
Nodes=program variables
Edges = connect variables that interfere with each other
Register allocation = graph coloring
5. S5:= s2 * S4
6. S6:= prod + S5
7. Prod:= s6
8. S7:= i+1
9. i := S7
10. if i<= 20 goto (1)
Stages in DAG Construction: