LPUNIT1 ppt1
LPUNIT1 ppt1
LPUNIT1 ppt1
• Course Outcomes:
• On successful completion of the course, students will be able to:
1. Exhibit role of various phases of compilation, with understanding of
types of grammars and design complexity of compiler.
2. Design various types of parses and perform operations like string
parsing and error handling.
3. Demonstrate syntax directed translation schemes, their
implementation for different programming language constructs.
4. Implement different code optimization and code generation
techniques using standard data structures.
M.B.Chandak, CSE-RCOEM, NAGPUR
Course: Gradation
• Three Test: T1, T2, T3 [Can attempt all three]
• Assignment: Types:
• 02 Marks : Class Assignment
• 02 Marks : Programming Assignment
• 02 Marks : Quiz/Programming Assignment
• 01 Mark : Class participation
• 03 Marks : Attendance
• Total : 30+10 = 40 Marks
• End Semester Question paper : Generally two questions with choice and 4
Questions with internal choice.
M.B.Chandak, CSE-RCOEM, NAGPUR
UNIT – I: Introduction [CO1]
Outcomes:
1. To understand the design complexity of language
processor.
2. To understand the functions of various phases of
compilation.
3. To understand allied concepts like cross
compilation, bootstrapping etc.
Preprocessor
Source program
Compiler
Target assembly program
assembler
linker
Error message
Compiler
Analysis Synthesis
Source
Program Tokens Syntactic Semantic
Scanner Parser
(Character Stream) Structure Routines
Intermediate Representation
Intermediate
Error recovery and Symbol and Code Generator
Attribute
Tables
Code Optimizer
Code Generator
Source
Program Tokens Syntactic Semantic
Scanner
Parser
(Character Stream) LA Structure Routines
Intermediate Representation
Scanner Intermediate
The scanner begins the analysis of the source program by Code Generator
reading the input, character by character, and grouping
characters into individual words and symbols (tokens)
Code Optimizer
RE ( Regular expression )
NFA ( Non-deterministic Finite Automata )
DFA ( Deterministic Finite Automata )
LEX
Code Generator
Source
Program Tokens Syntactic Semantic
Scanner Parser
(Character Stream) Structure Routines
Intermediate Representation
Parser Intermediate
Given a formal syntax specification (typically as a context-free
Code Generator
grammar [CFG] ), the parse reads tokens and groups them into
units as specified by the productions of the CFG being used.
As syntactic structure is recognized, the parser either calls
corresponding semantic routines directly or builds a syntax tree. Code Optimizer
CFG ( Context-Free Grammar )
BNF ( Backus-Naur Form )
GAA ( Grammar Analysis Algorithms )
LL, LR, SLR, LALR Parsers
Code Generator
YACC
Source
Program Tokens Syntactic Semantic
Scanner Parser
(Character Stream) Structure Routines
Intermediate Representation
Semantic Routines Intermediate
Perform two functions
Code Generator
Check the static semantics of each construct
Do the actual translation
The heart of a compiler
Code Optimizer
Syntax Directed Translation
Semantic Processing Techniques
IR (Intermediate Representation)
Code Generator
Source
Program Tokens Syntactic Semantic
Scanner Parser
(Character Stream) Structure Routines
Intermediate Representation
Optimizer Intermediate
The IR code generated by the semantic routines is analyzed and Code Generator
transformed into functionally equivalent but improved IR code
This phase can be very complex and slow
Peephole optimization Code Optimizer
Loop optimization, register allocation, code scheduling
Local Optimization
Register and Temporary Management
Peephole Optimization
Code Generator
Source
Program Tokens Syntactic Semantic
Scanner Parser
(Character Stream) Structure Routines
Intermediate Representation
Code Generator
Tokens
Code Optimizer
Parser
[Syntax Analyzer]
Optimized Intermediate Code
Parse tree
Code Generation
Semantic Process
[Semantic analyzer] Target machine code
Back end
synthesis
Parser Modified intermediate form
analysis
(syntax analysis)
Parse tree Code Optimization
Assembly or object code
Semantic Analysis
Machine Specific
Code Generation
Abstract syntax tree or
Modified assembly or object code
other intermediate form
M.B.Chandak, CSE-RCOEM, NAGPUR
Block diagram of compilation phases
1 2 a c
3 b
Regular Expression
• It is a tool to express language in the form of expression.
• RE uses primitive operators for expressing language.
• The three operators used are: Union, Concatenation, Kleene Star.
• Examples:
(0 + 10*) L= { 0, 1, 10, 100, 1000, 10000, … }
Set of strings of a’s and b’s ending with the string abb,
(a + b)*abb
So L = {abb, aabb, babb, aaabb, ababb, …………..}