Lecture 1 Introduction

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 27

Lecture 1

Compilers
What
• Bringing things together – Compile
• Hope we shall make it.
Why compilers
• 1% improvement means billions of dollars in a data
center.
• Code optimization at programming, deployment,
and configuration require
– Language skills
– Compilers
– Runtime implementation
• the best boys behind google are compiler engineers
– Jeff Dean, Sanjay Ghemawat, Urs Hoelzle
• Jobs, better programming, computer scientist
Why compilers

How to talk to
different hardware
How are these
various
languages
executed
Where compiler engineers work
Companies developing compilers
• Intel - icc
• Mozilla - jaegerMonkey, IonMonkay,
TraceMonkey
• Apple – LLVM Hardware
• Google – ART, V8 companies need
• Codeplay compilation
• Coverity technology
• Pathscale, etc
Motivation
To understand;
• language processing: HLL to LLL translation
• Appreciate programming language features
• Implementation challenges
• Intersection between H/W and system S/W
Interested in writing a new compiler
“ the quality of the kitchen influences the quality of
cooking too ”
Other reason for learning compiler design

• It is considered a topic that you should know in order


to be “well-cultured” in computer science.
• A good craftsman should know his tools, and
compilers are important tools for programmers and
computer scientists.
• The techniques used for constructing a compiler are
useful for other purposes as well.
• There is a good chance that a programmer or
computer scientist will need to write a compiler or
interpreter for a domain-specific language.
Compiler
• A given source language is either compiled or
interpreted for execution
• compiler is a program that translates a source
program (HLL; C, Java) into target code; machine
re-locatable code or assembly code.
– The generated machine code can be later executed
many times against different data each time.
– The code generated is not portable to other systems
– Compiler languages; source, destination and
implementation
Compiler
• An interpreter reads an executable source
program written in HLL as well as data for this
program, and it runs the program against the
data to produce some results. Gives output of
the program
– One example is the Unix shell interpreter, which runs
operating system commands interactively.
– Source program is interpreted every time it is
executed(less efficient).
– Portable since they are not machine dependent
Qualities/Requirement of a Compiler
1. Correctness; compiler and the output
2. Efficiency; compiler and the output
3. Interoperability; output code can run with other
library codes not from the compiler (interface)
4. The compiler must be portable from one target
language to another
5. Usability; It must print good diagnostics and error
messages
6. Must have consistent and predictable optimization.
Examples of compilers
• A Java compiler for the Apple Macintosh
• A COBOL compiler for the SUN
• A C++ compiler for the Apple Macintosh
• A = B + C ∗ D; in C+++
• the output corresponding to this input might look
something like this:
• LOD R1,C // Load the value of C into reg 1
• MUL R1,D // Multiply the value of D by reg 1
• STO R1,TEMP1 // Store the result in TEMP1
• LOD R1,B // Load the value of B into reg 1
• ADD R1,TEMP1 // Add value of Temp1 to register 1
• STO R1,TEMP2 // Store the result in TEMP2
• MOV A,TEMP2 // Move TEMP2 to A, the final result
Portable Compiler
Portable Compiler
Structure of a Compiler

Source Machine
Frontend IR Optimizer IR Backend code

IR: Intermediate representation

• Frontend
– Dependent on source language
– Lexical analysis
– Parsing
– Semantic analysis (e.g., type checking)
Structure cont’

Source Machine
Frontend IR Optimizer IR Backend code

• Optimizer
– Independent part of target processor
– Effort to realize efficiency
– Can be very computationally intensive
Structure cont’

Source Machine
Frontend IR Optimizer IR Backend code

• Backend
– Dependent on target processor
– Code selection
– Code scheduling
– Register allocation
– Processor (target) dependant optimization
Thanks to Andy D. Pimentel
Thanks to Frank Pfenning
Front end
• scanning: a scanner groups input characters into tokens;
• parsing: a parser recognizes sequences of tokens
according to some grammar and generates Abstract
Syntax Trees (ASTs);
• Semantic analysis: performs type checking (ie, checking
whether the variables, functions etc in the source
program are used consistently with their definitions and
with the language semantics) and translates ASTs into IRs;
• optimization: optimizes IRs.
Back end
• instruction selection: maps IRs into assembly
code;
• code optimization: optimizes the assembly
code using control-flow and data-flow
analyses, register allocation, etc;
• code emission: generates machine code from
assembly code.
Object file
• This file is not executable since it may refer to external symbols
(such as system calls). The OS uses Linker and loader to execute
the code:
• linking: A linker takes several object files and libraries as input and
produces one executable object file. It retrieves from the input
files (and puts them together in the executable object file) the
code of all the referenced functions/procedures and it resolves all
external references to real addresses. The libraries include the
operating sytem libraries, the language-specific libraries, and,
maybe, user-created libraries.
• loading: A loader loads an executable object file into memory,
initializes the registers, heap, data, etc and starts the execution of
the program.
Interesting Question on Languages
• Why many programming languages?
• Why new programming languages?
• What is a good programming language?

Thanks to Alex Aiken


Why many programming languages?
• Different/conflicting application domains
– Scientific computing: FP mgt, array mgt and
parallelism (Fortran)
– Business appl.: reporting, persistence, data
analysis (SQL)
– Systems programming: control of rss and real time
constraints (C, C++)
• It is hard to design a single language for all
Why new programming languages?
• Programmer (user) training is dominant cost
of a programming language
– Widely used languages are slow to change
– It is easy to start a new language (from old) when
productivity is greater than training cost
– Languages are adopted to fill a gap not addressed
What is a good prog. language?
• There is no universally accepted metric for
language design
– A good language is one people use
• Which people?
• For what?

You might also like