Constraint Based Analysis: Seminar
Constraint Based Analysis: Seminar
Constraint Based Analysis: Seminar
Toni Suter
I
Contents
1 Introduction 2
1.1 Types of static program analysis . . . . . . . . . . . . . 2
1.1.1 Intraprocedural analysis . . . . . . . . . . . . . . 2
1.1.2 Interprocedural analysis . . . . . . . . . . . . . . 3
3 Conclusion 18
1
1 Introduction
Constraint Based Analysis is a form of static program analysis that uses
systems of set contraints to describe certain aspects of a program. It
can be used for many different kinds of analyses (e.g. to perform type-
checking in a statically typed programming language or to analyse the
control flow in a dynamic programming language where functions are
first-class objects). This paper gives an overview over the various kinds
of static program analyses and explores control flow analysis with the use
of constraints in more detail.
2
1 Introduction
This optimization improves the speed of the program by storing the result
of the expression (c + d) in a temporary variable thereby removing the
need to calculate the result twice.
Live Variable Analysis, Reaching Definitions Analysis and Very Busy Ex-
pressions Analysis are other examples of intraprocedural analyses [NNH99].
All of these techniques depend on the fact that it is relatively easy to
determine the control flow within a procedure. For example, if the con-
trol flow reaches the condition of an if-else statement, there are exactly
2 positions where it can continue; either in the if-block or the else-block.
Since intraprocedural analysis doesn’t analyse the interactions between
different functions, the control flow is more or less predictable at compile
time.
3
1 Introduction
4
1 Introduction
5
1 Introduction
Like in the JavaScript example in Listing 1.4, there is not just one func-
tion body that can be executed as a result of calling p.description().
Depending on whether the argument that is passed to the function print()
is a Person object or an instance of a Person subclass the results may
vary. The function body that is actually executed depends on the dy-
namic type of the parameter p.
6
1 Introduction
7
2 Control flow analysis
This section describes how to analyse the control flow of a program using
constraints. As described in section 1.1, control flow analysis is partic-
ularly interesting in programming languages that support some form of
dynamic dispatch. In this paper simple JavaScript examples are used to
show the process of Constraint Based Control Flow Analysis. Some of
the code is derived from the examples in the book Principles of Program
Analysis [NNH99] where the untyped lambda calculus [Chu32] with a
few extensions is used.
8
2 Control flow analysis
Listing 2.1: JavaScript example (the labels are not part of JavaScript)
1 (function apply_two(f) {
2 (return f1 (2);)2
3 })3
4
5 (function square(x) {
6 (return x4 * x5 ;)6
7 })7
8
9 (function triple(y) {
10 (return y8 + 3;)9
11 })10
12
13 (apply_two(square11 );)12
14 (apply_two(triple13 );)14
9
2 Control flow analysis
The sections 2.3 and 2.4 show how one may get to such a result by
generating and resolving a set of constraints. Since this is a pure control
flow analysis, the result contains only function values. This is also the
reason why there are so many empty sets in the result. For example,
in this program the parameters x and y can only ever have the value 2.
Because this is a number and not a function value it does not appear in
the abstract environment p.
The abstract cache for label 1 is the set { square, triple }. We therefore
know, that the control flow at this point either continues in the function
square() or the function triple(). As described in subsection 1.1.2
this may enable the compiler to perform intraprocedural optimizations
on a whole-program level.
2.2 Acceptability
An analysis result is considered acceptable/valid, if it contains for each
subexpression the set of functions that it may evaluate to. Thus, a result
that contains the set of all functions for each subexpression would be a
valid result, altough not a very useful one. The goal is to narrow the
result down to the least solution because this increases the chances of
being able to perform additional optimizations [NNH99].
There are a few simple rules that define for each type of expression (e.g.
function application, binary expression, etc.) the conditions that have to
be true in order for an analysis result to be valid. These rules are used to
check whether an analysis result is acceptable for a particular program.
Consider for example Listing 2.2:
Listing 2.2: Acceptability example
1 (function my_func(f) {
2 (f1 ();)2
3 })3
4
5 (my_func((function() {
6 //brilliant code
7 })4 );)5
10
2 Control flow analysis
The rule fn defines the constraints that are generated for function def-
initions. First, the constraint {f unction(x) { return e0 ; }} ⊆ Ĉ(l) is
generated which causes the function value to propagate to the abstract
cache of the surrounding expression with the label l. Additionally, the
constraints that are generated by the expression e0 are added to the set
as well.
Similar to the examples above, there are rules that define the set of
constraints that needs to be generated for every language construct (e.g.,
function application, binary expression, etc.).
11
2 Control flow analysis
12
2 Control flow analysis
13
2 Control flow analysis
14
2 Control flow analysis
15
2 Control flow analysis
Step 6 : W = []
The node Ĉ(5) doesn’t have any out-
going edges, so it can be removed
from the worklist. The node Ĉ(2)
is the only remaining member of the
worklist. It can be removed too, be-
cause its outgoing edges are due to
constraints that have already been
processed. This is the final result of
the analysis.
16
2 Control flow analysis
Final result
The tables 2.3 and 2.4 show the final result of the Graph Formulation in
tabular form:
This represents the least solution for the code in Listing 2.3. For example,
it shows that the overall expression with the label 5 can only evaluate to
the identity function idy . Since all the possible function values for each
subexpression are now known, the analyser also knows which function
bodies may be executed as a result of calling a function (inter-flow).
In larger, more meaningful programs this knowledge may be useful to
perform intraprocedural optimizations on a whole-program level as shown
in subsection 1.1.2.
17
3 Conclusion
My goal for this paper was to give a brief overview over different kinds
of static program analysis as well as a more detailed description of Con-
straint Based Control Flow Analysis (CFA). After reading this paper it
should be clear to the reader that Constraint Based CFA is a powerful
tool to analyse the control flow between procedures (inter-flow) which
can lead to further optimization possibilities.
It is worth noting that there exist more constraint based analysis tech-
niques than the one that was described in this paper. However, explaining
all of them in detail would be beyond the scope of this paper.
18
Bibliography
[Aik99] Alexander Aiken. Introduction to Set Constraint-Based Pro-
gram Analysis. 1999.
[ALSU06] Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D.
Ullman. Compilers - Principles, Techniques and Tools. 2006.
[AS96] Harold Abelson and Gerald Jay Sussman. Structure and In-
terpretation of Computer Programs. 1996.
[Chu32] Alonzo Church. A Set of Postulates for the Foundation of
Logic. 1932.
[CT11] Keith Cooper and Linda Torczon. Engineering a Compiler.
2011.
[Lip11] Miran Lipovaca. Learn You a Haskell for Great Good! 2011.
[NNH99] Flemming Nielson, Hanne Riis Nielson, and Chris Hankin.
Principles of Program Analysis. 1999.
[Str14] Bjarne Stroustrup. Bjarne Stroustrup’s C++ Glossary, De-
cember 2014. http://www.stroustrup.com/glossary.html.
19