Formulas¶
Formulas define logical relations between the free variables used in expressions.
Depending on the values assigned to those free variables, a formula can be true or false.
When a formula is true, we often say that the formula holds.
For example, the formula x = 4 + 5
holds if the value 9
is assigned to x
, but it
doesn’t hold for other assignments to x
.
Some formulas don’t have any free variables. For example 1 < 2
always holds, and 1 > 2
never holds.
You usually use formulas in the bodies of classes, predicates, and select clauses to constrain
the set of values that they refer to.
For example, you can define a class containing all integers i
for which the formula i in
[0 .. 9]
holds.
The following sections describe the kinds of formulas that are available in QL.
Comparisons¶
A comparison formula is of the form:
<expression> <operator> <expression>
See the tables below for an overview of the available comparison operators.
Order¶
To compare two expressions using one of these order operators, each expression must have a type and those types must be compatible and orderable.
Name | Symbol |
---|---|
Greater than | > |
Greater than or equal to | >= |
Less than | < |
Less than or equal to | <= |
For example, the formulas "Ann" < "Anne"
and 5 + 6 >= 11
both hold.
Equality¶
To compare two expressions using =
, at least one of the expressions must have a type. If
both expressions have a type, then their types must be compatible.
To compare two expressions using !=
, both expressions must have a type. Those types
must also be compatible.
Name | Symbol |
---|---|
Equal to | = |
Not equal to | != |
For example, x.sqrt() = 2
holds if x
is 4
, and 4 != 5
always holds.
For expressions A
and B
, the formula A = B
holds if there is a pair of values—one
from A
and one from B
—that are the same. In other words, A
and B
have at least
one value in common. For example, [1 .. 2] = [2 .. 5]
holds, since both expressions have
the value 2
.
As a consequence, A != B
has a very different meaning to the negation not A = B
[1]:
A != B
holds if there is a pair of values (one fromA
and one fromB
) that are different.not A = B
holds if it is not the case that there is a pair of values that are the same. In other words,A
andB
have no values in common.
Examples
- If both expressions have a single value (for example
1
and0
), then comparison is straightforward: 1 != 0
holds.1 = 0
doesn’t hold.not 1 = 0
holds.
- If both expressions have a single value (for example
- Now compare
1
and[1 .. 2]
: 1 != [1 .. 2]
holds, because1 != 2
.1 = [1 .. 2]
holds, because1 = 1
.not 1 = [1 .. 2]
doesn’t hold, because there is a common value (1
).
- Now compare
- Compare
1
andint empty() { none() }
(a predicate defining the empty set of integers): 1 != empty()
doesn’t hold, because there are no values inempty()
, so no values that are not equal to1
.1 = empty()
also doesn’t hold, because there are no values inempty()
, so no values that are equal to1
.not 1 = empty()
holds, because there are no common values.
- Compare
Type checks¶
A type check is a formula that looks like:
<expression> instanceof <type>
You can use a type check formula to check whether an expression has a certain type. For
example, x instanceof Person
holds if the variable x
has type Person
.
Range checks¶
A range check is a formula that looks like:
<expression> in <range>
You can use a range check formula to check whether a numeric expression is in a given
range. For example, x in [2.1 .. 10.5]
holds if the variable x
is
between the values 2.1
and 10.5
(including 2.1
and 10.5
themselves).
Note that <expression> in <range>
is equivalent to <expression> = <range>
.
Both formulas check whether the set of values denoted by <expression>
is the same as the
set of values denoted by <range>
.
Calls to predicates¶
A call is a formula or expression that consists of a reference to a predicate and a number of arguments.
For example, isThree(x)
might be a call to a predicate that holds if the argument x
is
3
, and x.isEven()
might be a call to a member predicate that holds if x
is even.
A call to a predicate can also contain a closure operator, namely *
or +
. For example,
a.isChildOf+(b)
is a call to the transitive closure of
isChildOf()
, so it holds if a
is a descendant of b
.
The predicate reference must resolve to exactly one predicate. For more information about how a predicate reference is resolved, see “Name resolution.”
If the call resolves to a predicate without result, then the call is a formula.
It is also possible to call a predicate with result. This kind of call is an expression in QL, instead of a formula. For more information, see “Calls to predicates (with result).”
Member predicates only apply to members of a particular class and calls to
member predicates have a receiver of a matching type. Syntactically, if a call
contains a dot, then the expression before the dot specifies the receiver of
the call. For instance, x
is the receiver for the call x.isEven()
.
For calls to member predicates of the enclosing class on the member itself
(i.e., the value of this
), the receiver may be omitted syntactically. In
this case we say the call has an implicit this receiver. For instance, in the
following example the isEven()
call in isOdd()
is a member predicate
call with an implicit this receiver and the call is equivalent to
this.isEven()
:
class OneTwoThree extends int {
OneTwoThree() { this = 1 or this = 2 or this = 3 }
predicate isEven() { this = 2 }
predicate isOdd() { not isEven() }
}
Use of implicit this receivers can make it harder to spot predicates that introduce
cartesian products by failing to relate the implicit this
variable with
other variables, which can negatively affect query performance. For more
information on cartesian products, see “Troubleshooting query performance”.
It is possible to enable warnings about implicit this receivers for CodeQL packs
through the warnOnImplicitThis
property.
Parenthesized formulas¶
A parenthesized formula is any formula surrounded by parentheses, (
and )
. This formula
has exactly the same meaning as the enclosed formula. The parentheses often help to improve
readability and group certain formulas together.
Quantified formulas¶
A quantified formula introduces temporary variables and uses them in formulas in its body. This is a way to create new formulas from existing ones.
Explicit quantifiers¶
The following explicit “quantifiers” are the same as the usual existential and universal quantifiers in mathematical logic.
exists
¶
This quantifier has the following syntax:
exists(<variable declarations> | <formula>)
You can also write exists(<variable declarations> | <formula 1> | <formula 2>)
.
This is equivalent to exists(<variable declarations> | <formula 1> and <formula 2>)
.
This quantified formula introduces some new variables. It holds if there is at least one set of values that the variables could take to make the formula in the body true.
For example, exists(int i | i instanceof OneTwoThree)
introduces a temporary variable of
type int
and holds if any value of that variable has type OneTwoThree
.
forall
¶
This quantifier has the following syntax:
forall(<variable declarations> | <formula 1> | <formula 2>)
forall
introduces some new variables, and typically has two formulas in its body. It holds
if <formula 2>
holds for all values that <formula 1>
holds for.
For example, forall(int i | i instanceof OneTwoThree | i < 5)
holds if all integers
that are in the class OneTwoThree
are also less than 5
.
In other words, if there is a value in OneTwoThree
that is greater than or equal to 5
,
then the formula doesn’t hold.
Note that forall(<vars> | <formula 1> | <formula 2>)
is
logically the same as not exists(<vars> | <formula 1> | not <formula 2>)
.
forex
¶
This quantifier has the following syntax:
forex(<variable declarations> | <formula 1> | <formula 2>)
This quantifier exists as a shorthand for:
forall(<vars> | <formula 1> | <formula 2>) and
exists(<vars> | <formula 1> | <formula 2>)
In other words, forex
works in a similar way to forall
, except that it ensures that
there is at least one value for which <formula 1>
holds.
To see why this is useful, note that the forall
quantifier could hold trivially.
For example, forall(int i | i = 1 and i = 2 | i = 3)
holds: there are no integers i
which are equal to both 1
and 2
, so the second part of the body (i = 3)
holds for
every integer for which the first part holds.
Since this is often not the behavior that you want in a query, the forex
quantifier is a
useful shorthand.
Implicit quantifiers¶
Implicitly quantified variables can be introduced using “don’t care expressions.” These are used when you need to introduce a variable to use as an argument to a predicate call, but don’t care about its value. For further information, see “Don’t-care expressions.”
Logical connectives¶
You can use a number of logical connectives between formulas in QL. They allow you to combine existing formulas into longer, more complex ones.
To indicate which parts of the formula should take precedence, you can use parentheses. Otherwise, the order of precedence from highest to lowest is as follows:
- Negation (not)
- Conditional formula (if … then … else)
- Conjunction (and)
- Disjunction (or)
- Implication (implies)
For example, A and B implies C or D
is equivalent to (A and B) implies (C or D)
.
Similarly, A and not if B then C else D
is equivalent to
A and (not (if B then C else D))
.
Note that the parentheses in the above examples are not necessary, since they highlight the default precedence. You usually only add parentheses to override the default precedence, but you can also add them to make your code easier to read (even if they aren’t required).
QL also has two nullary connectives indicating the always true formula,
any()
, and the always false formula, none()
.
The logical connectives in QL work similarly to Boolean connectives in other programming languages. Here is a brief overview:
any()
¶
The built-in predicate any()
is a formula that always holds.
Example
The following predicate defines the set of all expressions.
Expr allExpressions() {
any()
}
none()
¶
The built-in predicate none()
is a formula that never holds.
Example
The following predicate defines the empty set of integers.
int emptySet() {
none()
}
not
¶
You can use the keyword not
before a formula. The resulting formula is called a negation.
not A
holds exactly when A
doesn’t hold.
Example
The following query selects files that are not HTML files.
from File f
where not f.getFileType().isHtml()
select f
Note
You should be careful when using
not
in a recursive definition, as this could lead to non-monotonic recursion. For more information, “Non-monotonic recursion.”
if ... then ... else
¶
You can use these keywords to write a conditional formula. This is another way to simplify
notation: if A then B else C
is the same as writing (A and B) or ((not A) and C)
.
Example
With the following definition, visibility(c)
returns "public"
if x
is
a public class and returns "private"
otherwise:
string visibility(Class c){
if c.isPublic()
then result = "public"
else result = "private"
}
and
¶
You can use the keyword and
between two formulas. The resulting formula is called a
conjunction.
A and B
holds if, and only if, both A
and B
hold.
Example
The following query selects files that have the js
extension and contain fewer
than 200 lines of code:
from File f
where f.getExtension() = "js" and
f.getNumberOfLinesOfCode() < 200
select f
or
¶
You can use the keyword or
between two formulas. The resulting formula is called a
disjunction.
A or B
holds if at least one of A
or B
holds.
Example
With the following definition, an integer is in the class OneTwoThree
if it is equal to
1
, 2
, or 3
:
class OneTwoThree extends int {
OneTwoThree() {
this = 1 or this = 2 or this = 3
}
}
implies
¶
You can use the keyword implies
between two formulas. The resulting formula is called an
implication. This is just a simplified notation: A implies B
is the same as writing (not A) or B
.
Example
The following query selects any SmallInt
that is odd, or a multiple of 4
.
class SmallInt extends int {
SmallInt() { this = [1 .. 10] }
}
from SmallInt x
where x % 2 = 0 implies x % 4 = 0
select x
Footnotes
[1] | The difference between exists( a, b | a in A and b in B | a != b )
On the other hand, not exists( a, b | a in A and b in B | a = b )
This is equivalent to |