Finding Insecure TrustManagers and Disabled Hostname Verification with CodeQL
In this post, I want to show how I found five vulnerabilities in usage of the Java TrustManager and HostnameVerifier classes.
I start with a short section about what a certificate is, what CodeQL is, and finally I explain the query I used to find the vulnerabilities.
What Are Certificates?
A certificate associates an identity (hostname, personal identity, …) with a public key and can either be signed by a Certificate Authority (CA) or be self-signed. A CA is a trusted third party that verifies the identity of the owner of the certificate and signs the certificate with their own private key. Both browsers and operating systems come with a set of CAs that they trust by default 1.
When a client connects to a server using TLS, the server sends its certificate to the client. The client then verifies the certificate by checking whether it is signed by a trusted CA and whether the hostname of the server matches the hostname in the certificate. If the certificate is valid, the client will establish a secure and encrypted connection with the server.
So What’s the Problem?
The problem is that the client can be configured to trust certificates that are not signed by a trusted CA or that don’t match the hostname of the server. This is usually done for testing purposes, but it can also be done by mistake or just as an oversight.
Browsers usually get this right, but there have also been cases in the past where they incorrectly implemented hostname verification 23 or where they had other problems verifying a certificate 4.
In this post I’m going to focus on Java applications that use the TrustManager
or HostnameVerifier
classes unsafely.
CodeQL
CodeQL is a static analysis tool that has been developed by Semmle - now @ Github.
It can be used both for (targeted) variant analysis and also (less targeted) analysis of entire bug classes like XSS, SSRF, and many more.
CodeQL has a simple but powerful, logical query language. If you want to learn more about CodeQL I recommend reading the CodeQL documentation.
Finding Insecure TrustManagers
So what is an insecure TrustManager
?
A TrustManager
is insecure if it accepts all certificates, regardless of whether they are signed by a trusted CA or not.
This is usually done by implementing the checkServerTrusted
method of the X509TrustManager
interface and never throwing an exception – therefore accepting all certificates.
In code this would look like this:
class InsecureTrustManager implements X509TrustManager {
@Override
public X509Certificate[] getAcceptedIssuers() {
return null;
}
@Override
public void checkServerTrusted(X509Certificate[] chain, String authType) throws CertificateException {
// BAD: Does not verify the certificate chain, allowing any certificate.
}
@Override
public void checkClientTrusted(X509Certificate[] chain, String authType) throws CertificateException {
}
}
If we then use this TrustManager
like so in our application:
SSLContext sslContext = SSLContext.getInstance("TLS");
sslContext.init(null, new TrustManager[] { new InsecureTrustManager() }, null);
HttpsURLConnection connection = (HttpsURLConnection) new URL("https://untrusted-root.badssl.com/").openConnection();
connection.setSSLSocketFactory(sslContext.getSocketFactory());
connection.connect();
We will happily connect to the server even though the certificate is not signed by a trusted CA.
The Query – High-Level
When writing a query it’s very helpful to verbalize the query:
We want to find all cases where an insecure TrustManager
is used to initialize an SSLContext
.
This means that we have a data flow query and we “just” have to define the source and the sink!
We can directly translate this into a CodeQL from
clause:
from InsecureTrustManagerFlow::PathNode source, InsecureTrustManagerFlow::PathNode sink
source
s are all TrustManager
instances that are insecure and sink
s are all SSLContext
instances that are initialized with an insecure TrustManager
!
Our where
clause then only has to ensure that the source is actually used at the sink, that is, we need flowPath
to hold:
where InsecureTrustManagerFlow::flowPath(source, sink)
The select
clause then adds a message at the location of the SSLContext#init
method and also references where the trust manager has been defined:
select sink, source, sink, "This uses $@, which is defined in $@ and trusts any certificate.",
source, "TrustManager",
source.getNode().asExpr().(ClassInstanceExpr).getConstructedType() as type, type.nestedName()
The rest of the query contains a little bit of boilerplate to make the query better structured and reusable.
(The main query can be found here, support files are in InsecureTrustManager.qll and InsecureTrustManagerQuery.qll).
The Query – Low-Level
(Some parts of the query are shown simplified)
The InsecureTrustManagerSource
class models all TrustManager
instances that are insecure on the data flow level 5 by viewing the node as an expression and then checking whether its constructed type 6 is an InsecureX509TrustManager
.
private class InsecureTrustManagerSource extends DataFlow::Node {
InsecureTrustManagerSource() {
this.asExpr().(ClassInstanceExpr).getConstructedType() instanceof InsecureX509TrustManager
}
}
InsecureX509TrustManager
is a class that models all classes deriving from X509TrustManager
(#1
) that have overridden the “checkServerTrusted” method (#2
) and that never throw a CertificateException
(#3
).
private class InsecureX509TrustManager extends RefType {
InsecureX509TrustManager() {
this.getAnAncestor() instanceof X509TrustManager and // #1
exists(Method m |
m.getDeclaringType() = this and
m.hasName("checkServerTrusted") and // #2
not mayThrowCertificateException(m) // #3
)
}
}
Under what conditions can a method throw a CertificateException
?
When it contains a throw
statement that throws a CertificateException
(#4
) or when it calls a method (#5
) that may throw a CertificateException
(#6
) or if there is no source code available for the called method and the method has a @throws
annotation that mentions CertificateException
(#7
).
private predicate mayThrowCertificateException(Method m) {
exists(ThrowStmt throwStmt | // #4
throwStmt.getThrownExceptionType().getAnAncestor() instanceof CertificateException // #4
|
throwStmt.getEnclosingCallable() = m // #4
)
or
exists(Method otherMethod | m.polyCalls(otherMethod) | // #5
mayThrowCertificateException(otherMethod) // #6
or
not otherMethod.fromSource() and // #7
otherMethod.getAnException().getType().getAnAncestor() instanceof CertificateException // #7
)
}
The InsecureTrustManagerSink
class models all cases where any TrustManager
(#8
) is used to init
(#9
) an SslContext
(#10
).
private class InsecureTrustManagerSink extends DataFlow::Node {
InsecureTrustManagerSink() {
exists(MethodCall ma, Method m |
m.hasName("init") and // #9
m.getDeclaringType() instanceof SslContext and // #10
ma.getMethod() = m
|
ma.getArgument(1) = this.asExpr() // #8
)
}
}
The InsecureTrustManagerConfig
module then simply combines the source (#11
) and the sink (#12
) like this:
module InsecureTrustManagerConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) { source instanceof InsecureTrustManagerSource } // #11
predicate isSink(DataFlow::Node sink) { sink instanceof InsecureTrustManagerSink } // #12
}
However, we have a slight problem: remember that we have a data flow query and not a taint tracking query. Recall the example from above:
SSLContext sslContext = SSLContext.getInstance("TLS");
sslContext.init(null,
new TrustManager[] { // #14
new InsecureTrustManager() // #13 #14
} // #14
, null);
We want to find flow from #13
to the second (1
in the definition of InsecureTrustManagerSink
, because CodeQL is zero-based) argument of init
.
However, #13
is an array element and cannot flow to the array itself (#14
) (CodeQL distinguishes between the array elements and the array itself). To fix this, we can allow implicit reads of array elements by overriding the allowImplicitRead
predicate.
predicate allowImplicitRead(DataFlow::Node node, DataFlow::ContentSet c) {
(isSink(node) or isAdditionalFlowStep(node, _)) and
node.getType() instanceof Array and
c instanceof DataFlow::ArrayContent
}
This predicate allows implicit reads of array elements when the array is used as a sink or when it is used as an additional flow step. By enabling implicit reads, CodeQL will not distinguish between data stored inside something (in a field, in an array as an element, in a map as a key or value, …) and the thing itself (the object the field belongs to, the array where the element is in, the map where the key/value is from, …) 7.
Finding Disabled Hostname Verification
So what is disabled hostname verification?
Hostname verification is disabled if we have a HostnameVerifier
that always returns true
in its verify
method.
Always returning true
means that we will accept any hostname, regardless of whether it matches the hostname in the certificate or not!
In code this would look like this:
HostnameVerifier verifier = new HostnameVerifier() {
@Override
public boolean verify(String hostname, SSLSession session) {
return true; // BAD: accept even if the hostname doesn't match
}
};
If we then use this HostnameVerifier
like so in our application:
HttpsURLConnection connection = (HttpsURLConnection) new URL("https://wrong.host.badssl.com/").openConnection();
connection.setHostnameVerifier(verifier);
connection.connect();
We will happily connect to the server even though the certificate is not valid for the wrong.host.badssl.com
domain 8.
The Query – High-Level
Again, when writing a query it’s very helpful to verbalize the query:
We want to find all cases where an all-accepting HostnameVerifier
is used in a HttpsURLConnection#set(Default)HostnameVerifier
call.
This means that we again have a data flow query and we “just” have to define the source and the sink!
We can directly translate this into a CodeQL from
clause:
from
TrustAllHostnameVerifierFlow::PathNode source, TrustAllHostnameVerifierFlow::PathNode sink
source
s are all HostnameVerifier
instances that are all-accepting and sink
s are all HttpsURLConnection#set(Default)HostnameVerifier
calls!
Our where
clause then only has to ensure that the source is actually used at the sink, that is, we need flowPath
to hold:
where TrustAllHostnameVerifierFlow::flowPath(source, sink)
The select
clause then adds a message at the location of the HttpsURLConnection#set(Default)HostnameVerifier
method and also references where the all-accepting hostname verifier has been defined:
select sink, source, sink,
"The $@ defined by $@ always accepts any certificate, even if the hostname does not match.",
source, "hostname verifier", source.getNode().asExpr().(ClassInstanceExpr).getConstructedType() as verifier, "this type"
The rest of the query contains a little bit of boilerplate to make the query better structured and reusable.
(The main query can be found here, support files are inUnsafeHostnameVerificationQuery.qll).
The Query – Low-Level
(Some parts of the query are shown simplified)
The TrustAllHostnameVerifier
class models all HostnameVerifier
instances that accept any hostname by checking whether the instance derives from HostnameVerifier
(#1
) and if it overrides the verify method (#2
) to always return true
(#3
).
class TrustAllHostnameVerifier extends RefType {
TrustAllHostnameVerifier() {
this.getAnAncestor() instanceof HostnameVerifier and // #1
exists(HostnameVerifierVerify m |
m.getDeclaringType() = this and // #2
alwaysReturnsTrue(m) // #3
)
}
}
When does a method always return true
?
When all return statements return true
(#4
). Note that this is a simplification, there could be methods that always return true
in practice/at runtime, but we cannot determine this statically.
private predicate alwaysReturnsTrue(HostnameVerifierVerify m) {
forex(ReturnStmt rs | rs.getEnclosingCallable() = m |
rs.getResult().(CompileTimeConstantExpr).getBooleanValue() = true // #4
)
}
The HostnameVerifierSink
class models all cases where any HostnameVerifier
is used in e.g. a HttpsURLConnection#setHostnameVerifier
call.
private class HostnameVerifierSink extends DataFlow::Node {
HostnameVerifierSink() { sinkNode(this, "hostname-verification") }
}
It does this by using the special sinkNode predicate that gets all nodes that are annotated with hostname-verification
in a “Models-as-Data” (MaD) file.
The MaD files can be found in .yml
files in the java/ql/lib/ext folder.
In our case, there are three definitions:
- ["javax.net.ssl", "HttpsURLConnection", True, "setDefaultHostnameVerifier", "", "", "Argument[0]", "hostname-verification", "manual"]
- ["javax.net.ssl", "HttpsURLConnection", True, "setHostnameVerifier", "", "", "Argument[0]", "hostname-verification", "manual"]
# from https://github.com/github/codeql/blob/257fe1ad6b5e8e596ece2306213dcfc340420e2c/java/ql/lib/ext/javax.net.ssl.model.yml#L6-L7
- ["org.apache.cxf.configuration.jsse", "TLSClientParameters", True, "setHostnameVerifier", "(HostnameVerifier)", "", "Argument[0]", "hostname-verification", "manual"
# from https://github.com/github/codeql/blob/257fe1ad6b5e8e596ece2306213dcfc340420e2c/java/ql/lib/ext/org.apache.cxf.configuration.jsse.model.yml#L7
The first element is the package name ("javax.net.ssl"
), the second element is the class name ("HttpsURLConnection"
).
The third element is a boolean that indicates whether to jump to an arbitrary subtype of that type (True
), the fourth element is the method name ("setDefaultHostnameVerifier"
) although generally this just selects a specific member (method, field, …) of the type.
The fifth element allows restriction based on the member signature (""
so no filtering is done), the sixth element is not relevant in our case.
The seventh element defines how data enters the sink ("Argument[0]"
in our case), the eighth element is the annotation that is used to annotate the sink ("hostname-verification"
).
The ninth element is the origin of the model (in this case manual
because the model has been added manually and not generated by e.g. the model generator). For more information about MaD files have a look at this internal documentation.
The TrustAllHostnameVerifierConfig
module then simply combines the source (#5
) and the sink (#6
) like this:
module TrustAllHostnameVerifierConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) {
source.asExpr().(ClassInstanceExpr).getConstructedType() instanceof TrustAllHostnameVerifier // #5
}
predicate isSink(DataFlow::Node sink) { sink instanceof HostnameVerifierSink } // #6
}
Because we want to reduce false-positives, we add an isBarrier
predicate to the query.
This predicate ignores all nodes that are in functions that suggest that they intentionally disable hostname verification.
predicate isBarrier(DataFlow::Node barrier) {
// ignore nodes that are in functions that intentionally disable hostname verification
barrier
.getEnclosingCallable()
.getName()
/*
* Regex: (_)* :
* some methods have underscores.
* Regex: (no|ignore|disable)(strictssl|ssl|verify|verification|hostname)
* noStrictSSL ignoreSsl
* Regex: (set)?(accept|trust|ignore|allow)(all|every|any)
* acceptAll trustAll ignoreAll setTrustAnyHttps
* Regex: (use|do|enable)insecure
* useInsecureSSL
* Regex: (set|do|use)?no.*(check|validation|verify|verification)
* setNoCertificateCheck
* Regex: disable
* disableChecks
*/
.regexpMatch("^(?i)(_)*((no|ignore|disable)(strictssl|ssl|verify|verification|hostname)" +
"|(set)?(accept|trust|ignore|allow)(all|every|any)" +
"|(use|do|enable)insecure|(set|do|use)?no.*(check|validation|verify|verification)|disable).*$")
}
General Guards
To further reduce false-positives, we also extend the where
clause with and not isNodeGuardedByFlag(sink.getNode())
to remove all sinks that are guarded by a flag indicating intentional disabling of hostname verification.
predicate isNodeGuardedByFlag(DataFlow::Node node) {
exists(Guard g | g.controls(node.asExpr().getBasicBlock(), _) | // #7
g = getASecurityFeatureFlagGuard() or g = getAnUnsafeHostnameVerifierFlagGuard() // #8
)
}
A node
is guarded when there is a Guard
that controls (#7
) 9 the node
and that is either a security feature flag guard or an unsafe hostname verifier flag guard (#8
).
A Guard
controls another node when the execution of the controlled node is dependent on the condition specified by the guard.
For example, consider the following code:
if (isHostnameVerificationDisabled()) { // #9
connection.setHostnameVerifier(new TrustAllHostnameVerifier()); // #10
}
Here, the connection.setHostnameVerifier
(#10
) call is guarded/controlled by the isHostnameVerificationDisabled
(#9
) method call.
Security Feature Flag Guards
The getASecurityFeatureFlagGuard
predicate gets some pre-defined guards indicating intentional disabling of a security feature while the getAnUnsafeHostnameVerifierFlagGuard
predicate gets guards specific to hostname verification. For that reason, we extend the existing FlagKind
class.
All we have to do is to override the getAFlagName
predicate to get all strings that should be considered a flag.
private class UnsafeHostnameVerificationFlag extends FlagKind {
UnsafeHostnameVerificationFlag() { this = "UnsafeHostnameVerificationFlag" }
bindingset[result]
override string getAFlagName() {
result
.regexpMatch("(?i).*(secure|disable|selfCert|selfSign|validat|verif|trust|ignore|nocertificatecheck).*") and
result != "equalsIgnoreCase"
}
}
By extending the FlagKind
class, we get all the functionality of the FlagKind
class for free! Namely, we get the getAFlag
predicate that gets all flags that are used to guard a node.
private Guard getAnUnsafeHostnameVerifierFlagGuard() {
result = any(UnsafeHostnameVerificationFlag flag).getAFlag().asExpr()
}
This completes the implementation of isNodeGuardedByFlag
and allows us to heavily reduce false-positives!
Conclusion
In this post I showed how to find multiple CVEs in the usage of the Java TrustManager
and HostnameVerifier
classes using CodeQL.
I did this by using a data flow query that finds all cases where an insecure TrustManager
or an all-accepting HostnameVerifier
is used.
Many – if not most – problems can be viewed as data flow/taint tracking problems and CodeQL is a great tool to solve these problems!
-
These CAs can and will be removed when there are problems with them, see e.g. https://groups.google.com/a/mozilla.org/g/dev-security-policy/c/oxX69KFvsm4, https://wiki.mozilla.org/CA/Symantec_Issues, or https://www.techtarget.com/searchsecurity/news/252527914/Mozilla-Microsoft-drop-Trustcor-as-root-certificate-authority. ↩
-
There are multiple “levels” in CodeQL. The data flow level is the highest level and is partially shared across all languages supported by CodeQL while the abstract syntax tree level is specific to each language and is the lowest level. ↩
-
A ClassInstanceExpr is for example
new FooBar()
and getConstructedType gets the type of the constructed object, in this caseFooBar
. ↩ -
For more information about implicit reads see this discussion. ↩
-
The certificate is only valid for
*.badssl.com
andbadssl.com
. Wildcard certificates – like*.badssl.com
– only apply to one level of subdomains, sowrong.host.badssl.com
is not covered by the certificate, buthost.badssl.com
orfoobar.badssl.com
would be. ↩ -
Technically, the
Guard
verifies that it controls the basic block that contains thenode
. ↩