The OAuth 2.0 specification defines a delegation protocol that is useful for conveying authorization decisions across a network of web-enabled applications and APIs. OAuth is used in a wide variety of applications, including providing mechanisms for user authentication. This has led many developers and API providers to incorrectly conclude that OAuth is itself an authentication protocol and to mistakenly use it as such. Let's say that again, to be clear:
Much of the confusion comes from the fact that OAuth is used inside of authentication protocols, and developers will see the OAuth components and interact with the OAuth flow and assume that by simply using OAuth, they can accomplish user authentication. This turns out to be not only untrue, but also dangerous for service providers, developers, and end users.
This article is intended to help potential identity providers with the question of how to build an authentication and identity API using OAuth 2.0 as the base. Essentially, if you're saying "I have OAuth 2.0, and I need authentication and identity", then read on.
Authentication in the context of a user accessing an application tells an application who the current user is and whether or not they're present. A full authentication protocol will probably also tell you a number of attributes about this user, such as a unique identifier, an email address, and what to call them when the application says "Good Morning". Authentication is all about the user and their presence with the application, and an internet-scale authentication protocol needs to be able to do this across network and security boundaries.
However, OAuth tells the application none of that. OAuth says absolutely nothing about the user, nor does it say how the user proved their presence or even if they're still there. As far as an OAuth client is concerned, it asked for a token, got a token, and eventually used that token to access some API. It doesn't know anything about who authorized the application or if there was even a user there at all. In fact, much of the point of OAuth is about giving this delegated access for use in situations where the user is not present on the connection between the client and the resource being accessed. This is great for client authorization, but it's really bad for authentication where the whole point is figuring out if the user is there or not (and who they are).
As an additional confounder to our topic, an OAuth process does usually include several kinds of authentication in its process: the resource owner authenticates to the authorization server in the authorization step, the client authenticates to the authorization server in the token endpoint, and there may be others. The existence of these authentication events within the OAuth protocol does not translate to the Oauth protocol itself being able to reliably convey authentication.
As it turns out, though, there are a handful of things that can be used along with OAuth to create an authentication and identity protocol on top of this delegation and authorization protocol. In nearly all of these cases, the core functionality of OAuth remains intact, and what's happening is that the user is delegating access to their identity to the application they're trying to log in to. The client application then becomes a consumer of the identity API, thereby finding out who authorized the client in the first place. One major benefit of building authentication on top of authorization in this way is that it allows for management of end-user consent, which is very important in cross-domain identity federation at internet scale. Another important benefit is that the user can delegate access to other protected APIs along side their identity at the same time, making it much simpler for both application developers and end users to manage. With one call, an application can find out if a user is logged in, what the app should call the user, download photos for printing, and post updates to their message stream. This simplicity is very compelling, but by doing both at the same time, many developers conflate the two functions.
To help clear things up, it may be helpful to think of the problem in terms of a metaphor: chocolate vs. fudge. From the start, the nature of these two things is quite different: chocolate is an ingredient, fudge is a confection. Chocolate can be used to make many different things, and it can even be used on its own. Fudge can be made out of many different things, and one of those things might be chocolate, but it takes more than one ingredient to make fudge happen and it might not even involve chocolate. As such, it's incorrect to say that chocolate equals fudge, and it's certainly overreaching to say that chocolate equals chocolate fudge.
OAuth, in this metaphor, is chocolate. It's a versatile ingredient that is fundamental to a number of different things and can even be used on its own to great effect. Authentication is more like fudge. There are at least a few ingredients that must brought together in the right way to make it work, and OAuth can be one of these ingredients (perhaps the main ingredient) but it doesn't have to be involved at all. You need a recipe that says what to combine and how to combine them, and there are a large number of different recipes that say how that can be accomplished.
And in fact, there are a number of well-known recipes out there for doing this with specific providers, like Facebook Connect, Sign In With Twitter, and OpenID Connect (which powers Google's sign-in system, among others). These recipes each add a number of items, such as a common profile API, to OAuth to create an authentication protocol. Can you build an authentication protocol without OAuth? Of course, there are many kinds out there, just as there are many kinds of non-chocolate fudge to be had out there. But what we're here to talk about today is specifically authentication built on top of OAuth 2.0, what can go wrong, and how it can be made secure and delicious.
Even though it's very possible to use OAuth to build an authentication protocol, there are a number of things that tend to trip up those who do so, either on the side of the identity provider or on the side of the identity consumer. The practices described in this article are intended to inform potential identity providers of common risks as well as inform consumers of common mistakes that they can avoid when using an OAuth-based authentication system.
Since an authentication usually occurs ahead of the issuance of an access token, it is tempting to consider reception of an access token of any type proof that such an authentication has occurred. However, mere possession of an access token doesn't tell the client anything on its own. In OAuth, the token is designed to be opaque to the client, but in the context of a user authentication, the client needs to be able to derive some information from the token.
This problem stems from the fact that the client is not the intended audience of the OAuth access token. Instead, it is the authorized presenter of that token, and the audience is in fact the protected resource. The protected resource is not generally going to be in a position to tell if the user is still present by the token alone, since by the very nature and design of the OAuth protocol the user will not be available on the connection between the client and protected resource. To counter this, there needs to be an artifact that is directed at the client itself. This could be done by dual-purposing the access token, defining a format that the client could parse and understand. However, since general OAuth does not define a specific format or structure for the access token itself, protocols like OpenID Connect's ID Token and Facebook Connect's Signed Response provide a secondary token along side the access token that communicates the authentication information directly to the client. This allows the primary access token to remain opaque to the client, just like in regular OAuth.
Since the access token can be traded for a set of user attributes, it is tempting to think that posession of a valid access token is enough to prove that a user is authenticated. This assumption turns out to be true in some cases, where the token was freshly minted in the context of a user being authenticated at the authorization server. However, that's not the only way to get an access token in OAuth. Refresh tokens and assertions can be used to get access tokens without the user being present, and in some cases access grants can occur without the user having to authenticate at all.
Furthermore, the access token will generally be usable long after the user is no longer present. Remember, since OAuth is a delegation protocol, this is fundamental to its design. This means that if a client wants to make sure that an authentication is still valid, it's not sufficient to simply trade the token for the user's attributes again because the OAuth protected resource, the identity API, often has no way of telling if the user is there or not.
An additional (and very dangerous) threat occurs when clients accept access tokens from sources other than the return call from the token endpoint. This can occur for a client that uses the implicit flow (where the token is passed directly as a parameter in the URL hash) and don't properly use the OAuth state
parameter. This issue can also occur if different parts of an application pass the access token between components in order to "share" access among them. This is problematic because it opens up a place for access tokens to potentially be injected into an application by an outside party (and potentially leak outside of the application). If the client application does not validate the access token through some mechanism, it has no way of differentiating between a valid token and an attack token.
This can be mitigated by using the authorization code flow and only accepting tokens directly from the authorization server's token endpoint, and by using a state
value that is unguessable by an attacker.
Another problem with trading the access token for a set of attributes to get the current user is that most OAuth APIs do not provide any mechanism of audience restriction for the returned information. In other words, it is very possible to take a naive client, hand it the (valid) token from another client, and have the naive client treat this as a "log in" event. After all, the token is valid and the call to the API will return valid user information. The problem is of course that the user hasn't done anything to prove that they're present, and in this case they haven't even authorized the naive client.
This problem can be mitigated by communicating the authentication information to a client along with an identifier that the client can recognize and validate, allowing the client to differentiate between an authentication for itself versus an authentication for another application. It is also mitigated by passing the set of authentication information directly to the client during the OAuth process instead of through a secondary mechanism such as an OAuth protected API, preventing a client from having an unknown and untrusted set of information injected later in the process.
If an attacker is able to intercept or coopt one of the calls from the client, it could alter the content of the returned user information without the client being able to know anything was amiss. This would allow an attacker to impersonate a user at a naive client by simply swapping out a user identifier in the right call sequence. This can be mitigated by getting the authentication information directly from the identity provider during the authentication protocol process (such as along side the OAuth token) and by protecting the authentication information with a verifiable signature.
One of the biggest problems with OAuth-based identity APIs is that even when using a fully standards-compliant OAuth mechanism, different providers will inevitably implement the details of the actual identity API differently. For example, a user's identifier might be found in a user_id
field in one provider but in the subject
field in another provider. Even though these are semantically equivalent, they would require two separate code paths to process. In other words, while the authorization may happen the same way at each provider, the conveyance of the authentication information could be different. This problem can be mitigated by providers using a standard authentication protocol built on top of OAuth so that no matter where the identity information is coming from, it is transmitted in the same way.
This problem occurs because the mechanisms for conveying authentication information discussed here are explicitly left out of scope for OAuth. OAuth defines no specific token format, defines no common set of scopes for the access token, and does not at all address how a protected resource validates an access token.
OpenID Connect is an open standard published in early 2014 that defines an interoperable way to use OAuth 2.0 to perform user authentication. In essence, it is a widely published recipe for chocolate fudge that has been tried and tested by a wide number and variety of experts. Instead of building a different protocol to each potential identity provider, an application can speak one protocol to as many providers as they want to work with. Since it's an open standard, OpenID Connect can be implemented by anyone without restriction or intellectual property concerns.
OpenID Connect is built directly on OAuth 2.0 and in most cases is deployed right along with (or on top of) an OAuth infrastructure. OpenID Connect also uses the JSON Object Signing And Encryption (JOSE) suite of specifications for carrying signed and encrypted information around in different places. In fact, an OAuth 2.0 deployment with JOSE capabilities is already a long way to defining a fully compliant OpenID Connect system, and the delta between the two is relatively small. But that delta makes a big difference, and OpenID Connect manages to avoid many of the pitfalls discussed above by adding several key components to the OAuth base:
The OpenID Connect ID Token is a signed JSON Web Token (JWT) that is given to the client application along side the regular OAuth access token. The ID Token contains a set of claims about the authentication session, including an identifier for the user (sub
), the identifier for the identity provider who issued the token (iss
), and the identifier of the client for which this token was created (aud
). Additionally, the ID Token contains information about the token's valid (and usually short) lifetime as well as any information about the authentication context to be conveyed to the client, such as how long ago the user was presented with a primary authentication mechanism. Since the format of the ID Token is known by the client, it is able to parse the content of the token directly and obtain this information without relying on an external service to do so. Furthermore, it is issued in addition to (and not in lieu of) an access token, allowing the access token to remain opaque to the client as it is defined in regular OAuth. Finally, the token itself is signed by the identity provider's private key, adding an additional layer of protection to the claims inside of it in addition to the TLS transport protection that was used to get the token in the first place, preventing a class of impersonation attacks. By applying a few simple checks to this ID token, a client can protect itself from a large number of common attacks.
Since the ID Token is signed by the authorization server, it also provides a location to add detached signatures over the authorization code (c_hash
) and access token (at_hash
). These hashes can be validated by the client while still keeping the authorization code and access token content opaque to the client, preventing a whole class of injection attacks.
It should be noted that clients are not required to use the access token, since the ID Token contains all the necessary information for processing the authentication event. However, in order to provide compatibility with OAuth and match the general tendency for authorizing identity and other API access in parallel, OpenID Connect always issues the ID token along side an OAuth access token.
In addition to the claims in the ID Token, OpenID Connect defines a standard protected resource that contains claims about the current user. The claims here are not part of the authentication process, as discussed above, but instead provide bundled identity attributes that make the authentication protocol more valuable to application developers. After all, it's preferable to say "Good Morning, Jane Doe" instead of "Good Morning, 9XE3-JI34-00132A". OpenID Connect defines a set of standardized OAuth scopes that map to subsets of these attributes: profile
, email
, phone
, and address
, allowing plain OAuth authorization requests to carry the necessary information for a request. OpenID Connect defines a special openid
scope that switches on the issuance of the ID token as well as access to the UserInfo Endpoint by the access token. The OpenID Connect scopes can be used along side other non-OpenID-Connect OAuth scopes without conflict, and the access token issued can potentially be targeted at several different protected resources. This allows an OpenID Connect identity system to smoothly coexist with an OAuth authorization system.
OAuth 2.0 was written to allow a variety of different deployments, but by design does not specify how these deployments come to be set up or how the components know about each other. This is OK in the regular OAuth world where one authorization server protects a specific API, and the two are closely coupled. With OpenID Connect, a common protected API is deployed across a wide variety of clients and providers, all of which need to know about each other to operate. It would not be scalable for each client to have to know ahead of time about each provider, and it would be even less scalable to require each provider to know about each potential client.
To counteract this, OpenID Connect defines a discovery protocol that allows clients to easily fetch information on how to interact with a specific identity provider. On the other side of the transaction, OpenID Connect defines a client registration protocol that allows clients to be introduced to new identity providers. By using these two mechanisms and a common identity API, OpenID Connect can function at internet scale, where no parties have to know about each other ahead of time.
Even with all of this robust authentication capability, OpenID Connect is (by design) still compatible with plain OAuth 2.0, making it a very good choice to deploy on top of an OAuth system with minimal developer effort. In fact, if a service is already using OAuth and the JSON Object Signing and Encryption (JOSE) specifications (including JWT), that service is already well on its way to supporting OpenID Connect already.
To facilitate the building of good client applications, the OpenID Connect working group has published documents on building a basic OpenID Connect client using the authorization code flow as well as building an implicit OpenID Connect client. Both of these documents walk the developer through building a basic OAuth 2.0 client and adding the handful of components necessary for OpenID Connect.
While the core specification is fairly straightforward, not all use cases can be adequately addressed by the base mechanisms. To support advanced use cases including higher security deployments, OpenID Connect also defines a number of optional advanced capabilities beyond standard OAuth, including the following (among others):