I recently started investigating authentication and quickly realized that the space is a mire of acronyms, overlapping terms, and imprecise language making the whole area very confusing. In this article, I explain what I learned and walk through some of the terminology and concepts. However, before we get into any details about authentication, I will quickly touch on the difference between authentication and authorization.
Authentication vs. Authorization
In short, authentication is the process of identifying you, while authorization is the process of determining what you have access to. My favorite example of authentication and authorization is someone entering a bar. When you attempt to enter a bar, the bouncer will ask to see your ID. When the bouncer verifies that it is a valid ID and that you match the picture on your ID, this is authentication. The bouncer is determining who you are. Next, the bouncer will use your date of birth on your ID to determine if you are authorized to enter the bar. If the date is at least 21 years prior, the bouncer will let you in.
This is as much as I’m going to say about authorization here, so if you want to learn more, check out my other blog post on authorization.
For the purposes of this article, by authentication, I mean logging in to an application and all of the pieces involved. More specifically, we will focus on the use case where there might be an outside authentication authority (such as with SSO).
For this to make sense, I find it helpful to look at this from a historical perspective. When many (still-used) authentication systems were initially written and set up, they were designed for entirely on-prem use cases. Companies had employees (and schools had students and faculty). These users needed to access various servers and applications on the network but didn’t want to remember many different passwords. Meanwhile, the network administrator wanted to be able to revoke access when these users left the company or graduated. Ideally, this happens in a single location so that the network administrator doesn’t have to remove them from many systems. So far, this sounds a lot like modern use cases.
The solutions, however, were slightly different because of the single network. Early on, same-sign on was very common. Same-sign on (NOT the same as single sign-on) works in one of two ways¹. Either all passwords are synced between your systems so that users will have the same password everywhere, or all of your applications point to a single directory service. This second one starts to sound a lot more like SSO but is a little different. With same-sign on, each application still performs its own authentication, but they all just happen to point to the same directory service, which stores users and passwords. The most common implementation of this is through the LDAP protocol². These directory services are more than just password stores and include actual user directories. They were first introduced to tech by telecommunications (the folks responsible for phone books) in the 1980s.
- It’s worth noting that password vaulting (or using password managers) is also occasionally called same-sign on, but is an entirely separate solution as well.
- Note that LDAP is a protocol, not a product and there are many different products that all implement the LDAP protocol, but work in basically the same way.
As may be obvious, this entire scheme is not incredibly secure. However, this mattered a lot less because all of these applications and servers lived on the same network, were controlled by the same entity, and generally didn’t involve trusting random third parties. Around the same time in the 80s, MIT developed the Kerberos system, which was intended to solve these privacy and security concerns. Kerberos is an SSO system that allows distinct servers to identify and trust each other over an insecure network. With Kerberos, the user signs into an authenticator that provides the authentication information to other systems. It uses many layers of encryption to protect against things like man-in-the-middle attacks (which, while commonplace now, were largely theoretical at the time). While it could be used over the internet, much of the required secret sharing is impractical, to say the least, so it is almost exclusively used within a single domain. It’s worth noting that Active Directory (AD), as well as other systems, were built on top of Kerberos.
Once Software as a Service (SaaS) started to become a thing, authentication over the internet became a real problem. Specifically, companies still wanted a lot of the central control they were used to even with these distributed 3rd party applications. At the same time, as we mentioned before, Kerberos and other existing systems were not well suited to authentication across multiple domains. As a result, Security Assertion Markup Language (SAML) was first developed in 2002 and SAML 2.0, which is largely the standard still today, was released in 2005. SAML defines how authentication and authorization information is exchanged between two parties. It includes both the messages and formats and the flow used to authorize and authenticate a user. SAML identifies three key components, the Identity Provider (IdP), the Service Provider, and the Principal. The principal is the entity (typically a user) that tries to gain access to the Service Provider. A Service Provider is whatever application the principal wants to use (Split is a service provider, as is Box, Slack, and many others). The Identity Provider is the authority about who the principal is and what they should have access to. For a more in-depth look at SAML and how SSO flows work, read this article (or one of the many other available explanations and articles about SSO).
I want to mention one more protocol: OpenID Connect (OIDC). OIDC is more or less the next version of SAML but built on top of OAuth. Developers typically find OIDC easier to implement because it uses REST and JSON (rather than XML and SOAP or basic HTTP). Based on that, one might conclude that the industry trend is to move toward OIDC and away from SAML and OIDC has gained a lot of traction with social, gaming, and mobile applications. However, nearly all government and enterprise applications still support primarily or exclusively SAML, so SAML isn’t going away soon.
Both OIDC and SAML use claims-based authentication. This means that when they authenticate, they also return information about the user known as claims. These claims are commonly used to make authorization decisions but are not directly authorization information. Going back to the bar and bouncer example, when I present my ID, my birthdate is a claim. The bouncer uses that claim to authorize me to enter, but the claim itself says nothing about whether I can enter this particular bar, it only gives information about me.
Now that we’ve discussed a lot of the history and protocols, what does the architecture look like? I’m going to ignore the direct login case because typically that isn’t very complex. We will instead focus on the situations involving external identity providers. For better or worse, this is also rooted in history. As mentioned in the SAML explanation earlier, for SSO, we have a Service Provider and an Identity Provider. That sounds easy enough, so the architecture should look something like this:
This simplifies the SSO flow, but essentially, the service provider is asking the IdP for authentication. Realistically though, there are at least a handful of different IdPs depending on which organization the particular user belongs to. So that diagram probably looks a little more like:
For now, let’s assume all of these IdPs are cloud-based IdPs (sometimes called Identity as a service or IDaaS providers).
Some prominent cloud providers include Azure AD, Okta, and OneLogin (not to mention social logins like Google, Facebook, or LinkedIn).
Now, let’s look at the Service Provider side. It’s entirely possible that the service provider implemented their own authentication framework and has implemented the service provider side of SAML. They may have even implemented the relying party portion of OIDC. However, SAML is a pain to deal with, so it’s becoming more and more common for service providers to instead integrate an authentication authority. Authentication Authorities are also commonly referred to as CIAM providers or Customer Identity and Access Management providers. This authentication authority is then responsible for maintaining integrations with all major identity providers and providing a unified interface for the service provider. This unified interface often involves an SDK focusing on ease of use. Common authentication authorities include Auth0, FusionAuth, and PingFederated. This authentication authority can either run in the service provider’s data centers, or it can run in the authentication authority’s data centers. Either way, it’s really part of the service provider’s offering. So now our diagram looks something like this:
Now, the reality is that authentication is only half of the story. The other half is user management. Specifically, provisioning users, de-provisioning users, and maintaining security groups. While this is not technically authentication, it goes hand in hand, since the identity provider is the authority on user identity. Additionally, enterprise administrators use user management to control which users have access to various applications (authorization). Together, these two are referred to as Federated Identity. Federated identity is a somewhat purposely vague term but encompasses everything involved with linking electronic identities across Identity and Access Management systems (IAMs)¹.
- It’s worth noting that Identity and Access Management is sometimes abbreviated as IdAM and is also referred to as Identity Management (IdM). Furthermore, AWS IAM (a common usage of the term IAM) is actually the AWS-specific feature that implements IAM and is completely separate from everything else we’re discussing here.
With all of that in mind, the federated identity picture actually looks something like this:
Now, let’s focus on the Enterprise side. An organization may manage all of its identity directly through one of these cloud providers or it may use the cloud provider for only its 3rd party SaaS applications and has a different login option for on-prem solutions. In either of those cases, this diagram is the whole story. However, the reality is that most enterprise companies, especially the older ones, started with an on-prem solution, later added the cloud provider, and want to keep those two systems connected. In this case, the organization needs to link the on-prem solution with the cloud solution. One way this is done is through a sync client of some sort:
For these setups, the sync client is responsible for sending all the directory updates from the on-prem solution to the cloud provider. This portion would actually be a same-sign on approach (making the entire picture a hybrid between same-sign on and SSO). We copy user information first, then the user actually authenticates with one location or the other (either the original on-prem solution or with the IDaaS provider). A few examples of these sync clients include Azure AD Connect (which connects on-prem AD with Azure AD), Okta’s AD integration (which connects AD with Okta), and Google Cloud Directory Sync (which connects AD or LDAP servers with Google Cloud).
Another way to link on-prem with an IDaaS provider is through a Security Token Service (STS). There are other examples, but the most common STS for this scenario is AD FS (Active Directory Federation Services). In this case, the AD FS server is run on-prem and communicates with the traditional AD for authentication, but then also communicates with the external server. So it looks something like this:
There are a few things worth noting here. Specifically, AD FS does not directly access any directory information and therefore cannot handle any user management. If that functionality is desired, a separate directory sync tool is necessary. It’s also worth noting that AD FS has a limited set of available protocols (it does not support OIDC, for example), so while it is possible to bypass the cloud provider portion of this diagram, it’s often still present. So you’d have your service provider talking to the cloud provider talking to AD FS talking to the on-prem AD deployment.
Another thing worth pointing out is that AD FS is not an IdP even though it’s often referred to that way. It does not directly contain any identity information and does not have access to it. Instead, it’s a Security Token Service (aka Secure Token Service or STS). This means that it’s responsible for issuing, validating, revoking, and renewing tokens, but it doesn’t contain any of the actual identity.
We already mentioned that the on-prem (yellow box) portion of this diagram is optional and that they authentication authority (pink box) is also optional. It turns out that the cloud provider (purple box) is also optional. It’s possible to directly hook up a service provider to AD FS for authentication. It’s also possible to use a sync client to sync either user or user and password information from the on-prem solution directly into a service provider. Skipping the cloud provider entirely isn’t super common, especially on the user management side, but it is possible.
Everything Comes Back to the IAM
Now that we have this rather complex (but complete) diagram, I want to point something out. The Authentication Authority and the cloud IAM (the pink and the purple boxes) can be, and often are, the exact same product. In fact, the entire on-prem solution, at least in the AD FS case, also represents basically the same functionality (and could be the same product apart from the historical nature of why there’s an on-prem piece at all). While discussing the pieces earlier, I listed different vendors because traditionally, different vendors have dominated in each section, but these solutions are now converging.
Traditionally, the authentication authorities focused on all types of authentication. They allow for direct login of several types and provide SSO integrations to nearly every other IdP available. They are now moving into the user management space as service providers who want easy authentication solutions also want easy user management solutions and integrations to the same IAMs. The IDaaS providers traditionally focused on providing a single location for employees to log into all available 3rd party service providers. In addition to a way for users to log into them directly, they needed to handle user management to allow IT admins to add or remove users, as well as authorization capabilities to control which users had access to which service providers. They didn’t necessarily focus on being able to authenticate with upstream IdPs, but as enterprises came in with more and more complex setups, the capabilities were added. As you can see, both of these now handle direct login, user management capabilities — both directly and through integrations with external IdPs, and authentication using external IdPs. At the end of the day, these are both IAM solutions, so it makes sense that even though they started with different aspects of the problem, they would converge.
There’s a lot going on in the IAM space. A lot of where we are today is also deeply rooted in the history of the space, but things continue to change even now. Hopefully this blog was helpful in understanding the space today. It’ll be exciting to see where it evolves to next.