Tuesday, May 15, 2012

How Claims encoding works in SharePoint 2010

Copyright from: wictor wilen


If you have been using previous versions of SharePoint 2007, been working with .NET or just Windows you should be familiar with that (NETBIOS) user names are formatted DOMAIN\user (orprovider:username for FBA in SharePoint). When SharePoint 2010 introduced the claims based authentication model (CBA) these formats was not sufficient for all the different options needed. Therefore a new string format was invented to handle the different claims. The format might at first glance look a bit weird...

How it works?

The claim encoding in SharePoint 2010 is an efficient and compact way to represent a claim type and claim value, compared to writing out all the qualified names for the claim types and values. I will illustrate how the claim are encoded in SharePoint 2010 focused on user names, but this claim encoding method could be used for basically any claim. Let's start with an illustrative drawing of the format and then walk through a couple of samples.

The format

The format is actually well defined in the SharePoint Protocol Specifications in the [MS-SPSTWS] document, read it if you want a dry and boring explanation, or continue to read this post...
The image below shows how claims are encoded in SharePoint 2010, click on the image for a larger view of it.
The SharePoint 2010 claim encoding format
Let's start from the beginning. The first character must be an I for an identity claim, otherwise it has to be c. Note that the casing is important here. The second character must be a : and the third a 0. The third character is reserved for future use.
It's in the fourth character the interesting part starts. The fourth character tells us what type of claim it is and the fifth what type of value. There are several possible claim types. The most common are; user logon name (#), e-mail (5), role (-), group SID (+) and farm ID (%). For the claim value type a string is normally used and that is represented by a . character. The sixth character in the sequence represents the original issuer and depending on the issuer the format following the sixth character varies. For Windows and Local STS the seventh character is a pipe character (|) followed by the claim value. The rest of the original issuers have two values separated by pipe characters, the name of the original issuers and then the claim value. Easy huh?
Note: the f (Forms AuthN) as trusted issuer is not documented in the protocol specs, and this is what SharePoint uses when dealing with membership providers (instead of m and r). For more info see SPOriginalIssuerType.
For full reference of claim types and claim value types, look into the [MS-SPSTWS} documentation.
Charmap(Added 2012-02-13) If you are creating custom claim providers or using a trusted provider (as original issuer), you will see that you get some "undocumented" values in the Claim Type (4th) position (that is they are not documented in the protocol specs). The most common character to see here is ǵ (0x01F5). If the claim encoding mechanism in SharePoint cannot find a claim type it automatically creates a claim type encoding for that claim. It will always start with the value of 500 increment that value with 1 which results in 501. 501 is in hex 01F5 which represents that character. It will continue to increase the value for each new (and to SharePoint not already defined) claim type. The important thing here to remember is that these claim types and their encoding is not the same cross farms, it all depends on in which order the new claim types are added/used. (All this is stored in a persisted object in the configuration database)
Some notes: the total length must not exceed 255 characters and you need to HTML encode characters such as %, :, ; and | in the claim values.

Some samples

If this wasn't clear enough, let's look at a few samples.
Standard Windows claim
Windows claim
Another common claim. This time it's not an identity claim but an identity provider claim, and this is how NT AUTHORITY\Authenticated Users is represented.
Authenticated users claim
This is how a Windows Security Group is represented as a claim. The value represents the SID of the group.
Security Group claim
If we're using federated authentication (as in the Azure AuthN series I 've written) we can see claims like this. It's an e-mail claim from a trusted issuer called Azure.
E-mail claim
Here's how a claim can be encoded if we're having a role called facebook in the trusted issuer with the name Azure.
Role claim
This final example shows how the encoded claim for the Local Farm looks like. It's a Farm ID claim from thesystem Claim Provider and the claim value is the ID of the farm.
Farm claim
This is how a forms authenticated user claim looks like.image


I hope this little post showed you all the magic behind the claims encoding in SharePoint. It's quite logical...yea really.


ChrisAlm said...

If you're doing content database log shipping to a remote zombie type farm at a DR site, and attaching those content databases and bringing the web application online in a DR scenario.....what happens with the security setting applied within the sites? Does the security get all messed up because that encoding hashtable is different at the DR site??

Paul Beck said...

I have reference this post on my blog. I don't want to rehash you work and it is useful for seeeing how the claim is formatted.