About SharePoint OnPrem Multitenancy

As per https://docs.microsoft.com/en-us/sharepoint/what-s-new/what-s-deprecated-or-removed-from-sharepoint-server-2019 it is clear that multi tenancy is supported in SPS 2016 and deprecated with 2019. But…

On-prem multi-tenancy is a configuration that is extremely difficult to set up and maintain.  We’ve been strongly discouraging customers from using this configuration unless they already deployed it during 2010/2013.  And if they have, then upgrading to a new version of SharePoint should be looked at as an opportunity to get off this configuration.  We only offer “limited support” for this configurations both SPS2013 and SPS2016 because the out-of-box experience is incomplete, so customers end up having to write a lot of custom code to create a good end-to-end experience.

 

Advertisement

Site Retention Policies keep sending notification emails to end users even postponed sites

Problem definition:

You are using Site Retention Policies in standard SharePoint 2013/2016 teamsites. The feature works as expected although if site owners extend their teamsite, the site keeps sending notifications about “the site is about to expire and will be deleted”. As it displays already the new deletion date which is one year ahead.

Well, out of the box SharePoint behaviour has been designed that any postpone should be short term and that postpone duration SharePoint keep notify you, even if the site has been postponed.But there is a glich in this design, the site owners can postpone a site over years. Of course in that duration every week (Actually whenever the “Expiration” timer job runs) if you get notification, it will be very annoying.

Luckly we have some workaround to mitigate this.

Before going to workaround, I would like to give some information

How to configure Site Policies, please read following article.
https://support.office.com/en-us/article/use-policies-for-site-closure-and-deletion-a8280d82-27fd-48c5-9adf-8a5431208ba5
“Site Settings -> Site Policies”

Components:

– “Site Policy” Feature: Site Settings -> “Site Collection Features” -> “Site Policy”. It is the feature you should already enabled to able to use “Site Policies”.

– “Expiration Policy” Timer job: Each web application we have one “Expiration” timer job. That timer job has responsible to expire operations.
Enumerates list items and looks for those with an expiration date that has already occurred. For those items, runs disposition processing. Disposition processing most often results in deleting items. But it can perform other actions, such as processing disposition workflows.
https://docs.microsoft.com/en-us/sharepoint/technical-reference/timer-job-reference-for-sharepoint-2013
This timer job’s default schedule is weekly. So you would expect notification emails fires weekly.

-For every SPWeb object ( Root or subwebs) we have; Site Settings -> “Site Closure and Deletion” settings.We have assigning related policies here for a specific SPWeb Object (even root site web object)

– “Project Policy Item List” . This is the hidden list that your policy related items/configurations are stored here.When you assign a policy for a site, it stores here. Every site collection have one of this list instance if you enabled the “Site Policy” Feature. So you can find this list in Site Collection’s root web. (Not under sub webs).

-Also we have several policy system EventRecievers. It is important because we should disable them if we going to play around with policy internal settings.

Let’s give an example.
Assume we have created a new site policy:

“Deletion Event”: Site Created Data + 1 Year .
“Send an email notifications to site owners this far in advance of deletion:” -> 3 Months
“Send follow-up notifications every:” -> 7 Days
“Owners can postpone imminent deletion for”  1 years .

(The below picture say 14 day, consider it 7days pls)
PolicySettings

Current date of the server : 05/01/2017

So what this information tell us:
If we assume we have created a site 05/01/2017 , this site will be deleted on 05/01/2018
Before deletion of 3 months , we start to get notifications. According to our example starting by 05/10/2017. (You will get first notification exactly when the Expiration timerjobs run on that week)

So i have created a brand new subsite and assigned this policy to that subweb from “Site Closure and Deletion”

If we look in “Project Policy List Item” table. We learn more information. You can use below powershell script to get information about related item.
(You need to adjust the script for finding correct list item id, I will not do it here)

If ((Get-PSSnapIn -Name Microsoft.SharePoint.PowerShell -ErrorAction SilentlyContinue) -eq $null )
{ Add-PSSnapIn -Name Microsoft.SharePoint.PowerShell }

$url = “<your site collection url>”
$rname = “Project Policy Item List”
$site = Get-SPSite $url
$rootweb = $site.OpenWeb()
$rList = $rootweb.Lists[$rname]
$item = $rList.Items[“<please locate the correct item id>”]
write-host “ExpireDate           :” $item[“_dlc_ExpireDate”]
write-host “ProjectExpirationDate:” $item[“ProjectExpirationDate”]
write-host “ProjectCreateDate    :” $item[“ProjectCreateDate”]
write-host “LastRun   :” $item.Properties[“_dlc_LastRun”]
write-host “PROPERTIES”
$item.Properties
write-host “XML”
$item.xml.Replace(“ows_”,[System.Environment]::NewLine + “ows_”)

The result is

ExpireDate           : 05/10/2017 2:27:27 PM  +9 Months.
-This is the value of date, next time “Expiration” timer job notice that should do something about it.
-It is not a real expiration date, it is a changable varible which “Expiration” timer job can calculate and change in time to time.
ProjectExpirationDate: 05/01/2018 2:27:27 PM  +1year.
-This value means when we delete the site according to our formula.
ProjectCreateDate    : 05/01/2017 2:27:27 PM  +0 time.
-This value when we have created the item policy. It is the beginning reference time.

So basically when we pass the date “05/10/2017”, Depends on “Expiration” timer job schedule in this week of the date, the timer job will send the email notification about our site will be deleted in 3 months, Lets assume timer job run on 07/10/2017 (2 days after “ExpireDate” value) your site owners will get first notification about site deletion. Afterwards timer job will update “ExpireDate” value by adding 1 week = 14/10/2017 (based on setting we defined “Send follow-up notifications every”). Also will update/add some other properties like “_dlc_LastRun” it will be 07/10/2017) .

On date 14/10/2017 is the next time “Expiration” timer jobs runs, it will send the second notification about site deletion.Afterwards timer job will update this date adding another 1 week. 21/10/2017. This goes until we reach “ProjectExpirationDate” on that date the object will be deleted. (or closed, depend on how you configure the policy)
Ok. So lets have a look some other important parameters.
ItemRetentionFormula -> It shows the formula of when we start first do something about this record -> For the first time,  “ExpireDate” calculated with this formula.But “ExpireDate” property will be changed afterwards when that date has come by “Expiration” timer job.

PS C:\Users\Administrator.CONTOSO> $item.Properties[“ItemRetentionFormula”]
<formula id=”Microsoft.Office.RecordsManagement.PolicyFeatures.Expiration.Formula.BuiltIn”><number>-3</number><property>ProjectExpirationDate</property><propertyId></propertyId><period>month
s</period></formula>

Some other important properties.
_dlc_policyId                  0x010085EC78BE64F9478AAE3ED069093B996300ACCF30C2E8DFDE4CB3D2D69F6C58E43C
ows_ProjectWebGuid='{06563951-AC75-4D8A-8835-79907AFE84BB}’
ows_ProjectWebUrl=’http://contososp:9090/sites/corpa/SharePointHub, /sites/corpa/SharePointHub’ -This is my subsite name

ows_ProjectCreateDate  -> We already explained this above
ows_ProjectExpirationDate -> We already explained this above
ows_ProjectIsClosed=’0′ -> If you selected the option close the site before deletion (meaning make it not reacable by users)
ows_ProjectNumberOfPostpone=’0′ -> It will show the number that you can understand any postpone happens.
ws__dlc_ExpireDate-> We already explained this above
ows_ContentType=’MyDeletePolicy’ -> this is same as the Policy name.Well the system works via content type structures behind the scene.

Lets return our problem. The problem begins when your site owner decide and postponed the deletion afterwards he/she got the first notification. Well according to our settings it will be postponed for 1 year ahead. But the problem the owners will continue to get emails for every week (we set “Send follow-up notifications every” it 7 days). I said 🙂 it is annoying.

What happens after you postpone in related item properties in “Project Policy List Item”
ows_ProjectNumberOfPostpone=’1′ -> changed “0” -> “1”
ows_ProjectCreateDate=’05/01/2017 2:27:27′  -> It is not changed.
ows_ProjectExpirationDate=’05/01/2019 2:27:27′ -> but this date increased as 1 year now in 2019 !.
We have new property named
_dlc_ItemStageId=1

 

There is only one workaround, can only applicable with Powershell. if we delete _dlc_LastRun and newly added property _dlc_ItemStageId and run “Expiration” timer job afterwards. Timer job will do a recalculation and correction. It will going to apply again the first time ItemRetentionFormula but this time the ProjectExpirationDate is in 2019.(Remember, it was updated when the site owner postponed). So correction will reset the “ExpireDate” value and you will not get anymore notification emails until  “05/01/2019 minus 3 months”, All Good 🙂

Here is the powershell to fix that issue:

If ((Get-PSSnapIn -Name Microsoft.SharePoint.PowerShell -ErrorAction SilentlyContinue) -eq $null )
{ Add-PSSnapIn -Name Microsoft.SharePoint.PowerShell }

#Site Collection Level.
$site =get-spsite <site collection url>

# Open the RootWeb
$web = $site.OpenWeb()

#gather  Project Policy Item List hidden list
$list = $web.Lists[“Project Policy Item List”]

#There will be several subsites or different policy item in the list based on usage.
#Need to locate correct policy item in the list with related site or subsite.
#Print all items in that list to find out the relate (site or subsite object)
$list.Items
#Please add your logic based on your requirements.For example you can use ProjectWebGuid or ProjectWebUrl for filter out your related item.I will leave it to you.

#gather the item.
$item = $list.Items[<related item id>]

#-Update procedure for the ExpireDate
#SET EVENT FIRING DISABLED.
$assembly = [Reflection.Assembly]::LoadWithPartialName(“Microsoft.SharePoint”);
$type = $assembly.GetType(“Microsoft.SharePoint.SPEventManager”);
$prop = $type.GetProperty([string]”EventFiringDisabled”,[System.Reflection.BindingFlags] ([System.Reflection.BindingFlags]::NonPublic -bor [System.Reflection.BindingFlags]::Static));
$prop.SetValue($null, $true, $null);

#Update.
$item.Properties.Remove(“_dlc_LastRun”);
$item.Properties.Remove(“_dlc_ItemStageId”);
$item.SystemUpdate($false)

#SET EVENT FIRING ENABLED
$prop.SetValue($null, $false, $null);

+Then run “Expiration” Timer job for related web application.

I strongly suggest, please do some test and practice the script on your test environment before appling it to production!
If you do something incorrect, you can easly messed up your policy items.Or you may call Microsoft Support to help.

About SharePoint with 16+ Cores

Well, If you update .Net 4.7.2 or higer it is OK, otherwise it is a bad idea.

More speficially,
ReaderWriterLockSlim with reentrant have design limitation which can lead serious performance down for previous .Net versions. And Sharepoint pretty much depended on this thread syncronization object. Specially Blob Cache and ObjectCache wraps and using them. More CPU causes more thread contention, excesive locking thats brings slowness.

Example callstacks:
SPReaderWriterLock named [BlobCache] waited 43992 milliseconds to acquire lock. Call stack:
at Microsoft.Office.Server.Utilities.SPReaderWriterLock.AcquireLock(Boolean readerLock, Boolean upgradable, Boolean throwException)
at Microsoft.Office.Server.Utilities.SPReaderWriterLock.AcquireLock(Boolean readerLock, Boolean upgradable)

System_Core_ni!System.Threading.ReaderWriterLockSlim.EnterMyLockSpin()
System_Core_ni!System.Threading.ReaderWriterLockSlim.TryEnterWriteLock(Int32)
Microsoft_Office_Server!Microsoft.Office.Server.Utilities.SPReaderWriterLock.AcquireLock(Boolean, Boolean)
Microsoft_Office_Server!Microsoft.Office.Server.ObjectCache.SPCache+MossObjectCache.UpdateUsageMap(System.String, UInt32, UInt32)
Here are some threads about the problem.
https://github.com/dotnet/coreclr/pull/13243
https://github.com/dotnet/coreclr/pull/13495

These issues has been fixed with .Net Framework 4.7.2 version!.

Pls check .NET Framework 4.7.2 Release Notes
https://github.com/Microsoft/dotnet/blob/master/releases/net472/dotnet472-changes.md

Again don’t think about 64cpus makes more performance. It depends of the software boundaries/limitations and several other things..
My suggestion, scale up with multiple machines. It is more cheaper and stable. ! Not with excessive hardwares. (8 cores are fine 🙂 )

If you have that monster machines, don’t worry you can  use it with HyperV and scale up VMs.

SharePoint User A authenticated as User B

How could that happen ?

Let me give a some information about the problem:

Issue:
Intermittently Users authenticated with reverse proxy, impersonate incorrectly and User A may become User B . Based on user privileges , end users experiencing either access denied or having different user identity and permissions.

Receipts

  • 1 In the middle Device ,who redirects authentication or re-authenticate. (I am not meaning a hacker device , if you are using SSL, it is not possible easily but your system admins can). It is mostly a Reverse Proxy configured officially by your system admins.All most every modern proxy has a feature that uses Port Sharing or Re-use Session functionality. It is perfectly fine , provides high performance , prevent reauthentication and port exhaustion problem.It is works for well almost every scenario . Almost!!!
  • Session-Based Authentication , (Like NTLM or Certificate Authentication). In our scenario,We have one web application with two Authentication Provider , 1 for default CBA/NTLM and 1 for CBA/Form-Based Authentication. (Issue happens on CBA/NTLM part)
    • This configuration provides another authentication layer on SharePoint ,and force SharePoint to use Federated Authentication mechanizm , so you will see Fed Auth cookies are in use.

Some important information:

Federated Authentication over CBA/NTLM. Despite the fact that it is a token/cookie based architecture, it is depending on NTLM authentication under the cover. It means it is still a Session Based Authentication.

Why we have a problem:
Because Any middle device setting for “same TCP session re-use” is not suitable/unsupported for any “Session Based” authentication type.

The issue, It is not related directly SharePoint implementation , It is related that how the Session Based Authentication works.

Also same problem may occur ; faulty middle device software and other incorrect configurations.

How the issue happens,
Lets have a look in details for Federated Authentication over CBA/NTLM.

In TCP/Network Layer – We don’t have any authentication in this layer.
We have two end points . Point A (Reverse Proxy) to Point B (SharePoint WFE)
Before autentication a TCP connection estabilishes between that two points.

Example
Point A – IP 10.10.0.5 , Source Port : 45000  -> Point B – IP : 10.10.0.25 port 443
We are now calling this “A TCP Channel” or “A TCP Session” .

After TCP connection is estabilished then HTTP start to work in that channel.

HTTP Layer
User go to the server anymously first , the server provides authentication challanges it supported. And user provides its credential assets . This is called NTLM Handshake;

NTLMNTLM2

TCP Channel should not closed until NTLM handshake complete. It is a requirement for
NTLM authentication . Thats why , all modern web servers , use a standart/feature called HTTP Keep Alive which provides “Persistent Connection” not only NTLM handshake duration , also several requests are handled in same TCP Channel until Client or Server close the connection.

Keep-Alive-Sessions

After NTLM HandShake completed , IIS Stores the Session Information ,
TCP Channel  X  -> [Point A]  [IP], [Port] <=> Identity : Authenticated User A.  (Or Anonymous)

(If we don’t use , HTTP Keep Alive,  For every request we need to do re-authenticate . (it is not a session based authentication it is called Request Based Authentication . Well NTLM fails in that scenario. You will see flooding 401 responses on NTLM handshake.)

CBA/NTLM

Now SharePoint come in the scenario , It builds claims and tokens on IIS/NTLM Identity , creates the tokens via Security Token Service , caches the tokens with Distributed Cache or Local Cache.

SharePoint and IIS believe and trust , underlayer TCP Session belong to only one verified authenticated identity.

If you have more than one authentication provider , SharePoint also builds Federated Authentication Cookies , default 5 days duration .Cookie – Token pairs must be match for user verification.
Fed Auth Cookie sent to client with last response in NTLM handshake (Status 200 or 302).
and corresponding Token has been cached in server (default 10 hours).

All incomming requests are considered valid/authenticated based on Fed Auth Cookie – Security token pair.

What happens if SharePoint can’t find token in the cache:
If the cookie valids , It rebuilds the token  and cache the token again and updates the cookie.  (You will have another 5 days, “sliding” )
It do not re-authenticate.

What happens if the Cookie expires: Well you need to reauthenticate.

What happens if the TCP Channel closed : you need to reauthenticate. But sometimes it is not happen! Even client send close channel , in the middle devices can prevent it. Because It wants to reuse that TCP Channel.

So far so good,

But what if someone , some other user  use the same TCP Channel and send requests to the server . Well exactly what is happening if the reverse proxy does in session-reuse functionality.

SharePoint and IIS believe and trust , underlayer TCP Session belong to only one verified authenticated identity.Because how the bible says for Session-Based Authentication works.

Problematic Scenario:
Lets assume User A authenticated , have a valid cookie and continue communication without re-autheticate using Cookie-Token pairs for validation.

In mean while User B make a request (anonymous first) from same TCP Channel and requesting authentication.It is NTLM.  Well the TCP Channel is not closed , it is the same channel,  There is no possibility to understand it is the different user, Server reuthenticate the user.

After NTLM HandShake completed , IIS Stores the Session Information by overriding ,
TCP Channel  X  -> [Point A]  [IP], [Port] <=> Identity : Authenticated User B.  (Or Anonymous)
IP is the always Reverse proxy it isn’t changed.Also we have using the same TCP Channel so the port also same. It is absolute override.

Now the channel is belong to USER B instead of USER A.
Well User A still have no issues, he has a valid cookie and we have a valid token , so User A and User B communicates with the server without any problem in same tcp channel.

But the problem happen when the token is missed/expired (default 10 hours) .
What SharePoint does , verify the channel in that point ,ask to the IIS what was the Identity  ? IIS tells : it is UserB . SharePoint recreates the token and updates the cookie for UserB in a request which belong to UserA.
Now User A gets inccorect cookie and have inccorect token cached . It becomes User B by now.

Depends on permissions , either facing access denieds or have access UserB’s resources.

This problem is called “Session Hijacking“.
This is not a security hole in Sharepoint ,IIS or NTLM .

An anology :
Lets assume , you have a internet faced web site under authentication have must protected resources and if you enable anonymous access on that server in same resources .Is it considered security hole in authentication mechanizm or your code ? Of course not.

Well, there is still a security issue because of unsupported or incorrect configuration . Also it is not depended on Sharepoint, It may happen any ASP.NET application on any kind of Session-Based Authentication.

In conclusion,

Any “Session Based” authentication (like NTLM, Certificate auth etc.)  to work needs one TCP Channel per authenticated user.

SharePoint Federated Authentication for Claims Based Authentication over NTLM is depending on the underlying NTLM Authentication. NTLM is a session based authentication . Any middle device setting for “same TCP session re-use” is not suitable/unsupported for these type of authentications.

Suggestions:

  • Disabling “TCP Channel/Session re-use” feature in middle devices
  • Update in the middle devices firmware for any faulty software
  • Not using Session-Based Authentication with Reverse Proxy “Session re-use” feature or changing authentication type for any/suitable “Request-based” authentication or “token based” authentication (like ADFS) or Form based authentication.

Note: Don’t mess with “Session ” that is TCP level Session/Channel, not an application level session object or structure !! (OSI Layer 5)

Clean draft versions via PowerShell

SharePoint versionining is very cool feature, but in time or setting incorrect version limitations can cause performance issues. When this happen you may want to get rid off older draft versions.

Following script is cleans for older draft versions in a library while keeping all major versions. Also you can keep defined number of last major items drafts available.
Note: This script doesn’t delete any major item.

#USAGE:
# .\CleanDraftVersions.ps1 -siteUrl <Site Url> -webName <webName> -listName <ListName> -versionsToKeep <Number> -delete=<$true|$false>
#PARAMETERS
# -siteUrl : Url of the Site Collection
# -webName : Specific subsite (leave empty for root web)
# -listName : Specific ListName
# -versionsToKeep: Number of major item’s drafts will be preserved (including current version)
# -delete : $false (default) for reporting only ,$true for deleting.
# Highligts: Red will be deleted , Yellow will be preserved , Green current Item .

#EXAMPLE
# .\CleanDraftVersions.ps1 -siteUrl “http://contoso.com&#8221; -webName “” -listName “Pages” -versionsToKeep 10 -delete $false
Download:
https://github.com/bpostaci/CleanDraftVersions/blob/master/CleanDraftVersions/CleanDraftVersions.ps1