Unable to delete RBS Blob data from File system even deleting from SharePoint 2010

Consider fallowing scenario that you have SQL Server 2008 R2  RBS enabled, and SharePoint Server 2010 RBS installed servers . You have some files that stored in SharePoint document library whichs streams are stored in RBS and even you deleted this files form SharePoint Document Library you noticed that the Blob data in file system still remaining.

Usually this is not a problem it is by design issue , because purpose of data recovery ,performance consideration, data integrity and safety the deleted files in real are not deleted immediately. So many systems are designed like this as SharePoint and also RBS included. In that kind of systems as a manner of being on the safe side they are just mark the files are deleted and than runs some background process later for deleting files according when some thresholds or limits are exceeded. If you what to find out this issue is a real problem you have to disable or make shut down this functionalities and after doing this still the blob files are remain on file system then you can say that you have a real problem.

On SharePoint side First thing you should check that the feature of Recycle Bin.

Recycle Bins are used to help users protect and recover data.Microsoft SharePoint Server 2010 supports two stages of Recycle Bins: the first-stage Recycle Bin and second-stage Recycle Bin.When a user deletes an item, the item is automatically sent to the first-stage Recycle Bin. By default, when an item is deleted from the first-stage Recycle Bin, the item is sent to the second-stage Recycle Bin. A site collection administrator can restore items from the second-stage Recycle Bin.You turn on and configure Recycle Bins at the Web application level. By default, Recycle Bins are turned on in all the site collections in a Web application. This article describes how to configure Recycle Bin settings for a Web application.”
http://technet.microsoft.com/en-us/library/cc263125(v=office.14).aspx

For more information and usage recommendations about SharePoint Server 2010 Recycle Bins, see Plan to protect content by using recycle bins and versioning (SharePoint Server 2010).

In that Point you have two option to bypass this feature that 1) you can totally disable Recycle Bin from Central Administrations site by CA-> Manage Web Application -> Select web Application which you decided to disable Recycle bin feature -> on Ribbon Menu Select General Settings and set “Recycle Bin” property as “Off”
2) or when you delete a file you can clear Site (First-stage) and Site Collection (Second-Stage) Recycle bins.

On SQL side in Content Database if you want to be sure and confirm deletion of the file you can use fallowing SQL .
1) Open SQL Server Management Studio
2) Select related Content Database and click “New Query”
3) Select * from AllDocs where ListID='<GUID>’
*** You can find List Guid on Browser Address bar when you open Library Settings page of a document libarary.
and check results for the file still is exists in that list .If you clear correct there should not be the related file is present on the results.

Even that you confirm that the file has been deleted from Content Database will still the Blob Data remains in File System where the blobs are stored. Becuase there is another mechanizm in SQL RBS side named “RBS Garbage Collector”

“SharePoint Server 2010 automatically marks unreferenced or deleted BLOB data for removal. SharePoint Server 2010 counts references to BLOBs by looking at the list of BLOB IDs stored by SharePoint Server 2010 in its content databases at the time of removal. Any BLOB references that are present in the RBS store tables but absent in the content database are assumed to be deleted by SharePoint Server 2010 and will be marked for removal. BLOBs that are not present in the content database and were created before the orphan cleanup time window, described later in this article, are also assumed to be deleted by SharePoint Server 2010 and will be marked for removal.

Because SharePoint Server 2010 tabulates BLOB references from the RBS columns of the content database, every RBS column must have a valid index before it can be registered in RBS.

The SQL Server RBS Maintainer tool removes the items marked by SharePoint Server 2010 for removal. You should schedule the clean-up tasks to be run during off-peak hours to reduce the effect on regular database operations.

RBS garbage collection is performed in the following three steps:

  • Reference scan.(RC)  The first step compares the contents of the RBS tables in the SharePoint Server 2010 content database with RBS’s own internal tables and determines which BLOBs are no longer referenced. Any unreferenced BLOBs are marked for deletion.
  • Delete propagation. (DP) The next step determines which BLOBs have been marked for deletion for a period of time longer than the garbage_collection_time_window value and deletes them from the BLOB store.
  • Orphan cleanup. (OC) The final step determines whether any BLOBs are present in the BLOB store but absent in the RBS tables. These orphaned BLOBs are then deleted”

http://technet.microsoft.com/en-us/library/ff943565(v=office.14).aspx

We have talked about ThreshHolds . In RBS configuration we have 3 important threshold for clearing BLOB data.

delete_scan_period :Specifies the minimum amount of time that must pass between two reference scan garbage collection runs. The default value is 30 days
orphan_scan_period: Specifies the minimum amount of time that must pass between two orphan cleanup garbage collection runs. The default value is 30 days
garbage_collection_time_window : Specifies the minimum time that must pass between identifying a blob as having no references in the database and deleting the blob from the store. This guarantees the availability of BLOBs for the specified time in case a backup is restored. The default value is 30 days.

So according to default values , your BLOB files should be cleared after 30 days , if they are not referenced to any Content Database record.
You can get more information about all configuration thresholds about RBS with following article:
http://msdn.microsoft.com/en-us/library/gg316763(v=sql.105).aspx

For testing immediate delete we can change these threashold .From SQL Server Management Studio:
1)Open SQL Server Management Studio
2)Select RBS enabled Content Database and click “New Query”
3) Execute following queries.
exec mssqlrbs.rbs_sp_set_config_value ‘garbage_collection_time_window’, ‘time 00:00:00’;
exec mssqlrbs.rbs_sp_set_config_value ‘delete_scan_period ‘, ‘time 00:00:00’;
exec mssqlrbs.rbs_sp_set_config_value ‘orphan_scan_period’, ‘time 00:00:00’;

Our job is not done yet:
The actual work of GC is done by the RBS Maintainer application. The maintainer is a console application that takes command line parameters such as the connection string to the database and the phases of GC to execute. This can be run from any machine that has access to the DB and the blob store(s). It can also be run from multiple machines simultaneously. You can schedule it using your favorite scheduler e.g. Windows Task Scheduler.

 Maintainer also takes an optional parameter to limit the amount of time it is run
http://blogs.msdn.com/b/sqlrbs/archive/2008/08/08/rbs-garbage-collection-settings-and-rationale.aspx

RBS requires you to define a connection string to each database that uses RBS before you run the RBS Maintainer. This string is stored in a configuration file in the <RBS installation path>\Microsoft SQL Remote Blob Storage 10.50\Maintainer folder that is ordinarily created during installation. The RBS Maintainer can be run manually by executing the Microsoft.Data.SqlRemoteBlobs.Maintainer.exe program together with the parameters that are listed in the following table.
When you run Maintainer from Command Prompt you can trace the operation logs in cmd window:

1) On Sql server open CMD prompt as Administrator and navigate to the path “C:\Program Files\Microsoft SQL Remote Blob Storage 10.50\Maintainer”

2)Execute the command
Maintainer.exe -connectionstringname  RBSMaintainerConnection -operation GarbageCollection ConsistencyCheck ConsistencyCheckForStores -GarbageCollectionPhases rdo
-ConsistencyCheckMode r -TimeLimit 120

You can get more information about Maintainer.exe parameters
http://blogs.msdn.com/b/sqlrbs/archive/2010/03/19/running-rbs-maintainer.aspx
for  schedule an RBS Maintainer task please read following arcile:
http://technet.microsoft.com/en-us/library/ff943565(v=office.14).aspx

After you run RBS Maintainer , RS and DP phase completed the blob records will be cleared ! no not yet 🙂 . This operation is takes much 2 or 3 mintues and depends on how much data you have.

RBSGC
Image Source:http://blogs.technet.com/b/pramodbalusu/archive/2011/07/09/rbs-and-sharepoint-2010.aspx

 

FILESTREAM GC runs as part of the database checkpoint process. This is what causes some confusion – an old FILESTREAM file will not be removed until after it is no longer needed AND a checkpoint runs. 
http://www.sqlskills.com/BLOGS/PAUL/post/FILESTREAM-garbage-collection.aspx

In Simple recovery mode, you may run following command
CHECKPOINT;
In  Full recovery mode, two transaction log with CHECKPOINT are needed
or

“Forces the FILESTREAM garbage collector to run, deleting any unneeded FILESTREAM files. A FILESTREAM container cannot be removed until all the deleted files within it have been cleaned up by the garbage collector. The FILESTREAM garbage collector runs automatically. However, if you need to remove a container before the garbage collector has run, you can use sp_filestream_force_garbage_collection to run the garbage collector manually
http://msdn.microsoft.com/en-us/library/gg492195.aspx

USE <Content Database>;
GO
EXEC sp_filestream_force_garbage_collection @dbname =  N'<Content Database>’;

And finally if still your BLOB data is not cleared than you may create a Case for Microsoft Support 🙂

Advertisement

Compare sql server 2005 and 2008 by SharePoint 2010

You can use both Sql server 2005 and sql server 2008 with sharepoint 2010 but what are the advantages or disadvantages .

Absolutely i recommend use SQL Server 2008 r2 for sharepoint 2010 products. Here is the why i choose it

Performance and Avaiblity :SQL Server 2008 R2 Enterprise edition enables several dimensions on which it can scale, enabling even the most demanding SharePoint Server 2010 deployments. For example, rapidly growing Enterprise Content Management (ECM) or Web Content Management (WCM) workloads may often require compute resources beyond those associated with traditional collaboration scenarios. SQL Server 2008 R2 Enterprise can support this scenario through a number of improvements, by enabling greater processor scale, the ability to address more physical memory on 64-bit hardware, and hot-add hardware support.

Business Continuity Management is a combination of high availability and disaster recovery. High availability ensures a certain absolute degree of operational continuity in the event one or more components fail in an isolated location. Availability requirements are defined by Operating and Service Level Agreements. Disaster recovery ensures a certain absolute degree of operational continuity in the event that all systems fail in one or more locations. Disaster recovery requirements are defined by Operating Level Agreements, Recovery Point, and Recovery Time Objectives.

SQL Server 2008 R2 provides a number of native capabilities to enable the design and deployment of a highly available SharePoint Server 2010 deployment including database mirroring, failover clustering, and log shipping.

SQL Server 2005 mainstream support will end fairly soon (read more athttp://support.microsoft.com/gp/lifesupsps)

Security:
Transparent Data Encryption (TDE) is a feature added in SQL Server 2008 Enterprise that performs real time I/O encryption and decryption of data and log files without requiring unsupported modification of the underlying table schema or increasing the size of the database—that means no changes to SharePoint to enable TDE SQL server audit.Server Audit, which enables tracking and logging events that occur on the system, for example detecting changes or modifications to database objects/stored procedures, surfacing changes to server configuration settings, or detecting changes to audit configuration settings.

Better Reporting Services:SharePoint Server 2010 and SQL Server 2008 R2 Enterprise provide close business intelligence capabilities integration through SQL Server Reporting Services. Using SQL Server Reporting Services, administrators can configure reporting servers to enable real-time access to information and control who has access to that information. End users can benefit from this integration by publishing SQL Server reports directly to Document Libraries or by optionally embedding reports in pages hosted on one or more sites in a Microsoft SharePoint Server 2010 deployment.

To learn more about Reporting Services in SQL Server 2008 see SQL Server Reporting Services (http://msdn.microsoft.com/en-us/library/ms159106.aspx).

Using FileStream Functionality with 2008:Remote Blob Storage integration with SharePoint Server 2010 in SQL Server 2008 R2 Enterprise enables an administrator to externalize SharePoint binary large object (BLOB) data by hosting it on less expensive, commodity hardware solutions and managing it with SQL Server through the same data management techniques they use today. Remote Blob Storage can be an integral component in the most stringent compliance scenarios by enabling third-party technologies to implement solutions such as expunge on the server(s) hosting the externalized BLOBs
see:https://blog.bugrapostaci.com/2010/10/01/sharepoint-2010-server-with-filestream-rbs-provider/

 

Sharepoint Server 2010 Fuctionality and SQL Server Eddition Comparison

Resources:
http://technet.microsoft.com/en-us/library/cc990273.aspx
http://technet.microsoft.com/en-us/library/cc262749.aspx

Other:

http://sqlcat.com/top10lists/archive/2009/02/24/top-10-performance-and-productivity-reasons-to-use-sql-server-2008-for-your-business-intelligence-solutions.aspx



SharePoint 2010 Server with FILESTREAM RBS Provider

What is RBS ?

Remote Blob Storage is a library API set that is incorporated as an add-on feature pack for Microsoft SQL Server. It can be run on the local server running Microsoft SQL Server 2008 R2, SQL Server 2008,SQL server 2008 express or SQL Server 2008 R2 Express. To run RBS on a remote server, you must be running SQL Server 2008 R2 Enterprise edition. RBS is not supported for Microsoft SQL Server 2005.

What is the benefits of RBS ?

RBS can provide the following benefits:

  • BLOB data can be stored on less expensive storage devices that are configured to handle simple storage.
  • The administration of the BLOB storage is controlled by a system that is designed specifically to work with BLOB data.
  • Database server resources are freed for database operations.

When we consider to use RBS ?

  • The BLOB data files are larger than 256 kilobytes (KB).
  • The BLOB data files are at least 80 KB and the database server is a performance bottleneck. In this case, RBS reduces the both the I/O and processing load on the database server.

What is the difference of RBS with FILESTREAM and RBS without FILESTREAM feature ?

This implementation of the FILESTREAM provider is known as the local FILESTREAM provider. You can conserve resources by using the local RBS FILESTREAM provider to place the extracted BLOB data on a different (cheaper) local disk such as RAID 5 instead of RAID 10. You cannot use RBS with the local FILESTREAM provider on remote storage devices, such as network attached storage (NAS). The FILESTREAM provider is supported when it is used on local hard disk drives only.

A remote RBS FILESTREAM provider that is available in SQL Server 2008 R2 Express can store BLOB data on remote commodity storage such as direct-attached storage (DAS) or NAS. However, SharePoint Server 2010 does not currently support the remote RBS FILESTREAM provider.

BLOBs can be kept on commodity storage such as direct-attached storage (DAS) or network attached storage (NAS), as supported by the provider. The FILESTREAM provider is supported by SharePoint Server 2010 when it is used on local hard disk drives only. You cannot use RBS with FILESTREAM on remote storage devices, such as NAS.

The following table summarizes FILESTREAM benefits and limitations.

Operational requirement RBS with FILESTREAM RBS without FILESTREAM
SQL Server integrated backup and recovery of the BLOB Store Yes Yes
Scripted migration to BLOBs Yes Yes
Supports mirroring No No
Log shipping Yes Yes, with provider implementation
Database snapshots No1 No1
Geo replication Yes No
Encryption NTFS only No
Network Attached Storage (NAS) Not supported by SharePoint 2010 Products Yes, with provider implementation

1If the RBS provider that you are using does not support snapshots, you cannot use snapshots for content deployment or backup. For example, the SQL FILESTREAM provider does not support snapshots.

If FILESTREAM is not a practical provider for your environment, you can purchase a supported third-party provider. In this case, you should evaluate the following criteria when shopping for a provider:

  • Backup and restore capability
  • Tested disaster recovery
  • Deployment and data migration
  • Performance impact
  • Long-term administrative costs

What is the Prerequistes ?

If you plan to store BLOB data in an RBS store that differs from your SharePoint Server 2010 content databases, you must run SQL Server 2008 with SP1 and Cumulative Update 2. This is true for all RBS providers.

Important Notes

  • RBS does not enable any kind of direct access to any files that are stored in Microsoft SharePoint 2010 Products. All access must occur by using SharePoint 2010 Products only.
  • If you are storing many small (less than 256 KB) files that are frequently accessed by many users, you might experience increased latency on sites that have many small files that are stored in RBS. Increased latency is one cost factor that you should consider when you evaluate RBS for your storage solution. However, it is unlikely to be the strongest consideration. The amount of increased latency is also related to the RBS provider that you use.
  • RBS can be run on the local server running Microsoft SQL Server 2008 R2, SQL Server 2008 or SQL Server 2008 R2 Express. To run RBS on a remote server, you must be running SQL Server 2008 R2 Enterprise edition. SharePoint Server 2010 requires you to use the version of RBS that is included with the SQL Server Remote BLOB Store installation package from the Feature Pack for Microsoft SQL Server 2008 R2. Earlier versions of RBS will not work with SharePoint Server 2010. In addition, RBS is not supported in SQL Server 2005.
  • For best performance, simplified troubleshooting, and as a general best practice, we recommend that you create the BLOB store on a volume that does not contain the operating system, paging files, database data, log files, or the tempdb file
  • Microsoft SQL Server 2008 R2 Express supports databases up to 10 GB. If the installation includes content databases that are larger than 4 GB but smaller than 10 GB, you can upgrade to SQL Server 2008 R2 Express for your content database storage solution instead of implementing RBS. For more information, see Microsoft SQL Server 2008 R2 Express Edition
  • WARNING:We do not recommend that you install RBS by running the RBS_X64.msi file and launching the Install SQL Remote BLOB Storage wizard. The wizard configures the RBS Maintainer to run a scheduled task every 30 days. This setting might not be optimal for your environment. For more information about the RBS Maintainer, see the SQL Server Help documentation that is included with the SQL Server Remote BLOB Store installation package from the Feature Pack for Microsoft SQL Server 2008 R2. (ref : Install and configure Remote BLOB Storage (RBS) with the FILESTREAM provider (SharePoint Server 2010))UPDATE 18.04.2011: With new version RBS.msi you can use GUI interface to install rbs to your envoriment.
    You have to select “Show the optional advanced configuration options” when wizard in progress. And uncheck the maintaner  schedule configuration for disable maintainer.You can download from Microsoft SQL Server 2008 R2 Feature Pack Page
    http://www.microsoft.com/downloads/en/details.aspx?displaylang=en&FamilyID=ceb4346f-657f-4d28-83f5-aae0c5c83d52
    This is the only provider that Sharepoint Server 2010  with version 10.50.xxxx (R2) .But you can load this provider to SQL Server 2008 express loaded machine.


Tips and Tricks

  • You can allow only big files to be put into FILESTREAM. Since FILESTREAM performance is not as good as the databases when it deals with small files (for example, <1M), you can change this threshold.  The following Windows PowerShell command change the setting to 1M (1048576 bytes), file below 1M will be stored in DB.

$cbd = Get-SPContentDatabase “WSS_Content”
$cbd.RemoteBlobStorageSettings.MinimumBlobStorageSize=1048576
$cdb.Update()

It would be good to test the performance based on your own storage and hardware.

This Article base on below MSDN articles :

And Blog Articles

Other Resources

Happy Codding…
.