SharePoint 2010 RBS Benefits/Trade-offs

There is a lot of interest in the ability of SharePoint 2010 to use the new Remote Blob Storage (RBS) capabilities of SQL 2008 R2. Much of it good... some of it bad. The main issue is that many people are under some false assumptions about what RBS does and what it's actual benefits/trade-offs might be.

First, RBS provides a few clear and potentially valuable direct benefits:

  • 1. Optimizes SQL disk I/O by moving BLOBs (Binary Large OBjects... aka... files) out of the database files themselves and into a secondary storage location. This means that when SQL is reading data from the database files, it no longer has to skip over the BLOBs in the data filesthat aren't essential to processing a query.
  • 2. Decreases storage costs by allowing BLOB data to be moved out of the expensive, ultra-high performance disk array (RAID 10) array where the SQL data files are, and into the reasonably fast but higher density (RAID 5+) storage array or to a less expensive disk system.
  • 3. Increases BLOB transfer speed between the SQL Server and the client (or vice-versa) because some of the actual file transfer can be handed off to the Windows Server OS rather than have to be processed and re-assembled in the SQL Server process.
  • 4. Faster and more effecient Move-SPSite operations through RBS "shallow copy" after deployment of SharePoint 2010 Service Pack 1, which will leave the BLOBs in place and simply move the references to the BLOBs along with the site move. Without shallow copy, the BLOBs are essentially duplicated as they're copied from one DB to another.

That's it. That's what RBS gets you. When discussing a system like SharePoint, which has a heavy focus on file (BLOB) management, these can be fairly significant benefits.

However, these benefits come at a significant price . Specifically:

  • Backup and Restore operations and management - RBS significantly complicates your backup and recovery processes. For example, backups must now be heavily coordinated so that the database files and the associated RBS store have a synchronized backup schedule. One cannot be backed up without the other, nor can one be restored without the other. Any tools used to automate or manage backups must be aware of this requirement and be able to achieve this synchronized backup, or an alternative (and likely unique) backup process must be used.
  • Additional infrastructure must now be managed and monitored- At minimum, there is now yet another file location that must be managed, defragmented, and monitored. A new exclusion must be created for anti-virus solutions (or special rules created such that infected files are identified but not removed). At worst, additional servers are now required to host the BLOB store, and the patching schedule must be directly and exactly aligned with the SQL Server itself. RBS also introduces its own performance counters that should have thresholds and alerts identified and configured in your monitoring solution.
  • Additional maintenance must now be scheduled- The RBS store is not automatically included in any of the typical SQL maintenance activities performed by SharePoint timer jobs or by the usual SQL maintenance job tasks. They must be scheduled separately and managed to ensure timeliness, effectiveness, and stability.
  • Clustered environments still require shared storage- Generally, this means that you're still using your expensive SAN, or you have the added complexity of configuring iSCSI targets and ensuring they are properly configured within the Windows/SQL Cluster and meet the requirements of SharePoint and SQL.
  • Microsoft does not support SQL Mirroring and RBS on the same database- Simply put, any database that is going to use mirroring cannot use RBS and be supported by Microsoft. If you choose to follow another vendor's guidance in this area then you will be dependent on that vendor's ability to support your implementation. Note that you can (and always could) use both SQL Clustering and Mirroring on the same environment, and this may be recommended depending on your SLA requirements, but databases that have RBS enabled will not be able to benefit from mirroring.

So, should you use RBS? Maybe, maybe not. If you're clear about the problem you're trying to solve and you clearly understand how RBS fits into solving that problem, then RBS may be a good solution. For example, if all of the following items are true, then you might consider RBS:

  • Your SharePoint environment is heavily focused on Document Libraries.
  • The vast majority ( >70%) of those files exceed 1MB in size.
  • The SharePoint content databases holding these files are relatively large ( >200GB) now or in the immediately foreseeable future.
  • You have Disaster Recovery tools that are either directly RBS aware, or have processes designed to synchronize backups.
  • You have highly skilled, expert SQL and Windows administration staff that is currently or has the capacity and directive to be well trained in RBS use, administration, and troubleshooting.

If ANY of the above items are not true... ANY OF THEM... then you should NOT be considering RBS.

Now... the true/false, fact/myth part of the conversation...

  • "Enabling RBS in SharePoint will make my overall SharePoint experience faster (browsing pages, viewing lists, processing workflows)."
  • o Maybe, but not significantly. It is likely that the benefit for you in performing these operations will be minimal. RBS increases performance exclusively when it comes to disk I/O patterns. When trying to present a list of items, the size of the files is irrelevant... it is the number of rows in the database tables that impacts performance. Activating RBS so that your 100 2GB files can be pushed out of the database file is not going to change the fact that SharePoint is only processing 100 rows of data. Put simply, the effort required to process the relational, structured data that is used to browse and present the SharePoint UI (and related content) will NOT be significantly impacted/benefited by RBS. If you have large lists in SharePoint, you're still going to have large lists after RBS, and that is where SQL is spending most of its time... sorting through the rows of data necessary to present SharePoint content. It isn't until you click directly on the file link that RBS is really even doing anything. Though many of the items you may be hitting are stored as BLOBs (pages, images, etc), those artifacts will likely fall below the threshold configured to move them into RBS, so the benefit for day-to-day browsing is still minimal.
  • "Using RBS will make my databases smaller so I can hold more content"
  • o Maybe, but only if it is well managed. It is possible to use RBS to put the structured data files on your "fast" disks, and your RBS BLOB store on your "almost as fast" disks that contain more data. If you're doing this though and you don't have huge amounts of content, you're probably better off simply buying more storage capacity. However, remember that RBS does not change the sizing recommendations/requirements for SharePoint databases. When calculating the size of the DB include both the DB and it's respective RBS content (total dataset size), so this doesn't get you past our sizing recommendations.
  • "RBS will let me access the files without going through SharePoint"
  • o ABSOLUTELY FALSE. If activated, the RBS BLOB store has the same basic rules as the SharePoint databases themselves: Thou Shalt Not Touch.The RBS BLOB store should be considered an extended component of the SharePoint database, and any supportability rules and requirements apply to the files stored and managed there by SharePoint as they do the databases themselves. These files are owned, managed, and accessed exclusively by SQL and SharePoint and should not otherwise be touched or manipulated manually except as required to perform exact and consistent, synchronized backup and/or restore activities. Anything else will violate the supportability and potentially the integrity of your SharePoint content. The only exception to any of this is the required maintenance of the RBS store as listed on TechNet.
  • "RBS will make file downloads faster"
  • o Maybe, but not significantly. It may make the transfer between the SQL server and the SharePoint Web Front End faster, but your own download is still going to go through IIS. It's doubtful any difference in speed will be noticeable to you.
  • "Microsoft is recommending that all users deploy RBS with SharePoint 2010"
  • o Absolutely False. RBS is designed to support specific business needs, and should only be deployed when those needs exist within the environment and the business is prepared to compensate for the complexity that comes with the solution. In most environments the additional administrative costs would likely exceed any infrastructure cost or benefits.
  • "You're telling me no one should be using RBS"
  • o Absolutely not. I'm saying your reasons should be good, you should know what you're getting yourself into, you should be solving a current or immenent business need, and you should plan to manage it well. For example, one area that might benefit from RBS might be an environment with a large, centralized Records Center. Since the centralized Record Center consists almost exclusively of documents, can only exist in a single site collection, should only exist once per farm, and since the content doesn't experience a significant amount of "churn" (the documents are stored and retrieved, but not in a highly collaborative way), a large Records Center site could benefit significantly.

Finally, if you're still considering RBS or just want more information about it, a recently released whitepaper provides the most exhaustive and informative description of RBS I've seen to date. I strongly suggest reading it.

Note: This article has been updated to reflect the latest guidance from the SharePoint Product Group. Some artifacts may remain that reflect information prior to this update. If so, please leave a note in the comments for invesgitation and response.

Comments

  • Anonymous
    July 09, 2011
    Another great post Chris, a keeper! Really clears up some misconceptions and shows you where RBS can be used and more importantly when not to use it.

  • Anonymous
    July 10, 2011
    Great post, I referenced this in my recent blog on the SP2010 SP1 sizing changes: www.benjaminathawes.com/.../Post.aspx

  • Anonymous
    July 10, 2011
    Thanks for the comments guys, and for the reference Ben. :)

  • Anonymous
    July 10, 2011
    Great post. Definitely makes the topic of RBS clear and clears up misconceptions on who should be using! Thanks!!

  • Anonymous
    July 14, 2011
    Great post Chris, thanks for the info. On a slightly unrelated note, in the last "fact/myth" response, you mention that a Records Center "should only be used once per farm". Could you expand on why this is the recommendation for the Records Center? Thanks again!

  • Anonymous
    July 16, 2011
    Hi Jason - Great question. The recommendation is primarily targeted at 2007, but continues to have benefit in 2010 as well. The technical reason is that SharePoint (both versions) offer a "Declare as record" function which will send a given document to the Record Center configured in Central Administration. As this is in Central Administration, and is only a single entry, this means that there should really only be one Records Center in the farm. See technet.microsoft.com/.../cc824902(office.12).aspx for more information on 2007. Also, regarding 2007 we have a note in the downloadable book "Records Management for Office SharePoint Server 2007" (go.microsoft.com/fwlink) which states: "A Office SharePoint Server 2007 farm can point to a single target Records Center site as the location to which to send records from sites in that farm. If a farm hosting active documents must point to multiple Records Center sites, you must use the Windows SharePoint Services 3.0 object model to implement a custom router in the target Records Center site to route incoming records to the appropriate destination Records Center site. " In 2010, there should still only be a single default Records Center, but you can configure the routing rules of that records center to route a document to other sites, including other Records Centers. In this case, there can be more than one, but there should only be one "default" per farm. Of course, there is also the option of "in-place" records - but that's a different show. ;) From a business process perspective, the goal of the Records Center is to provide a single location for records to be managed and a single primary point of investigation. Having numerous Records Centers effectively eliminates this benefit since someone must still fish through a lot of different document repositories in order to do e-discovery. Creating multiple Records Centers is technically possible, but more than one is of diminished usefullness from both a technology and business process perspective. -Chris

  • Anonymous
    December 14, 2011
    The comment has been removed