Category: RBS

The First SharePoint Backup, Recovery and BLOB Storage Solution

We like to make history at Metalogix. With the release of StoragePoint 4.2, we’ve done it again.

For the first time in the history of SharePoint, there’s now a fully integrated BLOB storage, SharePoint backup and granular restore solution for any version of SharePoint.  With access available directly through SharePoint central administration, SharePoint administrators can now add the benefits of automated backups and granular recovery to their Remote BLOB Storage (RBS) solution.

With Metalogix StoragePoint 4.2, we have developed the first RBS solution that integrates backup and recovery capabilities into a single user interface in the central administration. The release continues StoragePoint’s long history as the most intuitive RBS solution through its native integration with SharePoint. If you’re familiar with SharePoint, regardless of the version, you’ll be comfortable with StoragePoint 4.2.

To ensure your familiarity with the new release, here are some of the key capabilities we added to the product:

  • Automates the Out-of-Box (OOB) backup process and integrates StoragePoint’s existing continuous BLOB backup capability.
  • Includes granular recovery – all the way down to the document level – for all versions including SharePoint 2013.
  • Introduces support for backup to the cloud, which will reduce disk space for backups and your storage costs.
  • Provides the ability to restore an endpoint from backup.

A key concern I’ve regularly heard from customers is how they can quickly respond when someone in their organization has lost a document or folder. We addressed this for SharePoint admins and IT professionals by including new granular recovery capabilities in StoragePoint. Rather than rely on lengthy, time-consuming SharePoint backup processes with SQL Backup, cumbersome scripting to manage your backups, or a central backup/recovery team to respond to your requests, you’ll now be mere clicks away from recovering your data with StoragePoint 4.2.

Do you have compliance, retention and archiving policies for SharePoint content that are causing you concern? In StoragePoint 4.2, we added crucial capabilities for event-based retention support for EMC Centera. Additionally, there are new capabilities for synchronization between SharePoint Information Management Policies and external storage retention policies.

We’re excited to share with you the exciting features in StoragePoint 4.2 and invite you to schedule a demo today to see the product in action. See for yourself how the market’s only fully integrated SharePoint storage, backup and granular recovery solutions works seamlessly within Central Administration.

You can also download StoragePoint Express for free with a 200GB license by clicking here.

Dispelling the Myths of Shredded Storage in SharePoint 2013: Part 2

Read Part 1: The Impact of Shredded Storage on SharePoint 2013

It is important to separate fact from fiction when it comes to shredded storage in SharePoint 2013, a topic that created a tremendous amount of buzz in the SharePoint community.

It is true that shredded storage, for collaboration scenarios, will reduce network and storage I/O associated with saving edits to an existing document. However, it completely misses the mark when it comes to performance related to file upload and download.

Thus, it is now time to dispel the myth that shredded storage serves as a replacement for Remote BLOB Storage (RBS) and show how you can optimize Metalogix StoragePoint to make the most of shredded storage in SharePoint 2013.

Remote BLOB Storage (RBS) provides plumbing to allow third-party providers to offload binary large objects (BLOBs) from SharePoint Content Databases to external storage locations. The primary benefit of RBS is to reduce the size of unstructured data (BLOBs) stored in SQL Server databases while providing support for commodity storage.  Third party providers like Metalogix StoragePoint have extended this basic BLOB offloading capability to provide a long list of enterprise capabilities including support for a wide variety of storage devices, compliant storage, archiving, and enhanced backup/restore capabilities.

Recently I have heard statements from within the SharePoint community and from other companies that RBS is no longer needed due to the introduction of shredded storage.  This couldn’t be further from the truth.

In part 1 of this series we discussed the primary benefits of shredded storage.  Microsoft’s goal in implementing shredded storage was to reduce the I/O associated with saving document changes.  Rather than store entire copies of files SharePoint 2013 shreds files into smaller chunks allowing for incremental changes to documents to be saved to the SharePoint Content Database.  As a result network and storage I/O is greatly reduced making the process of saving edits to a document very efficient.  Additionally SQL transaction logs associated with document edits are smaller making log shipping more efficient (in fact addressing log shipping challenges for Office 365 was one of the drivers for introducing Shredded Storage).  But what about the impact to uploading (new) and downloading existing documents for SharePoint?

In my experience, most SharePoint farms have a much higher ratio of downloads and uploads versus edits to existing documents.  The fact remains that while shredded storage greatly improves the I/O characteristics when saving incremental changes to SharePoint it has a net negative impact on uploads and downloads speeds.

With Shredded Storage in place, the core value of that RBS provides still exists.  Does Shredded Storage reduce the size of SharePoint Content Databases (SQL Server) by removing BLOBs from SQL Server databases?  Does shredded storage allow you to leverage commodity or complaint storage devices?  Does shredded storage address backup challenges with growing SharePoint environments?  The answer to all of these questions is a resounding “NO!”

Optimizing RBS with Shredded Storage

In SharePoint Server 2013, Shredded Storage and RBS coexist without issue.  As previously discussed, the result of Shredded Storage is a single file broken down into smaller “chunks” and stored within the SharePoint Content Database.  With RBS in place the smaller “chunks” will be externalized rather than a single file.  Regardless of shredding the end result is the same: BLOBs are stored outside of SharePoint Content Databases.

There are, however, considerations when optimizing the performance of RBS with Shredded Storage.  While Shredded Storage cannot be “turned off” in SharePoint Server 2013, it can be optimized or disabled altogether by changing the chunk size of the file shreds.  The default chunk size is set to 64KB however you could set the chunk size to 2GB (the maximum allowable file size in SharePoint) effectively disabling Shredded Storage.  When performance testing Metalogix StoragePoint with Shredded Storage, we found that setting the chunk size to 20MB will yield the best upload and download performance.  Changing the chunk size is quite simple and requires a bit of PowerShell script.

[void][System.Reflection.Assembly]::LoadWithPartialName(“Microsoft.SharePoint”)
$service = [Microsoft.SharePoint.Administration.SPWebService]::ContentService
$service.FileWriteChunkSize = chunk size in bytes
$service.Update()

You will need perform an IISRESET and restart the SP Timer Service on all machines in the farm.

As you have seen over this two-part series, there is a lot of misinformation currently floating around about Shredded Storage and RBS in SharePoint 2013. The reality is that neither replaces the other. In fact, Shredded Storage and RBS complement each other. Shredded Storage reduces network and storage I/O when saving document edits. And RBS reduces Content Database size, improves upload and download speed, and accelerates backup/restore operations. Following the guidelines above will help you get the most out of RBS and Shredded Storage.


Continuous BLOB Backup with StoragePoint

By now, you probably are aware that StoragePoint v4.0 has been released this week.  With the release of StoragePoint v4.0, we are excited for new features that will greatly improve your SharePoint backup and restore experience.  Not only does StoragePoint v4.0 improve SharePoint performance and reduce storage costs, we can now automatically backup BLOBs allowing our customers to satisfy requirements for Recover Point Objectives that were previously difficult to meet.

In addition to continuous BLOB backup, the latest version of StoragePoint provides granular, item-level restore capability that works with standard out-of-the-box tools.  Here’s how continuous BLOB backup and the granular restore capability can greatly improve your backup/restore experience with SharePoint.

SharePoint Remote BLOB Storage (RBS) providers like Metalogix StoragePoint allow you to externalize unstructured data normally stored within SharePoint content databases to cost effective, tiered storage locations.  This results in up to 95% reduction in your content database size, increases performance of your SharePoint server farm, and now significantly improves backup and restore performance.

As files are uploaded to SharePoint, StoragePoint externalizes BLOBs in real-time allowing metadata to continue to be stored in the SharePoint content database.   By enabling StoragePoint continuous BLOB backup, a copy of each externalized BLOB will be written to a backup location.  Since the majority of your content database is comprised of unstructured data (i.e. BLOBs), proactively backing up BLOBs leaves very little data to backup on a regular interval using out-of-the-box backup tools.

StoragePoint v4.0 BLOB Backup

SharePoint undoubtedly has become a mission critical platform for many organizations that now rely on its robust Enterprise Content Management (ECM) capabilities.  Due to the critical nature of SharePoint organizations with growing environments now require very low Recover Point Objectives (RPO).

Traditional backup and restore tools often fail to meet the aggressive RPO requirements that organizations demand.  This is in part due to the speed at which these tools can execute backups often taking eight or more hours to backup 1TB of data.  The result is a backup approach that will not meet RPO requirements.  The combination of continuous BLOB backup and granular restore capability along with the benefits StoragePoint Remote Blob Storage (RBS) makes for a solid platform for scaling your SharePoint environment.  For more information about Metalogix StoragePoint visit the product page or request a demonstration.

Remote BLOB Storage – The Cost of Doing Nothing

The history of SharePoint is quite fascinating when you consider its roots as a departmental collaboration and document management tool.  In many organizations, SharePoint has grown up from literally an application running on a server under someone’s desk to an enterprise-wide, mission-critical system.  Even today, I am quite surprised at how often SharePoint growth is underestimated.

Because of this, organizations deploying SharePoint should consider Remote BLOB Storage (RBS) as a viable option for managing growth and reducing ongoing storage and management costs.  While there is a cost associated with deploying a third-party RBS solution, the cost of doing nothing could be much greater in the long run.

Before we explore the cost of doing nothing, let’s revisit the value that RBS brings to SharePoint.

  1. Enhances performance and scalability of SharePoint while reducing Total Cost of Ownership (TCO)
  2. Provides flexible storage options for storing unstructured SharePoint content.  Different storage options provide lower storage costs, lower overall TCO, and support for compliance scenarios that were previously not supported.
  3. Reduces management overhead by reducing the size of SharePoint content databases (SQL Server databases) and creating new options for backup, restore, disaster recovery, and high-availability scenarios.

Depending on the current state of your SharePoint deployment, RBS can provide immediate and often significant return on investment (ROI).  For organizations deploying new hardware in the form of storage infrastructure, the primary ROI will come from reduced up-front storage procurement costs. Consider the following scenario where we simply divert 90% of the storage from tier 1 (typically SAN or Direct Attached Storage) to tier 2 (primarily NAS storage).

Tier 1 Storage
StoragePoint Acquisition Cost  $ 29,990.00 Acquisition Cost/GB $ 5.00
Support & Maintenance  $ 5,398.20 Monthly Mgmt Cost/GB $ 2.00
Total Storage (GB) 3072GB Percentage of Total Storage 10%
Total Investment without StoragePoint  $ 15,360.00 Total Tier Storage(GB) $ 307.20
Total Investment with StoragePoint  $ 42,453.80 Total Acquisition Cost $1,536.00
Monthly Cost without StoragePoint  $  6,144.00 Total Monthly Cost $ 614.40
Monthly Cost with StoragePoint  $  1,996.80
Monthly Savings  $  4,147.20 Tier 2 Storage
  Acquisition Cost/GB             $2.00
Return on Investment (Months)                    6.5 Monthly Mgmt Cost/GB $ 0.50
Percentage of Total Storage 90%
Total Tier Storage(GB) 2764.8 GB
Total Acquisition Cost $ 5,529.60
Total Monthly Cost $1,382.40

As you can see, if you are planning for a capital expenditure for storage to support your new or growing SharePoint environment, complete ROI comes in only a few months and is often immediate – factors including per GB disk cost and ongoing management costs will impact ROI.  Deferring the use of RBS could have immediate, and in some cases, significant impacts on cost.

For organizations with expanding SharePoint environments, the cost of doing nothing can almost certainly have a negative impact.  With the understanding that the purchase of storage to support SharePoint may be a sunk cost for your organization, there are considerations for large SharePoint content databases.  Microsoft recently updated their guidance for large database support in SharePoint and I have previously posted a blog, Revisiting SharePoint Remote BLOB Storage, which provides detail on the new guidance.

The net of the new guidance is that supporting large content databases requires special consideration for disk IOPS (inputs/outputs per second).  For organizations choosing to store all unstructured content in SQL Server databases, the cost of doing so can be extremely high.  The reason is that the high number of IOPS required per GB (0.25 IOPS per GB minimum with 2 IOPS per GB recommended) result in overprovisioned, underutilized disks.  The amount of disks required to support a large SharePoint content database can be very large and result in significant costs, both in procurement and ongoing maintenance, for storage.  RBS alleviates this concern by storing unstructured content outside of the database, usually on a less expensive tier.

Beyond storage and ongoing maintenance costs, there are also soft costs that are more difficult to measure.  Consider that storing all SharePoint content in SQL Server databases will result in increased management costs for database maintenance including backup/restore processes.  Have you considered the costs for storing backups for the period of time dictated by your recovery point objectives (RPO)? How about the cost of data replication to support failover and nonproduction environments (Dev, Staging, QA, etc)? RBS allows organizations to leverage storage devices that facilitate backup/recovery scenarios and provide better ways to backup and/or replicate the unstructured SharePoint content, which can comprise up to 95% of your storage utilization.

Planning for deployment and ongoing management of your SharePoint environment is critical to controlling costs in the short and long term.  RBS can provide immediate costs savings in the short term with significant cost savings over the long term as your SharePoint environment continues to grow.  Often the fear of uncertainty that surrounds new technology will cause organizations to dismiss the technology altogether.  While RBS may fall into this category – you should take a second look. The cost of doing nothing may be higher that you think.

SPC11 Fail-over Demo Reveals Importance of Remote BLOB Storage

After much fanfare, SPC11 has come and gone leaving us with lots of great information to digest.  For me, one of the more compelling parts of the show was the keynote address where Richard Riley, Director on the SharePoint team, demonstrated a high availability scenario with SharePoint 2010 and the CTP of SQL Denali.  The demonstration included a 40-second fail-over of a 14TB database while simulating over 7,500 concurrent user connections. It was certainly an impressive display of raw computing power with EMC VNX storage being at the center of it all.

If you missed the SPC11 keynote you can watch a recording here: http://www.mssharepointconference.com/pages/keynote.aspx.

While failing over a 14TB SharePoint database live on stage was both impressive and daring, I feel a few important details may have been lost in the overall demonstration.  Let’s first revisit thehigh-level environment configuration.  Microsoft has done us a favor by publishing the full documentation for the fail-over test.  You can find it here: http://www.microsoft.com/download/en/details.aspx?id=27573

At a high level, the SharePoint Server farm contained the following components:

  • EMC VNX 5700 with 300TB of storage
  • NEC Express 5800/A1080a server with 1TB of RAM
  • SharePoint Server 2010
  • SQL Denali CTP

** Note that all servers in the farm were virtualized and ran on the NEC server.

The test simulated the following:

  • A single 14TB content database containing document archive content
  • 7,500 concurrent users
  • 108 Million Documents in an Archive Scenario (note that Microsoft’s supported limit is 60 million items per content database)
  • Fail-over of the SQL Server due to a network outage

What stands out most in this environment is the massive amount of storage required to support a 14TB document archive in SharePoint.  This demonstration leaves many people asking the question, “This is impressive but can I afford that?”  You may already be aware that Microsoft released updated database sizing guidelines in conjunction with Service Pack 1 for SharePoint 2010.  As part of the updated guidance, SharePoint now supports content databases larger than 200GB in size.  Going beyond 200GB for both collaboration and document archive scenarios requires special consideration for the disk subsystem.  I previously wrote about these considerations in a previous post you can review here.

At a high level, databases larger than 200GB in size require at least .25 IOPS/GB minimum, with 2IOPS/GB being recommended.  This is the very reason for over-provisioned storage in the case of the 14TB high-volume fail-over test demonstrated at SPC11.  The result is 300TB of over provisioned SAN storage required to support two (2) 14TB content databases (there are two copies of the database in the fail-over scenario).  At a cost in the millions of dollars, this is hardly a cost-effective solution for managing large volumes of SharePoint content.  So you have to ask, “Is there a better way?”   The answer is most certainly “YES!”

Many organizations have dozens if not hundreds of TBs of content sitting on file shares and these organizations have a need/requirement to move much of this content into an environment that allows for true content lifecycle management.  SharePoint is the logical choice for much of this content due to the extensive features provided at a relatively low cost.

As you can see, when SharePoint grows in size the underlying infrastructure and storage costs can increase exponentially.  Microsoft, in conjunction with partners like Metalogix, provides a better way to scale your SharePoint environment while reducing overall storage costs.  SharePoint 2010 provides Remote BLOB Storage (RBS) interfaces that allow partners like Metalogix to provide the ability to externalize unstructured  data (i.e. Office documents, PDF, TIF, JPG, etc.) from the SharePoint content databases.

Leveraging RBS and Metalogix StoragePoint will allow your organization to grow SharePoint without the need for expensive over provisioned storage. For more information on RBS and Metalogix StoragePoint

Revisiting RBS: SharePoint and Service Pack 1

Since the release of SharePoint 2010 there has been a lot of debate over the value and application of Remote BLOB Storage (RBS) with SharePoint 2010. That debate has been reinvigorated with the updated Plan for Software Boundaries guidance that was published with Service Pack 1 for SharePoint 2010. Frankly I am disturbed by the volume of blog posts that include inaccurate information about RBS and the updated Software Boundaries and Limits. It seems that some “experts” influence far outweighs their competence on the subject of RBS. I have personally seen the confusion created in the ecosystem manifest itself in conversations I have with customers on a daily and weekly basis. For this reason I think it is pertinent to revisit the value of BLOB externalization, correct common misconceptions, and discuss how the updated guidance from Microsoft may impact your consideration to implement SharePoint with a remote blob storage solution.

A Brief History of BLOB Externalization

First a brief history lesson before we tackle RBS (a history lesson that many of you already know). BLOB externalization is not a new concept. In fact most legacy ECM system store unstructured data (files) separate from metadata stored within a relational database. Microsoft originally developed SharePoint this way using the same Web Storage System that Exchange Server uses. With the release of WSS 2.0/SharePoint 2003, Microsoft moved to storing all data (structured and unstructured) within SQL Server databases. Many vendors attempted to address database growth through the use of archive tools that will pull BLOBs from the database in a post processing batch job. While this worked to solve database bloat problems it created compatibility issues with out-of-the-box and third party SharePoint solutions which had to be “aware” of the archive product in use and understand how to interpret the stub left behind. Definitely not the most elegant solution and it certainly didn’t address the core issue. It wasn’t until the release of Service Pack 1 with SharePoint 2007/WSS 3.0 that Microsoft introduced support for BLOB externalization via the EBS interface. Subsequently Microsoft introduced support for Remote BLOB Storage (RBS) along with continued support for EBS* with SharePoint 2010.

BLOB externalization isn’t about being able to leverage commodity disk but rather being able to leverage the “optimal” disk based on the content being managed/stored. The goal is to make sure that patient records, invoices, purchase orders, lunch menus, and vacation pictures land on the most optimal storage device. For obvious reasons not all content is created equal nor should it be treated as such. Subsequently there are scenarios that SharePoint simply cannot support out of the box or with the RBS FileStream provider. For example, take SEC17A-4 (Electronic Storage of Broker-Dealer Records) requirements for client/customer records. Once being declared a record any client related document (an IRA account opening document for example) as specific requirements for storage (they must be immutable and unalterable). Third party RBS products like Metalogix StoragePoint facilitate this scenario through support of WORM (Write Once, Read Many) and CAS (Content Addressable Storage) devices.

In the process of optimizing the storage environment for SharePoint, BLOB externalization accomplishes some critical goals. It is no secret that relational databases (including SQL Server) are not the ideal place to store large pieces of unstructured data. No, this isn’t a dig at SQL Server but rather stating the obvious fact that a system optimized for the storage of highly relational, fine-grained transactional data is not an ideal place to store large pieces of unstructured data. The problem with SQL Server is that the performance cost related to storing BLOBs in the database is expensive. If you consider the RPC call between the web server and the database that contains a large payload (metadata plus the BLOB) and the IO, processor, and memory requirements for storing the blob, you have a very expensive process. Yes it is true that Microsoft has optimized BLOB performance in the subsequent releases of SQL Server but it is still more optimal to store blobs outside of the database when you consider a typical SharePoint farm under load or the process for executing a bulk operation such as a full crawl. The updated guidance from Microsoft would certainly support this assertion. Microsoft itself has document this fact in many of its own publications and even alluded to the initial value of BLOB externalization as being a way to improve the performance of your SharePoint environment. Additionally SQL Server is very rigid in terms of the type of storage it can leverage and methods in which you back up the environment. This brings me to my next point. What was the intent of providing BLOB externalization interfaces within the SharePoint product in the first place?

The original sizing guidelines/limitation for content databases with WSS 3.0/SharePoint 2007 was 100GB (collaboration sites). With SharePoint 2010 Microsoft increased the size limit to 200GB and changed the limit yet again with Service Pack 1 for SharePoint 2010 (more on this later). These limits proved to be problematic for many looking to implement SharePoint pervasively throughout their organization. Not only is database growth a problem, there are challenges with segmentation of content to work around database size restrictions, along with SQL Server being a less than optimal place to storage BLOBs. Additionally backup/restore would become a challenge as SharePoint environments continued to grow both in size and criticality.

Microsoft originally positioned BLOB Externalization as a way to reduce the size of your SharePoint content database. While this is some debate on this topic, it is generally agreed upon in the SharePoint community that the content database size limitations did NOT include externalized BLOBs (this changes with Service Pack 1 for SharePoint 2010). When the StoragePoint team released StoragePoint 2.0 we spent quite a bit of time creating and shaping the messaging for the product which included the following benefits which holds true as the basis for BLOB externalization:

  • Reduce the size of your SharePoint content databases by externalizing BLOBs. Roughly 90-95% of your content database consists of BLOBs (this varies with auditing enabled). By externalizing the BLOBs we can reduce the size and number of databases required to support your environment.
  • Optimize performance by freeing SQL Server from the burden of managing unstructured pieces of data
  • Support the use a variety of storage platforms based on business requirements (storage costs, compliance, performance, etc) including SAN, NAS, and Cloud storage.
  • Create new opportunities to replicate and backup SharePoint content in a more efficient manor

If you consider that roughly 90-95% of a content database is comprised of BLOBs then you stand to have significant reduction in the database size and an increase in the size of the content you can manage per content database. One of the metrics that we often referred to with StoragePoint was the management of 2TB of content. If you reduce a 2TB content database by 95% you end up with a 102.4GB content database and 1945.6GB (1.9TB) of externalized BLOBs. This would be well within the database size limits for SharePoint 2010 and at the high end of the limit for SharePoint 2007. This sounds familiar doesn’t it? I think I have seen something like this in the SP1 limits for SharePoint 2010 … let’s take a look.

Service Pack 1 Consideration

Prior to the release of Service Pack 1 for SharePoint 2010 the content database did not include externalized BLOBs (yes, yes this is debatable but I can tell you that this is generally accepted based on the lack of clarity in the original Plan for Software Boundaries documentation). Microsoft revised this guidance along with the database size limits. For SharePoint 2010 the 200GB size limit is still in effect with a new option to expand a “collaboration” site to 4TB. Now for the fun part … in order to expand a content database beyond the 200GB limit you need an optimized SQL Server disk subsystem. Specifically Microsoft recommends that you have 2 IOPS (inputs/outputs per second) per GB of storage. Note that I am generalizing a bit on the SP1 guidelines and limits so you can read them for yourself here.

While the database “size” limitation includes the externalized BLOBs in the calculations, externalized BLOBs are not included in the IOPS requirement. In order to manage a 4TB database you must have a disk sub system that supports 2 IOPS per GB. If you are not familiar with this concept I can tell you that this is an expensive disk configuration (more on this below). With StoragePoint in place you can have a 4TB “content database” that consists of approximately 200GB SQL Database and 3.8TB of externalized BLOBs. All without an expensive disk requirement. Sound familiar? This is the same messaging that the StoragePoint advertised with the original release of StoragePoint 2.0 with SharePoint 2007/WSS 3.0 SP1 If you do believe that the IOPS requirement includes the externalized BLOBs then you have to discount Microsoft’s support for NAS storage (via iSCSI) with the RBS FileStream provider. Most NAS devices were not intended to support such a high level of IOPS. The new guidance is simply reaffirming what the StoragePoint team has asserted all along. Using a 95% reduction in the database (a typical database is comprised of 95% blobs) you would end up with a 200GB content database (within Microsoft’s original guidelines). If you decide to keep the BLOBs in the database then you need to have lots of expensive disk to maintain performance of your environment.

Let’s take a practical example following Microsoft’s database size limits and guidelines for disk performance and determine what a reasonable disk subsystem might look like. Remember that Microsoft requires .25 IOPS per GB for content databases over 200GB (2 IOPS per GB is highly recommended for optimal performance. Note that in order to keep things brief and to the point I am using some rough estimates to calculate IOPS. Disk performance is impacted by hard disk specs, RAID level, and controllers.

IOPS = 1/(Average Latency in ms + average seek time in ms)

The following tables illustrates the number of disks required to achieve both .25 IOPS per GB (minimum requirement) and 2 IOPS per GB (recommended). Note that for this example we will assume that the IOPS requirement is for data beyond 200GB leaving us with 3.8TB of data that requires optimal disk configuration (minimum IOPS = 972; recommended IOPS = 7792). Note the following assumptions used when calculating IOPS in the tables below.

  1. For each disk type IOPS estimates were used. IOPS will vary based on disk type and manufacturer
  2. RAID 5 and RAID 10 disk configurations were used as these tend to be the most common configurations for database servers (RAID 10 being the preferred configuration).
  3. The IOPS calculations make the assumption that .25 IOPS/GB and 2 IOPS/GB is required for databases above 200GB. The initial 200GB of data is not included in the minimum and recommended IOPS calculations. Additional disks would be require as including the 200GB in the calculations would require an additional 50 and 400 IOPS respectively.
  4. There is an IOPS penalty that varies based on the RAID configuration. For RAID 10 the IOPS penalty is calculated at .8 and for RAID 5 the IOPS penalty is calculated at .57.

Disk Configuration Sample for Minimum IOPS

Drive Type IOPS per Disk RAID Level Disk Capacity (GB) # Disks Usable Capacity (GB) Max IOPS
7200 RPM SATA 90 RAID 10 1024 14 7168 1008
10000 RPM SATA 130 RAID 10 1024 10 5120 1040
10000 RPM SAS 140 RAID 10 1024 10 5120 1120
15000 RPM SAS 180 RAID 10 1024 8 4096 1152
7200 RPM SATA 90 RAID 5 512 20 9216 1026
10000 RPM SATA 130 RAID 5 512 14 6144 1037.4
10000 RPM SAS 140 RAID 5 512 14 6144 1117.2
15000 RPM SAS 180 RAID 5 512 10 4096 1026

Disk Configuration Sample for Recommended IOPS

Drive Type IOPS per Disk RAID Config Disk Capacity (GB) # Disks Usable Capacity (GB) Max IOPS
7200 RPM SATA 90 RAID 10 1024 110 56320 7920
10000 RPM SATA 130 RAID 10 1024 76 38912 7904
10000 RPM SAS 140 RAID 10 1024 70 35840 7840
15000 RPM SAS 180 RAID 10 1024 56 28672 8064
7200 RPM SATA 90 RAID 5 512 152 38912 7797.6
10000 RPM SATA 130 RAID 5 512 106 27136 7854.6
10000 RPM SAS 140 RAID 5 512 98 25088 7820.4
15000 RPM SAS 180 RAID 5 512 76 19456 7797.6

As you start calculating the IOPS requirements (both minimum and recommended) it quickly become apparent that achieving an “optimized” disk subsystem for your large database is going to be quite expensive and will most likely result in overprovisioned disks. When you being to consider replication of environments for disaster recovery and nonproduction scenarios (i.e. moving production data into a nonproduction environment for testing) organizations will experience a 2-5X multiplier on the disk subsystem required to support SQL Server. Obviously this is not the ideal scenario for most organizations deploying SharePoint on any reasonable scale. RBS and products like Metalogix StoragePoint allow organizations to store content on the appropriate storage without the need to meet an expensive IOPS requirement.

Why Not Just Use the RBS FileStream Provider?

Somehow the RBS FileStream provider has evolved into a solution that some would actually consider for a medium or large scale SharePoint environment. I think folks forget why this provider was created in the first place. WSS 3.0 with the WIDE (Windows Integrated Database) option does not have a database size limit. In theory, and in practice, organizations can and have stuffed large volumes of content into this “at no additional charge” product. With the release of SharePoint Foundation 2010 and SQL Server 2008 Express edition, Microsoft introduces database instance limits. SQL Server 2008 R2 Express Edition has a 10GB instance limitation (SQL Server 2008 Express Edition has a 4GB instance limitation) … wait for it … now you see the problem. How can a customer upgrade without buying SQL Server licenses? Enter the RBS FileStream provider.

The problem with the RBS FileStream Provider is that it lacks basic features required to call it an enterprise solution. There are obvious issues such as lack of user interface, lack of support for “remote” storage, and lack of a multithreaded garbage collection process (this issue plagues many StoragePoint competitors as they opt to use the OOB garbage collector with RBS). But more importantly it fails to address a very important challenge. RBS FileStream does not bypass SQL Server for the processing of BLOBs. RBS FileStream pulls the BLOB out of the initial RPC call and then redirects it right back to SQL Server using the FileStream column type. Again, for obvious reasons this is not an efficient process. I am not saying that the RBS FileStream provider is not a viable solution but organizations considering this option should proceed with caution. Backing out of the RBS FileStream provider once you have amassed large volumes of content can prove cumbersome and time consuming.

Backup and Restore Considerations

Backup/restore and disaster recovery can be a complex topic and for this reason I am going to explore this in great detail in this post. Any RBS solution for SharePoint, including StoragePoint, will change the process for backing up and restoring SharePoint environments. What’s lost on most people is that this is not necessarily a negative aspect of RBS. Often the change is very positive and provides new ways for backing up SharePoint environments that weren’t previously possible.

Before we explore backup/restore processes it is important to first understand the anatomy of a BLOB when it is stored outside of SharePoint content databases. Externalized BLOBs are immutable which means they will never change once they are a written out to external storage. There is a one to one relationship between a BLOB and a given version of a file/document in SharePoint. This means that SharePoint will only create and delete BLOBs (StoragePoint actually deletes them as part of a garbage collection process). It may not be immediately apparent but this is actually a good thing. Traditionally you would backup SharePoint content databases using a simple or full recovery model. This means that you are taking full backups on a regular basis that contain objects that will never, ever change. This is less than efficient. By separating BLOBs from the database you can now backup (or replicate) a BLOB one time rather than capturing it in multiple backups. This approach reduces backup storage costs and provides an opportunity for DR scenarios (warm/hot failover) possible.

In general the backup process involves backing up the content database followed by the external BLOB store(s). A farm level restore would involve restoring your BLOB store followed by your content database(s). In many cases it isn’t necessary to backup the external BLOB store as there are ways to replicate it to multiple locations. Item level restores tend to be to area of biggest concern when using an RBS solution like StoragePoint. Fortunately StoragePoint has some built in features to make item level restore feasible. StoragePoint includes a feature called “Orphaned BLOB Retention Policies” that allows for the retention of BLOBs for which the corresponding database item has been deleted. These retention policies are used in conjunction with item level restore tools to guarantee that item level restore is available for a definable period of time.

Conclusion

RBS is clearly a viable option for organizations considering leveraging SharePoint where growth of the environment will be consistent or exponential over a period of time. Microsoft’s updated guidelines and database size limits are a confirmation of sorts for the opportunity that RBS presents for SharePoint deployments. If you are deploying SharePoint in any capacity you should consider RBS as an option for optimizing the storage, for both active and archive content for your SharePoint environment.

Metalogix Releases StoragePoint File Share Librarian with StoragePoint 3.1

The StoragePoint team announced the released version 3.1 on 10/26/2010.  This is the latest release in the award winning StoragePoint product line and contains the new File Share Librarian Module (along with a list of performance updates and hot fixes).  The StoragePoint team created teh File Share Librarian module in response to customer requests for a high speed migration capability for file share data.  Traditional, full fidelity migration tools provide an extension set of features but sacrifise performance to provide those features.  You can expect to see somewhere in the neighborhood of 5GB per hour for a traditonal file-share-to-SharePoint migration tool.  As you can see we aren’t going to be setting any land speed records anytime soon with that level of performance.  In many file share migration scenarios customers want to move large quantitites of data from the file system into SharePoint and then disable file system access to the content.  StoragePoint File Share Librarain is the answer.  

 By leveraging the RBS/EBS capabilities of StoragePoint the File Share Librarain uses a “shallow” copy migration to catalogue file share content into SharePoint.  To help understand this concept you need to first understand the workflow involved with inserting content into SharePoint with StoragePoint deployed.  Files uploaded into the SharePoint web front end server (either through the user interface or some programatic means) are handing off to StoragePoint for processing.  StoragePoint will then store the file on some external storage device (the most common scneario is a file share).  For migration scenarios, rather than deal with the overhead of physically moving the file into SharePoint to have it then be immediately externalized by StoragePoint, the File Share Librarain module leaves the file in-place and simply catalogues the file to make it available in SharePoint. 

The benefits of the File Share Librarian module are many but most notable the ability to process millions of files in a short period of time.  The speed of your migration is no longer bound to the physical size of your file share.  Below you will find an additional list of features for the File Share Librarian along with links to resources where you can learn more about its capability.  Note that the File Share Librarian will work with both SharePoint 2007 and SharePoint 2010 environments.

  • The File Share Librarian product page: http://www.storagepoint.com/product.aspx?tab=1
  • Overview Demo: http://www.storagepoint.com/Images2/File%20Share%20Librarian.wmv
  • Team Share to Team Site Demo: http://www.storagepoint.com/Images2/Team%20Shares-to-Team%20Sites.wmv
  • File Share to My Site Demo: http://www.storagepoint.com/Images2/File%20Shares-to-MySites.wmv
  • Request a 3.1 /w File Share Librarian Trial: http://www.storagepoint.com/contact.aspx?action=trial
    • Single and Dual Access Modes.  Single Access Mode treats the cataloged file share 100% like a normal BLOB store, assuming that you are going to shut off end user access to the content once the file share is cataloged.  And as you’re probably already guessing, Dual Access Mode is there for that transitional period between end users accessing content via the file share to end users accessing content exclusively via SharePoint.  It would be nice to just say that on Monday the file share is gone and you need to go to this URL to get to your content.  You can send that email out until you’re blue in the face and 50% of your end users will open help desk tickets or call yelling and screaming Monday morning wanting to know where there content is.  Not to say that you can’t operate in Dual mode for some undetermined period of time, but why would you want to.  Let me put it this way…we didn’t put this feature in Librarian to encourage folks to operate this way…we put it in there to do our little part to keep the IT staff sane while end users become comfortable with change and/or get out of their own way.
    • Simulation Mode.  Run the Librarian in this mode if you want to detect and have an opportunity to cleanse invalid folder or file names…or shorten path and file names to get within SharePoint URI length limits.  You can let us do this stuff for you (…there is an option), but we’re going to be punitive about it (i.e. an ampersand goes bye-bye, it doesn’t get converted to and).
    • Scheduling Options.  Schedule the Librarian job to run on pretty much any interval you can come up with.  The initial run will catalog the entire file share, with subsequent runs cataloging only the changes (i.e. new files added, old files removed, file updates/changes, etc.).
    • Exclusions.  Exclude file share content by sub folders, filename or filespec, created date, last modified date, and last accessed date.
    • Dynamic Container Creation.  Librarian will create a dynamic container structure (i.e. site collections, sites, libraries, and folders) based on the starting Destination container you pick when creating the Librarian configuration.  If your Destination is a Web Application or Content Database, it will start creating site collections.  If you start with a Library it will start creating folders.  You get the idea.
    • Permissions Promotion.  As previously stated, the Librarian takes the effective permissions of the folders to be cataloged and maps those to Owner (Full Control), Member (Contributor), and Visitor (Reader) roles on the SharePoint containers it creates.
    • Picture and Asset Library Support.  If the file share content is being cataloged into Picture or Asset libraries, Librarian will create thumbnail and preview images for supported image types and make sure those are cataloged as well.  This removes the need to upload the images into SharePoint, where SharePoint would natively create the thumbnail and preview images itself.
    • My Site Support.  Pretty much everybody has that drive that’s mapped to them whenever they login to Windows.  It’s their own personal networked “My Documents”.  We’ve talked to lots of companies that would like to transform those personal drives to personal sites in SharePoint, but many times we’re talking about 10’s of thousands of users and 10’s – 100’s of terabytes of data.  Not something you’re going to easily represent in SharePoint.  By simply pointing Librarian at the root of those personal drives and picking a My Site Host as the destination you would be well on your way to accomplishing that move with little effort.  You would get some use case-specific options to let you tell Librarian where to find the actual personal drive folders (…even if they are not immediately under the root of the parent file share), where to store the content in the personal site (i.e. Personal or Shared Documents or a custom structure), whether or not to create the User Profile and Personal Site if it doesn’t already exist, and so on.  As a test, we took 48,000 user folders, 48 million documents representing 6TB of content and represented it in SharePoint as 48,000 My Sites and 48 million list items in a single 150GB content database…in just over 6 days.  And no, this wasn’t on some kind of fantasy hardware.  The SQL box was dual quad core Xeon processors with 24GB of memory and the app server the job ran on was a single quad core process with 4GB of memory.  While the job was running (16 threads…you can control that and change it in-flight anytime you like) the app server was running about 80% CPU utilization with the SQL box running under 10%, vast majority of the time running 5-7% (…it was taking a nap).  I think the data speaks for itself.  We’re going to conduct some larger-scale tests in the coming months and publish a whitepaper with detailed results, including farm configuration.

    Remote BLOB Storage for SharePoint Foundation 2010

    There has been a lot of misconcenption around remote blob storage (RBS) capabilities with SharePoint 2010.  For some time many thought that RBS would be a optional feature included in the core release of SharePoint 2010/SharePoint Foundation 2010.  We know now that this is simply not the case.  The StoragePoint team has previously published materials comparing the features of the RBS FileStream Provider (an additional downlad that is part of a feature pack) with StoragePoint.  You can view that comparison here.  Microsoft has recently published a short white paper detailing RBS options with SharePoint Foundation 2010.  This is an excellant white paper that identifies the need for RBS (scalability and storage costs) along with addressing misconceptions about the capabilities of the RBS FileStream Provider.  For quite a while I have asserted that the RBS FileStream provider was purpose built to support upgrades from WSS 3.0 w/ WIDE (Windows Integrated Database) option and that RBS FileStream is not an enterprise solution.  The recently published document confirms my previous assertions.  You can download a copy of the white paper here http://www.microsoft.com/downloads/en/details.aspx?FamilyID=90ef5bf8-915e-41bd-806b-f915fbf5d353&displaylang=en