Category: Uncategorized

The Impact of Shredded Storage on SharePoint 2013: Part 1

There is little doubt the SharePoint community is excited for SharePoint 2013. With 60% of users in a recent SharePoint survey saying they want to upgrade in the next year, the anticipation is building to a climax.

One feature that has garnered a lot of buzz, and some confusion, is the new Shredded Storage feature and what impact it will have on binary large objects (BLOBs) storage. In this two-part series, we’ll delve deeper into Shredded Storage, explore its impact on SharePoint 2013 and address the myths about the new feature and Remote BLOB Storage.

Before we plunge into the details it makes sense to define what Shredded Storage is and is not. The term “shredding” has created a lot of confusion. Shredded Storage does not refer to file shredding, which is the secure deletion of files by overwriting the data multiple times. StoragePoint and many other storage-related products include this feature. Instead, SharePoint 2013’s Shredded Storage is an attempt by Microsoft to reduce the I/O impact when saving versions of a document or file by “shredding” it into smaller pieces and reassembling it when someone needs to access it.

Microsoft developed shredded storage to address an issue that resulted from how document edits or versions are stored in SharePoint and to reduce the number of transaction logs. The result – significantly improved I/O (network and disk) and reduced CPU overhead when storing incremental changes to documents. The SharePoint product team has done a fantastic job addressing what has been a long-standing complaint in the way files are stored in previous versions of SharePoint.

Unfortunately, by addressing one issue, Microsoft introduced another that results in a performance decrease for the uploading and downloading of files in SharePoint.

A Brief History Lesson

With the introduction of SharePoint Team Services and subsequently SharePoint Portal Server 2003, Microsoft moved away from using the Web Storage System and settled on using SQL Server exclusively for the storage of BLOBs. BLOBs are immutable objects when stored in SharePoint. This means that BLOBs are created and deleted but never updated. When editing existing documents in SharePoint, edits result in new BLOBs being created. These new BLOBs are full copies including the edits and not incremental changes.

This means that if you maintain 10 versions of a 1MB document you would end up with approximately 10MB in total storage requirement excluding metadata. While this wasn’t the most efficient use of storage it was the simplest way to handle versions without introducing complexity. Unfortunately SharePoint has a tendency to spin off new versions of documents even if only metadata has changed. With the introduction of SharePoint and Office 2010 Microsoft optimized communication between the Office client and the SharePoint server by implementing a file synchronization protocol named Cobalt.

What Does Cobalt Mean?

I am not going to cover Cobalt in nauseating detail as the topic has been covered quite well by Bill Baer in his post about Shredded Storage. What you should know is that Cobalt allowed the Office client to send incremental changes rather than the entire file to the server when a document was saved. The incremental changes were then reassembled on the server and saved as a complete file. Shredded Storage in SharePoint 2013 extends the saving of incremental changes to the database where only file changes are stored rather than entire copies of the file.

The goal of Cobalt and subsequently Shredded Storage is to reduce the network bandwidth utilization (client changes sent to the SharePoint server) and the I/O operations (SharePoint Server sends file to database) that result from incremental changes to documents. In fact Shredded Storage significantly improves I/O operations while reducing CPU overhead when saving incremental changes to documents. What isn’t immediately apparent is the negative impact to I/O operations for the inserting of new files and downloading of existing files.

The Test Results

The table below compares SharePoint 2010 and SharePoint 2013 upload and download times on the same document set. Our lab testing confirms that SharePoint 2013 uploads and downloads are slower — and in some cases significantly slower — than SharePoint 2010. This is a direct result of Shredded Storage. The overhead involved in determining how to split a document into smaller pieces and store those smaller pieces definitely has an impact on the performance for uploads and downloads.

Upload (speed in milliseconds) Download (speed in milliseconds)
Scenarios File Name File Type File Size (KB) SP2010   (A) SP2013 (B) Difference (A-B) SP2010        ( C) SP 2013 (D) Delta (C-D)
1 AA_Small TIF TIF 60 0.58 0.25 0.33 0.02 0.03 -0.01
2 AB_PDF Sample PDF 625 0.11 0.39 -0.29 0.02 0.05 -0.03
3 AC_SharePoint Training PPTX 669 0.15 0.72 -0.57 0.02 0.12 -0.10
4 AD_Drawing1 VSD 759 0.16 0.47 -0.31 0.02 0.05 -0.03
5 AE_1 MB Word Doc 2010 DOCX 1,082 0.35 0.66 -0.30 0.03 0.10 -0.07
6 AF_LV111-01-10 DWG 1,208 0.15 0.55 -0.41 0.03 0.07 -0.03
7 AG_1 mb image JPG 1,210 0.20 0.64 -0.43 0.03 0.07 -0.04
8 AH_Drawing2 VSD 1,659 0.24 0.78 -0.54 0.04 0.09 -0.05
9 AI_Customer 2009 PPT 2,192 0.34 0.93 -0.59 0.05 0.10 -0.06
10 AJ_2mb TIF Image TIF 2,579 0.32 1.01 -0.69 0.05 0.12 -0.07
11 AK_2mb Image JPG 2,725 0.34 1.06 -0.72 0.06 0.14 -0.08
12 AL_LV111-02-10 DXF 2,783 0.33 1.10 -0.77 0.06 0.16 -0.10
13 AM_3_6mb PDF Sample PDF 3,690 0.49 1.47 -0.98 0.07 0.21 -0.13
14 AN_4 MB PDF PDF 4,078 0.50 1.60 -1.10 0.08 0.19 -0.11
15 AO_Corporate Presentation 2007 PPT 4,248 0.49 1.69 -1.21 0.08 0.20 -0.11
16 AP_Analyst Briefing – 2008 PPT 4,434 0.54 1.68 -1.15 0.08 0.18 -0.10
17 AQ_4_5 mb Video MOV 4,627 0.46 1.77 -1.31 0.10 0.19 -0.09
18 AR_4_5 mb wmv video WMV 4,680 0.51 1.81 -1.30 0.24 0.18 0.06
19 AS_Internet Safety Presentation PPT 4,839 0.42 1.84 -1.42 0.24 0.20 0.05
20 AT_5mb Image JPG 5,267 0.50 2.15 -1.66 0.21 0.23 -0.02
21 AU_LV111-01-10 DXF 5,425 0.42 2.02 -1.60 0.28 0.25 0.03
22 AV_5_3 JPG JPG 5,457 0.55 2.08 -1.53 0.19 0.23 -0.04
23 AW_LV111-02-FL DXF 5,866 0.48 2.11 -1.63 0.24 0.22 0.02
24 AX_Corporate Slide Deck_April 2009 PPT 5,936 0.53 2.29 -1.76 0.18 0.27 -0.09
25 AY_LV111-01-FL DXF 5,972 0.44 2.22 -1.78 0.13 0.27 -0.14
26 AZ_7mb Excel File XLSX 7,415 0.68 0.89 -0.22 0.33 0.28 0.05
27 BA_SPC14_348_WhatsNewDevs PPTX 8,935 0.76 1.55 -0.80 0.30 0.95 -0.65
28 BB_SPC 2009 PPT 9,255 0.89 3.33 -2.45 0.39 0.34 0.04
29 BC_11_7mb Excel File XLSX 11,974 0.92 1.25 -0.33 0.73 0.32 0.41
30 BD_14_5 MB PDF PDF 14,861 1.24 5.08 -3.85 0.73 1.10 -0.38
31 BE_26 MB XLSX XLSX 26,557 2.24 2.79 -0.54 1.31 2.29 -0.97
32 BF_28MB_txt_TestFile TXT 28,787 2.05 8.58 -6.53 0.82 2.32 -1.49
33 BG_33_1 MB WORD 2010 Doc DOCX 33,947 2.53 3.24 -0.71 0.62 3.02 -2.40
34 BH_50MB_txt_TestFile TXT 54,265 3.69 15.92 -12.23 0.83 3.73 -2.90
35 BI_55 MB XLSX XLSX 56,356 4.18 4.90 -0.72 2.29 5.48 -3.19
36 BJ_70 MB WORD 2010 Doc DOCX 71,694 5.39 5.99 -0.59 6.55 6.36 0.19
37 BK_100 MB XLSX XLSX 103,108 8.85 8.16 0.69 5.83 7.86 -2.03
38 BL_103 MB WORD 2010 Doc DOCX 105,411 7.78 7.91 -0.13 4.84 7.20 -2.36
39 BM_180MB_txt_TestFile TXT 184,288 13.52 10.53 2.98 13.55 5.82 7.72
40 BN_190 mb Word Doc 2003 DOC 195,899 14.65 12.01 2.63 14.30 8.22 6.08
41 BO_250 mb Movie MOV 255,454 20.30 15.70 4.60 20.35 10.22 10.14
42 BP_382 mb Word Doc 2003 DOC 391,739 29.25 26.70 2.55 12.22 16.76 -4.55
43 BQ_540MB_txt_TestFile TXT 552,862 45.21 42.21 3.00 31.71 17.36 14.35

One Size Does Not Fit All

The test results above led us to examine the configuration options for Shredded Storage to determine if we could mitigate the negative impact on uploads and downloads. Unfortunately your options are limited. Contrary to other blog posts on the topic, Shredded Storage cannot be disabled.  You actually had the option to disable shredding in the SharePoint 2013 beta but that option was eliminated in the RTM build. The only remaining option is changing the default shred or “chunk” size that files will split into when they are stored.

For me the decision to disable shredding is a bit nearsighted.  Not all organizations use SharePoint for document collaboration where content is being updated/edited in large quantities. I would even argue that while some organizations do have collaboration sites where lots of editing occurs, they almost certainly have other sites where documents are simply uploaded and downloaded without edits or new versions being created.

A common example is document imaging where PDF/TIFF images are stored within SharePoint. Those images never change. Or, how about a document center that contains tens of thousands of published documents that are being read rather than updated? What’s more, Shredded Storage provides little value for these scenarios. It is true that even with versioning disabled the I/O between the client, SharePoint Server, and database Server will be optimized. However you will not reduce overall storage requirements.

Unfortunately you are relegated to living with Shredded Storage in hopes that Microsoft will provide, at a minimum the ability to disable the feature.  An even better would be an option to control Shredded Storage at the site or site collection level for added flexibility.

Solving one problem by introducing another significant problem is going to make for some unhappy campers who are already struggling to keep up with the explosive growth of their SharePoint content.

In part 2, we will address using RBS with Shredded Storage, including debunking myths, reviewing how RBS functions with Shredding Storage, and discussing best practices for optimizing RBS.

Remote BLOB Storage – The Cost of Doing Nothing

The history of SharePoint is quite fascinating when you consider its roots as a departmental collaboration and document management tool.  In many organizations, SharePoint has grown up from literally an application running on a server under someone’s desk to an enterprise-wide, mission-critical system.  Even today, I am quite surprised at how often SharePoint growth is underestimated.

Because of this, organizations deploying SharePoint should consider Remote BLOB Storage (RBS) as a viable option for managing growth and reducing ongoing storage and management costs.  While there is a cost associated with deploying a third-party RBS solution, the cost of doing nothing could be much greater in the long run.

Before we explore the cost of doing nothing, let’s revisit the value that RBS brings to SharePoint.

  1. Enhances performance and scalability of SharePoint while reducing Total Cost of Ownership (TCO)
  2. Provides flexible storage options for storing unstructured SharePoint content.  Different storage options provide lower storage costs, lower overall TCO, and support for compliance scenarios that were previously not supported.
  3. Reduces management overhead by reducing the size of SharePoint content databases (SQL Server databases) and creating new options for backup, restore, disaster recovery, and high-availability scenarios.

Depending on the current state of your SharePoint deployment, RBS can provide immediate and often significant return on investment (ROI).  For organizations deploying new hardware in the form of storage infrastructure, the primary ROI will come from reduced up-front storage procurement costs. Consider the following scenario where we simply divert 90% of the storage from tier 1 (typically SAN or Direct Attached Storage) to tier 2 (primarily NAS storage).

Tier 1 Storage
StoragePoint Acquisition Cost  $ 29,990.00 Acquisition Cost/GB $ 5.00
Support & Maintenance  $ 5,398.20 Monthly Mgmt Cost/GB $ 2.00
Total Storage (GB) 3072GB Percentage of Total Storage 10%
Total Investment without StoragePoint  $ 15,360.00 Total Tier Storage(GB) $ 307.20
Total Investment with StoragePoint  $ 42,453.80 Total Acquisition Cost $1,536.00
Monthly Cost without StoragePoint  $  6,144.00 Total Monthly Cost $ 614.40
Monthly Cost with StoragePoint  $  1,996.80
Monthly Savings  $  4,147.20 Tier 2 Storage
  Acquisition Cost/GB             $2.00
Return on Investment (Months)                    6.5 Monthly Mgmt Cost/GB $ 0.50
Percentage of Total Storage 90%
Total Tier Storage(GB) 2764.8 GB
Total Acquisition Cost $ 5,529.60
Total Monthly Cost $1,382.40

As you can see, if you are planning for a capital expenditure for storage to support your new or growing SharePoint environment, complete ROI comes in only a few months and is often immediate – factors including per GB disk cost and ongoing management costs will impact ROI.  Deferring the use of RBS could have immediate, and in some cases, significant impacts on cost.

For organizations with expanding SharePoint environments, the cost of doing nothing can almost certainly have a negative impact.  With the understanding that the purchase of storage to support SharePoint may be a sunk cost for your organization, there are considerations for large SharePoint content databases.  Microsoft recently updated their guidance for large database support in SharePoint and I have previously posted a blog, Revisiting SharePoint Remote BLOB Storage, which provides detail on the new guidance.

The net of the new guidance is that supporting large content databases requires special consideration for disk IOPS (inputs/outputs per second).  For organizations choosing to store all unstructured content in SQL Server databases, the cost of doing so can be extremely high.  The reason is that the high number of IOPS required per GB (0.25 IOPS per GB minimum with 2 IOPS per GB recommended) result in overprovisioned, underutilized disks.  The amount of disks required to support a large SharePoint content database can be very large and result in significant costs, both in procurement and ongoing maintenance, for storage.  RBS alleviates this concern by storing unstructured content outside of the database, usually on a less expensive tier.

Beyond storage and ongoing maintenance costs, there are also soft costs that are more difficult to measure.  Consider that storing all SharePoint content in SQL Server databases will result in increased management costs for database maintenance including backup/restore processes.  Have you considered the costs for storing backups for the period of time dictated by your recovery point objectives (RPO)? How about the cost of data replication to support failover and nonproduction environments (Dev, Staging, QA, etc)? RBS allows organizations to leverage storage devices that facilitate backup/recovery scenarios and provide better ways to backup and/or replicate the unstructured SharePoint content, which can comprise up to 95% of your storage utilization.

Planning for deployment and ongoing management of your SharePoint environment is critical to controlling costs in the short and long term.  RBS can provide immediate costs savings in the short term with significant cost savings over the long term as your SharePoint environment continues to grow.  Often the fear of uncertainty that surrounds new technology will cause organizations to dismiss the technology altogether.  While RBS may fall into this category – you should take a second look. The cost of doing nothing may be higher that you think.

Metalogix Releases StoragePoint 3.2

On September 6th Metalogix released StoragePoint 3.2 along with some powerful new features that focus on performance, file share management, and tiered storage capabilities.  Here is a quick rundown of what you can expect to find in StoragePoint 3.2.

Master/worker Timer Job Configuration

Previously StoragePoint leveraged a multithreaded timer job framework that provided added performance for batch jobs over a typical single threaded timer job.  As you know timer jobs are executed on a single SharePoint server and there is no out-of-the-box support for distributing work across multiple servers.  StoragePoint 3.2 contains a new timer job framework that allows for the configuration of a single mater job that hands work to “worker” jobs.  This new approach provides significant performance improvements for batch processes like externalize jobs (initial removal of existing BLOBs from content databases) or file share librarian jobs. 

Broader Archive Rules

Archiving and tiered storage rules can now be configured at a broader scope.  Previous versions of StoragePoint restricted archive rules to a site collection.  StoragePoint 3.2 allows for archive rules to be configured at site collections, content databases, and web applications. 

File Share Librarian Enhancements

For those of you familiar with File Share Librarian, the feature only allowed a single endpoint in previous versions of StoragePoint.  With StoragePoint 3.2 we now have the ability to configure File Share Librarian on existing Storage Profiles even with different endpoints. 

Revisiting RBS: SharePoint and Service Pack 1

Since the release of SharePoint 2010 there has been a lot of debate over the value and application of Remote BLOB Storage (RBS) with SharePoint 2010. That debate has been reinvigorated with the updated Plan for Software Boundaries guidance that was published with Service Pack 1 for SharePoint 2010. Frankly I am disturbed by the volume of blog posts that include inaccurate information about RBS and the updated Software Boundaries and Limits. It seems that some “experts” influence far outweighs their competence on the subject of RBS. I have personally seen the confusion created in the ecosystem manifest itself in conversations I have with customers on a daily and weekly basis. For this reason I think it is pertinent to revisit the value of BLOB externalization, correct common misconceptions, and discuss how the updated guidance from Microsoft may impact your consideration to implement SharePoint with a remote blob storage solution.

A Brief History of BLOB Externalization

First a brief history lesson before we tackle RBS (a history lesson that many of you already know). BLOB externalization is not a new concept. In fact most legacy ECM system store unstructured data (files) separate from metadata stored within a relational database. Microsoft originally developed SharePoint this way using the same Web Storage System that Exchange Server uses. With the release of WSS 2.0/SharePoint 2003, Microsoft moved to storing all data (structured and unstructured) within SQL Server databases. Many vendors attempted to address database growth through the use of archive tools that will pull BLOBs from the database in a post processing batch job. While this worked to solve database bloat problems it created compatibility issues with out-of-the-box and third party SharePoint solutions which had to be “aware” of the archive product in use and understand how to interpret the stub left behind. Definitely not the most elegant solution and it certainly didn’t address the core issue. It wasn’t until the release of Service Pack 1 with SharePoint 2007/WSS 3.0 that Microsoft introduced support for BLOB externalization via the EBS interface. Subsequently Microsoft introduced support for Remote BLOB Storage (RBS) along with continued support for EBS* with SharePoint 2010.

BLOB externalization isn’t about being able to leverage commodity disk but rather being able to leverage the “optimal” disk based on the content being managed/stored. The goal is to make sure that patient records, invoices, purchase orders, lunch menus, and vacation pictures land on the most optimal storage device. For obvious reasons not all content is created equal nor should it be treated as such. Subsequently there are scenarios that SharePoint simply cannot support out of the box or with the RBS FileStream provider. For example, take SEC17A-4 (Electronic Storage of Broker-Dealer Records) requirements for client/customer records. Once being declared a record any client related document (an IRA account opening document for example) as specific requirements for storage (they must be immutable and unalterable). Third party RBS products like Metalogix StoragePoint facilitate this scenario through support of WORM (Write Once, Read Many) and CAS (Content Addressable Storage) devices.

In the process of optimizing the storage environment for SharePoint, BLOB externalization accomplishes some critical goals. It is no secret that relational databases (including SQL Server) are not the ideal place to store large pieces of unstructured data. No, this isn’t a dig at SQL Server but rather stating the obvious fact that a system optimized for the storage of highly relational, fine-grained transactional data is not an ideal place to store large pieces of unstructured data. The problem with SQL Server is that the performance cost related to storing BLOBs in the database is expensive. If you consider the RPC call between the web server and the database that contains a large payload (metadata plus the BLOB) and the IO, processor, and memory requirements for storing the blob, you have a very expensive process. Yes it is true that Microsoft has optimized BLOB performance in the subsequent releases of SQL Server but it is still more optimal to store blobs outside of the database when you consider a typical SharePoint farm under load or the process for executing a bulk operation such as a full crawl. The updated guidance from Microsoft would certainly support this assertion. Microsoft itself has document this fact in many of its own publications and even alluded to the initial value of BLOB externalization as being a way to improve the performance of your SharePoint environment. Additionally SQL Server is very rigid in terms of the type of storage it can leverage and methods in which you back up the environment. This brings me to my next point. What was the intent of providing BLOB externalization interfaces within the SharePoint product in the first place?

The original sizing guidelines/limitation for content databases with WSS 3.0/SharePoint 2007 was 100GB (collaboration sites). With SharePoint 2010 Microsoft increased the size limit to 200GB and changed the limit yet again with Service Pack 1 for SharePoint 2010 (more on this later). These limits proved to be problematic for many looking to implement SharePoint pervasively throughout their organization. Not only is database growth a problem, there are challenges with segmentation of content to work around database size restrictions, along with SQL Server being a less than optimal place to storage BLOBs. Additionally backup/restore would become a challenge as SharePoint environments continued to grow both in size and criticality.

Microsoft originally positioned BLOB Externalization as a way to reduce the size of your SharePoint content database. While this is some debate on this topic, it is generally agreed upon in the SharePoint community that the content database size limitations did NOT include externalized BLOBs (this changes with Service Pack 1 for SharePoint 2010). When the StoragePoint team released StoragePoint 2.0 we spent quite a bit of time creating and shaping the messaging for the product which included the following benefits which holds true as the basis for BLOB externalization:

  • Reduce the size of your SharePoint content databases by externalizing BLOBs. Roughly 90-95% of your content database consists of BLOBs (this varies with auditing enabled). By externalizing the BLOBs we can reduce the size and number of databases required to support your environment.
  • Optimize performance by freeing SQL Server from the burden of managing unstructured pieces of data
  • Support the use a variety of storage platforms based on business requirements (storage costs, compliance, performance, etc) including SAN, NAS, and Cloud storage.
  • Create new opportunities to replicate and backup SharePoint content in a more efficient manor

If you consider that roughly 90-95% of a content database is comprised of BLOBs then you stand to have significant reduction in the database size and an increase in the size of the content you can manage per content database. One of the metrics that we often referred to with StoragePoint was the management of 2TB of content. If you reduce a 2TB content database by 95% you end up with a 102.4GB content database and 1945.6GB (1.9TB) of externalized BLOBs. This would be well within the database size limits for SharePoint 2010 and at the high end of the limit for SharePoint 2007. This sounds familiar doesn’t it? I think I have seen something like this in the SP1 limits for SharePoint 2010 … let’s take a look.

Service Pack 1 Consideration

Prior to the release of Service Pack 1 for SharePoint 2010 the content database did not include externalized BLOBs (yes, yes this is debatable but I can tell you that this is generally accepted based on the lack of clarity in the original Plan for Software Boundaries documentation). Microsoft revised this guidance along with the database size limits. For SharePoint 2010 the 200GB size limit is still in effect with a new option to expand a “collaboration” site to 4TB. Now for the fun part … in order to expand a content database beyond the 200GB limit you need an optimized SQL Server disk subsystem. Specifically Microsoft recommends that you have 2 IOPS (inputs/outputs per second) per GB of storage. Note that I am generalizing a bit on the SP1 guidelines and limits so you can read them for yourself here.

While the database “size” limitation includes the externalized BLOBs in the calculations, externalized BLOBs are not included in the IOPS requirement. In order to manage a 4TB database you must have a disk sub system that supports 2 IOPS per GB. If you are not familiar with this concept I can tell you that this is an expensive disk configuration (more on this below). With StoragePoint in place you can have a 4TB “content database” that consists of approximately 200GB SQL Database and 3.8TB of externalized BLOBs. All without an expensive disk requirement. Sound familiar? This is the same messaging that the StoragePoint advertised with the original release of StoragePoint 2.0 with SharePoint 2007/WSS 3.0 SP1 If you do believe that the IOPS requirement includes the externalized BLOBs then you have to discount Microsoft’s support for NAS storage (via iSCSI) with the RBS FileStream provider. Most NAS devices were not intended to support such a high level of IOPS. The new guidance is simply reaffirming what the StoragePoint team has asserted all along. Using a 95% reduction in the database (a typical database is comprised of 95% blobs) you would end up with a 200GB content database (within Microsoft’s original guidelines). If you decide to keep the BLOBs in the database then you need to have lots of expensive disk to maintain performance of your environment.

Let’s take a practical example following Microsoft’s database size limits and guidelines for disk performance and determine what a reasonable disk subsystem might look like. Remember that Microsoft requires .25 IOPS per GB for content databases over 200GB (2 IOPS per GB is highly recommended for optimal performance. Note that in order to keep things brief and to the point I am using some rough estimates to calculate IOPS. Disk performance is impacted by hard disk specs, RAID level, and controllers.

IOPS = 1/(Average Latency in ms + average seek time in ms)

The following tables illustrates the number of disks required to achieve both .25 IOPS per GB (minimum requirement) and 2 IOPS per GB (recommended). Note that for this example we will assume that the IOPS requirement is for data beyond 200GB leaving us with 3.8TB of data that requires optimal disk configuration (minimum IOPS = 972; recommended IOPS = 7792). Note the following assumptions used when calculating IOPS in the tables below.

  1. For each disk type IOPS estimates were used. IOPS will vary based on disk type and manufacturer
  2. RAID 5 and RAID 10 disk configurations were used as these tend to be the most common configurations for database servers (RAID 10 being the preferred configuration).
  3. The IOPS calculations make the assumption that .25 IOPS/GB and 2 IOPS/GB is required for databases above 200GB. The initial 200GB of data is not included in the minimum and recommended IOPS calculations. Additional disks would be require as including the 200GB in the calculations would require an additional 50 and 400 IOPS respectively.
  4. There is an IOPS penalty that varies based on the RAID configuration. For RAID 10 the IOPS penalty is calculated at .8 and for RAID 5 the IOPS penalty is calculated at .57.

Disk Configuration Sample for Minimum IOPS

Drive Type IOPS per Disk RAID Level Disk Capacity (GB) # Disks Usable Capacity (GB) Max IOPS
7200 RPM SATA 90 RAID 10 1024 14 7168 1008
10000 RPM SATA 130 RAID 10 1024 10 5120 1040
10000 RPM SAS 140 RAID 10 1024 10 5120 1120
15000 RPM SAS 180 RAID 10 1024 8 4096 1152
7200 RPM SATA 90 RAID 5 512 20 9216 1026
10000 RPM SATA 130 RAID 5 512 14 6144 1037.4
10000 RPM SAS 140 RAID 5 512 14 6144 1117.2
15000 RPM SAS 180 RAID 5 512 10 4096 1026

Disk Configuration Sample for Recommended IOPS

Drive Type IOPS per Disk RAID Config Disk Capacity (GB) # Disks Usable Capacity (GB) Max IOPS
7200 RPM SATA 90 RAID 10 1024 110 56320 7920
10000 RPM SATA 130 RAID 10 1024 76 38912 7904
10000 RPM SAS 140 RAID 10 1024 70 35840 7840
15000 RPM SAS 180 RAID 10 1024 56 28672 8064
7200 RPM SATA 90 RAID 5 512 152 38912 7797.6
10000 RPM SATA 130 RAID 5 512 106 27136 7854.6
10000 RPM SAS 140 RAID 5 512 98 25088 7820.4
15000 RPM SAS 180 RAID 5 512 76 19456 7797.6

As you start calculating the IOPS requirements (both minimum and recommended) it quickly become apparent that achieving an “optimized” disk subsystem for your large database is going to be quite expensive and will most likely result in overprovisioned disks. When you being to consider replication of environments for disaster recovery and nonproduction scenarios (i.e. moving production data into a nonproduction environment for testing) organizations will experience a 2-5X multiplier on the disk subsystem required to support SQL Server. Obviously this is not the ideal scenario for most organizations deploying SharePoint on any reasonable scale. RBS and products like Metalogix StoragePoint allow organizations to store content on the appropriate storage without the need to meet an expensive IOPS requirement.

Why Not Just Use the RBS FileStream Provider?

Somehow the RBS FileStream provider has evolved into a solution that some would actually consider for a medium or large scale SharePoint environment. I think folks forget why this provider was created in the first place. WSS 3.0 with the WIDE (Windows Integrated Database) option does not have a database size limit. In theory, and in practice, organizations can and have stuffed large volumes of content into this “at no additional charge” product. With the release of SharePoint Foundation 2010 and SQL Server 2008 Express edition, Microsoft introduces database instance limits. SQL Server 2008 R2 Express Edition has a 10GB instance limitation (SQL Server 2008 Express Edition has a 4GB instance limitation) … wait for it … now you see the problem. How can a customer upgrade without buying SQL Server licenses? Enter the RBS FileStream provider.

The problem with the RBS FileStream Provider is that it lacks basic features required to call it an enterprise solution. There are obvious issues such as lack of user interface, lack of support for “remote” storage, and lack of a multithreaded garbage collection process (this issue plagues many StoragePoint competitors as they opt to use the OOB garbage collector with RBS). But more importantly it fails to address a very important challenge. RBS FileStream does not bypass SQL Server for the processing of BLOBs. RBS FileStream pulls the BLOB out of the initial RPC call and then redirects it right back to SQL Server using the FileStream column type. Again, for obvious reasons this is not an efficient process. I am not saying that the RBS FileStream provider is not a viable solution but organizations considering this option should proceed with caution. Backing out of the RBS FileStream provider once you have amassed large volumes of content can prove cumbersome and time consuming.

Backup and Restore Considerations

Backup/restore and disaster recovery can be a complex topic and for this reason I am going to explore this in great detail in this post. Any RBS solution for SharePoint, including StoragePoint, will change the process for backing up and restoring SharePoint environments. What’s lost on most people is that this is not necessarily a negative aspect of RBS. Often the change is very positive and provides new ways for backing up SharePoint environments that weren’t previously possible.

Before we explore backup/restore processes it is important to first understand the anatomy of a BLOB when it is stored outside of SharePoint content databases. Externalized BLOBs are immutable which means they will never change once they are a written out to external storage. There is a one to one relationship between a BLOB and a given version of a file/document in SharePoint. This means that SharePoint will only create and delete BLOBs (StoragePoint actually deletes them as part of a garbage collection process). It may not be immediately apparent but this is actually a good thing. Traditionally you would backup SharePoint content databases using a simple or full recovery model. This means that you are taking full backups on a regular basis that contain objects that will never, ever change. This is less than efficient. By separating BLOBs from the database you can now backup (or replicate) a BLOB one time rather than capturing it in multiple backups. This approach reduces backup storage costs and provides an opportunity for DR scenarios (warm/hot failover) possible.

In general the backup process involves backing up the content database followed by the external BLOB store(s). A farm level restore would involve restoring your BLOB store followed by your content database(s). In many cases it isn’t necessary to backup the external BLOB store as there are ways to replicate it to multiple locations. Item level restores tend to be to area of biggest concern when using an RBS solution like StoragePoint. Fortunately StoragePoint has some built in features to make item level restore feasible. StoragePoint includes a feature called “Orphaned BLOB Retention Policies” that allows for the retention of BLOBs for which the corresponding database item has been deleted. These retention policies are used in conjunction with item level restore tools to guarantee that item level restore is available for a definable period of time.

Conclusion

RBS is clearly a viable option for organizations considering leveraging SharePoint where growth of the environment will be consistent or exponential over a period of time. Microsoft’s updated guidelines and database size limits are a confirmation of sorts for the opportunity that RBS presents for SharePoint deployments. If you are deploying SharePoint in any capacity you should consider RBS as an option for optimizing the storage, for both active and archive content for your SharePoint environment.

SQL Server Management Studio “Unable to Cast COM of Type …”

Ugh! I am stuck in Windows Registry hell. Here is the situation. I have a Windows Server 2008 R2 Standard Edition install running SQL Server 2008 SP1 and Visual Studio.NET 2008 SP1. It seems that with this configuration I get the following error in SQL Server Management Studio (SSMS).


Unable to cast COM object of type ‘System.__ComObject’ to interface type ‘Microsoft.VisualStudio.OLE.Interop.IServiceProvider’. This operation failed because the QueryInterface call on the COM component for the interface with IID ‘{6D5140C1-7436-11CE-8034-00AA006009FA}’ failed due to the following error: No such interface supported (Exception from HRESULT: 0x80004002 (E_NOINTERFACE)). (Microsoft.VisualStudio.OLE.Interop)

After doing some searching it appears that this is a result of Windows registry corruption that can be solved by re-registering the actxprxy.dll (you need to open the command line running as administrator and run ‘regsvr32 actxprxy.dll’).  There are serveral blog posts that document this solution (below are a few links).  Unfortunately for me this did not fix my problem and I haven’t been able to find a fix making SSMS unusable.  The irony of this situation is that I have a laptop with the exact same problem and it is running Windows 7 RC1 with SQL Server 2005 and Visual Studio.NET 2008 so the common demoninator appears to be VS.NET 2008.  I an anxious to see if others are having this problem if the above workaround was successful for them or not.

http://www.thisispaulsmith.co.uk/BLOG/post/SQL-Server-Management-Studio-Error-Unable-to-cast-COM-objecthellip3b.aspx

http://support.microsoft.com/kb/922214

http://blog.newslacker.net/2008/02/sql-server-2005-unable-to-cast-com.html

 

Enabling Outlook Instant Search on Windows Server 2008

I have recently switched my laptop over to Windows Server 2008 from Windows 7 to take advantage of hyper-v.  I wasn’t happy with Virtual PC 2007 for Windows 7 and VMWare is too expensive when I have access to Windows Server 2008.  Overall the experience is very good as Windows Server 2008 runs very quickly on my Lenovo t61 (I am not running the Desktop Experience feature as this sucks too many system resources).  Since I am running this machine as my core workstation I am running Office 2007. I ran into a scenario where I wanted to enable Outlook Instant Search.  Here is a link to a good blog post that walks you through the process of enabling Outlook Instant Search on Windows Server 2008.

http://exchangepedia.com/blog/2009/04/using-outlooks-instant-search-feature.html

Attempted to read or write protected memory – Event Id’s 6398, 6482, 7076

I have ran into this issue several times in the past where a SharePoint server’s event log is filling up with event id’s 6398, 6482, and 7076.  These errors typically popup in your event viewer for administrative timer jobs.  According to Microsoft there is an issue with multithreaded applications accessing IIS at the same time.  Microsoft does have a hot fix to address the issue.  The knowledgebase article is located here http://support.microsoft.com/default.aspx?scid=kb;EN-US;946517

Configuring Live Writer for SubText

Microsoft Live Writer can be configured to use with SubText.  Here are the steps. 

1). Navigate to the Weblog > Add Weblog Account menu

2). Select Another weblog service 

image

3). Enter the URL of your weblog (http://myblog.com/blogname)

4). Enter a username and password.

image

5). Under the type of weblog select Metaweblog API and enter the Remote Posting Url in the following format: 
http://<someblogsite.com>/<username>/services/MetaBlogAPI.aspx

image

6). Click Next.  Enter a name for this blog setting and click Finish.

 

 


Bookmark and Share