The StoragePoint team announced the released version 3.1 on 10/26/2010. This is the latest release in the award winning StoragePoint product line and contains the new File Share Librarian Module (along with a list of performance updates and hot fixes). The StoragePoint team created teh File Share Librarian module in response to customer requests for a high speed migration capability for file share data. Traditional, full fidelity migration tools provide an extension set of features but sacrifise performance to provide those features. You can expect to see somewhere in the neighborhood of 5GB per hour for a traditonal file-share-to-SharePoint migration tool. As you can see we aren’t going to be setting any land speed records anytime soon with that level of performance. In many file share migration scenarios customers want to move large quantitites of data from the file system into SharePoint and then disable file system access to the content. StoragePoint File Share Librarain is the answer.
By leveraging the RBS/EBS capabilities of StoragePoint the File Share Librarain uses a “shallow” copy migration to catalogue file share content into SharePoint. To help understand this concept you need to first understand the workflow involved with inserting content into SharePoint with StoragePoint deployed. Files uploaded into the SharePoint web front end server (either through the user interface or some programatic means) are handing off to StoragePoint for processing. StoragePoint will then store the file on some external storage device (the most common scneario is a file share). For migration scenarios, rather than deal with the overhead of physically moving the file into SharePoint to have it then be immediately externalized by StoragePoint, the File Share Librarain module leaves the file in-place and simply catalogues the file to make it available in SharePoint.
The benefits of the File Share Librarian module are many but most notable the ability to process millions of files in a short period of time. The speed of your migration is no longer bound to the physical size of your file share. Below you will find an additional list of features for the File Share Librarian along with links to resources where you can learn more about its capability. Note that the File Share Librarian will work with both SharePoint 2007 and SharePoint 2010 environments.
- Single and Dual Access Modes. Single Access Mode treats the cataloged file share 100% like a normal BLOB store, assuming that you are going to shut off end user access to the content once the file share is cataloged. And as you’re probably already guessing, Dual Access Mode is there for that transitional period between end users accessing content via the file share to end users accessing content exclusively via SharePoint. It would be nice to just say that on Monday the file share is gone and you need to go to this URL to get to your content. You can send that email out until you’re blue in the face and 50% of your end users will open help desk tickets or call yelling and screaming Monday morning wanting to know where there content is. Not to say that you can’t operate in Dual mode for some undetermined period of time, but why would you want to. Let me put it this way…we didn’t put this feature in Librarian to encourage folks to operate this way…we put it in there to do our little part to keep the IT staff sane while end users become comfortable with change and/or get out of their own way.
- Simulation Mode. Run the Librarian in this mode if you want to detect and have an opportunity to cleanse invalid folder or file names…or shorten path and file names to get within SharePoint URI length limits. You can let us do this stuff for you (…there is an option), but we’re going to be punitive about it (i.e. an ampersand goes bye-bye, it doesn’t get converted to and).
- Scheduling Options. Schedule the Librarian job to run on pretty much any interval you can come up with. The initial run will catalog the entire file share, with subsequent runs cataloging only the changes (i.e. new files added, old files removed, file updates/changes, etc.).
- Exclusions. Exclude file share content by sub folders, filename or filespec, created date, last modified date, and last accessed date.
- Dynamic Container Creation. Librarian will create a dynamic container structure (i.e. site collections, sites, libraries, and folders) based on the starting Destination container you pick when creating the Librarian configuration. If your Destination is a Web Application or Content Database, it will start creating site collections. If you start with a Library it will start creating folders. You get the idea.
- Permissions Promotion. As previously stated, the Librarian takes the effective permissions of the folders to be cataloged and maps those to Owner (Full Control), Member (Contributor), and Visitor (Reader) roles on the SharePoint containers it creates.
- Picture and Asset Library Support. If the file share content is being cataloged into Picture or Asset libraries, Librarian will create thumbnail and preview images for supported image types and make sure those are cataloged as well. This removes the need to upload the images into SharePoint, where SharePoint would natively create the thumbnail and preview images itself.
- My Site Support. Pretty much everybody has that drive that’s mapped to them whenever they login to Windows. It’s their own personal networked “My Documents”. We’ve talked to lots of companies that would like to transform those personal drives to personal sites in SharePoint, but many times we’re talking about 10’s of thousands of users and 10’s – 100’s of terabytes of data. Not something you’re going to easily represent in SharePoint. By simply pointing Librarian at the root of those personal drives and picking a My Site Host as the destination you would be well on your way to accomplishing that move with little effort. You would get some use case-specific options to let you tell Librarian where to find the actual personal drive folders (…even if they are not immediately under the root of the parent file share), where to store the content in the personal site (i.e. Personal or Shared Documents or a custom structure), whether or not to create the User Profile and Personal Site if it doesn’t already exist, and so on. As a test, we took 48,000 user folders, 48 million documents representing 6TB of content and represented it in SharePoint as 48,000 My Sites and 48 million list items in a single 150GB content database…in just over 6 days. And no, this wasn’t on some kind of fantasy hardware. The SQL box was dual quad core Xeon processors with 24GB of memory and the app server the job ran on was a single quad core process with 4GB of memory. While the job was running (16 threads…you can control that and change it in-flight anytime you like) the app server was running about 80% CPU utilization with the SQL box running under 10%, vast majority of the time running 5-7% (…it was taking a nap). I think the data speaks for itself. We’re going to conduct some larger-scale tests in the coming months and publish a whitepaper with detailed results, including farm configuration.