Optimal Designs for Managing Documents and Emails in SharePoint

Diary of a Thoroughly Modern Knowledge Worker
January 31, 2017
Records Management on SharePoint – The Questions You Should Ask
March 1, 2017
Show all

This paper was originally published in 2012 and revised in 2013. It has proven to be very popular, so it has now been updated to incorporate observations about Office 365 SharePoint Online and other insights gained in the course of a large number of SharePoint document and email management projects. I trust that you find it useful as you look to design or redesign a SharePoint environment so that it is optimal for managing documents and emails.
Noel Williams, Founder & CEO, MacroView, February 2017

Introduction

As organizations attempt to use Microsoft SharePoint for document management and email management a question that arises frequently is “What is the best Information Architecture for the SharePoint DM store?”  In other words, what is the best way to arrange the various SharePoint ‘building blocks’ – site collections, sites, document libraries, document sets, folders and metadata columns – to come up with a design that is optimal in terms of volume handling, manageability and ease of use?  That last objective – ease of use – is critical because if the users do not find it easy to interact with the document store they will likely continue to store their documents in file shares or local drives and their emails in Outlook folders and the SharePoint DM Project will not be successful.

Design Alternatives

In the early days of SharePoint a design we saw often was a large SharePoint document library with a tree of folders, mirroring the tree of folders in a previous Windows file share or Outlook public folders environment.

An alternative approach that also pops up quite often seeks to take advantage of SharePoint’s support for metadata – rather than have a tree of folders isn’t it better to have libraries with metadata columns, particularly Managed metadata columns?

Occasionally we also come across designers – particularly those with a background in traditional DM systems – who feel that the best approach is to store all the documents and emails in one or two big libraries and rely on search, rather than have a hierarchy of storage containers.
We’ll explore each of these alternative designs and quite a few variations, including:

  • Document sets are better than folders or libraries
  • You need to use a large number of Site Collections
  • You need to avoid breaks in permission inheritance
  • Etc.

Before we dive into the detail I would make the point that there is no single design that is right for every organization. To come up with an optimal SharePoint DM design you do need to take specific business requirements into account.

A second point is that size does matter – lots of designs work well enough if you have only a small volume of documents or users, but good design makes a major difference when there are hundreds of terabytes of documents and / or thousands of users. A related observation is that document stores tend to grow very quickly.  Even if you don’t have much volume stored in SharePoint today, you have only to look at how your file shares and Exchange stores have expanded in size over the past several years to realize that you should be designing your SharePoint DM store to cope with volume.

At MacroView we specialize in SharePoint-based solutions for document and email management. We sell our MacroView document Management Framework (DMF) add-on to customers across the world. These customer organizations range from small to very large, so we encounter many different designs for SharePoint DM environments, including all those we will look at in this post.

Large Library with a Tree of Folders
This design has the advantage of being familiar to business users. And at first brush it seems like SharePoint is fine with this style of design – why else would SharePoint offer Folders, and allow folders to be nested?

However there is a fundamental issue with this design, which is that the full SharePoint folder names become part of the URL for a document stored in SharePoint. If the folders are nested, and especially if their names are long and meaningful, there is a strong possibility that the URLs for documents will exceed their maximum permitted length of 255 characters.

These designs also lead to very large numbers of documents being stored in a single document library, which in turn can cause poor performance when you viewing the lists of documents in those libraries.  In an attempt to prevent performance degradation, with SharePoint 2010 Microsoft introduced List View Throttling. Put simply, if generating a view of a document library requires processing more than 5,000 items in the underlying SQL Server database (items can be documents or folders or document sets) SharePoint will not display all the items (or depending on configuration will not display any items). This might avoid poor performance, but it is far from optimal from the user’s perspective.

Some organisations have tried adjusting the LVT from its default value of 5,000 for end users to a much higher number – e.g. greater than the number of items in the library. This might fool SharePoint into issuing queries that need to process more than 5,000 items, but on receiving those queries SQL Server will still apply a table lock that will lead to poor performance for other users (which was what the LVT was instituted to avoid).

Libraries with metadata columns
It’s true that SharePoint’s ability to record and use metadata is one of the key features of SharePoint, compared to Windows file shares and Outlook folders.  This leads some designers to add lots of metadata columns to their document libraries. Their argument is that the more metadata columns, the more flexibility you have for slicing and dicing or searching to find the particular documents / emails that you want to work with.

Unfortunately this ignores the fact that business users generally dislike being prompted for metadata every time they save a document or email. Those users make the valid point that with their previous File Shares and Outlook folders they simply saved to the appropriate folder and were not prompted for any other details. The most they might have had to do was to create a new folder.

If the metadata columns are not mandatory, the likely result is that users will skip them. Similarly if the metadata columns have default values, chances are that users will go with the defaults, even if they are not appropriate.

Metadata capture fatigue is one of the key reasons why SharePoint-based document and email management solutions fail to gain adoption by business users. The way forward is not to abandon metadata, but to make the recording of metadata as automatic as possible – more of this below.

One Big Library and Rely on Search
Architects who are migrating to SharePoint DM from a traditional DM system such as iManage Worksite or OpenText eDocs / Hummingbird DM, sometimes feel that a search-based approach is preferable to bothering with setting up and maintaining any container hierarchy in SharePoint. Often they also opt to use a very small number of libraries – because this is standard practice in a traditional DM system.

These designs might sound appealing on the basis that they are familiar to users moving to SharePoint from a traditional DM system. Indeed MacroView DMF and MacroView Message enable very convenient and intuitive searching for documents across a SharePoint store.

However these designs are decidedly sub-optimal, essentially because they are at odds with the inherent structure of SharePoint, which is a tree of storage containers. 
Furthermore, designs that rely on a small number of libraries will almost certainly run into volume handling issues (including issues related to List View Threshold).

Instead of attempting to hide the container tree, it is much better to design it so that it is intuitive to the user and efficient in operation. Hopefully the observations in this article will alert you to the best techniques for the design of that tree.

Automatic Metadata Capture
MacroView Message and MacroView DMF both ship with two capabilities that automate the recording of metadata as you save an Outlook email to SharePoint. The result is that emails can be saved to SharePoint with no prompting at all:

  1. Automatically select an Email content type (if defined in the destination library)
  2. Automatically record all non-personal attributes of the email in corresponding metadata columns in the destination library.

Non-personal attributes include To, CC, BCC, From, Subject, Conversation Topic, etc. The metadata columns in the destination library can be the ones that MacroView uses by default (their internal names are mvTo, mvCC, mvBCC, etc) or alternatively MacroView Message and MacroView DMF can be configured to record in metadata columns of your choosing.

Of course you could add additional metadata columns to the Email content type in your libraries, in which case MacroView Message and MacroView DMF would prompt for those columns, unless it can record them automatically (more of that below).  The general point to remember is that zero prompting as emails are saved is a big plus when it comes to user adoption.

Automatic recording of metadata columns is popular with business users when documents (not just emails) are being saved or uploaded to SharePoint. There are a variety of techniques for automatic recording of metadata, almost all of which are based on the user dragging and dropping (or otherwise saving) to a particular node in the SharePoint tree structure.

For example if the user saves to a folder called Agreements the Document Type metadata column can be pre-defined to be Agreement. Generally speaking, automatic metadata capture works best when you have a more nodes in your SharePoint document store design – that way each node can have its own unique metadata pre-defined.

We have already seen how having a large library with a deeply nested folder structure can lead to problems, so what design or designs are better for large scale automatic metadata capture?

Document Sets

Document set can be a handy alternative to the folder. Like folders, document sets can be created by Contribute level users – you don’t need Design or Administrator level permissions as you would if you were creating document libraries or sites. In other words, both folders and document sets are good when end-users need self-service capabilities – e.g. creating the new area for a new project or transaction or matter.

Document sets have 3 potential advantages over folders: 

  • The document set can have its own metadata values. E.g. a document set for a new project can have Project Code, Project Name, Managed By and Project Type. You can have a default view of the library display these attributes, thereby creating a mini Projects Register. See this MacroView blog post for more details.  Selected metadata values can be shared to the documents within the document set. By using folder-level defaults you can share metadata with documents stored in folders, but documents sets make sharing much easier.
  • Document sets cannot be nested. This addresses one of the key weaknesses associated with design that use libraries with folders, which is that deeply nested folders can lead to document URLs that are longer than 255 characters.
  • The document set can be treated a single unit for the purpose of workflow processing.

 

Tree of Sites, Sub-sites and Libraries
If you are using the OOB SharePoint UI you tend to want your tree structure to be all within a library or libraries, using folders and document sets.  However by using the MacroView add-ons you open up the possibility of also using sites and sub-sites to build the overall tree structure, which can have some nice advantages. Let’s examine this in more detail…

Viewing and Navigating the Structure of a SharePoint document Store
MacroView DMF and MacroView Message show you the complete tree structure of your SharePoint document store – whether it be SharePoint Server on-premises or Office 365 SharePoint Online. The ability to see and easily navigate the complete tree structure enables designs that are good at handling large volumes and which avoid the issues with a tree of folders.

On-premises SharePoint Server Deployments
When MacroView is deployed to an on-premises SharePoint Server environment you simply register your SharePoint web applications and MacroView DMF then automatically (and efficiently) displays all areas of the store that contain document content for which you have access permission, or where you have permission to save new documents and emails.

Unlike the OOB SharePoint web browser UI, MacroView DMF does not just show the tree in a particular site. Instead the DMF display starts at the web application level and shows site collections, sites and sub-sites; for each site you can see the document libraries, and in turn the document sets and the tree of folders and sub-folders contained in each of those document libraries. The MacroView DMF tree-view display extends down to the hierarchies of those metadata columns that have been defined as being available for Metadata Navigation.

Unlike the OOB SharePoint web browser UI, the MacroView DMF tree display does not stop when it encounters a break in permission inheritance. Instead the DMF display extends all the way down to show any node that contains content for which you have access, or in which you have permission to save – even if there are multiple breaks in inheritance permission on the way down to those nodes.

MacroView DMF displays the full SharePoint site and library tree, from web application level down to metadata navigation ‘virtual folders’.

SharePoint Online Deployments
When MacroView is deployed to a SharePoint Online environment you register your SharePoint Online tenancy and MacroView automatically discovers and displays all the site collections for which you have access permission. You can then expand those site collections to see the sub-sites, document libraries, folders and document sets that they contain. The difference compared to on-Premises deployments is that the display stops as soon as it encounters a node for which you do not have any permission (i.e. just like the SharePoint web browser UI).

In their tree displays MacroView DMF 365 and MacroView Message 365 automatically create a separate top-level node under which all the OneDive for Business personal sites are shown. This is to avoid the extra navigation difficulty that can result if personal and non-personal site collections are mixed together.

Hybrid Deployments

MacroView DMF Hybrid and MacroView Message Hybrid enable the display in the same tree of a SharePoint Online tenancy AND one or more on-premises SharePoint web applications. They also support moving and copying documents and emails between the online and on-premises environments. This hybrid deployment can provide a convenient and easy-to-manage way to handle collaboration with external parties, while continuing to use an on-premises SharePoint Server to manage a large document store.

“Email integration is what first attracted me to MacroView, but I have come to realize that the unique value of MacroView is the way it enables users to manage documents and emails across the whole SharePoint farm. Providing users with the same ability to quickly find, filter and use their sites (especially while working in Outlook) with all security pre-applied would otherwise require massive customization. The volume handling and ease-of-use that MacroView provides is a key success factor for large scale SharePoint DM Projects.”
Kyle Connell, ECM Specialist, RKO Business Solutions

Efficient Display and Navigation of Large SharePoint Trees

We designed MacroView DMF and MacroView Message to cope with very large SharePoint document stores. By ‘large’ we mean SharePoint stores with a large tree of site collections, sites, libraries, folders etc as well as large numbers of documents. This is particularly the case when MacroView is deployed with on-premises SharePoint Server and the MacroView DMF custom full-trust web service is installed at the server.

A key aspect of the MacroView DMF tree display is that it never attempts to display more than a threshold number of sub-nodes when you click to expand any node in the tree. Consider a site that contains hundreds of document libraries. MacroView DMF will prompt you to enter some characters contained in the titles of those document libraries and then display only those libraries whose titles do contain the nominated characters.

Another example would be a document library that contains thousands of document sets, or thousands of folders. When you click to expand the document library node, MacroView DMF will prompt you to enter some characters contained in the names of the document sets or folders, and then display only those document sets or folders whose names do contain the nominated characters.

“Now that we have implemented MacroView DMF, our users are no longer contacting the Help Desk saying they don’t know where their documents are.”
Himanshu Pandya Manager, PMO & Governance, AEGIS Insurance Services

This filtering has excellent performance and minimal bandwidth consumption, thanks to custom MacroView DMF web service. The overall result is efficient, not just in terms of machine resources but also for the ‘real human’ user – who is able to quickly drill down to the library, document set or folder that they want to work with.  If the title for a Project library contains both the code number or ID for the Project and the name of the Project (e.g. ‘AA1234567 – Sale of Paradise Palms Condominiums’) then the user can quickly drill down to a wanted Project library using part of either the Project ID or the Project Name.

MacroView DMF prompting to filter a document library that contains 50,000 document sets.

As the screen shot above shows, this filtering works even when the number of document sets or folders present in the library exceeds the List View Threshold. It also works if you were to expand a folder that contained more than the List View Threshold number of sub-folders.

Familiar, Even Better User Experience
MacroView DMF makes the SharePoint document store looks like a tree of containers, which is familiar to users based on their experience of viewing a tree of folders in Windows Explorer or Outlook. If anything, the experience is better than that in Windows Explorer or Outlook because MacroView DMF provides a number of ways of efficiently navigating around a large SharePoint tree to find the container that they want to work with – e.g. a specific library, folder or document set.

The filtering of sites, libraries and folders described above is one such way; another is the Search Site Tree feature of DMF, which enables direct navigation to a site or library by searching for the title (or part thereof) of that site or library. In our example from above, this approach would allow a user to locate to the library for a particular Project if he / she can remember either the Project ID or part of the Project Name. There is no need to first navigate to the site that contains the library, and it does not matter where or how deeply the Project library is nested in the SharePoint tree. MacroView DMF uses the SharePoint search engine to perform an indexed search, so performance does not degrade as the SharePoint store grows.

“Thanks to MacroView DMF, user productivity was restored, even improved in several situations. That in turn has helped to make our new SharePoint–based document management a success.” Mark Buttice CIO, Mountain States Employers Council

Much Better than the OOB SharePoint web Browser UI
Compared to using the Out-Of-the-Box SharePoint web browser UI, MacroView DMF provides a much better user experience for viewing and navigating the tree structure of a SharePoint document store. MacroView DMF respects the SharePoint security model and so it will not show you any content that you would not have been able to see by using the web browser UI (and many more keystrokes). However MacroView DMF has the following advantages over the web browser UI:

You can view and navigate the SharePoint document store while you work in familiar applications such as Microsoft Outlook, Word, Adobe Reader, etc.

Shows the complete SharePoint tree, but not ‘furniture nodes’ such as Pages, Images, Site Collection Pages, Site Collection Images, Style Library, etc.

Copes with breaks in permission inheritance.

Avoids issues with the List View Threshold.

Is not subject to the 2,000 sub-site limitation.

Facilitates efficient navigation around a large SharePoint tree.

MacroView DMF enables a number of designs for a SharePoint document store that can be optimal in terms of volume handling, manageability and ease of use, but which would not really be usable if all you have to work with is the OOB SharePoint web browser UI.

Large Number of Site Collections
This design can be relevant to an organization that needs to store a very large volume of documents, because each site collection can potentially be mapped to a separate Content Database. In other words having a large number of site collections maximizes flexibility in terms of managing SQL Server storage. This design is also relevant to organizations that have deployed Personal Sites (also known as MySites) because each Personal Site is implemented as a site collection. Best practice is for these Personal Site Collections to be stored in a separate web application.

As you click to expand a web application that contains a large number of site collections, MacroView DMF uses the same prompt-to-filter and server-side filtering logic as described above for sites, document libraries, folders and document sets. This filtering can be made even more efficient by using the server-side caching feature of MacroView DMF Hybrid and MacroView Message Hybrid.
The Filter Site Collections by Favorites command is another way in which MacroView DMF makes it quick and easy to navigate to the particular site collection that you want to work with.  This command shows only those site collections that contain one or more favorite nodes (where a favorite can be a site, document library, folder or document set).

Designs that use large numbers of site collections occur for good technical reasons; MacroView DMF makes such designs user-friendly because the site collections appear in a familiar and intuitive way as top-level nodes in the MacroView DMF display of the SharePoint tree structure.

Web Application with Personal Site Collections displayed in MacroView DMF Explorer. Note Colleagues’ sites shown as Read-only and libraries such as Style Library not displayed.

Managed Metadata – the Good News and the Bad News
First the good news: Managed metadata columns based on a Term Set are an excellent way to implement metadata in SharePoint, for the following reasons:

The term set can managed centrally across all document libraries.

The term set can be hierarchical (i.e. tree structured), which reflects the inherent structure of a lot of classification schemes.

The term set can be Open, so that authorised users can easily add new values.

Managed metadata columns are indexed, which helps with efficiency.

But you should beware of pushing Managed metadata columns too far. An example of this is when you attempt to use a managed metadata hierarchy to handle a large Client/Project structure. The resulting Hierarchical Term Set has thousands of entries at the top level (to represent Clients), which causes issues with the display and navigation of the taxonomy.  It can also lead to List View Threshold issues as more and more documents are stored in a single large document library.

The optimal alternatives include having a site or library or document set for each Client or Project.  These approaches all work efficiently in MacroView DMF and have the further advantage of needing nodes in the tree corresponding to only active values (where a metadata navigation tree displays all possible values).

For more information

Contact MacroView Solutions .