Blog Post: 5 minute read
The other day I was part of a conversation on the subject of storing a group of categorized documents in SharePoint. The gist of the question was around wanting to know if it was best to create multiple document libraries (1 for each category), create 1 document library with multiple category folders within it or use metadata rather than folders. This is an age-old conversation that has gone on thru the years in the SharePoint world. 🙂
“Category” in this post means any kind of high-level criteria or group you’re wanting to organize your documents by. Examples of this could be project name, document type, etc.
As most people familiar with SharePoint will tell you, all options will work at times however one may be better suited than the others in any given situation. Your decision will ultimately be based on your unique business requirements and these six factors:
- Requirement to change category
The 3 options we will consider while evaluating each factor are:
- Option 1: create 1 document library per category
- Option 2: create 1 document library with 1 folder for each category
- Option 3: use a required metadata column for each category
[UPDATE March 22, 2017] I neglected to mention document sets in this post. I usually describe these to users as “smart folders” since the folder can itself have metadata. It is an excellent choice for grouping documents together in a folder with shared metadata. Examples of this are contract files, project files, job competitions, etc.
Factor 1: Permissions
What are the security requirements for the content? If there are different permissions per category your options are:
- Option 1: Create 1 document library per category. Set the permission at the library level. Good choice.
- Option 2: Create 1 document library with multiple category folders within it. Break permission inheritance from the library level and set it at the folder level. Ok choice.
- Option 3: You can not allow for permission differences between different metadata values so this is not an option.
My opinion: Both options 1 and 2 will work, however I lean toward trying to set permission at the library level (highest level) in SharePoint wherever possible. This is simply due to the negative performance hit whenever permission inheritance is broken as well as ease of permission administration.
Conclusion: Permissions alone will likely not dictate which option you choose. Your choice will often come down to other factors.
Factor 2: Volume
You should be aware of the list view threshold limit when working with lists and libraries in SharePoint. Know the rough estimate of the maximum number of documents that can be in any one category at any point in time. This will indicate how many documents you may want to display when a user is viewing the library. If you will exceed the threshold when displaying the documents this is a problem.
- Option 1: If the volume of documents in any category will exceed the limit, you can choose to separate them into separate libraries and then also have a top-level folder level to further break down the number of documents you will want to display at one time. This is a good choice.
- Option 2: If the volume of documents exceeds the threshold limit you may want to use a top-level folder structure for the category and an additional folder level under that to further break down the number of documents returned in any view.
- Option 3: If the volume of documents exceeds the threshold limit, then metadata alone is not sufficient. You may want to use some of the metadata values as folders instead in order to stay under the limit.
My opinion: If it exceeds a couple thousand documents, you will likely want to have at least a top-level folder structure to group the documents. End-users will receive a threshold error if any view they use exceeds 5000 items. (SharePoint Server 2016 can go beyond this limit) You will run into this, for example, if you try to view documents without folders and the # of items returned exceeds 5000 items. I would carefully design your library around the limit and additionally ensure indexes are created on metadata you will be filtering your views on.
Note that in SharePoint Server 2016 and SharePoint Online, an auto-indexing feature will automatically create a column index once you reach 2500 items in a list/library and it is used to filter a view.
For large libraries, it is usually not feasible for users to navigate the library to find their document particularly if you have a lot of folders. Instead, encourage users to leverage search. A good strategy is to have some metadata defined in the library and use them as refiners on a search page in the Enterprise Search Center.
You may also want to consider using the Content Organizer feature in SharePoint for large libraries. This can automatically create folders based on metadata and route documents into them. It can be configured to automatically create a new folder after, for example, every 2000 documents to ensure performance is kept in-check. Be aware there are limitations to this feature particularly for required metadata on a library as it will leave documents checked out if you submit multiple documents at a time to the drop-off library.
Conclusion: Have a good handle on the expected volume of documents in your library. Based on this you may want to create separate libraries in order to spread out the volume and/or create a simple folder structure within it.
Factor 3: Retention
You can set retention at numerous levels. (site, content type, library, folder)
- Option 1: You can set retention at the library level. All content within the library will fall under the same retention rules.
- Option 2: You can set retention at a specific folder level. All content within the folder will then fall under the same retention rules
- Option 3: You cannot set retention based on metadata directly. Out-of-the-box a date field is required to set up a retention rule for either document, folder, or content type retention. You can use something other than date with custom code.
My opinion: One important limitation of folder-based retention is if you allow end-users to create folders, then you will have to continually set retention for each new folder after it is created. (it will default to the document library retention if you don’t) If you require more granular control over different retention options you should use content types in your document library and set retention based on that. This allows the same retention to be applied across all libraries and folders using that content type which is definitely a more consistent, maintainable approach.
Conclusion: It really depends on the complexity of your retention requirements. Wherever possible I prefer to set retention at the content type level. This allows you to have different types of content across a site collection/site/library/folder each with different retention options. If you have simpler retention requirements (i.e. everything in the document library/folder has a retention of 10 years) then setting it at the document/folder level is an option and is sometimes sufficient.
Factor 4: Audience
In my opinion, this is likely one of the most important factors to consider.
Does this library have a few contributors and many readers? If this is the case, then you should only use folders for the benefit of the contributors. Let them come up with a simple folder structure (2 at most) they can all agree on and additionally tag the content with metadata. This will allow the flexibility of building search pages with metadata-based refiners. The majority readers will then use the Enterprise Search Center to find their content.
Does this library have a lot of contributors and very few readers? If this is the case, you will want to be very careful with the use of folders as there are likely as many ways to structure them as their are users contributing to the content. At times, folders are a good choice for a top level organization strategy within a list or library. You may want to consider setting up folders ahead of time and then turning off folder creation at the library level in order to eliminate abuse of folders in a library (although you will likely get kick-back from the users on this). If you choose to use metadata, spend the time to decide how users will want to filter/find their content and only do the absolute minimum metadata required to accomplish this.
Conclusion: Too much of either folders or metadata can be a bad design decision. I have seen very effective solutions using a judicious combination of both. This sets you up for an effective search experience while still providing helpful views for contributors. This is where you may also want to use the column default values feature to automatically set metadata in a folder based on the folder name.
Factor 5: Requirement to Move Content from 1 Category to another
Often there is a requirement to move content from 1 category to another and there are currently several ways to do this. I have written a blog post describing 10 ways to move content from 1 library to another.
- Option 1: Moving between document libraries will cause you to lose version history. It is also difficult to do as (currently) the capability to do this thru the UI is not there.
- Option 2: Moving between folders in a document library can be done thru the SharePoint UI and version history is maintained. The new ‘Move to’ functionality in the modern document library view allows you to easily move from 1 folder to another. This is super useful.
- Option 3: Changing the metadata value is simple at any time.
My opinion: One big drawback of moving from one library to another is the loss of version history. If this is a requirement, then you should not store your content in separate libraries. With the new ‘Move to’ capability introduced in the Modern Document library this allows you to move between folders within a library but not between libraries. Hopefully this capability will come in the future although it still may not resolve the version history issue.
Conclusion: If it is a requirement to retain version history then you should stay within one library for all of your categories and use either folders or metadata.
Factor 6: Search
Allowing users to use search to find their document is a value-add particularly if there is a lot of documents in the library.
- Option 1: If documents are housed across multiple libraries, users will have to know which document library the document is in to find it so it’s best for users to use the Enterprise Search Center instead.
- Option 2: If your documents are housed in multiple folders in one library, users will have the option of using the search box at the top of the library level to find it or going to the Enterprise Search Center to search from there.
- Option 3: if you are using metadata in a library, users will have the option of using the search box at the top of the library level or going to the Enterprise Search Center to search from there. Metadata is key for a great search experience!
Users can also use the Delve app in O365 to get back to recent documents they’ve worked on and to search for documents. It is a great productivity tool however it does not replace the need for a customized Enterprise Search Center in your environment with targeted content, customized refiners and display templates.
Conclusion: Users should use the Enterprise Search Center (or Delve) if they don’t know where their document is located. This means you should spend some time tagging your content with metadata to enhance the Enterprise Search experience. Conversely, if a user knows the library a document is located in, they can navigate directly to it and use the search box at the top of the library.
Like most things in SharePoint there is no clear-cut answer to the question whether you should choose multiple document libraries, 1 document library with multiple folders, metadata or a combination of all three. Hopefully if you consider the factors discussed in this post you will be armed with the knowledge to make the right decision for your specific business scenario.
Thanks for reading.