Searching for unlabeled records in SharePoint

Reading Time: 5 minutes

[Update January 2024] Thanks to a reader of my blog, the query I was using to find unlabeled items has been greatly simplified. The post has been updated to reflect the new query. Thank you Joel!!

Customers ask the best questions. Just last week, a customer asked if there was a way to show content on their site that DIDN’T have a retention label on it.

Sounds simple enough until you dig into the details.

Let’s not go happy path on the answer… let’s talk real world. Consider multiple libraries on a site, some potentially have multiple levels of folders, you haven’t defaulted a retention label, you haven’t covered off every edge case in your auto-apply label policies, you may be relying on end-users to manually apply a retention label, you’ve applied a non-record retention label that end-users have removed, or you have a combination of any of these.

As pointed out by a reader in the comments, if the SharePoint site had 1 library and you only wanted to view unlabeled content in 1 library, you could certainly create a library view filtering on the documents with a blank retention label and displaying all items without folders.

At some point, the roles in your organization involved with managing records (records managers, information stewards, etc.) will want to understand where retention labels are being applied and perhaps more importantly, where they’re not. Building a modern page on the site with some search web parts seems like an integrated and scalable way of presenting the information to them so they can take further action when required.

Content Explorer in the Data Classification feature will certainly show you where retention labels have been applied to content across all locations; however, it won’t show you where they haven’t been applied. The key benefit of Content Explorer is it extends beyond SharePoint and includes Exchange and OneDrive as well.

Building search solutions like this post is doing to show where retention labels have and have not been applied is a good approach in SharePoint as there are some limitations to Content Explorer that a SharePoint search solution will address:

  • it can take up to 2 days for content to show in Content Explorer; however, Search results show much quicker (~15 minutes)
  • an elevated level of permission is required to access the Content Explorer feature; however, for a search solution, records managers/information stewards only need access to the site(s) where the business records are stored and where the modern page is built

I’m using the PnP Modern Search web parts for this solution. I’m not going to go over how to install and configure them in this post. Refer to this link for a great explanation of that: Introduction – PnP Modern Search (v4) (microsoft-search.github.io)

The salient point of this post is defining the query to identify unlabeled documents. The PnP Search web parts are just one way of showing the results to the end-user.

I wanted to use Microsoft search for the solution simply because of scale and the built-in permission model it provides. I struggled for awhile trying to build the query for 2 reasons:

  1. I thought the ComplianceTag managed property (the property that holds the retention label name value) would be empty for unlabeled items and therefore not searchable in a query.
  2. I wanted to build the query in such a way that as new retention labels were added over time, the query would continue to work without updates. With the help of Twitter, Andrew Jolly provided me the query to detect retention labels in a scalable way. Thanks Andrew!

Note: this post assumes there are roles responsible for managing the business records on SharePoint site(s) that have an interest in knowing this information. This is typically the information steward role. The information steward could be in charge of a Hub of SharePoint sites (Finance Hub for example). This search solution will work for 1 site, many sites hubbed together, or even SharePoint tenant wide simply because of the scalability of Search.

For this post, I’ll build a modern page of “Unlabeled documents” on 1 SharePoint site. Let’s dig in.


Start by building a new Result Source on your SharePoint site by going to Site settings… Site Administration…Result source

A result source is one way (there are others) of filtering the search results so you only get back what you’re interested in. For this post, that’s any file without a retention label applied. The most important part of a result source is the query you provide and test in the Query Builder. Using the alphabet with an asterisk as the retention label condition is what allows this query to accommodate any new retention labels that may be added to the system in the future.

Below is the query I used to get back all file types listed (you would need to include all that you need) that do NOT have a retention label applied:

Below is the Query builder in the Result source configuration where you need to test the query:

 

Once you have the result source, you can start building your modern page with the PnP search web parts on it. Here are the PnP search web parts available:

I’ll start by adding the Search Results web part to a modern page and providing it the GUID of the ‘Unlabeled Documents’ result source built above:

There are many settings you can use to customize the page and your results for your specific requirements. Add some filters to the page with the Search Filters web part to help describe the document such as who last modified it and a link to the library it’s stored in for further follow-up. Since search is permission-trimmed, granting your information stewards access to the site where the content is stored will allow them to see all search results.

For example, here is what my page looks like:


Closing thoughts

Records Managers and information stewards will be interested in this type of information since they ultimately have the business unit/corporate responsibility for ensuring records are being managed appropriately.

This search page certainly does not preclude any automation capabilities you may have in place for applying a retention label which is still a scalable and recommended way for applying one; however, there will always be use-cases for needing to see what isn’t being labeled in case there’s something that hasn’t been detected and has fallen thru the cracks.

Thanks for reading.

JCK

4 comments

  1. Hi Joanne,

    Great post. Another quick solution is to create a Library View, of course this works for a single library but if it fits the scenario why not? You will need these settings:
    New View, filter: Retention label is equal to (leave field empty) and content type=equal to Document (or use the default document content type you use in your organisation).
    In the folder section: select Show all items without folders
    In the item limit: select 100 for example.

    Regards,
    Juan

    1. Hi Juan, I 100% agree the library view option is good if you’re working within one library. I’ve done this same thing. I’ll call it out in the post for more clarity.
      Thanks for sharing!
      -Joanne

  2. Hi Joanne,

    Thanks for this post, exactly what I was looking for 🙂
    I did some testing on the search query to find items without a label and I think your query can be simplified to avoid searching for all the alphabet letters and instead just use * :

    (filetype=docx OR filetype=xlsx OR filetype=pptx OR filetype=pdf)
    NOT (compliancetag:*)
    SPSiteURL={SiteCollection.URL}*

    Perhaps this is a new query capability, don’t know for how long this is supported but seems to be working fine at my end

    1. Hi Joel! Thank you for sharing – I can’t recall if I tested that, but I feel like I would have. I just now tried it on a few of my sites too and happy to say it worked!! That is a much more elegant query – I’ll update the post to reflect. Much appreciated!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.