Preface
This is a joint post written with @AndrewJolly – a professional acquaintance of mine who shares my interest in Microsoft 365 Purview compliance and SharePoint. The importance of metadata in SharePoint is not a new concept; however, it becomes critical when being used to drive retention… the focus of this post. If you’re using metadata as a condition for an auto-apply retention label policy and the metadata isn’t filled in, the auto-apply simply won’t work. Although you can make the column required in the column settings, this still won’t guarantee the column will be filled in. Andrew has come up with a way of showing values that have not been filled in. This could be used by compliance roles (records managers, information stewards, etc.) in your organization to have oversight on content they’re managing.
In one of Joanne’s previous posts, a means of locating all items without a retention label was outlined using a search query (Link: Searching for unlabeled records in SharePoint). This is a great example of a means of producing a report of unlabeled content which isn’t provided as a standard report in Microsoft Purview.
These types of supervisory reports are great tools to have for Records and Information managers to stay informed on how staff are doing with assigning metadata on documents. The following post looks at using search queries once again to provide another report that can assist in supporting records management practices within the Microsoft 365 environment.
When using auto apply label policies one common way that policies are built up is using site columns, in particular managed metadata-based columns.
Users assign the appropriate metadata either via manual selection or where they set the default column value in the library to ensure that any documents added to the library get classified from the outset.
If we’re relying on these metadata selections to drive auto apply of retention labels, then it stands to reason that we would want to be able to locate all items without an assigned value for a column which is instrumental for our auto apply policies to work.
Surely the answer is search, which it is, but it’s not your typical query. First, we need to take a moment to understand how metadata is applied to documents.
If a document has not been assigned a value for a particular managed metadata-based column, then the item’s entry within the search index does not contain any information about that field. Therefore, it is not possible to ask search to return a list of items where the value for a given managed metadata based column is null.
Diving into the search results using the fabulous ‘Search Query Tool’ https://github.com/pnp/PnP-Tools/tree/master/Solutions/SharePoint.Search.QueryTool we see another way to achieve what we need. One of the standard properties of a search result item is ‘owsmetadatafacetinfo’ which is a standard retrievable property that maintains a data structure containing information about all terms on an item across all metadata fields.
For those interested, the managed property stores a string made up of a concatenation of any managed metadata-based columns and their values that have been assigned to the item. Each reference to a managed metadata column can be seen in the image above beginning with {SiteColumnInternalName}|{TermSetName} , the other attributes are references to the term store/term set ending with the ID of the term and the term value.
If an item hasn’t been assigned a managed metadata value for a given column, then that column doesn’t appear in the ‘owsmetadatafacetinfo’ property.
This means we’re able to construct a search query to provide all items where a certain managed metadata column isn’t present in the ‘owsmetadatafacetinfo’ property.
Such a query would look like this:
ContentTypeId:’0x0101*’ -owsmetadatafacetinfo:’RecordsClassification’ -FileType:aspx
Breaking the query down:
ContentTypeId:’0x0101*’ – This part of the query asks for Items that are of the ‘Document’ content type, we use the content type Id property with a wildcard asterisk to include all child content types as well
-owsmetadatafacetinfo:’RecordsClassification|’ – This part of the query asks to return items that do not contain the managed metadata column ‘RecordsClassification’ in the owsmetadatafacetinfo property (note the ‘-‘ operator at the start of the search term to indicate “not”)
-FileType:aspx – This part of the query asks not to include items if the file type .aspx which omits pages or item view pages etc.
An Example Walk-thru
To demonstrate Andrew’s query with a real example, in this document library, a Fiscal Year managed metadata column has been added and filled in for all but 4 of the documents:
In this tenant, the Fiscal Year column will be used to trigger retention to start at the end of each Fiscal year (using an event-based retention label), so it’s important for the column to always be filled in. By following the query syntax illustrated in the previous table, we can quickly see which documents have no Fiscal Year value.
Save the search in Content Search
This search query can also be set up as a saved ‘Content Search’ within Microsoft Purview and can be used by records managers to gain insight into how metadata is being applied or not applied consistently. Below is an example of the same KQL query targeting a SharePoint site inside Content Search which highlights the same 4 documents:
Closing thoughts
This is another tool in a records and information manager’s toolbelt to help ensure records are being managed appropriately. Managed metadata columns are a useful column type in SharePoint because of their tenant-wide scalability, their unique ability to implement your corporate-wide taxonomy (an important aspect of both information architecture and records management), and their ability to be used in retention auto-apply label policies… the main reason behind this post.
We hope you found this helpful.
-Andrew Jolly and Joanne Klein