Site icon Joanne C Klein

Structured Document Processing Model with Microsoft Syntex

Reading Time: 3 minutes

As of March 2023, here are the types of custom Microsoft Syntex models you can build to classify documents in SharePoint (this post is about a structured document processing model):

Note: Microsoft’s pre-built models all use the Unstructured document processing teaching method.

Each Syntex model is suited to address different types of formatted content and file types. This post is not about that as it has been well-documented here: understand model differences.


I recently spent some time working with the Structured document processing model that has the ability to accommodate table structures found in documents. As a SharePoint practitioner from a ways back, I was curious how this would be manifested inside a SharePoint site. Here are some of the obvious questions I had:

To see how it worked, I created an order form with some fields in the body of the document and some tabular information embedded within. Below is an example of the PDF that was used to train the model with the fields and table I wanted to extract circled in yellow:

I created 5 variations of this order form to train the model and added them into 1 collection since they all shared the same layout.

Microsoft has done a great job at explaining how to teach your AI Builder model to extract fields and table rows/columns inside of a document so I won’t repeat their instructions. Link: Tag documents


What’s it look like in SharePoint?

During the publish process, you will be prompted to either create a new list or update an existing list with the table information extracted from the document. In my example, I created a new list. The list is on the same SharePoint site as the library and will link back to the library with the file ID property.

Continuing with the order example above, what does this look like in SharePoint?

I uploaded 6 order forms to the library:

Here’s what happens:

When I click the View orders link on the first document, it takes me to a filtered view of the associated list showing me only the items relating to that order:

Ah! Now it all makes sense. Using a structured document processing model is a dynamic way of retrieving 1:N rows of data out of a table and having them all stored as separate items in an associated SharePoint list.

Other observations:

Thanks for reading!

-JCK

Exit mobile version