[This article is part one of a two-part series.]

 

Recently, we had a situation where a customer was leveraging a SharePoint Document Library to store their utility bills. This particular customer does a great job at identifying critical elements of metadata on their libraries to filter and sort the information to help them glean insights about the data without having to open each bill and review them.

 

It's a great use case, but they spent a great deal of time each month uploading the bills and then, one by one, opening them and filling in the relevant metadata based on the data contained in the statement.


We talked through how they might streamline this process, and it quickly became apparent that AI Builder's Forms Processing was a great fit.

 

Why Forms Processing?

Forms Processing is a machine learning-based tool that enables the creation and training of a model to extract information from a form automatically. In our use case above, the scenario is pretty straightforward: upload the bill and identify the relevant information from various locations on the statement.


From there, we can leverage Power Automate to push that data over to the SharePoint Document Library.

 

In this two-part blog article, we'll explore how to get Forms Processing working and how we can take the outputs of the processing and append them to a Document Library.

 

Getting Started

 

First, you need the correct license in place. As licensing models can change over time, we're providing a link to Microsoft's AI Builder licensing page to view the most updated version.

 

Once you have the proper license in place, you can add intelligence to your business.

 

Setting Up the Model

To start, navigate to make.powerapps.com and select AI Builder on the left, then tap Build.

 

 

 

 

Select Forms Processing to get started. It will launch a start-up wizard asking you to name your model. Note that you'll need a minimum of five (5) documents with the same layout to get started and use to train your model.

 

Next, it asks us to list the pieces of information to extract. We're given a blank slate and asked to create the relevant fields. At the time of writing, the following options appear:


Field (examples: date, name, category)

Single Page Table (example: a table on an invoice/bill that displays line items)

Checkbox (preview) (example: a yes/no box)

 

 

 

 

Create Collections

Collections are used to group documents of the same structure together. For example, a hydro bill will come in the same format every month and is one collection. You might also have a water bill and a gas bill and each requires its own.

 

Next, we upload the initial set of documents. Remember, you need at least 5, but as with all machine learning, the more you train the model, the better. That said, in our experience, five seems to do the trick just fine.

 

 

 

 

Once you upload your documents, map the fields you created earlier with the fields on the forms you've uploaded. It helps teach the model where to find the data it needs and is done by "drawing" around the data and then selecting the corresponding field. It gets completed for each of the documents you uploaded.


 

 

Finally, click the Train button to start the training of the model. Here the system will examine the mapping you've provided and learn where to find the data from each document.


Depending on the number of fields and the number of files you've uploaded, it can take several minutes.


 

 

 

 

Publish your Model

After training the model, it's time to publish it—the model won't be live until you do. It provides the option to perform a quick test to see the model in action. You can upload one of the documents and view the model identify the fields in real-time so you can validate it's grabbing the right ones.

 

 

 

 

And that's it! We've just trained a machine learning model to process our form, identify the correct data, and grab hold of that information.

 

In part two of our article, we'll take the information and append it to SharePoint Document Library columns as metadata. If successful, this approach will save us time monthly to spend focused on other aspects of our job.


Looking for more guided articles? Check out more from our C5 Insight blog!