Accuracy reporting process

Click this icon on the toolbar to view and download a PDF version of this guide.

This guide outlines how administrators can use the accuracy reporting process to determine the accuracy that is returned by Decipher IDP for particular document types. The generated report is designed to show how well the document training is performing, and where any improvements can be made.

Until this feature is added to the core Decipher IDP product, SS&C | Blue Prism does not officially support this functionality. Every effort has been made to test these processes fully, however, there may be scenarios we cannot foresee. As such, we recommend customers exercise due caution and test these processes fully before integrating them into their workflow.

Prerequisites

  • Decipher IDP 1.2 or later.

  • SS&C | Blue Prism 6.6 or later.

  • The Decipher Accuracy Reporting Process package downloaded from the SS&C | Blue Prism Portal.

  • A separate reporting database enabled when installing Decipher Server. For more information, see Install Decipher Server.

  • Credentials with read-only access to the Decipher Server and reporting database.

Package contents

  • Decipher Accuracy Reporting Process – A Blue Prism release containing:

    • Process: Decipher Accuracy Reporting

    • Object: Data - SQL Server

    • Object: Utility - Collection Manipulation

    • Object: Utility - File Management

    • Object: MS Excel VBO

    • Object: MS Excel VBO - Paste Fast

    • Credential: Blue Prism Reporting

  • Decipher Accuracy Report - Template.xlsm. This excel template contains a macro which is run by the process to copy out formulas and format the report. Please ensure macros are enabled for the respective user.

Setup

  1. Save the Decipher Accuracy Report - Template.xlsm into a folder accessible by the user or resource running the reporting process.

  2. Import the release file into SS&C | Blue Prism:

    1. Log into SS&C | Blue Prism and click File > Import > Release / Skill.

    2. Select Browse..., and choose the Decipher Accuracy Reporting Process release downloaded from the portal.

      The processes are added to the Processes\Default folder in Studio. There are no changes to the standard SS&C | Blue Prism utility objects, so these can be skipped during import if they’re already present in the environment.

Process

To run the Accuracy Reporting process:

  1. Log into SS&C | Blue Prism, and open the Decipher Accuracy Reporting process.

  2. On the Main Page, configure the following data items:

    • Batch Type Name – The name of the batch type as it appears in Decipher IDP. For example, Invoice. If not populated, you need to provide a Document Type Name.

    • Document Type Name – The name of the document type as it appears in Decipher IDP. For example, Invoice. If not populated, you need to provide a Batch Type Name.

    • Earliest Batch DateTime – The report will only contain batches that were processed after this time.

    • Latest Batch DateTime – The report will only contain batches that were processed before this time. This is not required, and if left blank, is automatically populated with the current date and time.

    • Report Output Folder – The save destination for the newly created report. For example, C:\Users\Trainee\Documents\Output.

    • Report Template Path – The full location path of the name of the Decipher Accuracy Report - Template, including the file extension. For example, C:\Users\Trainee\Documents\Decipher Accuracy Report Template.xlsm.

    The above inputs, excluding Report Template Path, can also be configured as startup parameters from Control Room.

  3. On the Globals page, configure the following data items:

    • Database Credential Name – The name of the credentials used to connect to the database. Ensure the credentials can be accessed by the relevant processes and machines.

    • Server Address – The SQL server name containing the Decipher database. For example, .\SQLEXPRESS. This may also be the server's IP address. Check with your database administrator that no firewall rules will block the connection.

    • Decipher Reporting Database Name – The name of the Decipher reporting database. For example, DecipherServerDbReporting.

    • Decipher Main Database Name – The name of the Decipher Server database. For example, DecipherServerDb.

  4. Run the process.

The accuracy report is generated and saved to the configured output folder.

Report

The report is split into the following four worksheets, each outlining the capture accuracy in a different area, enabling you to fully analyze the performance of the Decipher IDP document processing:

BatchSummary

The BatchSummary page contains all batches included in the report, listed alphabetically, with the processing date, batch ID, document count, and the percentage of fields that were captured correctly or incorrectly. The field capture data is organized into the following categories:

All Fields Correctly Captured

These documents would be fully automated. All fields were automatically identified, validated, and with high confidence. No action required.

Captured - Low Confidence

This result is similar to All Fields Correctly Captured. However, Decipher IDP encountered a confidence issue, either in the OCR extraction or when assigning the region. This could be due to image quality, characters adjacent to the text, or additional training needs.

Field Not Captured - Manually Added

A field has been missed by Decipher IDP and a user has manually assigned a value. This mostly occurs prior to any training data being available for the document layout.

Incorrectly Captured - Manually Corrected

A result was returned but was either completely or partly incorrect. Completely incorrect is where the user selected or created an entirely new region. Partly incorrect is where Decipher IDP selected part of the correct text, but the region was resized to either include or exclude additional values.

This page gives you a high-level overview of your progress. It is expected that the earliest batch would show no documents in the high confidence (green) category as there is no training data. As more batches are processed in line with your training plan, you should see improvements in performance and an increase of data in the amber and green columns. For more information on implementing and creating a training plan, see Decipher IDP best practices.

It is useful to name your batches in a logical and recognizable way so you can efficiently track the document training progress. The batch name can be configured on upload, or you can amend it in the Batches tab in the Admin Panel. For more information, see Batches.

In addition, this page can be helpful to:

  • Identify batches containing documents that would require manual intervention during processing, and prioritize reviewing these documents. Batches with a high number of documents in the Incorrectly Captured - Manually Corrected (red) category could negatively impact future Decipher IDP processing.

  • Easily see which batches have achieved the targeted accuracy rate, and move these documents from development into a production or UAT environment.

DocumentsByBatch

The DocumentsByBatch page provides a breakdown of all documents in a batch and you can view field level analysis for each document. Use the drop-down at the top of the page to select from the batches in the report and the page is populated with accuracy data for each document. The table shows up to 200 documents, but it is recommended that batches contain fewer documents to maximize system performance and focus development analysis.

If the obfuscateFileName input flag is enabled in the Decipher VBO, the document names will be obfuscated in the report. For more information, see Using the Decipher VBO.

The accuracy data is provided in similar categories to the BatchSummary page, with additional detail on:

  • Total Fields – A count of all the fields captured in the document.

  • Data Validated – Indicates if validation errors, from either validation formulas, format expressions, or validation lists, were present in the document. If this field is shown as Yes, all fields with validation were correct.

  • Green/Amber/Red – Indicates the quantity of fields in each capture accuracy category. You can use this to identify which document layouts require further investigation or attention.

  • Correctly Captured - Populated/Correctly Captured - Unpopulated – Where Decipher IDP has automatically identified the correct region, this data is then separated by whether the region was or wasn't populated.

  • Incorrectly Captured - Manually Corrected/Incorrectly Captured - Manually Removed – Where Decipher IDP has automatically identified the incorrect region, this data is then separated by whether the user has selected a different region, or removed the assigned region and left the field blank.

DocumentDetail

The DocumentDetail page provides a breakdown of all fields in a document and you can view the capture outcome for each field. Use the Batch drop-down at the top of the page to select from batches in the report, and then use the Document Id drop-down to select a document from the batch.

Document ID is used as it is a more unique identifier than document name. The document ID must be copied and pasted from the DocumentID column in the DocumentsByBatch worksheet.

The page is populated with the field references from the document form definition (DFD) that was used and how they were captured. You can use this page to compare the document with the DFD configuration for any problematic fields and implement changes as necessary to improve the accuracy.

ByField

The ByField page shows the capture results for each field in a given document type. Use the Document Type drop-down at the top of the page to select from the document types in the report and the page is populated with accuracy data for each header and table field.

The results are summarized by the number of documents and the outcome for the corresponding fields. For example, out of ten documents, if a field was correctly captured in eight documents, but required manually adding in two documents, it is shown with 80% accuracy.

By default, 20 rows are shown for the header fields, and 10 for the table fields. If your documents contain more fields, you can expand the tables by using the plus (+) icons on the left of the page.

This page helps users identify fields where additional action may be required to improve performance or any fields which are consistently below the target accuracy rate. If your success rate is high, but there are fields which you need to ensure are captured accurately, you can implement further validation techniques for these fields.