Document types

Document types are a category of document, such as invoices, purchase orders, or loan applications. The Document Types page enables you to create new document types and associate each one with a document form definition. Multiple document types can then be associated with a batch type – enabling you to process more than one document type in the same batch.

You can activate machine learning by editing an existing document type and associating it with a machine learning model.

Machine learning models are applied to document types, however, they are also applied to the DFD associated with the document type, this means that any document types that use the same DFD will also use the same machine learning model. Best practice recommends that each document type has its own DFD and consequently its own machine learning model. This prevents issues where changes to the DFD resets the machine leaning model, or the machine learning isn’t always relevant to the document type.

To manage document types, click Admin Panel > Document Types.

Three options are available to manage document types:

  1. Create – click Add document type and enter details for the document type.
  2. Edit – click the edit button to update a document type and activate machine learning.
  3. Delete – click the delete button to remove the document type from the database.

Document type details

Document types use the fields listed below.

The machine learning options are only visible in the Edit Document Type dialog and not the Add Document Type dialog.

Type name

The name of your document type. For example Invoice.

Document form definition

Select a Document Form Definition from the drop-down.

Type description

Enter an overview of the document type.

Classification confidence threshold

The confidence threshold for bypassing Class Verify. If bypassing Class Verify has been enabled and all the documents in a batch have been classified with high confidence, the batch will go directly to the next step in the workflow, without an operator having to manually verify the classification.

Machine Learning

The default setting is Off. When machine learning is switched on, the machine learning options display.

Regardless of this setting, machine learning training is disabled by default in the SsiDataCaptureClient.exe.config file. See Enable machine learning training for details.

ML Model

Select the machine learning model that you want to associate with the selected document form definition, or click the Create new model link below the field to create a new machine learning model.

You can also use this dialog to upload an existing machine learning model – MLD file. However, this must relate directly to the selected document form definition, otherwise the learning in the model will automatically reset.

After you have created a new model, you will need to select it from the ML Model drop-down.

Train Model

1000 documents need to be verified to train a machine learning model. After the initial training, by default it will automatically retrain after each further 1000 documents have been verified, or you can specify any value between 50 and 5000.

Invisible document type

This option is not currently used by Decipher IDP.

Attachment document type

Select this option to mark this document type as an attachment.

An attachment is one or more pages that are part of the document but we don’t need any data extracted from them. You can mark a page as an attachment and Decipher IDP can also automatically detect attachments.

After the software detects this document type, during classification, it automatically attaches the document to the preceding document.

Exception reasons

Add exception reasons and descriptions for use during data verification.