Frequently asked questions
Decipher Intelligent Document Processing (IDP) provides a data extraction platform which extends the scope of what can be achieved by Blue Prism. Blue Prism digital workers are able to submit scanned or electronic documents to Decipher, which can then automatically classify the document before extracting and validating the key information. It provides a web-based user interface to allow business users to validate and easily annotate the extracted data before making it available for further processing by Blue Prism digital workers.
Decipher can be used with versions 6.6 and above and can only be used with Enterprise versions of Blue Prism.
The earliest Blue Prism version that Decipher IDP will work with is Blue Prism 6.6 and can only be used with Enterprise versions of Blue Prism.
New features include:
Active Directory authentication through SAML using AD FS is now supported.
Signatures can now be detected using a new signature format type in document form definitions.
Tesseract 5 OCR engine is now supported.
Additional blank page detection options have been added, enabling users to more configure blank page detection.
Table improvements to enhance the handling of complex tables.
A new India locale to support the processing of Indian characters and currency.
Improved recognition of checkboxes, with the RFT miscellaneous parameter now supported for checkmark groups.
New automatically skip class and data verification options have been added to the batch type configuration.
Additional miscellaneous parameters including RegexMode, which determines how regular expression formulas match and search fields.
Combined vector and scanned data processing where a document containing both vector and scanned data can be processed with partial OCR.
See the Release notes for further details.
All configuration, templates and machine learning created in 2.1 will be retained when upgrading to 2.2. The upgrade is installed onto the existing 2.2 environment.
The Natural Language Processing (NLP) plugin to support processing of unstructured documents is an optional feature. It requires a Graphics Processing Unit (GPU) to be available on the server or virtual server with a minimum of 8 GB of RAM for production use. If a GPU is not found, the NLP plugin cannot be installed, but other than the NLP capabilities, all other features in 2.2 are available.
Yes. Support for Azure SQL was introduced with version 2.1, however, at this time a configuration script was required before Azure SQL databases could be used with Decipher IDP. This limitation was removed in Decipher IDP 2.2.
The Decipher Licensing server is not currently compatible with Oracle. Please note that as per our Decipher installation instructions Decipher only supports SQL server 2012 onwards.
As announced at Blue Prism World on 18th May 2021, Decipher is available to all customers with Production or Business-Critical Support Agreements. Decipher is not available for purchase without either Production or Business-Critical support.
Decipher is able to support a wide-range of business documents, including: structured documents (for example, passports and driving licenses), semi-structured documents (for example, invoices, statements and purchase orders) and unstructured documents (for example, text-based legal documents such as contracts). Decipher IDP provides administrators with an easy to use interface to define the data items that are to be extracted and any validation that should be applied to this. The validation functionality can be used to check data format, automatically check calculations on the document are correct and validate data items against other databases, for example, to confirm that a supplier exists on the company’s approved supplier list. In addition to increasing the automation of documents, Decipher can identify potential fraud and reduce transcription errors.
Yes. Decipher will only function when there is a specific Decipher license key present, and when used in conjunction with a licensed Blue Prism Enterprise deployment.
More information about Decipher is available from the Decipher IDP Portal page.
Decipher IDP is currently subject to a fair use policy which entitles each production deployment to 4,000 pages processed per month per licensed Digital Worker (concurrent session), subject to Decipher environment configuration. Therefore, if a customer has 10 Digital Workers, they are entitled to process 40,000 pages per month, or 480,000 pages per year.
Decipher can be downloaded from the Decipher IDP Portal page.
A valid license key will be required to use Decipher.
The user interface is available in English, French, German, and Spanish.
Decipher IDP supports 26 languages for OCR extraction. These are: English, Spanish, French, German, Italian, Bulgarian, Czech, Danish, Dutch, Estonian, Finnish, Greek, Hungarian, Lithuanian, Norwegian, Polish, Portuguese, Romanian, Russian, Swedish, Turkish, Ukrainian, Latvian. Slovak, Croatian, and Afrikaans.
The language to use for data extraction is defined in Batch Types > General Settings.
Decipher IPD has a number of hardware and software prerequisites that must be provided – see the Install Guide on the portal for more information.
Ideas and suggestions can be raised and voted on via the ideas community.
Decipher IDP is released as a commercial software product that includes all relevant licensing within the main product. Refer to Decipher IDP Install and Configure for details. Version 2.2 requires RabbitMQ and Erlang to be installed on the server that will host the web client. Full details are included in the installation guide.
Yes, Blue Prism version 6.6 or above is required to use Decipher IDP.
Decipher IDP integrates with Blue Prism via processes using a Decipher VBO.
For evaluation, training or proof of concept use, Decipher can be installed on a virtual machine together with Blue Prism. This can be on a business laptop/desktop, on a local server or in a cloud environment such as Microsoft’s Azure.
For production use, Blue Prism recommendations Decipher is installed on separate hardware to your Blue Prism Core RPA solution to support modular scaling.
Firstly, on the PC of the business user that will use Decipher to make annotations, make sure that the necessary firewall rules are in place to the server on port 80 (http) or 443 (https).
Secondly, as with all Blue Prism installations, access to a SQL database is required. Decipher IDP introduces two more SQL databases so ensure the server can access a SQL server and that you have credentials with permissions to create databases on that server. As this is a proof of concept, SQL Express or similar is efficient.
There are three primary roles for Decipher IDP:
System Administrator to install Decipher IDP – a typical systems administrator role.
Decipher Administrator to setup new document form definitions and define batch types.
Decipher User to manage the batches and content being validated.
Yes. While Decipher IDP is a separate product, it is designed to integrate with the Blue Prism RPA product.
Yes. Decipher IDP allows administrators or approved business users to easily define custom documents. This means that customers will be able to use Decipher IDP to process a wide range of documents, whether standardized such as passports and driving licenses or documents used in their organizations. These document definitions can also include validation data, such as checking date of birth format. Validation rules can also be applied to the data values, e.g. checking the age given is relevant to the application type.
Supported file formats include: PDF, PNG, TIFF, GIF, JPEG, and BMP
We recommend that scans are a minimum of 300 DPI. Native PDFs are read directly from the underlying data and generally perform better than images.
Decipher IDP does not currently support PDFs that have been digitally signed if, when loaded, the PDF initiates a connection to the online verification service. Ensure signed PDFs are saved in a form that does not initiate an online check, for example, using Microsoft Print to PDF from a browser.
An example invoice document form definition file is available and can be downloaded from the customer portal. Additional document definition files will be uploaded to Blue Prism’s Digital Exchange (DX)
Version 2.2 of Decipher IDP includes the ability to process unstructured documents, using the optional Natural Language Processing plugin. This plugin required the Decipher environment to have a GPU (graphics processing unit) installed in the machine that will be running Decipher. This is available for installation from the portal. Full details of the NLP requirements are included in the installation guide.
Support for handwriting in Decipher is on the roadmap for inclusion in a future release of the product.
Any source that a Blue Prism Digital Worker has access to, i.e. any windows machine accessible file or directory. This allows administrators to easily integrate Decipher into their current processes.
Decipher is designed to optimize extraction and accuracy based on the specific attributes defined in each data form definition. This means that the optimum accuracy is obtained by training Decipher on the documents and data used in your organization. Decipher can be trained on a specific data set (with further learning turn-off) or set to continually improve the learning as it is used (e.g. automatically rebuild the machine learning model every 300 documents.
It is recommended that administrators read the Training types overview to understand how Decipher training works, so that they can optimize this for their deployment.
The number of documents required for training Decipher varies depending on the variability of the data and the formatting in the document type being processed. The more standardized the content is (by type and location of data) the lower the number of documents required to train Decipher.
The default number of documents for creating the machine learning model is set to 1000 (300 for unstructured documents), however high levels of accuracy can be achieved with significantly lower numbers depending on the type of document being processed and the completeness of the validation defined in the document form definition.
The Decipher web-based user interface includes easy to use annotation capabilities which allow business users to view, edit, or overwrite the data extracted form a document. Extracted data that doesn’t meet the defined validation rules will be clearly highlighted helping the business user quickly identify which data items they need to check or review. The web-based interface is included with Decipher, with the user interface for each document being automatically generated from the document form definition.
Yes. Decipher includes several machine learning models these include:
The classification engine analyzes documents and sort them into the document types.
The data extraction engine analyzes documents and extracts discrete data from them.
The ML capabilities in Decipher IDP enables the product to learn new document types and understand their discrete data. This learning process requires a default number of 1000 of documents (300 for unstructured) from which to train the machine learning algorithms – this figure can be changed in the settings.
As the customer processes more documents, the machine learning algorithm will continue to learn and therefore improve its recognition accuracy.
It is possible to assign priorities to different batch types (e.g. invoices or application forms), allowing individual batch types to be prioritized. This enables administrators to ensure that documents with a higher priority, for example invoices, are processed before application forms, regardless of the volume of application forms.
Decipher can be installed on a single machine, or across four (or more) separate machines with each of the core components on a separate machine. Further information on sizing Decipher for high volume processing can be found in the Administration section of the help pages.
Decipher IDP is designed to provide customers with an on-premise solution that can run in a disconnected network. Decipher stores all data, machine learning and extracted data locally and doesn’t share any data or meta data externally. The machine learning models created within Decipher remain dedicated to that company, with no data being sent back to any other platform or service.
Decipher can be installed in a virtual machine, whether on-premise or on a cloud environment such as Azure.
Decipher 2.2 allows machine learning models to be easily moved between environments by exporting and importing the machine learning files. Full details of this are included in the Decipher help.
The process of training the machine learning model on a central processing unit (CPU) can be slow. Unlike the NLP module which can only be installed on a machine with a GPU, the structured ML module can be installed on a machine with a CPU only. If the user enables periodic training of the ML model in the user interface, that may cause the model to be locked for a long period of time. For this reason, the enabling of ML is therefore an explicit choice that needs to be made by the administrator. Alternative approaches to avoid the locking of the ML model are either to import a model which has been trained on another machine, or have multiple machines running data capture, where only one of them has a GPU with ML training enabled, and the others are only performing capture on the CPU.
Roles and permissions are currently separate to those in Blue Prism, with Decipher IDP having a separate identity solution. Our goal is to integrate these together using Blue Prism Authentication Server.
Decipher IDP does not make any database changes to the Blue Prism Core RPA product. The Decipher product is installed separately to the core Blue Prism product with connectivity between these managed via a Decipher VBO.
Decipher IDP works with Microsoft’s underlying database technology – Transparent Data Encryption (TDE). Therefore, Decipher IDP uses the default encryption method configured by the user’s database administrator. Encryption can be enabled during the Decipher IDP installation. For example, if 256 bit AES encryption is the default method implemented in the SQL Server instance, Decipher IDP will use this for data at rest and encrypt configuration settings (if selected when installing Decipher server). These settings are not selected by default.
In-flight data encryption between the Decipher Server and SQL Server are configured as part of the Windows infrastructure deployment – this is Microsoft technology. Decipher IDP components utilize whatever encryption methods the environment has configured in Windows and SQL Server.
Roadmaps and future functionality
The Decipher Product Team has an ambitious long-term vision for Decipher that delivers significant innovation and ongoing value to Blue Prism customers.
This can be found at: https://portal.blueprism.com/product/product-portfolio-roadmap
The goal of Decipher IDP is to allow customers to easily process paper documents and forms as part of an electronic process and integrate this into a Blue Prism solution. Decipher IDP is built on a modular architecture that will, in future releases, allow customers to integrate partner products directly into Decipher to leverage their specialist capabilities and functions whilst maintaining a single platform.
The community is available for internal users and customers to request new features. Requests will be considered by the Product Manager and the engineering team. People who submit ideas and suggestions may be contacted by the Product Manager for further information or clarification. By using a community portal, Decipher IDP users can see the suggestions that have already been submitted—avoiding duplication, promoting discussion, and providing voting abilities https://community.blueprism.com/participate/ideation-home
Active Directory authentication through SAML 2.0 using Microsoft® Active Directory Federation Services (AD FS) is now supported, enabling users to log in to Decipher IDP using their Active Directory credentials. This enhancement is currently limited to credential authentication only.
Active Directory authentication through SAML 2.0 using Microsoft® Active Directory Federation Services (AD FS) is the only identity provider (IdP) verified and supported for this release. However, this functionality is not limited to AD FS and could be applied to other SAML IdP providers.
Decipher IDP does not currently support high availability. This is planned for a future release.