Frequently asked questions (FAQ)

Document Automation and READ abilities

What does READ do?

SS&C Blue Prism Document Automation extracts printed text and handwritten data from faxed or scanned forms at high accuracy. Our customers use Document Automation to automate digital workflows resulting in better efficiencies, improved customer experience and faster turnaround times.

Our AI, READ, recognizes and transcribes text, numbers, and checkboxes at an image level. Setup is easy and requires no coding or machine learning training. Your blank forms are used as templates to crop and extract individual form fields (we call them ‘shreds’) to be sent for digitization.

Our competitive advantage is that our AI is trained to extract and digitize real world low quality printed text and handwriting quickly and accurately.

What is the Document Automation platform?

The Document Automation platform is a complete solution that turns scanned pages into automation ready data. It automatically identifies and classifies pages by matching them to your templates and extracts printed-type and handwritten data from your defined form fields. There is an optional human-in-the-loop review interface to inspect the digitized results and to make corrections if necessary. Data transformations can be applied to make data output consistent and ready for ingestion into a Robotic Process Automation (RPA) or other back-end system.

Sign up for a free trial of the Document Automation platform here. Watch a demo here.

How long does it take to get my results back?

We offer different turnaround time service levels to different types of customers. Our customers range from Fortune 500 companies to small and medium businesses. Please refer to the plan you purchased to find out.

What is the Document Automation accuracy rate?

Document Automation achieves over 90% accuracy for printed text and handwriting, across all data types. Our average accuracy for live enterprise customers is 96%.

Accuracy is highly dependent on how templates are set up. For customers who set up their own templates, we have a comprehensive knowledge base of all our best practices. If you are not achieving the accuracy you want, contact Document Automation Support.

How does READ digitize bad handwriting?

Our AI is a complex neural network, with a dataset that has been trained on over one billion shreds. It has learned to transcribe even terrible handwriting. Document Automation has seen every type of doctor’s handwriting, blurry fax, and low-DPI scan, and can read them better than humans. For example:

How does Document Automation digitize multiple choice and checkbox fields?

Document Automation has the ability to digitize multiple choice and checkbox fields. We use the Support Set Builder to train our AI to recognize data as it would look in production.

How is Document Automation's AI different from traditional optical character recognition (OCR)?

OCR cannot identify handwriting as well as Document Automation, and it cannot handle low quality print or poorly scanned pages.

How can I get started with an account?

Sign up using this link.

What kind of documents work well with Document Automation?

Document Automation excels at digitizing documents with handwritten and printed text, and multiple-choice questions. From application forms, enrollment forms, claim forms, even survey forms, we can help. In addition to our core verticals in healthcare, insurance, and finance, we can also automate digital mail rooms.

Document Automation's alignment algorithm works best with structured forms with fields in fixed locations. Here are some examples of structured forms for Document Automation digitization:

We’re currently researching document recognition of semi-structured forms. Please contact Support if you have semi-structured forms or if you are unsure if your form is structured or not. You may be surprised to find out that what you consider unstructured would be a structured form to Document Automation. For example, some customers consider death certificates as unstructured forms, but they are structured, and we extract data from over 350 formats of death certificates.

Document Automation is not the best fit for unstructured forms, like diary pages, books or newspapers, yet. We're working hard to develop the capability to process unstructured text for the future.

Contact us if you are unsure if Document Automation is the right fit for your use case.

What languages does Document Automation support?

Document Automation currently supports any Latin-based languages that include the 26 basic letters from A-Z. We regularly process Spanish forms, however the system cannot capture accents yet.

Can Document Automation translate different languages?

Document Automation does not have the ability to translate between different languages. While most of Document Automation’s READ digitization engine has been trained in English, Document Automation produces exact transcriptions. Any roman-numeral data that’s captured in a shred of data will be digitized as transcribed by Document Automation.

Account information

I did not receive an account verification email. What should I do?

If you did not receive your account verification email, please first check that the email did not go to your spam folder. If you’ve exhausted all email inbox folders, please contact Support for further assistance.

I can't remember my password. What should I do?

If you can't remember your password (or entered it incorrectly), you can reset it on the SS&C Blue Prism Document Automation login page.

If this still doesn't work, please contact Support.

I signed up for a trial but cannot log in. What should I do?

If you have confirmed your account through the account verification email or if you did not receive your account verification email, please first check that the email did not go to your spam folder. If you’ve exhausted all email inbox folders, please contact Support for further assistance.

Can my trial account be converted or extended?

Yes. When your trial account expires, and you are ready to activate your production-level account, you have the option to use your trial account as your production account. All templates created and all progress will be saved. You also have the option to create a new account before going into production.

Can I add collaborators to my account?

If a Document Automation team member has enabled the organization feature for your account, members can be added as an admin or a collaborator. For instructions about how to add a user to your Document Automation account as a collaborator to your organization, see How to add organization team members. For more general information on permissions, see Organization management.

How many members of my organization can be added to my Document Automation account?

As many as you need – see Organization management for more information about how to assign roles in your organization.

Document classification

Can Document Automation automatically classify my files to match my templates?

Yes. When you process your batch you can have Document Automation automatically pick the correct template to use. Document Automation will compare each form in your batch to all your active templates and automatically choose the best template to use.

Document Automation will automatically rotate, scale, and deskew your scanned filled-in forms to match your templates. A template is just a blank copy of the form, for example:

This template will match to the following pages:

What is sorting? How does it work?

Our sorting engine, or VISION, is Document Automation’s document identification technology. Sorting is the first step in the overall digitization process where submitted files are ‘sorted’ against active templates in your Document Automation account. The Vision engine uses alignment features on the submitted page such as lines, tables and logos to identify the correct active template to digitize against. The engine is advanced enough to allow for handling of faxes, bad scans, upside down pages. For more information on sorting, see Document recognition.

Why do I have a rejected batch?

Files can become ‘rejected’ in the sorting process if the sorting engine determines that certain files in the submitted batch do not match an active, in-scope template. You may look into the rejected forms in a sub-batch of your submitted batch in the Document Recognition dashboard found in the Sorting tab or in the Inbox page.

You can view each rejected file image by using one of the following two methods:

  • Document Recognition Dashboard – Under the submitted batch, the rejected batch will share a name with the original submitted batch they came from, appended with "- Rejected". Click the orange button on the row of the batch. The batchgrid page will show all the forms that have been rejected in addition to their sort scores.
  • Inbox – The rejected batches will display the message "Could not be processed". Click More actions in the batch dropdown menu and then view the batch to see the full size of each page that has been rejected.

Poor scan quality is a common reason for in-scope forms to fall below the sorting threshold and be grouped into the rejected batch.

Another reason why the form may be rejected is because it does not match any of the active templates. To process those files, you will need to create an entirely new template or add a new version if there is only a slight change in form structure as your batch pages need to match the template exactly. Even slight differences in the batch, the numbering of questions, the wording of questions, require you to create a new version or template. Once the new forms have been created, re-upload those files and submit them for digitization.

Data quality

Why does my data have impossible results?

There are a variety of reasons why --impossible-- results may occur in your digitized data: illegible or ambiguous handwriting, values that spillover past the defined field area, poor quality scans.

Impossible outputs may also occur in your digitized data as a result of incorrectly defined templates. The following table lists the common reasons for these kinds of impossible outputs to occur:

Reason for impossible output Solution

Incomplete support sets of Select One and Select Many fields

Complete the support set for fields with the readiness tags labeled, “Not Ready” and “Missing Categories”. Please refer to the article on Multiple choice field setup to make sure that the support sets for every field on each sheet in the template is built.

Field is too large for capture

  1. Reduce the image size of your template. A large template image will affect the sizing of the field size for the submitted fields. See Template File Size Requirements to learn how your template image quality can affect the field size for submitted files.
  2. If the above solution does not solve the issue, create another field to capture the additional text (for example, from prescription to prescription_1 and prescription_2).

Field has incorrect data type

Change the field’s data type to observe the Best practices for choosing a data type. Most fields are recommended to be captured as Medium Text data type by default for the flexibility of minimal constraints.

Failed transformational constraints

These are only applied by Document Automation staff, please contact Support to request any configuration adjustments.

Data can also be marked as impossible if the filled-in forms do not match the template. It is extremely important that your forms match the template you've previously uploaded. If there are variations, your data will not be captured well. You should create a new template or add additional versions to capture any variations in your forms.

It’s recommended that you contact Support if there are a large number of impossible outputs in your results.

Privacy and security

Is Document Automation HIPAA compliant?

Yes. Ensuring the privacy and security of your data is our top priority. Document Automation is committed to securing customer data, including Protected Health Information (PHI), and assuring all privacy and compliance measures are enforced. All data is 256 bit AES encrypted in-transit and at-rest. The Document Automation solution and cloud-based infrastructure are HIPAA compliant with the Code of Federal Regulation (CFR 45). Our Policies, Corrective Action Plans (CAPS), and Remediation requirements have been examined by an external audit firm and quarterly audits are conducted to ensure adherence to all compliance requirements and policy updates. The Document Automation audit emphasizes strict adherence to the Administrative, Physical, and Technical Safeguards of the policy mandates in accordance with HIPAA Title II.

How does Document Automation ensure confidentiality?

Document Automation considers all customer data confidential. SS&C Blue Prism Document Automation complies with HIPAA, CCPA and GDPR privacy standards. This applies to data access, handling, retention, and destruction. SS&C Blue Prism Document Automation requires NDAs, Business Associate Agreements, standard clauses, and security/privacy standards with third parties.

Our privacy policy can be found at blueprism.com.

Technical integration

In what format does Document Automation provide results?

Document Automation's standard output is JSON output or a comma-separated values (CSV) file. A CSV file can be opened in Excel or uploaded to a number of databases and websites.

Integration method Data return format Documentation

READ (Shred) API

JSON

https://app.swaggerhub.com/apis/SS&C Blue Prism DA/READ_Text/0.1.0

Learn more about the READ API here.

Platform (Batch/File) API

JSON

https://app.swaggerhub.com/apis/SS&C Blue Prism DA/Full_Page_Digitization_Connect/1.0

AWS S3

JSON, CSV

To explore other custom integrations, please contact Support.

SFTP

JSON, CSV

To explore other custom integrations, please contact Support.

Google Drive

CSV

You can connect your Google Drive to your Document Automation account and upload files directly from that platform.

Box Integration

CSV

You can connect your Box account to your Document Automation account and upload files directly from that platform.

Drag and Drop

CSV

You can upload files directly from your computer via drag and drop into your Inbox. For more information, see Batch submission.

Template setup

How do I build a template?

For information about templates and how to build them, see Template definition and field placement.

Does my source template need to be blank?

Yes. Document Automation recommends that you upload a blank copy of the template you’d like to have data digitized against when completing the template definition process.

How do I add fields to a template?

See Field placement.

When defining fields, how big should I draw my boxes?

How you place and draw your fields on the template will affect the quality of data you get back, as well as its accuracy. For more information, see Best practices for choosing a data type.

What if one field contains multiple pieces of information (for example, multiple choice and space for “Other")?

A new field should be created for that answer. Draw separate boxes around each type of data, and define each individually. In this example, a field should be drawn around the multiple field (under a “select-one” or “select-many” data type) including all options shown as well as "other” and another field for where the text of “other” is filled out (under a medium text data type).

When should I choose “text” or "number" or “multiple choice” data types?

How you define a field will affect the type of data you get back, as well as its accuracy. For more information, see Best practices for choosing a data type.

What if I have multiple versions of my form?

SS&C Blue Prism Document Automation can easily handle multiple versions of forms. You would just need to upload versions onto our system. Please refer to the Version Discovery section for more information.

How does Document Automation deal with multi-page forms?

When you create a template, you are able to add multiple pages of your form. Based on the template page count and organization, the form submitted will also be organized in that manner with each set representing one completed form.

Thanks to our sorting engine, VISION, single-page or multi-page documents will automatically be matched.

Please always make sure that each form is in order to assure the digitization of all data.

Batch submission

Does each form have to be saved as a separate file?

No. You can scan whole batches of forms together. Document Automation uses Vision, a technology that sorts your documents, to match forms to templates. Please always make sure that each form is in the correct page order to assure the digitization data is returned in the correct order.

How can I upload large files?

Document Automation limits the maximum image file size to 50MB.

If you have a large number of big files we recommend you upload to Google Drive or Box. This is usually faster than uploading directly from your computer.

How do I get the files to Document Automation?

You can upload them using our friendly web interface, or use Google Drive or Box accounts. For more information see Batch submission.

What kind of image files should I submit to give me the best digitized results?

This is dependent on many factors such as how the template was set up, the transformations/validations applied on the data, and even the quality of the form submitted for digitization. Use a scanner to scan images. It is possible to process photographs of pages but at lower accuracy and it is not recommended.

When scanning, select an image resolution of 150 DPI or above. For example, black and white, 1-bit, CCITT Group 4 compression. Save your files in the PNG, TIFF, GIF, JPG, or PDF format. SS&C Blue Prism Document Automation can process multi-page TIFF and PDF files.

How long will it take to upload all my forms?

There are multiple factors that affect file upload time. internet connection, quantity, size, and method of upload.

Depending on the quantity, the size of your files, and your internet connection, it could take minutes or hours. If you have many thousands of pages, you may find it faster to upload from Box or Google Drive.

How many files can I upload at one time?

There are a variety of considerations used to determine the number of files to submit in a batch. These include your total volume and your workflow architecture. If you have files totaling more than 500 pages each day, you may want to consider these variables:

Volume

  • Daily Volume/Total Number of Files – This will help determine the number of batches (sets of files) planned for submission. Unless REVIEW is enabled (or we inform you otherwise), all files in the batch must be digitized before the data is returned. Smaller batches will typically generate a more continuous flow of data.
  • Total Page Count of each File – This is the total count of pages from all the files in the batch. If your files have 100 pages each, then 10-12 files might be the most you want to put in one batch. If your files all have two pages each, then you may find, in accordance with all the other factors mentioned here, that you can create batches with 200-400 files. For an operational workflow, you should ideally not exceed approximately 1200 pages in any single batch.
  • Template Page Count – This is the total number of template pages across all active templates in your account. The more template pages being used to sort the submitted files, the longer it will take to process the batch, thus slowing down the continuous flow of data.

Workflow architecture

  • Medium of Ingesting and Receiving Files – The medium (S3, SFTP, API) via which files are submitted and data returned can affect turnaround time (TAT). If you choose to use S3/SFTP, larger and less frequent batches are preferred. For smaller, more frequent, batches, the API is a logical option. If you have less than 500 pages each day, you can load that through the Inbox in the SS&C Blue Prism Document Automation platform via drag and drop.
  • Utilization of the REVIEW Interface – If this feature is enabled, files will be ingested as a batch but returned on a per file basis. The files must be accepted in the REVIEW interface by a user before the data is accessible. Users must plan to have the appropriate resources on hand for all the files to be manually reviewed within the expected TAT. You should design the flow of files into batches in accordance with both your resource availability and their ability to process the files in a timely manner.

If you have any questions determining the right batch size for your workflow, please contact Support.

What happens if I upload duplicate files?

If you try to upload a duplicate file using the same name as it was previously uploaded with, you will receive an error message saying the system was unable to upload the following files and why. But if you upload the same page scanned twice with different file names in a single batch, the Document Automation system will not be able to recognize the file as a duplicate.

What if some of my filled-in forms are just a little bit different from the template?

To achieve the best digitization results, a clean blank form should be uploaded as a template and defined with fields. The fields on the template maps to where the data should be pulled in from the submitted form.

If the fields on the form you wish to digitize are located on different parts of your template, it will return poor data due to inaccurate extraction or missing data.

Document Automation gives you the opportunity to add versions to different templates in order to ensure the exact data you are trying to extract. Find either a blank form (preferable), or white-out a filled-in one (back-up option) and upload it as a new version of a template in the Document Automation platform. Document Automation, through VISION, has the capacity to then sort the form to the specific template.