Optical character recognition (OCR)

Blue Prism provides several OCR capabilities for on-screen text:

Native character recognition via surface automation

Where the Blue Prism Application Modeller cannot be used directly to identify application elements (because, for example, the applications that need to be automated are not running on the same machine as Blue Prism), a technique called surface automation can be used to create an image of the application screen. Such applications can be modelled by using screen regions, image matching, and character recognition.

Native character recognition based on font matching is leveraged through the Recognise Text action in a Read stage when used against a previously captured Application Modeller region. This extracts text data from the region and stores it in a Data item. The input parameters for the Recognise Text action are font, foreground color, and background color.

Native character recognition requires the font to be generated before being used.For more details, see Fonts.

Tesseract OCR

For situations where it is not appropriate to use the native character recognition engine to interact with on-screen text, for example, where smoothed-text is enforced or for interacting with scanned or otherwise-restricted copies of electronic documents, Blue Prism can make use of an embedded Tesseract OCR engine to recognize text using pattern matching and complex, language-based text recognition.

In order to maximize the effectiveness of the text recognition, a minimum of 300 dots-per-inch (dpi) is required. For images, such as on-screen text, where the dpi is lower than this, a Scale parameter will artificially increase the size of the captured region before passing it to the engine. Generally setting the scale factor to 4 or 5 will provide successful results.

The Tesseract OCR engine is leveraged though the Read Text with OCR action in a Read stage when used against a previously captured Application Modeller region and includes the options to read text, lists and grids. It is also possible to output the pre-worked images to a specific diagnostics location to allow verification that the scaling being applied is sufficient for the selected region.

Language packs

Language packs for use with Tesseract can be obtained from the internet. Blue Prism works with Tesseract version 4.0.0 and it is imperative that the correct major version of the language files are used with it. Currently, the version 4.0.0 language files can be downloaded from the Tesseract website.

To add support for another language, download the appropriate files and copy them to the Tesseract\tessdata folder (usually C:\Program Files\Blue Prism Limited\Blue Prism Automate\Tesseract\tessdata).

The language files are prefixed with a language code, for example, fra (French), deu (German), jpn (Japanese), chi-tra (Traditional Chinese). Once installed on each of the required devices, this code can be specified in the Language parameter of the Read Text with OCR action within a Read stage, to instruct the engine to use the required pack.

Page segmentation mode

The Read Text with OCR action within a Read stage has an optional text parameter Page Segmentation Mode, allowing a Tesseract-defined value to be specified. The values which can be entered in this parameter are shown below, along with a brief description of their action.

If no value is entered for the Page Segmentation Mode, then the default value of Auto will be used.




Orientation and script detection (OSD) only


Automatic page segmentation with OSD.


Automatic page segmentation, but no OSD, or OCR.


Fully automatic page segmentation, but no OSD. (Default)


Assume a single column of text of variable sizes


Assume a single uniform block of vertically aligned text


Assume a single uniform block of text


Treat the image as a single text line


Treat the image as a single word


Treat the image as a single word in a circle


Treat the image as a single character


Find as much text as possible in no particular order.


Sparse text with OSD.


Treat the image as a single text line, bypassing workarounds that are Tesseract-specific.

For further information on segmentation modes please consult the official documentation provided by Tesseract on their website.