Collections

A collection is a group of several data items in one. Typically, a collection will be used to retrieve a large number of different pieces of information from a business object, in a single transaction. Once populated, the information in a collection can be accessed in a sequential manner using a Loop stage.

It is convenient to think of the contents of a collection as a table containing rows and columns. The columns are referred to as fields. The names given to fields are supplied by the user, or they may be defined within the corresponding business object(s).

The fields are accessed within expressions using 'dot' syntax, that is, the collection name, followed by a dot, followed by the field name : eg. for a field "Name" inside a collection defined in a stage named "Person", you can access the field in a Calculation stage or a Decision stage with the reference : [Person.Name].

A field name cannot contain square brackets or dots, to ensure that there are no collisions between the structure of the collection being referenced and the names.

Defined vs. Undefined

When the names and data types of the fields are defined explicitly in the collection stage properties form, the collection is said to be defined. This field definition may be the invention of the user, or it may have been predetermined by the designer of a business object with which the collection is used.

Similarly, if the fields are not defined in the properties form, the collection is said to be undefined. This situation arises frequently when interfacing with systems containing data whose structure can vary. An example might be the fetching of the data contained in a spreadsheet (which is in a tabular form) – if nothing is known about the columns of the table in the spreadsheet, then no labels or datatypes can be attached to the columns (or fields) of the collection in advance. The fields of the collection will only be available at runtime.

The following video demonstrates how to create and use undefined collections in Blue Prism.

Nested collections

A collection can contain collections within itself. Again, they can be defined within the collection stage or undefined until populated with data.

Nested collections can be referenced in expressions using further levels of the 'dot' syntax described above. eg. if a stage named 'Person' had a collection named 'Qualifications', which had three fields 'Type', 'Name' and 'Grade', the fields could be accessed using the references : [Person.Qualifications.Type], [Person.Qualifications.Name] and [Person.Qualifications.Grade].

Likewise, a Loop Start stage can loop directly over nested collections using the same syntax, so to loop over a person's qualifications in the above example, you would set the collection in the loop start stage to Person.Qualifications. This generally only makes sense inside loop stages iterating over the Person collection

Note however that the internal collections business object does not support nested collections – they must first be moved into a collection stage before being referenced in an internal action.

The Current Row

Access to the rows of the collection must be done using a loop stage. The loop stage automatically updates the current row of the collection, moving from one row to the next, in order. This change occurs each time the Loop End stage is encountered. Before you enter a loop stage the collection will be on the first row. The loop stage will continue iterate through the rows until the last row in the collection has been reached. At this point the loop will exit and there will be no current row. Trying to access a collection when there is no current row results in an error. In order to access the data again, some action must be taken to cause the "current row" to be set once more. Possibilities include:

  • entering a new loop;
  • adding a new row to the collection (which then becomes the current row, ready to be populated with data).

Manipulating Collections

Collections can be manipulated using the "Internal – Collections" business object. For more information about internal business objects see the Internal business objects overview. The following actions all require the name of the collection to be supplied as an input parameter. This name is of data type text, rather than of type collection. The collection named must be accessible (ie not located on another page and hidden) in order to be used.

  • Add Row – This action will add a new row to the collection. The new row will become the current row, ready to be populated with new data.
  • Remove Row – This action will remove the current row from the collection. After a row has been removed there will be no current row. When a row is removed in the middle of a loop iteration, the loop will continue to (what would have been) the next row when it reaches the loop end stage. If an attempt is made in the meantime to access the current row then an error will occur.
  • Remove All Rows – This action will remove all rows from the collection. After the rows have been removed, there will be no current row. When all rows are removed in the middle of a loop iteration, the loop will not continue to a new row when the loop end stage is reached. If an attempt is made in the meantime to access the current row then an error will occur.
  • Count rows – This action will get the number of rows within the collection. This action differs in from the other two actions in that is has an additional output parameter called "count" which must be mapped to a data item to get the number of rows.

Single Row Collections

A 'single row' collection always has one row – and that row is automatically set as the current row. The row manipulation actions in the Collections business object will raise an error if they are called on a single row collection, and Count rows will always return 1.

Examples

Undefined Collections

You wish to retrieve the entire contents of a Microsoft Excel Worksheet

  1. Preparation

    After ensuring that the CommonAutomation business object, CommonAutomation.clsExcel has been installed, you open Process Studio and add the following stages:

    • A collection stage – this is the collection that we will populate.
    • Two action stages – only one of these is of interest to the discussion; the other is a practical necessity.
  2. Configuring the action to retrieve the data

    We must populate the collection using an action stage. Open the properties page of the second of the two action stages. For the business object choose 'Microsoft Office – Excel Actions' from the drop-down menu (if this does not appear then the business object has not been successfully installed); for the action choose 'Get WorkSheet as collection'. On the outputs tab, drag and drop the name of your collection stage onto the output named 'Col1'. This instructs the action to place the data it collects into the collection you chose.

  3. Handling Preconditions

    Before the process will function, we must fulfil all of the preconditions specified by the action stage. Observe that on the preconditions tab in the current action properties it states "A workbook must be active". This is the reason for the other action stage. Open the properties for that action stage and choose the action 'Open Workbook' from the business object 'Microsoft Office – Excel Actions'. Enter the path of an existing Microsoft Excel workbook.

  4. Finishing off

    Finally, join up the action stages using links (no need to join the collection stage to anything) and when your process runs, the collection will be populated once the second stage has been run.

Defined Collections

You wish to retrieve the entire contents of a table which has field headers that do not change.

  1. Preparation

    After ensuring that business object that returns a collection is available, you open Process Studio and add the following stages:

    • A collection stage – this is the collection that we will populate.
    • One action stage – this is the stage that will populate our collection.
  2. Defining the collection

    Open the collection stage properties. On the fields tab click the add fields button to add fields that exactly match the fields of the collection with which you will be populating with. Choose the data type for each field in the collection also, again these must exactly match the collection with which you will be populating with.

  3. Configuring the action to retrieve the data

    We must populate the collection using an action stage. Open the properties page of the action stages. Choose the business object action that returns a collection, and on the outputs tab set its output to the collection stage named 'Coll1'

  4. Finishing off

    Finally, join up the action stages using links (no need to join the collection stage to anything) and when your process runs the collection will be populated from the action stage onwards.