Aggregations

Overview

Aggregations allow you to gather information from databases to prepare combined data sets for data processing. In Process Intelligence, aggregations are useful if the project requires only a small subset of the information from a large dataset. To make the most effective use of data space in a project, a user can upload data, then create the required metric calculations on the current timelines and delete these timelines. This way it is possible to have calculated metric data for larger data sets, without always keeping the actual data in a project.

Aggregations are based on already created metrics. For details on how to create metrics, see Metrics.

Configuration

  1. Click the icon > Project configuration > Aggregations.

    The Aggregations editor displays.

  2. Click Add metrics for the calculation. Only existing metrics can be used.  Multiple metrics can be selected at the same time.

    Some metrics cannot be aggregated due to their initial settings. These are derived metrics which use the following aggregator functions: standard deviation, 90 percentile, 75 percentile, 25 percentile, 10 percentile, and median.

  3. Click Add dimensions (Event and Attribute pairs) for the calculation and provide a name for it. Multiple dimensions can be selected at the same time.
  4. If required, remove any unused metrics or dimensions from the list by clicking the delete icon. You can delete multiple metrics and/or dimensions at the same time.
  5. Define the time resolution from hourly to monthly. This sets the timestamp step that is used to build the Aggregation Data table. For more information, see Aggregation data and examples.
  6. Click Save to save the calculation automatically during the next data upload to the project.
  7. To retrieve the calculated table of results, click Save and recalculate. A complex calculation can take a long time, so if necessary you cancel it by clicking Abort.

    Aborting calculations may cause the loss of the already aggregated data. If so, a warning message displays.

  8. To display the calculation results, click See calculated data.

    The Aggregation Data window displays. For more information, see Aggregation data and examples.

  9. If you delete the uploaded timelines, the calculated data will be retained. Aggregated data from further uploads will be calculated automatically and added to an existing table. To see an updated table, upload data to the project, open the Aggregations editor, and click See calculated data.

Aggregation data and examples

The Aggregation Data window displays the calculation result in a table view. The table contains one default column with timestamps of events. These depend on the selected time resolution. Other table columns represent metrics and dimensions added to the aggregation. You can see metric or attribute values in the event with a specific time stamp.

The data is stored even after timelines are deleted, so new uploads expand the Aggregation Data table adding values to the existing columns.

Example:

  1. Upload data collected during a month, for example, 1st - 31st January.
  2. Create some metrics.
  3. Add them to aggregations.
  4. Click Save and recalculate to retain the results.
  5. Delete all timelines from the project.
  6. Upload another month's worth of data, for example, 1st - 28th February.
  7. Aggregations are calculated automatically using the newly uploaded data for metrics. Aggregation Data will display values from the 1st of January to the 28th of February.

    You can continue the chain of monthly uploads to obtain a focused long-term analysis.

You can create charts based on two months' worth of data based on aggregations, because the required data from the first month is retained (although the original timelines are deleted). These charts can be displayed on a dashboard for long-term analysis.

Make sure to upload successive data. When new data is uploaded, intersecting time periods for data are overwritten according to the latest upload. If the new dataset contains a record with a time stamp before the earliest time stamp of the previous upload, it overwrites the entire aggregated data.

Aggregations with the Alteryx connector

Aggregations can be used as the source of an Alteryx data export. Alteryx requires metric values in different breakdowns, and it is too resource-consuming to calculate data on every single export, so this data can be retained as an aggregation. For example:

  1. Upload data to a project from Alteryx.
  2. Add aggregation functions created in Process Intelligence.

    Alteryx exports only simple metrics - sum, min, max, and count.

  3. Saved aggregation metrics are exported from the project with the Alteryx connector for further analysis or display in third-party applications.

For more information, see Connect to Alteryx as a data source.