Derive field

The Derive field operation creates a new field that combines the values of other fields. The new string will be made up of substrings, which may be constants (strings of any characters), values from other columns, or new data added based on a rule. For example, you can use this operation to take a customer's first and last names from two different columns and place the combined full name into one column. For more information, see Operations.

Configuration

  1. In the New column name field, specify the column into which combined data should be written. You can either provide a name for a new column of specify the name of an existing column.

    If you specify the name of an existing column, the operation will overwrite its data.

  2. In the New column value field, specify the substrings to be copied into the column you specified above. The operation will pick out the specified substrings and place them into that column.

    To copy a substring, you must also specify its Type and Value (see the table below for details). To add or remove a substring, click the plus or minus icon.

    The Preview area will display the result of running the configured operation on the first row.

    Type Description
    Field

    Copies data from a field.

    Value: The name of a field from which to copy data.

    Constant

    Adds a constant.

    Value: Any characters (space, digits, letters, etc.).

    Case

    Adds data based on a rule specified for a field.

    Value: Select a column name and specify a rule for adding a new substring.

    Case is an if-elseif-else statement. For each condition, specify Contains, Starts with, or Ends with, and specify a new string after Then. This string will be placed into the new field if the original value meets the condition.

    Click Add condition to branch an if-elseif-else statement. See Using type Case for details.

  3. If required, use a filter and preview the results of the operation. For details, see Using filters with operations.

    To preview the results that you can obtain with the operation, click Run preview. The application will show you how the first 100 rows in your table will look like after the operation is executed. Please note that it may take some time to generate a preview.

  4. Click Save.

    The Edit Operation window will close and the new operation will appear in the Operations pane.

Configuration examples

Using types Field and Constant

Task: Combine fields with first and last names into a new column named Employee.

Name Surname

Employee

John

Smith

John Smith

Solution: Under New column value, specify the substrings to be copied: value Name of type Field, value space of type Constant, and value Surname of type Field (in that order). The space constant will separate the name from the surname, so that you don't get JohnSmith, for example.

Using type Case

Task 1: Add a new column that will contain the status of an insurance policy — valid or invalid. The Insurance column contains some code. If a code starts with 100 or 215, the insurance policy is invalid. Any codes starting with 349 are valid. Other codes require additional verification.

Solution: Specify the above conditions and for any insurance policies that do not start with 349, 215 or 100, write Verification required in the new column.

Result: The table below shows the original data and the results obtained by running the operation.

Insurance Status
3491534

Valid

44776

Verification required

100132

Invalid

Task 2: Add a new column named Client that will contain a de-identified string instead of a client's name.

Solution: Create a new operation and specify Client for the name of the column. Under New column value, select the value Name of type Case (this is the column that contains clients' names). Then click Add condition, specify a Contains . condition, and specify Personal information as the string to be written into the new column. Now if the Name field contains at least one character, its value will be changed to Personal information and this new value will be written into the new column.

In the Contains . condition, the "." stands for any character. For more information, see Operators.