Data Mapping cfxdm - dm:mergecolumns

Data merge from multiple columns to a single column

dm:mergecolumns: This cfxdm function allows the user to select multiple columns using include or exclude columns options using a regular expression format and merge them into a single target column.

dm:mergecolumns syntax:

  • include (Mandatory): Specify a column or columns in regular expression format to include matched columns.

  • exclude (Optional): Specify a column or columns in regular expression format to exclude matched columns.

  • to (Mandatory): Target column for merged data from selected one or more columns.

Note: After merging multiple columns into a single column, it will remove source columns from the final output.

Example 1:

Step 1: Create an empty dm_mergecolumns_example_1 using AIOps studio as shown in the below screenshot.

Step 2: Add the following pipeline code/commands into the above-created pipeline as shown in the below screenshot:

You can copy the below code into your pipeline and execute that in your environment. ##### This pipeline creates a set of records/dateset details coming from a netflow ##### environment. ##### This dataset includes flow.client_addr, flow.server_addr, flow.service_port as ##### source columns. This pipeline uses dm mergecolumns to merge the source columns ##### to a single target column @dm:empty --> @dm:addrow flow.client_addr = '10.95.133.42' & flow.server_addr = '10.95.133.40' & flow.service_port = '9092' --> @dm:addrow flow.client_addr = '10.95.122.170' & flow.server_addr = '10.95.122.169' & flow.service_port = '9092' --> @dm:addrow flow.client_addr = '10.95.122.170' & flow.server_addr = '10.95.122.169' & flow.service_port = '9092' --> @dm:addrow flow.client_addr = '10.95.122.170' & flow.server_addr = '10.95.122.169' & flow.service_port = '9092' --> @dm:addrow flow.client_addr = '10.95.122.205' & flow.server_addr = '10.95.122.212' & flow.service_port = '9300' --> @dm:addrow flow.client_addr = '10.95.117.35' & flow.server_addr = '10.95.117/37' & flow.service_port = '686' --> @dm:addrow flow.client_addr = '10.95.122.105' & flow.server_addr = '10.95.122.212' & flow.service_port = '443' --> @dm:addrow flow.client_addr = '10.95.122.105' & flow.server_addr = '10.95.122.212' & flow.service_port = '9300' --> @dm:addrow flow.client_addr = '10.95.122.121' & flow.server_addr = '10.95.122.212' & flow.service_port = '9092' --> @dm:addrow flow.client_addr = '10.95.122.109' & flow.server_addr = '10.95.122.212' & flow.service_port = '8443' --> @dm:mergecolumns include = 'flow.client_addr|flow.server_addr|flow.service_port' & to = 'flow.unique_id'

Step 3: Click verify button to make sure syntax and pipeline code is correct (as shown below)

Step 4: Click execute button and execute the pipeline. RDA will execute the pipeline without any errors (as shown below).

Step 5: RDA uses the dm mergecolumns function to merge the specified columns into target column and prints the resultant output as shown in the following screenshot.

dm: merge functionality will be very useful when uniqueness is needed from within the dataset by combining couple of columns into one column as a key.

Last updated