Data Mapping cfxdm - dm:mergecolumns

Data merge from multiple columns to a single column

dm:mergecolumns: This cfxdm function allows the user to select multiple columns using include or exclude columns options using a regular expression format and merge them into a single target column.

dm:mergecolumns syntax:

  • include (Mandatory): Specify a column or columns in regular expression format to include matched columns.

  • exclude (Optional): Specify a column or columns in regular expression format to exclude matched columns.

  • to (Mandatory): Target column for merged data from selected one or more columns.

Note: After merging multiple columns into a single column, it will remove source columns from the final output.

Example 1:

Step 1: Create an empty dm_mergecolumns_example_1 using AIOps studio as shown in the below screenshot.

Empty pipeline

Step 2: Add the following pipeline code/commands into the above-created pipeline as shown in the below screenshot:

You can copy the below code into your pipeline and execute that in your environment. ##### This pipeline creates a set of records/dateset details coming from a netflow ##### environment. ##### This dataset includes flow.client_addr, flow.server_addr, flow.service_port as ##### source columns. This pipeline uses dm mergecolumns to merge the source columns ##### to a single target column @dm:empty --> @dm:addrow flow.client_addr = '10.95.133.42' & flow.server_addr = '10.95.133.40' & flow.service_port = '9092' --> @dm:addrow flow.client_addr = '10.95.122.170' & flow.server_addr = '10.95.122.169' & flow.service_port = '9092' --> @dm:addrow flow.client_addr = '10.95.122.170' & flow.server_addr = '10.95.122.169' & flow.service_port = '9092' --> @dm:addrow flow.client_addr = '10.95.122.170' & flow.server_addr = '10.95.122.169' & flow.service_port = '9092' --> @dm:addrow flow.client_addr = '10.95.122.205' & flow.server_addr = '10.95.122.212' & flow.service_port = '9300' --> @dm:addrow flow.client_addr = '10.95.117.35' & flow.server_addr = '10.95.117/37' & flow.service_port = '686' --> @dm:addrow flow.client_addr = '10.95.122.105' & flow.server_addr = '10.95.122.212' & flow.service_port = '443' --> @dm:addrow flow.client_addr = '10.95.122.105' & flow.server_addr = '10.95.122.212' & flow.service_port = '9300' --> @dm:addrow flow.client_addr = '10.95.122.121' & flow.server_addr = '10.95.122.212' & flow.service_port = '9092' --> @dm:addrow flow.client_addr = '10.95.122.109' & flow.server_addr = '10.95.122.212' & flow.service_port = '8443' --> @dm:mergecolumns include = 'flow.client_addr|flow.server_addr|flow.service_port' & to = 'flow.unique_id'

Pipeline code added

Step 3: Click verify button to make sure syntax and pipeline code is correct (as shown below)

Pipeline code is verified using 'Verify' button as shown above

Step 4: Click execute button and execute the pipeline. RDA will execute the pipeline without any errors (as shown below).

Screenshot -1
Screenshot - 2 (successful execution of pipeline)

Step 5: RDA uses the dm mergecolumns function to merge the specified columns into target column and prints the resultant output as shown in the following screenshot.

Three columns are merged into single target column as shown in the above screenshot.

dm: merge functionality will be very useful when uniqueness is needed from within the dataset by combining couple of columns into one column as a key.

Last updated