Data Mapping cfxdm - dm:enrich
Enrich the data using dictionaries
Last updated
Enrich the data using dictionaries
Last updated
dm:enrich: This cfxdm function allows the user to enrich an existing dataset by looking into additional datasets or dictionaries and brings in additional enriched information from them as per the user's selection and requirement.
dm: enrich syntax:
dict (mandatory): Dictionary name (named dataset) which has additional enrichment data.
src_key_cols (mandatory): Named dataset's (source) key columns, comma separated.
dict_key_cols (mandatory): Dictionary name's (named dataset) key columns, comma separated.
enrich_cols (mandatory): Enriched column names from Dictionary (named dataset) selected under 'dict' option, comma-separated.
The number of selected columns (count), for both src_key_cols & dict_key_cols options should be same.
i.e. if two columns are specified in src_key_cols, make sure two columns are specified in dict_key_cols too.
Step 1: Create an empty dm_enrich_example_1 using AIOps studio as shown in the below screenshot
Step 2: Add the following pipeline code/commands into the above-created pipeline as shown in the below screenshot:
You can copy the below code into your pipeline and execute that in your environment.
###### Pipeline created two datasets simulating two datasets app-processes-list and
###### apps-services-list from windows environment/inventory.
###### Pipeline uses dm:enrich function to enrich two datasets that are stored in RDA
##### Pipeline also uses, dm:recall
####### app-processes-list dataset
@dm:empty
--> @dm:addrow ip_address = '10.95.108.100' & process_name = 'System Idle Process' & pid = 0
--> @dm:addrow ip_address = '10.95.108.100' & process_name = 'svchost.exe' & pid = 1020
--> @dm:addrow ip_address = '10.95.108.100' & process_name = 'svchost.exe' & pid = 1020
--> @dm:addrow ip_address = '10.95.108.100' & process_name = 'silsvc.exe' & pid = 1064
--> @dm:addrow ip_address = '10.95.108.100' & process_name = 'smbhash.exe' & pid = 1152
--> @dm:addrow ip_address = '10.95.108.100' & process_name = 'svchost.exe' & pid = 1172
--> @dm:addrow ip_address = '10.95.108.100' & process_name = 'taskhostex.exe' & pid = 1432
--> @dm:addrow ip_address = '10.95.108.100' & process_name = 'Microsoft AD' & pid = 1500
--> @dm:addrow ip_address = '10.95.108.100' & process_name = 'svchost.exe' & pid = 1520
--> @dm:addrow ip_address = '10.95.108.100' & process_name = 'dwm.exe' & pid = 1524
--> @dm:addrow ip_address = '10.95.108.100' & process_name = 'dfsrs.exe' & pid = 1556
--> @dm:addrow ip_address = '10.95.108.100' & process_name = 'svchost.exe' & pid = 776
--> @dm:save name = 'app-processes-list'
@dm:empty
--> @dm:addrow ip_address = '10.95.108.100' & service_name = 'ADWS' & State = 'Running' & pid = 1500
--> @dm:addrow ip_address = '10.95.108.100' & service_name = 'AppHostSvc' & State = 'Running' & pid = 1524
--> @dm:addrow ip_address = '10.95.108.100' & service_name = 'BFE' & State = 'Running' & pid = 1172
--> @dm:addrow ip_address = '10.95.108.100' & service_name = 'BITS' & State = 'Running' & pid = 1020
--> @dm:addrow ip_address = '10.95.108.100' & service_name = 'BrokerInfrastructure' & State = 'Running' & pid = 776
--> @dm:addrow ip_address = '10.95.108.100' & service_name = 'ComsysApp' & State = 'Running' & pid = 3584
--> @dm:addrow ip_address = '10.95.108.100' & service_name = 'CertPropsSvc' & State = 'Running' & pid = 1020
--> @dm:addrow ip_address = '10.95.108.100' & service_name = 'CryptSvc' & State = 'Running' & pid = 820
--> @dm:addrow ip_address = '10.95.108.100' & service_name = 'DFSR' & State = 'Running' & pid = 1556
--> @dm:addrow ip_address = '10.95.108.100' & service_name = 'DNS' & State = 'Running' & pid = 1624
--> @dm:addrow ip_address = '10.95.108.100' & service_name = 'DPS' & State = 'Running' & pid = 1172
--> @dm:save name = 'app-services-list'
--> @c:new-block
--> @dm:recall name = 'app-services-list'
--> @dm:enrich dict = 'app-processes-list' & src_key_cols = 'ip_address,pid' & dict_key_cols = 'ip_address,pid' & enrich_cols = 'process_name'
Step 3: Click verify button to verify the pipeline. RDA will verify the pipeline without any errors (as shown below)
Step 4: Click execute button to execute the pipeline. RDA will execute the pipeline without any errors (as shown below)
Step 5: Verify that the output data is enriched with the requested column names and print the output as shown in the below screenshot.
As shown in the above screenshot, 'app-services-list' is enriched using additional dictionary 'app-processes-list' using key columns that match from both datasets. In addition, the pipeline enriches the output with process_name. In this example, only a single column was enriched from the process list. But, the same enrichment process can be extended to other columns based on the other datasets that users want to enrich.