Data Mapping cfxdm - dm:groupby

Group rows by selected columns

dm: groupby: This cfxdm function allows the user to group the data (by rows) based on selected columns using aggregate functions.

dm: groupby syntax:

  • columns (mandatory): Select one or more columns for grouping the data.

  • agg (optional):

    • count: It is applied by default when 'agg' is not specified. Supported on any value types (numeric or non-numeric values)

    • min: Supported on numeric values only

    • max: Supported on numeric values only

    • sum: Supported on numeric values only

Example 1:

Step 1: Create an empty dm_groupby_example_1 using AIOps studio as shown in the below screenshot.

Empty pipeline

Step 2: Add the following pipeline code/commands into the above-created pipeline as shown in the below screenshot:

You can copy the below code into your pipeline and execute that in your environment. ##### This pipeline creates a set of records/dateset details coming from a vCenter ##### environment. This dataset includes datastore details, Folder name, guest_hostname, ##### guest_ip_address. ###### This pipeline uses dm groupby to view the data grouped by selective column name ###### Example by Folder, datastore etc. @dm:empty --> @dm:addrow datastore.1 = 'CFX-QA-Store-NFS-qnap' & Folder = '' & guest_full_name = 'CentOS 4/5/6/7 (64 bit)' & guest_hostname = 'test2.oia.cloudfabrix.com' & guest_ip_address = '10.95.134.17' & `VM Name` = 'test1' --> @dm:addrow datastore.1 = 'CFX-QA-Store-NFS-qnap' & Folder = 'User VMs' & guest_full_name = 'CentOS 4/5/6/7 (64 bit)' & guest_hostname = 'qa.oia.cloudfabrix.com' & guest_ip_address = '10.95.122.13' & `VM Name` = 'test2' --> @dm:addrow datastore.1 = 'ENG_ISOs,netapp-qa-nfs' & Folder = 'User VMs,Ravi-Pisupati,TestVMs' & guest_full_name = 'Ubuntu Linux (64 bit)' & guest_hostname = 'ubuntuqa' & guest_ip_address = '10.95.102.172' & `VM Name` = 'test3' --> @dm:addrow datastore.1 = 'datastore1 (6)' & Folder = 'Ravi-Pisupati' & guest_full_name = 'CentOS 4/5/6/7 (64 bit)' & guest_hostname = 'elkstack' & guest_ip_address = '10.95.121.218' & `VM Name` = 'test4' --> @dm:addrow datastore.1 = 'datastore1-198' & Folder = '' & guest_full_name = 'Microsoft Windows Server' & guest_hostname = 'spdnode.adwinstack' & guest_ip_address = '10.95.132.4' & `VM Name` = 'test5' --> @dm:addrow datastore.1 = 'netapp-dev-nfs-01' & Folder = 'User VMs,Ravi-Pisupati,TestVMs' & guest_full_name = 'CentOS 4/5/6/7 (64 bit)' & guest_hostname = 'localhost' & guest_ip_address = '10.95.103.115' & `VM Name` = 'test6' --> @dm:addrow datastore.1 = 'netapp-dev-nfs-01' & Folder = 'User VMs,Ravi-Pisupati,TestVMs' & guest_full_name = 'CentOS 4/5/6/7 (64 bit)' & guest_hostname = 'localhost' & guest_ip_address = '10.95.103.115' & `VM Name` = 'test7' --> @dm:addrow datastore.1 = 'netapp-dev-nfs-02' & Folder = 'CFX-Drama,Capri-Demo' & guest_full_name = 'CentOS 4/5/6/7 (64 bit)' & guest_hostname = 'democloudp' & guest_ip_address = '10.95.122.212' & `VM Name` = 'test8' --> @dm:addrow datastore.1 = 'netapp-dev-nfs-02' & Folder = 'CFX-Drama,Capri-Demo' & guest_full_name = 'CentOS 4/5/6/7 (64 bit)' & guest_hostname = 'ravip-v201-platform' & guest_ip_address = '10.95.122.216' & `VM Name` = 'test9' --> @dm:addrow datastore.1 = 'netapp-dev-nfs-02' & Folder = 'CFX-Drama,Capri-Demo' & guest_full_name = 'CentOS 4/5/6/7 (64 bit)' & guest_hostname = 'cfxautomatesvc' & guest_ip_address = '10.95.125.116' & `VM Name` = 'test10' --> @dm:groupby columns = 'Folder'

Pipeline code added to empty pipeline created

Step 3: Click verify button to verify the pipeline. RDA will verify the pipeline without any errors (as shown below)

Pipeline code verified

Step 4: Click execute button to execute the pipeline. RDA will execute the pipeline without any errors (as shown below)

Screenshot (Execution of pipeline without any errors)

Step 5: Verify that the output data is grouped based on the selected column (e.g. 'Folder' column from the dataset created) as shown below screenshot.

RDA runs pipeline using function dm groupby using column 'Folder' as shown above

This function is useful when users want to group by column/columns. In addition, is useful for any numeric values.

When an agg function 'sum' is used, it automatically applies to all eligible columns (numerical values).

Last updated