Saturday, August 3, 2019

Dataweave 2.0 Tricks: Sorting and Grouping

The Challenges

I have Accounts retrieved from Salesforce like the following:

[
    {
        "LastModifiedDate": "2015-12-09T21:29:01.000Z",
        "Id": "0016100000Kngh3AAB",
        "type": "Account",
        "Name": "AAA Inc."
    },
    {
        "LastModifiedDate": "2015-12-09T20:16:47.000Z",
        "Id": "0016100000KnXKhAAN",
        "type": "Account",
        "Name": "AAA Inc."
    },
    {
        "LastModifiedDate": "2015-12-12T02:06:48.000Z",
        "Id": "0016100000KqonvAAB",
        "type": "Account",
        "Name": "AAA Inc."
    },
...
]
The dataset contains many accounts which have the same name. These accounts with the same account are regarded as duplicates Eventually I want to delete the duplicates and just leave one in the SFDC. Before I delete the duplicates, I need to create an output for review like the following:
{
    "AAA Inc.": [
        "0016100000Kngh3AAB",
        "0016100000KnXKhAAN",
        "0016100000KqonvAAB",
        "0016100000KnggyAAB",
        "0016100000KngflAAB",
        "0016100000KqalVAAR",
        "0016100000Kngh8AAB",
        "0016100000KnVUKAA3",
        "0016100000Kngh5AAB",
        "0016100000KnVXdAAN",
        "0016100000KnVh4AAF",
        "0016100000KnVs6AAF",
        "0016100000KnggAAAR",
        "0016100000KnlokAAB",
        "0016100000KnggKAAR"
    ],
    "Adam Smith": [
        "0016100000L7sDjAAJ"
    ],
    "Alice John Smith": [
        "0016100000L7x29AAB"
    ],
    "Alice Smith.": [
        "0016100000L7sDiAAJ"
    ],
...

Solutions

I device a two-stage solution. The first transform will create LinkedHashMap which will contain account name as key and the value as array of Account as shown below:
%dw 2.0
output application/java
---
//payload groupBy $.Name orderBy $$
(payload groupBy (account) -> account.Name)  orderBy (item, key) -> key
The second stage of transformation is to extract the account ID as the following:
%dw 2.0
output application/java
---
payload mapObject (item, key, index) -> {  
 (key) : (item map (value) -> value.Id)  
}
Of course, I can put the two Dataweave scripts into one like the following:
%dw 2.0
output application/java
---
//payload groupBy $.Name orderBy $$
((payload groupBy (account) -> account.Name)  orderBy (item, key) -> key)
mapObject (item, key, index) -> {  
 (key) : (item map (value) -> value.Id)  
}

Key Learnings

The key concept of the above use case is to group the accounts with the same name and sort them in alphabetic order. The Mulesoft document about the groupBy and orderBy together with other core functions of dataweave can be found here The groupBy and orderBy have the similar signature:

1. groupBy(Array, (item: T, index: Number) -> R): { (R): Array }
2. groupBy({ (K)?: V }, (value: V, key: K) -> R): { (R): { (K)?: V } }
3. groupBy(Null, (Nothing, Nothing) -> Any): Null
The first function indicates that it can take array as input. The usage will like the following:
//payload groupBy $.Name
//payload groupBy (account, index) -> account.Name
payload groupBy (account) -> account.Name
The above code are the same. The first one is a short-cut version. The second and third lines are to show the lambda style. As a good developer, you should know all the syntax.

In my solution of the second stage, I use mapObject function as the following:

payload mapObject (item, key, index) -> {  
 (key) : (item map (value) -> value.Id)  
}
This is because the payload is a LinkedHashMap and the value of each HashMap entry is an array. That is why I have to use map function inside the mapObject function.

In my work, I also need to remove the type attribute in the Account object.

[
    {
        "LastModifiedDate": "2015-12-09T21:29:01.000Z",
        "Id": "0016100000Kngh3AAB",
        "type": "Account",
        "Name": "AAA Inc."
    },
...
]
Here is the transform to remove the type:
(payload orderBy (item) -> item.Name) map (account) -> {
 (account -- ['type'])
}
As you can see I have used function of --.

Summary

The key thinking of solve this kind of problem how to group, sort, and extract values from the Array or LinkedHashMap. Thus I have used the following core Dataweave functions:
  • groupBy
  • orderBy
  • map
  • mapObject
Also, we should know the short-hand way and Lambda style for using the Dataweave functions. The short-hand way is to use build-in variable $, $$, $$$.
If the payload is Array
  • $ - item
  • $$ - index
If the payload is LinkedHashMap
  • $ - value
  • $$ - key
  • $$$ - index.
The Lambda style is like the following:
(payload orderBy (item) -> item.Name) map (account, index) -> {
 (account -- ['type'])
}
mapObject (item, key, index) -> {  
 (key) : (item map (value) -> value.Id)  
}

2 comments:

Anypoint Studio Error: The project is missing Munit lIbrary to run tests

Anypoint Studio 7.9 has a bug. Even if we following the article: https://help.mulesoft.com/s/article/The-project-is-missing-MUnit-libraries-...