Introduction
This short article is about how to use regex with Dataweave 2.0 with regard to email validation. Regex is used in Mulesoft language very wide with regard two functions, matchs(...) and match(...). It is very important to master the regular expression in order to be professional in Mulesoft integration projects. The are a lot of reference available one. Here are few:Use Case
We expect the output of the dataweave transformation as the following depending on the validity of the email address:[ { ... "invalidEmail": "johndo@yahoo" ... }, { ... "PersonEmail": "john.smith@google.com" ... }, ... ]
Invalid Emails
The following types of emails are invalid:- beginning with a dot: .gary.liu@google.com
- ending with a dot: gary.liu@google.com.
- double dots: gary.liu@google..com
- domain name contains underscore: gary.liu@att_rr.com
- domain name contains space: gayr.liu@att rr.com
- domain name contains and of the following: ,<>/[]
- no organization email: gary@google
Solution
%dw 2.0 output application/json var regexEmail = /^[^.][a-zA-Z0-9.!#$%&’*+\/=?^_`{|}~-]+@[a-zA-Z0-9-](?!.*?\.\.)[^_ ; ,<>\/\\]+(?:\.[a-zA-Z0-9-]+)[^.]*$/ --- payload map using (email = $.email) { (validEmail: email) if (email matches regexEmail), (invalidEmail: email) if ( not (email matches regexEmail)) }The above dataweave script is self-explanatory. Few explanation is required if you are not very familiar with regular expression:
- negation: [^_;,\.] this expression means if the email domain contains underscore _ , semi-coma, etc. is not valid email
- simple ^ and $ represent the beginning and end of the line
- [a-zA-Z0-9] mean any charater A a, Bb ... Zz, or 0 to 9 digits are valid
- + sign means to match one for more
- * sign matches 0 or more
- ?! means not include, (?!.*?\.\.) --> not include double dots: ..