The XML itself is build pretty much straight forward, you have an email, each mail has a couple of attributes like SUBJECT, SENDER, SENDDATE. Then, the emails consist of multiple PARTs, one PART might be the message body, another an attachment. The PART itself has exactly one TEXT node with some more attributes.
From the TEXT level on, it depends on the type of the PART. If it is pure text, each line of text will go into one LINE element. If the attachment is xml, the XML attribute will contain the entire string so it can be processed with the extract_from_xml() function introduced DI 11.
If the attachment is a comma seperated file, each TEXT will contain CSVLINES and underneath COL0...COL19 (up to) columns. This way a CSV file can be directly read without going through the hassle of decoding it with DI functions.
But there is another processing option as well. If the document has a directory for attachements specified, all attachments will be saved in that directory before returning the XML to DI. This way, in the dataflow you have full access to the file and can read it e.g. with the cobol copybook reader and join it back to the email. The format of the filename is: "attachment_" + to_char(getSentDate, 'YYYYMMDD_HH24MISS') + "_" + PART COUNTER + "(" + FILENAME + ")"