Monday, June 13, 2016

CloverETL - XML files without an XSD

Have you ever come across an XML file without an XSD that you need to process? I am sure you have! We encountered this first hand a few weeks ago. We received a number of XML files (more than 100k for those of you keeping track) where we had no idea whether the structure was the same throughout the entire data set. How can you make sense of each XML element without a proper XSD file? We built a clever little solution in CloverETL that will read each XML tag in the entire dataset, the level of each XML tag, and generate an XPath for each XML element that can be used for processing each XML file. Does this sound interesting yet?

We were able to accomplish this using the XSLTransformer.

Above is the main part of the graph which we used to read the XML tags. Below is the Xslt definition that we used within CloverETL in order to parse through each XML tag.

Once the tags have been processed and each XPath was created for your elements, you can then process your XML files with the CloverETL XMLReader component.

No comments:

Post a Comment