Thursday, June 21, 2018

Flat files and newlines for different OS

CloverETL can read, as one of the many sources, flat files. Eg. files without hierarchical structure, data stored in human readable format. Simple example is still popular csv.

Csv means comma separated values, eg columns of data are separated by ',' (comma) delimiter.

CloverETL has for such files a FlatFileReader component which can read csv with different delimiters ('|' or ';' are another popular ones). This component can read a flat files with different delimiters, it can read them not only from local system but also from remote ones (ftp, sftp, S3).

For each file you want to read with a FlatFileReader you will need to have a metadata. Reader provide easy way how to create metadata for existing file via Extract metadata functionality.

This option will parse the file and produce metadata (description) of the file. Eg. list of fields, their datatypes etc.

One of the issues you might encounter in real world is that you created your metadata from one version of flat file, but in reality files could come from various sources, with various OS. Each operation system implements its own newline delimiters.

Extract metadata will get new line from that one file you triggered it on. But don't worry, there is an way how to be prepared for files from different OS.

You just need to:

  1.  edit created metadata (double click on the edge with the metadata)
  2. click on first row with name of metadata to get properties in right hand side column
  3. select last option in Record delimiter field

This option will allow you to read files from different OS without issues. (You can even write to that field and use delimiters which are not in the dropdown, just give it try!)

No comments:

Post a Comment