|Project Page | Application Installer|
Production Release (Version 01.00.00) is available for download as a file release 01.00.00, from our CVS repository (tagged v01-00-00), or as an InstallAnywhere® installer.
Beta Release (Version 01.01.00) is available for download as a file release 01.01.00, from our CVS repository (tagged v01-01-00), or as an InstallAnywhere® installer. This update incorporates Log4J, JUnit and Cobertura.
FormatCheck is a Java-based application used to verify the format of flat files. The verification can be quite complex including verification of header/child records within a file, checking for batch header/trailer value matches, and requiring 1 of a set of fields within a record. It features a Swing-based GUI and creates printable reports. Development on FormatCheck began in late 2004 and it has been used on various projects since then.
We have found FormatCheck to be of value when creating file feed definitions for data marts and data warehousing. The ability of the tool to handle a diverse set of file formats, defined in an XML file, allows a set of definitions to be created and shared with the target data owners who can then verify their data extracts before passing them onto the warehousing team. Since FormatCheck is written in Java is can be used on various platforms.
The core of FormatCheck is a set of flat file format definitions that the data architect creates. These definitions express the rules for the file format, including record lengths, record sequence (e.g. a detail must follow either a header or other detail), and data within the records. Multiple formats can be shipped with the tool and a drop-down is provided to allow the user to select which file type he or she wishes to verify.
To use the tool, the user selects the format type and data file and then presses the "Verify File" button. The data file is processed and the set of errors is reported in the FormatCheck window. The user may either copy the errors onto the clipboard or print them. Further, FormatCheck will allow the user to print the data records (either all or those that had errors reported) in a structured format so that columns position are easily identified.
On the source code side this release gets the program out into the public. The focus for the 01.02 release will be improvement of the code and addition of unit testing. The next set of betas will be focused on the following feature set:
We invite other developers to join our project and help advance this tool. The major area that need focus is data architect documentation. The tool has proved itself very flexible, but explaining that flexibility and its use is a challenge.
|Last Update: 2007-09-06||Maintained by David Read|