Format Definition Overview
The format definition XSD defines several basic concepts: format, record, field
format
The format is the root element for describing a data file format. It is made up of one or more records.
record
The record element contains the definition for one record structure that is defined for the flat file.
There may be one or more record definitions for a format - meaning that a flat file may have one or more record types allowed within it.
The record will contain one or more fields that represent the data in the record. Constraints may be placed on a record such as its
maximum length, or for CSV files, the character used to separate the individual fields.
field
The field element represents a field within the record.
There are one or more fields defined for a record.
The field defines contraints used to validate the data. These
constraints may be data length, character types, numeric ranges, etc. The most flexible validation type FormatCheck provides
is the "RegEx" type, which allows the developer to provide a regular expression that the value must meet.
In addition to the field rules, a description may be placed on each validation to allow a more readable error message to be
produced.
Try It Out
Look at the included sample formats (in the Formats directory of the installation) and run the reports on
the sample data (in the Documentation/SampleFiles directory of the installation) to see the tool in action.
Features
This listing pertains to the version 01.00.00 release
- Flexible XML Schema for Describing Flat File Formats
The XSD is well documented so that data architects can work with a developer to quickly produce
a definition of the file format(s) to be supported.
- Supports Character Separated Value files
FormatCheck supports CSV files and parses the fields based on the defined separator character.
- Supports Flat Files
Flat files, with fixed length fields, are supported. FormatCheck will calculate the record length from the sum of data fields within the record.
Optionally the maximum width can be made longer than the sum of the data field widths. This is useful when the data is formatted with a fixed
block size that pads the individual records.
- Allows Multiple Record Types per File
Some flat file formats contain multiple record formats. For instance a file of purchase orders may contain headers and details. Further, batch
files may contain multiple batches each surrounded by a batch header and trailer record. FormatCheck allows for each of the record types to be defined
for a given data file.
- Provides Regular Expression Support for Data Field Validation
Most of the time a simple regular expression can be used to determine the legal values for a field. The "RegEx" data type is used
for this purpose.
- Supports Value Lists
Sometimes a field must contain a value from a predefined list. That list can be associated with the field in FormatCheck's definition.
- Multiple Record Id Value Comparison
Some flat file feeds contain multiple feeds, or batches, within them. These batches are typically surrounded by headers and trailers
which have an ID used to relate them. FormatCheck supports naming a data field so that its value is stored and later compared to
a value in another field and record.
- Record Predessor Requirements
A file with multiple record types may have a requirement for the type of record that can proceed another. For instance, in a PO file, a detail record
cannot occur before the first header record. This means that a detail record will have either a header or detail record as its predessessor.
- Required Field Grouping
In some instances one field out of a set are required for a given record to be considered valid. For instance if a record contains contact information
it may be that there must be at least one of the contact methods supplied (e.g. email, phone or pager field). FormatCheck allows the data fields to be assigned a group number
and then a requirement test performed based on "Or" or "XOr" logic. With an "Or" relationship at least one of the values must be supplied.
With the "XOr" relationship, one and only one may be supplied.
|