I'm looking for some quick data validation tools[1].

\r\n\r\n

Issue are numerous flatfiles received from various data vendors. These generally have an allegedly prescribed data layout (either delimited w/ fixed number of fields per record, or columnar, fixed record length), and an alleged record count.

\r\n\r\n

The careful reader will have noted the use of the word "allegedly" twice in the above paragraph. Review of data shows that neither the record count, nor the data layout, may be reliably depended on. Fortunately the data layout appears to be constant, if nonconformant, in most cases. Records vary from reported by a few records (typical) to hundreds of thousands (an apparent "missed digit" typo).

\r\n\r\n

As part of the data uptake process, verifying number of records, record lengths, delimiter counts, etc., would be of some utility. I'm wondering if there are any existing tools in the Windows environment to do this.[2]

\r\n\r\n

--------------------

\r\n\r\nNotes:\r\n
    \r\n
  1. OK, truthfully, I'm not. I'm hoping they don't exist, and I'll be obliged to create same. Preferably in Perl. Along the lines of reporting: records, max record length, min record length, mean, std. dev., prehaps ten most frequent lengths, delimiters per record (if requested). Which would require installing Perl. And Cygwin. On several desktops. Muwahahahah!!!
  2. \r\n
  3. He's lying.
  4. \r\n