IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New Can someone do my job for me? :-)
To cut a long story short, we've received a corrupted text file - to keep it all anonymous, say it's a fixed record length of 100 bytes long, but bytes 20 and 21 sometimes have a carriage return in them. So the program that processes said file runs into trouble.

It's on HP-UX.

My unix script-fu is weak - I could do it in COBOL, but, I don't have access to the compiler :(

If someone has a moment, can they make me feel like an idiot by showing me how straightforward it is to move spaces to two bytes in a fixed-record-length file?

Thanks, coz we're in a right panic at the mo'.
John.
Two out of three people wonder where the other one is.
New man tr


Peter
[link|http://www.ubuntulinux.org|Ubuntu Linux]
[link|http://www.kuro5hin.org|There is no K5 Cabal]
[link|http://guildenstern.dyndns.org|Home]
Use P2P for legitimate purposes!
New Only if there are no legitimate carriage returns
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Is there a record seperator?
Let me assume a return.

Do you want error checks? Let me assume yes.

Here's an ugly solution. Save this to a file named, say, fixup and then "perl fixup in > out". In case it does not work, do NOT try to edit in place. (Right now you have one problem, you do not want two...)
\n#! /usr/bin/perl -w\nuse strict;\n\nwhile (<>) {\n  if (length($_) < 101) {\n    if (20 < length($_) and length($_) < 22) {\n      # Join the next line on, then replace bytes 20, 21.\n      $_ .= <>;\n      substr($_, 19, 2, "  ");\n    }\n    else {\n      print STDERR "Unexpected return at line $. not in bytes 20 or 21???\\n";\n    }\n  }\n  if (length($_) <> 101) {\n    print STDERR "Line $. is not of length 101?\\n";\n  }\n  print $_;\n}\n


Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Re: Can someone do my job for me? :-)
If it's truly fixed length, with no record separator:
#!/usr/bin/python\n\nbadchar = '\\n'\nreclen = 100\n\ninf = open('/home/anderson/corrupt.txt', 'r')\noutf = open('/home/anderson/fixed.txt', 'w')\n\nwhile 1:\n    rec = inf.read(reclen)\n    if rec == '': break\n\n    if rec[19] == badchar: rec = rec[:19] + ' ' + rec[20:]\n    if rec[20] == badchar: rec = rec[:20] + ' ' + rec[21:]\n    outf.write(rec)\n\ninf.close()\noutf.close()
Otherwise change the 100 to 101 on UNIX, 102 on Windows. Very rough, but it should get you there.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
Expand Edited by admin Feb. 28, 2005, 05:55:12 PM EST
New Might lose data
I'm a real naysayer today.

But anyways if bytes 20 and 21 sometimes have legitimate data, you'll overwrite them with spaces.

Ben
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Overwriting is fine - it's a field we don't actually use.
(I know, I'm a typical user, putting in spec changes at the last minute...)
Two out of three people wonder where the other one is.
Expand Edited by Meerkat Feb. 28, 2005, 05:33:50 PM EST
New I try not to assume about such things
I have come to believe that idealism without discipline is a quick road to disaster, while discipline without idealism is pointless. -- Aaron Ward (my brother)
New Well, I changed it to check regardless.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New You guys *all* rock!
I'd tried tr but was foiled because the file does have carriage returns at the end. I couldn't get dd to do what I wanted either in the short amount of time I spent trying.

It was easy to identify the records that were 'bad', and since there were only 64 of them, and people were jumping up and down, I just got the line numbers of the bad records and fixed it in vi. Very low-tech, I know.

Proof of the pudding should follow in a few minutes when the job in question re-cycles.

I've saved off your scripts for next time this happens - and for my own learning.

Thanks again all - as an old tv show used to (almost) say: 'The nature of ziwt was irrepressible!' :)


edit: Can't even spell ziwt. Shoot me now...
Expand Edited by Meerkat Feb. 28, 2005, 05:35:40 PM EST
New The Proper Way
#!/usr/bin/pfy\n\nwhile (!sorted) do {\n  my $attitude = pester(pfy);\n  if ($attitude = "bad") then {\n    threaten(pfy);\n    harangue(pfy);\n  }\n  drink ($coffee);\n  send ($email);\n  return $to_pub;\n}


Peter
[link|http://www.ubuntulinux.org|Ubuntu Linux]
[link|http://www.kuro5hin.org|There is no K5 Cabal]
[link|http://guildenstern.dyndns.org|Home]
Use P2P for legitimate purposes!
New Perfect! :)
Two out of three people wonder where the other one is.
     Can someone do my job for me? :-) - (Meerkat) - (11)
         man tr -NT - (pwhysall) - (1)
             Only if there are no legitimate carriage returns -NT - (ben_tilly)
         Is there a record seperator? - (ben_tilly)
         Re: Can someone do my job for me? :-) - (admin) - (4)
             Might lose data - (ben_tilly) - (3)
                 Overwriting is fine - it's a field we don't actually use. - (Meerkat) - (2)
                     I try not to assume about such things -NT - (ben_tilly)
                     Well, I changed it to check regardless. -NT - (admin)
         You guys *all* rock! - (Meerkat)
         The Proper Way - (pwhysall) - (1)
             Perfect! :) -NT - (Meerkat)

Gloat.
112 ms