Oct 31, 2008, 9:57 AM
Post #1 of 15
Hi, I know there is a better way to do the following than how I have it:
string processing efficiency
Take a line from a file, get a substring at a specific position and a specific length. This is going to be a zip code (5 or 9 digits).
Take a broken zip code (you know how Excel kills leading zeroes...), shift it $numberOfZeroesMissing to the right.
Insert $numberOfZeroesMissing amount of 0's on the left.
Replace that substring in the original line, write to file. Position, length, input, and output are ARGVs.
Example: MA1526 10312008 should become MA0152610312008 (position 3, length 5, one zero missing)
Example: PR1526521 10312008 should become PR00152652110312008 (position 3, length 9, two zeroes missing)
The way I have it now I'm using split, foreach's, whiles nested in whiles, ifs, character arrays, etc. and it's over 90 lines of actual code. It just runs slow. The files I work on are 200,000+ lines, text, fixed-width fields, and sent by clients who probably make them in Excel in parts. Northeast and Puerto Rico zips start with 0 or 00.
I KNOW it's possible to cut it at least in half using regex's, but I can't figure out an efficient way. Can you help me? Thanks!