
randizzle
Novice
Oct 31, 2008, 9:57 AM
Post #1 of 15
(1130 views)
|
|
string processing efficiency
|
Can't Post
|
|
Hi, I know there is a better way to do the following than how I have it: Take a line from a file, get a substring at a specific position and a specific length. This is going to be a zip code (5 or 9 digits). Take a broken zip code (you know how Excel kills leading zeroes...), shift it $numberOfZeroesMissing to the right. Insert $numberOfZeroesMissing amount of 0's on the left. Replace that substring in the original line, write to file. Position, length, input, and output are ARGVs. Example: MA1526 10312008 should become MA0152610312008 (position 3, length 5, one zero missing) Example: PR1526521 10312008 should become PR00152652110312008 (position 3, length 9, two zeroes missing) The way I have it now I'm using split, foreach's, whiles nested in whiles, ifs, character arrays, etc. and it's over 90 lines of actual code. It just runs slow. The files I work on are 200,000+ lines, text, fixed-width fields, and sent by clients who probably make them in Excel in parts. Northeast and Puerto Rico zips start with 0 or 00. I KNOW it's possible to cut it at least in half using regex's, but I can't figure out an efficient way. Can you help me? Thanks!
|