
stuckinarut
User
Feb 27, 2014, 11:32 AM
Post #1 of 15
(5947 views)
|
Mindbending re-tweak challenge to compare files in reverse order & flag output
|
Can't Post
|
|
I need to re-tweak a script I've used to merge two files and flag common entries with single single trailing space and letter (or letters) to indicate any shared occurrences or not for all items in the list. I am using Windows XP [version 5.1.2600] (I think that's right). perl compare.pl listG.txt listL.txt >combolist.txt Example: listG.txt AS-298 AS-375 AS-402 (etc., etc.) and listL.txt AS-402 AS-590 (etc., etc.) and yields: AS-298 G AS-375 G AS-402 G L AS-590 L It works great!
#!/usr/bin/perl use strict; use warnings; # RUN AS: listcompare.pl listG.txt listL.txt my @letters = qw(G L); my $letter = shift @letters; my %lines; while (<>) { chomp; if (my $l = $lines{$_}) { my $ll = $l->{letters}; next if ($ll->[-1] eq $letter); push @$ll, $letter; } else { $lines{$_} = { f => scalar(@letters), letters => [$letter], } } if (eof) { $letter = shift @letters; } } print $_, ' ', join('', @{$lines{$_}{letters}}), "\n" for sort keys %lines; # END OF SCRIPT The need is to re-tweak the script for a pressing task of comparing two different lists but NOT merging them this time. Instead, the output can ONLY be the listL.txt lines, and also flagging any/all of those lines which have been matched to listM entries with a trailing {space} and 'YES' ... also sorted in Alphabetical sequence A-Z based upon the first column. If there is no match, only the original line intact will be output (without a 'YES'). Then the manual labor of reviewing each of the lines in the output ;-( Each of the two new list formats will each have 3 columns (separated by a space). Both lists will have about 2,500 lines. listL.txt (3 sample lines) D9PM 40 L7MRQ D9PM 40 A3WX D9PM 80 Q5BAL (etc., etc.) listM.txt (3 sample lines) A3WX 40 R5QRC A3WX 40 D9PM A3WX 80 L2AFT (etc., etc.) The mindbending challenges is that the 3 columns (space separated fields) in listL.txt must be matched to the 3 columns (space separated fields) in listM.txt, BUT IN REVERSE ORDER {SIGH}. In other words, the only match which would occur in the above examples would be for: D9PM 40 A3WX And the flagged output line would be from listL.txt would be: D9PM 40 A3WX YES It would really help if there are any 'Duplicate' matches to add maybe another space and the word 'DUPE' after 'YES' but I can try and identify these manually {MAJOR SIGH}. Can this even be done? Hopefully I have explained things correctly. My eyeballs are rolling in all directions ;-( Any assistance would be greatly appreciated. Thanks very much! -stuckinarut
|