 |
Home:
Perl Programming Help:
Intermediate:
Re: [jb60606] Comparing two CSV files:
Edit Log
|
|

Kenosis
User
Apr 10, 2013, 4:44 PM
Views: 3052
|
Re: [jb60606] Comparing two CSV files
|
|
|
Hi jb60606! Try the following:
use strict; use warnings; my $inFile02 = pop; my %hash; while (<>) { chomp; my @fields = split /,/; my $key = $fields[1] =~ /^\d{6,}$/ ? $fields[1] : $fields[0] . $fields[1] . $fields[8]; $hash{$key} = \@fields if @fields; } push @ARGV, $inFile02; while (<>) { chomp; my @fields = split /,/; my $key = $fields[1] =~ /^\d{6,}$/ ? $fields[1] : $fields[0] . $fields[1] . $fields[8]; next unless $hash{ $key }; my @results; for my $i ( 0 .. @fields - 1 ) { push @results, "$i|$hash{ $key }[$i]|$fields[$i]" if $hash{ $key }[$i] ne $fields[$i]; } print "$.," . ( join ',', @results ) . "\n" if @results; } You'll note the addition of the following:
my $key = $fields[1] =~ /^\d{6,}$/ ? $fields[1] : $fields[0] . $fields[1] . $fields[8]; This ternary operator evaluates the first field (SeqNum) for 6+ digits, which it seems your datasets have. If there's a match, the first field is used as the key, else the concatenation of fields 0, 1, and 8 is used as the unique key identifier. (Of course, you're going to have to choose fields to concatenate that you think will not be different between the two records whose fields you'll be examining for differences.) Tried this on a data set with the SeqNum removed in one record and it worked, although your mileage may vary so some tweaking may be in order. Hope this helps!
(This post was edited by Kenosis on Apr 10, 2013, 4:59 PM)
|
|
Edit Log:
|
Post edited by Kenosis
(User) on Apr 10, 2013, 4:45 PM
|
Post edited by Kenosis
(User) on Apr 10, 2013, 4:46 PM
|
Post edited by Kenosis
(User) on Apr 10, 2013, 4:46 PM
|
Post edited by Kenosis
(User) on Apr 10, 2013, 4:47 PM
|
Post edited by Kenosis
(User) on Apr 10, 2013, 4:47 PM
|
Post edited by Kenosis
(User) on Apr 10, 2013, 4:58 PM
|
Post edited by Kenosis
(User) on Apr 10, 2013, 4:59 PM
|
|
|  |