CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate: Re: [jb60606] Comparing two CSV files: Edit Log



Kenosis
User

Apr 6, 2013, 8:47 PM


Views: 1459
Re: [jb60606] Comparing two CSV files

Perhaps the following will be helpful:


Code
use strict; 
use warnings;

my $inFile02 = pop;
my %hash;

while (<>) {
chomp;
my @fields = split /,/;
$hash{ $fields[1] } = \@fields if @fields;
}

push @ARGV, $inFile02;

while (<>) {
chomp;
my @fields = split /,/;
next unless $hash{ $fields[1] };

my @results;
for my $i ( 0 .. @fields - 1 ) {
push @results, "$i|$hash{ $fields[1] }[$i]|$fields[$i]"
if $hash{ $fields[1] }[$i] ne $fields[$i];
}

print "$.," . ( join ',', @results ) . "\n" if @results;
}


Usage: perl script.pl inFile01 inFile02 [>outFile]

The last, optional parameter directs output to a file.

Using your dataset for two files, I changes just a couple of fields. Here's the output:


Code
4,6|28.640000|28.740000 
6,6|28.640000|30.640000,9|203|205
10,3|14:32:35.695000|15:32:35.695000


The first column is the line number of the second file where a field mismatch occurred. The subsequent fields have three elements:


Code
fieldNum|inFile01Val|inFile02Val


The script first pops the second file's name off @ARGV (for later use) and reads through the first file's lines. It splits each line, uses the seqnum for the key, and a reference to the array of fields as the value.

Next, the second file's name is pushed back onto @ARGV and its lines are read and split, like the first file. If the seqnum isn't found in the hash created from the first file, the next line is requested.

A for loop is used to iterate through the elements of both arrays (records from both files with matching seqnums). If the elements don't match, the field number (0 - n-1) and the mismatched values are pushed onto a temp array (@results), separated by a "|". If that temp array has elements, the file's line number (in Perl's $.) and the array are printed with commas separating the values.

Hope this helps!


(This post was edited by Kenosis on Apr 6, 2013, 9:18 PM)


Edit Log:
Post edited by Kenosis (User) on Apr 6, 2013, 9:16 PM
Post edited by Kenosis (User) on Apr 6, 2013, 9:16 PM
Post edited by Kenosis (User) on Apr 6, 2013, 9:18 PM


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives