CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Finding the differences between files

 



darokx
New User

Jan 26, 2014, 8:44 PM

Post #1 of 11 (2456 views)
Finding the differences between files Can't Post

Hi,

I am a newbie in Perl and trying to write a program to do the following:
- Compare 2 files (old and new revisions of the same file)
- Print out the differences found in the files.
- File A will be the reference of the comparison.
- The spaces in between are whitespaces.

File 1:
Item Name Qty
1 John 23
2 Pete 101
3 Allan 56
4 Jim 72

File 2:
Item Name Qty
1 John 24
2 Pete 101
3 Alex 56

Expected output would be:
Qty of Item 1 was changed from 23 to 24.
Name of Item 3 was changed from Allan to Alex.
Item 4 is missing in File2.

Thanks in advance.
Harry


Laurent_R
Veteran / Moderator

Jan 27, 2014, 12:03 AM

Post #2 of 11 (2452 views)
Re: [darokx] Finding the differences between files [In reply to] Can't Post

The easiest is probably to read file 1 and load its data into a hash (with the key being probably the name (if itis usique), and then read file 2 line by line and check for the differfences.


Tejas
User

Jan 27, 2014, 3:15 AM

Post #3 of 11 (2443 views)
Re: [darokx] Finding the differences between files [In reply to] Can't Post

Easiet and Fastest way is

1.Open first file
1.Save all the value that has to be compared in a hash.
3.Open second file and compare the value with the value in the hash

Code Snippet is below

Code
open FIRST, "< $firstFile" or die "could not open first file...\n"; 
while (my $line = <FIRST>) {
my @elements = split ' ', $line; #Space is the delimiter as per your file
my $value = $elements[Your Index]; #check woth your indexes
$hash{$value} = 1; #store all the values in a hash
#u can use multi level hases i.e $hash{$value} {xxx} {yyy}
for getting ur task done
}
close FIRST;

open SECOND, "< $secondFile" or die "could not open second file...\n";
while (my $line = <SECOND>) {
my @elements = split ' ', $line;
my $Val = $elements[0]; # Perl arrays are zero-indexed
if ($hash{$Val}) { #As , all the values in hash are assigned to 1 , if its availble,this condition is satisfied
print MATCH "$line";
}
else {
print UNMATCH "$line";
}
}


Hope this helps


(This post was edited by FishMonger on Jan 27, 2014, 9:08 AM)


teardrop
Novice


Jan 27, 2014, 8:25 AM

Post #4 of 11 (2433 views)
Re: [darokx] Finding the differences between files [In reply to] Can't Post

You can also use Text:.Diff:


Code
#!/usr/bin/perl 
use strict;
use warnings;
use Text::Diff;
my $diffs = diff 'file1' => 'file2';
print $diffs;



Laurent_R
Veteran / Moderator

Jan 27, 2014, 12:09 PM

Post #5 of 11 (2420 views)
Re: [teardrop] Finding the differences between files [In reply to] Can't Post

Except that this module is not going to give you anything near the output expected in the OP. It does a good job at finding textual differences (just as the diff Unix utility), not at finding finer differences in structured data.


(This post was edited by Laurent_R on Jan 27, 2014, 1:16 PM)


darokx
New User

Jan 28, 2014, 5:52 PM

Post #6 of 11 (2358 views)
Re: [Tejas] Finding the differences between files [In reply to] Can't Post

Thanks for all the inputs. It helped me a lot with the script.

Can I use 2 hashes to compare 2 rows? If yes, what is the best way?

Thanks again.


Laurent_R
Veteran / Moderator

Jan 29, 2014, 10:56 AM

Post #7 of 11 (2326 views)
Re: [darokx] Finding the differences between files [In reply to] Can't Post


In Reply To
Can I use 2 hashes to compare 2 rows? If yes, what is the best way?

You don't need to. You only need to store file 1 in a hash, and then you read line by line file 2, split it the same way you split the first one to test for the key found in file 2 in the hash built when reading file 1. Assuming the input files are records with just two fields separated by a space, where the first filed is unique and should be the comparison key and the second one the value, something like this:

Code
my %hash; 
# open file 1 as $FILE1
while (<$FILE1>) {
my ($key, $value) = split;
$hash{$key} = $value;
}
close $FILE1;
# open file 2 as $FILE2
while (<$FILE2>) {
my ($key, $value) = split;
if (defined $hash($key}) {
if ($value == $hash($key}) {
# print to the common file
} else {
# print to the difference file
} else {
# print to the missing item file (if any)
}
}
close $FILE2;



darokx
New User

Jan 29, 2014, 7:04 PM

Post #8 of 11 (2318 views)
Re: [Laurent_R] Finding the differences between files [In reply to] Can't Post

Thanks Laurent.

One more question, if the data has 3 fields separated by a space, how do you use this routine to compare the data in row3 just like what you did in row2?

I tried editing this code to do that but I can't make it work :-(

Thanks for all the help. I am learning and enjoying perl because of you all..


Laurent_R
Veteran / Moderator

Jan 29, 2014, 11:36 PM

Post #9 of 11 (2305 views)
Re: [darokx] Finding the differences between files [In reply to] Can't Post

I took a two field example because I could not figure out for sure, in your example, whether the comparison key should be field 1 or field 1 + field 2. Assiming it is the latter case:



Code
my @temp_array = split; 
my $key = $temp_array[0] . '-' . $temp_array[1];
my $value = $temp_array[2];
$hash{$key} = $value;


(It can be done with less code lines, but I prefer to detail the steps for pedagogical purpose.)


(This post was edited by Laurent_R on Jan 29, 2014, 11:37 PM)


darokx
New User

Feb 3, 2014, 9:36 PM

Post #10 of 11 (1952 views)
Re: [Laurent_R] Finding the differences between files [In reply to] Can't Post

Hi Laurent,

Here's an example of what I meant:

File 1
1 John White
2 Pete Blue
3 James Black

File 2
1 John Red
2 Andy Blue
3 James Green


I want to compare row2 of both files using the 1st row as comparison key.

Then I want to compare row 3 of both files using row1 still as the comparison key.

Can you show me how to do it with hash?

Thanks for your help..


Laurent_R
Veteran / Moderator

Feb 4, 2014, 10:22 AM

Post #11 of 11 (1852 views)
Re: [darokx] Finding the differences between files [In reply to] Can't Post

I showed you in the post # 7 above, you just have to adapt the details (3 fields instead of two). Please try by yourself and then, if you have difficulties, show us the code you tried. Similarly, if there is something you don't understand in the code I posted, feel free to ask.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives