CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Search content of one file with the content of a second file

 



perlfree
Novice

Mar 25, 2011, 2:44 PM

Post #1 of 17 (2479 views)
Search content of one file with the content of a second file Can't Post

Hi,

I have two files:

File1 (tab-delimited and two columns):
Ex_efxb 0.0023
MSeef 2.3000
F_ecjc 0.3338
MWEEI -0.111
DDAIij 17.777

File2:
MSeef 2.3000
F_ecjc 0.3338

I want to search the content of File one using the content of File 2 and then display the output as follows:

Date of search:
The following matches were found in File 1:

MSeef 2.3000
F_ecjc 0.3338


Here's is my perl script which did not work for the above tasks

Code
#!C:\bin\perl.exe  


my $REPORT_FILE = 'outFile.txt';

$F1 = 'File1.txt';
open(RF,"<$F1") || die "can't open $F1 $!";


$F2 = 'File2.txt';
open(RXNs,"<$F2") || die "can't open $F2 $!";
close F1;
close F2;


my $line = <RF>;
@f1 = split /\t/, $line;

my $line = <RXN>;
@f2 = $line;



open(DATA,"+>OutFile.txt") or die "Can't open data";

foreach $a (@f1){
$flag = 0;
foreach $b (@f2){
if ($a eq $b){
print DATA $line{1}."\t".$line{2}."\n" ;
$flag = 1;
last;
}
if ($flag==0){
print DATA "";
}
}

}
close DATA;


help would be appreciated to make the script work.
Thanks


(This post was edited by perlfree on Mar 25, 2011, 3:05 PM)


FishMonger
Veteran / Moderator

Mar 25, 2011, 3:01 PM

Post #2 of 17 (2473 views)
Re: [perlfree] Search content of one file with the content of a second file [In reply to] Can't Post

Start by adding these 2 lines near the top and see if you can fix the problems the point out. If you don't understand what the error messages mean, then post back with a more specific question.


Code
use warnings; 
use strict;



perlfree
Novice

Mar 25, 2011, 3:23 PM

Post #3 of 17 (2464 views)
Re: [FishMonger] Search content of one file with the content of a second file [In reply to] Can't Post


Turning on the warnings as suggested gave the following errors:

lobal symbol "$F1" requires explicit package name at testTwoFiles.pl line 7.
Global symbol "$F1" requires explicit package name at testTwoFiles.pl line 8.
Global symbol "$F1" requires explicit package name at testTwoFiles.pl line 8.
Global symbol "$F2" requires explicit package name at testTwoFiles.pl line 11.
Global symbol "$F2" requires explicit package name at testTwoFiles.pl line 12.
Global symbol "$F2" requires explicit package name at testTwoFiles.pl line 12.
Global symbol "@f1" requires explicit package name at testTwoFiles.pl line 18.
Global symbol "@f2" requires explicit package name at testTwoFiles.pl line 21.
Global symbol "@f1" requires explicit package name at testTwoFiles.pl line 27.
Global symbol "$flag" requires explicit package name at testTwoFiles.pl line 28.

Then I addede 'my' to the decalrations and still left with errors as follows:

Global symbol "$flag" requires explicit package name at testTwoFiles.pl line 28.

Global symbol "%line" requires explicit package name at testTwoFiles.pl line 31.

Global symbol "%line" requires explicit package name at testTwoFiles.pl line 31.

Global symbol "$flag" requires explicit package name at testTwoFiles.pl line 32.

Global symbol "$flag" requires explicit package name at testTwoFiles.pl line 35.

Execution of testTwoFiles.pl aborted due to compilation errors.


Thanks

In Reply To


FishMonger
Veteran / Moderator

Mar 25, 2011, 3:41 PM

Post #4 of 17 (2461 views)
Re: [perlfree] Search content of one file with the content of a second file [In reply to] Can't Post

The second set of errors is telling you that you forgot to declare the $flag and %line vars with the my keyword.


perlfree
Novice

Mar 25, 2011, 4:45 PM

Post #5 of 17 (2458 views)
Re: [FishMonger] Search content of one file with the content of a second file [In reply to] Can't Post


In Reply To
only %line is left now, but i'm not sure how to declare global var for this since I can only see $line twice on that line.



(This post was edited by perlfree on Mar 25, 2011, 4:46 PM)


FishMonger
Veteran / Moderator

Mar 25, 2011, 5:02 PM

Post #6 of 17 (2452 views)
Re: [perlfree] Search content of one file with the content of a second file [In reply to] Can't Post

This is the only line in your posted code where you're using the %line hash.

Code
print DATA $line{1}."\t".$line{2}."\n" ;


If you really meant to use that hash, then you need to declare and populate it prior to the foreach loop.

Other problems include:
1) You're opening 2 filehandles (RXNs and RF) then immediately close F1 and F2 which aren't filehandles.

2)

Code
my $line = <RF>;   
@f1 = split /\t/, $line;

Here you're reading-in a single line when you really should be reading-in the entire file loading a hash as you go.

3)

Code
my $line = <RXN>;

Here you're attempting to read from the RXN filehandle, but the the handle you actually opened is RXNs

There are numerous other problems, but start by fixing these.


perlfree
Novice

Mar 25, 2011, 5:34 PM

Post #7 of 17 (2449 views)
Re: [FishMonger] Search content of one file with the content of a second file [In reply to] Can't Post

This is what I have now:

Code
#!C:\bin\perl.exe 
use warnings;
use strict;

#$REPORT_FILE = 'outFile.txt';

my $F1 = 'File1.txt';
open(RF,"<$F1") || die "can't open $F1 $!";


my $F2 = 'File2.txt';
open(RXNs,"<$F2") || die "can't open $F2 $!";
close RF;
close RXNs;

#my $line ();
my %var1;
my %var2

while (my $line = <RF>{
$line = split('\t');
$var1{$1} = {2};
}
close(RF);

while (my $line = <RXNs>{
$line = split('\n');
$var1{$1};
}
close(RXNs);



open(DATA,"+>OutFile.txt") or die "Can't open data";

foreach $a (@f1){
my $flag = 0;
foreach $b (@f2){
if ($a eq $b){

print DATA $line{1}."\t".$line{2}."\n" ;
$flag = 1;
last;
}
if ($flag==0){
print DATA "";
}
}

}
close DATA;



FishMonger
Veteran / Moderator

Mar 25, 2011, 5:44 PM

Post #8 of 17 (2446 views)
Re: [perlfree] Search content of one file with the content of a second file [In reply to] Can't Post

That script won't even compile, so start with the first error and fix it then rerun the script and fix the next error in line. Keep doing that until there are no more errors. If you don't know how to fix the error, post the updated code and the exact error message.


perlfree
Novice

Mar 25, 2011, 6:55 PM

Post #9 of 17 (2441 views)
Re: [FishMonger] Search content of one file with the content of a second file [In reply to] Can't Post

I have modified the script a bit more according to the compilation errors.

Here's the latest script:

Code
#!C:\bin\perl.exe 
use warnings;
use strict;

my $REPORT_FILE = 'outFile.txt';

my $F1 = 'File1.txt';
open(RF,"<$F1") || die "can't open $F1 $!";


my $F2 = 'File2.txt';
open(RXNs,"<$F2") || die "can't open $F2 $!";
close RF;
close RXNs;

my %line;
my %var1;
my %var2;

while (my $line = <RF>){
$line = split('\t');
$var1{$1} = {2};
}
close(RF);

while (my $line = <RXNs>){
$line = split('\n');
$var1{$1}={1};
}
close(RXNs);



open(DATA,"+>OutFile.txt") or die "Can't open data";



if (exists $var1{$var2}){

print DATA $var1{1}."\t".$var2{2}."\n" ;

}
else {

print "$var2 not found in the file\n";
}

close DATA;



The remaining compilation error for the script is:
exists argument is not a HASH or ARRAY element or a subroutine at testTwoFiles.pl line 38.

I'm not sure how to fix the error.


FishMonger
Veteran / Moderator

Mar 25, 2011, 7:37 PM

Post #10 of 17 (2438 views)
Re: [perlfree] Search content of one file with the content of a second file [In reply to] Can't Post

No, actually these are the compilation errors.

Quote
D:\perl>perl -c perlfree.pl
Global symbol "$var2" requires explicit package name at perlfree.pl line 38.
Global symbol "$var2" requires explicit package name at perlfree.pl line 45.
perlfree.pl had compilation errors.


Line 18 declares a %var2 hash, which you never assign anything to it, and you never declare/assign the $var2 scalar variable.

Also, You've closed the RF and RXNs filehandles immediately after opening them, which means you won't be able to read-in the file contents in the while loops.


perlfree
Novice

Mar 26, 2011, 7:25 AM

Post #11 of 17 (2427 views)
Re: [FishMonger] Search content of one file with the content of a second file [In reply to] Can't Post

I have done something about var2 assignents and here is the latest sript and errors:

Code
#!C:\bin\perl.exe 
use warnings;
use strict;

my $REPORT_FILE = 'outFile.txt';

my $F1 = 'File1.txt';
open(RF,"<$F1") || die "can't open $F1 $!";


my $F2 = 'File2.txt';
open(RXNs,"<$F2") || die "can't open $F2 $!";


my %line;
my %var1;
my %var2;

while (my $line = <RF>){
$line = split('\t');
$var1{$1} = {2};
}
close(RF);

while (my $line = <RXNs>){
$line = split('\n');
$var1{$2}={1};
}
close(RXNs);



open(DATA,"+>OutFile.txt") or die "Can't open data";



if (exists $var1{$var2}){

print DATA $var1{1}."\t".$var2{2}."\n" ;

}
else {

print "$var2 not found in the file\n";
}

close DATA;

Errors:
Global symbol "$var2" requires explicit package name at testTwoFiles.pl line 37.

Global symbol "$var2" requires explicit package name at testTwoFiles.pl line 44.

testTwoFiles.pl had compilation errors.

Not sure how to deal with these errors.


FishMonger
Veteran / Moderator

Mar 26, 2011, 9:06 AM

Post #12 of 17 (2418 views)
Re: [perlfree] Search content of one file with the content of a second file [In reply to] Can't Post

It appears that you don't understand the difference between a scalar and a hash or how to loop over the hash elements.

This is clearly a homework assignment so I can't simply fix your script for you. I can only give some guidance.

If you add this just prior to the if( ){ } block, it will get rid of the compilation error.

Code
my $var2;


However, there are many other problems in your script which I haven't pointed out that will prevent it from doing what it's supposed to do.

You probably should read over your class notes and the sections of your textbook which relate to what you're needing to do.


Karazam
User

Mar 26, 2011, 9:23 AM

Post #13 of 17 (2417 views)
Re: [perlfree] Search content of one file with the content of a second file [In reply to] Can't Post


Code
while (my $line = <RF>){  
$line = split('\t');
$var1{$1} = {2};
}


There are three errors here. First, "$line = split('\t');" won't mean anything in this context.
$line is one line of your input file, it already has a value, and "split" is not given any value
to work with. Second, if you want insert a value into a hash, the syntax would be "$var1{$1} = 2;".
Third, the scalar $1 has not been defined anywhere. What is it you want this piece of code to do exactly?


Code
open(DATA,"+>OutFile.txt") or die "Can't open data";


Don't use '+>', it is almost always wrong. This is the way:


Code
open(DATA, '>', 'OutFile.txt') or die "Can't open data";


Or even better, use lexical filehandles (that goes for your previous open's too):


Code
open my $data_fh, '>', 'OutFile.txt' or die "Can't open data";


(And use 'or', not '||'.)


Code
if (exists $var1{$var2}){ ...


The $var2 in this situation has nothing whatsoever to do with the hash %var2, but is a wholly
separate scalar value which you haven't defined anywhere. Therefore you get the error.

Now, if I understand you original post correctly, you want too check if any of the lines in File2 also occurs in
File1. So, read in one of the files into a hash (lets take the smallest file to conserve memory):


Code
my %seen; 
while (my $line = <RXNs>) {
chomp $line;
$seen{$line} = 1;
}


Then just read the other file in a loop and check the %seen hash:


Code
while (my $line = <RF>) { 
chomp $line;
if ( exists $seen{$line} ) {
print "Found it! $line\n";
}
}


Maybe there's some reason why you try to use "split", but as your original question was phrased it seems
unnecessary.
Hope this helps.


FishMonger
Veteran / Moderator

Mar 26, 2011, 12:38 PM

Post #14 of 17 (2412 views)
Re: [Karazam] Search content of one file with the content of a second file [In reply to] Can't Post

Since this is a homework assignment, I was trying to avoid giving away the solution, however since the "cat's out of the bag", I'll post one anyway, but I won't explain it. The OP should try to figure it out.


Code

#!/usr/bin/perl

use strict;
use warnings;

open my $rf_fh, '<', 'file1.txt' or die "failed to open file1.txt $!";
open my $rxn_fh, '<', 'file2.txt' or die "failed to open file2.txt $!";

my %seen = map { chomp; $_ => 1 } <$rf_fh>;
my @dups;
map { chomp; push @dups, $_ if exists $seen{$_} } <$rxn_fh>;

print "The following matches were found in File 1:\n\n",
join("\n", @dups);


(This post was edited by FishMonger on Mar 26, 2011, 12:38 PM)


miller
User

Mar 26, 2011, 1:33 PM

Post #15 of 17 (2404 views)
Re: [FishMonger] Search content of one file with the content of a second file [In reply to] Can't Post


In Reply To

Code
my %seen = map { chomp; $_ => 1 } <$rf_fh>; 
my @dups;
map { chomp; push @dups, $_ if exists $seen{$_} } <$rxn_fh>;



Hi Fishmonger,

You don't need my help with code. However, I'd just like to point out that autovivification isn't really an issue here. Additionally, map in a void context is pretty messy in my opinion, better to use for, or even better grep. Nevertheless, you know all that stuff.

Following your example, I won't supply code of what I'm describing, just couldn't help but chime in.

To OP,

Just look at perlfaq4, search for "How can I remove duplicate elements from a list or array?", to see another example of the technique that Fishmonger is demonstrating.

Cheers,
- Miller


(This post was edited by miller on Mar 26, 2011, 1:58 PM)


perlfree
Novice

Mar 26, 2011, 1:54 PM

Post #16 of 17 (2398 views)
Re: [miller] Search content of one file with the content of a second file [In reply to] Can't Post


In Reply To
Thanks everyone for suggestions and help.

I tried the code from Fishmonger - it works. However, I have to apologise for confusing evryone on the issue of File 2. File2 should actually just contain the following(one column):

MSeef
F_ecjc

This now brings me to Karazam' question about what I was trying to achieve with splitting:

My approach was to split File1 with '\t' and then just copare the first part File1{1} with the contents of File 2 which should be just one column. Once again I apologise for this misake.
Clearly, Fishmonger's code addresses what everyone thought I was trying to use for searching (i.e two columns in File2 - which should not be case).

Thanks for the link, Miller.

.


(This post was edited by perlfree on Mar 26, 2011, 2:07 PM)


FishMonger
Veteran / Moderator

Mar 26, 2011, 2:15 PM

Post #17 of 17 (2389 views)
Re: [miller] Search content of one file with the content of a second file [In reply to] Can't Post


Quote
Additionally, map in a void context is pretty messy in my opinion, better to use for, or even better grep.


Yes I agree and almost changed that line to not use map. When I first wrote it, I assigned the return to $dups which was to be used latter to show the number of matches.

Although, I recently read some threads (I think on perlmonks) that p5p was reworking the map coding to give credence to cases where it would be appropriate to use it in void context. I don't recall the details of those threads.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives