CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
Advanced hashes

 



amindlessbrain
New User

Dec 15, 2010, 2:01 AM

Post #1 of 3 (1205 views)
Advanced hashes Can't Post

Hello all,

So my more recent forays into file parsing has led me to believe hashes are the way to go when comparing two files or when trying to find a unique list.

This is all well and good, and I got some code that I can run and that works (!), but its a bit cryptic and I don't completely understand it. I was wondering if anyone could help me pull it out piece by piece so I truly understand what's going on.

Finding unique values in an array @names

Code
my %seen = (); 
foreach (@names)
{
next if ($seen{$_});
$seen{$_} = 1;
push (@unique, $_);
}



This one is for finding consensus values between two files. Each file is tab delim and read in as an array of arrays. The field[$x][4] is what is being compared to find overlap.


Code
my %in_second_file; 
$in_second_file{$rec->[4]}++ for $rec in @fields2;

for my $record (@fields1) {
my $gene = $record->[4];
print $out "$gene\n" if exists $in_second_file{$gene};
}

Also, say I have something like this but I want to print $field[$x][1], $field[$x][2], , $field[$x][3] instead of just the one that is being compared?


rovf
Veteran

Dec 15, 2010, 4:45 AM

Post #2 of 3 (1194 views)
Re: [amindlessbrain] Advanced hashes [In reply to] Can't Post

Your code snippets make sense to some extent, except that you have one syntax error (the word "in" is not permitted after "for", so it is not clear to me what you would like to know about them. As for your question


Quote
Also, say I have something like this but I want to print $field[$x][1], $field[$x][2], , $field[$x][3] instead of just the one that is being compared


I would suggest the following:

Right now, $in_second_file{$gene} records how often $gene occured in @fields2. You didn't say whether you need this count, or whether you just need to know the fact that $gene occured at least once. If the latter is the case, I suggest that instead of the count of occurances, you store a list of all $rec, i.e. instead of


Code
$in_second_file{$rec->[4]}++ for $rec in @fields2;


you would have something like


Code
# Warning: Not tested 
push @{$in_second_file{$rec->[4]}},$rec foreach $rec (@fields2) ;



fulano
Novice

Jan 1, 2011, 9:03 PM

Post #3 of 3 (1036 views)
Re: [amindlessbrain] Advanced hashes [In reply to] Can't Post

Well, I can pick apart the first one, the second is giving me a headache though :).

I'm going to go line-by-line, on the off chance a beginner wanders through here.


First you create a temporary hash to act as a lookup for the values you've seen once already.

Code
my %seen = ();



This is going to loop through each entry in the names array

Code
foreach (@names)  
{


'next' jumps to the next iteration, in this case, it skips everything below and goes to the next line.
This is a backwards way of writing the standard if statement, so this:

Code
    next if ($seen{$_});

is the same as this:

Code
if ($seen{$_} == 1)  
{next;}


These two lines only execute if $seen{$_} isn't 1, or in other words, if we've never seen this name before. They simply set $seen{$_} to one (which Perl kindly creates for us, as it shouldn't exist at this point), and pushes the current name onto @unique.

Code
    $seen{$_} = 1;  
push (@unique, $_);
}



A slightly different way of doing it would be:

Code
my %seen = ();  
foreach (@names)
{
next if ($seen{$_});
$seen{$_} = 1;
}
@unique = keys %seen;

#fgpkerw4kcmnq2mns1ax7ilnd
open (Q, $0); while ($l = <Q>){if ($l =~ m/^#.*/)
{$l =~ tr/a-z1-9#/Huh, Junketeer's Alternate Pro Ace /;
print $l;}}close (Q);

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives