CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
is it an array or a hash?

 



scooper
Deleted

Oct 9, 2000, 6:07 AM

Post #1 of 7 (838 views)
is it an array or a hash? Can't Post

the search_database sub below outputs an array which references a
hash as the name ot the array. Never mind the goofiness of it's key --
that's what I'm using to differentiate the hits, and want use the regex later
in a different sub.

If I want to sort the output of report by the $num_keywordMatches, do I need
to sort the hash as I'm attempting to do (not working too well) or
should I sort the hash as an array and work back to the values of the hash?

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


sub search_database {
%matchgroup = ();

@favorite_keywords = sort { $fav_keywords{$b} cmp $fav_keywords{$a}} (keys %fav_keywords);
print "<HR><B>SORTED AS:<BR>@favorite_keywords<BR><HR>keyword threshold = $threshold</B><BR><HR>\n";
open (UPDATESFILE, $updatesfile) or die "Error opening updatefile. $updatesfile: $!\n";
flock(UPDATESFILE, 1); # this is a SHARED LOCK
my @keywords = map qr{@{[ split ]}}, @favorite_keywords;
while (<UPDATESFILE> ){
chomp $_;
my ($first, $update_record) = (split(/\t/, $_))[0,11];
foreach my $searchre (@keywords) {
unless ($update_record =~ /$searchre/) {
#$include{$first} &#0124; &#0124;= 'false';
} else {
push( @{$matchgroup{$first}}, $searchre );
}
}
}
close(UPDATESFILE);
}


sub report {
$result_count = 0;
$barrier = eval($#favorite_keywords*$threshold);

foreach $first (keys %matchgroup) {
$num_keywordMatches = eval($#{$matchgroup{$first}}+1);
if (($num_keywordMatches > $barrier) && (!($userdata[2] =~ /$first/))) {
$result_count++;
}
}

$range = 4;
# if we actually have 4 results to display
if ($result_count >= $range) {
# open the updates file

print "$result_count is OK keyword threshold = $threshold</B><HR>\n";

open(UPDATESFILE, $updatesfile) or die "Error opening file: $!\n";
flock(UPDATESFILE, 1);
while (<UPDATESFILE> ){
@updatefields = split(/\t/, $_);





#%sorted = sort {$a->{$matchgroup{$first}} cmp $b->{$matchgroup{$first}}} (keys %matchgroup);
#%sorted = sort {$a->{$matchgroup{$first}} <=> $b->{$matchgroup{$first}}} (keys %matchgroup);
#%sorted = sort {{$a->{$matchgroup{$first}}} <=> {$b->{$matchgroup{$first}}}} (keys %matchgroup);
#%sorted = sort {{{$a}->{$#{$matchgroup{$first}}}} cmp {{$b}->{$#{$matchgroup{$first}}}}} (keys %matchgroup);

foreach $first (keys %sorted) {
$num_keywordMatches = eval($#{$matchgroup{$first}}+1);

if (($num_keywordMatches > $barrier) && (!($userdata[2] =~ /$first/)) && ($first eq $updatefields[0])) {

print "<PRE>$first -- $updatefields[1] -- $num_keywordMatches -- $barrier</PRE>\n";

}
}
}
flock(UPDATESFILE,8);
close(UPDATESFILE);

} else {
# we need more results lower the threshold and search again
print "$result_count is too low keyword threshold = $threshold<HR>\n";

$search_times-= .10;

if ($threshold <= 0) {&got_nuthin;}
&calculate_threshold;
&parse_favorites;
&search_database;
&report;
}
}
</pre><HR></BLOCKQUOTE>


rGeoffrey
User / Moderator

Oct 8, 2000, 10:30 PM

Post #2 of 7 (838 views)
Re: is it an array or a hash? [In reply to] Can't Post

Two months ago we had a struggle with the subject of a 'sorted hash'.
http://www.perlguru.com/forum/Forum7/HTML/000286.shtml

The short answer is there is no such thing as a sorted hash. You can do neat things with the sorted keys of a hash, but hashes themselves are stored in a manner that perl thinks is most appropriate and that is not something to mess with.

One solution would be to convert these lines...

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


%sorted = sort {{{$a}->{$#{$matchgroup{$first}}}} cmp {{$b}->{$#{$matchgroup{$first}}}}} (keys %matchgroup);

foreach $first (keys %sorted) {
</pre><HR></BLOCKQUOTE>

to look more like ...

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


foreach $first (sort {{{$a}->{$#{$matchgroup{$first}}}} cmp {{$b}->{$#{$matchgroup{$first}}}}} (keys %matchgroup) {
</pre><HR></BLOCKQUOTE>

[This message has been edited by rGeoffrey (edited 10-09-2000).]


scooper
Deleted

Oct 9, 2000, 12:10 PM

Post #3 of 7 (838 views)
Re: is it an array or a hash? [In reply to] Can't Post

Oh yeah -- I read that post (and a few of the others) I DO KNOW that
sorting a hash is a dicey kind of affair... what you put in doesn't come
out in insertion order because that's just the way perl deals with it,
furthermore you need to define a sort order using sort() on output (if
that's what you choose to do).

My problem is that I wasn't sure if what I have is truly a 'hash',
because it's an 'array' constructed from 'hash' variables:

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


push( @{$matchgroup{$first}}, $searchre );
...


foreach $first (keys %matchgroup) {
...
}
</pre><HR></BLOCKQUOTE>

I did try this (in addition to the other tries that were commented in the code):

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


foreach $first (sort {{{$a}->{$#{$matchgroup{$first}}}} cmp {{$b}->{$#{$matchgroup{$first}}}}} keys %matchgroup) {
</pre><HR></BLOCKQUOTE>

The result was:

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


10033969368621 -- M -- 9 -- 6.8

10022968912751 -- T -- 11 -- 6.8

10012968834508 -- W -- 8 -- 6.8

10011968753655 -- J -- 10 -- 6.8

10010968734922 -- T -- 8 -- 6.8
</pre><HR></BLOCKQUOTE>

But this is what I want...

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


10022968912751 -- T -- 11 -- 6.8

10011968753655 -- J -- 10 -- 6.8

10033969368621 -- M -- 9 -- 6.8

10012968834508 -- W -- 8 -- 6.8

10010968734922 -- T -- 8 -- 6.8
</pre><HR></BLOCKQUOTE>


The idea is that the items with the highest number of matches are at the top
of the list.


Possible solutions??

A. Sort the array as suggested, but am I really getting to what needs to be
sorted?

B. I'm little confused about what to do. If I treat it as an array a little
earlier, (i.e. recode things) it looks like need to dig down way deep into
some array (all the way back to some hash variables), 'map' that -- sort it,
then 'map' it again to have it stick. Schwartzian Transform??? huh??

C. Otherwise if it's a TRUE HASH at this point, let the hash just spill
it's guts and push the output to a mapped array with a counter on it.
Once the array is full, sort the array and then push out the sorted results.

Neither of these approches is particularly refined, pretty or easy to do, and
I'm wondering which path is the right path -- possibly none of them!!!

Thanks rGeoffrey!!


[This message has been edited by scooper (edited 10-09-2000).]


japhy
Enthusiast

Oct 9, 2000, 1:50 PM

Post #4 of 7 (838 views)
Re: is it an array or a hash? [In reply to] Can't Post

If you want to sort a hash (or any data structure), it might help to draw a picture of the data structure so you know what you want to sort by. If I read your code correctly, your hash is like:

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


%matchgroup = (
keyword => [ qw( list of matches ) ],
# ...
)
</pre><HR></BLOCKQUOTE>

You want to sort by the number of matches found, right? Well, for a given key $k, that can be determined via scalar(@{ $matchgroup{$k} }). Transferring this to a sort:

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


for (sort { @{ $matchgroup{$a} } <=> @{ $matchgroup{$b} } keys %matchgroup) {
# ...
}
</pre><HR></BLOCKQUOTE>

I think you were getting confused as to what $a and $b were in the sort subroutine. They're just keys of the hash.

------------------
Jeff "japhy" Pinyan -- accomplished author, consultant, hacker, and teacher



scooper
Deleted

Oct 9, 2000, 2:48 PM

Post #5 of 7 (838 views)
Re: is it an array or a hash? [In reply to] Can't Post

japhy,

Thanks for taking the time to post -- I think that you're pointing me in the right direction,
but now really I'm confused!!

if i do...

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


push( @{$matchgroup{$first}}, $searchre );
</pre><HR></BLOCKQUOTE>


to get (basically) this...

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


%matchgroup = (
@first => $searchre
@10022968912751 => (?-xism:foo) (?-xism:bar) (?-xism:bim) (?-xism:bot)
@10011968753655 => (?-xism:foo) (?-xism:bar) (?-xism:bim)
)
</pre><HR></BLOCKQUOTE>

and $#ARRAY returns the number of items in an array

so I do ...


<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


foreach $first (keys %matchgroup)
{

$num_keywordMatches = eval($#{$matchgroup{$first}}+1);
print "$first $num_keywordMatches\n";

}
</pre><HR></BLOCKQUOTE>


and this tells me how many elements are in each array (how many times
we matched the string)


but if I do this...


<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


foreach $first (sort { eval($#{$matchgroup{$first}}+1) <=> eval($#{$matchgroup{$first}}+1) } keys %matchgroup) {
...
}

or


foreach $first (sort { @{ $matchgroup{$a} } <=> @{ $matchgroup{$b} }} (keys %matchgroup)) {
</pre><HR></BLOCKQUOTE>

I get this ...


<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


10033969368621 -- M -- 9 -- 6.8
10022968912751 -- T -- 11 -- 6.8
10012968834508 -- W -- 8 -- 6.8
10011968753655 -- J -- 10 -- 6.8
10010968734922 -- T -- 8 -- 6.8
</pre><HR></BLOCKQUOTE>

if I do this...
<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


foreach $first (sort { {$a}->eval($#{$matchgroup{$first}}+1) <=> {$b}->eval($#{$matchgroup{$first}}+1) } keys %matchgroup) {
...
}
</pre><HR></BLOCKQUOTE>

i get nothing!


if I do this

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


foreach $first (sort { eval($#{$matchgroup{$first}}+1)={$a} <=> eval($#{$matchgroup{$first}}+1)={$b} } (keys %matchgroup)) {
...
}

or

foreach $first (sort { {$a}=eval($#{$matchgroup{$first}}+1) <=> {$b}=eval($#{$matchgroup{$first}}+1) } (keys %matchgroup)) {


</pre><HR></BLOCKQUOTE>

I get a "Can't modify spaceship operator in scalar assignment" ?!!?

BTW -- I searched the forum posts and got no match for the term 'spaceship operator' is that '<=>' (looks like one to me)

So that leads me to believe that I'm not seeing either the really obvious, or I'm trying to
do somthing that's a little trickier than I thought.

Any other ideas??



[This message has been edited by scooper (edited 10-09-2000).]


dws
Deleted

Oct 9, 2000, 9:47 PM

Post #6 of 7 (838 views)
Re: is it an array or a hash? [In reply to] Can't Post

Hint 1: What purpose does your use of eval() serve?

Hint 2: Might there be a precedence issue involved? If so, is there a simple way to resolve it?



scooper
Deleted

Oct 10, 2000, 8:52 AM

Post #7 of 7 (838 views)
Re: is it an array or a hash? [In reply to] Can't Post

I didn't have a chance to reply BEFORE you did dws, but japhy did explain some things
in his post that got me to thinking.

First off -- this is what I ended up with.

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


for $first (sort { @{$matchgroup{$a}} <=> @{$matchgroup{$b}} } keys %matchgroup) {

$num_keywordMatches = scalar(@{$matchgroup{$first}});
</pre><HR></BLOCKQUOTE>

I finally did get Jeff's code to work. Actually it was working all along, the problem
was I couldn't see the results because of this line:

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>



if (($num_keywordMatches > $barrier) && (!($userdata[2] =~ /$first/)) && ($first eq $updatefields[0])) {

</pre><HR></BLOCKQUOTE>

The way I have it set up, the records don't get dispayed unless (until) the barrier value
has been met. So all of the sorting was happening behind the scenes; I just couldn't see it.

Rip that out and it's pretty obvious.

I didn't know about the scalar method, that's a new one. The spaceship operator
error must have been caused becaose I was trying to make Perl EVALuate
the equation out of order. so remove the eval().

The problem is that even though the values are sorted -- the records themselves are
not. The reason for this is that AS_RECORDS_ARE_READ, they go through the sub and are
reported.


What I really want to do is more like changing the order of the records read
in so the highest matchers are read first or hanging the order of records reported so
the highest matchers are reported first.


Thanks for your hints both of you -- you're making me think.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives