CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
Misbehaving Hash Sorting

 



TX_Jimmy
Novice

Nov 23, 2009, 9:27 AM

Post #1 of 11 (1309 views)
Misbehaving Hash Sorting Can't Post

Hi everyone. I searched for this but could not find it. I am sorting a hash within a loop. The first time though the loop it gets sorted, but each time after that it does not sort. What is going on and how can I fix it?



Code
use strict; 
use warnings;



###################################################################
## set up the fake data in a similar way the script builds it
###################################################################




my @dates = qw /11082009 11092009 11102009 11112009 11122009/;


my @mon = qw /ap1;30;IPad1 ap2;45;IPad1 ap3;16;IPad1 ap4;11;IPad1 ap5;37;IPad1 ap6;55;IPad1 ap7;39;IPad1 ap8;23;IPad1 ap9;5;IPad1 ap10;28;IPad1/;
my @tue = qw /ap1;15;IPad2 ap2;88;IPad2 ap3;22;IPad2 ap4;7;IPad2 ap5;45;IPad2 ap6;41;IPad2 ap7;11;IPad2 ap8;97;IPad2 ap9;22;IPad2 ap10;21;IPad2/;
my @wed = qw /ap1;75;IPad3 ap2;31;IPad3 ap3;40;IPad3 ap4;29;IPad3 ap5;12;IPad3 ap6;62;IPad3 ap7;23;IPad3 ap8;6;IPad3 ap9;38;IPad3 ap10;17;IPad3/;
my @thr = qw /ap1;35;IPad4 ap2;56;IPad4 ap3;61;IPad4 ap4;82;IPad4 ap5;56;IPad4 ap6;13;IPad4 ap7;7;IPad4 ap8;84;IPad4 ap9;12;IPad4 ap10;33;IPad4/;
my @fri = qw /ap1;11 ap2;12 ap3;76 ap4;43 ap5;98 ap6;70 ap7;24 ap8;75 ap9;23 ap10;46/;


my %hash;



for (@mon) {
push (@{$hash{11082009}}, "$_");
}

for (@tue) {
push (@{$hash{11092009}}, "$_");
}

for (@wed) {
push (@{$hash{11102009}}, "$_");
}

for (@thr) {
push (@{$hash{11112009}}, "$_");
}

for (@fri) {
push (@{$hash{11122009}}, "$_");
}


#######################################################################################
### This is pretty much out of the real script
#######################################################################################




for my $day (@dates) {
if (defined (@{ $hash{$day}})) {
my %nhash;
for (@{ $hash{$day}}) {
my @fields = split /;/, $_;
#print "$day $fields[0] $fields[1]\n";
$nhash{$fields[0]} = $fields[1];
}

my @sdata = sort by_count keys %nhash;
sub by_count { $nhash{$b} <=> $nhash{$a} };

for (@sdata) {
print "$day $_ $nhash{$_}\n";
}

}
}



7stud
Enthusiast

Nov 23, 2009, 10:40 AM

Post #2 of 11 (1304 views)
Re: [TX_Jimmy] Misbehaving Hash Sorting [In reply to] Can't Post

Hi,

For some reason, your sort function sees the same value for %nhash (the first value) every time through the loop. Therefore, it thinks the key/value pairs are:

ap1=30
ap2=45
ap3=16
etc.

and you end up sorting the keys by those values. As a result, the keys always get sorted in this order:

ap2
ap1
ap3

Subsequently, when you print out the values associated with each of the sorted keys, you get the values from the current %nhash. If you add the following to your sort function, you will see what I mean:


Code
	sub by_count { 
print "$a=$nhash{$a}, $b=$nhash{$b}\n";
$nhash{$b} <=> $nhash{$a}
};



I can't explain why that is happening, though.


(This post was edited by 7stud on Nov 23, 2009, 11:22 AM)


TX_Jimmy
Novice

Nov 23, 2009, 11:21 AM

Post #3 of 11 (1292 views)
Re: [7stud] Misbehaving Hash Sorting [In reply to] Can't Post

You are right. Somehow it is keeping that order in memory. It has something to do with the subroutine. If you change:

my @sdata = sort by_count keys %nhash;
sub by_count { $nhash{$b} <=> $nhash{$a} };

to

my @sdata = sort { $nhash{$b} <=> $nhash{$a}} keys %nhash;

it works correctly. Wierd, but I am a network guy and I am not really good at programming. I was looking at the example out of Learning Perl and they use a subroutine.


7stud
Enthusiast

Nov 23, 2009, 11:58 AM

Post #4 of 11 (1287 views)
Re: [TX_Jimmy] Misbehaving Hash Sorting [In reply to] Can't Post

Nice sleuthing. I never thought to try that.


Quote
I am not really good at programming.


Whaat? I thought this was really clever:


Code
my %hash; 
...
push (@{$hash{11082009}}, "$_");

I didn't know you could do that. That uses an undefined key, treats the undefined value returned by the undefined key as a reference, converts the reference to an array, then pushes a value into the array. Wow.

Below is a simplified version of the problem if one of the experts wants to take a look. I think the explanation is going to involve closures. A closure means that when you define a function inside a block, the function gets a snapshot of all the variables that it can currently see--like the variables in the enclosing block and any global variables. Then when the function is executed, it refers to that snapshot for the values of those variables.

Your by_count() function is getting its snapshot on the first run through the for loop; then for subsequent loops, the function is not redefined, and as a result the function is stuck with that original snapshot, which contains the first hash. When you execute the function by calling sort, it uses the values in its snapshot.

Apparently, a block like you used to fix the problem does not create a closure, and therefore the block can see the current values in the loop.



Code
use strict; 
use warnings;
use 5.010;
use Data::Dumper;


my %h1 = (
'a' => 5,
'b' => 8,
'c' => 1
);

my %h2 = (
'a' => 200,
'b' => 150,
'c' => 100
);

my @AoH = (\%h1, \%h2);

for my $h (@AoH) {
my %hash = %$h;

my @sorted_keys = sort by_val keys %hash;

sub by_val {
say "$a=$hash{$a}, $b=$hash{$b}";
$hash{$a} <=> $hash{$b}
};

say "$_ = $hash{$_}" for @sorted_keys;
say "=" x 20;
}

--output:--
c=1, a=5
c=1, b=8
b=8, a=5
c = 1
a = 5
b = 8
====================
c=1, a=5
c=1, b=8 #comparisons here use first hashes values--not 100, 200, 150
b=8, a=5
c = 100
a = 200
b = 150
====================



(This post was edited by 7stud on Nov 23, 2009, 12:38 PM)


7stud
Enthusiast

Nov 23, 2009, 12:17 PM

Post #5 of 11 (1282 views)
Re: [TX_Jimmy] Misbehaving Hash Sorting [In reply to] Can't Post

Check this out:


Code
use strict; 
use warnings;
use 5.010;

use strict;
use warnings;
use 5.010;

for my $num (1 .. 5) {
my $a = $num;

{print "in block: $a\n"};

sub test {
print "in function: $a\n";
}

test()
}

--output:--
block: 1
function: 1
block: 2
function: 1
block: 3
function: 1
block: 4
function: 1
block: 5
function: 1



(This post was edited by 7stud on Nov 23, 2009, 12:19 PM)


TX_Jimmy
Novice

Nov 23, 2009, 12:49 PM

Post #6 of 11 (1272 views)
Re: [7stud] Misbehaving Hash Sorting [In reply to] Can't Post

Wow, I guess we should not use a subroutine in a loop?


TX_Jimmy
Novice

Nov 23, 2009, 1:05 PM

Post #7 of 11 (1271 views)
Re: [7stud] Misbehaving Hash Sorting [In reply to] Can't Post


In Reply To

Whaat? I thought this was really clever:


Code
my %hash; 
...
push (@{$hash{11082009}}, "$_");

I didn't know you could do that. That uses an undefined key, treats the undefined value returned by the undefined key as a reference, converts the reference to an array, then pushes a value into the array. Wow.


I got that out of Programing Perl pg 276 ;) I had to learn about complex data structures, at least hashes of arrays, in order to get the data in a usable format.


7stud
Enthusiast

Nov 23, 2009, 7:58 PM

Post #8 of 11 (1256 views)
Re: [TX_Jimmy] Misbehaving Hash Sorting [In reply to] Can't Post

I was at the book store today, and I looked at Programming Perl. There is a section on Closures if my brief description didn't make any sense to you. Here is a simple, classic example:


Code
use strict; 
use warnings;
use 5.010;

sub make_func {
my $val = shift;
return sub {print $val, "\n"};
}

my $f = make_func(5);

#time passes...
#tetonic plates shift...

#make_func() has long since finished executing,
#and all variables created inside a function are
#destroyed once a function finishes executing.
#Therefore, $val doesn't exist anymore. Yet...

$f->();

--output:--
5

The "anonymous sub" got a snapshot of all the variables it could see when it was created, and it carried that snapshot with it. Then when the anonymous sub was executed, it referred to that snapshot to find the value of $val.

Unfortunately, the docs and that section in Programming Perl say that *anonymous subs* create closures. But in your case, you are using a named sub. I can't find any information about named subs creating closures.

Generally, I would never define a variable or a function inside a loop because that just unnecessarily creates variables and functions over and over again. So, I would not do this:


Code
for (1 .. 5) { 
my $x = 10 + $_;
}


I would do this instead:


Code
my $x = 0; 

for (1 .. 5) {
$x = 10 + $_;
}


The variable is defined once, and then it is assigned a new value every time through the loop. That is more efficient.

But sort is kind of a special case. If the sort function were more than 20 characters long, and it wouldn't fit neatly into a one line block, then I would just naively create a sub like you did. That is, until I read your post. So, I guess if you have a longer sort function that you are using inside a loop, you should use syntax like this:

Code
@results = sort  { 
line1;
line2;
line3;
} @arr;


In your program, it appears that perl tries to be more efficient by NOT creating a sub over and over again in a loop, and that is what created your problem.


(This post was edited by 7stud on Nov 23, 2009, 8:01 PM)


TX_Jimmy
Novice

Nov 24, 2009, 8:52 AM

Post #9 of 11 (1246 views)
Re: [7stud] Misbehaving Hash Sorting [In reply to] Can't Post

I understand closures in some contexts but I am still lacking. I am going to give chapter 8 a good going through. Thanks for all your help with this.


TX_Jimmy
Novice

Nov 24, 2009, 9:06 AM

Post #10 of 11 (1245 views)
Re: [TX_Jimmy] Misbehaving Hash Sorting [In reply to] Can't Post

I think I found it, chapter 8 Nested Subroutines

"globally named subroutines don't nest" so what is happening is that it only runs that subroutine once. Although the book is talking about subroutines in subroutines, I bet it applies to for loops also. And this make total sense of what is happening.


7stud
Enthusiast

Nov 24, 2009, 6:03 PM

Post #11 of 11 (1239 views)
Re: [TX_Jimmy] Misbehaving Hash Sorting [In reply to] Can't Post

Yep, I think you are on to something. I did some research, too. I learned that:

1) all named subs are global--no matter where they are defined (which I think is irrelevant to your situation)

2) named subs only get compiled once

3) a sub's closure is formed at compile time, and it contains all the variables that preceded it in the code--as long as the variables aren't contained in their own block. For example:


Code
{ 
my $x = 10;
}

my $y = "hello";

sub test {say $x, $y}


the sub won't be able to see $x.

4) closures contain variables -- not values. Therefore, if the variables change, the sub will see those changes. For example:


Code
my $x = 10; 

sub test {say $x}

$x++;
test();

--output:--
11


5) Autovivification -- If you treat an undefined value as if it were a reference, perl will automatically create the reference for you:


Code
use strict; 
use warnings;
use 5.010;
use Data::Dumper;

my %hash;

$hash{'abc'}->[0] = 3;

say Dumper(%hash);

--output:--
$VAR1 = 'abc';
$VAR2 = [
3
];


Code


	

(This post was edited by 7stud on Nov 24, 2009, 6:12 PM)

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives