CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
Flagging list data hybrid situation

 



stuckinarut
User

Aug 22, 2015, 11:28 PM

Post #1 of 10 (2199 views)
Flagging list data hybrid situation Can't Post

I wasn't sure exactly how to describe this in terms of a 'Subject' line.

This is a Step 1 and Step 2 situation.

STEP 1
=====
To end up with a sorted list based upon the numerical count of identical entries (using fruit names in this short list an example):


Code
apple 
grape
grape
peach
peach
peach
peach
pear
pear
pear


This code *partially* solves the problem:


Code
use strict; 
use warnings;

my %map;
my @sorted;
chomp(my @words = <STDIN>);

foreach my $word (@words) {
$map{$word} += 1;
}

@sorted = reverse sort { $a cmp $b } @words;

foreach my $key (sort keys %map) {
print "$key $map{$key} \n";
}


I end up with this:


Code
apple 1  
grape 2
peach 4
pear 3


The desired output is for the data to be 'descending' numerical order, but I can't figure out the correct sorting code needed:


Code
peach 4 
pear 3
grape 2
apple 1


STEP 2
====

NOTE: An objective is also to hopefully eliminate the need to manually type in all the data via <STDIN>, and simply use the initial list (in STEP 1) with all duplicate entries as 'listL.txt" in the following code which I am 're-purposing' from a previous task.


Code
use strict;   
use warnings;

my %D_list;
my %I_list;

open my $D_list, '<', 'listD.txt' or die "Cannot open listD.txt: $!";
while (my $line = <$D_list>) {
chomp $line;
$line =~ s/\r//g; # Removes Windows CR characters
$line =~ s/\s+$//; # Removes trailing white spaces
$D_list{$line} = 1
}
close $D_list;

my %I_list;
open my $I_list, '<', 'listI.txt' or die "Cannot open listI.txt: $!";
while (my $line = <$I_list>) {
$line =~ s/\r//g;
$line =~ s/\s+$//;
chomp $line;
$I_list{$line} = 1
}
close $I_list;

my ($L_count, $D_count, $I_count);
open my $L_list, '<', 'listL.txt' or die "Cannot open listL.txt: $!";
while (<$L_list>) {
chomp;
s/\r//;
s/\s+$//;
$L_count ++;
print;
$D_count ++ and print ' D' if exists $D_list{$_};
$I_count ++ and print ' I' if exists $I_list{$_};
print "\n";
}
print "L: $L_count; D: $D_count; I: $I_count; \n";


The 'listD.txt' sample entries are:


Code
grape 
pear


The 'listI.txt' sample entries are:


Code
apple 
grape
peach
pear


So the desired Final-Final output would look like this - with any 'listL.txt' entries FLAGGED if they also appear as entries on 'listD.txt' or 'listI.txt":


Code
peach 4 I 
pear 3 D I
grape 2 D I
apple 1 I


In order to match ONLY the fruit names on 'listL.txt' this REGEXP seems to make sense, but I got stuck trying to figure out exactly how to use it ;-(


Code
# Match ONLY to fruit name without the number 
if ($L_list =~ m/^([\w]+)\s) {

- not sure how to do this -

}


Thanks for any assistance!

-Stuckinarut


(This post was edited by stuckinarut on Aug 23, 2015, 1:12 PM)


BillKSmith
Veteran

Aug 23, 2015, 4:21 AM

Post #2 of 10 (2185 views)
Re: [stuckinarut] Flagging list data hybrid situation [In reply to] Can't Post

STEP 1:

Code
foreach my $key (sort {$map{$b}<=>$map{$a}} keys %map) {


Note: Although it is legal, it is poor practice to use perl key words as variable names because it can be confusing.

Sorry, I do not understand the requirement of STEP2.
Good Luck,
Bill


stuckinarut
User

Aug 23, 2015, 7:23 AM

Post #3 of 10 (2179 views)
Re: [BillKSmith] Flagging list data hybrid situation [In reply to] Can't Post


In Reply To
STEP 1:

Code
foreach my $key (sort {$map{$b}<=>$map{$a}} keys %map) {


Note: Although it is legal, it is poor practice to use perl key words as variable names because it can be confusing.

Sorry, I do not understand the requirement of STEP2.


Thanks, Bill. Maybe the Perl key words were 'throwing me for a loop' (PUN!) in trying to figure this part out :^) I would actually prefer to avoid poor practice, so will see how I can maybe revise things.

Regarding STEP 2, I'm not sure how better to explain the requirements but will try.

If I forget about trying to also integrate STEP 1 into STEP 2, maybe this will help.

Once the STEP 1 list is saved as 'listL.txt', everything is run as:


Code
perl step2.pl listD.txt listI.txt listL.txt >listscompared.txt


There are several 'Declaration' errors in the STEP 2 script to resolve, but I will try and explain better what needs to happen in this part:


Code
my ($L_count, $D_count, $I_count);  
open my $L_list, '<', 'listL.txt' or die "Cannot open listL.txt: $!";
while (<$L_list>) {
chomp;
s/\r//;
s/\s+$//;
$L_count ++;
print;
$D_count ++ and print ' D' if exists $D_list{$_};
$I_count ++ and print ' I' if exists $I_list{$_};
print "\n";
}
print "L: $L_count; D: $D_count; I: $I_count; \n";


In my original use of the script. if only the word 'peach' (WITHOUT ANY NUMBER) was in listL and in listD and listI, then the output would be:


Code
peach D I 
L: 1; D: 1; I: 1

NOTE:
L:1 indicates ONE list entry for 'peach'
D:1 indicates ONE match from list D
I:1 indicates ONE match from list I


But since list L actually now has 'peach 4' (a space and then number 4 after the word), then a REGEXP is needed to perform the matching on *only* the word - before the space and number.

Using the same above example if 'peach' is also on BOTH listD and listI, then the output (including the ACTUAL listL content) would be:


Code
peach 4 D I 
L: 1; D:1; I: 1


Using ONLY the 3 different list contents in the original post where peach is on listI but NOT on listD, here would be the final output:


Code
peach 4 I 
L: 1; D:0; I: 1


Perhaps this will make more sense?

Thanks again.

-Stuckinarut


(This post was edited by stuckinarut on Aug 23, 2015, 9:34 AM)


FishMonger
Veteran / Moderator

Aug 23, 2015, 1:18 PM

Post #4 of 10 (2144 views)
Re: [stuckinarut] Flagging list data hybrid situation [In reply to] Can't Post

I'm having a lot of difficulty understanding your needs.

You've given sample input for lists D and I, but not for list L and then you've shown several variations on desired output, but it's unclear which of those variations you really want.

You can probably do this with a single hash, but due to your confusing description, it's hard to say how that hash should be constructed.


FishMonger
Veteran / Moderator

Aug 23, 2015, 2:01 PM

Post #5 of 10 (2139 views)
Re: [stuckinarut] Flagging list data hybrid situation [In reply to] Can't Post

Here's one possible way to build the hash.

Code
#!/usr/bin/perl 

use strict;
use warnings;
use Data::Dumper;

my %fruit;
foreach my $list ( qw|D I L| ) {
my $file = "list$list.txt";
open my $fh, '<', $file or die "failed to open '$file' <$!>";
while (my $fruit = <$fh>) {
$fruit =~ s/\s+$//;
$fruit{$fruit}{total_cnt}++;
$fruit{$fruit}{$list}++;
push @{ $fruit{$fruit}{list} }, $list;
}
close $fh;
}
print Dumper \%fruit;


Which outputs:

Code
$VAR1 = { 
'peach' => {
'I' => 1,
'total_cnt' => 2,
'L' => 1,
'list' => [
'I',
'L'
]
},
'apple' => {
'I' => 1,
'total_cnt' => 1,
'list' => [
'I'
]
},
'pear' => {
'D' => 1,
'I' => 1,
'total_cnt' => 2,
'list' => [
'D',
'I'
]
},
'grape' => {
'D' => 1,
'I' => 1,
'total_cnt' => 2,
'list' => [
'D',
'I'
]
}
};


Since I can't determine which example output of yours you really want, I can't show you how to output the data in hash other than a simple dump to show the structure.


stuckinarut
User

Aug 23, 2015, 2:38 PM

Post #6 of 10 (2136 views)
Re: [FishMonger] Flagging list data hybrid situation [In reply to] Can't Post

Hello, FishMonger:

Thank you for your reply.

I will try to explain this better, and sorry for any confusion ;-(

This time, I will start with STEP 2 (a re-purposed and re-tweaked script from another task).

I finally got rid of the Declaration Errors, however discovered a new Mystery. So the STEP 2 script (to be further modified) *almost* works accurately using the these 3 lists:


Code
listD.txt 
grape
pear


And...


Code
listI.txt 
apple
grape
peach
pear


And...


Code
listL.txt 
apple
grape
grape
peach
peach
peach
peach
pear
pear
pear


When used in the following script:


Code
use strict;   
use warnings;

my %D_list;
my %I_list;
my %L_list;

open my $D_list, '<', 'listD.txt' or die "Cannot open listD.txt: $!";
while (my $line = <$D_list>) {
chomp $line;
$line =~ s/\r//g; # removes windows CR characters
$line =~ s/\s+$//; # removes trailing white spaces
$D_list{$line} = 1;
}
close $D_list;

open my $I_list, '<', 'listI.txt' or die "Cannot open listI.txt: $!";
while (my $line = <$I_list>) {
chomp $line;
$line =~ s/\r//g;
$line =~ s/\s+$//;
$I_list{$line} = 1;
}
close $I_list;

my ($L_count, $D_count, $I_count);
open my $L_list, '<', 'listL.txt' or die "Cannot open listL.txt: $!";
while (<$L_list>) {
chomp;
s/\r//;
s/\s+$//;
$L_count ++;
print;
$D_count ++ and print ' D' if exists $D_list{$_};
$I_count ++ and print ' I' if exists $I_list{$_};
print "\n";
}

print "L: $L_count; D: $D_count; I: $I_count \n";


This is the output:


Code
apple 
grape I
grape D I
peach I
peach I
peach I
peach I
pear D I
pear D I
pear D I
L: 10; D: 5; I: 10


For some reason, the first 'grape' line 'D' flag is missing, and the second 'apple' line 'I' flag is missing but I cannot figure out why ;-(

The Total D and I counts at the bottom are correct based upon the actual D and I lists which are matched, but why the 2 entries are not flagged remains a Mystery.

Once this script framework is working 100%, then the OBJECTIVE is to 'integrate' the correct sorting code fixed with BillKSmith's solution and the CONSOLIDATED (SHORTENED) "L" list so that the output looks like this, with the D and I flags:


Code
peach 4 I 
pear 3 D I
grape 2 D I
apple 1


NOTE: Or to keep the STEP 1 script and processing separate, and then use that output for the lineL.txt input. That's where I'm lost because of the 2 data columns involve instead of just one.

I am using fruit names as examples rather than various strange looking data codes which would only make things more confusing.

Does this help?

I appreciate your efforts to help!

-Stuckinarut


(This post was edited by stuckinarut on Aug 23, 2015, 2:41 PM)


BillKSmith
Veteran

Aug 23, 2015, 9:08 PM

Post #7 of 10 (2113 views)
Re: [stuckinarut] Flagging list data hybrid situation [In reply to] Can't Post


Quote
For some reason, the first 'grape' line 'D' flag is missing, and the second 'apple' line 'I' flag is missing


The first 'D' and the first 'I' are missing because a zero count has a logical value of FALSE. Use the pre-increment.


Code
    ++$D_count  and print ' D' if exists $D_list{$_};    
++$I_count and print ' I' if exists $I_list{$_};

Good Luck,
Bill


stuckinarut
User

Aug 23, 2015, 9:29 PM

Post #8 of 10 (2110 views)
Re: [BillKSmith] Flagging list data hybrid situation [In reply to] Can't Post


In Reply To

Quote
For some reason, the first 'grape' line 'D' flag is missing, and the second 'apple' line 'I' flag is missing


The first 'D' and the first 'I' are missing because a zero count has a logical value of FALSE. Use the pre-increment.


Code
    ++$D_count  and print ' D' if exists $D_list{$_};    
++$I_count and print ' I' if exists $I_list{$_};



Ohhhh... THANK YOU, Bill. That SOLVED the problem, and I just learned something else new!!!

-Stuckinarut


stuckinarut
User

Aug 24, 2015, 6:32 AM

Post #9 of 10 (2088 views)
Re: [stuckinarut] Flagging list data hybrid situation [In reply to] Can't Post

Well, talk about validation of the Perl Mantra (There's more than one way to do it) !!!

I finally managed to get a STEP 1 script working to use a list vs. <STDIN> although pretty ugly workarounds for all the Declaration errors I was getting. It also eliminates having to manually type in 2,059 entries:


Code
#!/usr/bin/perl 

use strict;
use warnings;

my %hash;
my %map;
my %word;
my $map;
my $word;
my @sorted;
my @words;


open FILE1, "listDI.txt" or die;

while (my $words=<FILE1>) {
chomp($words);
$words =~ s/\r//g; # removes windows CR characters
$words =~ s/\s+$//; # removes trailing white spaces

# foreach my $word (@words) {
$map{$words} += 1;
#}

}

# CHECK CONTENTS OF HASH
print "@{[%hash]}";


@sorted = reverse sort { $a cmp $b } @words;

foreach my $key (sort {$map{$b}<=>$map{$a}} keys %map) {
print "$key $map{$key} \n";
}


All of a sudden the idea came to work in REVERSE to reach the final objective:

1. Run STEP 2 to get a 'comboDI.txt' list

2. Run STEP1 with the new 'comboDI.txt' list to get the final desired output (although some additional massaging in Excel will now be required to swap the positions of the D and I flags with the occurrence number totals per entry):


Code
(FINAL WORKAROUND OUTPUT) 
peach I 4
pear D I 3
grape D I 2
apple I 1


Yup... 'There's more than one way to do it' (even if more cumbersome for us NON-Gurus). Less than desired, but at least I can get a first pass of the actual task done.

Thanks to all for your assistance.

-Stuckinarut


BillKSmith
Veteran

Aug 24, 2015, 7:16 AM

Post #10 of 10 (2085 views)
Re: [stuckinarut] Flagging list data hybrid situation [In reply to] Can't Post

Congratulations! I am glad that you have found a satisfactory solution largely on your own. I suspect that your attempts to explain the problem gave you the insight that you needed.

Now is a good time to stop and consider how you would have done the job if you knew what you know now. Did you save time by modifying existing code rather than starting from scratch? Learn from your answer. (No need to reply or even to write your answer)
Good Luck,
Bill

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives