CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Moving part of Multi-Dimensional Array to an Array

 



lilmark
New User

Dec 27, 2012, 1:14 PM

Post #1 of 10 (2533 views)
Moving part of Multi-Dimensional Array to an Array Can't Post

Hello, first time poster and new to perl so I'd appreciate the help. Smile



Initially I'm reading a text file.

All the records where the 4th column has a 1 value are pushed into @grp

Then, $md[1]=[@grp];

Next, it clears @grp and reads the file again and all records where the 4th column has a 2 value are pushed into @grp

Then, $md[2]=[@grp];

This repeats 50 times in order to create an organized multi-dimensional array of the data from the text file.

This part works as intended.





Now comes the part where I am trying to combine all the records which had a value of 1 in the 4th column, which I stored in @grp and then store into $md[1], into a single record. Will this work? Oh, I am also trying to sort it to keep it in order. $a below should be a date (ex. 201211).

@grp = $md[1];
for each $r (sort(@grp))
{
($a, $b, $c, $d, $e, $f, $g, $h) = split /\t/, $r;
print FH "$e\t$f\t$g\t$h\tcalculation\tcalculation\t";
}
print FH "/n time for segment 2";


7stud
Enthusiast

Dec 27, 2012, 4:13 PM

Post #2 of 10 (2526 views)
Re: [lilmark] Moving part of Multi-Dimensional Array to an Array [In reply to] Can't Post


Code
 
use strict;
use warnings;
use 5.012;

my %results;

while ( my $line = <DATA> ) {
chomp $line;

my @pieces = split ' ', $line; #splits on any group of whitespace
my $key = $pieces[3];

push @{ $results{$key} }, $line;

}

use Data::Dumper;
say Dumper \%results;


for my $key (sort keys %results) {
my @lines = @{ $results{$key} };

for (sort @lines) {
say;
}
say "*" x 20;
}


__DATA__
201211 b c 2 a
201203 y z 1 x
201201 bb cc 2 aa
201210 yy zz 1 xx


Code
--output:-- 
$VAR1 = {
'1' => [
'201203 y z 1 x',
'201210 yy zz 1 xx'
],
'2' => [
'201211 b c 2 a',
'201201 bb cc 2 aa'
]
};


201203 y z 1 x
201210 yy zz 1 xx
********************
201201 bb cc 2 aa
201211 b c 2 a
********************


By storing references, you can avoid calling split twice(although at the cost of more complexity):


Code
use strict; 
use warnings;
use 5.012;

my %results;

while ( my $line = <DATA> ) {
chomp $line;

my @pieces = split ' ', $line;
my $key = $pieces[3];

push @{ $results{$key} }, \@pieces;

}

use Data::Dumper;
say Dumper \%results;

for my $key (sort keys %results) {
my $AoA = $results{$key};
my @ordered_refs = sort { @$a[0] cmp @$b[0] } @$AoA;

for my $ref (@ordered_refs) {
say "@$ref";
}
say "*" x 20;
}

__DATA__
201211 b c 2 a
201203 y z 1 x
201201 bb cc 2 aa
201210 yy zz 1 xx



Code
--output:-- 
$VAR1 = {
'1' => [
[
'201203',
'y',
'z',
'1',
'x'
],
[
'201210',
'yy',
'zz',
'1',
'xx'
]
],
'2' => [
[
'201211',
'b',
'c',
'2',
'a'
],
[
'201201',
'bb',
'cc',
'2',
'aa'
]
]
};

201203 y z 1 x
201210 yy zz 1 xx
********************
201201 bb cc 2 aa
201211 b c 2 a
********************



(This post was edited by 7stud on Dec 27, 2012, 5:28 PM)


Laurent_R
Veteran / Moderator

Dec 27, 2012, 11:55 PM

Post #3 of 10 (2505 views)
Re: [lilmark] Moving part of Multi-Dimensional Array to an Array [In reply to] Can't Post

Hi,

I might have missed something, but it seems to me that you are doing a lot of useless copying.

Once you have pushed all your single values to @grp, you could process it directly rather copying to @md and then copying back at a leter point. And then only look for -value records.

Or, else, you could write directly to md and read from md, but this is slightly more complex.


lilmark
New User

Dec 28, 2012, 7:21 AM

Post #4 of 10 (2496 views)
Re: [Laurent_R] Moving part of Multi-Dimensional Array to an Array [In reply to] Can't Post

I got it to work.

@grp = @{$md[1]};



I know it seems like extra copying back and forth, but that is because it's needed to organize all the data in one location because there are 6 columns of data per month (4yrs to report on continuously) and 50+ rows (categories) on the report I am automating. So, while I do have to do some additional copying initially I am using loops which saves me time in the long run.

Otherwise, if I just print directly from the @grp as you had suggested what happens if new categories are added/removed? I would then have to return to the code and make adjustments manually. By using the extra copying and loops this report will self-sustain. I have 100+ reports that I already automated and set them up so they run without any intervention needed, aside from random issues that pop-up sometimes.

Of course I didn't expect you to know that since you're viewing the situation from another monitor. Wink


FishMonger
Veteran / Moderator

Dec 28, 2012, 8:13 AM

Post #5 of 10 (2492 views)
Re: [lilmark] Moving part of Multi-Dimensional Array to an Array [In reply to] Can't Post

Copying the arrays back and forth tells me that you're using the wrong data structure. You should be using a mutli level hash structure instead of duplicating arrays.


BillKSmith
Veteran

Dec 28, 2012, 1:38 PM

Post #6 of 10 (2463 views)
Re: [lilmark] Moving part of Multi-Dimensional Array to an Array [In reply to] Can't Post

I do not see how this is an issue with the solution 7stud already suggested, but then he did use another data structure. His basic idea will work with your array.

Code
use strict; 
use warnings;
my @md;
while (my $record = <DATA>) {
chomp $record;
my ($col4) = $record =~ /^.*\s+.*\s+.*\s+(\d\d?)/;
push @{$md[$col4]}, $record;
}

@md = map {join ':', sort @$_} grep {defined} @md;
print join( "\n", @md), "\n";

__DATA__
201211 b c 2 a
201203 y z 1 x
201201 bb cc 2 aa
201210 yy zz 1 xx

Good Luck,
Bill


lilmark
New User

Dec 31, 2012, 11:36 AM

Post #7 of 10 (2414 views)
Re: [BillKSmith] Moving part of Multi-Dimensional Array to an Array [In reply to] Can't Post

Thanks for the help guys. Sorry if I'm confusing. I tend to work backwards.



I like to start off simple and get it "working" first before making it efficient.

I started my code off by coding it very simple (300 lines).

Then I reviewed and compressed it (100 lines).



I'm still new to perl, so I'm just not good enough to start coding it at the 100 line effiency right off the bat.

7stud's code was useful. push @{ $results{$key} }, $line; should allow me to get rid of my one of my loops having to run through the file multiple times.


I probably could have made the code a little more compressed, but I wanted to calculate the total (_tot) separate from the data to make it easier on my head if I ever have to return to the code.

Here's what I ended up with....

#places the data from the text file into a multi-dimensional array
open FP, "$path/Analysis.txt";
while(chomp($r=<FP>))
{
$r=~s/\r//;
@arr=split /\t/, $r;
$dt_str = sprintf "%s %d", $mon[substr($arr[0],4)], substr($arr[0],0,4);
$dt_rng{$arr[0]}=$dt_str;
$nr = "$arr[5]\t$arr[6]\t$arr[7]\t$arr[8]";
push @{ $md[$arr[3]] }, $r;
push @{ $pp_tot{$arr[1]}{$arr[0]} }, $nr;
}
close FP;

$mon_cnt = scalar @{$md[$arr[3]]};

#calculates the total for each Group
foreach $lf_cd (sort keys %pp_tot)
{
$nr = "";
foreach $yrmo (reverse sort keys %{$pp_tot{$lf_cd}})
{
$lv_pp_tot = 0;
$lv_pplss_tot = 0;
$pp_dis_tot = 0;
$pplss_dis_tot = 0;
for ($i=0; $i < scalar @{$pp_tot{$lf_cd}{$yrmo}}; $i++)
{
($lv_pp, $lv_pplss, $pp_dis, $pplss_dis) = split /\t/, $pp_tot{$lf_cd}{$yrmo}[$i];
$lv_pp_tot = $lv_pp_tot + $lv_pp;
$lv_pplss_tot = $lv_pplss_tot + $lv_pplss;
$pp_dis_tot = $pp_dis_tot + $pp_dis;
$pplss_dis_tot = $pplss_dis_tot + $pplss_dis;
}
$nr .= sprintf "<b>$lv_pp_tot\t<b>$lv_pplss_tot\t<b>$pp_dis_tot\t<b>$pplss_dis_tot\t<right><b>%.1f%%\t<right><b>%.1f%%\t", 100*($pp_dis_tot/$lv_pp_tot), 100*($pplss_dis_tot/$lv_pplss_tot);
}
$pp_tot{$lf_cd} = $nr;
}

#prints all the data

open FH, ">$fn1" || die "Cannot open $fn1";
foreach $key (sort keys %pp_tot)
{
for ($i = 1; $i < (scalar @md) + 1; $i++)
{
$j=1;
@grp = @{$md[$i]};
foreach $r (reverse sort(@grp))
{
($yrmo, $lf_cd, $lf_nm, $cnx_cd, $cnx_nm, $lv_pp, $lv_pplss, $pp_dis, $pplss_dis) = split /\t/, $r;
if ($key eq $lf_cd && $i == $cnx_cd)
{
$lf_nm_tot = $lf_nm;
if ($j==1)
{
printf FH "\n<left>$lf_cd\t<left>$lf_nm\t<right>$cnx_cd\t<left>$cnx_nm\t<right>$lv_pp\t<right>$lv_pplss\t<right>$pp_dis\t<right>$pplss_dis\t<right>%.1f%%\t<right>%.1f%%\t", 100*($pp_dis/$lv_pp), 100*($pplss_dis/$lv_pplss);
}
else
{
printf FH "<right>$lv_pp\t<right>$lv_pplss\t<right>$pp_dis\t<right>$pplss_dis\t<right>%.1f%%\t<right>%.1f%%\t", 100*($pp_dis/$lv_pp), 100*($pplss_dis/$lv_pplss);
}
$j++
}
}
}
printf FH "\n\t<left><b>$lf_nm_tot Total\t\t\t%s\n", $pp_tot{$key};
}
print FH "\n";
close FH;


(This post was edited by lilmark on Dec 31, 2012, 11:50 AM)


FishMonger
Veteran / Moderator

Jan 1, 2013, 8:07 AM

Post #8 of 10 (2376 views)
Re: [lilmark] Moving part of Multi-Dimensional Array to an Array [In reply to] Can't Post

Compressing your code down to fewer lines should not be a primary concern. It is more important to have logical flow, easy to read/understand and without any obvious inefficiencies. Keeping those within the primary focus will help keep the code length down to a reasonable level.

Once the code is far enough along in development, you can profile it to find the inefficiencies and then benchmark different methods to improve those inefficiencies.

Please put your code within the "code tags" whenever you post any code. The code tags will retain the indentation which will make it easier for us to read and follow your code logic.

Here are a few code comments for you to consider.

All Perl scripts you write should include the strict and warnings pragmas. Those pragmas will aide in pointing out lots of common coding mistakes which can be difficult to track down as the code expands. So, begin all scripts like this:

Code
#!/usr/bin/perl 

use strict;
use warnings;


The strict pragma requires you to declare your vars, which is normally done with the 'my' keyword which declares a lexical var or you can use the 'our' keyword to declare a global var if required. You vary rarely need to use global vars.


Quote

Code
open FP, "$path/Analysis.txt";


That line has multiple problems.
1) You should ALWAYS check the return code of an open call to verify that it was successful and take proper action if it failed instead of blindly assuming that it succeeded.

2) It is better/safer to use the 3 arg form of open especially if any portion of it was provided by user input.

3) It is best practice to use a lexical var for the file handle instead of a bareword.

Code
open my $analysis_fh, '<', "$path/Analysis.txt" or die "failed to open '$path/Analysis.txt' <$!>";



Quote

Code
while(chomp($r=<FP>))


That statement is not doing exactly what you think. The condition that it's testing is the return value of chomp, not the assignment of $r. The issue here is that it will needlessly generate the following warning if the last line of the file does not contain a line terminator.

Quote
Use of uninitialized value in chomp at ...



Quote

Code
for ( $i = 0 ; $i < scalar @{ $pp_tot{$lf_cd}{$yrmo} } ; $i++ ) {


Perl's for loop syntax is cleaner and more efficient than the C style loop you're using.

Code
for my $i ( 0..$#{ $pp_tot{$lf_cd}{$yrmo} } ) {

I might even break it up slightly for additional clarity.

Code
my $last_index = $#{ $pp_tot{$lf_cd}{$yrmo} }; 
for my $i (0..$last_index) {



(This post was edited by FishMonger on Jan 1, 2013, 8:09 AM)


BillKSmith
Veteran

Jan 1, 2013, 3:20 PM

Post #9 of 10 (2357 views)
Re: [lilmark] Moving part of Multi-Dimensional Array to an Array [In reply to] Can't Post

In most applications, "efficiency" is seldom an issue. It is far more important that readers (including yourself) can understand the code. Few of us have the discipline to rewrite "working" code for the benefit of someone else. You really only have one chance at the truly important decisions such as the data structures.


Code
while(chomp($r=<FP>))  
{

This line will not process your last line of data if it does not have a newline at the end. (Probably not an issue if you are controlling the data, but why take a chance?)
Good Luck,
Bill


Laurent_R
Veteran / Moderator

Jan 1, 2013, 4:14 PM

Post #10 of 10 (2355 views)
Re: [BillKSmith] Moving part of Multi-Dimensional Array to an Array [In reply to] Can't Post


In Reply To
In most applications, "efficiency" is seldom an issue.


Well, yes and no...

I understand what you mean, but I am working daily with GB of data, efficiency is often an issue...

I have had recently an application which was going to take more than 60 days to execute. I rewrote it entirely in Perl, and managed to get it working in 13 hours. This makes a real business difference.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives