CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Problem in processing multiple files

 



mamat
Novice

Dec 13, 2009, 4:10 AM

Post #1 of 14 (1522 views)
Problem in processing multiple files Can't Post

Hi all,
My objective is to find the distance a particular node (e.g. Node 8) has travelled, and to do this I've done this in parts but when I combine these pieces together it does not work. My input data is as follows:

Quote
Node 8 (847.24, 357.59, 0.00).
Node 17 (570.01, 963.66, 0.00).
Node 20 (1367.53, 987.66, 0.00).
Node 24 (1034.31, 1359.30, 0.00).
Node 4 (1109.00, 242.11, 0.00).
Node 8 (847.40, 358.58, 0.00).


I use this script to extract just the co-ordinates of similiar nodes and direct them in corresponding files. Note that this creates many files.

Code
#!/bin/bash 
file="./file"
echo -e "Hi, please type the file name: \c "
read word
for ((i=1; i<=25; i++)); do
grep -w "Node \<$i\>" $word | tr -d "()" | awk '{print $3, $4}' > node_$i.txt;
done


With the support of folks in this forum, I've have this script that calculates the distance as follows:

#!/use/bin/perl
use strict;
use warnings;
#no warning 'uninitialized';

use Data::Dumper;

my @points = ();
my $total = 0;
open(IN, "some.txt") or die "$!";
while (my $line = <IN>) {
chomp($line);
my @array = (split (/\s+/, $line));
#print "@array\n";
push @points, [ @array ];
}
close(IN);

print '@points : ', Dumper \@points;
for my $i1 ( 0 .. $#points -1 ){
my ( $x1, $y1, $z1 ) = @{ $points[$i1] };
my ( $x2, $y2, $z2 ) = @{ $points[$i1 + 1 ] };
my $dist = sqrt( ($x2-$x1)**2 + ($y2-$y1)**2 + ($z2-$z1)**2 );
print "distance from ( $x1, $y1, $z1 ) to ( $x2, $y2, $z2 ) is $dist\n";
$total += $dist;
}
print "total distance is $total \n";


Note that I am using just a single file, some.txt to calculate the distance. While it works I have so many files (refer to the above script) and its not a good idea to manually type that many file names. So I've this code to read the file names and calculate the distance. This is the part that does not work:

Code
#!/use/bin/perl  
use strict;
use warnings;
use Data::Dumper;
my @points = ();
my $total = 0;

#http://www.daniweb.com/forums/thread75225.html
opendir (DIR, "/media/KINGSTON/test/") or die "$!";
my @files = grep {/node_.*?\.txt/} readdir DIR;
close DIR;
foreach my $file (@files) {
open(FH,"/media/KINGSTON/test/$file") or die "$!";
while (my $line = <FH>){
#read file line by line here
chomp($line);
my @array = (split (/\s+/, $line));
#print "@array\n";
push @points, [ @array ];

print '@points : ', Dumper \@points;

for my $i1 ( 0 .. $#points -1 ){
my ( $x1, $y1 ) = @{ $points[$i1] };
my ( $x2, $y2 ) = @{ $points[$i1 + 1 ] };
my $dist = sqrt( ($x2-$x1)**2 + ($y2-$y1)**2 );
print "distance from ( $x1, $y1 ) to ( $x2, $y2 ) is $dist\n";
$total += $dist;
}

}
close(FH);
}

#open(IN, "some.dat") or die "$!";
#while (my $line = <IN>) {
# chomp($line);
# my @array = (split (/\s+/, $line));
# #print "@array\n";
# push @points, [ @array ];
#}
#close(IN);

#print "total distance is $total \n";

What is the mistake here so that I can calculate the distance for multiple files in one go? I am trying to get the total distance for co-ordinates contained files of format node*.txt. For e.g. for node1.txt, the distance is x and for node2.txt its y, and so on.
Best Regards,
Mamat.


mamat
Novice

Dec 14, 2009, 6:23 PM

Post #2 of 14 (1499 views)
Re: [mamat] Problem in processing multiple files [In reply to] Can't Post

Can someone please reply? I guess there is much to read, but just the last snippet of code is more relevant to:

Quote
What is the mistake here so that I can calculate the distance for multiple files in one go?


Many Thanks


FishMonger
Veteran / Moderator

Dec 14, 2009, 7:22 PM

Post #3 of 14 (1496 views)
Re: [mamat] Problem in processing multiple files [In reply to] Can't Post

I haven't tested your code, but the main reason that you haven't had any response is most likely because you haven't given enough details. Saying it doesn't work isn't enough.

What is the script doing that it shouldn't?

What is the script not doing that it should?

Are you receiving any warnings/errors? If so what are they?


(This post was edited by FishMonger on Dec 14, 2009, 7:23 PM)


mamat
Novice

Dec 15, 2009, 2:18 AM

Post #4 of 14 (1484 views)
Re: [FishMonger] Problem in processing multiple files [In reply to] Can't Post

More details & the errors: I am trying to read through multiple files with file name of the format node*.txt and calculate the distance based on the co-ordinates it contains. Each file represents a different node. The second code snippet posted above calculate correctly the distance for co-ordinates in just a *single* file. But I want to do the same for multiple files contained in a directory. I am trying to do this using the code below:


Code
#!/use/bin/perl   
use strict;
use warnings;
use Data::Dumper;
my @points = ();
my $total = 0;

#http://www.daniweb.com/forums/thread75225.html
opendir (DIR, "NodeDist/") or die "$!";
my @files = grep {/node_.*?\.txt/} readdir DIR;
close DIR;
foreach my $file (@files) {
open(FH,"NodeDist/$file") or die "$!";
while (my $line = <FH>){
#read file line by line here
chomp($line);
my @array = (split (/\s+/, $line));
#print "@array\n";
push @points, [ @array ];

print '@points : ', Dumper \@points;

for my $i1 ( 0 .. $#points -1 ){
my ( $x1, $y1 ) = @{ $points[$i1] };
my ( $x2, $y2 ) = @{ $points[$i1 + 1 ] };
my $dist = sqrt( ($x2-$x1)**2 + ($y2-$y1)**2 );
print "distance from ( $x1, $y1 ) to ( $x2, $y2 ) is $dist\n";
$total += $dist;
}
print "total distance is $total \n";
}
close(FH);
}


I am getting too many lines of this error:
Argument "370.12," isn't numeric in subtraction (-) at calc-dist.pl line 26, <FH> line 1.
I do not know what this error means or if this error hides some other problem.
Appeciate any help!


FishMonger
Veteran / Moderator

Dec 15, 2009, 5:38 AM

Post #5 of 14 (1477 views)
Re: [mamat] Problem in processing multiple files [In reply to] Can't Post

The error message is telling you that "370.12," isn't a numerical number due to the trailing comma.

There are several ways to fix that, try changing this line:

Code
my @array = (split (/\s+/, $line));


to this:

Code
my @array = split(/[,\s]+/, $line);


BTW, there's no need to use that shell script. Perl can very easily process the original data file without creating a bunch of unnecessary files.


mamat
Novice

Dec 15, 2009, 9:05 PM

Post #6 of 14 (1470 views)
Re: [FishMonger] Problem in processing multiple files [In reply to] Can't Post

Thank you very much for your help. I wonder how I would have figured it out. Plus, I had not thought about processing my input in Perl itself. I used tr because I was a bit familiar with that. Your split code solves the error.

But how do I stop reading once I reach node1.txt, compute the distance, print and move on to the next file, i.e. node2.txt? A problem I notice that all the co-ordinates in the file are stored in the same array and thus am unable to differentiate if the distance is for node 1 or node 5.

The output now:

Quote
node_1.txt total distance is 0
distance from ( 219.18, 275.61 ) to ( 220.00, 276.18 ) is 0.998649087517723
node_1.txt total distance is 0.998649087517723
distance from ( 219.18, 275.61 ) to ( 220.00, 276.18 ) is 0.998649087517723
distance from ( 220.00, 276.18 ) to ( 220.82, 276.76 ) is 1.00439036235916
node_1.txt total distance is 3.00168853739461
distance from ( 219.18, 275.61 ) to ( 220.00, 276.18 ) is 0.998649087517723
distance from ( 220.00, 276.18 ) to ( 220.82, 276.76 ) is 1.00439036235916
distance from ( 220.82, 276.76 ) to ( 221.64, 277.33 ) is 0.998649087517723
node_1.txt total distance is 6.00337707478922


The distance is now incorrectly calculated irrespective of the node file names. And they are printed incrementally!

I want to have all the node1.txt co-ordinates read, calculate their distance, and then move to the next node. What do need to do to achieve this? If there is an example that is demonstrates this, please let me know.
Thanks Again.


(This post was edited by mamat on Dec 15, 2009, 11:49 PM)
Attachments: simple-time2.txt (4.41 KB)


FishMonger
Veteran / Moderator

Dec 16, 2009, 11:41 AM

Post #7 of 14 (1444 views)
Re: [mamat] Problem in processing multiple files [In reply to] Can't Post

Here's the base looping structure I'd use.


Code
#!/usr/bin/perl 

use strict;
use warnings;
use Data::Dumper;

my %nodes;
my $nodes = 'simple-time2.txt';

open my $node_FH, '<', $nodes or die "can't open <$nodes> $!";
while( <$node_FH> ) {
chomp;
s/[()]//g;
my (undef,$node,@points) = split /[,\s]+/;
push @{$nodes{$node}}, [ @points ];
}
#print Dumper \%nodes;

foreach my $node ( keys %nodes ) {
print "Node -> $node:\n";
# print Dumper $nodes{$node};

foreach my $points ( @{ $nodes{$node} } ) {
print Dumper $points;
}
}


Instead of storing the co-ordinates in an array, you could use a hash where the keys are the node numbers, like I've done with the source data.


mamat
Novice

Dec 18, 2009, 5:34 AM

Post #8 of 14 (1401 views)
Re: [FishMonger] Problem in processing multiple files [In reply to] Can't Post

Given your code, I thought I could access the co-ordinates and compute the distance as earlier. But I can't.

What is the purpose of

Quote
foreach my $points ( @{ $nodes{$node} } )

The first foreach is used to print the values of the hash, but the second one?

While I could access co-ordinates via my ( $x1, $y1 ) = @{$points}; how do I iterate to the next co-ordinate ($x2, $x2) so that I can use the distance formulae? I used a for loop which does not the print the values:

Code
for my $i ( 0 .. $#{ $nodes{$node} } ) { 
print "$i = $nodes{$node}[$i] \n"; }



I think I am not understanding the hash despite reading the last 2 days! But the second foreach has confused me. Please help me in accessing the co-ordinates so that I can apply the formale.

Thank you for your patience.


FishMonger
Veteran / Moderator

Dec 18, 2009, 7:05 AM

Post #9 of 14 (1394 views)
Re: [mamat] Problem in processing multiple files [In reply to] Can't Post

I'm not sure how to respond without providing a complete solution to your homework assignment.

Does this help?

Code
#!/usr/bin/perl 

use strict;
use warnings;
use Data::Dumper;

my %nodes;
my $nodes = 'simple-time2.txt';

open my $node_FH, '<', $nodes or die "can't open <$nodes> $!";
while( <$node_FH> ) {
chomp;
s/[()]//g;
my (undef,$node,@points) = split /[,\s]+/;
push @{$nodes{$node}}, [ @points ];
}


foreach my $node ( sort {$a <=> $b} keys %nodes ) {

last if $node == 3; # quit after 2 examples

print "\nArray Node -> $node:\n";

foreach my $points ( @{ $nodes{$node} } ) {
print "array of points: @$points\n";

foreach my $point ( @$points ) {
print "\t point: $point\n";
}
print "\n";
}
}

Outputs:

Code
C:\TEMP>mamat.pl 

Array Node -> 1:
array of points: 219.18 275.61 0.00.
point: 219.18
point: 275.61
point: 0.00.

array of points: 220.00 276.18 0.00.
point: 220.00
point: 276.18
point: 0.00.

array of points: 220.82 276.76 0.00.
point: 220.82
point: 276.76
point: 0.00.

array of points: 221.64 277.33 0.00.
point: 221.64
point: 277.33
point: 0.00.


Array Node -> 2:
array of points: 575.58 203.87 0.00.
point: 575.58
point: 203.87
point: 0.00.

array of points: 576.30 204.56 0.00.
point: 576.30
point: 204.56
point: 0.00.

array of points: 577.02 205.26 0.00.
point: 577.02
point: 205.26
point: 0.00.

array of points: 577.74 205.95 0.00.
point: 577.74
point: 205.95
point: 0.00.

array of points: 578.46 206.64 0.00.
point: 578.46
point: 206.64
point: 0.00.

array of points: 579.18 207.33 0.00.
point: 579.18
point: 207.33
point: 0.00.


I looped over the array elements, but you'll need to loop over the array indices so that you can access 2 elements at the same time.


FishMonger
Veteran / Moderator

Dec 18, 2009, 7:11 AM

Post #10 of 14 (1393 views)
Re: [FishMonger] Problem in processing multiple files [In reply to] Can't Post

The 3rd number in each set of points is always 0.00 and it doesn't appear that you're using that in your code, so I'd not include it in the array.

You could change:

Code
push @{$nodes{$node}}, [ @points ];


To this:

Code
push @{$nodes{$node}}, [ @points[0,1] ];



mamat
Novice

Dec 19, 2009, 9:18 PM

Post #11 of 14 (1368 views)
Re: [FishMonger] Problem in processing multiple files [In reply to] Can't Post

Thank you for your time & the code. Below is the version of the code that does what I want. I was trying to understand what the second foreach loop does in the last posts, because I didn't know how to iterate through the hash like an array. And am still not clear on $_, $, #, @ combinations.

Code
#!/usr/bin/perl -w 

use strict;
use warnings;
use Data::Dumper;

my %nodes;
my $nodes = $ARGV[0];
my $time = $ARGV[1];
my $total = 0;
my $speed = 0;

open my $node_FH, '<', $nodes or die "can't open <$nodes> $!";
while( <$node_FH> ) {
chomp;
s/[()]//g;
my (undef,$node,@points) = split /[,\s]+/;
push @{$nodes{$node}}, [ @points[0,1] ];
}
#print Dumper \%nodes;

foreach my $node ( sort {$a <=> $b} keys %nodes ){
#print "$node\n";
foreach my $i ( 0 .. $#{$nodes{$node}} -1 ){
my ( $x1, $y1 ) = @{$nodes{$node}[$i]};
my ( $x2, $y2 ) = @{$nodes{$node}[$i+1]};
#print "x1: $x1 y1: $y1\n";
#print "x2: $x2 y2: $y2\n";
my $dist = sqrt( ($x2-$x1)**2 + ($y2-$y1)**2 );
$total += $dist;
$speed = $total/$time;
}
#print " node: $node \t distance is $total m\n";
print " node: $node \t speed: $speed m/s\n";
}

You code really helped to finish (not a homework/job) a lot faster.


FishMonger
Veteran / Moderator

Dec 20, 2009, 8:12 AM

Post #12 of 14 (1347 views)
Re: [mamat] Problem in processing multiple files [In reply to] Can't Post

You're Welcome.

A couple comments.

Use either the -w (warnings) switch or the warnings pragma, not both.

The pragma is preferred because it is lexically scoped and can be turned on/off as needed.

The -w switch is globally scoped and can not be disabled. Since it's global, it is applied to files loaded with 'use' or 'require' and if those scripts/modules are not warnings safe, you'll get warnings that could be difficult to track down and fix.

You need to use proper indentation. At first glance your foreach loops appeared to be inside the while block. The closing brace for a while block should be at the same level of indentation as the start of the while keyword and the next section of code (i.e., your foreach block) should be at that same level of indentation.

It's generally not necessary to initialize a var to 0 unless not doing so could generate a warning or error.

It's best practice to test/verify that required user supplied data was actually provided and take proper action if it wasn't.


Quote
I didn't know how to iterate through the hash like an array.

You don't. The data structure we're using here is a HoA (Hash of Arrays). So, the first foreach loop is looping over the hash keys where the associated values are arrays. The second foreach loop is looping over the array elements.


mamat
Novice

Dec 21, 2009, 5:46 AM

Post #13 of 14 (1328 views)
Re: [FishMonger] Problem in processing multiple files [In reply to] Can't Post

The code above has a logic error. The line, $total += $dist; is in the wrong place. The procedure is to calculate the distances between the co-ordinates of node 1, then speed and move on to Node 2.

But because of the above line, Node 1's total distance is also included in the Node 2's. As a result, Node n's distance is a huge number when in reality the distances of all the nodes should relatively the same. Because of this error in the distance, the speed is subsequently incorrect.

If I put the dist outside the foreach loop, I get the distance is always 0. Where do I put the $dist so that it calculates only Node 2's distance and not include any other Node's distance.

Thanks again.


FishMonger
Veteran / Moderator

Dec 21, 2009, 7:21 AM

Post #14 of 14 (1323 views)
Re: [mamat] Problem in processing multiple files [In reply to] Can't Post

These 2 vars are declared in the wrong scope/block.

Code
my $total = 0;  
my $speed = 0;


Lexical vars should be declared in the smallest scope/block that they require, not the largest. They need to be declared in the foreach loop; I'll leave it to you to figure out which one.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives