CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Nested looping help

 



davidcassidy22
Novice

Oct 25, 2015, 11:55 AM

Post #1 of 6 (1377 views)
Nested looping help Can't Post

Hello, I was wondering if anyone can lend me a hand with my program using nested loops. I have two files of data. For each line in the first file, I need to look through the second file and grab the whole line that has the closest values for two of the columns in the first file. My program right now is printing out a line from the second file that doesn't match anything from the first file and is repeatedly printing it. Both files are rather large; the first is 32000 lines and the second is 340000 lines.

A sample of the first file:


Code
 262.0747     -34.9933     5          
264.7288 -34.9913 5
262.1343 -34.9886 5
262.7769 -34.9886 5
257.7683 -34.9848 5
266.7092 -34.981 5
262.4659 -34.9788 5
257.0192 -34.9775 5
262.7218 -34.9773 5


A sample of the second file:


Code
068.22971     -55.84485        36.4        23.0         4.6 
068.15411 -55.83293 46.5 28.5 27.3
068.20515 -55.83111 31.5 25.4 5.9
068.36025 -55.82560 38.5 33.8 0.0
068.35223 -55.82085 23.3 8.3 13.7
068.27966 -55.81310 35.3 35.2 12.2
068.24229 -55.80592 65.1 54.6 29.3
068.37510 -55.80102 31.6 25.4 6.8
068.25218 -55.80001 28.4 18.3 6.2
068.39895 -55.79662 29.5 28.2 22.9


I am trying to get the closest values for the first two columns in the first file from the second file. Even though the values in the samples don't look close, as I said, the second file is 340000 lines have a large range.

Here is my code:


Code
#!/usr/bin/perl -w 
use strict;
use warnings;

my$ir_path = "\/home\/master\/files";
my$intab = "$ir_path\/data\/herschel\/herschel\_test\/herschel\_unmatched\_data\_test\.txt";
my$intab2 = "$ir_path\/data\/herschel\/herschel\_test\/MSXlist\_preirsa\_edited\_test\.txt";
my$outtab = "$ir_path\/data\/herschel\/herschel\_test\/herschel\_matched\_data\_test\.txt";

die " FILE $intab NOT FOUND\!\n" if (! -f $intab) ;

unlink ("$outtab") if (-e $outtab);

open INT , "$intab" or die "Cannot open file $intab" ;
open INT2, "$intab2" or die "Cannot open file $intab2" ;
open OUTT, ">$outtab" or die "Cannot open file $outtab" ;

my $nn=0;
my @ra;
my @dec;
my @F250;
my @F350;
my @F500;
my @rarad;
my @decrad;

while (<INT>) {
next if /^\s*$/;
($ra[$nn],$dec[$nn],$F250[$nn],$F350[$nn],$F500[$nn]) = (split)[0,1,2,3,4];
$rarad[$nn] = ($ra[$nn]/180)*3.14159;
$decrad[$nn] = ($dec[$nn]/180)*3.14159;
$nn=$nn+1;
}

my $nn=0;
my @ramsx;
my @decmsx;
my @ramsxrad;
my @decmsxrad;

while (<INT2>) {
next if /^\s*$/;
($ramsx[$nn],$decmsx[$nn]) = (split)[0,1];
$ramsxrad[$nn] = ($ramsx[$nn]/180)*3.14159;
$decmsxrad[$nn] = ($decmsx[$nn]/180)*3.14159;
$nn=$nn+1;
}

my @compare;
my $ramin;
my $decmin;
my $F250min;
my $F350min;
my $F500min;
#my $min_compare=1000;

for (my $ii = 0;$ii<32067; $ii++) {
my $min_compare=1000;
for (my $jj = 0;$jj<340968;$jj++) {
$compare[$jj] = ((sin($decrad[$jj])*sin($decmsxrad[$ii]))+(cos($decrad[$jj])*cos($decmsxrad[$ii])*cos($rarad[$jj]-$ramsxrad[$ii])));
if ($compare[$jj] < $min_compare) {
$min_compare = $compare[$jj];
$ramin = $ra[$jj];
$decmin = $dec[$jj];
$F250min = $F250[$jj];
$F350min = $F350[$jj];
$F500min = $F500[$jj];
# print "$ramin\n";
}
}
printf OUTT "%-10s %-10s %-10s %-10s %-10s\n",$ramin,$decmin,$F250min,$F350min,$F500min;
}


close OUTT;
close INT;
close INT2;
print "Done\, Herschel would be proud!\n";
exit(0);


I am still a perl noob and I'm sure my code looks rather ugly, but I'd appreciate any help. My attempt to solve this was as follows:

-Loop through the first file
-Set an arbitrary minimum compare value
-Loop through the second file
--Use an equation to which I want to find the smallest value
--If the equation gives me something smaller than the minimum compare value, set the minimum compare value equal to that and set my variables equal to the values from the second file that made the compare value small.
--continue for each line in the second file until there is a smallest compare value and print out the variables
--Continue this process for each line in the first file

I'm sorry if I've worded this badly and I apologize for my ugly and inefficient code. Thanks in advance


BillKSmith
Veteran

Oct 25, 2015, 3:29 PM

Post #2 of 6 (1367 views)
Re: [davidcassidy22] Nested looping help [In reply to] Can't Post

I do not see anything wrong with your perl. Better use of perl's complex data structures would improve the clarity of your code by greatly reducing the number of variables. I suspect that your problem is in your theory. Are you certain that the equation for compare is correct and that it is codded correctly? Can compare return a negative value? If so, are they handled correctly? Perhaps you want the smallest absolute value rather than the least value.

You have done a good job of asking your question in perl terms only. This is almost always a good idea. In this case, I may be able to help more if you do provide some of the underlying astronomy.
Good Luck,
Bill


davidcassidy22
Novice

Oct 26, 2015, 7:59 AM

Post #3 of 6 (1359 views)
Re: [BillKSmith] Nested looping help [In reply to] Can't Post

Hi Bill, I really appreciate you looking over my code. The equation is correct, but it can take on negative values; you're right, I should be looking for the lowest absolute value of $compare. The underlying astronomy behind this is that my first file is a list of the sources I am interested in. The second file is a whole point source catalog from a Herschel infrared sky survey. I am interested in grabbing data from the sky survey for my sources in the first file so I am trying to obtain the sources from the survey with the closest RA and Dec match. Instead of assuming a euclidean distance and using the Pythagorean theorem, $compare is an equation for finding angular separation ($compare is the cosine of angular separation).

I will try to use an absolute value to see if this is going to help and if not, I will post what is being printed to the outfile.


BillKSmith
Veteran

Oct 26, 2015, 8:52 AM

Post #4 of 6 (1352 views)
Re: [davidcassidy22] Nested looping help [In reply to] Can't Post

You stated that $compare is intended to be the cosine of the angular separation. You should verify that it is always in the range -1 <= $compare <= +1. If not, you have a problem either with your equation or with floating point round off. In either case, your program needs fixing.

Remember that cos(0)=1. The minimum angular separation corresponds to the maximum of $compare. Do not take absolute value. A negative $compare corresponds to a separation of more than 90 deg.
Good Luck,
Bill


davidcassidy22
Novice

Oct 29, 2015, 8:08 AM

Post #5 of 6 (1248 views)
Re: [BillKSmith] Nested looping help [In reply to] Can't Post

You are correct, I needed to make sure that the acos of compare was smaller than the min compare. This ended up working out, however, my second list isn't as complete as I thought it was and does not contain sources close enough to the list of interest.

Just thought I'd update, I really appreciate the help again.


BillKSmith
Veteran

Oct 29, 2015, 3:41 PM

Post #6 of 6 (1230 views)
Re: [davidcassidy22] Nested looping help [In reply to] Can't Post

Sorry that your project did not work out as planned. In my work on that project, I came across a module that would have been very useful. Check out Math::Trig. The greatcircle function is exactly what you needed for angular separation. Do not ignore the other functions and the constants.
Good Luck,
Bill

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives