CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
Average second column of a file for those with same value in first column

 



sessmurda
Novice

Dec 9, 2010, 4:43 PM

Post #1 of 5 (1714 views)
Average second column of a file for those with same value in first column Can't Post

I have a file with the following format:

TC370218:Scaffold54496 99.5798319327731
TC370218:Scaffold511205 100
TC370218:Scaffold511205 97.9020979020979
TC370218:Scaffold511205 100
TC370218:Scaffold95465 98.6486486486486
TC370218:Scaffold407488 100
TC370218:Scaffold365965 91.044776119403
TC399120:Scaffold83115 99.2606284658041
TC399120:Scaffold39414 86.5800865800866
TC399120:Scaffold39414 92.8571428571429
TC399120:Scaffold39414 93.2203389830508
TC380518:Scaffold442599 99.4515539305302
TC380518:Scaffold442599 98.5915492957747
TC411939:Scaffold53981 98.4924623115578
TC411939:Scaffold53981 100
TC411939:Scaffold53981 95.4248366013072
TC411939:Scaffold53981 100
TC411939:Scaffold53981 100
TC411939:Scaffold475634 95.6204379562044
TC411939:Scaffold475634 98.4615384615385
TC411939:Scaffold475634 98.2456140350877
TC411939:Scaffold257493 89.5424836601307
TC411939:Scaffold532037 96.4285714285714
TC411939:Scaffold286517 96
TC411939:Scaffold130041 92.6470588235294
TC411939:Scaffold434155 97.8260869565217
TC411939:Scaffold306822 94.4444444444444
TC411939:Scaffold299507 97.8260869565217
TC411939:Scaffold67975 93.1034482758621
TC411939:Scaffold180522 94.3396226415094

For most of the lines, this is not an issue, but what I need is for those lines that have the same 1st column, to average all the values in the 2nd, and print the name and the new average values once. I know there must be a way to do this with hashes, but dont understand them well enough to do something this complicated. Any suggestions?


rovf
Veteran

Dec 10, 2010, 2:29 AM

Post #2 of 5 (1709 views)
Re: [sessmurda] Average second column of a file for those with same value in first column [In reply to] Can't Post

You can use the value of the first column as hash key. The hash value would be a reference to an array holding the values to be averaged. You read your file line by line, filling up the hash. afterwards, you go through your hash, and calculate the average value of the numbers in the respective arrays.

Ronald


sessmurda
Novice

Dec 13, 2010, 3:54 PM

Post #3 of 5 (1680 views)
Re: [rovf] Average second column of a file for those with same value in first column [In reply to] Can't Post

Thanks for the strategy. So I guess my main issue is I am unsure of how to calculate the average. My code below is telling me that it is dividing by 0 at line 3310867, which has a normal value like any other.


Code
use Data::Dumper; 
use List::Util qw/sum/;
sub avg { sum(@_) / @_ }

open (IN, "$ARGV[0]") || die "nope\n";
my %hash;

# Read data line by line
while (<IN>)
{
chomp;
my $line = $_;

my $key = (split/\t/, $line)[0];
push @{ $hash{$key} }, $line;
}
my $avg = avg @{ $hash{$key} };
print $avg,"\n";
close IN;


I used data dumper before calculating the average to be sure that it loads correctly, which it does. Any suggestions?

Thanks


rovf
Veteran

Dec 14, 2010, 12:28 AM

Post #4 of 5 (1660 views)
Re: [sessmurda] Average second column of a file for those with same value in first column [In reply to] Can't Post

http://line 3310867

Were it nor for the odd line number, I would say: The only place where you divide is in sub avg, which means that the number of parameters is zero.

So, first, I would find out how Perl comes to that strange line number...


sessmurda
Novice

Dec 14, 2010, 1:15 PM

Post #5 of 5 (1613 views)
Re: [rovf] Average second column of a file for those with same value in first column [In reply to] Can't Post

Yeah, ended up finally fixing the script, only to find out what I was doing was biologically incorrect and I have to make a different script. Thanks for the reply and help

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives