CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Average over multiple files

 



venuu99
Novice

Feb 9, 2011, 3:55 AM

Post #1 of 2 (695 views)
Average over multiple files Can't Post

Hi Gurus

I am trying to find average of a number in a row from multiples files with multiple rows. But unable to succeed. Example

File1
chr1 150455032 150455092 4178 20288
chr17 42589293 42589353 3708 251844

File2
chr1 150455032 150455092 2704 20288
chr17 42589293 42589353 1701 251844

I have multiple files like this with multiple rows. The output I want is the average of 4th element in each row from all files (eg: 4178+2704 / 2, 3708+1701 / 2etc).

Here is what i tried. It prints the average only for the first row from all files. It has probably got to do something with the loop and printing. Can someone please help me in finding the problem.

use Statistics::Basic qw(:all);

@count = glob "*.count";

foreach $file(@count)
{
open IN, $file;

while ($line = <IN>)
{
@exomes=split("\t", $line);
push(@probe_count,$exomes[3]);
close IN;
}

$mean = mean(\@probe_count);
}

print "$mean\n";

Thank you very much.

V


Karazam
User

Feb 9, 2011, 5:05 AM

Post #2 of 2 (690 views)
Re: [venuu99] Average over multiple files [In reply to] Can't Post

From your description I get the impression that you want two separate values:
the average of the fourth column of the lines that begin with chr1 in every
file, and dito for lines that begin with chr17.

However, what the code seems to be doing is opening one file, saving the mean
for the fourth column of all lines in $mean, then opening the next file and
saving the mean for *that* file in $mean, overwriting the previous value
(because "$mean = mean(\@probe_count);" is within the foreach loop). Then when
all files has been read, it prints the mean of all lines in the last file.

I have rewritten the code to match the description (hoping I understood it
right). Some idioms I changed are:

- use the three argument form of open
- the simpler while (<$fh>) { ... assigns each line to the default variable $_
- split matches white space by default (and operates on $_ by default)
- put split in parenthesis to splice it directly

Hope this helps. Smile


Code
#!/usr/bin/perl 
use warnings;
use strict;
use Statistics::Basic qw(:all);

my ( @chr1, @chr17 );
my @count = glob "*.count";

foreach my $file ( @count ) {
open my $fh, '<', $file;
while (<$fh>) {
push @chr1, (split)[3] if /^chr1\s+/;
push @chr17, (split)[3] if /^chr17\s+/;
}
}

my $mean1 = mean( \@chr1 );
my $mean17 = mean( \@chr17 );
print "$mean1, $mean17\n";


 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives