CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
PERL code help

 



perl student
Novice

Jun 11, 2013, 3:23 AM

Post #1 of 10 (638 views)
PERL code help Can't Post

Hello members/perl experts,

I am perl student and would need some urgent help/suggestions. I need to write a script that would read some files, get some information, do some calculations and report the result....

Can anybody help please...

Thanks


BillKSmith
Veteran

Jun 11, 2013, 4:17 AM

Post #2 of 10 (632 views)
Re: [perl student] PERL code help [In reply to] Can't Post

It is not possible to teach you perl in the space of a forum reply. Read the first few chapters of your text book and try to do the problems. If your solutions do not work, show them to us. We probably can explain what is wrong.
Good Luck,
Bill


Laurent_R
Veteran / Moderator

Jun 11, 2013, 12:04 PM

Post #3 of 10 (621 views)
Re: [perl student] PERL code help [In reply to] Can't Post


In Reply To
I am perl student and would need some urgent help/suggestions. I need to write a script that would read some files, get some information, do some calculations and report the result....


How can we help you with such a poor description of what you need?

I suggest the following code:


Code
#!/usr/bin/perl 

use strict;
use warnings;

read_some_file();
get_some_information();
do_some_calculations();
report_the_result();


And we are done! Oh, well, you still have to define the four subroutines, but that's a piece of cake.


perl student
Novice

Jun 12, 2013, 1:25 AM

Post #4 of 10 (616 views)
Re: [Laurent_R] PERL code help [In reply to] Can't Post

Well I have 12 files....In each file there are entries such as

# rsid chromosome position genotype
rs3094315 1 742429 AA
rs12562034 1 758311 GG
rs3934834 1 995669 CC
rs3094315 1 742429 AA
rs12562034 1 758311 GG
rs3934834 1 995669 CC

I need to write a perl script that will read the 12 files and report the genotype and allele frequencies for each SNP (rs...).
I have 12 files like this and each snp (rs...) will be present in each file once. So the population size for each SNP is 12.

Now to calculate the genotype and allele frequency please refer the example.txt file. In that file I have calculated the genotype and alele frequency of the first SNP rs3094315. Like all the other SNPs this is also present once in the 12 files...

Now I want to implement this in perl....

Read the 12 files, for each SNP calculate the genotype and allele frequency and present the results....

The image shows how my results should be printed like...Once again this is for the one snp rs3094315..
Attachments: Exmple.txt (1.12 KB)
  My_output.png (28.4 KB)


perl student
Novice

Jun 12, 2013, 4:50 AM

Post #5 of 10 (613 views)
Re: [perl student] PERL code help [In reply to] Can't Post

I did write a script but am not getting the desired output....I guess I need not hard code the genotypes but rather let the script find the genotypes under the heading genotype in the data files. This is because the genotypes will be different...But I am just starting perl and am not that proficient yet...

If you wish I can send you the script...


Laurent_R
Veteran / Moderator

Jun 12, 2013, 10:36 AM

Post #6 of 10 (604 views)
Re: [perl student] PERL code help [In reply to] Can't Post

Yes, post your script (using the code tags), we will probably be able to tell you what is going wrong.

If I understood your example correctly, all you need to to is to look at the last field of your file lines, count the frequency of the pairs and then make some calculations of how many A, C, G and T (or whatever) you have in terms of percentage. Are these calculations to be made on a file by file basis, or for the aggregate data of all 12 files?


perl student
Novice

Jun 12, 2013, 11:44 PM

Post #7 of 10 (591 views)
Re: [Laurent_R] PERL code help [In reply to] Can't Post

For each SNP (rs....), the script will look for the data in all the 12 files...In my example I have shown the data of one SNP collected from all the 12 files....

Here is my script attached
Attachments: SNPFreqCalc_Complete.pl (3.18 KB)


BillKSmith
Veteran

Jun 13, 2013, 4:24 AM

Post #8 of 10 (583 views)
Re: [perl student] PERL code help [In reply to] Can't Post

I tested your script by commenting out all your input and hand coding your sample data into the array-of-hashes %snp_data.

Code
my %snp_data = (rs3094315 => [qw(AA AA AA AA AG AA AA AA AA AG AG AA)],);



The output file contained:

Code
SNP id	Population Size	allele1	allele2	homozygous (Allele 1) frequency	Heterozygous frequency	homozygous (Allele 2) frequency	allele1 frequency	allele2 frequency 
rs3094315 12 A G 75 0 25 0.875 0.125


It appears to be correct. The problem is that it seems to lack the generality to process full files. (i.e. 'A' and 'G' are hardcoded in several places.) Can you tell us how it has to be generalized?
Good Luck,
Bill


perl student
Novice

Jun 13, 2013, 5:52 AM

Post #9 of 10 (581 views)
Re: [BillKSmith] PERL code help [In reply to] Can't Post

Yes that is my problem....I want it to be generic...In other words, I want the script to read the data present under the heading genotype in the data files for each SNP and work out the genotypes. So, whatever genotypes it find for each SNP it does the calculation accordingly...


BillKSmith
Veteran

Jun 13, 2013, 8:44 AM

Post #10 of 10 (573 views)
Re: [perl student] PERL code help [In reply to] Can't Post

I think your input is correct. (I cannot test it witout data). Here is my first try at processing the hash. I do not know the biology, so my variable names may seem very strange.

Code
my %snp_data = (rs3094315 => [qw(AA AA AA AA AG AA AA AA AA AG AG AA)],); 

while (my($snpID,$types) = each %snp_data) {
my %allelefreq;
my %genofreqs;
foreach (@$types){
$genofreqs{$_}+= (100/@$types);
my @alleles = split //, $_, 2;
$allelefreq{$alleles[0]} += (.5/@$types);
$allelefreq{$alleles[1]} += (.5/@$types);
}
do {
local $, = ' ';
print OUTPUT $snpID, scalar @$types, keys %allelefreq,
values %genofreqs, values %allelefreq, "\n";
};
}


I know that some of the values are in the wrong order. I have to leave something for you to do. Hint: Sort the hashes. Refer perldoc -q "sort a hash"
Good Luck,
Bill

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives