
lily3344
New User
Dec 15, 2008, 10:07 PM
Post #1 of 4
(1533 views)
|
|
Extrating fasta sequences?
|
Can't Post
|
|
ok...I need to extract some sequences from a fasta format file which look like this >FBpp0100000 MKATSCRPIPFINFRIVRYFYRPRRLNKYYDIYRIAALVRDRACLFTCQQ ASQNLQQQQPRFYSAPGRRAGFFSQFFDNMKAEMDKNKEIKDNIRKFREE AQKLEESDALKSARQKFNIVESEAQKSSSMLKEQLGAIKERVGDVLEDAS KSHLAKKVTEELSKKARGVSDTISDTSGKLGQTSAFQAISNTTTTIKKEM DSASIENRVYRAPAKLRKRVQLVMSDSDRVVEPNTEATGMELHKDSKFYE SWENFKNNNTYVNKVLDWKVKYDESENPVIRASRLLTDKVSDVMGGLFSK TELSETMTELVKIDPSFDQKDFLRDCETDIIPNILESIVRGDLEILKDWC FESTFNIIANPIKEAKKAGVYLDSKILDIENIELAMGKVMEQGPVLIITF QAQQIMCVRDQKSQVVEGDPEKVMRVHYVWVLCRDRNELNPKAAWRLMEL SANSSEQFV >FBpp0100001 MRRVPPTDAEMQPNRARFKKYNVWASALQEDALSENMRGCDVTRSGRDRN VENYDFSLRYRLNGENTLKRRLSNSSEDGGECSHPAHKRGRPSSRPITGN QQRGLVKSRTGHRSRRGTSSASGSSDFCEPRHILDLNEVGERDPSDVATE MASKLYEEKDELLVRVVEVLGIDVCLELYKETQRIEADGGMMIKNGIRRR TPGGVFLFLIKHHDNITQEQQKRIFSEDRQSLSKSRKQIETLMRDRKVEE LKKCLSKQVTELPTLNQRKEYYMQGDEQSEDKQPGSLSNPPPSPVGAEQE HDSPEYRTHEININLVDNAELPSTSKAAAAAQGAPLKDLISYDHDFLDVN CGDMDFF suppose I just want one of the sequences, I tried the following: sub readlines { my @line; my @results = (); my $proteins = @proteins; while (my $line = readline ($FILE)) { if ($line =~ /$proteins/) { next until $line =~ /^>/; }elsif ($line =~ tr/$proteins//cd) { my $lines .= $line; } } return ($lines); } } which doesn't work, but its not giving me a feed back or error ....is there possibility that this actually works, but just that my computer is slow? the fasta file is about 11MB, can someone help me out?
|