CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
[HELP] find results in list

 



acpguedes
Novice

Oct 16, 2011, 8:15 AM

Post #1 of 19 (3991 views)
[HELP] find results in list Can't Post

I have a list like posted below

Code
---------------------------------------------------------------- 
miRNA: hsa-miR-92b* #variable is here!

Hybrid:
mfe: -29.5 kcal/mol #variable is here!

position 29 #variable is here!
target 5' U A A U C 3' #variable is here!
GCUG ACUGC UUCC UCCC #variable is here!
UGAC UGGCG AGGG AGGG #variable is here!
miRNA 3' G G C C A 5' #variable is here!


Miranda:

Position: 29 #variable is here!

Query: 3' guGACGUGGCGCAGGGCAGGGa 5' #variable is here!
||| ||:|| |:|| |||| #variable is here!
Ref: 5' tgCTGAACTGCATTCCTTCCCc 3' #variable is here!

Energy: -24.110001 kCal/Mol #variable is here!


Rybrid vs miRanda: OK
----------------------------------------------------------------
miRNA: hsa-miR-4689 #variable is here!

Hybrid:
mfe: -34.0 kcal/mol #variable is here!

position 3 #variable is here!
target 5' U G ACCC U U 3' #variable is here!
GU CCCACCAUG UC UCCUCA #variable is here!
CG GGGUGGUAC AG AGGAGU #variable is here!
miRNA 3' C G U 5' #variable is here!

Miranda:

Position: 3 #variable is here!

Query: 3' ccggGGGUGGUAC----A-GAGGAGUu 5' #variable is here!
||||||||| | :|||||| #variable is here!
Ref: 5' tgtgCCCACCATGACCCTCTTCCTCAt 3' #variable is here!

Energy: -29.650000 kCal/Mol #variable is here!


Rybrid vs miRanda: OK


always starts with "miRNA" and ends with "Rybrid vs. Miranda: OK"

but most important is the term "MFE:" I have ever edit in a regexp
"MFE" is a numerical value like: "MFE: -22.4 kcal / mol"
I want to look in the input values ​​of "MFE" they are in a range that I'll set, eg between -28 and -30
the output file should contain all the results where "MFE" obeys the interval, but it must contain everything between "miRNA" and "Rybrid vs. Miranda: OK."


I tried to do a script, but it did not work because it print on the output file only the condition.


Code
#!/usr/bin/perl 


use strict;

use warnings;

print "Write input arquive name\n\t";
my $input = <STDIN>;
print "Write max energy. Like: -28\n\t";
my $min = <STDIN>;
print "Write min energy. Like: -30\n\t";
my $max = <STDIN>;
print "Write output arquive name\n\t";
my $output = <STDIN>;


open IN, $input or die "cannot open input";
open OUT,">". $output or die usage ();


while(<IN>){
if($_ =~ m/(mfe:s+(-?d.*.?d.*?)s+kcal/mol)/gi){
if($2 <= $min && $2 >= $max){
print OUT " $1.$/ ";

}
}
}


if someone can give you a hint, an example or even part of the script or redo the script I've ever done, I thank you very much.


FishMonger
Veteran / Moderator

Oct 16, 2011, 9:24 AM

Post #2 of 19 (3987 views)
Re: [acpguedes] [HELP] find results in list [In reply to] Can't Post

Start by using chomp on each of the input vars.

e.g.,

Code
chomp(my $input = <STDIN>);



acpguedes
Novice

Oct 16, 2011, 11:48 AM

Post #3 of 19 (3980 views)
Re: [acpguedes] [HELP] find results in list [In reply to] Can't Post

thanks for the tip, but my question was how to focus the text.
Do I must put the FILEHANDLE into an array?
Do I use FOR instead of WHILE?
If, as is using the FOR loop?
It could give you an example.


rovf
Veteran

Oct 17, 2011, 3:19 AM

Post #4 of 19 (3921 views)
Re: [acpguedes] [HELP] find results in list [In reply to] Can't Post

What do you mean by "focusing the text"?

Anyway - although I don't quite understand your question, I noticed in your regexp


Code
m/(mfe:s+(-?d.*.?d.*?)s+kcal/mol)/gi


that you probably forgot the backslash in front of "s" and "d".


BillKSmith
Veteran

Oct 17, 2011, 5:18 AM

Post #5 of 19 (3918 views)
Re: [rovf] [HELP] find results in list [In reply to] Can't Post

The '/' in the units also needs a backslash.
Good Luck,
Bill


rovf
Veteran

Oct 17, 2011, 6:39 AM

Post #6 of 19 (3909 views)
Re: [BillKSmith] [HELP] find results in list [In reply to] Can't Post

Ooops, you are right. But this means that the code posted by the OP did even compile!!!!!


acpguedes
Novice

Oct 17, 2011, 5:44 PM

Post #7 of 19 (3872 views)
Re: [rovf] [HELP] find results in list [In reply to] Can't Post

the code compiles, and regexp is alright but it is considered only where the line has "mfe: blahblahblah"

What I want is to leave all rows in the output file, starting with "miRNA" and ending with "Rybrid vs miRanda OK."

but, when find "mfe: blahblahblah kcal/mol", just print if blahblahblah are between min and max


rovf
Veteran

Oct 18, 2011, 1:05 AM

Post #8 of 19 (3847 views)
Re: [acpguedes] [HELP] find results in list [In reply to] Can't Post


Quote
the code compiles, and regexp is alright


The code which you posted, doesn't compile - maybe you made an error when pasting the code to the forum, but the regexp is syntactically incorrect, as you can easily see:


Code
C:\>perl -lwe "m/(mfe:s+(-?d.*.?d.*?)s+kcal/mol)/gi" 
Unmatched ( in regex; marked by <-- HERE in m/( <-- HERE mfe:s+(-?d.*.?d.*?)s+kcal/ at -e line 1.



acpguedes
Novice

Oct 18, 2011, 5:48 AM

Post #9 of 19 (3843 views)
Re: [rovf] [HELP] find results in list [In reply to] Can't Post

which systesis would be correct??


acpguedes
Novice

Oct 18, 2011, 6:13 AM

Post #10 of 19 (3837 views)
Re: [acpguedes] [HELP] find results in list [In reply to] Can't Post

OK
the correct systesis is

Code
/(mfe:\s+(\-?\d.*\.?\d.*?)\s+kcal\/mol)/


but still not solved the problem

What I want is this:

I have a list of results, including the results are separated by "------------------------------------ ---------------------------------------------"

a sample list is below


Code
---------------------------------------------------------------- 
miRNA: hsa-miR-92b*

Hybrid:
mfe: -29.5 kcal/mol

position 29
target 5' U A A U C 3'
GCUG ACUGC UUCC UCCC
UGAC UGGCG AGGG AGGG
miRNA 3' G G C C A 5'


Miranda:

Position: 29

Query: 3' guGACGUGGCGCAGGGCAGGGa 5'
||| ||:|| |:|| ||||
Ref: 5' tgCTGAACTGCATTCCTTCCCc 3'

Energy: -24.110001 kCal/Mol


Rybrid vs miRanda: OK
----------------------------------------------------------------
miRNA: hsa-miR-4689

Hybrid:
mfe: -34.0 kcal/mol

position 3
target 5' U G ACCC U U 3'
GU CCCACCAUG UC UCCUCA
CG GGGUGGUAC AG AGGAGU
miRNA 3' C G U 5'


Miranda:

Position: 3

Query: 3' ccggGGGUGGUAC----A-GAGGAGUu 5'
||||||||| | :||||||
Ref: 5' tgtgCCCACCATGACCCTCTTCCTCAt 3'

Energy: -29.650000 kCal/Mol


Rybrid vs miRanda: OK


this list, for example, has 3 results.

dai, I want a script where he writes on a result by result output file
but each result will be written only if the value of "mfe: blablabla kcal / mol" is within a range.

example, if I set the range is between -28 and -30 it will write the output file so the second result, looks like this:


Code
miRNA:  hsa-miR-92b* 

Hybrid:
mfe: -29.5 kcal/mol

position 29
target 5' U A A U C 3'
GCUG ACUGC UUCC UCCC
UGAC UGGCG AGGG AGGG
miRNA 3' G G C C A 5'


Miranda:

Position: 29

Query: 3' guGACGUGGCGCAGGGCAGGGa 5'
||| ||:|| |:|| ||||
Ref: 5' tgCTGAACTGCATTCCTTCCCc 3'

Energy: -24.110001 kCal/Mol


Rybrid vs miRanda: OK


* Note that this result only from the list the "mfe:" is between -28 and -30

I'm waiting some help

Thanks


(This post was edited by acpguedes on Oct 18, 2011, 6:14 AM)


BillKSmith
Veteran

Oct 18, 2011, 6:58 AM

Post #11 of 19 (3828 views)
Re: [acpguedes] [HELP] find results in list [In reply to] Can't Post

Do I understand correctly? You want to copy each set of results to the output, omitting the mfe line if its value is not between min and max.

A 'set' begins with a miRNA line and ends with a Rybrid line. Lines between the sets (i.e. Dash line)should not be copied.
Good Luck,
Bill


acpguedes
Novice

Oct 18, 2011, 7:37 AM

Post #12 of 19 (3827 views)
Re: [BillKSmith] [HELP] find results in list [In reply to] Can't Post

you understand.
If value of mfe line is not between min end max, all the result should be omitted.

A 'set' begins with a miRNA line and ends with a Rybrid line.
But everything should be copied!

how should i do?


rovf
Veteran

Oct 18, 2011, 8:01 AM

Post #13 of 19 (3822 views)
Re: [acpguedes] [HELP] find results in list [In reply to] Can't Post


Quote
each result will be written only if the value of "mfe: blablabla kcal / mol" is within a range


Hmmmm.... But this is how you have implemented it already: You extract the number from the regexp (stored in $2), and write it to the file only if it is between $min and $max. What else are you missing?


acpguedes
Novice

Oct 18, 2011, 8:05 AM

Post #14 of 19 (3821 views)
Re: [rovf] [HELP] find results in list [In reply to] Can't Post

Actually I do not know, because if you text you will see that the script is not working


FishMonger
Veteran / Moderator

Oct 18, 2011, 8:12 AM

Post #15 of 19 (3816 views)
Re: [acpguedes] [HELP] find results in list [In reply to] Can't Post

If you need to output the entire record/block based on one of the lines in that block, then you should read-in the file by chunks, not line-by-line.

Se my answer in your duplicate question.
http://perlguru.com/gforum.cgi?post=59043#59048


(This post was edited by FishMonger on Oct 18, 2011, 8:13 AM)


rovf
Veteran

Oct 18, 2011, 8:13 AM

Post #16 of 19 (3814 views)
Re: [acpguedes] [HELP] find results in list [In reply to] Can't Post


Quote
because if you text you will see that the script is not working


Well, for one, you haven't posted the original script yet (the first script had the incorrect regexp, and who knows what else is wrong, so you should always post a complete script, if you want people to try it out). Best is to upload (as a zip file) the script, together with the data file.

Second, to say "it is not working" is too unspecific. What does it mean? Does it crash? Does it print out the first few sentences of the Bible instead of the desired result? Does it output the wrong records (and, if yes, which ones)?


BillKSmith
Veteran

Oct 18, 2011, 9:35 AM

Post #17 of 19 (3810 views)
Re: [acpguedes] [HELP] find results in list [In reply to] Can't Post

Forget about sets. Copy all the lines except the bad ones.


Code
while (<IN>) { 
if ( $_ =~ m/mfe:\s+(-?\d*\.?\d*)\s+kcal\/mol/i ) {
next unless ( $1 <= $min && $1 >= $max ) ;
}
print OUT $_;
}



Note: You have reversed max and min from their conventional meanings. I have kept your notation.
Good Luck,
Bill


FishMonger
Veteran / Moderator

Oct 18, 2011, 9:47 AM

Post #18 of 19 (3806 views)
Re: [BillKSmith] [HELP] find results in list [In reply to] Can't Post

But that outputs all records and only strips out the 1 line if min/max condition is met. My understanding is that the entire record/block needs the printed if the condition is met and the entire block skipped otherwise.


acpguedes
Novice

Oct 18, 2011, 12:42 PM

Post #19 of 19 (3795 views)
Re: [FishMonger] [HELP] find results in list [In reply to] Can't Post

thank you, guys...

but now I got..


Code
#!/usr/bin/perl  

use strict;
use warnings;

my $min = -28;
my $max = -30;
my $file = 'input.txt';
$/ = '----------------------------------------------------------------';

#open my $fh, $file, or die "can't open '$file' $!";

while ( my $record = <DATA> ) {
if ( $record =~ m~\nmfe:\s+(\S+)\s+kcal/mol~ ) {
if ( $1 <= $min and $1 >= $max ) {
print $record;
}
}
}

__DATA__
----------------------------------------------------------------
miRNA: hsa-miR-92b*

Hybrid:
mfe: -29.5 kcal/mol

position 29
target 5' U A A U C 3'
GCUG ACUGC UUCC UCCC
UGAC UGGCG AGGG AGGG
miRNA 3' G G C C A 5'


Miranda:

Position: 29

Query: 3' guGACGUGGCGCAGGGCAGGGa 5'
||| ||:|| |:|| ||||
Ref: 5' tgCTGAACTGCATTCCTTCCCc 3'

Energy: -24.110001 kCal/Mol


Rybrid vs miRanda: OK
----------------------------------------------------------------
miRNA: hsa-miR-4689

Hybrid:
mfe: -34.0 kcal/mol

position 3
target 5' U G ACCC U U 3'
GU CCCACCAUG UC UCCUCA
CG GGGUGGUAC AG AGGAGU
miRNA 3' C G U 5'


Miranda:

Position: 3

Query: 3' ccggGGGUGGUAC----A-GAGGAGUu 5'
||||||||| | :||||||
Ref: 5' tgtgCCCACCATGACCCTCTTCCTCAt 3'

Energy: -29.650000 kCal/Mol


Rybrid vs miRanda: OK


 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives