Home: Perl Programming Help: Regular Expressions:
[HELP] regexp that covers the whole text



acpguedes
Novice

Oct 16, 2011, 10:20 AM


Views: 15347
[HELP] regexp that covers the whole text

I have problem with this script:


Code
#!/usr/bin/perl  


use strict;

use warnings;

print "Write input arquive name\n\t";
my $input = <STDIN>;
print "Write max energy. Like: -28\n\t";
my $min = <STDIN>;
print "Write min energy. Like: -30\n\t";
my $max = <STDIN>;
print "Write output arquive name\n\t";
my $output = <STDIN>;


open IN, $input or die "cannot open input";
open OUT,">". $output or die usage ();


while(<IN>){
if($_ =~ m/(mfe:s+(-?d.*.?d.*?)s+kcal/mol)/gi){
if($2 <= $min && $2 >= $max){
print OUT " $1.$/ ";

}
}
}


I need a regex that covers the whole text below as it is repeated several times and I want to get all of the text according to the value of "mfe:"



Code
----------------------------------------------------------------  
miRNA: hsa-miR-92b* #variable is here! Start here

Hybrid:
mfe: -29.5 kcal/mol #variable is here! this is the important value

position 29 #variable is here!
target 5' U A A U C 3' #variable is here!
GCUG ACUGC UUCC UCCC #variable is here!
UGAC UGGCG AGGG AGGG #variable is here!
miRNA 3' G G C C A 5' #variable is here!


Miranda:

Position: 29 #variable is here!

Query: 3' guGACGUGGCGCAGGGCAGGGa 5' #variable is here!
||| ||:|| |:|| |||| #variable is here!
Ref: 5' tgCTGAACTGCATTCCTTCCCc 3' #variable is here!

Energy: -24.110001 kCal/Mol #variable is here!


Rybrid vs miRanda: OK #Finish here
----------------------------------------------------------------
miRNA: hsa-miR-4689 #variable is here!

Hybrid:
mfe: -34.0 kcal/mol #variable is here!

position 3 #variable is here!
target 5' U G ACCC U U 3' #variable is here!
GU CCCACCAUG UC UCCUCA #variable is here!
CG GGGUGGUAC AG AGGAGU #variable is here!
miRNA 3' C G U 5' #variable is here!

Miranda:

Position: 3 #variable is here!

Query: 3' ccggGGGUGGUAC----A-GAGGAGUu 5' #variable is here!
||||||||| | :|||||| #variable is here!
Ref: 5' tgtgCCCACCATGACCCTCTTCCTCAt 3' #variable is here!

Energy: -29.650000 kCal/Mol #variable is here!


Rybrid vs miRanda: OK


I did a search regexp that the value of "mfe:" but I wanted the output file would leave all text


Code
/(mfe:s+(-?d.*.?d.*?)s+kcal/mol)/


--------------------------------------
I'm looking for one that represents the entire regexp text above in order to say "Here begins the text, here are some lines with text, text in that part of the value should be in the range request, the text ends here"
like:
/(miRNA:)....(mfe:s+(-?d.*.?d.*?)s+kcal/mol)......((Rybrid)s+(vs)s+(miRanda)s+(OK))
^this is a eg
--------------------------------------

I'm wating some help

tanks


(This post was edited by acpguedes on Oct 16, 2011, 10:41 AM)


wickedxter
User

Oct 18, 2011, 3:56 AM


Views: 15137
Re: [acpguedes] [HELP] regexp that covers the whole text

as i c you are going threw the file a line at a time, so youll have to setup multi. regex for the different matches your looking for.

you can just use if elsif to match each line your looking for and return the information you want to save.


(This post was edited by wickedxter on Oct 18, 2011, 3:57 AM)


acpguedes
Novice

Oct 18, 2011, 6:15 AM


Views: 15132
Re: [wickedxter] [HELP] regexp that covers the whole text

What I want is this:

I have a list of results, including the results are separated by "------------------------------------ ---------------------------------------------"

a sample list is below


Code
---------------------------------------------------------------- 
miRNA: hsa-miR-92b*

Hybrid:
mfe: -29.5 kcal/mol

position 29
target 5' U A A U C 3'
GCUG ACUGC UUCC UCCC
UGAC UGGCG AGGG AGGG
miRNA 3' G G C C A 5'


Miranda:

Position: 29

Query: 3' guGACGUGGCGCAGGGCAGGGa 5'
||| ||:|| |:|| ||||
Ref: 5' tgCTGAACTGCATTCCTTCCCc 3'

Energy: -24.110001 kCal/Mol


Rybrid vs miRanda: OK
----------------------------------------------------------------
miRNA: hsa-miR-4689

Hybrid:
mfe: -34.0 kcal/mol

position 3
target 5' U G ACCC U U 3'
GU CCCACCAUG UC UCCUCA
CG GGGUGGUAC AG AGGAGU
miRNA 3' C G U 5'


Miranda:

Position: 3

Query: 3' ccggGGGUGGUAC----A-GAGGAGUu 5'
||||||||| | :||||||
Ref: 5' tgtgCCCACCATGACCCTCTTCCTCAt 3'

Energy: -29.650000 kCal/Mol


Rybrid vs miRanda: OK


this list, for example, has 3 results.

dai, I want a script where he writes on a result by result output file
but each result will be written only if the value of "mfe: blablabla kcal / mol" is within a range.

example, if I set the range is between -28 and -30 it will write the output file so the second result, looks like this:


Code
miRNA:  hsa-miR-92b* 

Hybrid:
mfe: -29.5 kcal/mol

position 29
target 5' U A A U C 3'
GCUG ACUGC UUCC UCCC
UGAC UGGCG AGGG AGGG
miRNA 3' G G C C A 5'


Miranda:

Position: 29

Query: 3' guGACGUGGCGCAGGGCAGGGa 5'
||| ||:|| |:|| ||||
Ref: 5' tgCTGAACTGCATTCCTTCCCc 3'

Energy: -24.110001 kCal/Mol


Rybrid vs miRanda: OK


* Note that this result only from the list the "mfe:" is between -28 and -30

I'm waiting some help

Thanks


FishMonger
Veteran / Moderator

Oct 18, 2011, 7:59 AM


Views: 15125
Re: [acpguedes] [HELP] regexp that covers the whole text

Is this what you're looking for?

Code
#!/usr/bin/perl 

use strict;
use warnings;

my $min = -28;
my $max = -30;
my $file = 'input.txt';
$/ = '----------------------------------------------------------------';

#open my $fh, $file, or die "can't open '$file' $!";

while ( my $record = <DATA> ) {
if ( $record =~ m~\nmfe:\s+(\S+)\s+kcal/mol~ ) {
if ( $1 <= $min and $1 >= $max ) {
print $record;
}
}
}

__DATA__
----------------------------------------------------------------
miRNA: hsa-miR-92b*

Hybrid:
mfe: -29.5 kcal/mol

position 29
target 5' U A A U C 3'
GCUG ACUGC UUCC UCCC
UGAC UGGCG AGGG AGGG
miRNA 3' G G C C A 5'


Miranda:

Position: 29

Query: 3' guGACGUGGCGCAGGGCAGGGa 5'
||| ||:|| |:|| ||||
Ref: 5' tgCTGAACTGCATTCCTTCCCc 3'

Energy: -24.110001 kCal/Mol


Rybrid vs miRanda: OK
----------------------------------------------------------------
miRNA: hsa-miR-4689

Hybrid:
mfe: -34.0 kcal/mol

position 3
target 5' U G ACCC U U 3'
GU CCCACCAUG UC UCCUCA
CG GGGUGGUAC AG AGGAGU
miRNA 3' C G U 5'


Miranda:

Position: 3

Query: 3' ccggGGGUGGUAC----A-GAGGAGUu 5'
||||||||| | :||||||
Ref: 5' tgtgCCCACCATGACCCTCTTCCTCAt 3'

Energy: -29.650000 kCal/Mol


Rybrid vs miRanda: OK



acpguedes
Novice

Oct 18, 2011, 8:21 AM


Views: 15116
Re: [FishMonger] [HELP] regexp that covers the whole text

Wow, man, you helped me a lot.
I'm not very good at perl, so me more help.
In this algorithm, I would like to indicate the path of an input file, if the input file contains a list of result sets.

something like
my $ input = <STDIN>;


FishMonger
Veteran / Moderator

Oct 18, 2011, 8:32 AM


Views: 15113
Re: [acpguedes] [HELP] regexp that covers the whole text

No problem, just use your original method when assigning the input vars, but remember to 'chomp' those vars before using them.


Code
print "Write input arquive name\n\t";  
chomp(my $input = <STDIN>);

print "Write max energy. Like: -28\n\t";
chomp(my $min = <STDIN>;)

print "Write min energy. Like: -30\n\t";
chomp(my $max = <STDIN>);

print "Write output arquive name\n\t";
chomp(my $output = <STDIN>);



acpguedes
Novice

Oct 18, 2011, 12:40 PM


Views: 15102
Re: [FishMonger] [HELP] regexp that covers the whole text

thank you....
now it's ok