Home: Perl Programming Help: Regular Expressions:
Trying to extract a block of text from a string



srhadden
Novice

Dec 20, 2011, 5:03 PM


Views: 6688
Trying to extract a block of text from a string

Hi all,

I have a string that is like this:

"

line1

line2

BEGIN BLOCK:

stuff

END BLOCK:

line 3

line 4

"

I want to take such a string, and extract everything between and including the BEGIN BLOCK: and END BLOCK: words.

I am using this:

$str =~/(.*)BEGIN BLOCK:.*END BLOCK(.*)/s;

I had to use /s because I want .* to include newlines.

Then I put the pieces back together

$newstr = "$1$2";

Something isn't right though, I don't always get the stuff after my block. I always get $1, but not $2.

Hopefully I explained the basic problem and someone can suggest a better regex.

Thanks


rovf
Veteran

Dec 21, 2011, 3:09 AM


Views: 6684
Re: [srhadden] Trying to extract a block of text from a string

Your code extracts everything outside the BEGIN-END-Block, but *this* works for me (i.e. contrary to your claim, $2 is not empty). Tried it from the bash command line like this:


Code
 $ perl -lwe '"X\nBEGIN BLOCK:\nY\nEND BLOCK:\nZ" =~ /(.*)BEGIN BLOCK:.*END BLOCK(.*)/s; print "$1 - $2"'


This prints:


Code
X 
- :Z


BTW, you should add a colon to END BLOCK.


srhadden
Novice

Dec 21, 2011, 10:29 AM


Views: 6682
Re: [rovf] Trying to extract a block of text from a string

Thank you very much, this does seem to work pretty good for me.

I found out later on yesterday that I thought my routine was not stripping the block out properly. But I don't know all the ins and out of this code base, and figured out some buried web server was sent the string and it updated the DB, totally outside the perl code. Live and learn.

Thanks again. It seems that I did discover a decent way to do it! Usually I come up with something and I get 10 better suggestions :).