CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Frequently Asked Questions:
I'm having trouble matching over more than one lin

 



Jasmine
Administrator

Mar 15, 2001, 5:53 AM

Post #1 of 1 (20780 views)
I'm having trouble matching over more than one lin Can't Post

I'm having trouble matching over more than one line. What's wrong?

Either you don't have more than one line in the string you're looking at (probably), or else you aren't using the correct modifier(s) on your pattern (possibly).

There are many ways to get multiline data into a string. If you want it to happen automatically while reading input, you'll want to set $/ (probably to '' for paragraphs or undef for the whole file) to allow you to read more than one line at a time.

Read the perlre manpage to help you decide which of /s and /m (or both) you might want to use: /s allows dot to include newline, and /m allows caret and dollar to match next to a newline, not just at the end of the string. You do need to make sure that you've actually got a multiline string in there.

For example, this program detects duplicate words, even when they span line breaks (but not paragraph ones). For this example, we don't need /s because we aren't using dot in a regular expression that we want to cross line boundaries. Neither do we need /m because we aren't wanting caret or dollar to match at any point inside the record next to newlines. But it's imperative that $/ be set to something other than the default, or else we won't actually ever have a multiline record read in.


Code
    $/ = '';            # read in more whole paragraph, not just one line 
while ( <> ) {
while ( /\b([\w'-]+)(\s+\1)+\b/gi ) { # word starts alpha
print "Duplicate $1 at paragraph $.\n";
}
}

Here's code that finds sentences that begin with ``From '' (which would be mangled by many mailers):


Code
    $/ = '';            # read in more whole paragraph, not just one line 
while ( <> ) {
while ( /^From /gm ) { # /m makes ^ match next to \n
print "leading from in paragraph $.\n";
}
}

Here's code that finds everything between START and END in a paragraph:


Code
    undef $/;           # read in whole file, not just one line or paragraph 
while ( <> ) {
while ( /START(.*?)END/sm ) { # /s makes . cross line boundaries
print "$1\n";
}
}


 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives