Home: Perl Programming Help: Regular Expressions:
matching in any order



Mendy
Deleted

Nov 16, 2000, 11:17 AM


Views: 6175
matching in any order

Is there a way to match a list of letters in any order but all letters ( or most) should be matched?
if you do /abcdef/ it will have to be in that order and if you do /[abcdef]/ it will only match one.


japhy
Enthusiast / Moderator

Nov 16, 2000, 11:34 AM


Views: 6175
Re: matching in any order

Ah, Randal Schwartz wrote a program that does something very much like this -- it builds a regex to match letters by a pattern. Like, matching words that follow the pattern "ABCDCC", like "obsess".

We can do something similar in your case. I must warn you, though, the regex will not look pretty.

I ask that you give me a while to work on it -- I have to get to Physics right now, but I'll have a solution for you after 6:00 pm (eastern time).

------------------
Jeff "japhy" Pinyan -- accomplished author, consultant, hacker, and teacher



japhy
Enthusiast / Moderator

Nov 16, 2000, 12:17 PM


Views: 6175
Re: matching in any order

Here is a solution:

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


# $REx = any_order("abcdef");
# if ($str =~ $REx) { ... }

sub any_order {
my $c = shift;
my $re;
for my $p (0 .. length($c)-1) {
$re .= join "", map "(?!\\$_)", 1 .. $p;
$re .= "([$c])";
}
return qr/$re/;
}
</pre><HR></BLOCKQUOTE>

The important part of this is the line with $re .= join ..., 1 .. $p. This creates a series of negative look-aheads, which basically make a regex like this:

<BLOCKQUOTE><font size="1" face="Arial,Helvetica,sans serif">code:</font><HR>


m{
# match a, b, or c, and store in \1
([abc])

# make sure \1 won't match
# and then match a, b, or c, and store in \2
(?!\1) ([abc])

# make sure \1 and \2 won't match
# and then match a, b, or c, and store in \3
(?!\1) (?!\2) ([abc])
}x
</pre><HR></BLOCKQUOTE>

So in the string "ccba", 'c' is in \1, and then Perl tries to match a, b, or c, but NOT what was matched by \1. So it fails, since 'c' was matched by \1. Then it tries again, starting at the second c, and it works, because \1 matches 'c', \2 matches 'b', and \3 matches 'a'.

------------------
Jeff "japhy" Pinyan -- accomplished author, consultant, hacker, and teacher



Mendy
Deleted

Nov 17, 2000, 6:08 AM


Views: 6175
Re: matching in any order

Thank you very much!
It's a good solution.
I was interested in writing a sort of scrabble 'AI' and I find that if I control the 'length($c)-1' to a smaller length, I can find smaller words that will fit from within the set.


AndrewG
stranger

Nov 20, 2001, 4:00 PM


Views: 6120
Re: matching in any order

Newbie question. I'm having a problem getting the subroutine to work. I want the script to read in a text file of words to examine.

open(IN,"$wordlist") or die $!;
while (<IN>) {

I'm missing something

}


TIA