CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
REGEX - screen strings, separate values, escaped chars

 



mungedout
Novice

Jul 2, 2005, 2:18 PM

Post #1 of 6 (4448 views)
REGEX - screen strings, separate values, escaped chars Can't Post

I have a database with info (1=item) followed by OPTIONAL secondary info (2=size) as a single value:

@clothing = ("Fun-Stuff (MED)", "Skirts_Pants (LARGE)", "Ski Masks", "Blouse's (SMALL)", "Ties", "Jeans (10/12)",);

I need to screen and separate the values. The only "good" data will contain alphanumeric characters, the asterisk, apostrophe, hyphen, underscore, or a period (*._-') in the first group, followed by an OPTIONAL single set of parenthesis () that may only contain aphanumeric characters between the parentheses. The parentheses can only occur at the end, and nowhere else, must contain both left and right parentheses characters, must be filled with only alphanumeric characters or a forward slash separating alphanumeric characters (not a "lonely" forward slash), and cannot be empty between the left and right parentheses if it DOES occur.

After screening, I'd like to also put anything in that was found in parenthesis as its own value as:

$str = "Jeans";
$str2 = "10/12";

if it occurs.

Do I need to break up each value into two separate strings, then join them; or do I need to screen first to find its existence?

I'm lost. I suck. I'm sorry.


rork
User

Jul 5, 2005, 1:56 AM

Post #2 of 6 (4424 views)
Re: [mungedout] REGEX - screen strings, separate values, escaped chars [In reply to] Can't Post

I think this is the regexp you need:

Code
/^([\w\-'\.\*]*)(\([\da-zA-Z]+(?:\/\[\da-zA-Z]+)?\))?$/;


Explained:

([\w\-'\.\*]*) = The first group, matches alphanumeric and ._-' this can be grabbed by $1
(\([\da-zA-Z]+(?:\/\[\da-zA-Z]+)?\))? = The group between parenthesis, optional, and can be grabbed by $2

More about regular expressions: perlreref

(unfortunately I wasn't able to test it)
--
Don't reinvent the wheel, use it, abuse it or hack it.


mungedout
Novice

Jul 5, 2005, 4:41 AM

Post #3 of 6 (4421 views)
Re: [rork] REGEX - screen strings, separate values, escaped chars [In reply to] Can't Post

Thank you, Rork! I was beginning to think it was too hard. I had regex'd only to allow specific characters and parenthesis, but couldn't figure out how to require specific characters between them if the parenthesis were present.

I tested it on an item in the array with parenthesis and the regexp you illustrated allowed for empty characters between the parenthesis, though. I want to make sure each item in the array is "okay" - without garbage characters or incomplete - then put each half (if there are two parts) into their own values.

Is it something really small to change?

Your help was/is greatly appreciated.


rork
User

Jul 5, 2005, 10:49 AM

Post #4 of 6 (4417 views)
Re: [mungedout] REGEX - screen strings, separate values, escaped chars [In reply to] Can't Post

I now tested it and saw some problems
- I forgot the space before the parenthesis
- I had a \ before a left square bracket
- The second capture included parenthesis

so it would be:

Code
/^([\w\-'\.\*]*)\s?(?:\(([\da-zA-Z]+(?:\/[\da-zA-Z]+)?)\))?$/



Code
my @clothing = ("Fun-Stuff (MED)", "Skirts_Pants (LARGE)", "Ski Masks", 
"Blouse's (SMALL)", "Ties", "Jeans (10/12)","Blouse's

(SMA LL)");
foreach my $piece (@clothing) {
print $piece . " - ";
print "OK" if ($piece =~ /^([\w\-'\.\*]*)\s? # delete brake
(?:\(([\da-zA-Z]+(?:\/[\da-zA-Z]+)?)\))?$/);
print " - " . $1 if ($1);
print " - " . $2 if ($2);
print "<BR>\n";
}


Gives:
Fun-Stuff (MED) - OK - Fun-Stuff - MED
Skirts_Pants (LARGE) - OK - Skirts_Pants - LARGE
Ski Masks -
Blouse's (SMALL) - OK - Blouse's - SMALL
Ties - OK - Ties
Jeans (10/12) - OK - Jeans - 10/12
Blouse's (SMA LL) -

It doesn't return OK with a space between the parenthesis.

if you want the space in Ski Masks to match too you should at \s within the first square brackets. (Note: this will also add the space to $1 if applicable)
--
Don't reinvent the wheel, use it, abuse it or hack it.


mungedout
Novice

Jul 6, 2005, 5:39 AM

Post #5 of 6 (4410 views)
Re: [rork] REGEX - screen strings, separate values, escaped chars [In reply to] Can't Post

Hi rork,

Well, I tried to "break" it by entering garbage characters and deliberately forsaking characters, as I always do when testing something, and it DIDN'T break. That's always the acid test, and that's a good sign Smile

And, yes, I added the \s character to the first bracketed group. Worked like a champ, as far as I can tell.

Thanks for the reply and, more than that, the explanation. As I attempt to decipher it bit by bit - it makes sense why my ultra-simple regexp was worthless. Mucho Gracias.


KevinR
Veteran


Jul 6, 2005, 10:02 AM

Post #6 of 6 (4405 views)
Re: [mungedout] REGEX - screen strings, separate values, escaped chars [In reply to] Can't Post

what you have is a perfect candidate for a hash instead of an array:

your array:

@clothing = ("Fun-Stuff (MED)", "Skirts_Pants (LARGE)", "Ski Masks", "Blouse's (SMALL)", "Ties", "Jeans (10/12)",);

as a hash:


Code
%clothing = ( 
Fun-Stuff => 'M',
Skirts_Pants => 'L',
Ski_Masks => '',
Blouses => 'S',
Ties => '',
Jeans => '10/12',
);

-------------------------------------------------


(This post was edited by KevinR on Jul 6, 2005, 10:03 AM)

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives