CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Post deleted by Gally

 



Gally
New User

Aug 7, 2002, 9:32 AM

Post #1 of 5 (2608 views)
Post deleted by Gally

 


NuclearClam
Novice

Aug 8, 2002, 3:21 AM

Post #2 of 5 (2592 views)
Re: [Gally] Can someone please explain what this code means? [In reply to] Can't Post

i'm new to perl too, but i'll try my best. i don't know what pack does. i also might be wrong about some things Smile


Code
      

$out->{$fields[$i]} =~ s/([A-F0-9]{2})/pack("C",hex($1))/egix;



i'm pretty sure that $out was created with a class. a class is a form of data more complex than a scalar or array. a class can hold scalars, arrays, any type of data, really, and functions too. you shouldn't need to worry about them for quite a while. the $out->{$fields[$i]} looks at the array in the class "out" called "fields", and chooses the element whose number is equal to $i (if $i is 1, it gets whatever $fields[1] is, etc.).

the next part is a bit more complicated, so i'll take it piece by piece.


Code
      

$out->{$fields[$i]} =~ s/([A-F0-9]{2})/pack("C",hex($1))/egix;





if you haven't learned about the substitute function already, its basic usage is (more-or-less): s/stringtobereplaced/replacementstring/giemosx; where giemosx are the different parameters you can pass to s///. in this case, the string to be replaced is ([A-F0-9]{2}), which is a regex (or regular expression... there's a whole following/religion to these, and a whole lot to learn about them!). i'm not sure what the inverted question mark represents. the square brackets, when used in a regex, represents a character class. it matches any of the letters inside of it. using a dash between 0 and 9 is just a shorter way of writing 0123456789, likewise writing A-F is a shorter way of writing ABCDEF. the {2} tells it to match anything in the character class right before it twice. no more, no less. if it was {,2}, it would mean at most two. {2,} would mean at least two. {2,5} would mean two to five times. putting this whole pattern into parentheses (called "capturing parentheses" when used in this way) puts what it matches into a special variable called a backreference. it's useful for when you want to rearrange things a little bit but keep what you matched originally in at least a part of it. each additional set of parentheses in the pattern creates an additional backreference, unless they're the kind of parentheses that make it so that no backreference is created, but we won't get into that. so (\w+)(\s+) would put what it matches as "at least one alphanumeric letter/number" (the + means "at least one of the previous") into $1, and however many whitespace characters \s+ matches into $2.

since i dont know what pack does, all i can tell you is that the hex($1) means "convert whatever was the first item in parentheses already matched to hexidecimal".

the parameters e, g, i, and x mean (respectfully) "evaluate the replacement string as an expression" (so it runs the function in the replacement string, or replaces any $some reference with what $some represents), "change all occurrences of the pattern" (it 'll only do it on the first instance it finds if you don't include this), "ignore letter case in the pattern", and "ignore the whitespace in the pattern".

and about the =~ if you're wondering, it's used with operators that need to be evaluated as a regex (sorry if that's imprecise, pros Wink). you use it mostly with tr/// (translate), s/// (substitute), and m// ("match" more or less). i'm sure you'll learn about these soon enough. !~ is the opposite of =~. it matches anything that *isn't* in the following regex. also, a regex is like a basic definition of what a certain match "success" should look like. there can, and often will be multiple matches for one regex.

and one more thing... $1, $2, etc. are used as backreferences mostly when it is *not* in the same pattern, e.g. in another part of the operator, or separated entirely. \1, \2, etc. are used when you want to refer to something in capturing parentheses in the same pattern. for instance, /(\w\w)\1/ matches the pattern "lala" (each \w matches one letter, and the \1 matches exactly what was matched by \w\w this time), but not "lalb".

if you have any questions, feel free to ask. i tried not to make things too confusing Smile


***edit***

gah i forgot to mention...

$string =~ s/someregex/replacementregex/;

looks for someregex in $string and replaces it with replacementregex. i worded things a little poorly in that part Tongue.


(This post was edited by NuclearClam on Aug 8, 2002, 3:33 AM)


Gally
New User

Aug 8, 2002, 7:49 AM

Post #3 of 5 (2583 views)
Re: [NuclearClam] Can someone please explain what this code means? [In reply to] Can't Post

Laugh Thank you so much for your reply, NuclearClam! It helped alot......I understand exacly what you mean Cool.


davorg
Thaumaturge / Moderator

Aug 9, 2002, 1:34 AM

Post #4 of 5 (2576 views)
Re: [Gally] Can someone please explain what this code means? [In reply to] Can't Post


In Reply To

Code
$out->{$fields[$i]} =~ s/([A-F0-9]{2})/pack("C",hex($1))/egix;



Low level reply:

It takes whatever data is in $out->{$fields[$i]} and applies the given substitution to it. There seems to be an error in the substitution. Are you sure it should be a character - it's more usual to see a % there.

The substitution does this -

Look for (or %) followed by two occurances of a character from the set 0-9 and A-F (these are the characters that make up valid hex numbers - so we're looking for a followed by a tow-digit hex number). The hex number is captured (by the parantheses) into $1.

That whole piece of text is then replaced by what is on the regiht hand side of the substitution. This takes the hex number (in $1), converts that to a decimal number (using the "hex" function) and then passes the resulting value to "pack" using the template "C". This returns the character with that number as its ASCII code.

The /e option on the substitution means "execute the piece of perl code on the right hand side of the substitution and use what is returned as the replacement value". So, effectively that substitution will find a string like %20 and replace it with a space (as 20 in hex is 32 in decimal and 32 is the ASCII code for the space character.

As for the other options on the substitution operator - /g means do this globally (i.e. replace ALL occuarances in the string. /i means do a case-insensitive match (i.e. A-Z also matches a-z). /x means allow comments in the regex, and is therefore completely unnecessary in this regex.


High level reply:

This expression takes URL-encoded strings and converts them to plain text. It is therefore a bad idea to use this in your code as the "escape function fron CGI.pm has been used my many people over many years and is therefore far less likely to contain bugs.

--
Dave Cross, Perl Hacker, Trainer and Writer
http://www.dave.org.uk/
Get more help at Perl Monks


davorg
Thaumaturge / Moderator

Aug 9, 2002, 6:08 AM

Post #5 of 5 (2572 views)
Re: [NuclearClam] Can someone please explain what this code means? [In reply to] Can't Post


Quote
i'm pretty sure that $out was created with a class.

That's not strictly accurate. All we know about $out is that it contains a reference to a hash. That _might_ be an object, but it usually isn't.

--
Dave Cross, Perl Hacker, Trainer and Writer
http://www.dave.org.uk/
Get more help at Perl Monks

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives