CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
"qr/\d+.+\.pdf$/ );" what mean ?

 



tot94
Novice

Sep 17, 2015, 11:17 AM

Post #1 of 4 (9696 views)
"qr/\d+.+\.pdf$/ );" what mean ? Can't Post

Hello,

I'm new in perl ! I try to make a script to scrape my website. But, i want to perform myself with url regexion ! I found these one :


Code
my @links = $mech->find_all_links( url_regex => qr/\d+.+\.pdf$/ );

Code
 
What mean the expression ?
I look on Internet, but do you know website in order to learn url regexion ?

Thanks,


FishMonger
Veteran / Moderator

Sep 17, 2015, 11:29 AM

Post #2 of 4 (9693 views)
Re: [tot94] "qr/\d+.+\.pdf$/ );" what mean ? [In reply to] Can't Post


Code
The regular expression: 

(?-imsx:/\d+.+\.pdf$/)

matches as follows:

NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
/ '/'
----------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
----------------------------------------------------------------------
.+ any character except \n (1 or more times
(matching the most amount possible))
----------------------------------------------------------------------
\. '.'
----------------------------------------------------------------------
pdf 'pdf'
----------------------------------------------------------------------
$ before an optional \n, and the end of the
string
----------------------------------------------------------------------
/ '/'
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------



tot94
Novice

Sep 17, 2015, 11:47 AM

Post #3 of 4 (9691 views)
Re: [FishMonger] "qr/\d+.+\.pdf$/ );" what mean ? [In reply to] Can't Post

Thanks for your answer, so if i well understand, the delimiter is "/"?

So the expression is delimits here : / \d+.+\.pdf$ / .

What is the use of the secnd "\" ?

Wht the use of 'qr' ? Query ?

So this expression could catch a file like : 2a.pdf ?


Laurent_R
Veteran / Moderator

Sep 18, 2015, 2:42 AM

Post #4 of 4 (9683 views)
Re: [tot94] "qr/\d+.+\.pdf$/ );" what mean ? [In reply to] Can't Post

qr/.../ is the quote operator for regular expressions. It basically says: I am not passing an ordinary string, but a regex, and it says to the compiler to compile it as a regex.

In qr/\d+.+\.pdf$/ ), the first dot is a meta-character that will match any character (except a newline) and will do it any number of times (as any times as possible) because of the + quantifier that follows. The second dot is preceded by a backslash (the escape character), meaning that "\." is meant to match a literal dot ".".

And yes, this expression could catch a file like : 2a.pdf, or 2aa.pdf, 2abc.pdf, 3abdcef.pdf, etc.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives