CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
Need some help with regex.

 



Ennio
Novice

Oct 28, 2008, 6:13 PM

Post #1 of 2 (3677 views)
Need some help with regex. Can't Post

I have an array with the following content, a list of URLs.

http://www.mysite.com/
http://www.mysite.com/index.html
http://www.mysite.com/contact.html
http://www.myblog.com/
http://www.myblog.com/default.aspx


I'm creating a script that will look in this array, and remove all the self-referencing link, and non-local links for the base URL that I'm looking for.

So if I'm looking for http://www.mysite.com/ (base URL), I will need to remove from the array http://www.mysite.com/, http://www.mysite.com/index.html, http://www.myblog.com/, and http://www.myblog.com/default.aspx

My problem is in the regular expression to do that, I got it to remove all the non-local links, but now I'm not sure on how to remove the self-referencing links. Can I get some help to complete the regular expression.

Thank you

Here is what I have.


Code
  

$base = http://www.mysite.com/;
for ($counter = 0; $counter <= $#links; $counter++){
if ($links[$counter] =~ m/($base)/){
#do something
} else {
#do something
}
}


1arryb
User

Feb 26, 2009, 1:25 PM

Post #2 of 2 (2955 views)
Re: [Ennio] Need some help with regex. [In reply to] Can't Post

Hi Ennio,

Maybe something like this?

Code
$base = http://www.mysite.com/;  
for ($counter = 0; $counter <= $#links; $counter++){
# Shift parentheses to remember the relative url (if any), not the base.
if ($links[$counter] =~ m|^$base(.*)|){
# Local link.
# $1 is the "remembered" string matched above.
my $rUrl = $1 if $1;
if ( $rUrl and $rUrl =~ m/^index|^default/ ) {
# Self referential link. Throw away.
} else {
# keep
}

} else {
# Non-local link. Throw away.
}
}


Cheers,

Larry

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives