CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
Finding and Replacing URL in a text

 



Saya
Novice

Jan 13, 2004, 5:16 AM

Post #1 of 4 (2777 views)
Finding and Replacing URL in a text Can't Post

Hi,

I have worked a little bit regular expressions and PERL. Now I have spend almost a day trying to solve the problem.

The problem - Assume we have a string containing the following text: "If you are interested in reading news please visit <a href="http://www.bbc.com">BBC's website</a>. And if you interested in learning about PERL visit <a href="http://perlguru.com/gforum.cgi?forum=regularexpression">Perl Guru</a>".

Now I want to find <a href="http://www.bbc.com"> and <a href="http://perlguru.com/gforum.cgi?forum=regularexpression"> with a regular expression and replace the <a> tag as follows:

<a href="http://www.bbc.com"> => <a href="/mypage.asp?URL=http://www.bbc.com">

<a href="http://perlguru.com/gforum.cgi?forum=regularexpression"> => <a href="/mypage.asp?URL=http://perlguru.com/gforum.cgi?forum=regularexpression">

Can anyone please help me with this ?

Regards Saya


Recall
Novice

Jan 13, 2004, 10:56 AM

Post #2 of 4 (2774 views)
Re: [Saya] Finding and Replacing URL in a text [In reply to] Can't Post

See:

http://search.cpan.org/~gaas/HTML-Parser-3.35/lib/HTML/LinkExtor.pm

That should give you a head start.


davorg
Thaumaturge / Moderator

Jan 14, 2004, 3:40 AM

Post #3 of 4 (2771 views)
Re: [Saya] Finding and Replacing URL in a text [In reply to] Can't Post

Recall is right that you don't want to be parsing HTML using regular expressions. You'll be far better off using a real HTML parser, like HTML::Parser or one of its subclasses.

Not sure that I'd use HTML::LinkExtor for this task tho' as you don't want to extract the links, you want to convert them in place. I'd be far more likely to use HTML::TreeBuilder for this.

--
Dave Cross, Perl Hacker, Trainer and Writer
http://www.dave.org.uk/
Get more help at Perl Monks


Saya
Novice

Jan 14, 2004, 4:18 AM

Post #4 of 4 (2767 views)
Re: [davorg] Finding and Replacing URL in a text [In reply to] Can't Post

Hi davorg and Recall,

As I have mentioned, then I have a string and not fully HTML. It is a CMS system where the user may enter some data in a form, and before parsing the form data I need to do the replacing of the <a> tags containing external link eg. http://mywebsite/test/mypage.asp.

So I still need help with the regular expression.

so far I have the following: m!(^|\s)(http://\S+)!gi;

which works on this string:

my $test = "dette er en test for at checke http://www.nnit.dk link test og lidt mere tekst til :-) http://www.vahu.dk";

but I need it to work on the following kind og string:

#my $test = "dette er en test for at checke <a href=\"http://www.nnit.dk\">test<\\a> link test og lidt mere tekst til :-) <a href=\"http://www.vahu.dk\">vahu<\\a>";

Any help or direction would be much appreciated :-)

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives