Feb 18, 2018, 12:24 AM
Re: [dilbert] since php-parser attempts failed i need to get a perl-approach
You didn't maintain updating us on the progress you had made on your Europa task, you were last attempting to implement paging. The code I provided, with a change to the conf, would apply here too. I have since written a comprehensive scraper module to supersede Web::Scraper with iteration capabilities, though it needs a bit of a rework.
I haven't written PHP in years now, I couldn't tell you whats wrong with your code off the top of my head, but PHP is perfectly compable of this task.
I have never used XML::LibXML directly to parse html, I typically recommend HTML::TreeBuilder::XPath. This is because it simplifys executing xpath queries, it inherits from HTML::TreeBuilder which is a featuresome html parser, which in turn inherits from HTML::Element which is a featuresome html extractor/modifier. Together they create a very powerful html processing package that cover all you'd need and more.
If you are having trouble getting to grips with the basics of web scraping in Perl, I'd be happy to go through it with you. Every scrape is different, if you don't understand the process behind one, you will have difficulty writing another.
(This post was edited by Zhris on Feb 18, 2018, 12:26 AM)