
rkellerjr
Novice
Aug 28, 2013, 9:28 AM
Post #20 of 34
(16967 views)
|
Re: [FishMonger] LWP Browser->Get Challenge
[In reply to]
|
Can't Post
|
|
OK, doing it your way it never stops, it never gives me a failure so it just continues and continues and continues. This is why I changed the code so I could read the content and verify whether the file had data. I have 12 pages that actually contain data. I had to kill the program at page 21. When called their server serves up an XML file regardless of whether it contains actual data. So when I make the call to download a page of data, it always gives an XML page with header information but blank below that if it doesn't have data. This gives a legitimate XML file which has no data. So, I created my routine to read the file after downloading and if it doesn't contain actual data (I search for a certain data element) then I'm done and move on to the next batch. I had forgotten why I did that so please, moving forward, don't assume I'm writing bad code, let's tackle my problem of not getting all the data within a downloaded XML file, so I'm not wasting my companies time re-writing code I really don't need to re-write. Mucho appreciated :) The code below is what I changed per your request. Output was the same as it always has been, partially downloaded XML files.
$response = $browser->get($url,':content_file' => $file,); if ($response->is_success) { print "Completed $element page \($page\) file \($filepage\) \($more\) ..\n"; &get_page_data ($element, $output, $pricecode, $page); } else { #$test = $response->code; $test2 = $response->headers_as_string; #$test2 = $response->content_length; die "$test2\n"; } Now, having said all that I have included the above logic where it makes sense within the confines of the logic I need to achieve my goals. Here is the snippet of code after removing the above and adding the success check so that the code is "more correct" and in line with what you'd like to see.
sub get_page_data { my ($element, $output, $pricecode) = @_; my ($response, $url, $page, $more); print "Downloading $element Info ...\n"; `mkdir $output` unless (-d "$output"); $more = "yes"; $page = 0; while ($more) { $page++; $url = "https://[Server and path]/$element/HAY/?page=$page"; $filepage = "0" x (3 - length($page)) . $page; $response = $browser->get($url,':content_file' => $tempxml,); if ($response->is_success) { if ($more = &check_xml) { $file = "$output\\$element" . "_" . $filepage . ".xml"; $response = $browser->get($url,':content_file' => $file,); # Or I could change this to a copy statement which I might later. } } else { #$test = $response->code; $test2 = $response->headers_as_string; #$test2 = $response->content_length; die "$test2\n"; } unlink ("temp.xml"); print "Completed $element page \($page\) file \($filepage\) \($more\) ...\ } print "\n"; }
|