
gw1500se
New User
Oct 23, 2012, 5:54 AM
Post #1 of 2
(3218 views)
|
Understanding Parallell::ForkManager
|
Can't Post
|
|
I have a large number of sub-tasks that I want to run in parallel. Some of the tasks are quite short while others are long (hours) so I set up a hash that contains the pid ($pm->start) and a name. When the task is 'start'ed I set the values in the hash then add that hash to an array. In theory the array should contain the information I need for each running task. When a task finishes I have a 'run_on_finish' defined that removes itself from the array. The problem is that it seems things are not happening in the way I understood the documentation. Within my loop, I thought the child executes everything between 'start and next' and 'finish'. In the meantime the parent continues executing everything after 'finish' to the end of the loop. Then the loop runs again creating a new child and the parent executes to the end of the loop again. Here is my code:
foreach my $url (@urls) { unless ($seen{$url}++) { my $proc=process->new(); $proc->pid($pm->start and next); chomp($url); trim($url); print "*************Processing $url*******************\n"; system("$helper_utils/collect_user_metrics.pl", "$url",">$tempoutput/collect_user_metrics.txt"); system("$helper_utils/get_docroot_info.pl", "$pool", "$url",">$tempoutput/get_docroot_info.txt"); $pm->finish(); print("Adding new child to array\n"); $proc->name($url); push(@processed,$proc); } } My debug print ('adding new child') is never output but the 'Processing' is. Can someone clear up how this works for me? TIA.
|