CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
Search Posts SEARCH
Who's Online WHO'S
Log in LOG

Home: Perl Programming Help: Beginner:
Recursion issue


New User

May 3, 2017, 11:59 PM

Post #1 of 6 (937 views)
Recursion issue Can't Post


I'm having some trouble with a recursive tree searching function. My issue is I'm trying to figure out a way to append the previous nodes name once I've reached a leaf node. Here is an example tree:

| | |
b c d
| | | |
e f g g
| | |
h h j

The reason I need to do this is that some nodes in the tree have the same name, but I need a unique identifier for each leaf node. In the above tree My goal is to have the following naming conventions once I reach a leaf node:


With my current code, I have no problem finding the name of the leaf node (in the example it would be nodes b, e ,h, f ,h,d,j). However, this results in a collision since in the example, there are two leaf nodes named "h". In my tree I also have the situations where the branch nodes have the same name. In the above tree, branch "g" appears twice however the leaf node is still unique. Appending the names is proving to be frustrating and not as trivial as I imagined.

Here is how I am trying to approach this (somewhat pseudo code):

my $node_name = "";

&parse_tree($root_node, $node_name);

sub parse_tree {
my ($root_node, $node_name) = $_;

#find sub nodes
my @sub_nodes = $root_node->findnodes;

#basically a check for if sub nodes exists, if they don't then the current node is a leaf node
if(scalar(@sub_nodes) > 0) {

# now go through each sub node
foreach my $sub_node (@sub_nodes) {

#grab the name of node, I don't actually have this written but this will return the name of the current sub node;
$node_name = &get_node_name ($sub_node);

#now do the recursion, here I also export the name of the node, so it can be used later.
&parse_tree($sub_node, $node_name);
#reading a leaf node
} else {

print "name: $node_name name\n";

With the above, the output becomes the names of the leaf nodes (assuming the node names are (a,b,c,d,e,f,g,h) from the above example):

name: b
name: e
name h
name: f
name: h
name: j
name: d

Is there a particular method I need to be using to parse this tree, to get it to output the way I want? I was thinking doing it 'length' wise where I traverse until I reach a leaf node. Once the leaf node is reached go back to the root node and continue parsing. Only issue with this is I would need to keep track of which nodes I have already read. An issue with marking nodes as read is that some nodes are identical (i.e. h, h or g, g) so once I mark one node as "read" it would skip the next one. Another possibility is I need to save the level of the tree I'm in, and always do a comparison with the previous level.

Anyways, writing this recursive function is making me lose my mind, I have spent a long time trying different methods but each method I figure out always has a drawback or case where it will fail. Maybe there is a very easy way of doing this that I am completely missing.


Veteran / Moderator

May 4, 2017, 7:20 AM

Post #2 of 6 (927 views)
Re: [rietchel] Recursion issue [In reply to] Can't Post

It is difficult to figure out exactly what you want because we don't know the data structure that you use and also we don't know about the methods (find_node, get_node_name) that you invoke.

Also the output that you seem to want does not entirely make sense: for example, why do you have a.c.e and a.c.e.h, but not a.c?

One idea would be to build your string during recursion. As an example, this does something close to what you seem to want:

use strict; 
use warnings;
use Data::Dumper;

my $tree = [ 'a', [ [ 'b'], [ 'c', [ ['e', 'h'], 'f', ['g', 'h'], ['g', 'j'] ]], [ 'd']]];

sub traverse_tree {
my ($node, $string) = @_;
if (ref $node eq 'ARRAY') {
for my $subnode (@$node) {
$string .= $subnode if ref $subnode ne 'ARRAY';
traverse_tree($subnode, $string);
} else {
print "$string\n";

# print Dumper $tree;
traverse_tree $tree, "";

And this is the output:

$ perl 


May 5, 2017, 8:59 PM

Post #3 of 6 (911 views)
Re: [rietchel] Recursion issue [In reply to] Can't Post

Recursion can be confusing. It helps to pay as much attention to the design of your data structure as you do to the code. Document its design before you attempt to code. I have included the design spec in the comments of the code. Neither Laurent nor I have reproduced your output exactly because you have not made it clear how to choose the nodes to print (They are all leaves except a.c.e). I print only the 'leaves' of the tree. (Move the print statement out of the if-block if you wish to print every node.)

use strict; 
use warnings;

# Data structure is defined recursively.
# TREE is a reference to a node
# NODE is a two element list
# 0: Value
# 1: Ref to an array (possibly empty) of trees.

my $sample_tree = [
[ [ 'b', [] ],
[ 'c',
[ [ 'e', [ [ 'h', [] ] ] ],
[ 'f', [] ],
[ 'g', [ [ 'h', [] ] ] ],
[ 'g', [ [ 'j', [] ] ] ]
[ 'd', [] ]


sub visit_all_nodes {
# Visit every node in the tree.
# If there are no sub-trees, print the "path" to the current node.
use 5.10.0;
use feature qw(state);
state @path;
my ($value, $ref_to_array_of_sub_trees) = @{shift()};
my $number_of_sub_trees = scalar @$ref_to_array_of_sub_trees;
push @path, $value;

if ( $number_of_sub_trees == 0 ) {
local $" = q(.); print "@{path}\n";
visit_all_nodes($_) foreach @$ref_to_array_of_sub_trees;

pop @path;


The code is formatted with perltidy to make the definition of the tree as clear as possible.

The path is not passed as an argument. The variable @path is a 'state' variable. There is only one copy of it, which is available to every instance of the subroutine, but nowhere else.

The version requirement is to make sure that the state feature is available.

The special variable $" is used to add the periods which you show in your sample output.
Good Luck,

(This post was edited by BillKSmith on May 6, 2017, 3:49 AM)

New User

May 7, 2017, 8:27 PM

Post #4 of 6 (884 views)
Re: [BillKSmith] Recursion issue [In reply to] Can't Post

Thanks for the response Laurent and Bill. Sorry I was not clear in my explanation of the scenario, but I am trying to achieve only paths to leaf nodes, as Bills output shows. My input data structure is XML files.

Anyways I've managed to solve the issue on my end using Bill's state keyword, and using an array to store the name of the current node. My original implementation was similar to Bill's however I was not using an array to store the data.

I did not end up needing the special local variable. Probably best since I still do not quite understand how it works (or the local keyword for that matter).

local $" = q(.); print "@{path}\n";

Again, thanks for the feedback and help!


May 8, 2017, 8:06 AM

Post #5 of 6 (873 views)
Re: [rietchel] Recursion issue [In reply to] Can't Post

The special variable $" contains the character (default <space>) which is placed between elements when an array is interpolated into a string. Your example required '.'.

You can think of 'local' as a way to make a copy of a global variable for use in the current scope (block). The rest of the program uses the default. For a more accurate description refer local.

This advice is probably to late for the current version of your project, but you almost certainly will write more versions. Parsing XML is really a job for a module. DIY can fail in ways you cannot even imagine. This issue comes up often. Search the forums for help in choosing which module to use.
Good Luck,

New User

May 8, 2017, 6:46 PM

Post #6 of 6 (851 views)
Re: [BillKSmith] Recursion issue [In reply to] Can't Post

Ya I am using XML::LibXML for the main parsing of elements and nodes, however I needed to parse multiple XML files which are in a tree structure themselves. Possibly there is already something built in for this, but its what I could do in as little time as possible.

Thanks again.


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives