CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner: supertree construction in perl: Edit Log



zing
Novice

Oct 7, 2012, 1:19 PM


Views: 1359
supertree construction in perl

Hi all ,
I had this problem of creating a supertree by recursive Alfred Aho's algorithm.
So I divided the problem into its the key concepts and dealt with them one by one.
I'll first put up my code I built with help of people on this forum and then later on I'll explain my problem.So here's what I've build till now

Code
#This program read the triplets from file named "data" and returns the 
#supertree.
# ___DATA(triplets)____
#b c a
#a c d
#d e b
#### NOTE ::: SuperTree part hasnt been incorporated yet.
use strict;
use warnings;
use Data::Dumper;
use Graph;
use Data::Dump qw/ pp /;
####READ IN THE INPUT DATA ########
my @triplets; # Get all the triplets
while (<>) {
push @triplets, [ split ];
}

#Make a deep copy of @triplets
my @triplet_deep_copy = map { [@$_] } @triplets;


#####AUXILIARY GRAPH G(L) #######
# In order to generate the G(L) first of all extract first two columns of @triplets #into another matrix
my @auxiliary_edges=@triplets;
splice(@$_, 2, 1)
foreach @auxiliary_edges;
print "----EDGE LIST TO BUILD AUXILIARY GRAPH-----\n";
print Dumper \@auxiliary_edges;


##### CONNECTED COMPONENTS ##########
my $auxiliary_graph = Graph->new( undirected => 1 );

my @from;
my @to;
for (my $p = 0; $p <= 2; $p++) {
$from[$p]=$triplets[$p][0];
}

for (my $q = 0; $q <= 2; $q++) {
$to[$q]=$triplets[$q][1];
}

for (my $r = 0; $r <= 2; $r++) {
$auxiliary_graph->add_edge($from[$r], $to[$r]);
}

my @subgraphs = $auxiliary_graph->connected_components; # Subgraphs
my $V = $auxiliary_graph->vertices; # Number of taxa
my $connected_components=scalar @subgraphs; #Get the number of #connected components

###### FINDING THE TRIPLETS WHICH ARE SUBSET(OR INDUCED BY) OF #EACH OF THE CONNECTED COMPONENTS######
Main(@auxiliary_edges);
exit(0);
sub induced {
my $trip = shift;
my @matches;
for my $QT ( @_ ) {
for my $triplet ( @$trip ) {
my %seen; # my %Pie;
undef @seen{@$QT};
delete @seen{@$triplet};
if ( keys( %seen ) <= ( @$QT - @$triplet ) ) {
push @matches, $triplet;
}
} ## end for my $triplet ( @$trip )
} ## end for my $QT ( @_ )
return @matches;
}## end sub induced


sub Main {
my $tree = Graph->new( undirected => 1 );
my $dad='u';
$tree->add_vertex($dad);
for my $one (@subgraphs) {
my @matches = induced( \@triplet_deep_copy, $one );
print "\nTriplet induced by subgraph ", pp( $one => { MATCHES =>\@matches } ), "\n\n";
}
}


So this is what I have written till now. Now let me explain my problem.


Code
___INPUT(set of triplets)____ 
b c a
a c d
d e b

Set of species/vertices=a,b,c,d,e
Now build the auxiliary graph by selecting first two vertices of each of the triplets,i.e.

Code
b c 
a c
d e

The auxiliary graph thus generated will be

Code
 a-c-b  d-e

The number of connected components in this auxiliary graph (q)=2 (viz. a-c-b and d-e)
The algorithm I need to implement is this:-

Code
TreeConstruct(S) 
1. Let L be the set of species appear in S. Build G(L)
2. Let C1 , C2 , . . . , Cq be the set of connected components in G(L)
3. If q >1, then
For i = 1, 2, . . . , q, let Si be the set of triplets in S labeled by the set
of leaves in Ci .
Let Ti = TreeConstruct(Si )
Let T be a tree formed by connecting all Ti with the same parent node.
Return T.
4. If q=1 and C1 contains exactly one leaf, return the leaf; else return fail.


The progression will be like this:-

Code
1. Initially we have q=2 (a-c-b & d-e). So introduce an internal vertex (u) and make 
these connected components child of u.
u=> a-c-b;
d-e;
2. Select component 1 = a-c-b. Check all lines from INPUT which are a subset of
this component1.First line of INPUT i.e. "b c a" is a subset of component1.
3."b c a" now becomes the INPUT for the program and it is recursed again with
this INPUT(Now for input "b c a" the auxiliary graph will be "b-c" & "a",i.e. two
connected components,thus q=2 ...)

Final output(SUPERTREE) for the given input should be like this

Code
u  => u => d 
=> e
=> u => a
=> u => b
=> c

TRIPLETS(input) and SUPERTREE(output) look like these
http://ars.sciencedirect.com/content/image/1-s2.0-S0166218X10000983-gr1.jpg

The picture link above has the exact triplets for my problem and the exact supertree(output) expected


(This post was edited by zing on Oct 8, 2012, 10:16 AM)


Edit Log:
Post edited by zing (Novice) on Oct 7, 2012, 1:20 PM
Post edited by zing (Novice) on Oct 7, 2012, 1:21 PM
Post edited by zing (Novice) on Oct 7, 2012, 2:02 PM
Post edited by zing (Novice) on Oct 8, 2012, 1:28 AM: Code properly indented
Post edited by zing (Novice) on Oct 8, 2012, 1:29 AM
Post edited by zing (Novice) on Oct 8, 2012, 10:16 AM


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives