
BillKSmith
Veteran
Jul 11, 2013, 8:28 PM
Post #2 of 4
(9126 views)
|
Re: [oxydeepu] Can find overlapping coordinates
[In reply to]
|
Can't Post
|
|
The following code demonstrates my method for a single chromosome. It assumes that all exons are in the order of their starting positions. (We can sort them first if necessary.) If you provide a more realistic sample of your data, I can generalize the code to process it. Note: I assume that orientation should be ignored in forming ranges.
use strict; use warnings; my ($range_start, $range_end); my ($chrom , $orientation); my @ranges; while (my $exon = <DATA>) { my ($start_position, $end_position); chomp $exon; ($chrom, $orientation, $start_position, $end_position) = split /\s/, $exon; if (!defined $range_start) { ($range_start, $range_end) = ($start_position, $end_position); next; } if ($start_position > $range_end) { push @ranges, [$range_start, $range_end]; ($range_start, $range_end) = ($start_position, $end_position); } elsif ($start_position <= $range_end and $end_position > $range_end) { $range_end = $end_position; } } push @ranges, [$range_start, $range_end]; print "$chrom $orientation $_->[0] $_->[1]\n" foreach @ranges; __DATA__ Contig0 + 127874 130761 Contig0 + 129936 129984 Contig0 + 130573 130607 Contig0 + 130630 130761 Contig0 + 130732 130767 Contig0 + 130784 130818 Contig0 + 130832 130866 Contig0 + 130832 130867 Contig0 + 130893 130928 Contig0 + 130970 131004 Contig0 + 130982 131017 OUTPUT:
Contig0 + 127874 130767 Contig0 + 130784 130818 Contig0 + 130832 130867 Contig0 + 130893 130928 Contig0 + 130970 131017 Good Luck, Bill
|