
BillKSmith
Veteran
Apr 17, 2017, 9:17 AM
Post #5 of 6
(5775 views)
|
Re: [vasumathi] Problem with printing continuous minimum and maximum polypurine stretches using sliding window
[In reply to]
|
Can't Post
|
|
Your data revealed that my counting was not correct. The match operator does not do exactly what I thought. With that corrected, I can duplicate your expected output by setting $window = 28 and $step = 1. (It also finds several other matches.) I realize that this is just a shorter version of what you already had last week. I still do not understand your original question. Based on your new comments, I suspect that you should be testing $Gper rather than $AGper.
use strict; use warnings; # post=8395l0 # Updated since posting # $step was not handled correctly my @data = ( 'ATGCGATAGAAGCGTAGACGATGGAAGGGAAGGAAGGAGGGAGGAAGCTATT', 'CGTAGATGATTGATAGAGGGAAGAGGAGAGAGGAAGGGAAGGGAAGGGAGGA', ); my $window = 28; my $step = 1; foreach my $string (@data) { my $line = 'x' x $step . $string; # Simplifies loop SUBSTRING: while (length($line = substr $line, $step) >= $window) { my $nucltde = $_ = substr $line, 0, $window; my $countG = length join '', /(G)/g; my $countCT = length join '', /([CT])/g; my $countAG = length join '', /([AG])/g; my $Gper = ( $countG / $window ) * 100; my $AGper = ( $countAG / $window ) * 100; unless ( $countAG >=15 and $AGper >= 46 and $countCT <= 1 ) { next SUBSTRING; } printf "%25s: %2d %3d%1s %2d\n", $nucltde, $countAG, $AGper, '%', $countCT; } } OUTPUT: ATAGAGGGAAGAGGAGAGAGGAAGGGAA: 27 96% 1 AGAGGGAAGAGGAGAGAGGAAGGGAAGG: 28 100% 0 AGGGAAGAGGAGAGAGGAAGGGAAGGGA: 28 100% 0 GGAAGAGGAGAGAGGAAGGGAAGGGAAG: 28 100% 0 AAGAGGAGAGAGGAAGGGAAGGGAAGGG: 28 100% 0 GAGGAGAGAGGAAGGGAAGGGAAGGGAG: 28 100% 0 GGAGAGAGGAAGGGAAGGGAAGGGAGGA: 28 100% 0 Good Luck, Bill
(This post was edited by BillKSmith on Apr 17, 2017, 9:33 AM)
|