CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
Finding peaks from a serie of numbers (histogram)

 



ptahotep
New User

Jul 26, 2013, 11:19 AM

Post #1 of 6 (384 views)
Finding peaks from a serie of numbers (histogram) Can't Post

Hi, I have a set of numbers in a file with this format:

Quote
P 1 2 3 4 5 6
1 0 0 0 0 0 0
2 0 0 0 0 0 0
3 0 0 0 0 0 0
4 0 0 1 0 0 0
5 0 0 1 0 0 0
6 0 0 1 0 0 0
7 0 0 1 0 0 0
8 0 0 1 0 0 0
9 0 0 1 0 0 0
10 0 0 1 0 0 0
11 0 0 2 0 0 0
12 0 0 3 0 0 0
13 0 0 3 0 3 0
14 0 0 3 0 3 0
15 0 0 3 0 5 0
16 0 0 3 0 5 0
17 0 0 3 0 5 0
18 0 0 3 0 5 0
19 0 0 3 0 6 0
20 0 0 4 0 7 0
21 0 0 4 0 7 0
22 0 0 4 0 7 0
23 0 0 4 0 7 0
24 0 0 4 0 7 0
25 0 0 4 0 7 0
26 0 0 4 0 7 0
27 0 0 4 0 7 0
28 0 0 4 0 7 0
29 0 0 4 0 9 1
30 0 0 4 2 9 2
31 0 0 4 2 9 2
32 0 0 4 2 9 2
33 0 0 5 4 9 2
34 0 0 5 4 9 3
35 0 0 5 4 9 3
36 0 0 5 4 9 4
37 0 0 5 6 9 4
38 0 0 5 6 9 4


Where, the first column is the index, and the other 6 are 6 different histograms. I would like to find peaks in every histogram, i.e, the positions with more significant values. The expected values comes from other thousands of files similar to this one. I now calculate a z-score (value-mean/standard_deviation) and so I localize the peaks. But, when I have a peak, I would like obtain from the position where the peak start (sometimes with value=0) to the peak finnish (also, sometimes value=0.

Please, do you know any code for this problem? Thanks in advance,

Pta.


recruiter
User

Jul 26, 2013, 7:10 PM

Post #2 of 6 (373 views)
Re: [ptahotep] Finding peaks from a serie of numbers (histogram) [In reply to] Can't Post

So for example if the index is 18 and it has values of 0 0 3 0 5 0, you wanting to return 3 and 5, or find the position of that?


Laurent_R
Enthusiast / Moderator

Jul 27, 2013, 5:31 AM

Post #3 of 6 (364 views)
Re: [ptahotep] Finding peaks from a serie of numbers (histogram) [In reply to] Can't Post

If I understand well, all the values on the second column form an histogram that you want to go through to sind some properties, and then the same for the third column, etc.

I think you should "transpose" your data into an array of 6 arrays, so that you end up with one of the inner array containing just the list of values of one histogram, etc.

Assuming $line contains one data line from your file at a time, it could just be something like this (not tested):


Code
chomp $line; 
my ($hist_index, @temp_array) = split / /, $line;
for my $i (1..6) {
$AoA[$i][$hist_index] = $temp_array[$i];
}


You end up with an array of six arrays, one per histogram, each containing the values of the current histogram.

Please note that arrays usually start with subscript 0, but, given the look of your data, I made them start at subscript 1, as I think if might easier for you to use them this way.


ptahotep
New User

Jul 27, 2013, 9:08 AM

Post #4 of 6 (358 views)
Re: [Laurent_R] Finding peaks from a serie of numbers (histogram) [In reply to] Can't Post

Thanks, the solution is good for gather the different 6 columns.
Now, if I have for example this 'column':
0 0 0 1 3 4 6 8 10 12 9 7 7 5 3 2 2 4 5 4 2 1 1 2 3 7 9 12 14 11 9 7 3 2 1 0 0 0

I would like to find these 2 significant peaks:
1 3 4 6 8 10 12 9 7 7 5 3 2
1 2 3 7 9 12 14 11 9 7 3 2 1

In fact I would like to get the positions for these 2 peaks:
4-16
23-35


BillKSmith
Veteran

Jul 27, 2013, 10:13 AM

Post #5 of 6 (349 views)
Re: [ptahotep] Finding peaks from a serie of numbers (histogram) [In reply to] Can't Post

If you can give us an algorithm for finding peaks, we can help you code it in perl (or perhaps find a module). If you have an unambiguous definition of peak, we probably can also help with the algorithm. If you are making judgment calls about what is a peak and where it starts and stops, it is very unlikely that any computer program could do the same thing.
Good Luck,
Bill


Laurent_R
Enthusiast / Moderator

Jul 27, 2013, 12:07 PM

Post #6 of 6 (346 views)
Re: [ptahotep] Finding peaks from a serie of numbers (histogram) [In reply to] Can't Post

What you could do is to do something equivalent to computing the derivate of the function, if it was a continuous mathematical function.

Basically, the idea is to make a new array in which each element is the difference between the current element and the previous element of your array. When the values in the new array change sign (go from positive to negative or vice-versa), you reached an extremum (maximum or minimum) in your original array. From there, you should be able to find what you are looking for.

More, I cannot say, because you are not giving enough information on what you need exactly.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives