CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
A file parsing and 2D array/matrix problem.

 



rushadrena
Novice

Aug 25, 2012, 8:32 AM

Post #1 of 62 (8319 views)
A file parsing and 2D array/matrix problem. Can't Post

 I am stuck with this complicated problem. I have a list

** LIST**
[code]
substrate[s]: 3649
product[s]: 3419 3648
substrate[s]: 3645
product[s]: 3647
substrate[s]: 3659
product[s]: 3647
substrate[s]: 3675
product[s]: 3674
substrate[s]: 3674
product[s]: 3490 3489
substrate[s]: 3489
product[s]: 3490
substrate[s]: 3490
product[s]: 3485
substrate[s]: 3485
product[s]: 3486
substrate[s]: 3486
product[s]: 3488
substrate[s]: 3488
product[s]: 3487
substrate[s]: 3487
product[s]: 3877
substrate[s]: 3877
product[s]: 3419
substrate[s]: 3182
product[s]: 1875
substrate[s]: 2809
product[s]: 3182
substrate[s]: 3186
product[s]: 2809 [/code]


Now I have a superlist each of substrate & product as:-

**SUPERLIST_SUBSTRATE**
[code]
substrate[s]: 3649
substrate[s]: 3645
substrate[s]: 3659
substrate[s]: 3675
substrate[s]: 3674
substrate[s]: 3489
substrate[s]: 3490
substrate[s]: 3485
substrate[s]: 3486
substrate[s]: 3488
substrate[s]: 3487
substrate[s]: 3877
substrate[s]: 3182
substrate[s]: 2809
substrate[s]: 3186
substrate[s]: 3675
substrate[s]: 3492
substrate[s]: 3314
substrate[s]: 3006
substrate[s]: 3049[/code]


**SUPERLIST_PRODUCT**
[code]
product[s]: 3419
product[s]: 3648
product[s]: 3489
product[s]: 3647
product[s]: 3647
product[s]: 3674
product[s]: 3490
product[s]: 3490
product[s]: 3485
product[s]: 3486
product[s]: 3488
product[s]: 3487
product[s]: 3877
product[s]: 3419
product[s]: 1875
product[s]: 3182
product[s]: 2809
product[s]: 3492
product[s]: 3186
product[s]: 3492
product[s]: 1825
product[s]: 2543 [/code]


The superlist_product and superlist_substrate will encompass all the possible substrates & products in LIST. ie. substrate(LIST) is a subset of superlist_substrate and similarly for product(LIST). Now i want to create a SUPERARRAY as superlist_substrate(rows) X superlist_product(columns). Now parse the LIST for each substrate id one by one insert a "1" for each product id in the SUPERARRAY. For example consider first two lines of LIST

substrates: 3649

products: 3419 3648
So for substrate id 3649 ,the row id=3649 will be selected from SUPERARRAY and a "1" will be inserted at column ids 3419 & 3648 of the SUPERARRAY. And so on for the entire LIST.Basically SUPERARRAY will be a matrix.


(This post was edited by rushadrena on Aug 25, 2012, 8:34 AM)


Laurent_R
Veteran / Moderator

Aug 25, 2012, 10:57 AM

Post #2 of 62 (8309 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Hi,

if your list is smaller than the super-arrays, your matrix will be sparce. You probably want to use a hash of hashes and create only the entries that exist in the list (other entries will be undefined).

Something like this (untested):


Code
my %supermatrix; 
open my $DATA, '<', $list_in or die "unable to open my list $list $!\n;
while (my $line = <$DATA>) {
chomp $line;
my $substrate = $1 if $line =~ /substrate.*(\d+)$/;
$line = <DATA>; # fetch next line
my @products;
(undef, @products) = split / /, $line;
foreach my $prod (@products) {
$supermatrix{$substrate}{$prod} = 1;
}
}


At the end of the while loop, you hash of hash is populated with 1's for each existing combination of substrates and products. Non existing combinations will be undefined. When using this data structure you will need to check for existence of a combination. For example, you may have later in your code:


Code
print "combination $substrate1 $product1 exists ! \n" if exists $supermatrix{$substrate1}{$prod1};



rushadrena
Novice

Aug 25, 2012, 11:42 AM

Post #3 of 62 (8302 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Thanks a lot Laurent.
But the problem is that my list would at max be ~5 to 10% smaller than superarrays. Also I need the output in such a format that there are blanks wherever there isnt a "1".
Actually the quality of the output depends on the number of blanks also, because this output will be then compared to other 20 such outputs. So in that
way position of "blanks" and "1" is equally important.


FishMonger
Veteran / Moderator

Aug 25, 2012, 12:47 PM

Post #4 of 62 (8294 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post


Quote

Code
while (my $line = <$DATA>)  { 
...
...
$line = <DATA>; # fetch next line
...


Here's one good example where you should not use uppercase vars especially when the name conflicts with a built-n global.


Laurent_R
Veteran / Moderator

Aug 25, 2012, 1:22 PM

Post #5 of 62 (8288 views)
Re: [FishMonger] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Right, I originally thought to put the data in the program (in the data section at the end). I then changed my mind to offer the OP the possibility to read from a file, because I thought it was more convenient to give an example of file opening. And I forgot to change it the second time the file handle is used.

I think I said this was untested code. Quite easy to correct.


Laurent_R
Veteran / Moderator

Aug 25, 2012, 1:32 PM

Post #6 of 62 (8286 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Hmmm, you don't seem to realize that a Cartesian product is far more demanding in terms of space allocation than what you think.

If you have, say, 1,000 substrates in your list (and 1,000 products), you end up with at least one million possible combinations of substrate/products (actually more if you can have several products for one substrate). Most of these combination are probably useless. This is why I suggest a sparse matrix modelized with a hash or hashes.


rushadrena
Novice

Aug 25, 2012, 2:32 PM

Post #7 of 62 (8280 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Laurent thanks a lot for all your valuable suggestions and time.
In my case the size of supermatrix is 762 X 680.
And the LIST has 740 substrates and 600 products. Therefore a minimal representation would mean a considerable loss of information, a the representation is very much important for me.Computational resources aren't an issue.
In that respect could you be please suggest a suitable method (which takes care of blanks also)
[EDIT] I would like to save this matrix to a text file.


(This post was edited by rushadrena on Aug 25, 2012, 3:36 PM)


rushadrena
Novice

Aug 25, 2012, 11:16 PM

Post #8 of 62 (8263 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

I tried testing your code with some print checks inserted.

Code
 my %supermatrix;  
my $list = "LIST.txt";
open my $DATA, '<', $list or die "unable to open my list $list $!\n";
print "\nWORKING ON $list\n";
print `head $list`;
while (my $line = <$DATA>) {
chomp $line;

my $substrate = 1 if $line =~ /substrate.*(\d+)$/;

$line = <$DATA>; # fetch next line
print "PRODUCT====$line";
my @products;
(undef, @products) = split / /, $line;
foreach my $prod (@products) {
$supermatrix{$substrate}{$prod} = 1;
print "SUBSTRATE ===$substrate";
print "combination $substrate $prod exists ! \n" if exists $supermatrix{$substrate}{$prod};
}
}

I still am worried that it wouldnt help my case of complete matrix.


Laurent_R
Veteran / Moderator

Aug 26, 2012, 1:05 AM

Post #9 of 62 (8261 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

From what you described, you really don't need a complete matrix. In a complete matrix, perhaps 99% or more of the elements will be 0 and 1% or less will be 1. The 99% are just useless. You only need to know if you have a match or not. For that a sparce matrix is far better. For a specific combination of substrate/product, you only need to know if the element exists.


rushadrena
Novice

Aug 26, 2012, 7:17 AM

Post #10 of 62 (8245 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Im sorry Laurent for any misinterpretations. But in my case ~15% of elements will be zero, because the size of supermatrix is 762 X 680 (518160 elements).
And the LIST has 740 substrates and 600 products (444000 elements).
% blanks = 74160/518160 = 15%.
So I cant afford to have a sparse representation. Moreover I need to create this matrix representation for 10 more such cases( i.e. different LISTS) for the same supermatrix. And all these 10 LISTS have at most 16% of blanks for the supermatrix.
So please help me.
Thanks again.


Laurent_R
Veteran / Moderator

Aug 26, 2012, 8:29 AM

Post #11 of 62 (8240 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Not quite right, Rushadrena. From the data examples you gave, each substrate has 1 or 2 products associated with it, not more (at least more often 1 than more than 2). So you don't get at all a Cartesian product between 740 substrates and 600 products (444000 elements), but only the actually existing combinations, i.e., assuming two products per substrate, at most 1480 elements. This is very very far from the 518160 elements of a full matrix, less than 0,3%. This means that more than 99,7% of the full matrix would be unemployed, or, I would rather say, totally useless and worthless.

The other point is that, anyway, from the way you described your problem, all you really care to know if whether a specific substrate/combination exists (where to assign the 1 value), or not. For that, the solution I suggested is totally sufficient. The sparce matrix approach I suggested just contains exactly as much useful information on your data as a full matrix taking more than 300 times more space in memory (and far longer to load).

I'm ready to make one concession, though. You may want to have available a full list of all the possible substrates and a full list of all the possible products, not just those in the input list, so that you can say: although this specific combination of substrate/product does not exists in the input list, it would still be a possible candidate, since both the product and the substrate exists. If you want that, then all what you need is two other simple hashes, one with all the possible 740 substrates and one with the all the possible 600 products. So you would end up with two simple hashes and one hash of hashes, keeping in memory about 3,000 elements, still very far less than 500 k-elements. These two hashes give you a virtual Cartesian product of all possibilities, but you never have to compute the actual Cartesian product.

But your description of the problem is an extremely strong indication that a sparce matrix is really exactly what you need. And an hash of hashes is the ideal data structure to store that, because you need just one (pretty fast) line of code to retrieve the information you need (i.e. whether a given substrate/combination exsists in the input data).

I hope I am being clear in my explanations. I work a lot on quite similar problems, the one thing you want to avoid, especially when the volume of data grows, is the quadratic burden of a full Cartesian product (or, even worse, an exponential or factorial explosion of possibilities). Some of the problems I work on at my job can be solved within a few hours of computation with various things similar to the sparce matrix approach described above, but would probably not have the time to finish by the final explosion of the sun and the end of the solar system if we were to try to compute all the possibilities in a super-matrix approach.

One last example. My company has a database with about 35 million customers and about a million possible products and services. What is stored in the database, is the list of services (usually 5 to 20) actually subscribed by the customer. Not a "super-matrix" of all possible customer-service combination s, with 0 and 1 to record if the service has been subscribed or not by the customer. This "supermatrix" would have 35,000 billion elements and would take ages to query and require disk space that I can't even imagine. What a standard business-oriented database (e.g., Oracle) does is, in effect, is to implement a slightly more complicated version of the sparce matrix approach I have described.


(This post was edited by Laurent_R on Aug 26, 2012, 8:32 AM)


Chris Charley
User

Aug 26, 2012, 8:57 AM

Post #12 of 62 (8236 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

I've been following this thread and some questions occurred to me. This would be such a large matrix that you would lose the header information if you had to scroll down anumber of rows. Likewise, if you scrolled over to read the columns, you would lose the substrate in the first left column.

Here is a sparse table created from the sample LIST.txt file. It lists only the combinations seen, not the 'super' matrix you could create from the SUPERSUB and SUPERPROD lists.


Code
 prod-> 1875 2809 3182 3419 3485 3486 3487 3488 3489 3490 3647 3648 3674 3877 
2809 - - 1 - - - - - - - - - - -
3182 1 - - - - - - - - - - - - -
3186 - 1 - - - - - - - - - - - -
3485 - - - - - 1 - - - - - - - -
3486 - - - - - - - 1 - - - - - -
3487 - - - - - - - - - - - - - 1
3488 - - - - - - 1 - - - - - - -
3489 - - - - - - - - - 1 - - - -
3490 - - - - 1 - - - - - - - - -
3645 - - - - - - - - - - 1 - - -
3649 - - - 1 - - - - - - - 1 - -
3659 - - - - - - - - - - 1 - - -
3674 - - - - - - - - 1 1 - - - -
3675 - - - - - - - - - - - - 1 -
3877 - - - 1 - - - - - - - - - -

^
|
substrate


Would it be better to create a comma separated file, where it could be opened by a spreadsheet program like Excel? Those programs can 'freeze' the column/row headers so you can easily scroll and still keep them visible.


Laurent_R
Veteran / Moderator

Aug 26, 2012, 10:09 AM

Post #13 of 62 (8232 views)
Re: [Chris Charley] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Yes, you could easily generate a CSV file from the hash of hashes for importing the data under a speadsheet. Or, better yet, you could use a CPAN module to write directly a speadsheet file. For example: Spreadsheet-WriteExcel, Spreadsheet::Write, Spreadsheet::SimpleExcel, etc.

I am happy that you presented the data in such a tabular form, Chris, as it will show graphically to Rushadrena how sparse the data actually is. This example has 210 element holders, and only 17 of them are really useful, already less than 10%. And the more you add data, the more the ratio between just empty places and actually useful elements becomes large.


rushadrena
Novice

Aug 27, 2012, 2:14 AM

Post #14 of 62 (8211 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Laurent and Chris , Thanks a ton for sharing practical views and extensively exploring other realms of the problem space.
Yes the supermatrix will be very very sparse. But let me add the last element to the problem posed here by me.
I need to create 10 such supermatrices and concatenate them taking two at a time. For instance till now I'm able to create a text file for each of these 10 supermatrices.
Now there's the last piece of puzzle. I have created 10 such matrices (with obviously same number of rows and column).
Now the problem is that I have to concatenate (OR logic operation) two such matrices,
INPUT = Two matrices A,B (each saved in separate text files) of same row and column
OUTPUT = A single matrix C ( C[j] = A[j] OR B[j] )

Code
==============INPUT======= 
MAT - A
1875 2809 3182 3419
2809 - 1 1 -
3182 1 - - -
3186 1 1 - -
3485 - - - -
3486 - - - -

MAT - B
1875 2809 3182 3419
2809 1 - - 1
3182 - - - -
3186 - 1 1 -
3485 - - - -
3486 - 1 - 1


========== OUTPUT===========
MAT - C
1875 2809 3182 3419
2809 1 1 1 1
3182 1 - - -
3186 1 1 1 -
3485 - - - -
3486 - 1 - 1


I.e. an element of matrix will be one if either of the corresponding element of A or B is one.


(This post was edited by rushadrena on Aug 27, 2012, 2:20 AM)


Laurent_R
Veteran / Moderator

Aug 27, 2012, 7:33 AM

Post #15 of 62 (8197 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

It is still very easy to concatenate two sparse matrices.

You just need to copy each element of matrix A into matrix B, and matrix B will be the concatenation of the two matrices.

BTW, the matrices don't necessarily have to have the same size.


rushadrena
Novice

Aug 27, 2012, 12:58 PM

Post #16 of 62 (8186 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Is there a way I can read a matrix from a text file into perl, so as to access each element one by one. This is what I have tried .Though it reads the matrix from file and prints it as it is but Im not able to access elements one by one.

Code
use strict; 
use warnings;

open DATA, "matrix.txt" or die $!;
chomp( my @lines = <DATA> );
foreach (@lines) {
print "$_\n";
}

=====CONTENTS of matrix.txt======
1 2 3 4

1 5 6 8

1 7 8 0


Chris Charley
User

Aug 27, 2012, 6:25 PM

Post #17 of 62 (8174 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

You might not know, but to pair every one of the 10 matrices with each other will create another 45 matrices. That is because the number of unique pairings is a combination of 10 items, 2 at a time.

(10 * 9) / (2 * 1) = 45

So you will have a total of 55 matrices. And you will probably want each in its own file, (or perhaps not).

Here is some output from what I worked on. Its ok for small tables, but I don't think you will be able to view a 400 column matrix that way. You should probably create a comma separated values file, (.csv), to be read by a program like Excel, which reads those files.


Code
C:\Old_Data\perlp>perl t1.pl 
Processing file: junk.txt
prod-> 1875 2809 3182 3419 3485 3486 3487 3488 3489 3490 3647 3648 3674 3877
2809 - - 1 - - - - - - - - - - -
3182 1 - - - - - - - - - - - - -
3186 - 1 - - - - - - - - - - - -
3485 - - - - - 1 - - - - - - - -
3486 - - - - - - - 1 - - - - - -
3487 - - - - - - - - - - - - - 1
3488 - - - - - - 1 - - - - - - -
3489 - - - - - - - - - 1 - - - -
3490 - - - - 1 - - - - - - - - -
3645 - - - - - - - - - - 1 - - -
3649 - - - 1 - - - - - - - 1 - -
3659 - - - - - - - - - - 1 - - -
3674 - - - - - - - - 1 1 - - - -
3675 - - - - - - - - - - - - 1 -
3877 - - - 1 - - - - - - - - - -

^
|
substrate

Processing file: another.txt
prod-> 1875 2809 3182 3248 3374 3390 3419 3485 3486 3487 3488 3490 3641 3645 3877
2809 - - 1 - - - - - - - - - - - -
3182 1 - - - - - - - - - - - - - -
3186 - 1 - - - - - - - - - - - - -
3287 - - - - - - - - - - - - - - 1
3489 - - - - - - - - 1 - - - - - -
3490 - - - - - - - 1 - - - - - - -
3491 - - - - - - - - - - - 1 - - -
3499 - - - - - - - - - 1 - - - - -
3609 - - - - - - - - - - - - - 1 -
3645 - - - - - - - - - - - - 1 - -
3647 - - - 1 - - 1 - - - - - - - -
3674 - - - - - 1 - - - - - 1 - - -
3685 - - - - 1 - - - - - - - - - -
3877 - - - - - - 1 - - - - - - - -
3986 - - - - - - - - - - 1 - - - -

^
|
substrate

Combining junk.txt and another.txt
prod-> 1875 2809 3182 3248 3374 3390 3419 3485 3486 3487 3488 3489 3490 3641 3645 3647 3648 3674 3877
2809 - - 1 - - - - - - - - - - - - - - - -
3182 1 - - - - - - - - - - - - - - - - - -
3186 - 1 - - - - - - - - - - - - - - - - -
3287 - - - - - - - - - - - - - - - - - - 1
3485 - - - - - - - - 1 - - - - - - - - - -
3486 - - - - - - - - - - 1 - - - - - - - -
3487 - - - - - - - - - - - - - - - - - - 1
3488 - - - - - - - - - 1 - - - - - - - - -
3489 - - - - - - - - 1 - - - 1 - - - - - -
3490 - - - - - - - 1 - - - - - - - - - - -
3491 - - - - - - - - - - - - 1 - - - - - -
3499 - - - - - - - - - 1 - - - - - - - - -
3609 - - - - - - - - - - - - - - 1 - - - -
3645 - - - - - - - - - - - - - 1 - 1 - - -
3647 - - - 1 - - 1 - - - - - - - - - - - -
3649 - - - - - - 1 - - - - - - - - - 1 - -
3659 - - - - - - - - - - - - - - - 1 - - -
3674 - - - - - 1 - - - - - 1 1 - - - - - -
3675 - - - - - - - - - - - - - - - - - 1 -
3685 - - - - 1 - - - - - - - - - - - - - -
3877 - - - - - - 1 - - - - - - - - - - - -
3986 - - - - - - - - - - 1 - - - - - - - -

^
|
substrate



rushadrena
Novice

Aug 28, 2012, 1:10 AM

Post #18 of 62 (8165 views)
Re: [Chris Charley] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Dear Chris,
Actually I dont need to visualize the matrix I just need to process it as it is and for the purposes text file is sufficient.
Chris could you pass on the code (t1.pl) you have written to achieve the OR of two matrix. That would be really helpful.


Laurent_R
Veteran / Moderator

Aug 28, 2012, 4:40 AM

Post #19 of 62 (8152 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Hi,

to access the individual elements:

Code
 use strict;   
use warnings;

my $matrix = "matrix.txt";
open my $fh, "<", $matrix or die "unable to open $matrix $! \n";
chomp( my @lines = <DATA> );
foreach my $line (@lines) {
my @fields = split / /, $line;
print "$_\n" foreach @fields;
}


With your data:


Code
1 2 3 4  
1 5 6 8
1 7 8 0


this prints:


Code
 perl matrix.pl  
1
2
3
4
1
5
6
8
1
7
8
0



(This post was edited by Laurent_R on Aug 28, 2012, 4:41 AM)


Chris Charley
User

Aug 28, 2012, 8:49 AM

Post #20 of 62 (8138 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Giving you a solution without you working through it will not help you learn. I suspect this is a school assignment and I won't help you cheat your teacher.

Even if its not a school assignment, I'm not sure you would understand the code.

The problem is not difficult. The code you wrote(?) to create 'MAT -A' and 'MAT - B' should be part of the solution.

Post the code that created these 2 matrices, and then see if it can be modified to merge the 2 tables.


rushadrena
Novice

Aug 28, 2012, 10:25 AM

Post #21 of 62 (8134 views)
Re: [Chris Charley] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Hi there,
Guys it ain't any school assignment but related to my research work. Here's the code I have developed for creating the matrices

Code
use Modern::Perl; 
use File::Slurp qw/read_file/;
use Text::Table;
use Data::Dumper;

my ( %supermatrix, @titles, %seen, @rows );

my @list = read_file 'LIST.txt';

for ( my $i = 0 ; $i < $#list + 1 ; $i += 2 ) {
my ($substrateID) = $list[$i] =~ /(\d+)/g;
$supermatrix{$substrateID}{$1} = 1 while $list[ $i + 1 ] =~ /(\d+)/g;
}

for my $product ( read_file 'SUPERLIST_PRODUCT.txt' ) {
my ($productID) = $product =~ /(\d+)/g;
push @titles, $productID unless $seen{$productID}++;

for my $substrate ( read_file 'SUPERLIST_SUBSTRATE.txt' ) {
my ($substrateID) = $substrate =~ /(\d+)/g;
$supermatrix{$substrateID}{$productID} //= '.';
}
}

my $titles = join ',',
map "{title => 'p$_', align_title => 'center', align => 'center'}",
sort { $a <=> $b } @titles;

for my $y ( sort { $a <=> $b } keys %supermatrix ) { #rows
my ( $rowLable, @row );

for my $x ( sort { $a <=> $b } keys %{ $supermatrix{$y} } ) {
#columns
$rowLable = $y unless $rowLable;
push @row, $supermatrix{$y}{$x};
}
push @rows, [ "s$rowLable", @row ];
}

my $tb = Text::Table->new( ' ', eval $titles );
$tb->load(@rows);
say $tb;

say "\n", Dumper \%supermatrix;

Chris the OR combination code for matrices(the one you have written) is the last thing I need.


Chris Charley
User

Aug 29, 2012, 7:20 AM

Post #22 of 62 (8106 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

I wasn't able to configure the code you posted, so I am posting some solution I got,

Code
#!/usr/bin/perl 
use strict;
use warnings;

my @matrix;
my @file = qw/junk.txt another.txt/;

for my $file ( @file ) {
my %data;
my $sub;

open my $fh, "<", $file
or die "Unable to open $file for reading. $!";

while (<$fh>) {
if (/substrate\D+(\d+)$/) {
$sub = $1;
}
else { # get the product(s)
$data{$sub}{$_} = 1 for /\d+/g;
}
}
close $fh or die "Unable to close $file. $!";

print "Processing file: $file\n";
process(%data);

push @matrix, \%data;
}

for my $i (0 .. $#matrix) {
for my $j ($i+1 .. $#matrix) {
print "Combining $file[$i] and $file[$j]\n";
my %data = combine($matrix[$i], $matrix[$j]);
process(%data);
}
}

sub process {
my %data = @_;
my %seen;
my @product = sort {$a <=> $b}
grep ! $seen{$_}++,
map keys %$_, values %data;

printf "%7s" . "%5s" x @product . "\n", 'prod->', @product;

for my $substrate (sort {$a <=> $b} keys %data) {
printf "%7s", $substrate;
printf "%5s", $data{$substrate}{$_} || '-' for @product;
print "\n";
}
printf "\n%5s\n%5s\n%s\n\n", '^', '|', 'substrate';
}

sub combine {
my ($matrix1, $matrix2) = @_;
my %new_hash = %$matrix1;

for my $substrate (keys %$matrix2) {
$new_hash{$substrate}{$_} = 1 for keys %{ $matrix2->{$substrate} };
}
return %new_hash;
}



Laurent_R
Veteran / Moderator

Aug 29, 2012, 10:43 AM

Post #23 of 62 (8094 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post


Rushadrena,

your idea of storing the data in a file in tabular form is probably wrong.

It is far easier to store direcly the hash of hashes structure using Data::Dumper. Then, you only have to open the file, slurp its content and use eval on it to recreate the hash.

And combining two hashes the way you want to do is very easy, it only takes three lines of code (as shown in Chris Charley's code suggestion.

Final point, I can see from your code that you're still trying to build the complete matrix instead of a sparse one, this is simply the wrong approach, it takes more space, it takes more time, and it takes more code.


FishMonger
Veteran / Moderator

Aug 29, 2012, 11:23 AM

Post #24 of 62 (8090 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post


Quote
It is far easier to store direcly the hash of hashes structure using Data::Dumper. Then, you only have to open the file, slurp its content and use eval on it to recreate the hash.

I'd suggest using the Storable module for that step instead of Data::Dumper and an eval statement.
http://search.cpan.org/~ams/Storable-2.35/Storable.pm


Laurent_R
Veteran / Moderator

Aug 29, 2012, 11:42 AM

Post #25 of 62 (8086 views)
Re: [FishMonger] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Yes, right, the Storable module is probably even easier to use.


Laurent_R
Veteran / Moderator

Aug 30, 2012, 7:46 AM

Post #26 of 62 (5644 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

I made a quick attempt to code your problem yesterday, but did not have time to post it before I had to leave the place where I was then. And I'm back in front of the same computer only now.

Here it is:

Code
   use strict;    
use warnings;
use Data::Dumper;

my ($list_1, $list_2) = ("list.txt", "list2.txt");
my %supermatrix1 =read_file ($list_1);
print "\n==============================\n";
my %supermatrix2 = read_file ($list_2, %supermatrix2);
print "\n==============================\n";
my %concat_supermatrix = concat_matrices ();
print Dumper %concat_supermatrix;

sub read_file {
my $list_in = shift;
my %supermatrix;
open my $DATA, '<', $list_in or die "unable to open $list_in $!\n";
while (my $line = <$DATA>) {
chomp $line;
my $substrate = $1 if $line =~ /substrate.{4} (\d+)\s*$/;
$line = <$DATA>;
my @products;
(undef, @products) = split /\s+/, $line;
$supermatrix{$substrate}{$_} = 1 foreach (@products);
}
print Dumper %supermatrix;
return %supermatrix;
}

sub concat_matrices {
my %supermatrix_cat = %supermatrix1;
foreach my $subs (keys %supermatrix2) {
$supermatrix_cat{$subs}{$_} = 1 for keys %{ $supermatrix2{$subs} };
}
return %supermatrix_cat;
}


As you can see, it is just 35 lines of code. It is quite similar to Chris Charley's version, because it is based on the same sparse data structure.

I ran it with the data of your ooriginal post for list_1 and the following data for list2:

Code
 substrate: 7182     
product: 1875
substrate: 2809
product: 3187
substrate: 3186
product: 2810


As you can see the matrices have a very different size.

This is the data dump of list2:

Code
   ==============================    
$VAR1 = '2809';
$VAR2 = {
'3187' => 1
};
$VAR3 = '3186';
$VAR4 = {
'2810' => 1
};
$VAR5 = '7182';
$VAR6 = {
'1875' => 1
};


And this is the data dump of the concatenated matrix:

Code
   ==============================    
$VAR1 = '3489';
$VAR2 = {
'3490' => 1
};
$VAR3 = '3877';
$VAR4 = {
'3419' => 1
};
$VAR5 = '3182';
$VAR6 = {
'1875' => 1
};
$VAR7 = '3649';
$VAR8 = {
'3419' => 1,
'3648' => 1
};
$VAR9 = '3674';
$VAR10 = {
'3489' => 1,
'3490' => 1
};
$VAR11 = '3490';
$VAR12 = {
'3485' => 1
};
$VAR13 = '3675';
$VAR14 = {
'3674' => 1
};
$VAR15 = '3645';
$VAR16 = {
'3647' => 1
};
$VAR17 = '3488';
$VAR18 = {
'3487' => 1
};
$VAR19 = '2809';
$VAR20 = {
'3187' => 1,
'3182' => 1
};
$VAR21 = '3485';
$VAR22 = {
'3486' => 1
};
$VAR23 = '3186';
$VAR24 = {
'2810' => 1,
'2809' => 1
};
$VAR25 = '3487';
$VAR26 = {
'3877' => 1
};
$VAR27 = '3486';
$VAR28 = {
'3488' => 1
};
$VAR29 = '3659';
$VAR30 = {
'3647' => 1
};
$VAR31 = '7182';
$VAR32 = {
'1875' => 1
};




(This post was edited by Laurent_R on Aug 30, 2012, 7:51 AM)


rushadrena
Novice

Sep 4, 2012, 4:27 AM

Post #27 of 62 (5621 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Hi all,
I'm facing a small problem. Yesterday while parsing all the files I tried to run test the matrix generation code first and found out that its not giving complete output. Here's the problem
LIST.txt
substrate: 1 2
product: 3
substrate: 6 9
product: 8 10
substrate: 3
product: 6
substrate: 9
product: 5
substrate: 5
product: 2
substrate: 3
product: 9
substrate: 8
product: 9
substrate: 8
product: 1
substrate: 7
product: 11
substrate: 19
product: 17
substrate: 14
product: 13
substrate: 14
product: 11
substrate: 18
product: 19
substrate: 7 14
product: 15
substrate: 7 16
product: 7 17
substrate: 5
product: 6
substrate: 18 15
product: 7
substrate: 7 8
product: 8 18
substrate: 6
product: 9
substrate: 11
product: 12




SUPERLIST_SUBSTRATE

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19


SUPERLIST_PRODUCT

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19


********OUTPUT********


Code
    p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16 p17 p18 p19 
s1 . . 1 . . . . . . . . . . . . . . . .
s2 . . . . . . . . . . . . . . . . . . .
s3 . . . . . 1 . . 1 . . . . . . . . . .
s4 . . . . . . . . . . . . . . . . . . .
s5 . 1 . . . 1 . . . . . . . . . . . . .
s6 . . . . . . . 1 1 1 . . . . . . . . .
s7 . . . . . . 1 1 . . 1 . . . 1 . 1 1 .
s8 1 . . . . . . . 1 . . . . . . . . . .
s9 . . . . 1 . . . . . . . . . . . . . .
s10 . . . . . . . . . . . . . . . . . . .
s11 . . . . . . . . . . . 1 . . . . . . .
s12 . . . . . . . . . . . . . . . . . . .
s13 . . . . . . . . . . . . . . . . . . .
s14 . . . . . . . . . . 1 . 1 . . . . . .
s15 . . . . . . . . . . . . . . . . . . .
s16 . . . . . . . . . . . . . . . . . . .
s17 . . . . . . . . . . . . . . . . . . .
s18 . . . . . . 1 . . . . . . . . . . . 1
s19 . . . . . . . . . . . . . . . . 1 . .


Consider first two lines of LIST.txt -
substrate: 1 2
product: 3
Then there is a "1" s1-p3, but there isn;t a "1" for s2-p3.
#####REQUIRED OUTPUT--Places marked with "X" need to be "1" #####

Code
    p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 p13 p14 p15 p16 p17 p18 p19 
s1 . . 1 . . . . . . . . . . . . . . . .
s2 . . X . . . . . . . . . . . . . . . .
s3 . . . . . 1 . . 1 . . . . . . . . . .
s4 . . . . . . . . . . . . . . . . . . .
s5 . 1 . . . 1 . . . . . . . . . . . . .
s6 . . . . . . . 1 1 1 . . . . . . . . .
s7 . . . . . . 1 1 . . 1 . . . 1 . 1 1 .
s8 1 . . . . . . X 1 X . . . . . . . X .
s9 . . . . 1 . . . . . . . . . . . . . .
s10 . . . . . . . . . . . . . . . . . . .
s11 . . . . . . . . . . . 1 . . . . . . .
s12 . . . . . . . . . . . . . . . . . . .
s13 . . . . . . . . . . . . . . . . . . .
s14 . . . . . . . . . . 1 . 1 . X . . . .
s15 . . . . . . X . . . . . . . . . . . .
s16 . . . . . . X . . . . . . . . . X . .
s17 . . . . . . . . . . . . . . . . . . .
s18 . . . . . . 1 . . . . . . . . . . . 1
s19 . . . . . . . . . . . . . . . . 1 . .


THE CODE I USED

Code
use Modern::Perl; 
use File::Slurp qw/read_file/;
use Text::Table;
use Data::Dumper;

my ( %supermatrix, @titles, %seen, @rows );

my @list = read_file 'LIST.txt';

for ( my $i = 0 ; $i < $#list + 1 ; $i += 2 ) {
my ($substrateID) = $list[$i] =~ /(\d+)/g;
$supermatrix{$substrateID}{$1} = 1 while $list[ $i + 1 ] =~ /(\d+)/g;
}

for my $product ( read_file 'SUPER_PRO.txt' ) {
my ($productID) = $product =~ /(\d+)/g;
push @titles, $productID unless $seen{$productID}++;

for my $substrate ( read_file 'SUPER_SUB.txt' ) {
my ($substrateID) = $substrate =~ /(\d+)/g;
$supermatrix{$substrateID}{$productID} //= '.';
}
}

my $titles = join ',',
map "{title => 'p$_', align_title => 'center', align => 'center'}",
sort { $a <=> $b } @titles;

for my $y ( sort { $a <=> $b } keys %supermatrix ) { #rows
my ( $rowLable, @row );

for my $x ( sort { $a <=> $b } keys %{ $supermatrix{$y} } ) { #columns
$rowLable = $y unless $rowLable;
push @row, $supermatrix{$y}{$x};
}
push @rows, [ "s$rowLable", @row ];
}

my $tb = Text::Table->new( ' ', eval $titles );
$tb->load(@rows);
say $tb;



Laurent_R
Veteran / Moderator

Sep 4, 2012, 10:43 AM

Post #28 of 62 (5582 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Hi,

you did not say that there could be several substrates on the same line, you said there could be several products on product lines, but you did not say it for substrates.

The code I provided and the one by Chris therefore does not look for possibe other substrate on the substrate line. And you code, probably derived in part from what Chris and myself have suggested, also doesn't check for other substrates on substrate lines.

You have to change the second and third lines of this part of your code to look for possible other substrates:


Code
for ( my $i = 0 ; $i < $#list + 1 ; $i += 2 ) {  
my ($substrateID) = $list[$i] =~ /(\d+)/g;
$supermatrix{$substrateID}{$1} = 1 while $list[ $i + 1 ] =~ /(\d+)/g;
}



Chris Charley
User

Sep 4, 2012, 11:23 AM

Post #29 of 62 (5576 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Yes, Laurent found the problem - you need to check for more than 1 substrate on it's line. In addition, when I ran the code you posted, I found other errors but I haven't been able to find the cause. Here is my code modified from my original post. It checks for more than 1 substrate (on 1 line).

Code
 #!/usr/bin/perl  
use strict;
use warnings;
use List::Util qw/ max /;

my @matrix;
my $path = '.'; # (current directory - '.') or path to data files

my @file = qw/list.txt one.txt another.txt/;

for my $file ( @file ) {
my %data;
my @substrate;

open my $fh, "<", "$path/$file"
or die "Unable to open $file for reading. $!";

while (<$fh>) {
if (/^substrate/) {
@substrate = /\d+/g;
}
elsif (/^product/) {
while (/(\d+)/g) {
for my $sub (@substrate) {
$data{$sub}{$1} = 1 ;
}
}
}
else {
die "Unknown format $file. $!";
}
}
close $fh or die "Unable to close $file. $!";

print "Processing file: $file\n";
process(%data);

push @matrix, \%data;
}

for my $i (0 .. $#matrix) {
for my $j ($i+1 .. $#matrix) {
print "Combining $file[$i] and $file[$j]\n";
my %data = combine($matrix[$i], $matrix[$j]);
process(%data);
}
}

sub process {
my %data = @_;
my %seen;

my @product = sort {$a <=> $b}
grep ! $seen{$_}++,
map keys %$_, values %data;

# to get column width for print
my $wid = 1 + max map length, @product;

printf "%7s" . "%${wid}s" x @product . "\n", 'prod->', @product;

for my $substrate (sort {$a <=> $b} keys %data) {
printf "%7s", $substrate;
printf "%${wid}s", $data{$substrate}{$_} || '-' for @product;
print "\n";
}
printf "\n%5s\n%5s\n%s\n\n", '^', '|', 'substrate';
}


sub combine {
my ($matrix1, $matrix2) = @_;
my %new_hash = %$matrix1;

for my $substrate (keys %$matrix2) {
$new_hash{$substrate}{$_} = 1 for keys %{ $matrix2->{$substrate} };
}
return %new_hash;
}


I've attached 2 files. One has the three data files and the other has the output. And now a third file, using the 'SUPERLIST_PRODUCT.txt' AND 'SUPERLIST_SUBSTRATE.txt


(This post was edited by Chris Charley on Sep 4, 2012, 4:46 PM)
Attachments: FILE.txt (1.71 KB)
  matrices.txt (15.5 KB)
  t33.pl (2.12 KB)


rushadrena
Novice

Sep 7, 2012, 1:05 AM

Post #30 of 62 (5492 views)
Re: [Chris Charley] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Hi Chris,
I have 10 directories and each of them has a file "list".
I modified your code to create pairwise concatenated matrices but Im getting this error --"No such file or directory at code2.pl line 6", whereas the file does exist.
Here's the code :-

Code
#!/usr/bin/perl 
use strict;
use warnings;
use List::Util qw/ max /;

open my $fh, "<", 'SUPERLIST_PRODUCT' or die $!;
my $spr_prod = do {local $/; <$fh>};
close $fh or die $!;
my @spr_prod = $spr_prod =~ /\d+/g;

open $fh, "<", 'SUPERLIST_SUBSTRATE' or die $!;
my $spr_substr = do {local $/; <$fh>};
close $fh or die $!;
my @spr_substrate = $spr_substr =~ /\d+/g;
my $i=0;
my $j=0;
my @matrix;
my @file;
my @dirs = split /\n/,`find -maxdepth 1 -type d`;
for ($i=1;$i<=10;$i++)
{
for ($j=1;$j<=10;$j++)

{

my $path = '.'; # (current directory - '.') or path to data files
my @file = qw($dirs[$i]/list $dirs[$j]/list);

for my $file ( @file ) {
my %data;
my @substrate;

open my $fh, "<", "$path/$file"
or die "Unable to open $file for reading. $!";

while (<$fh>) {
if (/^substrate/) {
@substrate = /\d+/g;
}
elsif (/^product/) {
while (/(\d+)/g) {
for my $sub (@substrate) {
$data{$sub}{$1} = 1 ;
}
}
}
else {
die "Unknown format $file. $!";
}
}
close $fh or die "Unable to close $file. $!";

print "Processing file: $file\n";
process(\@spr_prod, \@spr_substrate, %data);

push @matrix, \%data;
}
}}
for my $i (0 .. $#matrix) {
for my $j ($i+1 .. $#matrix) {
print "Combining $file[$i] and $file[$j]\n";
my %data = combine($matrix[$i], $matrix[$j]);
process(\@spr_prod, \@spr_substrate, %data);
}
}

sub process {
my ($spr_prod, $spr_subst, %data) = @_;
my %seen;

my @product = sort {$a <=> $b}
grep ! $seen{$_}++,
@$spr_prod, map keys %$_, values %data;

# to get column width for print
my $wid = 1 + max map length, @product;

printf "%7s" . "%${wid}s" x @product . "\n", 'prod->', @product;

undef %seen;
my @substrate = sort {$a <=> $b}
grep ! $seen{$_}++,
@$spr_subst, keys %data;

for my $substrate (@substrate) {
printf "%7s", $substrate;
printf "%${wid}s", $data{$substrate}{$_} || '-' for @product;
print "\n";
}
printf "\n%5s\n%5s\n%s\n\n", '^', '|', 'substrate';
}


sub combine {
my ($matrix1, $matrix2) = @_;
my %new_hash = %$matrix1;

for my $substrate (keys %$matrix2) {
$new_hash{$substrate}{$_} = 1 for keys %{ $matrix2->{$substrate} };
}
return %new_hash;
}



(This post was edited by rushadrena on Sep 8, 2012, 3:24 AM)


Laurent_R
Veteran / Moderator

Sep 7, 2012, 4:43 AM

Post #31 of 62 (5484 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Does the file exist in the directory from where you run your program? Or is it in a subdirectory?


rushadrena
Novice

Sep 7, 2012, 9:06 AM

Post #32 of 62 (5461 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Yup Laurent. The SUPERLIST_PRODUCT & SUPERLIST_SUBSTRATE files exist in the present directory, which further has 10 subdirectories. And each of these subdirectories then have a "list" file.


Laurent_R
Veteran / Moderator

Sep 7, 2012, 10:16 AM

Post #33 of 62 (5457 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Can you post the result of a 'ls -al' command in your directory?


rushadrena
Novice

Sep 7, 2012, 12:03 PM

Post #34 of 62 (5450 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Here it is :-
ls -al


Code
drwxr-xr-x 3 vanilla vanilla 4096 2012-09-05 22:35 acs/ 
-rw-r--r-- 1 vanilla vanilla 2215 2012-09-07 13:20 code2.pl
drwxr-xr-x 2 vanilla vanilla 4096 2012-09-05 20:32 dre/
drwxr-xr-x 2 vanilla vanilla 4096 2012-09-05 20:33 gga/
drwxr-xr-x 2 vanilla vanilla 4096 2012-09-05 20:33 mdo/
drwxr-xr-x 2 vanilla vanilla 4096 2012-09-05 20:34 mgp/
drwxr-xr-x 2 vanilla vanilla 4096 2012-09-05 20:34 oaa/
drwxr-xr-x 2 vanilla vanilla 4096 2012-09-05 20:38 spu/
drwxr-xr-x 2 vanilla vanilla 4096 2012-09-05 20:38 ssc/
-rw-r--r-- 1 vanilla vanilla 5884 2012-09-05 20:51 SUPERLIST_PRODUCT
-rw-r--r-- 1 vanilla vanilla 5884 2012-09-05 20:58 SUPERLIST_SUBSTRATE
drwxr-xr-x 2 vanilla vanilla 4096 2012-09-05 20:47 xla/
drwxr-xr-x 2 vanilla vanilla 4096 2012-09-05 20:46 xtr/



Laurent_R
Veteran / Moderator

Sep 8, 2012, 1:36 AM

Post #35 of 62 (5404 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Assuming code2.pl is your program and that you are launching it from the directory, everything seems OK, the file permissions should enable you to open the 2 SUPERLIST files.

I do not understand why opening these files should fail. Can you view your SUPERLIST files with the cat or more commands? I'm asking because there is sometimes an invisible or hidden character in file names.


rushadrena
Novice

Sep 8, 2012, 3:27 AM

Post #36 of 62 (5395 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Ohh my bad. Actually there was a typo in file name I removed it and the code has now moved to next 32 lines. With an error for line 33 as :-

Code
Unable to open $dir[$i]/list for reading. No such file or directory at code2.pl line 33.


Am I reading the files wrongly at line number 33.


Laurent_R
Veteran / Moderator

Sep 8, 2012, 7:10 AM

Post #37 of 62 (5388 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Hi,

I think that the command:


Code
my @file = qw($dirs[$i]/list $dirs[$j]/list);


probably does not do what you seem to think it does, although I am not entirely sure of what you are trying to do with it. Add a print on @file or use the debugger to figure out the contents of the @file array.

If you want a list of files in directory dirs[$i], you may want to use the glob function or you could do something like this:


Code
my $file_list = `ls $dirs[$i]`; 
my @files = split /\n/, $file_list;



Chris Charley
User

Sep 8, 2012, 9:27 AM

Post #38 of 62 (5368 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

The way you have set up the for loops is wrong, (I'll show below), and does not interface with the rest of the program. I believe you want the pairings of each matrix, (and not the matrix by itself). That seems to be the intention of your attempt to solve the problem.

You also have other issues which you have been discussing with Laurent. How to find the files, etc.

To find all pairs, I set up a table from 1 to 5, (instead of 1 to 10), just to demonstrate combinations.


Code
 #!/usr/bin/perl  
use strict;
use warnings;

for (my $i=1;$i<=5;$i++) {
for (my $j=1;$j<=5;$j++) {
print "($i $j)\n";
}
}
print "\ncombinations\n";
for (my $i=1;$i<=5;$i++) {
for (my $j=$i+1;$j<=5;$j++) {
print "($i $j)\n";
}
}
__END__
C:\Old_Data\perlp>perl t7.pl
(1 1) duplicate
(1 2)
(1 3)
(1 4)
(1 5)
(2 1) done earlier (1 2)
(2 2) duplicate
(2 3)
(2 4)
(2 5)
(3 1) done earlier (1 3)
(3 2) done earlier (2 3)
(3 3) duplicate
(3 4)
(3 5)
(4 1) done earlier (1 4)
(4 2) done earlier (2 4)
(4 3) done earlier (3 4)
(4 4) duplicate
(4 5)
(5 1) done earlier (1 5)
(5 2) done earlier (2 5)
(5 3) done earlier (3 5)
(5 4) done earlier (4 5)
(5 5) duplicate

combinations
(1 2)
(1 3)
(1 4)
(1 5)
(2 3)
(2 4)
(2 5)
(3 4)
(3 5)
(4 5)

C:\Old_Data\perlp>


You can see in the first column that some pairs have already been seen, just in reverse order, (1 2) and (2 1) for example.

Note: arrays in Perl begin with 0 and not 1.


(This post was edited by Chris Charley on Sep 8, 2012, 11:25 AM)


Laurent_R
Veteran / Moderator

Sep 8, 2012, 11:04 AM

Post #39 of 62 (5363 views)
Re: [Chris Charley] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Right, Chris, I had not seen this error un the OP's code.

To have only the combinations, you could also do something like that:


Code
my @file_counters = 1..5; 
while (my $i = shift @file_counters){
print "$i $_ \n" foreach @file_counters;
}


which prints:
1 2
1 3
1 4
1 5
2 3
2 4
2 5
3 4
3 5
4 5


rushadrena
Novice

Sep 8, 2012, 1:32 PM

Post #40 of 62 (5354 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Hi Laurent. I dont want to list all the files in each of the ten directories. "list" is actually the name of the file. "list" is the only file I'm concerned about (the file "list" resides in each of the ten directories). The code will select first directory and then takes its "list" file,generate matrix and concatenate it with the "list" file generated matrix of second directory. Pairwise matrices ,that is.

Code
1->1 
1->2
1->3
1->4
.....
2->1
2->2
......


The debug mode outputs ::-

Code
main::(code2.pl:26):	my $path = '.'; # (current directory - '.') or path to data files 
DB<1> n
main::(code2.pl:27): my @file = qw($dirs[$i]/list $dirs[$j]/list);
DB<1> n
main::(code2.pl:29): for my $file ( @file ) {
DB<1> n
main::(code2.pl:30): my %data;
DB<1> n
main::(code2.pl:31): my @substrate;
DB<1> n
main::(code2.pl:33): open my $fh, "<", "$file"
main::(code2.pl:34): or die "Unable to open $file for reading. $!";
DB<1> print @dirs
../xla./mgp./oaa./acs./spu./mdo./ssc./dre./xtr./gga
DB<2> print $dirs[1]
./xla



(This post was edited by rushadrena on Sep 8, 2012, 1:33 PM)


Laurent_R
Veteran / Moderator

Sep 9, 2012, 12:36 AM

Post #41 of 62 (5316 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Allright, I misunderstood what you were doing.

I do not understand, your error message does not correspond to your code.

The error message:


Code
Unable to open $dir[$i]/list


In the previous code, your list of directories is @dirs, not @dir.

But the code you had submitted said the directory is $path, which seems to be the current path.

It seems from de debugging sessions you've juste posted that you've corrected these errors, do you still have an error message when opening the file?

On another point, I don't see the interest of comparing 1 with 1, and also probably not 1 with 2 and 2 with 1.


rushadrena
Novice

Sep 9, 2012, 2:26 AM

Post #42 of 62 (5312 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Laurent, see the problem is that the values of dirs (i.e. the @dirs) are getting displayed(as shown by the debug code), but the file "list" inside each of these directories are not being read. So Im still getting this error :-

Code
Unable to open $dirs[$i]/list for reading. No such file or directory at code2.pl line 33.


And yes , there aint no point in comparing 1 with 1 (and so on), that was just to explain you the problem scenario.


Laurent_R
Veteran / Moderator

Sep 9, 2012, 2:36 AM

Post #43 of 62 (5309 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

 
Don't use qw but qq to create your list of files.

qw/foo bar . . ./ is the same as (’foo’,’bar’,. . .)

Variables are not interpolated in this type of construct.

With qq/$dirs[$i] .../, it should work, since variables will be interpolated (and escape sequences will be processed).


rushadrena
Novice

Sep 9, 2012, 4:17 AM

Post #44 of 62 (5303 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Laurent that was a good point. So here's the relevant snippet from my code.

Code
my $path = '.'; # (current directory - '.') or path to data files 
my @file = qq($dirs[$i]/list $dirs[$j]/list );

for my $file ( @file ) {
my %data;
my @substrate;

open my $fh, "<", "$file"
or die "Unable to open $file for reading. $!";

Now the file is being read but there's another error

Code
 Unable to open ./xla/list ./xla/list  for reading. No such file or directory at code2.pl line 35.


Debugging portion :-

Code
main::(code2.pl:27):	my $path = '.'; # (current directory - '.') or path to data files 
DB<1> n
main::(code2.pl:29): my @file = qq($dirs[$i]/list $dirs[$j]/list );
DB<1> n
main::(code2.pl:31): for my $file ( @file ) {
DB<1> n
main::(code2.pl:32): my %data;
DB<1> n
main::(code2.pl:33): my @substrate;
DB<1> n
main::(code2.pl:35): open my $fh, "<", "$file"
main::(code2.pl:36): or die "Unable to open $file for reading. $!";
DB<1> n
Unable to open ./xla/list ./xla/list for reading. No such file or directory at code2.pl line 35.
Debugged program terminated. Use q to quit or R to restart,

DB<2> print @file
./xla/list ./xla/list
DB<3> print $file[0]
./xla/list ./xla/list
DB<4> print $file[1]


The @file should be holding ./xla/list as $file[0] and the second file ./xla/list as $file[1].


Laurent_R
Veteran / Moderator

Sep 9, 2012, 5:56 AM

Post #45 of 62 (5291 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Sorry, my error. There is an improvement, since the variables are now interpolated, but the two files names get stored in the first element of the array, because qq// makes a single string, not a list of strings.

Try this instead:


Code
my @file = ($dirs[$i]/list, $dirs[$j]/list );


Actually, you could also open directly "$dirs[$i]/list" and $dirs[$j]/list, but if you find it more convenient to create a temporary array with the two files, then the syntax above should work.


rushadrena
Novice

Sep 9, 2012, 8:13 AM

Post #46 of 62 (5285 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Tried this

Code
my @file = ($dirs[$i]/list, $dirs[$j]/list );

But got this error

Code
Bareword "list" not allowed while "strict subs" in use at code2.pl line 29. 
Bareword "list" not allowed while "strict subs" in use at code2.pl line 29.
Execution of code2.pl aborted due to compilation errors.



Laurent_R
Veteran / Moderator

Sep 9, 2012, 9:13 AM

Post #47 of 62 (5276 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Sorry, again an error. Should be:


Code
my @file = ("$dirs[$i]/list", "$dirs[$j]/list" );



rushadrena
Novice

Sep 10, 2012, 7:24 AM

Post #48 of 62 (5244 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Hi Laurent.
Finally the code is running fine,though the output file is huge,but that's not an issue as the format is what I wanted.
But a strange error keeps popping up after 10 minutes of running the code ( a complete run for 10X10 would take around 20 minutes)

Code
$ perl code2.pl >> 10_X_10.txt 
Use of uninitialized value within @file in concatenation (.) or string at code2. pl line 63.
Use of uninitialized value in concatenation (.) or string at code2.pl line 63.



Laurent_R
Veteran / Moderator

Sep 10, 2012, 8:12 AM

Post #49 of 62 (5225 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Please tell what is line 63 of your code.


rushadrena
Novice

Sep 10, 2012, 8:17 AM

Post #50 of 62 (5223 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Here it is

Code
61 for my $i (0 .. $#matrix) { 
62 for my $j ($i+1 .. $#matrix) {
63 print "Processing $file[$i] and $file[$j]\n";
64 my %data = combine($matrix[$i], $matrix[$j]);
65 process(\@spr_prod, \@spr_substrate, %data);
66 }
67 }



Chris Charley
User

Sep 10, 2012, 9:44 AM

Post #51 of 62 (2958 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post


Quote
../xla./mgp./oaa./acs./spu./mdo./ssc./dre./xtr./gga

Your @file array has 11 entries beginning with '.'. So, you either need to remove that entry or access your files like below.

Code
print "Processing $file[$i+1] and $file[$j+1]\n";

I think thats your problem.


Laurent_R
Veteran / Moderator

Sep 10, 2012, 10:02 AM

Post #52 of 62 (2956 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Quite obviously, there some values of $i and/or $j that you are using for which @file is not defined.

Check how you filled the @file array (is it from 0 to 9, or perhaps from 1 to 10)?

I don't want to check in the code you posted, because you posted it quite a while ago and I assume it has changed quite a bit since.

Edit: I had not seen the answer by Chris when I typed the above. Maybe it is 0 to 10 or 1 to 11, after all. You should probably filter out the extra entry as soon as you read your directory.


(This post was edited by Laurent_R on Sep 10, 2012, 10:18 AM)


rushadrena
Novice

Sep 10, 2012, 2:42 PM

Post #53 of 62 (2940 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Hi Laurent and Chris I didnt see that coming. I have fixed that error by introducing this

Code
splice(@dirs, 0, 1);

Now I have this huge file matrix.txt(I have snipped the file so as to contain only 2x5 matrices for a concise view) which I have to further process : Simultaneously one by one splitting and processing

Code
Processing  
10011
01010
Processing
11110
01010
Processing
11100
00110
Processing
10010
11001
Processing
11110
00001
Processing
11010
00110
Processing
10101
01010
Processing
00111
11100
Processing
11010
11111


What I want is to go through this list in a selection of three.

1. After the first occurrence pattern "Processing" i have 2 lines.Save these 2 lines in a separate file.save the file size of this file in variable "a"

2. After 2nd occurrence of "Processing", I have 2 lines.Save these 2 lines in a separate file.save the file size of this file in variable "b"

3. After 3rd occurrence of "Processing", I have 2 lines.Save these 2 lines in a separate file.save the file size of this file in variable "c"
Now I'd perform an operation $a+$b/$c >> RESULT.txt.

Then select other triplet i.e. 4th,5th and 6th occurence of "Processing" save the calculation to RESULT.txt and so on.

[EDIT:] I did tried these two codes, but they dont cater to the triplet selection

Code
 perl -ne 'BEGIN{ $/="Processing"; } if(/^\s*(\S+)/){ open(F,">$1.out")||warn"$1 write failed:$!\n";chomp;print F "gi", $_ }'



Code
awk -vRS="Processing" '{ print $0 > "file"t++".out" }' matrix.txt



(This post was edited by rushadrena on Sep 10, 2012, 2:58 PM)


rushadrena
Novice

Sep 11, 2012, 6:10 AM

Post #54 of 62 (2885 views)
Re: [Chris Charley] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Hi all,

Im still getting this irritating error after 10 minutes of running the code. And it keeps popping up continuously. :-

Code
$ perl code2.pl >> 10x10 
Use of uninitialized value within @file in concatenation (.) or string at code2.pl line 61.
Use of uninitialized value in concatenation (.) or string at code2.pl line 61.
Use of uninitialized value within @file in concatenation (.) or string at code2.pl line 61.
Use of uninitialized value in concatenation (.) or string at code2.pl line 61.


The relevant debug code

Code
  DB<1> print @dirs 
./acs./dre./gga./mdo./mgp./oaa./spu./ssc./xla./xtr
DB<2> print $#dirs
9

BTW, here's the code

Code
#!/usr/bin/perl 
use strict;
use warnings;
use List::Util qw/ max /;

open my $fh, "<", 'SUPERLIST_PRODUCT' or die $!;
my $spr_prod = do {local $/; <$fh>};
close $fh or die $!;
my @spr_prod = $spr_prod =~ /\d+/g;

open $fh, "<", 'SUPERLIST_SUBSTRATE' or die $!;
my $spr_substr = do {local $/; <$fh>};
close $fh or die $!;
my @spr_substrate = $spr_substr =~ /\d+/g;
my $xx=0;
my $zz=0;
my @matrix;
my @file;
my @dirs = split /\n/,`find -maxdepth 1 -type d`;
splice(@dirs, 0, 1);
for ($xx=0;$xx<=9;$xx++)
{
for ($zz=0;$zz<=9;$zz++)

{
my $path = '.'; # (current directory - '.') or path to data files
my @file = ("$dirs[$xx]/irrev_rev_revdup", "$dirs[$zz]/irrev_rev_revdup" );
for my $file ( @file ) {
my %data;
my @substrate;

print "#########################$file###################################";
open my $fh, "<", "$file"
or die "Unable to open $file for reading. $!";

while (<$fh>) {
if (/^substrate/) {
@substrate = /\d+/g;
}
elsif (/^product/) {
while (/(\d+)/g) {
for my $sub (@substrate) {
$data{$sub}{$1} = 1 ;
}
}
}
else {
die "Unknown format $file. $!";
}
}
close $fh or die "Unable to close $file. $!";

print "Processing file: $file\n";
process(\@spr_prod, \@spr_substrate, %data);

push @matrix, \%data;
}
}}
for my $i (0 .. $#matrix) {
for my $j ($i+1 .. $#matrix) {
print "Processing $file[$i] and $file[$j]\n";
my %data = combine($matrix[$i], $matrix[$j]);
process(\@spr_prod, \@spr_substrate, %data);
}
}

sub process {
my ($spr_prod, $spr_subst, %data) = @_;
my %seen;

my @product = sort {$a <=> $b}
grep ! $seen{$_}++,
@$spr_prod, map keys %$_, values %data;

# to get column width for print
my $wid = 1 + max map length, @product;

printf "%7s" . "%${wid}s" x @product . "\n", 'prod->', @product;

undef %seen;
my @substrate = sort {$a <=> $b}
grep ! $seen{$_}++,
@$spr_subst, keys %data;

for my $substrate (@substrate) {
printf "%7s", $substrate;
printf "%${wid}s", $data{$substrate}{$_} || '-' for @product;
print "\n";
}
printf "\n%5s\n%5s\n%s\n\n", '^', '|', 'substrate';
}


sub combine {
my ($matrix1, $matrix2) = @_;
my %new_hash = %$matrix1;

for my $substrate (keys %$matrix2) {
$new_hash{$substrate}{$_} = 1 for keys %{ $matrix2->{$substrate} };
}
return %new_hash;
}



rushadrena
Novice

Sep 11, 2012, 8:31 AM

Post #55 of 62 (2876 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Laurent there is another problem with the code. I tried to test the code on a smaller instance of directories and file size. For example I created 10 directories a,b,c,d,e,f,g,h,i,j inside a directory named test . Each of these directories has a file named "irrev_rev_revdup".
The contents of each of "irrev_rev_revdup" is same (for checking purpose) --

substrate[s]: 1 9
product[s]: 10
substrate[s]: 2 7
product[s]: 8 3
substrate[s]: 3
product[s]: 4
substrate[s]: 2
product[s]: 5
substrate[s]: 6
product[s]: 7
substrate[s]: 9
product[s]: 3 8
substrate[s]: 3 5
product[s]: 3 9
substrate[s]: 10
product[s]: 9 1
substrate[s]: 7
product[s]: 2 6
substrate[s]: 3
product[s]: 8 7

Now inside the test directory I created two files SUPERLIST_PRODUCT and SUPERLIST_SUBSTRATE, the contents of both these files ----

1
2
3
4
5
6
7
8
9
10

So the test directory has these files and directories:-
a b c d e f g h i j SUPERLIST_PRODUCT SUPERLIST_SUBSTRATE code2.pl

Now after running the code2.pl, logically there should be 100 concatenated (combined) matrices because there are 10 directories.But Im getting 19900 combined matrices.

Can you point out what could be the flaw ?


Chris Charley
User

Sep 11, 2012, 8:49 AM

Post #56 of 62 (2875 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Do you want a matrix of each individual file with the combined pairs of matrices OR just the combined pairs (without each individual file matrix).


rushadrena
Novice

Sep 11, 2012, 9:08 AM

Post #57 of 62 (2872 views)
Re: [Chris Charley] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Chris yes I want a matrix of each individual file with the combined pairs of matrices .
Suppose i have 5 files then i would like to have a matrix each for these files( a total of 5 that is), and also for the combined pairs (thus 5x5 =a total of 25 that is).


Laurent_R
Veteran / Moderator

Sep 11, 2012, 10:27 AM

Post #58 of 62 (2865 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Hi,

I think your problem is probably there:


Code
for my $i (0 .. $#matrix) {  
for my $j ($i+1 .. $#matrix) {


If I under(stand what you do, it should proably be :


Code
for my $i (0 .. $#dirs) {  
for my $j ($i+1 .. $#dirs) {



rushadrena
Novice

Sep 11, 2012, 1:14 PM

Post #59 of 62 (2854 views)
Re: [Laurent_R] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Laurent,
I implemented the corrections suggested, and now there are 45 combined matrices generated ((9+8+7+....1).
But the combined matrices are not getting correctly represented,

Code
#########################./a/irrev_rev_revdup#######################Processing file: ./a/irrev_rev_revdup 
prod-> 1 2 3 4 5 6 7 8 9 10
1 - - - - - - - - - 1
2 - - 1 - 1 - - 1 - -
3 - - 1 1 - - 1 1 1 -
4 - - - - - - - - - -
5 - - 1 - - - - - 1 -
6 - - - - - - 1 - - -
7 - 1 1 - - 1 - 1 - -
8 - - - - - - - - - -
9 - - 1 - - - - 1 - 1
10 1 - - - - - - - 1 -

^
|
substrate

#########################./d/irrev_rev_revdup######################Processing file: ./d/irrev_rev_revdup
prod-> 1 2 3 4 5 6 7 8 9 10
1 - - - - 1 - - - - 1
2 - - 1 - 1 - - 1 - -
3 - - 1 1 1 - 1 1 1 -
4 - - - - - - - - - -
5 - - 1 - 1 - - - 1 -
6 - - - 1 1 - 1 1 - -
7 - 1 1 - - 1 - 1 - -
8 - - - - - - - - - -
9 - - 1 - - - - 1 - 1
10 1 - - - - - - - 1 -


Now the combined matrix of both is here and its incorrect(It should be displaying "Processing ./a and ./b" rather than "Processing ./a and ./d")

Code
 Processing ./a and ./d 
prod-> 1 2 3 4 5 6 7 8 9 10
1 1 - 1 - - - - 1 - 1
2 1 1 1 - 1 - 1 1 - -
3 1 1 1 1 - - 1 1 1 -
4 - - - - - - - - - -
5 1 1 1 1 - - - - 1 -
6 - 1 - - - - 1 - - -
7 - 1 1 - - 1 1 1 - -
8 - - - - - - - - - -
9 1 - 1 - - - - 1 - 1
10 1 - - - - - - - 1 -

^
|
substrate


The above combined matrix should be for these two ("a" and "b" rather than "a" & "d"):-

Code
#########################./a/irrev_rev_revdup######################Processing file: ./a/irrev_rev_revdup 
prod-> 1 2 3 4 5 6 7 8 9 10
1 - - - - - - - - - 1
2 - - 1 - 1 - - 1 - -
3 - - 1 1 - - 1 1 1 -
4 - - - - - - - - - -
5 - - 1 - - - - - 1 -
6 - - - - - - 1 - - -
7 - 1 1 - - 1 - 1 - -
8 - - - - - - - - - -
9 - - 1 - - - - 1 - 1
10 1 - - - - - - - 1 -

^
|
substrate

#########################./b/irrev_rev_revdup#####################Processing file: ./b/irrev_rev_revdup
prod-> 1 2 3 4 5 6 7 8 9 10
1 1 - 1 - - - - 1 - -
2 1 1 - - 1 - 1 - - -
3 1 1 1 1 - - 1 1 1 -
4 - - - - - - - - - -
5 1 1 1 1 - - - - 1 -
6 - 1 - - - - 1 - - -
7 - 1 - - - 1 1 - - -
8 - - - - - - - - - -
9 1 - 1 - - - - 1 - -
10 1 - - - - - - - 1 -

Similarly none of the 45 matrices are getting displayed with proper name combinations.
Relevant code for displaying combination names:-

Code
for my $i (0 .. $#dirs) { 
for my $j ($i+1 .. $#dirs) {
print "Processing $dirs[$i] and $dirs[$j]\n";
my %data = combine($matrix[$i], $matrix[$j]);
process(\@spr_prod, \@spr_substrate, %data);
}
}


EDIT***** :- It would be really helpful if each of the concatenated matrices are printed after the two matrices from which it has been combined.


(This post was edited by rushadrena on Sep 11, 2012, 10:13 PM)


rushadrena
Novice

Sep 12, 2012, 12:16 AM

Post #60 of 62 (2817 views)
Re: [Chris Charley] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Hi Chris,
I modified your code to cater to my requirements (see parts in itallics).:-

Code
#!/usr/bin/perl 
use strict;
use warnings;
use List::Util qw/ max /;

open my $fh, "<", 'SUPERLIST_PRODUCT' or die $!;
my $spr_prod = do {local $/; <$fh>};
close $fh or die $!;
my @spr_prod = $spr_prod =~ /\d+/g;

open $fh, "<", 'SUPERLIST_SUBSTRATE' or die $!;
my $spr_substr = do {local $/; <$fh>};
close $fh or die $!;
my @spr_substrate = $spr_substr =~ /\d+/g;
my $xx=0;
my $zz=0;
my $i=0;
my $j=0;

my @matrix;
my @file;
my @dirs = split /\n/,`find -maxdepth 1 -type d`;
splice(@dirs, 0, 1);
for ($xx=0;$xx<=9;$xx++)
{
for ($zz=0;$zz<=9;$zz++)

{
my $path = '.'; # (current directory - '.') or path to data files
my @file = ("$dirs[$xx]/irrev_rev_revdup", "$dirs[$zz]/irrev_rev_revdup" );
for my $file ( @file ) {
my %data;
my @substrate;

print "#########################$file###################################";
open my $fh, "<", "$file"
or die "Unable to open $file for reading. $!";

while (<$fh>) {
if (/^substrate/) {
@substrate = /\d+/g;
}
elsif (/^product/) {
while (/(\d+)/g) {
for my $sub (@substrate) {
$data{$sub}{$1} = 1 ;
}
}
}
else {
die "Unknown format $file. $!";
}
}
close $fh or die "Unable to close $file. $!";

print "Processing file: $file\n";
process(\@spr_prod, \@spr_substrate, %data);

push @matrix, \%data;

} my %dat = combine($matrix[$i], $matrix[$j]);
print "Combined matrix $file[$i] and $file[$j]"
process(\@spr_prod, \@spr_substrate, %dat);

}}


sub process {
my ($spr_prod, $spr_subst, %data) = @_;
my %seen;

my @product = sort {$a <=> $b}
grep ! $seen{$_}++,
@$spr_prod, map keys %$_, values %data;

# to get column width for print
my $wid = 1 + max map length, @product;

printf "%7s" . "%${wid}s" x @product . "\n", 'prod->', @product;

undef %seen;
my @substrate = sort {$a <=> $b}
grep ! $seen{$_}++,
@$spr_subst, keys %data;

for my $substrate (@substrate) {
printf "%7s", $substrate;
printf "%${wid}s", $data{$substrate}{$_} || '-' for @product;
print "\n";
}
printf "\n%5s\n%5s\n%s\n\n", '^', '|', 'substrate';
}


sub combine {
my ($matrix1, $matrix2) = @_;
my %new_hash = %$matrix1;

for my $substrate (keys %$matrix2) {
$new_hash{$substrate}{$_} = 1 for keys %{ $matrix2->{$substrate} };
}
return %new_hash;
}

And here's a snippet from the output :-

Code
#########################./a/irrev_rev_revdup######################Processing file: ./a/irrev_rev_revdup 
prod-> 1 2 3 4 5 6 7 8 9 10
1 - - - - - - - - - 1
2 - - 1 - 1 - - 1 - -
3 - - 1 1 - - 1 1 1 -
4 - - - - - - - - - -
5 - - 1 - - - - - 1 -
6 - - - - - - 1 - - -
7 - 1 1 - - 1 - 1 - -
8 - - - - - - - - - -
9 - - 1 - - - - 1 - 1
10 1 - - - - - - - 1 -

^
|
substrate

#########################./b/irrev_rev_revdup#####################Processing file: ./b/irrev_rev_revdup
prod-> 1 2 3 4 5 6 7 8 9 10
1 1 - 1 - - - - 1 - -
2 1 1 - - 1 - 1 - - -
3 1 1 1 1 - - 1 1 1 -
4 - - - - - - - - - -
5 1 1 1 1 - - - - 1 -
6 - 1 - - - - 1 - - -
7 - 1 - - - 1 1 - - -
8 - - - - - - - - - -
9 1 - 1 - - - - 1 - -
10 1 - - - - - - - 1 -

^
|
substrate
Combined matrix ./a/irrev_rev_revdup and ./a/irrev_rev_revdup
prod-> 1 2 3 4 5 6 7 8 9 10
1 - - - - - - - - - 1
2 - - 1 - 1 - - 1 - -
3 - - 1 1 - - 1 1 1 -
4 - - - - - - - - - -
5 - - 1 - - - - - 1 -
6 - - - - - - 1 - - -
7 - 1 1 - - 1 - 1 - -
8 - - - - - - - - - -
9 - - 1 - - - - 1 - 1
10 1 - - - - - - - 1 -

^
|
substrate

#########################./a/irrev_rev_revdup####################Processing file: ./a/irrev_rev_revdup
prod-> 1 2 3 4 5 6 7 8 9 10
1 - - - - - - - - - 1
2 - - 1 - 1 - - 1 - -
3 - - 1 1 - - 1 1 1 -
4 - - - - - - - - - -
5 - - 1 - - - - - 1 -
6 - - - - - - 1 - - -
7 - 1 1 - - 1 - 1 - -
8 - - - - - - - - - -
9 - - 1 - - - - 1 - 1
10 1 - - - - - - - 1 -

^
|
substrate

#########################./c/irrev_rev_revdup###################Processing file: ./c/irrev_rev_revdup
prod-> 1 2 3 4 5 6 7 8 9 10
1 - - - - - - - - - 1
2 - - 1 - 1 - - 1 - -
3 1 - 1 1 1 - 1 1 1 -
4 - - - - - - - - - -
5 1 - 1 - 1 - 1 1 1 -
6 - - - - - - 1 - - -
7 1 1 1 - 1 1 1 1 - -
8 - - - - - - - - - -
9 - - 1 - - - - 1 - 1
10 1 1 - 1 1 - - - 1 -

^
|
substrate
Combined matrix ./a/irrev_rev_revdup and ./a/irrev_rev_revdup
prod-> 1 2 3 4 5 6 7 8 9 10
1 - - - - - - - - - 1
2 - - 1 - 1 - - 1 - -
3 - - 1 1 - - 1 1 1 -
4 - - - - - - - - - -
5 - - 1 - - - - - 1 -
6 - - - - - - 1 - - -
7 - 1 1 - - 1 - 1 - -
8 - - - - - - - - - -
9 - - 1 - - - - 1 - 1
10 1 - - - - - - - 1 -

As you can see the concatenated matrix isn't correct in any of the cases.
I hope you are getting me, the output should have been like this:-
Processing file: ./a/irrev_rev_revdup
Processing file: ./b/irrev_rev_revdup
Combined matrix ./a/irrev_rev_revdup and ./b/irrev_rev_revdup (Along with correct concatenation)


Laurent_R
Veteran / Moderator

Sep 12, 2012, 4:33 AM

Post #61 of 62 (2807 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Granted:


Code
 Processing ./a and ./d   
pro -> 1 2 3 4 5 6 7 8 9 10
1 1 - 1 - - - - 1 - 1
2 1 1 1 - 1 - 1 1 - -
3 1 1 1 1 - - 1 1 1 -
4 - - - - - - - - - -
5 1 1 1 1 - - - - 1 -
6 - 1 - - - - 1 - - -
7 - 1 1 - - 1 1 1 - -
8 - - - - - - - - - -
9 1 - 1 - - - - 1 - 1
10 1 - - - - - - - 1 -


is not a correct combination the matrices a and b you have shown. But is it a proper combination of a and d?


Laurent_R
Veteran / Moderator

Sep 12, 2012, 1:46 PM

Post #62 of 62 (2799 views)
Re: [rushadrena] A file parsing and 2D array/matrix problem. [In reply to] Can't Post

Hi Rushadrena,

I am taking data from your post #59. I took a bit more time to look at your data.

I looked at your matrices A and A combined with (B or D).



Code
#########################./a/irrev_rev_revdup#######################Processing file: ./a/irrev_rev_revdup  
prod-> 1 2 3 4 5 6 7 8 9 10
1 - - - - - - - - - 1
2 - - 1 - 1 - - 1 - -
3 - - 1 1 - - 1 1 1 -
4 - - - - - - - - - -
5 - - 1 - - - - - 1 -
6 - - - - - - 1 - - -
7 - 1 1 - - 1 - 1 - -
8 - - - - - - - - - -
9 - - 1 - - - - 1 - 1
10 1 - - - - - - - 1 -


Processing ./a and ./d
prod-> 1 2 3 4 5 6 7 8 9 10
1 1 - 1 - - - - 1 - 1
2 1 1 1 - 1 - 1 1 - -
3 1 1 1 1 - - 1 1 1 -
4 - - - - - - - - - -
5 1 1 1 1 - - - - 1 -
6 - 1 - - - - 1 - - -
7 - 1 1 - - 1 1 1 - -
8 - - - - - - - - - -
9 1 - 1 - - - - 1 - 1
10 1 - - - - - - - 1 -

^
|
substrate


The result looks consistent: any place where you have a 1 in matrix A, you also have a 1 in the matrix supposed to be a combination of A and (B or D).

In other words, your combination actually looks like a correct combination of Matrix A with some other matrix (possibly B or D).

If the caption says A and D, why do you claim it is in fact A and B and that the result is false? Maybe it is actually A and D, and maybe it is after all correct. What makes you think that this combination is actually A and B?

Since you haven't given us Matrix D, I can't go much further into the analysis, but I do not see at this point anything that can lead to the conclusion that the combination matrix is wrong.

Please provide matrix D of your test, so that I can figure out myself whether this is right or not. For the time being, all I can say is that the combinbation matrix definitely looks like a combination of matrix A with another matrix.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives