CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
converting hash to AoH

 



zohman
Novice

Jul 24, 2017, 2:16 AM

Post #1 of 18 (4777 views)
converting hash to AoH Can't Post

Hi,

i have scenario where i need to convert some hash to JSON format,
but to achieve that i first need to convert some hash that i receive to AoH.

My example hash:
%hash output:

Code
$VAR1 = { 
'user1' => {
'Type' => 'A',
'Data' => 'data',
'Comment' => 'text text text'
},
'user2' => {
'Type' => 'A',
'Data' => 'data',
'CustomField1' => 'unknown'
},
'user3' => {
'Type' => 'A',
'Data' => 'data',
'Comment' => 'text text text'
},
'user4' => {
'Type' => 'A',
'Data' => 'data'
}
};


the AoH that i need from the above hash is:


Code
$VAR1 = [ 
{
'Name' => 'user1',
'Type' => 'A',
'Data' => 'data',
'Comment' => 'text text text'
},
{
'Name' => 'user2',
'Type' => 'A',
'Data' => 'data',
'CustomField1' => 'unknown'
},
{
'Name' => 'user3',
'Type' => 'A',
'Data' => 'data',
'Comment' => 'text text text'
},
{
'Name' => 'user4',
'Type' => 'A',
'Data' => 'data'
}
];



so the 'user' KEY of the hash become 'Name' key inside with it's VALUE.

any effective method to achieve that?

thanks.


zohman
Novice

Jul 24, 2017, 2:42 AM

Post #2 of 18 (4775 views)
Re: [zohman] converting hash to AoH [In reply to] Can't Post

Okay i did this, and it works great,


Code
	my @AoH; 
foreach my $name ( keys %hash ) {
my @t;
foreach ( keys %{$hash{$name}} ) {
push @t, $_ => $hash{$name}{$_};
}
push @AoH, { name => $name, @t };
}


Regards,


BillKSmith
Veteran

Jul 24, 2017, 5:52 AM

Post #3 of 18 (4769 views)
Re: [zohman] converting hash to AoH [In reply to] Can't Post

Your solution is certainly correct. However, I believe that the use of 'map' makes the intent much clearer. Also note that the use of 'sort' puts the array elements in the expected order.


Code
use strict; 
use warnings;
use Data::Dumper;
#post 84260
my %HoH = (
'user1' => {
'Type' => 'A',
'Data' => 'data',
'Comment' => 'text text text'
},
'user2' => {
'Type' => 'A',
'Data' => 'data',
'CustomField1' => 'unknown'
},
'user3' => {
'Type' => 'A',
'Data' => 'data',
'Comment' => 'text text text'
},
'user4' => {
'Type' => 'A',
'Data' => 'data'
}
);

my @AoH = map {{Name=>$_,%{$HoH{$_}}}} sort keys %HoH;


print Dumper(\@AoH);

OUTPUT:
$VAR1 = [
{
'Comment' => 'text text text',
'Name' => 'user1',
'Type' => 'A',
'Data' => 'data'
},
{
'Name' => 'user2',
'CustomField1' => 'unknown',
'Data' => 'data',
'Type' => 'A'
},
{
'Type' => 'A',
'Data' => 'data',
'Comment' => 'text text text',
'Name' => 'user3'
},
{
'Name' => 'user4',
'Type' => 'A',
'Data' => 'data'
}
];

Good Luck,
Bill


zohman
Novice

Jul 24, 2017, 9:49 AM

Post #4 of 18 (4767 views)
Re: [BillKSmith] converting hash to AoH [In reply to] Can't Post

sorting is not a concern but it looksmuch more elegant,
so i'll embrace that, Thanks.


Mr Keystrokes
Novice

Jul 25, 2017, 4:50 AM

Post #5 of 18 (4751 views)
Re: [BillKSmith] converting hash to AoH [In reply to] Can't Post

This is slightly related...I wanted to confirm, is the following a hash of arrays?



Code
 
$VAR123 = 'hsa_circ_0024017|chr11:93463035-93463135+|NM_033395|KIAA1731 FORWARD';
$VAR124 = [
{
'energy' => '-4.3',
'spacer' => 'AGGCACC',
'end' => '97',
'start' => '81'
}
];
$VAR125 = 'hsa_circ_0067224|chr3:128345575-128345675-|NM_002950|RPN1 FORWARD';
$VAR126 = [
{
'energy' => '-4.4',
'spacer' => 'CAGT',
'end' => '17',
'start' => '6'
},
{
'energy' => '-4.1',
'spacer' => 'GTT',
'end' => '51',
'start' => '26'
},
{
'energy' => '-4.1',
'spacer' => 'TTG',
'end' => '53',
'start' => '28'
}
];



BillKSmith
Veteran

Jul 25, 2017, 7:05 AM

Post #6 of 18 (4728 views)
Re: [Mr Keystrokes] converting hash to AoH [In reply to] Can't Post

The variables $VAR123 and $VAR125 are scalars. Each contains a single character string. These string appear to contain pipe (|) delimited fields.

The variables $VAR124 and $VAR126 are each a reference to an array of hashes. The array which $VAR124 refers to contains only one element (a reference to a hash). The other array contains three elements.

Please refer:

Code
>perldoc perldsc

for a tutorial on complex data structures. You probably should also read all the documents mentioned in its 'SEE ALSO' section.
Good Luck,
Bill


Chris Charley
User

Jul 25, 2017, 9:41 AM

Post #7 of 18 (4721 views)
Re: [Mr Keystrokes] converting hash to AoH [In reply to] Can't Post

Yes, that is a hash of arrays of hashes but it doesn't display that way in your output because the hash was used for Dumper instead of a reference to that hash.


Code
print Dumper %some_hash


To see the structure, use a reference.


Code
print Dumper \%some_hash



(This post was edited by Chris Charley on Jul 25, 2017, 9:49 AM)


Mr Keystrokes
Novice

Jul 25, 2017, 1:14 PM

Post #8 of 18 (4710 views)
Re: [BillKSmith] converting hash to AoH [In reply to] Can't Post

Thanks for the link, but now I'm unsure as to which type of 'multidimensional' data structure I should use.
I wish to be able to refer to specific hashes and compare them to equivalent hashes within the same array. And all within loops may I add.


Laurent_R
Veteran / Moderator

Jul 25, 2017, 1:34 PM

Post #9 of 18 (4708 views)
Re: [Mr Keystrokes] converting hash to AoH [In reply to] Can't Post

I don't understand your question. You already have a hash of arrays of hashes, that's probably what you should use.

Update: well, sometimes you need to reorganize your data structure for the purposes of your processing, but there is nothing in what you say that would seem to make this necessary.


(This post was edited by Laurent_R on Jul 25, 2017, 1:36 PM)


Mr Keystrokes
Novice

Jul 26, 2017, 2:38 AM

Post #10 of 18 (4688 views)
Re: [Chris Charley] converting hash to AoH [In reply to] Can't Post

Oh yeah, good point.


Mr Keystrokes
Novice

Jul 26, 2017, 12:37 PM

Post #11 of 18 (4681 views)
Re: [Mr Keystrokes] converting hash to AoH [In reply to] Can't Post

In my script I have created a (1st)hash of (2nd)arrays of (3rd)hashes but now for each key in my (1st)hash I need to loop through the (2nd)arrays comparing their respective (3rd)hashes i.e the spacers of each array with the other (2nd)arrays for that particular key of a (1st)hash.


Code
#!/usr/bin/perl  
use strict;
use warnings;
use Data::Dumper;
use Regexp::Common qw /number/;

open my $hairpin_file, '<', "new_xt_spacer_results.hairpin", or die $!;

my %HoA_sequences;
my $curkey;

while (<$hairpin_file>){
chomp;
if (/^>(\w+\d+\|\w+:\d+-\d+[-|+]\|\w+\|\w+\s+\w+$)/){
$curkey = $1;
}elsif (my ($energy, $start, $end, $spacer) =
/^\s*($RE{num}{real})\s+(\d+)\s+\.\.\s+(\d+)\s+\w+\s*[\w+|-]?\s+(\w+)\s*[\w+|-]?\s*/ ) {
die "value seen before header: '$_'"
unless defined $curkey;
push @{ $HoA_sequences{$curkey}},
{ energy=>$energy, start=>$start, end=>$end, spacer=>$spacer};
}
else { die "don't know how to parse: '$_'" }
}


Data:

Code
>hsa_circ_0024017|chr11:93463035-93463135+|NM_033395|KIAA1731  FORWARD 
-4.3 81 .. 97 ACTACACAGGCTCTA AGGCACC AAA GGGGTCT AAGxxxxxxxxxxxx
>hsa_circ_0067224|chr3:128345575-128345675-|NM_002950|RPN1 FORWARD
-4.4 6 .. 17 xxxxxxxxxxGTGAC CAGT ATGC ACTG AAGATGAGGTTTGTG
-4.1 26 .. 51 TGCACTGAAGATGAG GTT TGTGGACCATGTGTTTGATG AAC AAGTGATAGATTCTC
-4.1 28 .. 53 CACTGAAGATGAGGT TTG TGGACCATGTGTTTGATGAA CAA GTGATAGATTCTCTG


I don't quite know how to reference to each (3rd)hash for comparison to other (3rd)hashes within the (2nd)array...

For example, in my data structure, I am trying to loop through and compare the spacer scalars with each other within each (1st)hash:


Code
$VAR125 = 'hsa_circ_0067224|chr3:128345575-128345675-|NM_002950|RPN1  FORWARD';  
$VAR126 = [
{
'energy' => '-4.4',
'spacer' => 'CAGT',
'end' => '17',
'start' => '6'
},
{
'energy' => '-4.1',
'spacer' => 'GTT',
'end' => '51',
'start' => '26'
},
{
'energy' => '-4.1',
'spacer' => 'TTG',
'end' => '53',
'start' => '28'
}
];



BillKSmith
Veteran

Jul 26, 2017, 8:09 PM

Post #12 of 18 (4667 views)
Re: [Mr Keystrokes] converting hash to AoH [In reply to] Can't Post

The following code builds the HoA. I do not know what you want to compare, so it just extracts (and prints) all four spacer values.

Code
#!/usr/bin/perl   
use strict;
use warnings;
use Data::Dumper;
use Regexp::Common qw /number/;

my $results = \do{
">hsa_circ_0024017|chr11:93463035-93463135+|NM_033395|KIAA1731 FORWARD\n"

."-4.3 81 .. 97 ACTACACAGGCTCTA AGGCACC "
."AAA GGGGTCT AAGxxxxxxxxxxxx\n"

.">hsa_circ_0067224|chr3:128345575-128345675-|NM_002950|RPN1 FORWARD\n"

."-4.4 6 .. 17 xxxxxxxxxxGTGAC CAGT "
."ATGC ACTG AAGATGAGGTTTGTG\n"

."-4.1 26 .. 51 TGCACTGAAGATGAG GTT "
."TGTGGACCATGTGTTTGATG AAC AAGTGATAGATTCTC\n"

."-4.1 28 .. 53 CACTGAAGATGAGGT TTG "
."TGGACCATGTGTTTGATGAA CAA GTGATAGATTCTCTG\n"
};


my $HEADER = qr/
^>(
\w+\d+ \|
\w+\d+ : \d+ \- \d+[-|+] \|
\w+ \|
\w+\s+\w+
)$
/x;
my $ENERG_DAT = qr/
^ \s*
($RE{num}{real}) \s+ # energy
($RE{num}{int}) \s+ # start
\.\. \s+
($RE{num}{int}) \s+ # end
\w+ \s*
[\w+|-]? \s+
(\w+) \s* # spacer
[\w+|-]? \s*
\w+ \s+
\w+ \s+
\w+ \s*
$
/x;

open my $hairpin_file, '<', $results, or die $!;

my %HoA_sequences;
my $curkey;
while (<$hairpin_file>){
chomp;
if (/$HEADER/){
$curkey = $1;
}
elsif (my ($energy, $start, $end, $spacer) = /$ENERG_DAT/) {
die "value seen before header: '$_'" unless defined $curkey;
push @{ $HoA_sequences{$curkey}},
{ energy=>$energy, start=>$start, end=>$end, spacer=>$spacer};
}
else { die "don't know how to parse: '$_'" }
}
#print Dumper (\%HoA_sequences);

my @spacers;
foreach my $array (values %HoA_sequences) {
push @spacers, map( $_->{spacer}, @$array);
}
local $" = "\n";
print "@spacers\n\n";

OUTPUT:
CAGT
GTT
TTG
AGGCACC

Good Luck,
Bill


Mr Keystrokes
Novice

Jul 26, 2017, 11:12 PM

Post #13 of 18 (4660 views)
Re: [BillKSmith] converting hash to AoH [In reply to] Can't Post

Wow, this is waay more elegant than I'm used to :)

What is the function of the @array in:


Code
push @spacers, map( $_->{spacer}, @$array);

Is it to instruct the computer to pick the spacer value from each and every array of hashes? And not just the first one.

The script is supposed to compare the spacers like CAGT ,
GTT and TTG together and detect whether they are the same. And if they are the same it is supposed to choose the spacer that is associated with the lowest energy score.
It would then, if possible, dispense of the arrays which have the same spacer but highest energy scores from %HoA_sequences.

After that in phase 2 entitled, 'may the best hairpin win', all the remaining hairpins are compared with each other to determine whether they overlap or not. If they do not overlap they are kept in the %HoA_sequences.
However, if any 2, or any 3 (etc) hashes overlap then similar to phase 1, the spacer with the lowest energy score is kept and the highest scored spacers are dispensed with, leaving %HoA_sequences containing only distinctive hairpins per sequence.
For phase 2, I will need to use the range between any given hairpin as in the start and end for determination of overlaps.


(This post was edited by Mr Keystrokes on Jul 28, 2017, 2:02 AM)


BillKSmith
Veteran

Jul 27, 2017, 7:49 AM

Post #14 of 18 (4649 views)
Re: [Mr Keystrokes] converting hash to AoH [In reply to] Can't Post

Here is an explanation of the extraction loop in post #12:

The values of the HoA are each a reference to an array. The 'values' function returns a list of these references. The 'for' loop assigns these, one-at-a-time, to $array. The term @$array dereferences these references (returns a list of hash references). The function 'map', one-at-a-time, assigns these hash references to $_, evaluates the term $_->{spacer} and returns the list of results. (Note that the arrow operator, in that term, dereferences the hash reference and returns the value corresponding to 'spacer'.) The list of spacers, corresponding to each value of %HoA, is pushed into the array @spacers.

The special value $LIST_SEPARATOR ($") controls how an array is interpolated into a string. Here it is used to print every element on a separate line.


Should I assume that you want to process each array of the %HoA separately? If not, I still do not understand your requirements.
Good Luck,
Bill


Mr Keystrokes
Novice

Jul 28, 2017, 2:06 AM

Post #15 of 18 (4626 views)
Re: [BillKSmith] converting hash to AoH [In reply to] Can't Post

Thanks for the explanation, I see that a lot of it is about dereferencing in order to process the actual value from the data structure.

by the way, I have corrected my previous post because I meant the lowest scores are extracted not the highest. My mistake.

'Should I assume that you want to process each array of the %HoA separately?'
Yeah, essentially each array of hashes contains information that is only relevant to its parent hash.


BillKSmith
Veteran

Jul 28, 2017, 7:25 PM

Post #16 of 18 (4606 views)
Re: [Mr Keystrokes] converting hash to AoH [In reply to] Can't Post

You now have a reasonable specification for phase I. You have the code for building the data structure, and I already showed you how to dereference the data. Give it a try. Ask when you need help with a specific question. Hint: Use a temporary hash to hold the "best so far" for each value of 'spacer'.
Good Luck,
Bill


Mr Keystrokes
Novice

Jul 30, 2017, 5:38 AM

Post #17 of 18 (4568 views)
Re: [BillKSmith] converting hash to AoH [In reply to] Can't Post


Code
#!/usr/bin/perl  
use strict;
use warnings;
use Data::Dumper;
use Regexp::Common qw /number/;

use List::Util qw (all reduce);
use feature 'say';


## Captures the Header of the circRNA sequence
my $SeqName = qr/
^>(
\w+\d+ \|
\w+\w+ : \d+ \- \d+[-|+] \|
\w+ \|
\w+\s+\w+
)$
/x;

## Captures the energy, start position, end pos and spacer of each hairpin

my $HairpinData = qr/^\s*($RE{num}{real})\s+ # energy
(($RE{num}{int}))\s+ # start
\.\.\s+
(($RE{num}{int}))\s+ # end
\w+\s*
[\w+|-]?\s+
(\w+)\s* # spacer
[\w+|-]?\s*
/x;

open my $hairpin_file, '<', "new_xt_spacer_results.hairpin", or die $!;

my %HoA_sequences;
my $curkey;


## Assigning captured data to respectively named variables
## And pushing such info into a hash of array of hashes.

while (<$hairpin_file>){
chomp;
if (/$SeqName/){
$curkey = $1;
}
elsif (my ($energy, $start, $end, $spacer) = /$HairpinData/) {
die "value seen before header: '$_'" unless defined $curkey;
push @{ $HoA_sequences{$curkey}},
{ energy=>$energy, start=>$start, end=>$end, spacer=>$spacer};
}
else { die "don't know how to parse: '$_'" }
}


## Detecting identical spacers of hairpins & keeping the hairpin with
## the lowest energy score

for my $key (keys %HoA_sequences)
{
say "$key:";
my @spacers = map { $_->{spacer} } @{$HoA_sequences{$key}};
if ( all { $spacers[0] eq $_ } @spacers )
{
my $hr_max = reduce {
$a->{energy} < $b->{energy} ? $a : $b
} @{$HoA_sequences{$key}};

say "\thashref with max energy:";
say "\t\t$_ => $hr_max->{$_}" for keys %$hr_max;
}
}


So far I am able to compare the first spacer with all the rest and if they're all equal I can then find out which has the lowest energy and pick that one out. But I am struggling to come up with an algorithm that will detect when some (not all) of the spacers are the same and to then pick the one with the lowest energy out of those 'some'.
For example, if 2 out of 5 spacers are the same then there are essentially 4 spacers.

I think the solution lies in the use of the reduce function of List::Util https://perldoc.perl.org/List/Util.html#reduce


BillKSmith
Veteran

Jul 30, 2017, 7:23 AM

Post #18 of 18 (4564 views)
Re: [Mr Keystrokes] converting hash to AoH [In reply to] Can't Post

Good, so far. I previously suggested using a temporary hash to solve this problem. Use the values of spacer as keys and a references to the required hashes as a values.



Code
my %new_HoA; 
while ( my ($header, $array) = each %HoA_sequences) {
my %best;
foreach my $hash (@$array) {
my $spacer = $hash->{spacer};
if (!exists $best{$spacer} or $hash->{energy}<$best{$spacer}{energy}){
$best{$spacer} = $hash;
}
}
$new_HoA{$header} = [values %best];
}


I do not understand your phase 2 requirements and I am not sure that you do. I expect that the code will be very similar to phase 1 except for the criteria for choosing the best.
Good Luck,
Bill

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives