Home: Perl Programming Help: Beginner:
Xml formatting



Stefanik
User

Jan 7, 2013, 1:42 PM


Views: 4979
Xml formatting

Hi all.
I should perform an XML formatter.
I've a file with a just one "long" string, formatted as following:


Code
<tag1><tag2>string1</tag2><tag3>string2</tag3></tag1>


I'm thinking of a code similar to:

Code
file -> split if ">" 
$count=0
while line in file
$count=0;
if match "<string>"
then
$count=$count+1
else if match "</string>"
$count=$count-1
end if
"\n"
"\t" for $count times
end while


What do you think?


(This post was edited by Stefanik on Jan 7, 2013, 1:45 PM)


rovf
Veteran

Jan 8, 2013, 12:24 AM


Views: 4958
Re: [Stefanik] Xml formatting

Does the XML code always look pretty much like your example, or can it be arbitrary XML?

Note that you are more flexible if you use something like XML::Parser for reading the file. No point in reinventing the wheel.


Stefanik
User

Jan 8, 2013, 12:48 PM


Views: 4932
Re: [rovf] Xml formatting

That's ok, but is it already installed in "standard" perl package?

I mean, I can't install any new package on the pc, so I should use what is already present.


rovf
Veteran

Jan 9, 2013, 12:09 AM


Views: 4924
Re: [Stefanik] Xml formatting


Quote
is it already installed in "standard" perl package?


Just try it:


Code
perldoc XML::Parser




Quote
I can't install any new package on the pc


Why not?

If you write your own code, why can't you use code other people have written?


Stefanik
User

Jan 9, 2013, 2:43 AM


Views: 4916
Re: [rovf] Xml formatting

It's already installed


Code
 NAME 
XML::Parser - A perl module for parsing XML documents
SYNOPSIS
use XML::Parser;


....



Let's try... :)


Stefanik
User

Jan 13, 2013, 1:37 PM


Views: 4860
Re: [Stefanik] Xml formatting

Can someone suggest me example to understand XML::parser?
I red the specific on perldoc, but it's not clear for me.

Thanks


7stud
Enthusiast

Jan 13, 2013, 9:07 PM


Views: 4851
Re: [Stefanik] Xml formatting


Code
 
use strict;
use warnings;
use 5.012;

use XML::Parser;

my $xml = <<END_OF_XML;
<?xml version="1.0"?>
<note>
<to>Tove</to>
<from color="red" age="25">Jani</from>
<from color="red" age="22">Jose</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
END_OF_XML


my $parser = new XML::Parser(

Handlers => {Start => \&start_of_tag,
End => \&end_of_tag,
Char => \&handle_text}
);

$parser->parse($xml);


my $found_target_tag = 0;

sub start_of_tag {
my($e, $element, %attrs) = @_;

if ($element eq 'from') {
if ($attrs{age} == 22) {
$found_target_tag = 1;
}
} elsif ($element eq 'body') {
$found_target_tag = 1;
}
}

sub handle_text {
if ($found_target_tag) {
my($e, $text) = @_;
say $text;
}
}

sub end_of_tag {
$found_target_tag = 0;
}

--output:--
Jose
Don't forget me this weekend!

However, there are much better XML modules that you can use. Check to see if you have any other XML modules installed:

$ perldoc perllocal


(This post was edited by 7stud on Jan 14, 2013, 1:42 AM)


Stefanik
User

Jan 14, 2013, 5:56 AM


Views: 4837
Re: [7stud] Xml formatting

Many thanks 7stud.
From your example I'm not sure about the use of XML::parser...

What I have is a file containing something similar:


Code
<?xml version='1.0' encoding='ISO-8859-1' standalone='no'?> 
<Request MO="OSUB" O
peration="get"> <num>456</num></Request>
<?xml version='1.0' encoding='ISO-8859-1' standalone='no'?>
<Response>
<errorid>051</errorid>
</Response>


I should jag the xml part not alredy formed:

Code
Right Output: 
<?xml version='1.0' encoding='ISO-8859-1' standalone='no'?>
<Request MO="OSUB" Operation="get">
<num>456</num>
</Request>

<?xml version='1.0' encoding='ISO-8859-1' standalone='no'?>
<Response>
<errorid>051</errorid>
</Response>


Is it possible with the parser?


7stud
Enthusiast

Jan 14, 2013, 1:56 PM


Views: 4830
Re: [Stefanik] Xml formatting

Were there any other XML modules installed?


Code
$ perldoc perllocal



(This post was edited by 7stud on Jan 16, 2013, 2:08 AM)


Stefanik
User

Jan 14, 2013, 11:46 PM


Views: 4819
Re: [7stud] Xml formatting

Can't list the modules:
No documentation found for "perllocal".

O.S. is solaris 10.


Stefanik
User

Jan 15, 2013, 5:53 AM


Views: 4812
Re: [Stefanik] Xml formatting

I write a code to perform the indent xml tag:


Code
 my $fin="InputXML.log"; 
my $fout="OutputXML.log";
my $countxml=0;
my $qx="x";open (NSOFILE, "<", $fin) or die "No file!";
open (NOUTFILE, ">", $fout) or die "No file!";while ($qx = <NSOFILE>){
print "inizio: "."$countxml\n";
if ($qx =~ (/^<[^\?|\/].*?>/m)){ ###Check StartTag
$countxml++;
my $cind = $countxml;
while ($cind != 0) {
$qx =~ s/</\t </g;
$cind--;
}
} elsif ($qx =~ (/<\//m)){ ###Check EndTag
my $cind = $countxml;
while ($cind != 0) {
$qx =~ s/</\t </g;
$cind--;
}
$countxml--;
} elsif ($qx =~ (/>\n.*\n<\//m)) ###check Value between tag
my $cind = $countxml;
while ($cind != 0) {
$qx =~ s/(>\n.*\n<\/)/(>\n\t.*\n<\)/g;
$cind--;
}
}
}
close (NSOFILE);
close (NOUTFILE);



The Input file contains something like this:


Code
 DELETE: 
TESTSUB:
NUM,458:
PARAMETERS,other; RESP:
0;



<?xmlversion='1.0'encoding='ISO-8859-1'standalone='no'?>
<Request>
<operation>
DeleteSubscriber
</operation>
<subscriberNumbertype="string">
458
</subscriberNumber>
<originNodeTypetype="string">
ADM
</originNodeType>
<originHostNametype="string">
lh
</originHostName>
<originTransactionIDtype="string">
4568768597657
</originTransactionID>

<Lev1>
<mystring>
here
</mystring>

</Lev1>
<originTimeStamptype="dateTime.iso8601">
20130109T11:18:22+0100
</originTimeStamp>



</barring>

</Request>



Output I produce is this:


Code
 DELETE: 
TESTSUB:
NUM,458:
PARAMETERS,other; RESP:
0;



<?xmlversion='1.0'encoding='ISO-8859-1'standalone='no'?>
<Request>
<operation>
DeleteSubscriber
</operation>
<subscriberNumbertype="string">
458
</subscriberNumber>
<originNodeTypetype="string">
ADM
</originNodeType>
<originHostNametype="string">
lh
</originHostName>
<originTransactionIDtype="string">
4568768597657
</originTransactionID>
<Lev1>
<mystring>
here
</mystring>



</Lev1>
<originTimeStamptype="dateTime.iso8601">
20130109T11:18:22+0100
</originTimeStamp>





</Request>



Stefanik
User

Jan 15, 2013, 5:55 AM


Views: 4811
Re: [Stefanik] Xml formatting

I have two problems:

1) my code at point "###check Value between tag" doesn't work, maybe regex doesn't match "\n"...

2) I should remove blank line between tags (only the ones between tags)


7stud
Enthusiast

Jan 16, 2013, 2:12 AM


Views: 4803
Re: [Stefanik] Xml formatting


In Reply To
I have two problems:

Can you list your installed modules with this:


Code
$ instmodsh


If not, run this program:


Code
 
use strict;
use warnings;
use 5.012;

use ExtUtils::Installed;

my $inst = ExtUtils::Installed->new();
my @modules = $inst->modules();

say for @modules;



(This post was edited by 7stud on Jan 16, 2013, 2:27 AM)


Stefanik
User

Jan 16, 2013, 4:45 AM


Views: 4794
Re: [7stud] Xml formatting

Seems impossible to get package list Unsure

"list_pack.pl" is the script you suggest me:


Code
 bash-3.00# ./list_pack.pl 
Perl v5.12.0 required--this is only v5.8.4, stopped at ./list_pack.pl line 4.
BEGIN failed--compilation aborted at ./list_pack.pl line 4.
bash-3.00#
bash-3.00#
bash-3.00# /usr/perl5/5.8.4/bin/instmodsh
Available commands are:
l - List all installed modules
m <module> - Select a module
q - Quit the program
cmd? l
Installed modules are:
Perl
cmd? q
bash-3.00#



7stud
Enthusiast

Jan 16, 2013, 4:23 PM


Views: 4770
Re: [Stefanik] Xml formatting

Get rid of the line:

use 5.012;

and rerun the program.