CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
Search Posts SEARCH
Who's Online WHO'S
Log in LOG

Home: Perl Programming Help: Regular Expressions:
Help with parsing text



Jul 8, 2006, 6:42 PM

Post #1 of 3 (9756 views)
Help with parsing text Can't Post

I need to parse some text that contains essentially key value pairs. It is in the following format:

Key is:
Beginning of line, followed by known text, followed by a ':' character.

Value is:
Text. Could include multiple lines, blank lines, the ':' character, whatever.

A key/value pair ends when any other key is encountered, or the end of the file.

The program will have a list of the keys. They are optional and the order may be unpredictable.

I'm having a hard time saying:
Match any key in this list, followed by text, up until the first instance of any other key in the list.

So far the only reliable approach I thought of is:

Match a key, store the remaining text.
Run through the list of keys, looking for matches in the original remainder, storing these as well.
Select the shortest one.

Can anyone suggest a more efficient approach?



Jul 8, 2006, 7:09 PM

Post #2 of 3 (9755 views)
Re: [andrew_b] Help with parsing text [In reply to] Can't Post

lets see some of the text


Jul 9, 2006, 8:47 AM

Post #3 of 3 (9752 views)
Re: [KevinR] Help with parsing text [In reply to] Can't Post

Consider the following:

use warnings;
use strict;

#the text could be from a file, etc. but:
my $text;
undef $/;
$text = <DATA>;

#The keys would be known, e.g.
my @keys = (
'A key could be:',
'A value could be:',
'The order is:',
'The keys are:',

#What I'd like to end up with is something like:
my %result = (
'A key could be:' => 'Anything\n\n',
'A value could be:' => 'Any text.\n\nEven multiple lines.\n\nMight include a : or whatever\n\n',
'The order is:' => '\nunpredictable\n\n',
'The keys are:' => 'optional\n',

#the code below works but seems inefficient:
my %mess;

foreach my $key(@keys) {
if ($text =~ /$key/) {
$mess{$key} = {'remainder' => $', 'length' => length $'};

while (my ($mess_key, $mess_val_ref) = each %mess) {
foreach my $key(@keys) {
if ($mess_val_ref->{'remainder'} =~ /$key/) {
my $length = length $`;
if ($length < $mess{$mess_key}{'length'}) {
$mess{$mess_key}{'length'} = $length;
$mess{$mess_key}{'value'} = $`;
else {
$mess{$mess_key}{'value'} = $mess_val_ref->{'remainder'};

while( my($key, $val) = each %mess ) {
print $key, $val->{'value'};

#other ideas?

A key could be: Anything

A value could be: Any text.

Even multiple lines.

Might include a : or whatever

The order is:

The keys are: optional


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives