Home: Perl Programming Help: Intermediate:
regex expression



beecha
Novice

Jan 4, 2018, 6:18 AM


Views: 3762
regex expression

hi all,

i am working on the regex for below mentioned paragraph

war eligible, init =0*DEC0, ev = {1*BIN0,2*DEC3,6*BIN4,
& 5*BIN43,6*DEC32,
& 5*BIN43,7*DEC32}

Always the line starts from WAR i need to take only the values which are in the { } example:1*BIN0,2*DEC3,6*BIN4,5*BIN43,6*DEC32,5*BIN43,7*DEC32

can anyone help me with this

t


Laurent_R
Veteran / Moderator

Jan 5, 2018, 10:01 AM


Views: 3752
Re: [beecha] regex expression

You need to define very precisely and very accurately what you're looking for when designing a regex.

Please clarify:
- is the text you want to study in a string variable? In a file?
- does it contain only what you have shown or does it have a repetition of such 3-line paragraphs??
- are the { and } characters part of the string?

I would be good to quote verbatim your input data in within a code block so that we can figure exactly what it looks like.


BillKSmith
Veteran

Jan 6, 2018, 11:27 AM


Views: 3743
Re: [Laurent_R] regex expression

As Laurent has already said, we need more information. I have made a number of assumptions and proposed a solution which requires them. If this does help in any other way, I hope it helps you to post a better statement of your problem.


Code
use strict; 
use warnings;
#post 84614


=assumptions
Text is in a string (or you can read it into a string yourself)

the '&' characters are continuation marks (not included in string)

It is not necessary to do the whole job with one regex.

You do not know how many values to expect
=cut

my $string = <<'END_STRING';
war eligible, init =0*DEC0, ev = {1*BIN0,2*DEC3,6*BIN4,
& 5*BIN43,6*DEC32,
& 5*BIN43,7*DEC32}
END_STRING


$string =~ s/\n(\& )?//msg;
$string =~ m/war[^{]*{([^}]*?)}.*$/;
my @values = split /\,/, $1;

do{local $" = ','; print "@values\n"};



C:\Users\Bill\forums\guru>perl beecha1.pl
1*BIN0,2*DEC3,6*BIN4, 5*BIN43,6*DEC32, 5*BIN43,7*DEC32

Good Luck,
Bill


beecha
Novice

Jan 6, 2018, 10:30 PM


Views: 3734
Re: [Laurent_R] regex expression

Hi

Thanks for reply.

- is the text you want to study in a string variable? In a file?
ANS:the text will be present in the file i am reading the file line by line .

- does it contain only what you have shown or does it have a repetition of such 3-line paragraphs??

ANS: It will have repetition in the file . only "WAR" "init" "ev" will be common in the paragraph . i am looking to get the string between the {} without '&'
"1*BIN0,2*DEC3,6*BIN4, 5*BIN43,6*DEC32, 5*BIN43,7*DEC32"

For ex:
war eligible, init =0*DEC0, ev = {1*BIN0,2*DEC3,6*BIN4,
& 5*BIN43,6*DEC32,
& 5*BIN43,7*DEC32}
obtain string:
"1*BIN0,2*DEC3,6*BIN4, 5*BIN43,6*DEC32, 5*BIN43,7*DEC32"

war wear, init =1*DEC1, ev = {
& 1*BIN0,2*DEC3,6*BIN4,
& 5*BIN43,6*DEC32,
& 5*BIN43,7*DEC32,6*DEC32,6*DEC3,
& 8*BIN3,6*DEC34
}
- are the { and } characters part of the string?
i am looking to get the string between the {} without '&' character.

Thanks


beecha
Novice

Jan 6, 2018, 10:33 PM


Views: 3732
Re: [BillKSmith] regex expression

thank you for your reply.

It will have repetition in the file . only "WAR" "init" "ev" will be common in the paragraph . i am looking to get the string between the {} without '&'
"1*BIN0,2*DEC3,6*BIN4, 5*BIN43,6*DEC32, 5*BIN43,7*DEC32"

For ex:
war eligible, init =0*DEC0, ev = {1*BIN0,2*DEC3,6*BIN4,
& 5*BIN43,6*DEC32,
& 5*BIN43,7*DEC32}
obtain string:
"1*BIN0,2*DEC3,6*BIN4, 5*BIN43,6*DEC32, 5*BIN43,7*DEC32"

war wear, init =1*DEC1, ev = {
& 1*BIN0,2*DEC3,6*BIN4,
& 5*BIN43,6*DEC32,
& 5*BIN43,7*DEC32,6*DEC32,6*DEC3,
& 8*BIN3,6*DEC34
}


Laurent_R
Veteran / Moderator

Jan 7, 2018, 5:54 AM


Views: 3718
Re: [beecha] regex expression

OK, one try with some assumptions on your input:

Code
use strict; 
use warnings;

my $input = "war eligible, init =0*DEC0, ev = {1*BIN0,2*DEC3,6*BIN4,
& 5*BIN43,6*DEC32,
& 5*BIN43,7*DEC32}
war wear, init =1*DEC1, ev = {
& 1*BIN0,2*DEC3,6*BIN4,
& 5*BIN43,6*DEC32,
& 5*BIN43,7*DEC32,6*DEC32,6*DEC3,
& 8*BIN3,6*DEC34
} ";

my @strings = split /\nwar/m, $input;
for (@strings) {
s/\n\&//g;
my $match = $1 if /\{([^}]+?)\}/mg;
print "match: $match \n"
}


This produces the following output:

Code
match: 1*BIN0,2*DEC3,6*BIN4, 5*BIN43,6*DEC32, 5*BIN43,7*DEC32 
match: 1*BIN0,2*DEC3,6*BIN4, 5*BIN43,6*DEC32, 5*BIN43,7*DEC32,6*DEC32,6*DEC3, 8*BIN3,6*DEC34


I suppose this is (more or less) what you want.


BillKSmith
Veteran

Jan 7, 2018, 7:47 AM


Views: 3715
Re: [beecha] regex expression

Laurent and I have made exactly the same assumptions and we each proposed two step solutions. I do not see any reason in your new post that says you could not use either one.

It now appears that your data is in a file and you do not know how to read it into a string. Show us what you have done to read the file (Omit parsing. We can return to that later). Attach a realistic sample data file. Verify that we can run your code against your data and duplicate your error.

Hint: In the data that you have posted, "paragraphs" are separated by a single blank line. If this is always true, you should use the special variable 'INPUT_RECORD_SEPARATOR' ($/).

You can get the documentation for it with the utility perldoc.

Code
perldoc -v $/

Good Luck,
Bill


Laurent_R
Veteran / Moderator

Jan 9, 2018, 6:47 AM


Views: 3643
Re: [beecha] regex expression

Hi beecha,
I see that you have now posted the same question on the Perl Monks forum: http://www.perlmonks.org/?node_id=1206966

But you apparently did not care about answering Bill's and my posts here. Thank you very much for your gratitude.