Home: Perl Programming Help: Regular Expressions:
Read the var values



Chupo_cro
Novice


Aug 19, 2012, 6:18 AM


Views: 21699
Read the var values

I'd like to parse the text file containing the lines similar to the following:

Code
1234 X200.18Y50.41Z42.78 
5678 Y=452.54 Z28.10 X=94.75
90Z=12.689 Y87.91 X42.58

I was trying to make the regexp to read the X, Y and Z values into the variables but couln't do that in just one pass so I had to make a WHILE loop. I tried with this variation of the regexp, but all I did was - the values are now in a different vars and I still need three passes per line to read X, Y and Z values :-/

Can comeone, please, help me with constructing the correct regexp? What I would like to do is:

To parse the text file row by the row and to read the X, Y and Z values into the separate variables. The '=' after the variable name is optional. The line number at the beginning of the line is optional.

Another problem with my code is:

In case I'd have the input in $1, $2 and $3 (e.g. $1 would contain 'X=12.34') the next step would be to extract just a value without the variable name and the '=' sign. If I would use something as:


Code
$1 =~ m/(\d+\.?\d+)/;


Then I'd lose the contents of the $2 and $3 :-/ So how to modify $x without altering $(x+1), $(x+2), ... ?
Chupo_cro


Chupo_cro
Novice


Aug 19, 2012, 6:53 AM


Views: 21693
Re: [Chupo_cro] Read the var values

OK, I've solved extracing only the value wothout var name and the '='. The regexp is now:


Code
while ($data =~ s/X *=? *(\d+\.?\d+)|Y *=? *(\d+\.?\d+)|Z *=? *(\d+\.?\d+)//)


Here is the full code.

The remaining qouestion is: what is the regexp to read the values into the vars in just one pass (without WHILE loop)?
Chupo_cro


FishMonger
Veteran / Moderator

Aug 19, 2012, 7:43 AM


Views: 21690
Re: [Chupo_cro] Read the var values

Are you currently getting your desired output?

If not, how does it differ from what you want?

Instead of looping over the lines, you could join them into a single scalar and apply one or more regex's to extract the data.


Laurent_R
Veteran / Moderator

Aug 19, 2012, 9:00 AM


Views: 21686
Re: [Chupo_cro] Read the var values

Hello,

Am I correct understanding that:


Code
1234 X200.18Y50.41Z42.78


mean that X = 200.18, Y = 50.41 and Z = 42.78?


Given that X, Y and Z are not always in the same order, I would probably use three distinct regexes on each line of input.

Assuming you line is in $_, something like this:


Code
$X = $1 if /X=?([\d]+)/; 
$Y = $1 if /Y=?([\d]+)/;
$Z = $1 if /Z=?([\d]+)/;


And, BTW, it might seem costly, but using 3 simple regexes is not necessarily more expensive that a single complicated one that is far more likely to require a lot of backtracking.

The regexes above take into account an optional '=' sign between the letter and the numbers.

You could also decide to remove the equal sign before stating your matches:


Code
s/=//g; # removes the "=" characters 
$X = $1 if /X([0-9]+)/;
$Y = $1 if /Y([0-9]+)/;
# ...



BillKSmith
Veteran

Aug 19, 2012, 1:42 PM


Views: 21679
Re: [Chupo_cro] Read the var values

Use a module to take care of the details of matching a number. Store your data in an array of hashes. The hash will keep the right number with the right letter.


Code
  

use strict;
use warnings;
use Regexp::Common;
use Readonly;
use Data::Dumper;
Readonly::Scalar my $NUMBER => $RE{num}{real};

my @points;
while (my $line = <DATA>) {
my %parms = $line =~ /([XYZ])=?($NUMBER)/g;
push @points, \%parms;
}
print Dumper \@points;
__DATA__
1234 X200.18Y50.41Z42.78
5678 Y=452.54 Z28.10 X=94.75
90Z=12.689 Y87.91 X42.58

Good Luck,
Bill


Chupo_cro
Novice


Aug 19, 2012, 7:23 PM


Views: 21666
Re: [Laurent_R] Read the var values


Quote
Hello,

Am I correct understanding that:


Code
1234 X200.18Y50.41Z42.78


mean that X = 200.18, Y = 50.41 and Z = 42.78?


Yes, that is correct. The '=' is optional and optional spaces are allowed after the X, Y, or Z. The text is in fact G-code. However, different CNC machines may use different syntax, hence the '=' is optional. The lines I made for the testing purposes are purposely made to reflect as many as possible allowed variations.

Quote

Given that X, Y and Z are not always in the same order, I would probably use three distinct regexes on each line of input.

Assuming you line is in $_, something like this:


Code
$X = $1 if /X=?([\d]+)/; 
$Y = $1 if /Y=?([\d]+)/;
$Z = $1 if /Z=?([\d]+)/;


Thank you very much! I found this information very useful!!

Quote

And, BTW, it might seem costly, but using 3 simple regexes is not necessarily more expensive that a single complicated one that is far more likely to require a lot of backtracking.

The regexes above take into account an optional '=' sign between the letter and the numbers.

You could also decide to remove the equal sign before stating your matches:


Code
s/=//g; # removes the "=" characters 
$X = $1 if /X([0-9]+)/;
$Y = $1 if /Y([0-9]+)/;
# ...


Thank you very much for your help!
Regards
Chupo_cro

(This post was edited by Chupo_cro on Aug 19, 2012, 7:48 PM)


Chupo_cro
Novice


Aug 19, 2012, 7:32 PM


Views: 21664
Re: [BillKSmith] Read the var values


Quote
Use a module to take care of the details of matching a number. Store your data in an array of hashes. The hash will keep the right number with the right letter.

Thank you very much for your reply. I have to first examine the code and to read the docs, then I probably might have some additional question.

Regards
Chupo_cro


Chupo_cro
Novice


Aug 19, 2012, 7:45 PM


Views: 21657
Re: [Chupo_cro] Read the var values

I am sorry about the incorrect posts timestamps, I thought the Time Offset parameter in the Display Profile Settings is relative to GMT but seems it isn't. I am in the GMT+01 timezone, it is 4:45 am here at the moment, I've set the Time Offset to 1, but the parameter is wrong :-/
Chupo_cro


BillKSmith
Veteran

Aug 19, 2012, 9:20 PM


Views: 21648
Re: [Chupo_cro] Read the var values

I believe that time stamps on this site are kept in PST (GMT-8). Your profile only changes the way they are displayed on your computer. Try setting your profile to +9.
Good Luck,
Bill


Chupo_cro
Novice


Aug 19, 2012, 9:28 PM


Views: 21648
Re: [FishMonger] Read the var values


Quote
Are you currently getting your desired output?

If not, how does it differ from what you want?

Well, with the regexp I wrote I might get the X, Y and Z values in three passes but everytime I would have to check which one of $1, $2 and $3 contains the extracted data.

Quote
Instead of looping over the lines, you could join them into a single scalar and apply one or more regex's to extract the data.

The input text contains several thousand lines of the data and I have to generate the transformed output line by line - that is why I am trying to process each line separately.

Thank you for the reply
Regards
Chupo_cro

(This post was edited by Chupo_cro on Aug 19, 2012, 9:29 PM)


Chupo_cro
Novice


Aug 19, 2012, 10:27 PM


Views: 21643
Re: [BillKSmith] Read the var values


Quote
I believe that time stamps on this site are kept in PST (GMT-8). Your profile only changes the way they are displayed on your computer. Try setting your profile to +9.

Aaahh :-)) It was a very stupid of me to think the wrong parameter would cause the wrong time stamp at the server. However, that would be true in the case of the news server + incorrent time while sending the article - the problem when using NNTP servers can be seen very often. Seems I spend too much time on the newsgroups :-)

The correct parameter for my timezone is +2 at the moment (because of the daylight saving).

Thank you
Regards
Chupo_cro