CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
regular expressions aaarrrrrhhh

 



jeffersno1
Novice

Feb 23, 2012, 1:32 PM

Post #1 of 6 (1289 views)
regular expressions aaarrrrrhhh Can't Post

 
Hi All,

Wondered if some one could help me!! I am trying to use regular expressions to get the following from my script

1 = 53254
2 = 705
3 = 13185

There are hundreds of lines in the /tmp/users file. I've just listed a few.

What im struggling with is the syntax of the regular expression. I dont seem to be able to match it. At the same time i think i shouldnt be using a while loop as i always want true values and not empty values.

Can someone offer any assistance as to where im going wrong?

I've supplied some snippets from the file and the code below. as you can see i've attempted several matches.

Thanks

Jeffers


Code
#!/usr/bin/perl 
#
$fup1 = '/tmp/users';

open (INFILE, "$fup1" || die "cant open $!\n");

while ($line = <INFILE>)
{
s/;//g;
chomp $line;
$fup = $line =~ m/VAL=(\d+)/;
#$value = $line =~ m/<\x20(\d+)\x20>/;
#####$value = $line =~ m/< (\d+) >/;
#($fup, $value) = $line =~ m/VAL=(\d+)< (\d+) >/;
print "fup = $fup\n";
#print "$fup,$value\n";
}

print "fup = $fup\n";

close (INFILE);


file looks like


select COUNT (*) from PCUBE.TUNABLES where TUNABLE_INDEX=0 and VAL=1;
< 53254 >
1 row found.

select COUNT (*) from PCUBE.TUNABLES where TUNABLE_INDEX=0 and VAL=2;
< 705 >
1 row found.

select COUNT (*) from PCUBE.TUNABLES where TUNABLE_INDEX=0 and VAL=3;
< 13185 >
1 row found.


naven8
Novice

Feb 23, 2012, 8:07 PM

Post #2 of 6 (1283 views)
Re: [jeffersno1] regular expressions aaarrrrrhhh [In reply to] Can't Post

I hope the following code will work
open (FILE,"y");
while (my $line = <FILE>){
chomp($line);
if ($line =~ m/VAL=(\d+)/){
my $val = $1;
$line = <FILE>;
$line =~ s/^<(.*)>$/$1/;
print "Entered here $val=$line";
}

}


naven8
Novice

Feb 23, 2012, 8:16 PM

Post #3 of 6 (1282 views)
Re: [naven8] regular expressions aaarrrrrhhh [In reply to] Can't Post

Sorry the above one had some problem.Use the follwoing

open (FILE,"y");
while (my $line = <FILE>){
chomp($line);
if ($line =~ m/VAL=(\d+)\;\s*$/){
my $val = $1;
$line = <FILE>;
chomp($line);
$line =~ s/^<\s*(\d*)\s*>\s*$/$1/;
print "Entered here $val=$line \n";
}

}


BillKSmith
Veteran

Feb 23, 2012, 9:22 PM

Post #4 of 6 (1282 views)
Re: [jeffersno1] regular expressions aaarrrrrhhh [In reply to] Can't Post

Your last attempt is very close. You need .* to match the characters in between the ones you want. Note: You need the options /ms to force the .* to match newlines (Refer: perldoc perlop).

A bigger problem is that an input record consists of four "lines". Setting the $INPUT_RECORD_SEPARATOR (refer: perldoc perlvar) to a null string tells perl to treat blank lines, rather than newlines, as record separators. Now, each read reads a full logical record into the string $line.


Code
#!/usr/bin/perl 
use strict;
use warnings;
use English;

my $fup1 = '/temp/users';
open my $INFILE, '<', $fup1 || die "cannot open $fup1 $!\n" ;
$INPUT_RECORD_SEPARATOR = q();
while ( my $line = <$INFILE> ) {
chomp $line;
my ($fup, $value) = $line =~ m/VAL=(\d+).+<\s(\d+)\s>/xms;
print "$fup = $value\n";
}
close($INFILE);



Your script would 'work' without the following changes, but good practice demands them.

Always use strict and use warnings.

Use lexical (my) variables for file handles.

Use the three argument form of open.
Good Luck,
Bill


jeffersno1
Novice

Feb 25, 2012, 2:30 PM

Post #5 of 6 (1251 views)
Re: [BillKSmith] regular expressions aaarrrrrhhh [In reply to] Can't Post

Hi BillKSmith, naven8

Thanks very much for your replies

Just a couple of questions if you have time?

BillKSmith, Can you explain the following??

Code
$INPUT_RECORD_SEPARATOR = q();

I thought changing the input record separator would be something like:
$"",";

I've looked into what /xms mean and I will start using "my" local variables in my code now. gonna need them when i start writing subs...

naven8,

In your script are you reading the file twice? once for the VAL= and again for the < # >
Is this needed? Is there any concern if the file is large?

Many thanks guys, still getting to grips with perl... I wont give up :)

thanks again

Jeffers


BillKSmith
Veteran

Feb 25, 2012, 5:13 PM

Post #6 of 6 (1241 views)
Re: [jeffersno1] regular expressions aaarrrrrhhh [In reply to] Can't Post

$INPUT_RECORD_SEPARATOR

The directive 'use English;' allows us to use the english long form of the perl built-in variables. A null string can be specified by two consecutive single (or double) quotes. I find the q() notation easier to read. Compare to the terse form:


Code
  

$/="";

$INPUT_RECORD_SEPARATOR = q();



Ugly?

Note: 'use English;' would not be required with the terse form.

The file is only read once!

The first time through the while loop, Everything from the beginning of the file through the first blank line is read as a single string into the lexical scalar variable $line. The regular expression extracts the values of $fup and $value from that string. The print statement prints both values with an equal sign between them. We have now reached the end of the while block so the variable $line goes out of scope and ceases to exist.

A new copy of the lexical variable $line is created. Everything up to the next blank line is read into that variable and it is processed as before.

This continues until there is nothing more to read. The while loop terminates and the input file is closed.



The /s option on the regular expression is the only one that is actually required in this case. (Without it, the metacharacter (.) would not match the newlines in the string $line.)

The /m option changes the meaning of the anchors ^ and $ from start and end of string to start and end of line. These anchors are not used in this regular expression. I always use this option on multi-line matches.

The /x option tells the regular expression engine to ignore white space (and perl comments) within the regular expression. I always use it (even when it is not needed) in all but the simplest regular expressions.



Now, look at the regular expression itself. It matches everything (including newlines) from the 'V' of 'VAL' through the angle bracket (>) after the value. The two pair of parenthesis capture the two integer values and return them to $fup and $value.
Good Luck,
Bill

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives