CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
need help on this

 



nitinp
Novice

Oct 9, 2006, 3:35 AM

Post #1 of 12 (9702 views)
need help on this Can't Post

Hi,

I am new to perl and need help on this.

This is the paragraph(alphanumeric) I have,

Line 1

aaaaaaaaaaaaaaaa

xxxxxxxxxxxxxxxxxx

yyyyyyyyyyyyyyyyy

Line 10

88 88 77 77 66 66 22 22 88 88 77 77 66 66 22 22

I need to figure out the following things. All this is stored in a file

1) I need to figure out the line number where ab ad ef gh matches first.

where ab ad ef gh are digits seperated by space.Could be any digit

so when I start search it should report line 10 to me because thats where this

pattern ab ad ef gh was found.

and also need to print 4 because there are 4 such patterns in line 1.

88 88 77 77 and 66 66 22 22 are seperated by 3 spaces.

Could any suggest how I can go about this.

Thanks.

Regards.

Nit.


KevinR
Veteran


Oct 9, 2006, 12:08 PM

Post #2 of 12 (9695 views)
Re: [nitinp] need help on this [In reply to] Can't Post

have you tried writing any code yet? $. holds the value of the input file line number so you would use that to get the line number and some regexps to find the pattern you are looking for.
-------------------------------------------------


nitinp
Novice

Oct 11, 2006, 12:09 AM

Post #3 of 12 (9687 views)
Re: [nitinp] need help on this [In reply to] Can't Post

Hi,

I need some sample code like how I have to write the regexp and also how to figure out the line no.

Thanks.
Regards.

Nit


davorg
Thaumaturge / Moderator

Oct 11, 2006, 3:03 AM

Post #4 of 12 (9684 views)
Re: [nitinp] need help on this [In reply to] Can't Post

As Kevin said, a variable called $. contains the current line number.

What do you have so far? Show us your code and we'll help you fix it. We won't write your code for you.

--
Dave Cross, Perl Hacker, Trainer and Writer
http://www.dave.org.uk/
Get more help at Perl Monks


nitinp
Novice

Oct 11, 2006, 3:32 AM

Post #5 of 12 (9682 views)
Re: [davorg] need help on this [In reply to] Can't Post

Hi,

I have posted the code.When I supply the input in input.txt it is of this form.

Line 10

88 88 77 77 66 66 22 22 88 88 77 77 66 66 22 22 66 66 22 22


Line 11

99 99 99 99 88 88 88 88 88 88 88 88 88 88 88 88 66 66 22 22 44



(last line) Line 100

99

Instead of this.

Line 1

sdksjdksjd

Line 2

kdsdksjd

Line 10


88 88 77 77 66 66 22 22 88 88 77 77 66 66 22 22 66 66 22 22


Line 100

99

Line 101

sasasjaksjasajs



I dont know how many lines would be present before line 10 so assume that only when we search for this pattern we can get to know

where to start reading the actual input from.I also dont know how many lines can exist after line 100.We have to stop there based on the same

pattern.Each of these patterns are seperated based on 3 spaces that I use as a split mechanism.

I also assume that there are 5 such patterns on each line which could also vary based on para.We need to generalize this too.

Each of these lines after line 10 and line 11 and so will have the same count of patterns.

Line 1 and line 2 both have 5 sets of this pattern 88 88 77 77.So we should have this number as 5.

Here is my code.

My output should have these patterns stored line after line seperated by one space.

# open output file,if it already exists then delete it

if( -e $outputfile)
{
unlink($outputfile);
}
# open output file in append mode.

open (MYFILE, '>>output.txt');
@data = <IN>;

# counting no of lines in a file.

seek (IN,0,0);
while ($line = <IN>) {
$n++;
}


for($no_of_lines=0;$no_of_lines<$n;$no_of_lines++)
{

my @values= split(' ', $data[$no_of_lines]);


# I assume that there are 5 columns for the inner for loop there could be 6 too

# we need to actually figure how many such columns exist based on the pattern.



for($i=1;$i<5;$i++)
{
$newvalues[$i]=$values[$i];
}

print MYFILE @newvalues;

#adds a space after every pattern is printed.


print MYFILE " ";

}


close (MYFILE);
close IN;


The new code should actually ignore what comes before line 10 and what comes after line 100.

Its not always line 10 where pattern starts and not always line 100 where pattern ends.

Pattern always ends with a two chartacters.

The patterns are numbers from 0-9 and a to f so we can say hex.

So we could also have a pattern like this

aa 77 88 22

and could end with

bb

Thanks.

Regards.

Nit


KevinR
Veteran


Oct 12, 2006, 11:58 AM

Post #6 of 12 (9673 views)
Re: [nitinp] need help on this [In reply to] Can't Post

your data is confusing, are saying that these lines:

Line 10
Line 100

are actual lines in the data file? And the lines that say "Line 10" and "Line 100" might not be the actual 10th and 100th lines of the file so you need to search for those two patterns in the file to know where to start and stop the search of the other patterns you are looking for?
-------------------------------------------------


nitinp
Novice

Oct 13, 2006, 1:22 AM

Post #7 of 12 (9671 views)
Re: [KevinR] need help on this [In reply to] Can't Post

It could be any.We need to search for patterns.You are correct.I gave that as an example.

Regards.

Nit


KevinR
Veteran


Oct 13, 2006, 8:42 AM

Post #8 of 12 (9667 views)
Re: [nitinp] need help on this [In reply to] Can't Post


In Reply To
It could be any.We need to search for patterns.You are correct.I gave that as an example.

Regards.

Nit


Post what the real data looks like. There is no sense trying to help using fake data.
-------------------------------------------------


nitinp
Novice

Oct 13, 2006, 10:30 AM

Post #9 of 12 (9665 views)
Re: [KevinR] need help on this [In reply to] Can't Post

Hi,

What I have posted is real data.Just remove Line 1 and Line 10.These are numbers I gave.Thats about it.



Regards.

Nit


KevinR
Veteran


Oct 13, 2006, 1:53 PM

Post #10 of 12 (9662 views)
Re: [nitinp] need help on this [In reply to] Can't Post

OK, using some dummy data:


Code
fefae rdaefdew 
fdsafs fdsfdsfds
88 88 77 77 66 66 22 22 88 88 77 77 66 66 22 22
88 88 77 77 66 66 22 22 88 88 77 77 66 66 22 22 66 66 22 22
aadfsadsfasdf
88 88 77 77 66 66 22 22
sadasdasd
99
88 88 77 77 66 66 22 22 88 88 77 77 66 66 22 22
aadfsadsfasdf
88 88 77 77 66 66 22 22 88 88 77 77 66 66 22 22 66 66 22 22
sadasdasd


I came up with this code:


Code
open(IN,'data.txt') or die "Can't open data.txt: $!"; 
my @data = <IN>;
close(IN);
open (MYFILE, '>>output.txt') or die "Can't open output.txt: $!";
for my $i (0..$#data) {
last if $data[$i] =~ /^\d\d$/;#end search pattern
if ($data[$i] =~ /[a-f0-9]{2}\s+[a-f0-9]{2}\s+[a-f0-9]{2}\s+[a-f0-9]{2}\s*/) {
my $count = 0;
$count = ($data[$i] =~ s/(([a-f0-9]{2}\s*){4})/$2/g);
my $line_number = $i+1;
print MYFILE "Found $count sets of pattern 'NN NN NN NN' at line number $line_number\n";
}
}
close(MYFILE);


which produces output of:


Code
Found 4 sets of pattern 'NN NN NN NN' at line number 3 
Found 5 sets of pattern 'NN NN NN NN' at line number 4
Found 2 sets of pattern 'NN NN NN NN' at line number 6



if the file is too large it would be better or maybe even necessary to parse the file line by line instead of reading the entire file into an array. The search pattern may also have to be refined if it returns false positives.
-------------------------------------------------


nitinp
Novice

Oct 17, 2006, 5:51 AM

Post #11 of 12 (9649 views)
Re: [KevinR] need help on this [In reply to] Can't Post

Hi,

It works great.Thanks a ton.I wanted to know how this actually works

$count = ($data[$i] =~ s/(([a-f0-9]{2}\s*){4})/$2/g);

and it seems to be actually changing the data itself or the file contents which was copied there.

the file contains info which has symmetric columns so its always going to be 4.Yours works for any number of patterns on each line so its great.

does s star mean one space of more than that.

rgds.

Nit






KevinR
Veteran


Oct 17, 2006, 9:15 AM

Post #12 of 12 (9645 views)
Re: [nitinp] need help on this [In reply to] Can't Post

\s* means zero or more spaces.
-------------------------------------------------

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives