CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Read a file from a specific line and extract something to another file

 



sfo_sc
Novice

Jan 17, 2005, 11:43 AM

Post #1 of 33 (6413 views)
Read a file from a specific line and extract something to another file Can't Post

Hi,

I am totally new to perl program and in fact, I just bought the learning perl book. I need to write a log monitor program for my web server. What I want to do is to read the file in and search for "Exceptions" and save those text in a seperate file. But since the log file can be really big and I am going to run the script every few hours or something, I wonder if there is a way to start reading in a specific file in perl. That way I don't have to always start reading at the begining of the file. I can store the linenumber is a temp file and use that number as a starting point to read the file when I run the script next time. Can someone give me some help? Thanks.


KevinR
Veteran


Jan 17, 2005, 12:02 PM

Post #2 of 33 (6411 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post

absent other suggestions, you can look into using Tie::file, which treats a text file as a regular perl array. Each element of the array is the same as each line of the file. So if the file had 10,000 lines on the first read, you could start the next read at line 10,001 (array element 10000) .

It could still be a bit slow because as you say, the log files can become large (many megabytes) even for small websites.

http://www.perldoc.com/perl5.8.4/lib/Tie/File.html
http://perl.plover.com/TieFile/


Tie::file is a standard perl module, so most installations of perl should have it available for use.
-------------------------------------------------


sfo_sc
Novice

Jan 17, 2005, 1:06 PM

Post #3 of 33 (6409 views)
Re: [KevinR] Read a file from a specific line and extract something to another file [In reply to] Can't Post

Thanks for the suggestion Kevin. Do you know if I could use the same thing to extract certain lines into a file? Or is there any smart way to do that? I am thinking of having my program search for the keywords "Exception", then when it find the word, extract every line that is related to that exception and save it to a different file. Is there any module or function for this kind of problem? Thanks.


KevinR
Veteran


Jan 17, 2005, 3:09 PM

Post #4 of 33 (6404 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post

There is no module I am aware of, because this is just a very basic thing, there really is no need to use a module, just a small private script/function you code yourself.

To search for the term Exception, you can use a regexp very easily. Then if the word is found, you can push the entire line into an array. After all the searching is completed, you open your file and print the array to the file. Thats as simple as it gets, in reality you may need to do some more"massaging" of the data.
-------------------------------------------------


sfo_sc
Novice

Jan 17, 2005, 3:20 PM

Post #5 of 33 (6403 views)
Re: [KevinR] Read a file from a specific line and extract something to another file [In reply to] Can't Post

Thanks for the suggestion. Where can I find something I could read about "regexp"? I am looking at my perl book.. didn't mention something like that.


sfo_sc
Novice

Jan 17, 2005, 3:26 PM

Post #6 of 33 (6401 views)
Re: [KevinR] Read a file from a specific line and extract something to another file [In reply to] Can't Post

By the way, when I use Tie:File, it seems like storing each line to the array. When I use foreach (@array) to try to print every single line out, it actually missing some lines. I wonder what I did wrong.

Another thing, when I try to compare the line, for example $array[0] for the first line, how do I compare the first word in the line with "Exception"?


sfo_sc
Novice

Jan 17, 2005, 4:09 PM

Post #7 of 33 (6397 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post

Ok... I found out the answer for my first question. I actually need to add the \n at each line to show every single lines in the files.

I still wanna know how to compare the first word in the first line though. Thanks.


KevinR
Veteran


Jan 17, 2005, 11:00 PM

Post #8 of 33 (6386 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post


In Reply To
Thanks for the suggestion. Where can I find something I could read about "regexp"? I am looking at my perl book.. didn't mention something like that.


Sorry, regexp is the abbreviated term for Regular Expression, which is chapter 7 of the book you have. To match the first word of a line you will want o use the beginning of string anchor, ^, something like this:


Code
if ($string =~ m/^Exception/) { 
do something
}


This will return true (find a match ) if the line begins with the word Exception.
-------------------------------------------------


sfo_sc
Novice

Jan 18, 2005, 9:09 AM

Post #9 of 33 (6378 views)
Re: [KevinR] Read a file from a specific line and extract something to another file [In reply to] Can't Post

Thanks again for your help. Do you have a link or a book where I can learn more about the regexp? When I read some other people's codes, I often find really hard to understand those /\/\ stuff they have in the codes.

So the ^ actually means the first word in the string? Thats really convinence!


sfo_sc
Novice

Jan 18, 2005, 9:18 AM

Post #10 of 33 (6376 views)
Re: [KevinR] Read a file from a specific line and extract something to another file [In reply to] Can't Post

Oh and what does the "m" stands for before /^Exception/ means? Thanks.


sfo_sc
Novice

Jan 18, 2005, 10:03 AM

Post #11 of 33 (6373 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post

Ah... found out what is that m thing mean.. but it seems like with or without m after =~ is the same? Is that true?


KevinR
Veteran


Jan 18, 2005, 10:09 AM

Post #12 of 33 (6373 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post


In Reply To
Oh and what does the "m" stands for before /^Exception/ means? Thanks.



Quote
...I just bought the learning perl book...


Chapter 7 : Regular Expressions will explain all that stuff.

the small case m means "match". There are three basic regexps (regular expressions)

m (match)
s (substitution)
tr (trade)
-------------------------------------------------


sfo_sc
Novice

Jan 18, 2005, 10:54 AM

Post #13 of 33 (6368 views)
Re: [KevinR] Read a file from a specific line and extract something to another file [In reply to] Can't Post

Thanks for the explaination.
One more thing, I told you I want to start reading the file in a certain line when I run the script, and now I can do that with Tie::File. So I will have to store the line number somewhere so I can use that number next time I run the script. The question is, besides storing that number to a text file and read it in, is there a different/better way to do that? I notice there is a thing for __DATA__ where you can store the data at the end of the source code file, but it can only be read and cannot write stuff to it. Is there any other ways to do this? Thanks.


KevinR
Veteran


Jan 18, 2005, 12:16 PM

Post #14 of 33 (6367 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post

Storing the line number in a text file on the server is a simple and good method. I'm sure you could be as elaborate as you want, maybe use mySQL or DBI or other database type storage, but the text file solution is easy and simple and will work fine.


As far as I know you can not write to the __DATA__ section of a script in any conventional manner. A script is just a text file after all, and you could open and write to the script if you really wanted to, but its not exactly kosher, ifyaknowwhatImean. Wink
-------------------------------------------------


sfo_sc
Novice

Jan 18, 2005, 1:46 PM

Post #15 of 33 (6366 views)
Re: [KevinR] Read a file from a specific line and extract something to another file [In reply to] Can't Post

Thanks for your help! =)


davorg
Thaumaturge / Moderator

Jan 19, 2005, 2:22 AM

Post #16 of 33 (6362 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post


In Reply To
What I want to do is to read the file in and search for "Exceptions" and save those text in a seperate file. But since the log file can be really big and I am going to run the script every few hours or something, I wonder if there is a way to start reading in a specific file in perl. That way I don't have to always start reading at the begining of the file. I can store the linenumber is a temp file and use that number as a starting point to read the file when I run the script next time. Can someone give me some help? Thanks.


When you have finished processing the file, you can use the function tell to find the current position in the file (which you can store in your temp file). Then the next time you run your process, you can use seek to start processing from the correct place.

--
Dave Cross, Perl Hacker, Trainer and Writer
http://www.dave.org.uk/
Get more help at Perl Monks


davorg
Thaumaturge / Moderator

Jan 19, 2005, 2:25 AM

Post #17 of 33 (6361 views)
Re: [KevinR] Read a file from a specific line and extract something to another file [In reply to] Can't Post


In Reply To

Code
if ($string =~ m/^Exception/) { 
do something
}


This will return true (find a match ) if the line begins with the word Exception.


Putting my pedant head on, that code finds a match if the line begins with the _string_ "Exception". To match the _word_ "Exception" you would need.


Code
if ($string =~ m/^Exception\b/) { 
do something
}


The difference is that Kevin's code would also match things like "Exceptional" whereas mine wouldn't.

Of course, this may not matter to you :)

--
Dave Cross, Perl Hacker, Trainer and Writer
http://www.dave.org.uk/
Get more help at Perl Monks


davorg
Thaumaturge / Moderator

Jan 19, 2005, 2:28 AM

Post #18 of 33 (6360 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post


In Reply To
So the ^ actually means the first word in the string? Thats really convinence!


No, the ^ means "this part of the regex must match at the start of of the string" it says nothing about words.

And if you use the /m option on your match or substitution operator then the meaning is subtly changed to mean "this part of the regex must match at the start of of the string or immediately after a newline in the string".

--
Dave Cross, Perl Hacker, Trainer and Writer
http://www.dave.org.uk/
Get more help at Perl Monks


davorg
Thaumaturge / Moderator

Jan 19, 2005, 2:38 AM

Post #19 of 33 (6359 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post


In Reply To
Ah... found out what is that m thing mean.. but it seems like with or without m after =~ is the same? Is that true?


"That m thing" is the match operator. It is followed by a regular expression like this:


Code
m/some regex/


However, sometimes you might want your regex to contain the / character. In those cases you would need to escape the embedded/ with a \ which makes your regex look ugly.


Code
m/some\/regex/


In these cases, Perl allows you to use a different delimiter for your regular expression which makes it more readable.


Code
m|some/regex|


But if you are using the standard / as your regex delimiters you don't need to use the m - Perl implies its existance.


Code
/some regex/


Similar rules apply to the q and qq operators. See the sections on "Regexp quote-like operators" and "Quote and quote-like operators" in perldoc perlop for more details.

--
Dave Cross, Perl Hacker, Trainer and Writer
http://www.dave.org.uk/
Get more help at Perl Monks


davorg
Thaumaturge / Moderator

Jan 19, 2005, 2:41 AM

Post #20 of 33 (6358 views)
Re: [KevinR] Read a file from a specific line and extract something to another file [In reply to] Can't Post


In Reply To
There are three basic regexps (regular expressions)

m (match)
s (substitution)
tr (trade)


Careful. Your terminology is a little off here. These aren't regular expressions. These are operators that use regular expressions as one of their operands. Or, rather, two of them are. The tr (transliterate) operator doesn't use regular expressions at all.

And there are other places where you can use regular expressions. The first parameter to "split" is an obvious example.

--
Dave Cross, Perl Hacker, Trainer and Writer
http://www.dave.org.uk/
Get more help at Perl Monks


sfo_sc
Novice

Jan 19, 2005, 3:14 PM

Post #21 of 33 (6348 views)
Re: [davorg] Read a file from a specific line and extract something to another file [In reply to] Can't Post

Thank you very much for your detail explaination. This is really helpful for a new perl coder like myself.

Actually, I just get into another problem when I am writing this script.

First let me show you a piece of the log file :
java.lang2.reflect.InvocationTargetException
at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:520)
at org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:929)
at org.apache.coyote.tomcat5.CoyoteAdapter.service(CoyoteAdapter.java:160)
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:799)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.processConnection(Http11Protocol.java:705)
at org.apache.tomcat.util.net.TcpWorkerThread.runIt(PoolTcpEndpoint.java:577)
at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:683)
at java.lang.Thread.run(Thread.java:534)
Caused by: com.sorrent.db.DatabaseException: Backend start-up failed: FATAL: Sorry, too many clients already

at com.sorrent.db.Database.getConnection(Database.java:113)
at com.sorrent.foxsports.api.Manager.getConnection(Manager.java:14)
at com.sorrent.foxsports.api.NewsManager.getNewsLite(NewsManager.java:39)
at com.sorrent.server.fsm.fsm.News.update(News.java:67)
at com.sorrent.server.fsm.ContentInst.updateInstance(ContentInst.java:49)
at com.sorrent.server.fsm.fsm.Sport.updateNews(Sport.java:48)
... 39 more

Notice there is a empty line after "Caused by: ..." line. I want to have the code when ever it sees the empty line, push that line to the file or something. By using the =~ m/\n/, it didn't catch the empty line. I also try to use <null>, it also failed. Can any one tell me how do I compare the empty string match?


KevinR
Veteran


Jan 19, 2005, 3:36 PM

Post #22 of 33 (6345 views)
Re: [davorg] Read a file from a specific line and extract something to another file [In reply to] Can't Post

Thanks Dave, of course you are correct. I was really trying to get the thread starter to read the book they said they have where they could have found all the information they were looking for, but I do appreciate your corrections and your more thorough explanations.
-------------------------------------------------


sfo_sc
Novice

Jan 19, 2005, 3:43 PM

Post #23 of 33 (6342 views)
Re: [KevinR] Read a file from a specific line and extract something to another file [In reply to] Can't Post

You are right man. Just happen to me I need to write this script at work and I don't really have the time to read the whole book line by line. Thank you for pointing out which chapter or link I should look at for my problems.


KevinR
Veteran


Jan 19, 2005, 4:13 PM

Post #24 of 33 (6340 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post


In Reply To
Thank you very much for your detail explaination. This is really helpful for a new perl coder like myself.

Actually, I just get into another problem when I am writing this script.


Notice there is a empty line after "Caused by: ..." line. I want to have the code when ever it sees the empty line, push that line to the file or something. By using the =~ m/\n/, it didn't catch the empty line. I also try to use <null>, it also failed. Can any one tell me how do I compare the empty string match?


I'm not sure why you wpould want to push an empty line into an array, but you could just skip empty lines. Normally when looping through lines of text you chomp them, which removes the newline charater, then you can check if a line is empty with a regexp and use the next command to go to the next line

next if ($string /^$/);

as Dave said, you do not have to use the "m" operator for a match, it is implied, so my exmple does not have it. If you are not using an implict variable such as $string, you can just say:

next if (/^$/);

The "^" is the start of string anchor and the "$" is the end of string anchor. So using ^$ with nothing between is an empty line. If you think there might be some spaces only, you could do this:

next if (/^\s*$/);

which is a line with only zero or more spaces in it.

Pay particular attention to anything Dave has to say, he is not only an experienced perl programmer, he can also explain perl better than anyone I have ever come across on the internet, and he will not talk down to or be insulting to newbies like some people on other forums are. He is an invaluable resource, and I mean that sincerely.
-------------------------------------------------


sfo_sc
Novice

Jan 19, 2005, 4:30 PM

Post #25 of 33 (6338 views)
Re: [KevinR] Read a file from a specific line and extract something to another file [In reply to] Can't Post

Thanks for the info. I found out I can use length($string) == 0 too. =)

By the way.. when I try to look for lines the start with tabs I use m/^\t/ but when I try to look for lines that start with space and I use m/^\s/, the code will go into infinit loop. Can someone tell me why? And how to look for lines start with space.


(This post was edited by sfo_sc on Jan 19, 2005, 4:38 PM)


sfo_sc
Novice

Jan 19, 2005, 9:08 PM

Post #26 of 33 (2845 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post

Whats wrong with this piece of code? I couldn't figure out.

Code
while($file[$i] =~ m/^\t/ || $file[$i] =~ m/^Caused/ 
|| length($file[$i]) == 0 || $file[$i] =~ m/^\s/) {
push(@temp, $file[$i] . "\n");
$i++;
}

I keep getting "Use of uninitialized value in pattern match (m//) at ./logwatch.pl line 53, <SOURCE> line 274." and go into infinit loop. Please help.

It is complaining about the leading whitespace condition I just adding. If I remove that, the code will work.

Line 53 is the $i++, which doesn't make sense at all.

Here is the case where I want to capture to the file.

Code
java.lang.InternalError: jzentry == 0, 
jzfile = 135272768,
total = 390,
name = /opt/tomcat5/webapps/fsm.war,
i = 12,
message = invalid LOC header (bad signature)
at java.util.zip.ZipFile$2.nextElement(ZipFile.java:320)

I can capture the line start with tab no problem, but I couldn't get the lines start with space.


(This post was edited by sfo_sc on Jan 19, 2005, 10:53 PM)


KevinR
Veteran


Jan 19, 2005, 10:35 PM

Post #27 of 33 (2838 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post


In Reply To
Thanks for the info. I found out I can use length($string) == 0 too. =)

but when I try to look for lines that start with space and I use m/^\s/, the code will go into infinit loop.


Can you post an example of the code you are trying, because


Code
if ($string =~ /^\s/) { 
do something
}


should find lines that start with a space or a tab
-------------------------------------------------


sfo_sc
Novice

Jan 19, 2005, 10:50 PM

Post #28 of 33 (2836 views)
Re: [KevinR] Read a file from a specific line and extract something to another file [In reply to] Can't Post

Darn it! Figure out why. It is totally un-related to the regular expression. Took me whole day to figure it out. Thank you for your help KevinR


(This post was edited by sfo_sc on Jan 19, 2005, 11:21 PM)


sfo_sc
Novice

Jan 19, 2005, 11:35 PM

Post #29 of 33 (2829 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post


In Reply To
Whats wrong with this piece of code? I couldn't figure out.

Code
while($file[$i] =~ m/^\t/ || $file[$i] =~ m/^Caused/ 
|| length($file[$i]) == 0 || $file[$i] =~ m/^\s/) {
push(@temp, $file[$i] . "\n");
$i++;
}

I keep getting "Use of uninitialized value in pattern match (m//) at ./logwatch.pl line 53, <SOURCE> line 274." and go into infinit loop. Please help.

It is complaining about the leading whitespace condition I just adding. If I remove that, the code will work.

Line 53 is the $i++, which doesn't make sense at all.

Here is the case where I want to capture to the file.

Code
java.lang.InternalError: jzentry == 0, 
jzfile = 135272768,
total = 390,
name = /opt/tomcat5/webapps/fsm.war,
i = 12,
message = invalid LOC header (bad signature)
at java.util.zip.ZipFile$2.nextElement(ZipFile.java:320)

I can capture the line start with tab no problem, but I couldn't get the lines start with space.


My code magically works if I add another line at the end of the read in file which doesn't match any of the case. So if my read file ends with a line that matches the case, my code will go into infinit loop. Can someone see whats wrong with my code? What is causing the problem?


sfo_sc
Novice

Jan 19, 2005, 11:38 PM

Post #30 of 33 (2828 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post

I think I found the problem. Crazy
I better go get some sleep to clear up my mind.


KevinR
Veteran


Jan 20, 2005, 12:10 AM

Post #31 of 33 (2827 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post

I think maybe you want to use "if" instead of "while" here:


Code
while ($file[$i] =~ m/^\t/ || $file[$i] =~ m/^Caused/  
|| length($file[$i]) == 0 || $file[$i] =~ m/^\s/) {
push(@temp, $file[$i] . "\n");
$i++;
}


"while" is generally used for looping through lists or files while a condition is true. "if" is used to make a decision based on the evaluation of an expression.

what you have is something like this:


Code
$num = 1; 
while ($num == 1) {
print "AHHHH!!";
}


since $num will always equal 1, the condition will always be true, and the loop will never exit. Now go get some sleep! Wink
-------------------------------------------------


(This post was edited by KevinR on Jan 20, 2005, 12:13 AM)


sfo_sc
Novice

Jan 20, 2005, 1:14 PM

Post #32 of 33 (2816 views)
Re: [KevinR] Read a file from a specific line and extract something to another file [In reply to] Can't Post

Something I have learn since I was studying Computer Science, once my head really gets stuck, it is usually time to go to bed! Tongue


Jean
User


Jan 23, 2005, 10:15 AM

Post #33 of 33 (2803 views)
Re: [sfo_sc] Read a file from a specific line and extract something to another file [In reply to] Can't Post

I'm not sure why the code would go into infinite loop, but this can help you:

\s matches tabs as well as other whitespace, e.g. ascii(32)


Jean Spector
SQA Engineer @ Exanet
jean.spector@softhome.net


There are only 10 types of people in the world -
Those who understand binary, and those who don't.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives