CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
File::Tail Problems

 



daves
Novice

May 16, 2013, 6:47 AM

Post #1 of 21 (1534 views)
File::Tail Problems Can't Post

Guys,

I am having a problem tailing a log file. The problem is that it will eventually write non-printable characters to my log file (i.e. square boxes).

At this point, I have basically stripped down all of my code down to the following lines in hopes that I could just compare the original log file to the one I created (they should be the same). But they never are. My script will take several lines in a row (20-30) and represent them with non-printable characters. When I open the original file, of course the lines contain regular characters. Another interesting fact is that the original file and my file are always the exact same size.



Any help would be appreciated.


Code
use File::Tail; 
my $file = File::Tail->new("/home/logs/2013/05.15/users/Client.log");
$debug_log = "$ENV{HOME}\/log_scrape_debug.log";
while (defined(my $line= $file->read)) {
open(OUT,">> $debug_log");
print OUT $line;
close(OUT);
}



FishMonger
Veteran / Moderator

May 16, 2013, 7:38 AM

Post #2 of 21 (1526 views)
Re: [daves] File::Tail Problems [In reply to] Can't Post


Quote

Code
while (defined(my $line= $file->read)) {  
open(OUT,">> $debug_log");
print OUT $line;
close(OUT);
}


That is extremely inefficient.

The open and close statements should not be inside the loop.

You should be using a lexical var for the filehandle and the 3 arg form of open.

Does your Client.log contain only ascii chars or does it also contain utf-8?


daves
Novice

May 16, 2013, 7:47 AM

Post #3 of 21 (1522 views)
Re: [FishMonger] File::Tail Problems [In reply to] Can't Post

Hi,
The Client.log definitely contains utf-8. Does that affect this?

The open and close are in the loop so the file gets updated in real-time. Is there a way to update the file in real-time and not wait for the loop to end?

Thanks for your help.


FishMonger
Veteran / Moderator

May 16, 2013, 8:00 AM

Post #4 of 21 (1517 views)
Re: [daves] File::Tail Problems [In reply to] Can't Post

I'm not sure if the issue you're having is due to the utf-8, but it's worth looking into.


Quote
Is there a way to update the file in real-time and not wait for the loop to end?

Unless you've implemented a timeout, the loop will never end because File::Tail blocks while waiting for more lines to be read.

Here's a better way to write that loop, without taking into account a timeout.

Code
my $debug_log = "$ENV{HOME}/log_scrape_debug.log"; 

open my $debug_fh, '>>', $debug_log or die "failed to open '$debug_log' $!";

while (defined(my $line = $file->read)) {
print $debug_fh $line;
}

close $debug_fh;



daves
Novice

May 16, 2013, 8:06 AM

Post #5 of 21 (1515 views)
Re: [FishMonger] File::Tail Problems [In reply to] Can't Post

I see but that doesn't write to the file in real-time, correct?


FishMonger
Veteran / Moderator

May 16, 2013, 8:08 AM

Post #6 of 21 (1514 views)
Re: [daves] File::Tail Problems [In reply to] Can't Post

What is the real problem that you're script needs to resolve?

Based on what you've posted, File::Tail looks to be the wrong choice. Maybe you should be looking at File::Copy


FishMonger
Veteran / Moderator

May 16, 2013, 8:11 AM

Post #7 of 21 (1512 views)
Re: [daves] File::Tail Problems [In reply to] Can't Post

It writes to the file as soon as File::Tail reads a new line. If you want it more realtime than that, you need to decrease the time File::Tail waits before checking for new lines.


daves
Novice

May 16, 2013, 8:23 AM

Post #8 of 21 (1510 views)
Re: [FishMonger] File::Tail Problems [In reply to] Can't Post

The real problem my script needs to resolve is that fact that I am missing events......

So, I wrote a script that is monitoring a log file, in real-time, for certain events. If I see that event in the log file, I do something.

The script was working, however I was missing some events. I looked in the log file and saw that the events existed, so I wasn't sure why my script wasn't picking them up. It had to be one of two things:

1. My code to check for the events was wrong.
OR
2. File::Tail was not giving me the events.


In order to check if File::Tail was giving me the events, I took my script and just basically tried to recreate the log file to ensure it was exactly the same. Since it wasn't I determined that for some reasons File::Tail was now printing the characters I see in the log file as non-printable squares. If I figure out why that is happening, I believe my true script will pick up all events.

So I really need to figure out why File::Tail is not printing out the entries as they happen. Is it because of utf-8?

Another interesting finding......

Lets assume I kill the script and it has been running all day long. If I set File::Tail to start from the beginning of the file, the script works perfectly...UNTIL.....I get to the real-time writing. It appears this is when I start getting the non-printable characters.


FishMonger
Veteran / Moderator

May 16, 2013, 8:46 AM

Post #9 of 21 (1506 views)
Re: [daves] File::Tail Problems [In reply to] Can't Post

Change the open statement to this:

Code
open my $debug_fh, '>:encoding(UTF-8)', $debug_log 
or die "failed to open '$debug_log' $!";


And lets try adjusting the File::Tail object.

Code
my $file = File::Tail->new( 
name => "/home/logs/2013/05.15/users/Client.log",
tail => 0, # set to -1 if you want to start at the beginning of the file instead of the end
interval => 2,
maxinterval => 10,
debug => 1,
);



daves
Novice

May 16, 2013, 9:29 AM

Post #10 of 21 (1501 views)
Re: [FishMonger] File::Tail Problems [In reply to] Can't Post

Darn....no luck. It will work for the first 5,000 lines...then fail....then the next 8,000 it will work...then fail....then etc, etc.

Thanks for your help. If you have any other ideas, I'd love to hear them.


FishMonger
Veteran / Moderator

May 16, 2013, 9:40 AM

Post #11 of 21 (1498 views)
Re: [daves] File::Tail Problems [In reply to] Can't Post

You could try reducing the intervals a little more.

Did the debug output give any clues?

How many lines per second are being added to the primary log file?

Can you be more descriptive on how it's failing?

Have you compared the failure lines in both files with a hex editor to see how they differ?

What kind of program is generating the log entries?


daves
Novice

May 16, 2013, 10:11 AM

Post #12 of 21 (1495 views)
Re: [FishMonger] File::Tail Problems [In reply to] Can't Post


Quote
Did the debug output give any clues?


No. I didn't see any. Where would it be? To the screen? To the output file?



Quote
How many lines per second are being added to the primary log file?



Probably somewhere around 100



Quote
Can you be more descriptive on how it's failing?


When I look at the output log I created from Perl, it has a few thousand lines that look like the original log file (perfect...this is the expected result).

Then I see a huge line of of squares (probably 300-500 characters long on a single line) <-- This to me is a failure.




Quote
Have you compared the failure lines in both files with a hex editor to see how they differ?


No, I will try that.



Quote
What kind of program is generating the log

entries?

A custom real time application for financial traffic.





As I was saying before, if I switch the tail=> -1.....it will run through the whole file without any issues (no huge line of of squares).

A colleague of mine believes that its really not a perl problem, but rather a tail problem. I am going to try a straight tail -f > file.txt from the command prompt and see if that tails properly.

Thanks for your help. I really appreciate it.


daves
Novice

May 16, 2013, 10:40 AM

Post #13 of 21 (1491 views)
Re: [daves] File::Tail Problems [In reply to] Can't Post

from the command line tail -f > file.txt has the same problem. Bummer.

Ok, so I need to find a new implementation of monitoring a log file that has 3.5 mil rows and a file size of 200MB. Any suggestions? I was thinking of:

1. Keep a counter of how many rows I have read so far.
Next
2. running a wc -l on the file to get the total number of rows
Next
3. Subtract total number of rows - rows read so far.
Next
4. File::ReadBackwards.pm the number of rows I need to read and put it into an array.

Any thoughts on how fast that will be or a better way to do it?


FishMonger
Veteran / Moderator

May 16, 2013, 12:01 PM

Post #14 of 21 (1486 views)
Re: [daves] File::Tail Problems [In reply to] Can't Post

Can you provide more details on the overall process/problem that you trying to solve?

One possible approach, depending on what your real goal is, would be to open a standard filehandle and process the data as needed. When you get to the eof, use the tell() function to retrieve/store the current byte offset and then close the file. When you reopen the file, use the seek() function to move the file pointer to that offset and begin reading/parsing from that point.


daves
Novice

May 16, 2013, 12:23 PM

Post #15 of 21 (1484 views)
Re: [FishMonger] File::Tail Problems [In reply to] Can't Post

Simply...I am trying to tail a log file monitoring for certain events to occur in real-time.

My problem appears to be that tail -f (even from a command prompt) will return non-printable characters every so often.


FishMonger
Veteran / Moderator

May 16, 2013, 12:33 PM

Post #16 of 21 (1481 views)
Re: [daves] File::Tail Problems [In reply to] Can't Post

Could it be that the editor (or shell) that you're using to view the data doesn't support the needed character set (utf-8 or maybe utf-16)?

Have you looked at it with a hex editor yet?

Can you post a sample of the data as an attachment?


daves
Novice

May 16, 2013, 12:40 PM

Post #17 of 21 (1479 views)
Re: [FishMonger] File::Tail Problems [In reply to] Can't Post

Unfortunately, I can't post the data as it is sensitive financial transactions. So here is why I don't think its utf-8 or utf-16:

If I run a simple Perl script to just tail the original log file and print to the debug log file starting from the very beginning of the log file, all the data gets printed perfectly (and I am talking about more than 3/4 of day of logged data)....once the tail catches up to the real-time log entries its starts printing the non-printable characters every so often 5,000 - 10,000 rows work and then it starts to fail.

If I stop the Perl script, and re-run again from the beginning of the log file, when the Perl script hits those same rows it processed as non-printable during 'real-time', it processes them fine....


FishMonger
Veteran / Moderator

May 16, 2013, 12:58 PM

Post #18 of 21 (1476 views)
Re: [daves] File::Tail Problems [In reply to] Can't Post

Hmm, I'm running out of ideas.

Is it failing more often, less often, or about the same since we decreased the interval values?

How about trying sysopen and syswrite instead of open and print?

Or, you can try turning off I/O buffering (i.e., $| = 1;) and setting binmode on the filehandle.


daves
Novice

May 16, 2013, 1:00 PM

Post #19 of 21 (1474 views)
Re: [FishMonger] File::Tail Problems [In reply to] Can't Post

That's ok...I appreciate your effort. I am working on a 'hack' that perhaps I will post once I prove that all entries are being properly written....I guess you could call it a custom tail program.


BillKSmith
Veteran

May 16, 2013, 2:05 PM

Post #20 of 21 (1467 views)
Re: [daves] File::Tail Problems [In reply to] Can't Post

I would not expect your OS to allow it, but it sounds like perl is reading data before it is physically written to the disk. Perhaps, as a work around, you could try to redirect the log file to a perl program and write both files with that program.
Good Luck,
Bill


FishMonger
Veteran / Moderator

May 16, 2013, 2:23 PM

Post #21 of 21 (1464 views)
Re: [BillKSmith] File::Tail Problems [In reply to] Can't Post

Earlier post from Dave

Quote
From the command line tail -f > file.txt has the same problem.


That tells me that the problem is probably at the OS level. However, we haven't been given any details on how/when he's viewing file.txt. Is it as it's being written to via another tail -f process in a separate terminal window? Or is the tail -f process being killed and the file then loaded into a text editor?

I often redirect the output of tail -f of large fast growing log files and have never come across his reported issue.


(This post was edited by FishMonger on May 16, 2013, 2:27 PM)

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives