CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Filtering Log files and Splitting by time.

 



x-plicit2009
Novice

Oct 19, 2009, 1:01 AM

Post #1 of 15 (2154 views)
Filtering Log files and Splitting by time. Can't Post

Output of the file looks like:

<txt>16-OCT-2009 09:11:46 * 10.65.4.2
<txt>16-OCT-2009 09:11:47 * 10.65.4.24
<txt>16-OCT-2009 09:11:48 * 10.112.4.2

How can i filter in order to get the date/time/ip adress. Then count the number of repeated ip adresses and seperate time in small chunks of 1 hour?


ichi
User

Oct 19, 2009, 1:36 AM

Post #2 of 15 (2153 views)
Re: [AngeloSpinola] Filtering Log files and Splitting by time. [In reply to] Can't Post

what have you tried?


x-plicit2009
Novice

Oct 19, 2009, 5:38 AM

Post #3 of 15 (2149 views)
Re: [ichi] Filtering Log files and Splitting by time. [In reply to] Can't Post

could you please give me some guidlines?


x-plicit2009
Novice

Oct 19, 2009, 7:32 AM

Post #4 of 15 (2143 views)
Re: [AngeloSpinola] Filtering Log files and Splitting by time. [In reply to] Can't Post

I tried this but i just got 1 ocorrence of each, why is this?




Code
$LOGFILE = "access.log"; 
open(LOGFILE) or die("Could not open log file.");

my @array = qw(10.0.2.149 10.0.2.23 10.0.2.32 10.0.4.1 10.0.4.102 10.1.1.30);
my %counts = ();
for (@array) {
$counts{$_}++;
}
foreach my $keys (keys %counts) {
print "$keys = $counts{$keys}\n";
}



x-plicit2009
Novice

Oct 19, 2009, 7:45 AM

Post #5 of 15 (2141 views)
Re: [AngeloSpinola] Filtering Log files and Splitting by time. [In reply to] Can't Post

Well, i also tried out the code below, and it also gave me just 1 ocorrence of each. Does someone know why this happens??




Code
$LOGFILE = "access.log"; 
open(LOGFILE) or die("Could not open log file.");

use strict;
use warnings;
use Data::Dumper;

#my @array = qw(10.0.2.149 10.0.2.23 10.0.2.32);
#my %counts = ();
#for (@array) {
# $counts{$_}++;
# }
# foreach my $keys (keys %counts) {
# print "$keys = $counts{$keys}\n";
#}

#!/usr/bin/perl



my %hash;

while ( <DATA> ) {
chomp;
$hash{$_}++;
}
print Dumper \%hash;

__DATA__
10.0.2.149
10.0.2.23
10.0.2.32



FishMonger
Veteran / Moderator

Oct 19, 2009, 7:53 AM

Post #6 of 15 (2140 views)
Re: [AngeloSpinola] Filtering Log files and Splitting by time. [In reply to] Can't Post


In Reply To
I tried this but i just got 1 ocorrence of each, why is this?




Code
$LOGFILE = "access.log"; 
open(LOGFILE) or die("Could not open log file.");

my @array = qw(10.0.2.149 10.0.2.23 10.0.2.32 10.0.4.1 10.0.4.102 10.1.1.30);
my %counts = ();
for (@array) {
$counts{$_}++;
}
foreach my $keys (keys %counts) {
print "$keys = $counts{$keys}\n";
}


You only have 1 occurrence of each in your array, so why would you think you'd have more than that when you printed out the values?


x-plicit2009
Novice

Oct 19, 2009, 7:59 AM

Post #7 of 15 (2137 views)
Re: [FishMonger] Filtering Log files and Splitting by time. [In reply to] Can't Post

Hmm i see, but i am trying to get these ips from a log file and they have more than one ocurrence, how could i populate this array with the data in the file?


FishMonger
Veteran / Moderator

Oct 19, 2009, 8:01 AM

Post #8 of 15 (2135 views)
Re: [AngeloSpinola] Filtering Log files and Splitting by time. [In reply to] Can't Post

Show us your code that processes the log file.


x-plicit2009
Novice

Oct 19, 2009, 8:14 AM

Post #9 of 15 (2132 views)
Re: [FishMonger] Filtering Log files and Splitting by time. [In reply to] Can't Post

At the moment its a just grep procedure applied to a xml file that is not in the most perfect output as you can see above. i am not sure what is the best appearance...

what i know is what i need afterwards,

That is:

Time
Date
Ip

Statistical report:

Nuber of duplicates;
Number of connections per time span having granularities such as hour/day/week

Could you please give me some advice? I am quite new to Perl.


x-plicit2009
Novice

Oct 19, 2009, 8:17 AM

Post #10 of 15 (2129 views)
Re: [AngeloSpinola] Filtering Log files and Splitting by time. [In reply to] Can't Post

Matching a ip address could be as following,


Code
if(/\b(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\b/) { 
print "$1\n";
}



savo
User

Oct 20, 2009, 3:23 AM

Post #11 of 15 (2113 views)
Re: [AngeloSpinola] Filtering Log files and Splitting by time. [In reply to] Can't Post

I did something similar a week or two ago, if you look at my posts there is more info (think in one about hashes and the other about more problems).

Here is my code for finding ip and counting how many time its been seen.


Code
while (<MESSEGES>) { 

if (/Failed password/) {

@list = split;
if ( $list[12] =~ /^(\d+\.){3}\d+/ ) {
$hash{ $list[12] } += 1;
}
else {
if ( $list[10] =~ /^(\d+\.){3}\d+/ ) {
$hash{ $list[10] } += 1;
}
}
}
}



x-plicit2009
Novice

Oct 20, 2009, 4:32 AM

Post #12 of 15 (2106 views)
Re: [savo] Filtering Log files and Splitting by time. [In reply to] Can't Post

Thanks for the feedback,

I have the issue of the counting fine now. Now the problem is to analyze the dates, getting the time and date from the log file, and with these split the file into smaller chunks, having granularity as week, day, hour.


savo
User

Oct 20, 2009, 12:10 PM

Post #13 of 15 (2091 views)
Re: [AngeloSpinola] Filtering Log files and Splitting by time. [In reply to] Can't Post

I have all day tomorrow to have ago at it if i am not to hungover after tonight
but here is what i came up with this evening.

This code isn't working right but will ask for a date and print out the info for the date. Do you want to be prompting for dates and times to look up or saving to different files based on date maybe time as well. Also one of the reasons this doesn't work is because the ips will be duplicated at different times etc is this ok with a count next to each or would you want just the first or last time they where seen?



Code
#!/usr/bin/perl 
use warnings;
use strict;
use 5.010;

sub dateconversion {
my $num = shift;
my @month = qw(JAN FEC MAR APR MAY JUE JUL AUG SEP OCT NOV DEC);
my $r = $month[$num];
}

say "Please enter a date to lookup";
chomp( my $lookup = <STDIN> );

my @lookup = split /\D/, $lookup;
--$lookup[1];

$lookup[1] = dateconversion( $lookup[1] );

my $daytolookup = join "-", @lookup;

if ( !open TEST, "test" ) {
die "didnt open? ($!)";
}

my %hash;

while (<TEST>) {

my ( $date, $time, $dump, $ip ) = split / /, $_;
$date =~ s/<txt>//;

${ $hash{$date}{$time}{$ip} } += 1;

}

foreach my $date ( keys %hash ) {
for my $time ( keys %{ $hash{$date} } ) {
for my $ip ( keys %{ $hash{$date}{$time} } ) {
say "$date -- $time -- $ip", if $date eq $daytolookup;

}
}
}



x-plicit2009
Novice

Oct 20, 2009, 2:16 PM

Post #14 of 15 (2085 views)
Re: [savo] Filtering Log files and Splitting by time. [In reply to] Can't Post

Hey, thanks!

The file could be splitted into smaller chunks, for example hour/day/week based on the date,time in the logfile. Seems a broader solution. I think the most appropriate report (output), would be f.e. IP Number -> Number of times it showed on each line -> Red/Green Alert (depending on case)

I am not quite sure of the best solution, but i see two solutions:

1)1 whole program were the user puts the input times and dates, splitting is made based on time that is in the log file,next, counts the ips and writes to another file giving information on Number of IP then"Count of duplicate ips", and at last releasing a alert string if it achieves a certain value f.e. 100 ip duplicate counts (red/green alert)

2) 2 programs, the first does the splitting of log file into smaller files, per hour/day/week and the 2nd program would do the logging tracing analysis on each smaller file, outputing the count of duplicates and throwing an alert in case of achieving max value.

I look foward hearing from your feedback


savo
User

Oct 21, 2009, 9:34 AM

Post #15 of 15 (2068 views)
Re: [AngeloSpinola] Filtering Log files and Splitting by time. [In reply to] Can't Post

I did not get as much time as i wanted today (kids are ill) but i have worked on it a little.

When run by default it will write all the logs to files by day. You will either have to edit it or add a logs dir. It will add data to the end of the log file so if its ran against the same log file more than once it will add duplicate data to the end. This inst to hard to fix either have it overwrite the old file or add some checking.

if you run it with command -s you will get a date prompt enter the date like 16/10/2009 (it is split on any none digit so 16*10^2009 would work as well).

If you run with the -v it will display all the data to the screen, I did have it outputting better but that broke when i put it in a subroutine.

Could you send me a sample log file so i can play some more when have time.

EDIT

The sort by date is not working i couldn't quite work it out will start my own post about that and add it after.


Code
#!/usr/bin/perl 
use warnings;
use strict;
use 5.010;

my $lookup;
my %hash;
my %count;

sub dateconversion {
my $num = shift;
my @month = qw(JAN FEC MAR APR MAY JUE JUL AUG SEP OCT NOV DEC);
my $r = $month[$num];
}

sub search {
my $date = shift;
my $time = shift;
my $ip = shift;
my $count = shift;
my $lastdate = shift;

my @lookup = split /\D/, $lookup;
--$lookup[1];

$lookup[1] = dateconversion( $lookup[1] );

my $daytolookup = join "-", @lookup;
say "$ip -- $date -- $time -- $count", if $date eq $daytolookup;
}

sub outputtoscreen {
my $date = shift;
my $time = shift;
my $ip = shift;
my $count = shift;

# my $lastdate =1;
# say "---------------", unless $lastdate eq $date; broken this now its passed to a sub
say "$ip -- $date -- $time -- $count";
$lastdate = $date;

# $lastdate = shift;
}

sub outputtofile {
my $date = shift;
my $time = shift;
my $ip = shift;
my $count = shift;
my $lastdate = shift;
if ( !open OUTPUT, ">>logs/logger-$date" ) {
die "didnt open? ($!)";
}
select OUTPUT;

say "$ip -- $date -- $time -- $count{$ip}";
$lastdate = $date;
select STDOUT;
}

if ( $ARGV[0] =~ /-s/ ) {
say "Please enter a date to lookup";
chomp( $lookup = <STDIN> );
}

if ( !open TEST, "test" ) {
die "didnt open? ($!)";
}

while (<TEST>) {
chomp;
my ( $date, $time, $dump, $ip ) = split / /, $_;
$date =~ s/<txt>//;
${ $hash{$date}{$time}{$ip} } +=
1; #this needs fixing as no need to count will always be 1
$count{$ip} += 1;
}

close TEST;

foreach my $date ( sort keys %hash ) {
my $lastdate = 1;
for my $time ( sort keys %{ $hash{$date} } ) {
for my $ip ( sort keys %{ $hash{$date}{$time} } ) {
if ( $ARGV[0] =~ /-v/ ) {
outputtoscreen( $date, $time, $ip, $count{$ip}, $lastdate );
}
elsif ( $ARGV[0] =~ /-s/ ) {
search( $date, $time, $ip, $count{$ip}, $lastdate );
}
else {
outputtofile( $date, $time, $ip, $count{$ip}, $lastdate );
}
}
}
}



(This post was edited by savo on Oct 21, 2009, 9:36 AM)

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives