CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
How to skip 14 digits from a syslog file??

 



amanjosan2008
Novice

Jul 20, 2013, 12:51 PM

Post #1 of 14 (686 views)
How to skip 14 digits from a syslog file?? Can't Post

The Syslog file entry is as follows:-

Jul 21 01:08:34 hostname kernel: [15125.218403] [IPTABLES DROP] : IN=ppp0 OUT= MAC= SRC=78.99.104.112 DST=180.87.232.130 LEN=131 TOS=0x00 PREC=0x00 TTL=118 ID=5539 PROTO=UDP SPT=45670 DPT=56838 LEN=111

The code snippet to read the file is as follows:-


while (<LOG_FILE>) {
if (!/$log_tag/) { next; }
my(@entry_split)=split / +/;
my(%entry);

#year is not in syslog date format... try to guess it from the local time
# ***TODO*** what happen when the year change ?
my($year);
(undef,undef,undef,undef,undef,$year,undef,undef,undef) = localtime(time);
$year += 1900;

$entry{'date'}="$year-".$m{shift(@entry_split)}."-".shift(@entry_split)." ".shift(@entry_split);
$entry{'host'}=shift(@entry_split);
shift(@entry_split); # kernel:
shift(@entry_split); # [IPTABLES

my($chain_name)=shift(@entry_split); # DROP]
$chain_name=~s/\]//;

shift(@entry_split); # :
foreach (@entry_split) {
if (/(.*)=(.*)/) {
(my($field),my($value))=split /=/;
$entry{$field}=$value;
}
}


The output is:-

'kirat-1440','2013-07-21 01:08:34','[IPTABLES','ppp0','78.99.104.112','adsl-dyn112.78-99-104.t-com.sk','180.87.232.130','unknown','UDP','45670','56838'


I want to skip this word :- [15125.218403]
Some times it is [ 9125.218403]

The space between bracket and digit usually causes issue.
Therefore I want to ignore 14 characters.

I need to insert some code between these two line to do so:-

shift(@entry_split); # kernel:
shift(@entry_split); # [IPTABLES

Inserting another shift(@entry_split); did not helped.


The value for chain name is '[IPTABLES' but it should be 'DROP'.

Please guide.


Laurent_R
Enthusiast / Moderator

Jul 20, 2013, 1:08 PM

Post #2 of 14 (684 views)
Re: [amanjosan2008] How to skip 14 digits from a syslog file?? [In reply to] Can't Post

Just trying at this point to make your message more readable by using the proper tags. Please try to do it yourselv next time you post.


The Syslog file entry is as follows:-


Code
Jul 21 01:08:34 hostname kernel: [15125.218403] [IPTABLES DROP] : IN=ppp0 OUT= MAC= SRC=78.99.104.112 DST=180.87.232.130 LEN=131 TOS=0x00 PREC=0x00 TTL=118 ID=5539 PROTO=UDP SPT=45670 DPT=56838 LEN=111

The code snippet to read the file is as follows:-



Code
while (<LOG_FILE>) { 
if (!/$log_tag/) { next; }
my(@entry_split)=split / +/;
my(%entry);

#year is not in syslog date format... try to guess it from the local time
# ***TODO*** what happen when the year change ?
my($year);
(undef,undef,undef,undef,undef,$year,undef,undef,undef) = localtime(time);
$year += 1900;

$entry{'date'}="$year-".$m{shift(@entry_split)}."-".shift(@entry_split)." ".shift(@entry_split);
$entry{'host'}=shift(@entry_split);
shift(@entry_split); # kernel:
shift(@entry_split); # [IPTABLES

my($chain_name)=shift(@entry_split); # DROP]
$chain_name=~s/\]//;

shift(@entry_split); # :
foreach (@entry_split) {
if (/(.*)=(.*)/) {
(my($field),my($value))=split /=/;
$entry{$field}=$value;
}
}


The output is:-


Code
'kirat-1440','2013-07-21 01:08:34','[IPTABLES','ppp0','78.99.104.112','adsl-dyn112.78-99-104.t-com.sk','180.87.232.130','unknown','UDP','45670','56838'


I want to skip this word :- [15125.218403]
Some times it is [ 9125.218403]

The space between bracket and digit usually causes issue.
Therefore I want to ignore 14 characters.

I need to insert some code between these two line to do so:-

shift(@entry_split); # kernel:
shift(@entry_split); # [IPTABLES

Inserting another shift(@entry_split); did not helped.


The value for chain name is '[IPTABLES' but it should be 'DROP'.

Please guide.


Laurent_R
Enthusiast / Moderator

Jul 20, 2013, 3:46 PM

Post #3 of 14 (681 views)
Re: [amanjosan2008] How to skip 14 digits from a syslog file?? [In reply to] Can't Post

I think the easiest is to remove these words from the line before you do the split.


Code
while (<LOG_FILE>) { 
next unless /$log_tag/);
s/\[[\d ]\d{4}\.\d{6}]//; # or, simpler, with less detailed control on the match: s/\[[\d .]{12}]//;
my @entry_split = split /\s+/;
my %entry;
my $year = (localtime time)[5] + 1900;
$entry{'date'} = "$year-$m{$entry_split[0]}-$entry_split[1] $entry_split[2]";
# ...


To show you the result of both regex substitutions, consider this session under the Perl debugger:


Code
  DB<25> $_  = "Jul 21 01:08:34 hostname kernel: [15125.218403] [IPTABLES DROP] : IN=ppp0 OUT= MAC= " 

DB<26> s/\[[\d ]\d{4}\.\d{6}]//;

DB<27> p $_
Jul 21 01:08:34 hostname kernel: [IPTABLES DROP] : IN=ppp0 OUT= MAC=
DB<28> $_ = "Jul 21 01:08:34 hostname kernel: [15125.218403] [IPTABLES DROP] : IN=ppp0 OUT= MAC= "

DB<29> s/\[[\d .]{12}]//;

DB<30> p $_
Jul 21 01:08:34 hostname kernel: [IPTABLES DROP] : IN=ppp0 OUT= MAC=
DB<31> @entry_split = split /\s+/;

DB<32> x @entry_split
0 'Jul'
1 21
2 '01:08:34'
3 'hostname'
4 'kernel:'
5 '[IPTABLES'
6 'DROP]'
7 ':'
8 'IN=ppp0'
9 'OUT='
10 'MAC='
DB<33>


I do not understand your requirement about the IPTABLE DROP thing. What do you want to have exactly? Is "IPTABLE DROP" a constant (i.e. is it always there just the way you've shown, or can it be different words? Please explain.

Some additional comments on your code: it seems that you have initialized somewhere outside the code you've shown a %m hash, which is used to convert the month "Jul" into "07". It would have been nice to show it to us.

I frankly would not use successive shift calls to assign values to variable or discard unnecessary fields.

Do you really need somewhere else the %entry hash? If you are not using it, I would drop it and do something like different.

To explain it, I'll take a different somewhat simpler example. This string contains the sales for each month through the year, separated by a space: "45 44 67 54 34 56 76 45 51 62 71 40". Suppose that, for some reason, I want to report only the sales for March, April, September and December. Remember that an array starts with subscript 0, therefore, March will be 2, April 3, etc.

Solution 1: assigning the useful values to variables, discarding the others:

Code
my (undef, undef, $mar, $apr, undef, undef, undef, undef, $sep, undef, undef, $dec) = split /\s+/; 
print "sales for March, April, September and December: $mar $apr $sep $dec\n"

Of course, because we are extracting here only 4 months out of 12, there are a lot of noisy undef's, but in some other cases, a couple of undef where fields are useless simplify the coding.

Solution 2: split the whole thing into a temporary array and then use only the useful fields:


Code
my @fields = split /\s+/; 
my @results = @fields[2,3,8,11];
print "sales for March, April, September and December: @results \n";
# the @result array is uncecessary. I could have: print "sales for March, April, September and December: @fields[2,3,8,11] \n";


You see how easy it is to get rid of the useless values of the array?

A more concise solution:

Code
my @results = ( split /\s+/)[2,3,8,11]; 
print "sales for March, April, September and December: @results \n";


And finally, the one-line solution demonstrated under the Perl debugger:

Code
  DB<34> $_ = "45 44 67 54 34 56 76 45 51 62 71 40"; 

DB<35> print "Sales for March, April, September and December: ", join " ", ( split /\s+/)[2,3,8,11], "\n";
Sales for March, April, September and December: 67 54 51 40


This one-line solution is just there to show how powerful this way of doing things can be, it might not be possible to use it in your case since some fields (such as the month) need reprocessing.

Please note that I have tested only the code that I have shown under the debugger, I haven't tested the other code samples, there may be somewhere a small stupid mistake that I haven't seen. The general idea should remain correct.


amanjosan2008
Novice

Jul 21, 2013, 12:20 AM

Post #4 of 14 (676 views)
Re: [Laurent_R] How to skip 14 digits from a syslog file?? [In reply to] Can't Post

Hi Laurent_R,

Thank you for your update and your efforts.

The name of projest is IP Tables logger. This project was made to log Packets dropped by IPTables firewall.
It was made for Syslog. But I use Rsyslog.
The output of Rsyslog contains an extra field of timestamp. I am unable to remove it by any means.
The value of timestamp is causing errors in the Chain name field.

The values for timestamp are variable and change with time as follows:-

[ 814.206745]
[ 5436.257907]
[34215.938271]


If the timestamp value is like: [ 814.206745] The Output is like:


Chain Date Host Interf. Proto. Src IP Dest IP Dest. port
407.810038 2013-07-21 12:26:13 kirat-1440 ppp0 UDP 0.0.0.0 180.87.232.130 56838
401.953316 2013-07-21 12:26:07 kirat-1440 ppp0 UDP 0.0.0.0 180.87.232.130 56838

When the time stamp value increases with time to something like [34215.938271], the output becomes:-

Chain Date Host Interf. Proto. Src IP Dest IP Dest. port
[IPTABLES 2013-07-21 12:26:13 kirat-1440 ppp0 UDP 0.0.0.0 180.87.232.130 56838
[IPTABLES 2013-07-21 12:26:07 kirat-1440 ppp0 UDP 0.0.0.0 180.87.232.130 56838


I want an output as follows:- The value of Chain is having issue. Date and everthignelse is fine.

Chain Date Host Interf. Proto. Src IP Dest IP Dest. port
DROP 2013-07-21 12:26:13 kirat-1440 ppp0 UDP 0.0.0.0 180.87.232.130 56838
DROP 2013-07-21 12:26:07 kirat-1440 ppp0 UDP 0.0.0.0 180.87.232.130 56838

Therefore I want to skip this 14 field value. Please help. SHould I use Shift operator with some regex.


Laurent_R
Enthusiast / Moderator

Jul 21, 2013, 1:43 AM

Post #5 of 14 (670 views)
Re: [amanjosan2008] How to skip 14 digits from a syslog file?? [In reply to] Can't Post

This really seems to be a rather easy task, but the input lines that you are showing change from one of your posts to the next.

I gave you a regular expression that that would remove the time stamp from the first version of your description of the input line:


Code
s/\[[\d ]\d{4}\.\d{6}]//;

I even showed you where to put it (before the split).

You are now showing a very different input line. It seems that you are giving only part of the line (without the beginning). And furthermore, you don't put it within code tags as I requested, so the lines are reformatted.

There is no way to do data munging without a precise and exact knowledge of the data input format.

Please provide exact samples of the complete input line (or, at least, the complete beginning of the line) for each of the two cases, and enclose them within code tags. Actually, two or three lines for each case would be great, as it would enable me to figure out what is constant and what is changing from one line to the next in each of the two basic cases. Under these conditions, I will certainly be able to give you a solution.


(This post was edited by Laurent_R on Jul 21, 2013, 1:47 AM)


Laurent_R
Enthusiast / Moderator

Jul 21, 2013, 1:50 AM

Post #6 of 14 (667 views)
Re: [amanjosan2008] How to skip 14 digits from a syslog file?? [In reply to] Can't Post

Example of the regex used to remove the time stamps on the first version of your input data:


Code
  DB<25> $_  = "Jul 21 01:08:34 hostname kernel: [15125.218403] [IPTABLES DROP] : IN=ppp0 OUT= MAC= " 

DB<26> s/\[[\d ]\d{4}\.\d{6}]//;

DB<27> p $_
Jul 21 01:08:34 hostname kernel: [IPTABLES DROP] : IN=ppp0 OUT= MAC=
DB<28> $_ = "Jul 21 01:08:34 hostname kernel: [15125.218403] [IPTABLES DROP] : IN=ppp0 OUT= MAC= "

DB<29> s/\[[\d .]{12}]//;

DB<30> p $_
Jul 21 01:08:34 hostname kernel: [IPTABLES DROP] : IN=ppp0 OUT= MAC=


As you can see both regexes do the job of removing the time stamp from the line being processed.


amanjosan2008
Novice

Jul 22, 2013, 2:19 AM

Post #7 of 14 (656 views)
Re: [Laurent_R] How to skip 14 digits from a syslog file?? [In reply to] Can't Post

Thank you for the reply.

This Regex will work with

Code
[15125.218403]

.


If there is a space like in

Code
 [   23.453629]

or

Code
[  284.271544]

then it wont work.

Please suggest some regex for a variable value with space having 12 digits a dot some space and in brackets. ex: [xxxxx.xxxxxx] where x may be a number or space.

I aplogise for my poor knowledge in perl.


(This post was edited by amanjosan2008 on Jul 22, 2013, 2:44 AM)


BillKSmith
Veteran

Jul 22, 2013, 6:51 AM

Post #8 of 14 (648 views)
Re: [amanjosan2008] How to skip 14 digits from a syslog file?? [In reply to] Can't Post

Your description of your variable is inconsistent due to misuse of the words "digit" and "number". I assume that you wish to match a 14 character field consisting of one period and 11 spaces and/or digits surrounded by square brackets. The period is always the seventh character.

Template: [xxxxx.xxxxxx] x = digit or space.


Laurnet's regex on DB<29> above will always match such a field.

I have written a slightly more explicit one which will also always match. The following code tests each regex against each of ten examples. All are valid (my interpretation) fields. Some intentionally chosen to look strange.


Code
use strict; 
use warnings;
my $REGEX_1 = qr/\[[\s\d.]{12}\]/; # Laurent
my $REGEX_2 = qr/\[[\s\d]{5}\.[\s\d]{6}\]/; # Bill
my @fields = (
# '[xxxxx.xxxxxx]', 'x' represents a digit or a space
'[15125.218403]',
'[ 23.453629]',
'[ 284.271544]',
'[ ]',
'[11111.111111]',
'[1 2 3. 4 5 6]',
'[ 2 ]',
'[ 1234.12345 ]',
'[01234.123450]',
'[01 3 .123 50]',
);
print "(1) All match\n" if !(scalar grep !$REGEX_1, @fields);
print "(2) All match\n" if !(scalar grep !$REGEX_2, @fields);


OUTPUT:

Code
(1) All match 
(2) All match


Sorry, even my code tags did not properly display all of the fields. Please refer to the attached copy of the code.
Good Luck,
Bill

(This post was edited by BillKSmith on Jul 22, 2013, 10:06 AM)
Attachments: amanj.pl (0.57 KB)


amanjosan2008
Novice

Jul 23, 2013, 3:24 PM

Post #9 of 14 (624 views)
Re: [BillKSmith] How to skip 14 digits from a syslog file?? [In reply to] Can't Post

Thank you Laurent & Bill...

got the needed REGEX..

I am trying to put it in my file.
I will get back to you if needed.

Thanks for your support.


amanjosan2008
Novice

Jul 23, 2013, 4:03 PM

Post #10 of 14 (621 views)
Re: [BillKSmith] How to skip 14 digits from a syslog file?? [In reply to] Can't Post

Hi back..
I am getting Syntax error as follows:-


Code
./feed.pl 
Backslash found where operator expected at ./feed.pl line 99, near "my @entry_split = split /\"
(Might be a runaway multi-line // string starting on line 98)
(Do you need to predeclare my?)
syntax error at ./feed.pl line 97, near "/$log_tag/)"
Global symbol "@entry_split" requires explicit package name at ./feed.pl line 98.
syntax error at ./feed.pl line 99, near "my @entry_split = split /\"
Substitution replacement not terminated at ./feed.pl line 106.


Complete file is as follows:-


Code
#!/usr/bin/perl 

use strict;
use DBI;
use POSIX qw(strftime);

# IPTable log analyzer
# Copyright (C) 2002 Gerald GARCIA
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Plac<B2>e - Suite 330, Boston, MA 02111-1307, USA.
#
# Contact author : gege@gege.org

# $Id: feed_db.pl,v 1.8 2002/11/12 20:43:18 gege Exp $

use Socket;


######################################################################################################
################# C O N F I G S E C T I O N ###################################
######################################################################################################

my $dsn = 'DBI:mysql:iptables:localhost';
my $db_user_name = 'XXXXXXXXXX';
my $db_password = 'XXXXXXXXX';
my $log_file = '/var/log/syslog';
my $pid_file = "/var/run/iptablelog.pid";

######################################################################################################
################# E N D O F C O N F I G S E C T I O N ###########################
######################################################################################################


my $go_to_background=($ARGV[0] eq '--background');

$SIG{INT} = \&got_int;

my ($id, $password);
my $dbh = DBI->connect($dsn, $db_user_name, $db_password);


# get the short name of months according to the locale
# thanks to Bill Garrett <memesis at users.sourceforge.net>
my(%m);
my($month_nb);
for $month_nb (0..11) {
$m{strftime("%b", 0, 0, 0, 1, $month_nb, 96)}=sprintf("%02d",$month_nb+1);
}

#my($key);
#foreach $key (keys %m) {
# print $key,"-",$m{$key},"\n";
#}


my($log_tag)="IPTABLES";

open(LOG_FILE,"tail --follow=name --retry $log_file 2>/dev/null |") or die "Unable to open log file $log_file : $!\n";


#fork to background
if ($go_to_background) {
my($f) = fork();
if (!defined($f)) { die "Unable to fork : $!\n"; }
if($f) {
# parent
open(PID, ">$pid_file") or die "Unable to create pid file $pid_file: $!";
print PID "$f\n";
close PID;
exit(0);
} else {
# child
close STDIN;
open(STDIN, '</dev/null');
close STDOUT;
open(STDOUT, '>/dev/null');
close STDERR;
open(STDERR, '>/dev/null');
}
}



while (<LOG_FILE>) {
next unless /$log_tag/);
s/\[[\s\d.]{12}\]//;
my @entry_split = split /\s+/;
my(%entry);

#year is not in syslog date format... try to guess it from the local time
# ***TODO*** what happen when the year change ?
my($year);
(undef,undef,undef,undef,undef,$year,undef,undef,undef) = localtime(time);
$year += 1900;

$entry{'date'}="$year-".$m{shift(@entry_split)}."-".shift(@entry_split)." ".shift(@entry_split);
$entry{'host'}=shift(@entry_split);
shift(@entry_split); # kernel:
shift(@entry_split); # [IPTABLES

my($chain_name)=shift(@entry_split); # DROP]
$chain_name=~s/\]//;

shift(@entry_split); # :
foreach (@entry_split) {
if (/(.*)=(.*)/) {
(my($field),my($value))=split /=/;
$entry{$field}=$value;
}
}

my($iaddr) = inet_aton($entry{'SRC'});
my($host_name) = gethostbyaddr($iaddr, AF_INET);
if (defined($host_name)) { $entry{"SRC_NAME"}=$host_name; } else { $entry{"SRC_NAME"}="unknown"; }

# open(HOST,"host $entry{'SRC'} |");
# my($result)=<HOST>;
# if ($result=~s/Name: (.*)$/$1/) {
# $result=~s/\n//; $entry{"SRC_NAME"}=$result;
# } else {
# if ($result=~s/.* pointer (.*)\.$/$1/) {
# $result=~s/\n//; $entry{"SRC_NAME"}=$result;
# } else { $entry{"SRC_NAME"}="unknown"; }
# }
# close(HOST);

my($iaddr) = inet_aton($entry{'DST'});
my($host_name) = gethostbyaddr($iaddr, AF_INET);
if (defined($host_name)) { $entry{"DST_NAME"}=$host_name; } else { $entry{"DST_NAME"}="unknown"; }

# open(HOST,"host $entry{'DST'} |");
# my($result)=<HOST>;
# if ($result=~s/Name: (.*)$/$1/) {
# $result=~s/\n//; $entry{"DST_NAME"}=$result;
# } else {
# if ($result=~s/.* pointer (.*)\.$/$1/) {
# $result=~s/\n//; $entry{"DST_NAME"}=$result;
# } else { $entry{"DST_NAME"}="unknown"; }
# }

close(HOST);

my($dummy)="'".$entry{"host"}."','".$entry{"date"}."','".$chain_name."','".$entry{'IN'}."','".$entry{'SRC'}."','".$entry{'SRC_NAME'}."','".$entry{'DST'}."','".$entry{'DST_NAME'}."','".$entry{'PROTO'}."','".$entry{'SPT'}."','".$entry{'DPT'}."'";

print "$dummy\n";


$dbh->do("insert into logs
(host,date,chain,interface_in, ip_src, name_src, ip_dest, name_dest, proto, port_src, port_dest) values ($dummy)");


}
$dbh->disconnect();

sub got_int {
$SIG{INT} = \&got_int; # but not for SIGCHLD!
close(LOG_FILE);
}


There is a small syntax error but I am not able to find it out.
Please suggest.


Laurent_R
Enthusiast / Moderator

Jul 23, 2013, 11:16 PM

Post #11 of 14 (609 views)
Re: [amanjosan2008] How to skip 14 digits from a syslog file?? [In reply to] Can't Post

Please help us, which one is line 99?


amanjosan2008
Novice

Jul 23, 2013, 11:20 PM

Post #12 of 14 (608 views)
Re: [Laurent_R] How to skip 14 digits from a syslog file?? [In reply to] Can't Post

Line 99 is as follows:-


Code
  my @entry_split = split /\s+/;



BillKSmith
Veteran

Jul 24, 2013, 6:23 AM

Post #13 of 14 (595 views)
Re: [amanjosan2008] How to skip 14 digits from a syslog file?? [In reply to] Can't Post

The error is two lines before that:

Code
next unless /$log_tag/ ); #parens do not balance

Good Luck,
Bill


amanjosan2008
Novice

Jul 24, 2013, 10:25 AM

Post #14 of 14 (589 views)
Re: [BillKSmith] How to skip 14 digits from a syslog file?? [In reply to] Can't Post

Thank you Laurent & Bill...

My application is running perfectly now....

I appreciate your help..
& I am thankful to perlguru.com too....
bye-bye
tc..
best of luck...

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives