Home: Perl Programming Help: Beginner:
regexp to search through log files and convert into pipe delimited file?.



laknar
New User

Nov 24, 2012, 4:55 PM


Views: 1415
regexp to search through log files and convert into pipe delimited file?.

Im trying to grep a set of strings to find each record in a log file and grep the fields for the corresponding string. the scenario is to find the string and read the above line for date and also grep a field OrderID from the below lines for each search string.The search strings are "Authenticate RequestXML", "Authenticate ResponseXML", "Authorize RequestXML", "Authorize ResponseXML"

Attached the sample log file.

the expected output would be like below.

Column Names.

Start_Date|Message|Request_Type|OrderNumber|OrderID

Nov 16, 2012 5:17:53 AM|INFO|Authenticate RequestXML||8585273913266697
Nov 16, 2012 5:17:53 AM|SEVERE|Authenticate ResponseXML|5207694|858527
Nov 16, 2012 5:18:48 AM|INFO|Authorize RequestXML||8585273913266697
Nov 16, 2012 5:18:52 AM|INFO|Authorize ResponseXML|5207694|85852739132

I need to use perl script to acheive this scenario. Any help will be greatly appreciated.
Attachments: test (8.80 KB)


Laurent_R
Veteran / Moderator

Nov 25, 2012, 2:23 AM


Views: 1406
Re: [laknar] regexp to search through log files and convert into pipe delimited file?.

There are two possible approaches: one, if your log file is not too large, is to "slurp" the whole file into an array of lines and then process each line. The advangtage of this approach is that when you read a line with one of your key word, it is very easy to get back to the previous line to collect the date.

The other approach, which I usually prefer because I am most of the time dealing with very large files whose size may or will exceed the size of the memory available to my process, is to read the lines one at a time, but always storing the previous line, so as to be able to pick up the date of the previous line if the current line has one of the searched key word.

Adopting the second approach, you could try something like this (untested, there may be a couple of typos, but you should get the basic idea):


Code
my $previous line; 
my $date_regex = qr /^\w{3} \d?\d, \d{4} \d?\d:\d\d:\d\d (:?AM|PM)/;
while (my $line = <$INPUT>) {
chomp $line;
$previous line = $line;
if ($line =~ /(Authenticate Re(:?quest|sponse)XML/ or /Authorize Re(:?quest|sponse)XML)/) {
my $type_msg = $1;
}
else {
next;
};
my $diagnostic = split /:/, $line, 1;
my $date = $1 if $previous_line =~ /($date_regex)/;
# we have found date and diagnostic, we need to loop for the order id
while (<$INPUT>) {
next unless /OrderID/;
$orderID = $1 if /^<OrderID>(\d+)<\\OrderID>/;
print $date, "|", $diagnostic, "|", $type_msg, "|", $orderID, "\n";
last;
}
}



(This post was edited by Laurent_R on Nov 25, 2012, 2:24 AM)