CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Problems with sorting a record

 



yellowman
Novice

Feb 28, 2006, 8:27 AM

Post #1 of 16 (3301 views)
Problems with sorting a record Can't Post

I am trying to sort a file that has entries like this:

01/09 08:44 accept NN291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com https/tcp
01/09 08:44 accept NN291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com http/tcp
01/09 09:30 accept ttr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com https/tcp
01/09 13:05 accept NN277099.gs.myclarris.net sd-mscq2.gcsd.clarris.com https/tcp

I want to sort on the 4th item in the record(i.e.NN291580.gs.myclarris.net) and print the results to a file.

But code only ony outputs the first two entries in the output file. It should sort the whole file, but it always gets stuck at whatever the first record is.

Below is the code:


Code
   

sub sort_file {
$count =0;
$count2 =0;
$ocount =0;
$scount =0;
$size = @master;

while($count <= $size){
$line = $master [$count];

if($line ne ""){
@temp = split(/ +/, $line);
}
$count++;

while ($ocount <= $size){
$line2 = $master [$count2];
if($line2 ne ""){
@temp2 = split(/ +/, $line2);
$count2++;
}
if($temp[3] eq $temp2[3]){;
@sorted[$scount] = $line2;
$scount++;
}
$ocount++;
}
$count2 = 0;
}

@master = @sorted;
&results;

}

The &results does nothing more that take what's in the @master array and print the results to a file. I know, global variables are bad, but the script is just for me;-)

Thanks for the help...


KevinR
Veteran


Feb 28, 2006, 12:07 PM

Post #2 of 16 (3300 views)
Re: [yellowman] Problems with sorting a record [In reply to] Can't Post

I don't understand what the sort criteria is. You say sort, but sort how? Sort what? Those are mixed strings with numeric and alpha and non word characters in them so please explain how you want them sorted.
-------------------------------------------------


yellowman
Novice

Feb 28, 2006, 1:39 PM

Post #3 of 16 (3297 views)
Re: [KevinR] Problems with sorting a record [In reply to] Can't Post

I guess its not really a sort, its more of a grouping thing. I want to group those strings based on the 4th word(s) in the string.

For instance, in the string....

01/09 08:48 accept sc291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com https/tcp

I want to take the the 4th word sc291580.gs.myclarris.net and group all strings in the log with that word in it together. So that I get groups like this...

01/09 08:44 accept sc291580.gs.clarris.net sd-mscq2.gcsd.clarris.com echo-request/icmp
01/09 08:44 accept sc291580.gs.clarris.net sd-mscq2.gcsd.clarris.com microsoft-ds-tcp/tcp
01/09 08:44 accept sc291580.gs.clarris.net sd-mscq2.gcsd.clarris.com http/tcp
01/09 09:33 accept rtr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com https/tcp
01/09 09:33 accept rtr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com https/tcp
01/09 09:33 accept rtr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com https/tcp
01/10 18:32 accept dintsu01.northdrum.com sd-mscq2.gcsd.clarris.com https/tcp
01/10 18:32 accept dintsu01.northdrum.com sd-mscq2.gcsd.clarris.com https/tcp
01/10 18:32 accept dintsu01.northdrum.com sd-mscq2.gcsd.clarris.com https/tcp

Right now, these logs aren't grouped in any way and impossible to read. They look like this...

01/09 08:44 accept sc291580.gs.clarris.net sd-mscq2.gcsd.clarris.com http/tcp
01/09 09:33 accept rtr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com https/tcp
01/09 13:05 accept sc277099.gs.clarris.net sd-mscq2.gcsd.clarris.com https/tcp
01/10 10:38 accept rtr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com echo-request/icmp
01/10 10:38 accept rtr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com https/tcp
01/13 15:37 accept sc277099.gs.clarris.net sd-mscq2.gcsd.clarris.com echo-request/icmp

These logs usually aren't to big ( less than 5MB). The log I am playing with is 94k. I ammended my code above and it runs, it just turns my 94k file into a 27 MB file. If I comment out the while loop (as seen below) I can get the code to group based on the first string entry. But I need the while loop to increase my count.

I hope this is more clear...


Code
   

sub sort_file {
$count =0;
$count2 =0;
$ocount =0;
$scount =0;
$size = @master;

# while($count <= $size){
$line = $master [$count];

if($line ne ""){
@temp = split(/ +/, $line);
}
# $count++;
while ($ocount <= $size){
$line2 = $master [$count2];
if($line2 ne ""){
@temp2 = split(/ +/, $line2);
$count2++;
}
if($temp[3] eq $temp2[3]){;
@sorted[$scount] = $line2;
$scount++;
}
$ocount++;
}
$ocount =0;
$count2 =0;
# }

@master = @sorted;
&results;

}

sub results {
unless (open(OUT, ">Sort_Results.txt")) {
die ("Cannot open the output file. Check permissions \n");
}
print OUT (@master);
print ("\nThe Firewall Log has been parsed successfully.\n")
}



KevinR
Veteran


Feb 28, 2006, 1:57 PM

Post #4 of 16 (3294 views)
Re: [yellowman] Problems with sorting a record [In reply to] Can't Post

OK, you want to group/collate the data, here is one way:


Code
my %collated = (); 
while(my $line = <DATA>) {
chomp $line;
my $key = (split(/\s+/,$line))[3];
push @{$collated{$key}},$line;
}
foreach my $data (keys %collated) {
print join("\n",@{$collated{$data}});
print "\n";
}
__DATA__
01/09 08:44 accept NN291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com https/tcp
01/09 08:44 accept NN291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com http/tcp
01/09 09:30 accept ttr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com https/tcp
01/09 13:05 accept NN277099.gs.myclarris.net sd-mscq2.gcsd.clarris.com https/tcp
01/09 08:44 accept NN291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com https/tcp
01/09 08:44 accept NN291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com http/tcp
01/09 09:30 accept ttr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com https/tcp
01/09 09:30 accept ttr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com https/tcp
01/09 13:05 accept NN277099.gs.myclarris.net sd-mscq2.gcsd.clarris.com https/tcp
01/09 09:30 accept ttr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com https/tcp
01/09 13:05 accept NN277099.gs.myclarris.net sd-mscq2.gcsd.clarris.com https/tcp

-------------------------------------------------


davorg
Thaumaturge / Moderator

Mar 1, 2006, 5:31 AM

Post #5 of 16 (3286 views)
Re: [yellowman] Problems with sorting a record [In reply to] Can't Post

You really don't need Perl for something like this. Assuming you have access to Unix box (or a Windows box with Cygwin installed) you can do this:


Code
sort -k 4 < input_file > output_file


But if you insist on Perl, then you can do something like this:


Code
print sort my_sort <STDIN>; 

sub my_sort {
my @a = split /\s+/, $a;
my @b = split /\s+/, $b;

return $a[3] cmp $b[3];
}


Like my first example, it's implemented as a filter. That is it reads data from standard input and writes it to standard output. Assuming the code is in a file called "my_sort", you would call it like this:


Code
my_sort < input_file > output_file


You should read the documentation for sort.

--
Dave Cross, Perl Hacker, Trainer and Writer
http://www.dave.org.uk/
Get more help at Perl Monks


(This post was edited by davorg on Mar 1, 2006, 7:19 AM)


yellowman
Novice

Mar 1, 2006, 7:16 AM

Post #6 of 16 (3283 views)
Re: [davorg] Problems with sorting a record [In reply to] Can't Post

OK, I tried to run the following code as seen above...


Code
   

sub play {

my %collated = ();
while(my $line = <FIRE>) {
chomp $line;
my $key = (split(/\s+/,$line))[3];
push @{$collated{$key}},$line;
}

foreach my $data (keys %collated) {
print join("\n",@{$collated{$data}});
print "\n";
}

# &results;

}



...but nothing gets printed to the screen. I need to use Perl because its the standardized language around here. Plus I am trying to get better at it. I figure the more I use it, the better I will be at it.

Any ideas on version 2 of my original code. It seems to be the closest so far. Thanks.


KevinR
Veteran


Mar 1, 2006, 8:19 AM

Post #7 of 16 (3279 views)
Re: [yellowman] Problems with sorting a record [In reply to] Can't Post

Try opening the file handle FIRE within the sub routine.
-------------------------------------------------


yellowman
Novice

Mar 1, 2006, 8:43 AM

Post #8 of 16 (3276 views)
Re: [KevinR] Problems with sorting a record [In reply to] Can't Post

WOW, that works. Why does adding the FIRE handle to a sub routine make a difference? Now if I wanted to add the contents of %collate array to my @master array can it be done with something like:

@master = %collated

...or is that not legal?

And, if I open the file in one sub routine and group it in another will it work or will it print nothing like before. Right now everything is in the same sub and works file, but if I could have a sub that just opens files it would make for cleaner code.


KevinR
Veteran


Mar 1, 2006, 1:02 PM

Post #9 of 16 (3270 views)
Re: [yellowman] Problems with sorting a record [In reply to] Can't Post

if @master already exists, you can try like this using push():

push @master, values %collated;

which appends the values of the hash to the end of the array. You don't want the keys as they don't contain all the data from each line.

normally you pass data to sub routines using the system array:

do_this(@master);

and get the list back out the same way, but in reverse:


Code
sub do_this { 
my @new_array = @_;
...
}

-------------------------------------------------


yellowman
Novice

Mar 1, 2006, 3:19 PM

Post #10 of 16 (3266 views)
Re: [KevinR] Problems with sorting a record [In reply to] Can't Post

I tried using push(), but it whacked the order again.

I tried just setting it equal to a scalar like below:


Code
 $count = 0;foreach my $data (keys %collated) {      
$temp =join("\n",@{$collated{$data}});
@master[$count] = $temp;
$count++;
}

...and again I can get $temp to loop through and print nicely to the screen, but the order gets whacked when I try to put it in a regular array. I can't be to far off of my logic right?


KevinR
Veteran


Mar 1, 2006, 10:09 PM

Post #11 of 16 (3264 views)
Re: [yellowman] Problems with sorting a record [In reply to] Can't Post

I think this will do the trick:


Code
sub play {   

my %collated = ();
while(my $line = <FIRE>) {
chomp $line;
my $key = (split(/\s+/,$line))[3];
push @{$collated{$key}},$line;
}

foreach my $data (keys %collated) {
print join("\n",@{$collated{$data}});
print "\n";
push @master,@{$collated{$data}};
}

results(@master);

}

sub results {
print "$_\n" for @_;
}


that is barebones so you need to adapt it to your needs.
-------------------------------------------------


yellowman
Novice

Mar 2, 2006, 5:52 AM

Post #12 of 16 (3258 views)
Re: [KevinR] Problems with sorting a record [In reply to] Can't Post

Well, It sort of works. It looks like it loops through and prints the data twice. I've tried it a couple different ways. What's odd is that it prints the first set of data out of order, then it prints the second set of data grouped just fine but the formatting is messed up. I'm looking over the code right now (I'm a little slow, I need the book), but if anything hits you give me a hint. Below is what it all looks like....


Code
   

if (open(FIRE, "$path")){
my %collated = ();
while(my $line = <FIRE>) {
chomp $line;
my $key = (split(/\s+/,$line))[3];
push @{$collated{$key}},$line;
}

foreach my $data (keys %collated) {
join("\n",@{$collated{$data}});
push @master,@{$collated{$data}};
}
}
&results2(@master);
}

sub results2 {

# print "$_\n" for @_;

my @test = @_;
print (@test);
}



Here is what the output file looks like...

01/09 08:44 accept sc291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com
01/09 08:44 accept sc291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com echo-request/icmp
01/09 08:44 accept sc291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com http/tcp
01/09 08:44 accept sc291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com http/tcp
01/09 08:44 accept sc291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com https/tcp
01/09 08:44 accept sc291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com https/tcp
01/09 09:30 accept rtr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com echo-request/icmp
01/09 09:30 accept rtr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com https/tcp
01/09 09:32 accept rtr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com https/tcp
01/09 13:05 accept sc277099.gs.myclarris.net sd-mscq2.gcsd.clarris.com https/tcp
01/09 13:05 accept sc277099.gs.myclarris.net sd-mscq2.gcsd.clarris.com https/tcp
01/10 14:22 accept rtr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com echo-request/icmp
01/10 14:22 accept rtr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com https/tcp
01/11 15:55 accept 158.147.113.67 sd-mscq2.gcsd.clarris.com http/tcp
01/11 15:55 accept 158.147.113.67 sd-mscq2.gcsd.clarris.com https/tcp
01/11 17:35 accept dintsu01.northgrum.com sd-mscq2.gcsd.clarris.com https/tcp
01/11 17:35 accept dintsu01.northgrum.com sd-mscq2.gcsd.clarris.com https/tcp
01/13 15:37 accept sc277099.gs.myclarris.net sd-mscq2.gcsd.clarris.com http/tcp
01/13 15:37 accept sc277099.gs.myclarris.net sd-mscq2.gcsd.clarris.com nbsession/tcp
01/15 11:36 accept 158.147.113.67 sd-mscc2.gcsd.clarris.com nbname/udp
01/15 11:37 accept 158.147.113.67 sd-mscc2.gcsd.clarris.com http/tcp 01/11 17:35 accept dintsu01.northgrum.com sd-mscq2.gcsd.clarris.com https/tcp 01/11 17:35 accept dintsu01.northgrum.com sd-mscq2.gcsd.clarris.com https/tcp 01/09 08:44 accept sc291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com https/tcp 01/09 08:44 accept sc291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com http/tcp 01/09 08:44 accept sc291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com echo-request/icmp01/09 08:44 accept sc291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com 01/09 08:44 accept sc291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com https/tcp 01/09 08:44 accept sc291580.gs.myclarris.net sd-mscq2.gcsd.clarris.com http/tcp 01/09 13:05 accept sc277099.gs.myclarris.net sd-mscq2.gcsd.clarris.com https/tcp 01/09 13:05 accept sc277099.gs.myclarris.net sd-mscq2.gcsd.clarris.com https/tcp 01/13 15:37 accept sc277099.gs.myclarris.net sd-mscq2.gcsd.clarris.com nbsession/tcp 01/13 15:37 accept sc277099.gs.myclarris.net sd-mscq2.gcsd.clarris.com http/tcp 01/09 09:30 accept rtr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com https/tcp 01/09 09:30 accept rtr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com echo-request/icmp01/09 09:32 accept rtr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com https/tcp 01/10 14:22 accept rtr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com https/tcp 01/10 14:22 accept rtr-mpls-1.net.clarris.com sd-mscq2.gcsd.clarris.com echo-request/icmp01/11 15:55 accept 158.147.113.67 sd-mscq2.gcsd.clarris.com http/tcp 01/11 15:55 accept 158.147.113.67 sd-mscq2.gcsd.clarris.com https/tcp 01/15 11:36 accept 158.147.113.67 sd-mscc2.gcsd.clarris.com nbname/udp 01/15 11:37 accept 158.147.113.67 sd-mscc2.gcsd.clarris.com http/tcp


davorg
Thaumaturge / Moderator

Mar 2, 2006, 8:08 AM

Post #13 of 16 (3256 views)
Re: [yellowman] Problems with sorting a record [In reply to] Can't Post

I can't help thinking that you're making this all far more complicated than it needs to be. Did you look at the code which I posted a couple of days ago?

--
Dave Cross, Perl Hacker, Trainer and Writer
http://www.dave.org.uk/
Get more help at Perl Monks


yellowman
Novice

Mar 2, 2006, 8:22 AM

Post #14 of 16 (3254 views)
Re: [davorg] Problems with sorting a record [In reply to] Can't Post

Yes I looked at the code. I did something similar in my original code and it turned a 94K file into a 26MB file. Since there were no suggestions on my original code, I have decided to try the route that KevinR has suggested. It works, except I can't figure out how to get the results to a file instead of printing to the screen.


davorg
Thaumaturge / Moderator

Mar 2, 2006, 8:28 AM

Post #15 of 16 (3253 views)
Re: [yellowman] Problems with sorting a record [In reply to] Can't Post


In Reply To
Yes I looked at the code. I did something similar in my original code and it turned a 94K file into a 26MB file.


My code doesn't do that. It just prints out the original data, but in the order that you requested. There's no possible way that my code would inflate your data by that amount.

--
Dave Cross, Perl Hacker, Trainer and Writer
http://www.dave.org.uk/
Get more help at Perl Monks


yellowman
Novice

Mar 2, 2006, 9:55 AM

Post #16 of 16 (3246 views)
Re: [davorg] Problems with sorting a record [In reply to] Can't Post

OK....let me play with it again and see if I can get it to work.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives