
Jean
User

Feb 21, 2001, 7:48 AM
Post #2 of 3
(380 views)
|
It depends on how large your file is. In case it's a small file, you can simply read all its lines into an array, sort it and then dump it back to the file. Regarding removing duplicates, I'm attaching two sample scripts that do the duplicate elimination. The simpler one removes only sequential duplicates, i.e. can be used after sorting only and the other one removes all duplicates, though with larger files it is slower.
# REMOVES SEQUENTIAL DUPLICATES ######## my $SourceFile = $ARGV[0] || "Server.log"; my $TargetFile = $SourceFile.".clean"; my $lastline; # Open the files open (SOURCEFILE, "<$SourceFile") or die "Can't open $SourceFile"; open (TARGETFILE, ">>$TargetFile") or die "Can't create $TargetFile"; while (<SOURCEFILE>) { if ( $_ ne $lastline ) { print TARGETFILE $_; } $lastline = $_; } # Close the files. close (SOURCEFILE); close (TARGETFILE); # REMOVES ALL DUPLICATES ######## my $SourceFile = $ARGV[0] || "Server.log"; my $TargetFile = $SourceFile.".NoDups"; my $i; # Open the files open (SOURCEFILE, "<$SourceFile") or die "Can't open $SourceFile"; open (TARGETFILE, ">>$TargetFile") or die "Can't create $TargetFile"; while (<SOURCEFILE>) { chomp; if ( !exists $lines{$_} ) { $i++; $lines{$_} = "1"; print TARGETFILE "$_\n"; } } # Close the files. close (SOURCEFILE); close (TARGETFILE); Note from japhy -- use the <pre> and </pre> tags around your code block (except use [ and ] instead of < and >) Jean QA Engineer @ http://www.extent.com mage@lycosmail.com
(This post was edited by japhy on Feb 21, 2001, 8:17 AM)
|