May 29, 2001, 6:04 AM
Post #1 of 10
I'm having an efficiency debate with myself that I thought I might ask the gurus about. I have a program that requires records to be sorted (as so many do) - as little as 10 to as many as several thousand or more. Now, those records will be split into pages of, say, 20 records, once they're sorted, and each page will be displayed on a person's browser. But only one page needs to be created at a time.
Sort with a twist
Just to simplify the situation, say @records holds all the records. At the moment, I sort the records:
@records = sort @records;
And then take a slice of the 20 records that are to be displayed on the particular page:
@records = @records[$first..$last];
(where $first and $last specify the numbers of the first and last records to be displayed).
Now that's fine, and it works, but I can think of another way to do it that may be better, and this is where it gets a little tricky. From what I've read, the Perl sort function uses QuickSort (although I also read that 5.6 uses MergeSort). Now, what I'm thinking is that when I use the above code, Perl sorts the entire array for me - but it doesn't need to. It only really needs to sort that part between $first and $last. Instead, I was thinking of writing a modified QuickSort algorithm that throws away records that it doesn't need to sort on each iteration. An example mightn't go astray:
Say there were 1000 records, and records 20 to 39 need to be returned. The QuickSort algorithm picks its first pivot element, and pivots around that. The pivot element ends up at position 300. Now, the order of records 301 to 1000 aren't important, because we know that in the final order, records 20 to 39 are now in the first 300 elements in the array, so all records above 300 can just be thrown away, and records 1 to 299 can be sorted further. Then, the pivot might end up at position 89 after the second iteration, and so records 90 and above can be thrown out. Basically, when QuickSort comes to forking into a sort of a particular sub-array, it decides whether it really needs to or not.
The question is this - if that will actually work (I haven't programmed it yet), will it be faster than the first, standard perl sort method? The reason I ask is that I also read that because the sort function is in binary form, it executes faster than a custom made sort algorithm, but how much faster?
I'm thinking that the standard Perl sort would be better for small array sizes, but the modified QuickSort better for large array sizes. Any ideas? Am I making sense?