
Laurent_R
Veteran
/ Moderator
Oct 5, 2012, 9:02 AM
Post #9 of 9
(16582 views)
|
Re: [Laurent_R] sort 10 million 7-digit phone numbers with ~1MB RAM
[In reply to]
|
Can't Post
|
|
Using the idea described in the above post of not storing the numbers in the array, but storing 1 in the N th element of the array if N exists in the input, I can now allocate ten million random numbers and "sort" them, using the same command line type of test as before:
$ time perl -e '$c[int rand 10000000]=1 foreach (1..10000000); foreach (1..10000000) {print "$_\n" if $c[$_] ==1}' > ten_million.txt real 0m16.359s user 0m15.015s sys 0m0.593s ~ $ head ten_million.txt 1 2 3 4 5 6 7 9 11 12 ~ $ tail ten_million.txt 9999986 9999987 9999988 9999989 9999990 9999991 9999992 9999993 9999994 9999996 Of course, I am using here far more than 1 megabyte of memory because I am not using a bit vector, but a simple array. But at least, it shows that the idea works pretty well. Basically, what is needed here is to build a routine to properly manage a bit vector of the adequate size. Second, since one megabyte will not be quite enough to hold 10 million bits (1 MB = 8,388,608 bits), though by a relatively small margin, the process should be built in two passes, for example one with numbers between 1 et 5,000,000 and a second one between 5,000,001 et 10,000,000 (you read twice the entire input file; in the first pass, you check only numbers between 0 and five million; in the second pass, you check numbers between five million and ten million).
|