CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Frequently Asked Questions:
How To Retrieve a Random Line From A File?

 



sleuth
Enthusiast

Jan 18, 2001, 1:53 PM

Post #1 of 4 (4129 views)
How To Retrieve a Random Line From A File? Can't Post

 To retrieve a line from a file randomly you will have to get the total amount of lines from the file you are targeting then get a random integer from the total amount of lines in your file. Well, this works really well for unix/linux boxes.

$filename = "test.db";
($total = `wc -l $filename`) =~ s!\D+!!g;;
$rand=int(rand($total));
open(data, "<$filename");
while(<data>){
if ($. == $rand){
$myRandomLine = "$_";
last;
}
}
close(data);

Now $myRandomLine is the random line.

If you are on an NT box, you could use this,

$filename = 'data2.db';
open(data, "<$filename");
while(<data>){
push(@all,$_);
$total++;
}
close(data);
$rand=int(rand($total));
$myRandomLine = $all[$rand];
print "$myRandomLine\n";

Once again, $myRandomLine is the random line. Also, note that this way is less efficient because you are storing the entire file into memory rather that the first method where you don't have more than one line of the file in memory at a time (Even Japhy said so).

TESTING: The first set of code was tested on Red Hat Linux Running apache using perl5.006.

The NT code was tested on my windows98 pc running active state perl5.6 and PWS (Personal Web Server) by Microsoft.

Sleuth



japhy
Enthusiast

Jan 18, 2001, 3:15 PM

Post #2 of 4 (4127 views)
Re: How To Retrieve a Random Line From A File? [In reply to] Can't Post

I hate to break it to you, sleuth, but the FAQ has a far more efficient method that is proven to be fair for each line:


Code
japhy% perldoc -q 'random line' 

How do I select a random line from a file?

Here's an algorithm from the Camel Book:

srand;
rand($.) < 1 && ($line = $_) while <>;

This has a significant advantage in space over
reading the whole file in. A simple proof by
induction is available upon request if you doubt
its correctness.

Jeff "japhy" Pinyan -- accomplished hacker, teacher, lecturer, and author


japhy
Enthusiast

Jan 18, 2001, 3:48 PM

Post #3 of 4 (4125 views)
Re: How To Retrieve a Random Line From A File? [In reply to] Can't Post

Here is the proof that:


Code
rand($.) < 1 and ($line = $_) while <FH>;

provides a fair and equal distibution for any number of lines.

Proof by Induction

The rand($bound) function returns some real number x such that 0 <= x < $bound. The $. variable contains the current line number of the filehandle being read from.

Thus, for the first line of the file, we have rand(1) < 1, which is true 100% of the time. Thus, after the first line of the file has been read, $line is the first line of the file.

When we get to the second line, rand(2) < 1 1/2 of the time. This means that the second line has a 50% chance of being placed in $line, and the first line has a 100% chance of the remaining 50% (100% of 50% = 50%) to be chosen.

When we get to the third line, rand(3) < 1 1/3 of the time. This means that the third line has a 33% chance of being in $line, and that the second line has a 50% chance of the remaining 66% (50% of 66% = 33%) to be chosen, and the first line has a 100% chance of the remaining 33% (100% of 33% = 33%).

So, we can assume that at line N, each line has a 1/N probability of being chosen.

Now, line N + 1 has a 1 / (N + 1) probability of being chosen. That leaves N / (N + 1) probability for the remaining N lines, which is equally divisible by N, and that leaves a 1 / (N + 1) probability for each of the other N lines.

This proof can also be found in chapter 8 of the Perl Cookbook.

Jeff "japhy" Pinyan -- accomplished hacker, teacher, lecturer, and author


sleuth
Enthusiast

Jan 18, 2001, 6:38 PM

Post #4 of 4 (4122 views)
Re: How To Retrieve a Random Line From A File? [In reply to] Can't Post

 Speechless sorry cant speak, code to good to speak ...


 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives