CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
Open file command with Sort and sub function call

 



C_PRASANNA
Novice

Mar 6, 2015, 4:38 AM

Post #1 of 17 (7069 views)
Open file command with Sort and sub function call Can't Post

Hello,

Can someone explain clearly what exactly the attached Perl program does.

I need to know the below line functionality,

open $fOut,"| sort | perl -e \"use Extraction;Extraction::WriteFile('E:\OutputFolder\OutputFile.txt')\"";

As i analyzed, the writefile function is called at first and comes out after the file is opened inside this function. But before closing the file handle $fOut, the flow goes to writefile function and creates the output file.

However if we open a file it will be created once we call the OPEN command if its not available, but here files are created just before closing the handler $fOut.

Regards,

Prasanna - India.
Attachments: Extraction.pl (0.48 KB)


BillKSmith
Veteran

Mar 6, 2015, 12:44 PM

Post #2 of 17 (7059 views)
Re: [C_PRASANNA] Open file command with Sort and sub function call [In reply to] Can't Post

Output to the filehandle $fOut is piped to a sort command and then on to another perl program which uses the Extraction module to write the output to a file. The syntax probably only makes sense on unix systems. This statement really does not do anything because the filehandle is never referenced.
Good Luck,
Bill


C_PRASANNA
Novice

Mar 6, 2015, 10:44 PM

Post #3 of 17 (7052 views)
Re: [BillKSmith] Open file command with Sort and sub function call [In reply to] Can't Post

Thanks Bill.

But please explain the flow of the attached code.

In general, perl will open a file in OPEN command, all the data will be commited to opened file on closing the file handler.

Please refer the attached Extraction.pm module for reference for the below code flow stepes,

Step1 : Here writefile fucntion is called in the open command, interpreter goes to Writefile fucntion but cannot executed anything inside theWritefile function.

open $fOut,"| sort | perl -e \"use Extraction;Extraction::WriteFile('E:\OutputFolder\OutputFile.txt')\""; # Step 1

Step 2: Reads the Input file ($fIn) in readmode.

Step3: Print the $fIn data into $fOut.

Step 4: On close($fOut) automatically call goes to the WriteFile function and open the Outputfile inside the Writefile function and excutes the function.

Questions:

1. How come WriteFile function calle two times, 1. At the time of open command (Not executed anything inside the function) and 2. At the time close($fOut).

2. STDIN is used inside the writefile function to receive the data collected in Step3. How come STDIN is used like this?( STDIN is used to receive the Standard input from user).

Thanks,

Prasanna.
Attachments: Extraction.pm (0.77 KB)


BillKSmith
Veteran

Mar 7, 2015, 7:24 AM

Post #4 of 17 (7041 views)
Re: [C_PRASANNA] Open file command with Sort and sub function call [In reply to] Can't Post

Your description of step 1 indicates that you do not understand the command that opens a filehandle to a pipe. With my limited understanding of unix, I am not sure that I do, but I will share my best guess.

Your open forks a child process consisting of a unix sort command with its output piped to the perl interpreter running an anonymous script. All output to $Fout is piped to this process.

Your step 3 sends the data to the sort program. It cannot do anything until it has all the data. (I must assume that close $Fout sends an end-of-file to sort.) When sort completes, it pipes its output to perl. The anonymous script uses the Extract module to open a real file, write the data to it and then close the file. Then perl exits and the child process ends. I assume that the parent process (the main perl program) has already terminated.

This should give you a better idea of the flow. Perhaps a unix guru can correct my terminology and supply more detail.
Good Luck,
Bill


FishMonger
Veteran / Moderator

Mar 7, 2015, 9:36 AM

Post #5 of 17 (7036 views)
Re: [C_PRASANNA] Open file command with Sort and sub function call [In reply to] Can't Post

You could use perl's debugger to step through the script to see its flow or you could run it under strace to get lower level view of the process flow.

This is clearly a very obfuscated and inefficient way to copy and sort a file.

I have not tested that horrible mess, but there are a couple parts of it that lead me to believe that it would fail with a compilation error. I'm referring to using an empty prototype definition which implies that the sub doesn't accept arguments, but one is being passed when called. I don't recall for sure if using the FQN when calling a sub bypasses prototype checking.


(This post was edited by FishMonger on Mar 7, 2015, 9:47 AM)


Laurent_R
Veteran / Moderator

Mar 7, 2015, 3:25 PM

Post #6 of 17 (7025 views)
Re: [BillKSmith] Open file command with Sort and sub function call [In reply to] Can't Post


In Reply To
Your step 3 sends the data to the sort program. It cannot do anything until it has all the data.


Hmm, I understand what you mean, but I think this is not really accurate, or not accurately phrased. The correct sentence would be, I think: "It cannot output anything until it has all the data."

I am using the Unix sort utility together with Perl programs very regularly, with huge data sets (very often tens of GB). And, monitoring the temporary directory for work files, I know it starts to do a lot of things before having the whole data. But of course, you are absolutely right that a sort program cannot start spitting out data before it has seen the entire output.


C_PRASANNA
Novice

Mar 9, 2015, 10:49 AM

Post #7 of 17 (6950 views)
Re: [Laurent_R] Open file command with Sort and sub function call [In reply to] Can't Post

Thanks Bill/Laurent/Fishmonger.

From all of your response i understand sort will be processed only the complete data is collected.

If its so how come flow goes inside the writefile function first then comes out without executing anything inside.

I have debugged the code and given the print statement for your reference in the attached code itself. please suggest.


BillKSmith
Veteran

Mar 9, 2015, 12:51 PM

Post #8 of 17 (6941 views)
Re: [C_PRASANNA] Open file command with Sort and sub function call [In reply to] Can't Post

Remember that FishMonger has already told you that this is a very bad solution to the original problem. It does remain interesting, even if not very useful.

I doubt that that behavior that you describe ever happens. Please post complete copies of both the main program and the module. Show us exactly what makes you think so. Writefile is a function of the module (Extract.pm). The module is loaded (use Extract;) by a perl program running in the child process. Your main program does not even know that it exists. How can it call it??? Perhaps you have another function with the same name?
Good Luck,
Bill


C_PRASANNA
Novice

Mar 10, 2015, 9:48 PM

Post #9 of 17 (6870 views)
Re: [BillKSmith] Open file command with Sort and sub function call [In reply to] Can't Post

Bill,

I have attached the Main program (ExtractionMain.pl) and module (Extraction.pm) here and Extraction.pm having the print statement for the understanding of the code flow. String from print statement on Extraction.pm is provided in the same module after __END__.

Think its useful.
Attachments: ExtractionMain.pl (0.15 KB)
  Extraction.pm (0.83 KB)


FishMonger
Veteran / Moderator

Mar 11, 2015, 5:42 AM

Post #10 of 17 (6846 views)
Re: [C_PRASANNA] Open file command with Sort and sub function call [In reply to] Can't Post

Your code doesn't compile and the module is missing 2 important use statements (the warnings and strict pragmas) which should be in every script you write.

Fix those problems then come back with the updated code and a specific question.


C_PRASANNA
Novice

Mar 11, 2015, 6:34 AM

Post #11 of 17 (6836 views)
Re: [FishMonger] Open file command with Sort and sub function call [In reply to] Can't Post

Bill,

Program that i attached is just sample code snippet, like the same we have code running in server over 10 years.

But the actual question is how come writeFile function is called twice? For this, please open the Extraction.pm module for reference. Check the print command that i given in the order of code flow. Here what you should focus is "Step 2 ...!" and "Step 7 ...!", after printing the "Step 2 ...!" flow comes out of writeFile function and again goes inside the WriteFile function at the time of "Step 7 ...!".

As you all explained earlier, please explain why the interpreter goes to the Writefile function before collecting all the data and receives the argument inside the function and come out?

Reference: print statement that i provide after __END__ in the module is the flow of code.


FishMonger
Veteran / Moderator

Mar 11, 2015, 7:15 AM

Post #12 of 17 (6832 views)
Re: [C_PRASANNA] Open file command with Sort and sub function call [In reply to] Can't Post

If you want us to test your code and explain to you how it works (or doesn't work) then you should at least provide code that actually compiles.


Quote
after printing the "Step 2 ...!" flow comes out of writeFile function and again goes inside the WriteFile function at the time of "Step 7 ...!".

What makes you think it does that? It does not go out and back in to the subroutine.

Step through the script with the debugger to see its flow.
perl -d ExtractionMain.pl


(This post was edited by FishMonger on Mar 11, 2015, 7:51 AM)


FishMonger
Veteran / Moderator

Mar 11, 2015, 7:47 AM

Post #13 of 17 (6829 views)
Re: [C_PRASANNA] Open file command with Sort and sub function call [In reply to] Can't Post

Here's a brief explanation of a portion of the script.

When you open $fOut "| sort ... you opened a pipe to the sort function.

After you close that pipe, the output of the sort function is piped to your perl one-liner. It's at this point that the WriteFile() function gets called and outputs the data which was piped from the sort function.

Here's a much cleaner and more efficient way to accomplish your goal and doesn't use a module.

Code
open my $fin,  '<', 'E:/InputFolder/Input.txt' or die("Unable to open the input file $!"); 
open my $fout, '>', 'E:/OutputFolder/OutputFile.txt' or die("Unable to open the output file $!");

print {$fout} sort <$fin>;



(This post was edited by FishMonger on Mar 11, 2015, 7:49 AM)


BillKSmith
Veteran

Mar 11, 2015, 10:41 AM

Post #14 of 17 (6812 views)
Re: [C_PRASANNA] Open file command with Sort and sub function call [In reply to] Can't Post

You have two copies of the module (one in each process) writing buffered output to STDOUT. There is no guarantee that lines are printed in the same order that they actually occur. Autoflush might fix the "problem".
Good Luck,
Bill


FishMonger
Veteran / Moderator

Mar 11, 2015, 10:56 AM

Post #15 of 17 (6810 views)
Re: [BillKSmith] Open file command with Sort and sub function call [In reply to] Can't Post

Sorry Bill but that's not correct. Only 1 process is writing to the file.

Here's the "corrected version" of the OP's code.

First, the contents of the input file.

Quote
Step 5 ..!
Step 3 ..!
Step 4 ..!
Step 6 ..!
Step 1 ..!
Step 7 ..!
Step 2 ..!


"Corrected" module code. I adjusted it slightly to work on Linux because my Windows box just had a HD crash.

Code
package Extraction; 

#use strict;
#use warnings;

sub ReceiveFile()
{
print "Step 1 ..! \n";
open $fOut,"| sort | perl -MExtraction -e 'Extraction::WriteFile(\'OutputFile.txt\')'";

print "Step 3 ..! \n";
open $fIn, "<Input.txt" or die("Unable to open the file"); #-Step2

while(<$fIn>)
{
print "Step 4 ..! \n";
print $fOut "$_"; #-Step3
}
print "Step 5 ..! \n";
close($fIn);

print "Step 6 ..! \n";
close($fOut);
}

sub WriteFile
{

my $out_file_name = shift;
print "Step 2 ..! \n";

print "Step 7 ..! \n";
open OUT,">$out_file_name" or die ("$! $out_file_name");

while(<STDIN>)
{
print OUT "$_";
}

close(OUT);
}

1;


Contents of output file.

Quote
Step 1 ..!
Step 2 ..!
Step 3 ..!
Step 4 ..!
Step 5 ..!
Step 6 ..!
Step 7 ..!


EDIT: Here's the console output.

Quote
./ExtractionMain.pl is used to Extract the workfiles...
Step 1 ..!
Step 3 ..!
Step 4 ..!
Step 4 ..!
Step 4 ..!
Step 4 ..!
Step 4 ..!
Step 4 ..!
Step 4 ..!
Step 5 ..!
Step 6 ..!
Step 2 ..!
Step 7 ..!



(This post was edited by FishMonger on Mar 11, 2015, 11:02 AM)


BillKSmith
Veteran

Mar 11, 2015, 1:04 PM

Post #16 of 17 (6804 views)
Re: [FishMonger] Open file command with Sort and sub function call [In reply to] Can't Post

Fishmonger,

Refer to attachments to post #9. I agree that only the child process writes to the file. Both processes load (use) the same module. Both print debug information to STDOUT. This debug info could be misleading.

You have clearly demonstrated that you have fixed this debug problem. I still do not understand how.
Good Luck,
Bill


FishMonger
Veteran / Moderator

Mar 11, 2015, 1:43 PM

Post #17 of 17 (6800 views)
Re: [BillKSmith] Open file command with Sort and sub function call [In reply to] Can't Post

Yes, both processes print to STDOUT, but there's no overlap. If you step through the script with the debugger, you'll find that second perl process doesn't start until after the $fOut file handle is closed. That is when the sort function outputs/pipes it buffered data to the perl process.

The main issue with the contents of OutputFile.txt was the additional \n char the OP added to the print statement, which caused blank lines at the beginning. I did not test this, but I suspect that if buffering was disabled on filehandles, then the blank lines would have been in-between each of the sorted lines instead of above them. After removing the unwanted \n char in the print statements, the proper/desired output was returned.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives