CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
Relatively simple data extraction

 



shlam16
New User

Oct 12, 2012, 4:58 AM

Post #1 of 5 (3324 views)
Relatively simple data extraction Can't Post

Hi guys, I am a complete beginner when it comes to perl, and I have been set a task that is beyond me.

I have a whole bunch of files, each with 6 columns of data. I need to extract the first 2 columns of data from each of these files and have them all print out into an outfile.

The files that are being extracted from are sequential, so the data needs to be extracted in order. When I say sequential I mean like "file.1.1, file.3.1, file.5.1, etc".

I feel like I could probably use a cat command to paste the whole thing into one file and then just use sed to delete the last 4 columns, but I was hoping someone could show me a better way in perl?



Thanks, Shlam.


(This post was edited by shlam16 on Oct 12, 2012, 5:05 AM)


BillKSmith
Veteran

Oct 12, 2012, 7:46 AM

Post #2 of 5 (3316 views)
Re: [shlam16] Relatively simple data extraction [In reply to] Can't Post

The straightforward way to do this in perl requires somewhat more code than your shell script.

One advantage of the perl script is that it can sort the file names correctly. I find that the easiest way to do this is to use the common idiom called the "Schwartzian Transformation".


Code
     
use strict;
use warnings;
my @files = glob 'file.*.1';
my @sorted_files = map {$_->[1]}
sort {$a->[0] <=> $b->[0]}
map {/\.(\d+)\./;[$1,$_]}
@files;
open my $OUT, '>', 'mergefile.txt' or die "cannot open output file:$!";
foreach (@sorted_files) {
open my $FH, '<', $_ or die "Cannot open $_:$!";
while (my $line = <$FH>) {
print {$OUT} join( ' ', (split)[0,1] ), "\n";
}
close $FH;
}
close $OUT;

Note: code untested.
Good Luck,
Bill


Laurent_R
Veteran / Moderator

Oct 12, 2012, 10:28 AM

Post #3 of 5 (3312 views)
Re: [BillKSmith] Relatively simple data extraction [In reply to] Can't Post

Hi,

I do not think that sorting the files is necessary, as the glob function returns the file names in sorted order.


BillKSmith
Veteran

Oct 12, 2012, 10:33 AM

Post #4 of 5 (3310 views)
Re: [Laurent_R] Relatively simple data extraction [In reply to] Can't Post

Numeric order??
Good Luck,
Bill


Laurent_R
Veteran / Moderator

Oct 13, 2012, 1:44 AM

Post #5 of 5 (3299 views)
Re: [BillKSmith] Relatively simple data extraction [In reply to] Can't Post


In Reply To
Numeric order??


Yes, right, because of the example given in the OP, I was thinking of five files, but we don't know how many files there are, neither do we know their exact nomenclature.

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives