CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Reading a File

 



chander1986
Novice

May 9, 2009, 10:49 AM

Post #1 of 21 (1963 views)
Reading a File Can't Post

I am trying to write a module that will read a csv file and will sort one of the columns in that file in an ascending order and then write back to the same file.
Is there a way to do this?


KevinR
Veteran


May 9, 2009, 11:34 AM

Post #2 of 21 (1962 views)
Re: [chander1986] Reading a File [In reply to] Can't Post

There are several ways you could go about it. What have you tried? Where are you stuck?
-------------------------------------------------


chander1986
Novice

May 9, 2009, 4:49 PM

Post #3 of 21 (1957 views)
Re: [KevinR] Reading a File [In reply to] Can't Post

What i am doing is currently reading the whole file and storing the particular column in an array. I am able to sort the column, but i need to arrange other columns according to this particular column. This is where i am actually stuck.

I think this particular way is a bit tedious. Is there a better way to do it?


KevinR
Veteran


May 9, 2009, 6:11 PM

Post #4 of 21 (1953 views)
Re: [chander1986] Reading a File [In reply to] Can't Post

If the particular column is always a unique value you could use a simple hash, where the column to be sorted is the hash keys and the values are the originals lines of the file. If the values to be sorted are not unique, use an array of arrays. Sort the array of arrays by the column of interest and print the sorted array of arrays back into the file.
-------------------------------------------------


(This post was edited by KevinR on May 9, 2009, 6:11 PM)


chander1986
Novice

May 10, 2009, 1:04 AM

Post #5 of 21 (1940 views)
Re: [KevinR] Reading a File [In reply to] Can't Post

The following is the code tht i have written to sort column 8 followed by column 9.

Code
 
use Text::ParseWords;
use IO::File;

open(FILE,"bank_id.csv");
@lines = <FILE>;

my $fh_out = IO::File->new('>>temp.csv');

@sorted = sort multisort @lines;

$length = $#lines + 1;

for($i = 0; $i<$length; $i++) {

$fh_out->print($lines[$i]);

}




sub multisort {

($field1a, $field2a, $field3a, $field4a,$field5a, $field6a, $field7a, $field8a,$field9a, $field10a, $field11a, $field12a)=split(/,/, $a);
($field1b, $field2b, $field3b, $field4b,$field5b, $field6b, $field7b, $field8b,$field9b, $field10b, $field11b, $field12b)=split(/,/, $b);

# now do the actual sort:
$field8a cmp $field8b
or
$field9a cmp $field9b;
}


But it is not working as intended. It just creates new file with the exactly same data. Where am i going wrong??


KevinR
Veteran


May 10, 2009, 10:20 AM

Post #6 of 21 (1933 views)
Re: [chander1986] Reading a File [In reply to] Can't Post

The code you posted prints @lines back to a file instead of @sorted. Try print @sorted to the file and see if it is working.
-------------------------------------------------


chander1986
Novice

May 10, 2009, 10:57 AM

Post #7 of 21 (1930 views)
Re: [KevinR] Reading a File [In reply to] Can't Post

No it is still not sorting it correctly.

I am attaching the sample input and output files. I actually want the input file to sorted on the year column and sub sorted on the month column.

Is the approach that i am using right? Or is there any other better approach than this??
Attachments: input.csv (0.49 KB)
  output.csv (0.49 KB)


KevinR
Veteran


May 10, 2009, 9:43 PM

Post #8 of 21 (1920 views)
Re: [chander1986] Reading a File [In reply to] Can't Post

This uses a cached key sort to sort the data. Add your output file stuff.


Code
use Text::ParseWords;  
use IO::File;
use strict;
use warnings;

open(FILE,"bank_id.csv") or die "$!";
<FILE>;#skip header line in file. If the header is not in the file remove this line
my @lines = <FILE>;
close FILE;
my @sorted = map{$_->[0]}
sort{$a->[8] <=> $b->[8] || $a->[9] <=> $b->[9]}
map{[$_,split(/,/)]} @lines;
print $_ for @sorted;

-------------------------------------------------


chander1986
Novice

May 11, 2009, 3:46 PM

Post #9 of 21 (1904 views)
Re: [KevinR] Reading a File [In reply to] Can't Post

Hi Kevin,

I modifed your code to include the output file thing. This is the modified version:


Code
use Text::ParseWords;   
use IO::File;
use strict;
use warnings;

my $fh_out = IO::File->new('>>temp.csv');

open(FILE,"bank_id.csv") or die "$!";
<FILE>;#skip header line in file. If the header is not in the file remove this line
my @lines = <FILE>;
close FILE;
my @sorted = map{$_->[0]}
sort{$a->[8] <=> $b->[8] || $a->[9] <=> $b->[9] }
map{[$_,split(/,/)]} @lines;
print $_ for @sorted;

my $length = $#sorted + 1;
my $i = 0;

for($i=0 ; $i<$length; $i++) {

$fh_out->print($sorted[$i]);

}


This code is working fine for the sample input file. But when i put some other input file, which has the same structure but more number of rows, this code doesnt work as desire. I am attaching the input file for which it is giving trouble.

The other thing is i also want to preserve the header.
Attachments: bank_id.csv (15.2 KB)


KevinR
Veteran


May 11, 2009, 4:52 PM

Post #10 of 21 (1901 views)
Re: [chander1986] Reading a File [In reply to] Can't Post

The data file has lines with embedded commas, for example:


Code
3373,6099,CONCORD EFS INC,"FUNCTIONS REL TO DEP BKE,NEC",,3373,740112,1993,1,,0,20 
5342,6099,MONEYGRAM INTERNATIONAL INC,"FUNCTIONS REL TO DEP BKE,NEC",MN,5342,1273931,1993,1,,0,20


So you need to use Text::ParseWords or other CSV parser to split the lines instead of using split(/,/).
-------------------------------------------------


KevinR
Veteran


May 11, 2009, 5:20 PM

Post #11 of 21 (1899 views)
Re: [chander1986] Reading a File [In reply to] Can't Post

OK, I had time to work on it, this seems to work with the data file you posted:


Code
use Text::ParseWords; 
use strict;
use warnings;

open(my $FILE,"bank_id.csv") or die "$!";
my $header = <$FILE>;
my @lines = <$FILE>;
close $FILE;
my @sorted = map{$_->[0]}
sort{$a->[1] <=> $b->[1] || $a->[2] <=> $b->[2] }
map{[ $_,(quotewords(',',0,$_))[7,8] ]} @lines;
print $_ for @sorted;
open (my $fh_out, '>', 'temp.csv') or die "$!";
print $fh_out $header;
print $fh_out @sorted;

-------------------------------------------------


chander1986
Novice

May 28, 2009, 11:22 AM

Post #12 of 21 (1864 views)
Re: [KevinR] Reading a File [In reply to] Can't Post

It is working, but i actually didnt understand what you did. can you provide me with some explanation?


KevinR
Veteran


May 28, 2009, 1:15 PM

Post #13 of 21 (1861 views)
Re: [chander1986] Reading a File [In reply to] Can't Post


In Reply To
It is working, but i actually didnt understand what you did. can you provide me with some explanation?


Explain code I wrote 17 days ago? If only my memory were still that good. Unsure

See if this helps to understand the rather funky looking cached key sort technique:

http://bytes.com/topic/perl/insights/775438-sorting-data-schwartzian-transform
-------------------------------------------------


(This post was edited by KevinR on May 28, 2009, 1:26 PM)


chander1986
Novice

May 28, 2009, 4:16 PM

Post #14 of 21 (1843 views)
Re: [KevinR] Reading a File [In reply to] Can't Post

I understood the code. The problem im facing is that it is working for all the input files but one. I am attaching the input file. When i try to sort this input file, it is giving me some weird output. I have modified the code somewhat:

Code
use Text::ParseWords;  
#use strict;
use warnings;

open(my $FILE,"issue_id.csv") or die "$!";
my $header = <$FILE>;
my @lines = <$FILE>;
close $FILE;
my @sorted = map{$_->[0]}
sort{$a->[1] <=> $b->[1] || $a->[2] <=> $b->[2] }
map{[ $_,(quotewords(',',0,$_))[3,4] ]} @lines;
print $_ for @sorted;
open (my $fh_out, '>', 'temp.csv') or die "$!";
print $fh_out $header;
print $fh_out @sorted;



Can u help me out with this?. I have spent almost 4 hrs on this but am not able to figure out?


KevinR
Veteran


May 29, 2009, 8:59 AM

Post #15 of 21 (1827 views)
Re: [chander1986] Reading a File [In reply to] Can't Post

Attach the file and I will see if I can help.
-------------------------------------------------


chander1986
Novice

May 29, 2009, 10:17 AM

Post #16 of 21 (1823 views)
Re: [KevinR] Reading a File [In reply to] Can't Post

I have attached the file.


KevinR
Veteran


May 29, 2009, 10:24 AM

Post #17 of 21 (1822 views)
Re: [chander1986] Reading a File [In reply to] Can't Post

Where is it attached?
-------------------------------------------------


chander1986
Novice

May 29, 2009, 10:25 AM

Post #18 of 21 (1821 views)
Re: [KevinR] Reading a File [In reply to] Can't Post

Oops...Sorry..now its attached
Attachments: issue_id.csv (62.3 KB)


KevinR
Veteran


May 29, 2009, 12:12 PM

Post #19 of 21 (1817 views)
Re: [chander1986] Reading a File [In reply to] Can't Post

OK, while you did not desrcibe what "weird output" is, I think I see the problem. Lines like this in the file:


Code
47,Whole Foods Market Inc,0000865436,1993,12,7,2967,Mrs Gooch's Natural Foods


The apostrophe at "h's" seems to be interpreted as a quoting mechanism like double-quotes that are used around fields that have embedded commas. Switching to Text::CSV_XS might eliminate that problem but you can also use a regexp to escape the apostrophe first before passing the line to the quotewords() function. LIke this:


Code
my @sorted = map{$_->[0]}     
sort{$a->[1] <=> $b->[1] || $a->[2] <=> $b->[2] }
map{s/'/\'/g;[ $_,(quotewords(',',0,$_))[3,4] ]} @lines;


That will add some overhead to the program but hopefully not very much.

The documentation for Text::ParseWords only mentions "quotes" but apparently that means single-quotes and double-quotes, so any single-quote being used as an apostrophe in a word like Mrs Gooch's will cause the splitting to break and not be parsed at all. The simple answer is escape single-quotes like I showed above. I don't see any single-quotes being used to quote fields in the file you attached that have embedded commas in them so hopefully using the regexp doesn't introduce any other problems.
-------------------------------------------------


chander1986
Novice

May 29, 2009, 12:42 PM

Post #20 of 21 (1814 views)
Re: [KevinR] Reading a File [In reply to] Can't Post

I tried running it. But it is still not working. I used the same input file i attached over here.


KevinR
Veteran


May 29, 2009, 2:34 PM

Post #21 of 21 (1811 views)
Re: [chander1986] Reading a File [In reply to] Can't Post

Saying its not working is not much help. Tell me what it does do. See if this works:


Code
use Text::ParseWords;   
use strict;
use warnings;
my @lines;
open(my $FILE,"c:/perl_test/issue_id.csv") or die "$!";
my $header = <$FILE>;
while(<$FILE>) {
chomp;
s/'/\\'/g;
push @lines, [(quotewords(',',0,$_))[3,4],$_];
}
my @sorted = sort {$a->[0] <=> $b->[0] || $a->[1] <=> $b->[1]} @lines;
print "@{$_}\n" for @sorted;

-------------------------------------------------

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives