CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Parsing CSV file - Add Concatenated field/sort

 



BigRedEO
Novice

Mar 21, 2016, 1:21 PM

Post #1 of 53 (40246 views)
Parsing CSV file - Add Concatenated field/sort Can't Post

I'm being thrown back into Perl for the first time in 9 years, so I'm considering myself a "beginner" again. I have a .csv file which I need to manipulate. Each line starts with three fields. I need to append a new field to the front of each line that is a concatenation of the first three fields, i.e. if each line starts with

$field1 $field2 $field3

I need to add CONCAT($field1, $field2 $field3) to the beginning of each line/record. To be honest, not a clue where to start with that.

Then I also need to sort the entire file in order of occurence by "init-tran-date", "update-tran-date" and "update-tran-time" which are at spread out locations in each 40 field line/record. Would it be easier to put the fields into an array and sort based on their index location?

I at least remember some of the basics from the old days! Hope it still holds true -


Code
#!/usr/bin/perl/ 

$filename = '/mypath/myfile.csv';
open(FILE, $filename) or die "Could not read from $filename, program halting.";
while (<FILE>)
{
chomp;
($field1, $field2, $field3, etc) = split(","); #or array?

#sort
print;
}

close (FILE);
exit;



FishMonger
Veteran / Moderator

Mar 21, 2016, 6:21 PM

Post #2 of 53 (40238 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Lets start at the beginning. Every perl script should begin by loading the strict and warnings pragmas.

Code
#!/usr/bin/perl 

use strict;
use warnings;

The strict pragma will require you to declare your vars, which is done with the my keyword.

Code
my $filename = '/mypath/myfile.csv';


For the reading/parsing/writing of the csv file, use the Text::CSV_XS module. http://search.cpan.org/~hmbrand/Text-CSV_XS-1.22/CSV_XS.pm


Quote
I need to add CONCAT($field1, $field2 $field3) to the beginning of each line/record. To be honest, not a clue where to start with that.

There several ways to concatenate the fields. You could simply make a double quoted string with the 3 fields, or you could use the join function. Then use the unshift function to add that string to the beginning of the array.

Your multi field sorting requirement can be done by using the sort function. The documentation on the sort function gives a couple examples and a simple google search will provide you with more extensive examples.


(This post was edited by FishMonger on Mar 21, 2016, 6:23 PM)


Laurent_R
Veteran / Moderator

Mar 22, 2016, 12:41 AM

Post #3 of 53 (40223 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

I agree with everything Fishmonger has said.

Just a small additional comment:


In Reply To

Code
 ($field1, $field2, $field3, etc) = split(","); #or array? 

}



You may do that to handle your line (for example for putting the three fields at the beginning), but in the end, for the sorting in accordance with those fields you need to use, you'll need each line to be in an array, and, therefore, the file to be an array or arrays. I would personally go for an array immediately.


BigRedEO
Novice

Mar 22, 2016, 5:51 AM

Post #4 of 53 (40220 views)
Re: [Laurent_R] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Thank you both for getting me started on this!


FishMonger
Veteran / Moderator

Mar 22, 2016, 6:59 AM

Post #5 of 53 (40216 views)
Re: [Laurent_R] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Your comment implies slurping the data into an array before doing anything with it. If the csv data is simple and doesn't have any complexities such as embedded comma's or some fields with quotes and some without , then I'd probably agree. However, if the fields aren't that simple, then I'd use the module loop over the data and build the array.


(This post was edited by FishMonger on Mar 22, 2016, 7:02 AM)


BigRedEO
Novice

Mar 22, 2016, 1:19 PM

Post #6 of 53 (40205 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

In looking at what I need to really do here (I'm parsing this file with over 2 million records), I've realized I actually need to add the concatenated field to the beginning of each record first (that will become a primary key for a MySQL table), then sort it by that new primary key field at the beginning of each record, THEN a subsort by init-date, update-date, update time, and write it all back to the file.

This file will then be used to load a MySQL table and the program that handles the load needs the records in that sorted order so that it deletes all but the most recent record for each key.

The date fields are in MM/DD/YYYY format so I guess I could use a date sort, but the last key needs to be based on the update-date/update-time combination.


FishMonger
Veteran / Moderator

Mar 22, 2016, 1:24 PM

Post #7 of 53 (40201 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Before you get too far in the process, you should take a look at the DBD::CSV module.
http://search.cpan.org/~hmbrand/DBD-CSV-0.48/lib/DBD/CSV.pm


FishMonger
Veteran / Moderator

Mar 22, 2016, 1:36 PM

Post #8 of 53 (40194 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post


Quote
I've realized I actually need to add the concatenated field to the beginning of each record first (that will become a primary key for a MySQL table)

That sounds like an odd design and it sounds like the DB isn't normalized.


Quote
The date fields are in MM/DD/YYYY format so I guess I could use a date sort, but the last key needs to be based on the update-date/update-time combination.

That's not the format used for a mysql date field. You'll want to reformat that before inserting into the DB.

There's no need to presort the data before inserting into the DB. The sorting should be done when you do your select statements.


(This post was edited by FishMonger on Mar 22, 2016, 1:44 PM)


BigRedEO
Novice

Mar 23, 2016, 6:05 AM

Post #9 of 53 (40179 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

A script that someone else has written will be used to load the table - it has a built-in "update" function that overwrites a record if the key already exists, hence it's need to be "pre-sorted." Only the most recent record should be loaded into the MySQL table.

I went to your above link, but most of the language there is a bit above me.


FishMonger
Veteran / Moderator

Mar 23, 2016, 6:42 AM

Post #10 of 53 (40173 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

The DBD::CSV module allows you to access the csv file via sql statements as if it was an actual database. Using an sql statement should allow you to do the 3 field concat and multi field sorting in one step.


BigRedEO
Novice

Mar 23, 2016, 1:24 PM

Post #11 of 53 (40168 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Trying to just get a simple start and test whether I can open/read a file. I put just the first four lines of the file into a test.csv, then wanted to make sure I could get Perl to open and read the file. I was expecting it to show me just the first field of each record. Instead, it's showing me the entire first line/record and that's it. What am I doing wrong here?


Code
#!/usr/bin/perl/ 

use strict;
use warnings;

my $filename = '/swpkg/shared/batch_processing/mistints/test.csv';
open(FILE, $filename) or die "Could not read from $filename, program halting.";

# Read the header line.
my $line = <FILE>;

# Display the header, just to check things are working.
print($line);

# Read the lines one by one.
while($line = <FILE>)

#split the fields, concatenate the first three fields, add to the beginning of each line in the file
{
chomp($line);

my @values = split(',', $line);

print ($values[0]);

}

close (FILE);
exit;



FishMonger
Veteran / Moderator

Mar 23, 2016, 2:11 PM

Post #12 of 53 (40164 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

That should work, but could stand some improvements.

Can you post (as an attachment) your test script so I can run a test?


Laurent_R
Veteran / Moderator

Mar 24, 2016, 3:14 AM

Post #13 of 53 (40149 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post


In Reply To
I was expecting it to show me just the first field of each record. Instead, it's showing me the entire first line/record and that's it. What am I doing wrong here?


In theory, your code should work and do what you want.

When you say "the entire first line/record", do you mean the header line which you are processing separately, or the next line?

Since the code seems correct to me (there may be some improvements, but it should work), please show your input file, so that we can try to figure out why: 1. it reads apparently only the first record; 2. it apparently does not split the record into fields.


BigRedEO
Novice

Mar 24, 2016, 6:36 AM

Post #14 of 53 (40136 views)
Re: [Laurent_R] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

All of a sudden this morning, it's working. Don't ask me why. My guess is the VI editor I'm having to use may be leaving in characters I'm not seeing. I usually try to edit the script in a simple text file, then copy it into the VI editor for the server I'm working on and I re-copied it this morning after deleting the script first, so my guess is there was an hidden character I wasn't seeing.

The one thing I wasn't expecting was how it displayed the fields from each record. I expected to see:

1334
2525
1350

But I get

133425251350

What do I need to add to show each record on it's own line? I am attaching my test CSV file.

This is at least helping me to get a better idea of how the data is manipulated and I'm hoping will get me closer to how to add the new field to the front of each record.

Thank you!
Attachments: test.csv (1.40 KB)


FishMonger
Veteran / Moderator

Mar 24, 2016, 7:01 AM

Post #15 of 53 (40131 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

The output was put all on 1 line because your print statement is missing the \n new line character.

Here's a slightly adjusted/improved version.

Code
#!/usr/bin/perl/  

use strict;
use warnings;

my $filename = '/swpkg/shared/batch_processing/mistints/test.csv';
$filename = 'test.csv';

open my $FH, $filename
or die "Could not read from $filename <$!>, program halting.";

# Read the header line.
my $line = <$FH>;

# Display the header, just to check things are working.
print $line, $/;

# Read the lines one by one.
while($line = <$FH>) {

# split the fields, concatenate the first three fields,
# and add it to the beginning of each line in the file
chomp($line);
my @fields = split(/,/, $line);
unshift @fields, join '_', @fields[0..2];
print $fields[0], $/;
}

close $FH;
exit;


Which outputs:

Quote
c:\test>BigRedEO.pl
STORE_NBR,CONTROL_NBR,LINE_NBR,SALES_NBR,QTY_MISTINT,REASON_CODE,MISTINT_COMM,SZ_CDE,TINTER_MODEL,TINTER_SERL_NBR,SPECTRO_MODEL,SPECTRO_SERL_NBR,EMP_NBR,TRAN_DATE,TRAN_TIME,CDS_ADL_FLD,PROD_NBR,PALETTE,COLOR_ID,INIT_TRAN_DATE,GALLONS_MISTINTED,UPDATE_EMP_NBR,UPDATE_TRAN_DATE,GALLONS,FORM_SOURCE,UPDATE_TRAN_TIME,SOURCE_IND,CANCEL_DATE,COLOR_TYPE,CANCEL_EMP_NBR,NEED_EXTRACTED,MISTINT_MQ_XTR,DATA_SOURCE,GUID,QUEUE_NAME,BROKER_NAME,MESSAGE_ID,PUT_TIME,CREATED_TS

1334_53927_1
2525_67087_1
1350_163689_1



(This post was edited by FishMonger on Mar 24, 2016, 7:05 AM)


BigRedEO
Novice

Mar 24, 2016, 7:13 AM

Post #16 of 53 (40126 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Thank you!

If I want it without the underscore, would it just be

Code
unshift @fields, join @fields[0..2];

And I will look up unshift on my own!


FishMonger
Veteran / Moderator

Mar 24, 2016, 7:32 AM

Post #17 of 53 (40122 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

The join function requires 2 arguments, so if you don't want the underscore use an empty string.

Code
unshift @fields, join '', @fields[0..2];



(This post was edited by FishMonger on Mar 24, 2016, 7:32 AM)


BigRedEO
Novice

Mar 24, 2016, 8:16 AM

Post #18 of 53 (40115 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

The boss has decided he wants the underscore, so I've put it back in and added these two lines -


Code
    $" = ","; 
print "@fields\n" ;


so I can see the completed records and with the comma back in as a separator.

Now I just have to figure out the sort on the "init-tran-date", "update-tran-date" and "update-tran-time" - and how to write it TO a file and I should be able to start loading my MySQL table!

Thank you again. (always fun to be thrown back into a language you haven't touched for 8 years and asked to try and have it done before you leave for a vacation)


FishMonger
Veteran / Moderator

Mar 24, 2016, 8:21 AM

Post #19 of 53 (40112 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

You'll want to use the Schwartzian transform sorting idiom.
https://en.wikipedia.org/wiki/Schwartzian_transform
https://www.google.com/search?q=perl+schwartzian+transform&spell=1&sa=X&ved=0ahUKEwi12_fGz9nLAhUH52MKHdcXDIIQvwUIGigA&biw=1920&bih=977


BigRedEO
Novice

Mar 24, 2016, 10:20 AM

Post #20 of 53 (40107 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

This sorting has my head spinning so much.

Much of the sorting language is over my head - I understand the cmp commands in the sorts, but the mapping has me lost. I need to sort on init-tran-date, update-tran-date and update-tran-time because the latter two fields could be empty (if the initial record never had an update to it). And the .csv file needs to be sorted because it will also be used with a sqlBatch60 Java program (over which I have no control - the data just needs to already be sorted to work with that already written sqlBatch60 program).

I think I'd rather load a MySQL table with what I have now, SELECT * with a sort on those three fields, and output that back into a new .csv file.


FishMonger
Veteran / Moderator

Mar 24, 2016, 1:31 PM

Post #21 of 53 (40098 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

See if this sorting test does what you want.

Code
#!/usr/bin/perl/  

use strict;
use warnings;
use Data::Dumper;

my $filename = '/swpkg/shared/batch_processing/mistints/test.csv';
$filename = 'test.csv';

open my $FH, $filename
or die "Could not read from $filename <$!>, program halting.";

# Read the header line.
chomp(my $line = <$FH>);
my @fields = split(/,/, $line);
print "Field Names:\n", Dumper(@fields), $/;

my @data;
# Read the lines one by one.
while($line = <$FH>) {

# split the fields, concatenate the first three fields,
# and add it to the beginning of each line in the file
chomp($line);
my @fields = split(/,/, $line);
unshift @fields, join '_', @fields[0..2];
push @data, \@fields;
}
close $FH;
print "Unsorted:\n", Dumper(@data), $/;

@data = sort {
$a->[0] cmp $b->[0] ||
$a->[20] cmp $b->[20] ||
$a->[23] cmp $b->[23] ||
$a->[26] cmp $b-> [26]
} @data;

print "Sorted:\n", Dumper(@data);

exit;



(This post was edited by FishMonger on Mar 24, 2016, 1:41 PM)


FishMonger
Veteran / Moderator

Mar 24, 2016, 2:48 PM

Post #22 of 53 (40092 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Here's an example using sql statement to read, parse and sort the csv file. I didn't include the concatenating of the 3 fields, but that can be added.

Code
#!/usr/bin/perl 

use strict;
use warnings FATAL => 'all';
use DBI;
use DBD::CSV;
use Text::CSV_XS;
use Data::Dumper;

my $dbh = DBI->connect("dbi:CSV:", undef, undef, {
RaiseError => 1,
f_dir => 'c:/test',
csv_eol => "\n",
csv_class => 'Text::CSV_XS',
csv_tables => {
test => { f_file => 'test.csv'}
},
});

my $sth = $dbh->prepare("SELECT *
FROM test
ORDER by INIT_TRAN_DATE,
UPDATE_TRAN_DATE,
UPDATE_TRAN_TIME");
$sth->execute;

while (my $row = $sth->fetch) {
print Dumper $row;
#print join(',', @$row), $/;
}
$dbh->disconnect;


Output:
c:\test>BigRedEO_v2.pl

Code
$VAR1 = [ 
'2525',
'67087',
'1',
'650462328',
'1',
'4',
'Tinted Wrong Product',
'14',
'IFC 8012NP',
'Standalone-5',
'',
'',
'11',
'10/23/2015',
'104314',
'',
'A91W00353',
'',
'',
'10/20/2015',
'0.25',
'0',
'',
'0.25',
'',
'',
'COMP',
'',
'CUSTOM MATCH',
'0',
'TRUE',
'TRUE',
'O',
'1AC5D8742D47435EA05343D57372AD32',
'POS.MISTINT.V0000.UP.Q',
'PROD_SMISC_BK',
'414D512050524F445F504F533235323531C2295605350020',
'10/23/2015 10:46',
'10/23/2015 10:47'
];
$VAR1 = [
'1334',
'53927',
'1',
'100551589',
'1',
'6',
'Bad Shercolor Match',
'16',
'IFC 8112NP',
'01DX8005513',
'',
'',
'77',
'10/23/2015',
'95816',
'',
'OV0020001',
'',
'MANUAL',
'10/21/2015',
'1',
'0',
'',
'1',
'MAN',
'',
'CUST',
'',
'CUSTOM MATCH',
'0',
'TRUE',
'TRUE',
'O',
'5394A0E67FFF4D01A0D9AD16FA29ABB1',
'POS.MISTINT.V0000.UP.Q',
'PROD_SMISC_BK',
'414D512050524F445F504F533133333464EB2956052C0020',
'10/23/2015 10:45',
'10/23/2015 10:45'
];
$VAR1 = [
'1350',
'163689',
'1',
'650462302',
'1',
'3',
'Tinted Wrong Color',
'14',
'IFC 8012NP',
'06DX8006805',
'',
'',
'1',
'10/23/2015',
'104907',
'',
'A91W00351',
'COLOR',
'6233',
'10/23/2015',
'0.25',
'0',
'',
'0.5',
'ENG',
'',
'SW',
'',
'PALETTE',
'0',
'TRUE',
'TRUE',
'O',
'F1A072BCC548412FA22052698B5B0C28',
'POS.MISTINT.V0000.UP.Q',
'PROD_SMISC_BK',
'414D512050524F445F504F53313335307BC12956053C0020',
'10/23/2015 10:52',
'10/23/2015 10:52'
];



BigRedEO
Novice

Mar 28, 2016, 5:32 AM

Post #23 of 53 (40028 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

I tested this Perl script (which is great, thank you), but ran into one weird little hiccup with the last variable - here was the output (just the first two results) -

Code
Field Names: 
$VAR1 = 'STORE_NBR';
$VAR2 = 'CONTROL_NBR';
$VAR3 = 'LINE_NBR';
$VAR4 = 'SALES_NBR';
$VAR5 = 'QTY_MISTINT';
$VAR6 = 'REASON_CODE';
$VAR7 = 'MISTINT_COMM';
$VAR8 = 'SZ_CDE';
$VAR9 = 'TINTER_MODEL';
$VAR10 = 'TINTER_SERL_NBR';
$VAR11 = 'SPECTRO_MODEL';
$VAR12 = 'SPECTRO_SERL_NBR';
$VAR13 = 'EMP_NBR';
$VAR14 = 'TRAN_DATE';
$VAR15 = 'TRAN_TIME';
$VAR16 = 'CDS_ADL_FLD';
$VAR17 = 'PROD_NBR';
$VAR18 = 'PALETTE';
$VAR19 = 'COLOR_ID';
$VAR20 = 'INIT_TRAN_DATE';
$VAR21 = 'GALLONS_MISTINTED';
$VAR22 = 'UPDATE_EMP_NBR';
$VAR23 = 'UPDATE_TRAN_DATE';
$VAR24 = 'GALLONS';
$VAR25 = 'FORM_SOURCE';
$VAR26 = 'UPDATE_TRAN_TIME';
$VAR27 = 'SOURCE_IND';
$VAR28 = 'CANCEL_DATE';
$VAR29 = 'COLOR_TYPE';
$VAR30 = 'CANCEL_EMP_NBR';
$VAR31 = 'NEED_EXTRACTED';
$VAR32 = 'MISTINT_MQ_XTR';
$VAR33 = 'DATA_SOURCE';
$VAR34 = 'GUID';
$VAR35 = 'QUEUE_NAME';
$VAR36 = 'BROKER_NAME';
$VAR37 = 'MESSAGE_ID';
$VAR38 = 'PUT_TIME';
';AR39 = 'CREATED_TS

Unsorted:
$VAR1 = [
'1334_53927_1',
'1334',
'53927',
'1',
'100551589',
'1',
'6',
'Bad Shercolor Match',
'16',
'IFC 8112NP',
'01DX8005513',
'',
'',
'77',
'10/23/2015',
'95816',
'',
'OV0020001',
'',
'MANUAL',
'10/21/2015',
'1',
'0',
'',
'1',
'MAN',
'',
'CUST',
'',
'CUSTOM MATCH',
'0',
'TRUE',
'TRUE',
'O',
'5394A0E67FFF4D01A0D9AD16FA29ABB1',
'POS.MISTINT.V0000.UP.Q',
'PROD_SMISC_BK',
'414D512050524F445F504F533133333464EB2956052C0020',
'10/23/2015 10:45',
' '10/23/2015 10:45
];



BigRedEO
Novice

Mar 28, 2016, 8:10 AM

Post #24 of 53 (40024 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

And just to make sure something didn't copy strangely with the VI editor, here is the code I used -

Code
#!/usr/bin/perl/ 

use strict;
use warnings;
use Data::Dumper;

my $filename = '/swpkg/shared/batch_processing/mistints/test.csv';
$filename = 'test.csv';

open my $FH, $filename
or die "Could not read from $filename <$!>, program halting.";

# Read the header line.
chomp(my $line = <$FH>);
my @fields = split(/,/, $line);
print "Field Names:\n", Dumper(@fields), $/;

my @data;
# Read the lines one by one.
while($line = <$FH>) {

# split the fields, concatenate the first three fields,
# and add it to the beginning of each line in the file
chomp($line);
my @fields = split(/,/, $line);
unshift @fields, join '_', @fields[0..2];
push @data, \@fields;
}
close $FH;
print "Unsorted:\n", Dumper(@data), $/;

@data = sort {
$a->[0] cmp $b->[0] ||
$a->[20] cmp $b->[20] ||
$a->[23] cmp $b->[23] ||
$a->[26] cmp $b-> [26]
} @data;

print "Sorted:\n", Dumper(@data);

exit;



FishMonger
Veteran / Moderator

Mar 28, 2016, 8:40 AM

Post #25 of 53 (40019 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

I'm unable to duplicate that problem using the sample data you provided. Maybe the test.csv file you tested is different from what you posted.

Have you tried the other version using DBI and DBD::CSV?

Here's the adjustment it needs to add the concatenated fields.

Code
my $sth = $dbh->prepare("SELECT * 
FROM test
ORDER by STORE_NBR,
CONTROL_NBR,
LINE_NBR,
INIT_TRAN_DATE,
UPDATE_TRAN_DATE,
UPDATE_TRAN_TIME");
$sth->execute;

while (my $row = $sth->fetch) {
my @row = @$row;
unshift @row, join '_', @row[0..2];
print Dumper \@row;
#print join(',', @row), $/;
}



(This post was edited by FishMonger on Mar 28, 2016, 8:48 AM)


BigRedEO
Novice

Mar 28, 2016, 8:42 AM

Post #26 of 53 (23352 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

I haven't tried DBI yet because I haven't created the MySQL table yet.


(This post was edited by BigRedEO on Mar 28, 2016, 8:53 AM)


BigRedEO
Novice

Mar 28, 2016, 8:46 AM

Post #27 of 53 (23348 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

I just downloaded the file I attached in this thread and FTPed it to my server and tried to run it - STILL got that weird hiccup on the last variable.


BigRedEO
Novice

Mar 28, 2016, 8:52 AM

Post #28 of 53 (23342 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

I just did a cat -vet on my Perl script and have this (I'm trying to add writing to an Outfile) -

Code
#!/usr/bin/perl/  $ 
$
use strict; $
use warnings; $
use Data::Dumper; $
$
my $filename = '/swpkg/shared/batch_processing/mistints/test.csv'; $
$filename = 'test.csv'; $
$
open my $FH, $filename $
or die "Could not read from $filename <$!>, program halting."; $
$
# Read the header line. $
chomp(my $line = <$FH>); $
my @fields = split(/,/, $line); $
#print "Field Names:\n", Dumper(@fields), $/; $
print Dumper(@fields), $/;$
$
my @data; $
# Read the lines one by one. $
while($line = <$FH>) { $
$
# split the fields, concatenate the first three fields, $
# and add it to the beginning of each line in the file $
chomp($line); $
my @fields = split(/,/, $line); $
unshift @fields, join '_', @fields[0..2]; $
push @data, \@fields; $
} $
close $FH; $
print "Unsorted:\n", Dumper(@data); #, $/; $
$
@data = sort { $
$a->[0] cmp $b->[0] || $
$a->[20] cmp $b->[20] || $
$a->[23] cmp $b->[23] || $
$a->[26] cmp $b-> [26] $
} @data; $
$
$
open my $OFH, '>', '/swpkg/shared/batch_processing/mistints/parsedTest.csv';$
print $OFH Dumper(@data);$
close $OFH;$
$
$
#print "Sorted:\n", Dumper(@data); $
#print "Sorted:", Dumper(@data);$
$
exit;$
$


Is there something in there I'm not seeing that is causing the weird hiccup with that last variable?

Also, how to I write it to the outfile as a single record, minus the single quotes around each field, rather than the entire record as on Variable each?


FishMonger
Veteran / Moderator

Mar 28, 2016, 8:56 AM

Post #29 of 53 (23337 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Run this command on the server and attach that file so I can review it.


Code
od -c test.csv > test.csv_od_dump



FishMonger
Veteran / Moderator

Mar 28, 2016, 9:00 AM

Post #30 of 53 (23334 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post


In Reply To
I haven't tried DBI yet because I haven't created the MySQL table yet.


You don't need mysql to run the test. The script uses the csv file, not the database.


BigRedEO
Novice

Mar 28, 2016, 9:03 AM

Post #31 of 53 (23328 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Attached is the od_dump file
Attachments: test.csv_od_dump (6.32 KB)


FishMonger
Veteran / Moderator

Mar 28, 2016, 9:05 AM

Post #32 of 53 (23324 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

That file is corrupt.


BigRedEO
Novice

Mar 28, 2016, 9:05 AM

Post #33 of 53 (23323 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Aha! I'll learn new things yet - never would have guess that's how it works.


BigRedEO
Novice

Mar 28, 2016, 9:07 AM

Post #34 of 53 (23319 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

I will recreate the test file - all I did was

Code
head -4 bigfile.csv . test.csv

So I'll try that again.


BigRedEO
Novice

Mar 28, 2016, 9:10 AM

Post #35 of 53 (23316 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Nope - STILL the same error. I hope that doesn't mean my main file is corrupted?


BigRedEO
Novice

Mar 28, 2016, 9:19 AM

Post #36 of 53 (23311 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Just attempted the DBI version and got this when I tried to run it -


Code
Can't locate DBI.pm in @INC (@INC contains: /usr/perl5/site_perl/5.12/sun4-solaris-64int /usr/perl5/site_perl/5.12 /usr/perl5/vendor_perl/5.12/sun4-solaris-64int /usr/perl5/vendor_perl/5.12 /usr/perl5/5.12/lib/sun4-solaris-64int /usr/perl5/5.12/lib .) at AlterDataNewDBI.pl line 5. 
BEGIN failed--compilation aborted at AlterDataNewDBI.pl line 5.



FishMonger
Veteran / Moderator

Mar 28, 2016, 9:24 AM

Post #37 of 53 (23307 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

The od dump file should look like this:

c:\test>od -c test.csv

Code
0000000   S   T   O   R   E   _   N   B   R   ,   C   O   N   T   R   O 
0000020 L _ N B R , L I N E _ N B R , S
0000040 A L E S _ N B R , Q T Y _ M I S
0000060 T I N T , R E A S O N _ C O D E
0000100 , M I S T I N T _ C O M M , S Z
0000120 _ C D E , T I N T E R _ M O D E
0000140 L , T I N T E R _ S E R L _ N B
0000160 R , S P E C T R O _ M O D E L ,
0000200 S P E C T R O _ S E R L _ N B R
0000220 , E M P _ N B R , T R A N _ D A
0000240 T E , T R A N _ T I M E , C D S
0000260 _ A D L _ F L D , P R O D _ N B
0000300 R , P A L E T T E , C O L O R _
0000320 I D , I N I T _ T R A N _ D A T
0000340 E , G A L L O N S _ M I S T I N
0000360 T E D , U P D A T E _ E M P _ N


Lets pull directly from the big csv file.


Code
head -4 bigfile.csv | od -c > od_dump

Attach that dump file.


FishMonger
Veteran / Moderator

Mar 28, 2016, 9:26 AM

Post #38 of 53 (23305 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

You need to install the DBI and DBD::CSV modules as well as Text::CSV_XS module.


BigRedEO
Novice

Mar 28, 2016, 9:28 AM

Post #39 of 53 (23303 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Attached is the od_dump from the big file
Attachments: od_dump (6.32 KB)


BigRedEO
Novice

Mar 28, 2016, 9:53 AM

Post #40 of 53 (23300 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

I tried this command for installing DBI - failed with "fatal error."

Code
perl -MCPAN -e 'install Bundle::DBI'



BigRedEO
Novice

Mar 28, 2016, 10:38 AM

Post #41 of 53 (23296 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

PROBLEM CORRECTED ON THE TEST FILE -

I used dos2unix on the file and it now reads correctly. Most likely a Ctrl character unseen in there somewhere.

Now - how to format it to write back to a .csv file the same format as the original?


(This post was edited by BigRedEO on Mar 28, 2016, 10:38 AM)


FishMonger
Veteran / Moderator

Mar 28, 2016, 11:33 AM

Post #42 of 53 (23285 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post


In Reply To
PROBLEM CORRECTED ON THE TEST FILE -

I used dos2unix on the file and it now reads correctly. Most likely a Ctrl character unseen in there somewhere.

Now - how to format it to write back to a .csv file the same format as the original?



Code
print join(',', @$_), $/ for @data;


Or use the Text::CSV module.


BigRedEO
Novice

Mar 28, 2016, 12:41 PM

Post #43 of 53 (23281 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

That has done it! I was able to sort the actual, very large file and all data appears to be correct.

Thank you. This gets me further along to where I need to be!


BigRedEO
Novice

Apr 12, 2016, 7:55 AM

Post #44 of 53 (23223 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Running into a new problem - I have to format the Date and DateTime fields in this .csv file from MM/DD/YYYY to YYYY-MM-DD. I added these two lines -

Code
my $in_date = $fields[14]; 
my $db_date = join '-', reverse split /\D/, $in_date;

right after the push @data, \@fields; line and the script ran without any errors, but it also didn't reformat the date field. And I actually need to do this on three fields - [14], [20], [23]. And I also will need to reformat the dates in the last two fields, both of which are DATETIME - how do I format just the Date in the DATETIME fields?


FishMonger
Veteran / Moderator

Apr 12, 2016, 8:50 AM

Post #45 of 53 (23214 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post


Code
my $db_date = join '-', (split m{/}, $in_date)[2,0,1];



BigRedEO
Novice

Apr 12, 2016, 8:57 AM

Post #46 of 53 (23212 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Thank you - but I guess I'm missing a line to put it back into the file.


Code
#!/usr/bin/perl/ 

use strict;
use warnings;
use Data::Dumper;

my $filename = 'tested.csv';

open my $FH, $filename
or die "Could not read from $filename <$!>, program halting.";

# Read the header line.
chomp(my $line = <$FH>);
my @fields = split(/,/, $line);
print Dumper(@fields), $/;

my @data;
# Read the lines one by one.
while($line = <$FH>) {

# split the fields, concatenate the first three fields,
# and add it to the beginning of each line in the file
chomp($line);
my @fields = split(/,/, $line);
unshift @fields, join '_', @fields[0..2];
push @data, \@fields;
my $in_date = $fields[14];
my $db_date = join '-', (split m{/}, $in_date)[2,0,1];
}
close $FH;
print "Unsorted:\n", Dumper(@data); #, $/;

@data = sort {
$a->[0] cmp $b->[0] ||
$a->[20] cmp $b->[20] ||
$a->[23] cmp $b->[23] ||
$a->[26] cmp $b-> [26]
} @data;


open my $OFH, '>', '/swpkg/shared/batch_processing/mistints/parsedMistints.csv';
print $OFH join(',', @$_), $/ for @data;
close $OFH;

exit;


What am I missing to put it back in with the rest of the data?


FishMonger
Veteran / Moderator

Apr 12, 2016, 9:11 AM

Post #47 of 53 (23207 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Fix the date formatting before the push statement and instead of copying those fields into new vars, just work with the @fields array directly.

Code
    my @fields = split(/,/, $line); 
unshift @fields, join '_', @fields[0..2];
$fields[14] = join '-', (split m{/}, $fields[14])[2,0,1];
# fix the other date fields in the same way then do the push
push @data, \@fields;



BigRedEO
Novice

Apr 12, 2016, 9:34 AM

Post #48 of 53 (23203 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

I've run into some errors here in trying that same line for each of the three date fields -

Useless use of a constant (2) in void context at AlterDataNew.pl line 28.
Useless use of a constant (2) in void context at AlterDataNew.pl line 29.
Useless use of a constant (2) in void context at AlterDataNew.pl line 30.
Can't use string ("10/23/2015") as an ARRAY ref while "strict refs" in use at AlterDataNew.pl line 28, <$FH> line 2.


FishMonger
Veteran / Moderator

Apr 12, 2016, 9:43 AM

Post #49 of 53 (23200 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

Post your updated script.

The change I suggested would not have given you those warnings. You must have made some additional changes.


FishMonger
Veteran / Moderator

Apr 12, 2016, 9:50 AM

Post #50 of 53 (23197 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

You should also post (as an attachment) a short 10 line sample input data file so I can try to duplicate your problem.


BigRedEO
Novice

Apr 12, 2016, 9:51 AM

Post #51 of 53 (7452 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

I just found a more concise way to get that work for those three date fields. Newest version below worked just fine. Now to figure out how to do that with the DATETIME fields -


Code
#!/usr/bin/perl/ 

use strict;
use warnings;
use Data::Dumper;

my $filename = 'tested.csv';

open my $FH, $filename
or die "Could not read from $filename <$!>, program halting.";

# Read the header line.
chomp(my $line = <$FH>);
my @fields = split(/,/, $line);
print Dumper(@fields), $/;

my @data;
# Read the lines one by one.
while($line = <$FH>) {

# split the fields, concatenate the first three fields,
# and add it to the beginning of each line in the file
chomp($line);
my @fields = split(/,/, $line);
unshift @fields, join '_', @fields[0..2];
$_ = join '-', (split /\//)[2,0,1] for $fields[14];
$_ = join '-', (split /\//)[2,0,1] for $fields[20];
$_ = join '-', (split /\//)[2,0,1] for $fields[23];
push @data, \@fields;
}
close $FH;

@data = sort {
$a->[0] cmp $b->[0] ||
$a->[20] cmp $b->[20] ||
$a->[23] cmp $b->[23] ||
$a->[26] cmp $b-> [26]
} @data;

open my $OFH, '>', '/swpkg/shared/batch_processing/mistints/parsedMistints.csv';
print $OFH join(',', @$_), $/ for @data;
close $OFH;

exit;


FishMonger
Veteran / Moderator

Apr 12, 2016, 9:59 AM

Post #52 of 53 (7449 views)
Re: [BigRedEO] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post


Quote
Now to figure out how to do that with the DATETIME fields


Write a subroutine which uses the Time::Piece module as Borodin showed you on your SO crosspost. That single sub could handle both of your date formats.

If you didn't want to go that route, then you could use a slightly modified version of this split/join approach.


(This post was edited by FishMonger on Apr 12, 2016, 10:02 AM)


BigRedEO
Novice

Apr 12, 2016, 10:00 AM

Post #53 of 53 (7447 views)
Re: [FishMonger] Parsing CSV file - Add Concatenated field/sort [In reply to] Can't Post

That's just what I'm attempting right now (and sorry for the cross posts - I posted both at the same time to see who might get back to me and you both replied around the same time).

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives