CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Beginner:
Parsing postcode/location CSV

 



Zhris
User

Jul 14, 2010, 3:17 PM

Post #1 of 7 (1139 views)
Parsing postcode/location CSV Can't Post

Hey,

>>>>> Background:
I am working on a GPS related project, which involves real time tracking of my clients boat as he travels across the UK. In order to do this I am using Insta Mapper (http://www.instamapper.com). However, my client requires additional features, one of which is to provide the nearest outcode (first half of a UK postcode) to the boats current position. To do this I have downloaded a CSV containing over 2000 UK outcodes and their latitude/longitude. However I have a couple of minor issues which are bothering me while parsing the CSV. As the CSV file is large I will only provide a link to it: http://test.massweb.co.uk/PostcodesAndPositions.csv (my copy) / http://www.freemaptools.com/download/postcodes/postcodes.csv (where I downloaded from).

>>>>> Example data (first 10 lines):

Code
"id","outcode","lat","lng" 
"1","AB10","57.131086","-2.122482"
"2","AB11","57.13121","-2.082261"
"3","AB12","57.098381","-2.172400"
"4","AB13","57.108","-2.237"
"5","AB14","57.101","-2.27"
"6","AB15","57.143396","-2.199525"
"7","AB16","57.161","-2.156"
"8","AB21","57.245343","-2.225550"
"9","AB22","57.187186","-2.131376"


>>>>> Isolated section of my script:

Code
open (INPUT, "<$postcodesandpositionspath") || die "Error while opening INPUT"; 
while (<INPUT>) {
next if ($. == 1);
my ($id, $outcode, $latitude, $longitude) = $_ =~ m/^\"([^\"]+)\"\,\"([^\"]+)\"\,\"([^\"]+)\"\,\"([^\"]+)\"/;
my $data = {
'ID' => $id,
'Outcode' => $outcode,
'Latitude' => $latitude,
'Longitude' => $longitude
};
$object->{'PostcodesAndPositions'}->{'ByOutcode'}->{$outcode} = $data;
$object->{'PostcodesAndPositions'}->{'ByPosition'}->{$latitude . $longitude} = $data;
}
close (INPUT) || die "Error while closing INPUT";
#
print Dumper ($object);


>>>>> Issues:
1) I am pulling out the id, outcode, latitude and longitude using the regular expression m/^\"([^\"]+)\"\,\"([^\"]+)\"\,\"([^\"]+)\"\,\"([^\"]+)\"/, as I couldn't think of a better way to do it. I tried using the split function but it is beyond me. Is there a better way I could do this (preferably in 1 line)?
2) Whilst building the hash ($object) I always get 1 group of undef data, i.e.:

Code
'ByPosition' => { 
'' => {
'Longitude' => undef,
'Outcode' => undef,
'ID' => undef,
'Latitude' => undef
},

This creates problems in later parts of my script, which I have temporarily solved in a very dirty way. I can't figure out why this is occurring. I have carefully checked over my code and the CSV but i'm unable to work it out. Maybe something to do with new line characters?

Thank you in advance for your help.

Chris


FishMonger
Veteran / Moderator

Jul 14, 2010, 5:47 PM

Post #2 of 7 (1131 views)
Re: [Zhris] Parsing postcode/location CSV [In reply to] Can't Post

It would be better to use either:

Text::CSV http://search.cpan.org/~makamaka/Text-CSV-1.18/lib/Text/CSV.pm
or
Text::CSV::Slurp http://search.cpan.org/~robbiebow/Text-CSV-Slurp-0.8/lib/Text/CSV/Slurp.pm

I also think you should refractor your hash to get rid of the duplication.

Here's an example of each.


Code
#!/usr/local/bin/perl -- 

use strict;
use warnings;
use Text::CSV;
use Text::CSV::Slurp;
use Data::Dumper;

my $csv = Text::CSV->new or die "Cannot use CSV: ".Text::CSV->error_diag();
my $csvfile = 'PostcodesAndPostions.csv';

# Process the csv file line-by-line
open my $csv_fh, '<', $csvfile or die "Can't open $csvfile $!";
<$csv_fh>; # skip over the header line

while ( my $row = $csv->getline( $csv_fh ) ) {
print Dumper $row; # $row is an array reference
}


# slup the csv file into a ref of array of hashes
my $data = Text::CSV::Slurp->load(file => $csvfile);
print Dumper $data;


Output using your sample data.

Code
D:\perl>perl-1.pl 
$VAR1 = [
'1',
'AB10',
'57.131086',
'-2.122482'
];
$VAR1 = [
'2',
'AB11',
'57.13121',
'-2.082261'
];
$VAR1 = [
'3',
'AB12',
'57.098381',
'-2.172400'
];
$VAR1 = [
'4',
'AB13',
'57.108',
'-2.237'
];
$VAR1 = [
'5',
'AB14',
'57.101',
'-2.27'
];
$VAR1 = [
'6',
'AB15',
'57.143396',
'-2.199525'
];
$VAR1 = [
'7',
'AB16',
'57.161',
'-2.156'
];
$VAR1 = [
'8',
'AB21',
'57.245343',
'-2.225550'
];
$VAR1 = [
'9',
'AB22',
'57.187186',
'-2.131376'
];


$VAR1 = [
{
'outcode' => 'AB10',
'lat' => '57.131086',
'id' => '1',
'lng' => '-2.122482'
},
{
'outcode' => 'AB11',
'lat' => '57.13121',
'id' => '2',
'lng' => '-2.082261'
},
{
'outcode' => 'AB12',
'lat' => '57.098381',
'id' => '3',
'lng' => '-2.172400'
},
{
'outcode' => 'AB13',
'lat' => '57.108',
'id' => '4',
'lng' => '-2.237'
},
{
'outcode' => 'AB14',
'lat' => '57.101',
'id' => '5',
'lng' => '-2.27'
},
{
'outcode' => 'AB15',
'lat' => '57.143396',
'id' => '6',
'lng' => '-2.199525'
},
{
'outcode' => 'AB16',
'lat' => '57.161',
'id' => '7',
'lng' => '-2.156'
},
{
'outcode' => 'AB21',
'lat' => '57.245343',
'id' => '8',
'lng' => '-2.225550'
},
{
'outcode' => 'AB22',
'lat' => '57.187186',
'id' => '9',
'lng' => '-2.131376'
}
];



Zhris
User

Jul 15, 2010, 9:50 AM

Post #3 of 7 (1123 views)
Re: [FishMonger] Parsing postcode/location CSV [In reply to] Can't Post

Hey,

Thank you for replying,

And, thank you for suggesting Text::CSV, a module I have seen before, but never used, and didn't cross my mind when parsing the CSV.

I'm still trying to figure this one out, but i'm recieving an error *Can't locate object method "getline" via package "IO::Handle"*. Is there any obvious reason why this error is occurring?

You are correct by saying that I should get rid of the hash duplication, as it is pointless (it was just a way I could simply access the hash directly in 1 line whether I had an outcode or a position).

I see that you only tested the sample data, and i'm looking forward to getting this working and testing the whole data, and hopefully resolve the "undef" issues I had.

Thanks,

Chris


FishMonger
Veteran / Moderator

Jul 15, 2010, 10:33 AM

Post #4 of 7 (1121 views)
Re: [Zhris] Parsing postcode/location CSV [In reply to] Can't Post

The "undef" issue is caused by a blank line in the file. The module will handle that for you, or if you wanted to continue with your existing approach, you could simply skip over blank lines.

Code
next if /^\s*$/;



Zhris
User

Jul 15, 2010, 12:16 PM

Post #5 of 7 (1116 views)
Re: [FishMonger] Parsing postcode/location CSV [In reply to] Can't Post

Hey,

I noticed a blank line at the bottom of the CSV which I had removed, and had no luck, unless there is another blank line I am unaware of. However, thank you for suggesting "next if /^\s*$/", which I will try, although I am attempting to go the Text::CSV route.

I am currently stumped as to why I am recieving the Text::CSV error I described in my previous post. Any suggestions why this error could be occurring would be helpful (note that IO::Handle is installed properly)?

Thanks again,

Chris


FishMonger
Veteran / Moderator

Jul 15, 2010, 12:33 PM

Post #6 of 7 (1111 views)
Re: [Zhris] Parsing postcode/location CSV [In reply to] Can't Post

Sounds like one or more of your modules are out of date.

What version of perl and IO::Handle are you using?


Zhris
User

Jul 15, 2010, 1:04 PM

Post #7 of 7 (1108 views)
Re: [FishMonger] Parsing postcode/location CSV [In reply to] Can't Post

Perl version: 5.008008.

Not sure how to check the IO::Handle version. However http://links.1and1faqs.com/perldiver.cgi provides everything I need to know about perl and installed modules.

I am stuck using a shared host plan at 1and1, and I know that it limits what i'm able to do. I've been loosely programming in perl for about 6 years using 1and1 to test/run scripts, which hasn't helped in my "quest to become a perl expert" (i've only ever ran a script in the browser). Once I get a proper PC I plan to get strawberry perl, then to change my web host in the future.

Chris


(This post was edited by Zhris on Jul 15, 2010, 1:06 PM)

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives