CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Intermediate:
In place decompression

 



kencl
User

Feb 22, 2002, 8:02 AM

Post #1 of 8 (778 views)
In place decompression Can't Post

Hi Folks,

Interesting project here. I have to parse a huge (2 Gb +) file and import it into a database. If it was just an ASCII text file I know that I could read it in place with:

Code
open FH, "<filename.txt"; 
while(defined($line = <FH>)) {
# process 1 line at a time
}
close FH;

However, the file is a gzip archive, so my questions are a) how do I decompress it? and b) Is it possible to decompress it in place one line at a time? Dealing with a file of this size I obviously need to conserve system resources including harddrive space.

Any suggestions or strategies for dealing with this are greatly appreciated! Sly

>> If you can't control it, improve it, correlate it or disseminate it with PERL, it doesn't exist!


Paul
Enthusiast

Feb 22, 2002, 8:48 AM

Post #2 of 8 (776 views)
Re: [kencl] In place decompression [In reply to] Can't Post

I believe you can do something like...

[perl]
open FH, "/bin/gzip -c -d file.gz |" or die "Unable to open `/bin/gzip -c -d file.gz |': $!";
[/perl]


(This post was edited by RedRum on Feb 22, 2002, 8:49 AM)


Jasmine
Administrator

Feb 22, 2002, 9:08 AM

Post #3 of 8 (768 views)
Re: [kencl] In place decompression [In reply to] Can't Post

Perhaps the [url=http://search.cpan.org/search?dist=PerlIO-gzip]PerlIO::gzip module may be helpful. "Perl extension to provide a PerlIO layer to gzip/gunzip"

Sample from the [url=http://search.cpan.org/doc/NWCLARK/PerlIO-gzip-0.11/gzip.pm]docs:


Code
  use PerlIO::gzip; 
open FOO, "<:gzip", "file.gz" or die $!;
print while <FOO>; # And it will be uncompressed...


binmode FOO, ":gzip(none)" # Starts reading deflate stream from here on



mhx
Enthusiast / Moderator

Feb 22, 2002, 10:06 AM

Post #4 of 8 (761 views)
Re: [Jasmine] In place decompression [In reply to] Can't Post

The module look so interesting to me (it definitely is very interesting) that I downloaded it. After having a look at the [url=http://search.cpan.org/doc/NWCLARK/PerlIO-gzip-0.11/README]README file, I thought the following might be of interest:


Code
Alpha at the moment, does now (well, since 0.06, oops) gzip and gunzip. 
**DON'T** trust it with your data.

YOU NEED PERL later than 5.7.2
yes. this only works on UNSTABLE DEVELOPMENT PERL


If you don't have a bleeding edge Perl on your machine and are really in the mood to play, you probably won't download the module.

-- mhx

Hmmm, I think I'm going to install it now and have a look at the source. I'm really interested in how it works... Wink

At last with an effort he spoke, and wondered to hear his own words, as if some other will was using his small voice. "I will take the Ring," he said, "though I do not know the way."

-- Frodo



mhx
Enthusiast / Moderator

Feb 22, 2002, 10:13 AM

Post #5 of 8 (759 views)
Re: [kencl] In place decompression [In reply to] Can't Post

There's a whole bunch of I/O Filters, including bzip and gzip de/compression, available in the [url=http://search.cpan.org/search?dist=IO-Filter]IO::Filter distribution. Perhaps you want to have a look at that one. However, these will unfortunately use the external programs and not the appropriate libraries, at least for the current version.

-- mhx

At last with an effort he spoke, and wondered to hear his own words, as if some other will was using his small voice. "I will take the Ring," he said, "though I do not know the way."

-- Frodo



rGeoffrey
User / Moderator

Feb 22, 2002, 10:18 AM

Post #6 of 8 (756 views)
Re: [RedRum] In place decompression [In reply to] Can't Post

And as a variation on the theme you can also use:


Code
open (FH, "zcat $fname |") or die "can't open '$fname', $!";



kencl
User

Feb 22, 2002, 10:36 AM

Post #7 of 8 (752 views)
Re: [rGeoffrey] In place decompression [In reply to] Can't Post

Actually, I've been looking at IO::Zlib, though after mhx pointed out the Alpha status of PerlIO::gzip I remembered to checked the CPAN test results for it, only to find that it hasn't been tried on FreeBSD, which unfortunately is my target OS.

Thanks for all the input folks! Smile
.

>> If you can't control it, improve it, correlate it or disseminate it with PERL, it doesn't exist!

(This post was edited by kencl on Feb 22, 2002, 10:46 AM)


Kanji
User

Feb 24, 2002, 12:29 AM

Post #8 of 8 (737 views)
Re: [kencl] In place decompression [In reply to] Can't Post

I've used Compress::Zlib to great success on FreeBSD, which can also be installed via the ports system (or as a package) for added convenience/assured compatability.

--k.

--k.

(This post was edited by Kanji on Feb 24, 2002, 12:29 AM)

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives