CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Perl Programming Help: Regular Expressions:
global matching in big files

 



Cameron
Deleted

Aug 19, 2000, 9:11 AM

Post #1 of 3 (2877 views)
global matching in big files Can't Post

I have a some rather large text files consisting of a single line of text. The files are 40-120 Mb.

In order to improve performance, and stop my machine from hanging, i have been reading the files in 100000 character chunks.

I want to find all occurances of a pattern within the file.

Does m//g find all occurances dispite processing it one chunk at a time ?

Or do i need to split the file such that i don't lose any matches because of bad luck with the arbitrary chunk size ?


Cameron


TheGame+
Deleted

Aug 21, 2000, 3:31 AM

Post #2 of 3 (2877 views)
Re: global matching in big files [In reply to] Can't Post

Depending on what pattern you're trying to find (a few hundred chars or possibly 100.000+ chars), you might use overlapping 'chunks'.
- read in chars 0-99.999 and match
- keep the last 1000 (?) chars, append chars 100.000-199.999 to it and match again
- and so on

The size of the overlap would depend on your typical match, of course, so it's definitely not a good solution for ALL cases.
And you'll have to be careful of matching things that are in the overlapping part twice.
Anyway, it's just an idea - I'm not sure how this compares to gobbling up the whole file into memory, but it's definitely going to take more processing. Sleep()ing between loops or lowering the priority of that task might solve your problem too Smile


Cameron
Deleted

Aug 22, 2000, 8:41 AM

Post #3 of 3 (2877 views)
Re: global matching in big files [In reply to] Can't Post

Thank you, TheGame+

That sufficiently answers my question.

I appreciate your help very much,
cameron

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives