CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
Search Posts SEARCH
Who's Online WHO'S
Log in LOG

Home: Perl Programming Help: Advanced: How to speed up search on array?: Edit Log


Aug 30, 2016, 3:49 AM

Views: 13734
How to speed up search on array?

I have a program where I have to search through a huge array with 3,000,000 items and return every matching item in @prkeys. Each time the program runs I'm running this search routine 3000+ times. It can take me 90+ minutes to run the whole program. Here's the code that is slowing me down.

# Regex will be: MODEL.+OLDPRICE 
# Find all models that start with key and end with price.
@k=grep(/^$k/, @prkeys);

The intention of this code is to make a fuzzy match looking for strings that begin with a model number and end with an price. This is my last resort to find a match on a model, and it must be a fuzzy match of some type, although I could split the search into 2 parts, but it seems both parts would be a regex and splitting it into 2 parts would slow me down even more.

  1. @prkeys is an array that contains 3,000,000+ items.
  2. @prkeys is all in memory.
  3. Each string in @prkeys is 5-30 characters long.
  4. I must return every item that matches in @prkeys.
  5. Each time I search for $k, $k starts with a model number, and ends with a price. So the regex looks like: <code>/MODEL.+PRICE/</code>
  6. Because of the bad data the customer gives us I do have to use this search method of searching 3,000,000 strings.
  7. This OS is a virtual machine and there are other VMs on this physical server, and I suspect the other VMs are also slowing me down. I cannot move the VM to another physical server, so I must address speed in the code itself.
  8. I have a test program to test read speed but I have no other ideas how to speed this up. Speed normally is not an issue for me.


  1. How can I speed this up? Each time this one line runs it takes about 2 seconds. That's 6000 seconds just for this one line only, not counting any other overhead and processing for the rest of the program.
  2. Will I have to use another data structure to search for all this data?

Thank you for your help. I normally don't have to do such searching on a huge dataset.

I will post a link to the huge file and a test program for you to use shortly.

Here's the link to the data file and test program. It's about 800mb.

(This post was edited by bulrush on Aug 30, 2016, 10:07 AM)

Edit Log:
Post edited by bulrush (User) on Aug 30, 2016, 4:06 AM
Post edited by bulrush (User) on Aug 30, 2016, 4:08 AM
Post edited by bulrush (User) on Aug 30, 2016, 4:33 AM
Post edited by bulrush (User) on Aug 30, 2016, 4:35 AM
Post edited by bulrush (User) on Aug 30, 2016, 5:27 AM
Post edited by bulrush (User) on Aug 30, 2016, 10:07 AM

Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives