I dont know the price range for programs, but if anybody is interested, just make me a best offer. Thanks.
Using a tool known as lynx, I successfully dumped the contents of a webpage into a neat and well-arranged textfile (check out the attachment). In the dump as you can see, it grouped the abstracts on top and then the references(html links) below.
I want a program that will perform a search (lets say "galaxy") on the textfile and find abstracts that contain the word. Then it will redirect the link to the abstract into another file.
Example, I have pasted some parts of logfile.txt. In the example below, the system will realize that the first abstract contained "galaxy", and then it will either grep out the number from that line that begins with "astro-ph" and then match the  with the  in the references and send the html link to an output file or email.
Anybody that wants to tackle this and needs further explanation, please feel free to email me at email@example.com. I will match any price out there and more if work is done before the end of July. Thanks
astro-ph/0606414 [abs, ps, pdf, other] : Comments: 10 pages, an invited talk presented in the GC2006 workshop, high-resolution version available at this http URL About 2 million seconds of Chandra observing time have been devoted to the Galaxy center
astro-ph/0606415 [abs, ps, pdf, other] : Comments: 4 pages, 1 figure, submitted to ApJ Letters We investigate the orientation of the axes and angular momentum of dark matter halos with respect to their neighboring voids using high resolution N-body cosmological simulations.