CGI/Perl Guide | Learning Center | Forums | Advertise | Login
Site Search: in

  Main Index MAIN
INDEX
Search Posts SEARCH
POSTS
Who's Online WHO'S
ONLINE
Log in LOG
IN

Home: Need a Custom or Prewritten Perl Program?: I Need a Programmer for Freelance Work:
crawl & analyze generic web pages.

 



mixxed
New User

Nov 29, 2009, 5:06 AM

Post #1 of 4 (12456 views)
crawl & analyze generic web pages. Quote | Reply | Private Reply

hi, i'm new to this forum so bare with me a bit :)

i'm looking for a script/app that would perform the following actions in the first stage of the project:

Start from: a database with links and priorities.

1. read the list and determine the highest link to crawl
2. crawl the page and store locally.
3. create a database of such links
4. analyze the page to determine:
- identify primary content areas (exclude header/footer, banners, features)
- type of page (blog, forum,news,etc..),
- page last modified, commented, etc.
- various page-related or site-related parameters
- other various scores (based on formulas I provide)
5. analyze the primary content
- similarity to previously known contents (trained bayesian or smth similar)
- recency of the content
- retrieve the main keywords
6. store scores&values from #4 and #5 in a database.


I'm expecting this to be a 1-2 months project.
Is there anyone here that could get this done and provide a quote ?

M.


Anonymous
Anonymous
ashishgupta10@gmail.com

Dec 3, 2009, 8:47 AM

Post #2 of 4 (12372 views)
Re: [mixxed] crawl & analyze generic web pages. [In reply to] Quote | Reply | Private Reply

contact me at ashishgupta10@gmail.com


i will do needful for you.


lifesaver
Novice

Dec 12, 2009, 10:10 AM

Post #3 of 4 (11977 views)
Re: [mixxed] crawl & analyze generic web pages. [In reply to] Quote | Reply | Private Reply


In Reply To
hi, i'm new to this forum so bare with me a bit :)

i'm looking for a script/app that would perform the following actions in the first stage of the project:

Start from: a database with links and priorities.

1. read the list and determine the highest link to crawl
2. crawl the page and store locally.
3. create a database of such links
4. analyze the page to determine:
- identify primary content areas (exclude header/footer, banners, features)
- type of page (blog, forum,news,etc..),
- page last modified, commented, etc.
- various page-related or site-related parameters
- other various scores (based on formulas I provide)
5. analyze the primary content
- similarity to previously known contents (trained bayesian or smth similar)
- recency of the content
- retrieve the main keywords
6. store scores&values from #4 and #5 in a database.


I'm expecting this to be a 1-2 months project.
Is there anyone here that could get this done and provide a quote ?

M.


We can do it,if interested contact our designer live or put your requirement in our messegebord @ http://www.livefreelancer.net/


Anonymous
Anonymous
bean.gh@gmail.com

Feb 9, 2010, 1:24 PM

Post #4 of 4 (11546 views)
Re: [Anonymous] crawl & analyze generic web pages. [In reply to] Quote | Reply | Private Reply

Hello, I have 8+ years of exp. in PhP, Perl, Python, MySQL, CSS, Linux, meson, catalyst, mod-Perl, Apache, Ajax, JavaScript, Html toolkit, Java, C/C++, Marklogic(4yrs), Xquery. I have done projects ranging from web site development, socket programming, crawler development and ftp/LWP programs. I have a very good knowledge of website development and can do the job. For more details of my work please refer www.123greetings.com and http://www.bharatmatrimony.com (I was an employee of both the companies) .

 
 


Search for (options) Powered by Gossamer Forum v.1.2.0

Web Applications & Managed Hosting Powered by Gossamer Threads
Visit our Mailing List Archives