We want to develop a system for scraping data from various websites specialized in classified ads for second-hand products.
The data must be captured periodically to know which ads are new, which ones have been updated and which ones have been eliminated.
With the data from all the websites, finally unified in a single database, we want to be able to analyze the evolution of the market data.
Therefore, we need to be able to go through certain categories (not all) of a total of five different websites. We need to be able to scrape about 10 or 15 key fields of all those ads (each website have the same page structure in all of their categories).
Preferably we would like the system to be developed in Python (we already have a crawler of one of those web pages in Python and works fine).
We want a stable system. We want the system to be executed as autonomously as possible (as long as there are no changes in the format of the target websites). We also want the system to have a series of alerts by email to notify us when a failure occurs (some service goes down, some web blocks us, the format of some web has changed and we can no longer extract the data, etc.). We want proxy change support (to prevent ip-blocking)
We are open to suggestions regarding the most professional architecture to maintain the system (python->file->mysql ; python->postgre->mysql; python->mysql; ... server hosted, crawlers specialized hosting, etc..).
The five websites to be crawled (for now) are:
[login to view URL]
[login to view URL]
[login to view URL]
[login to view URL]
[login to view URL]
We probably need maintenance services after the end of the project.
Hi, I am interested in your project related to scrape 5 websites on an ongoing basis. I will you provide you complete solution including script,documentation, data storage, proxy provider recommendation,scheduling,etc...
My price is 80 Eur per website. so it will be 400 eur for all the 5 websites.
Hello!
I am a python developer.
I looked at your project and it seems interesting.
I have all necessary skills required for this project.
Ping me to discuss in detail.
I have a lot of experience with web scraping and automation using proxies and VPN, surpassing new Google reCaptcha. I have also experience creating API which connects with these automation and control a good behaviour. If you need more information of how I can reach the work, please contact me.
Hi i'm a python programmer and exp with web scraping using python beautifulsoup and selenium. I can build you a script to scrap data you need from this websites. And export data in csv, excel,txt or where you want :)
For any questions please be free to contact me.