Project ID:
1458602
Project Type:
Hourly
Hours of work:
Unspecified
Project Duration:
1 - 4 weeks
Budget:
$15-$25 CAD / hour
(Approx. €11-€20 EUR)
Project Description:
We have an existing application that collects, monitors and analyzes public health inspection report data for large restaurant chains.
The source data lives on public health inspection websites. Typically these websites are at the city or province/state level. Here are two examples:
http://www.peelregion.ca/health/foodcheck/inspection.htm
https://www.myfloridalicense.com/insptermsofUse.asp?SID=
We scrape this data with Python scripts that use BeautifulSoup (or PDFMiner if the data is in PDF format) and store it in a database via the Django ORM.
We're in the process of expanding the number of these websites we track and are looking for someone who will write between 50 and 100 additional scrapers for us.
If you're interested in this project please:
1. Private message me and I'll provide you with access to a repository on bitbucket.org where you can review the framework/base classes we use for scraping and an example scraper. We need them to follow a certain format because we already have a lot of these scrapers and we want them to be consistent.
2. After you've had a chance to look at the example code provide us with an *approximate* effort estimate in hours to code a typical scraper like the example we provided. The scrapers very in complexity. We understand that and aren't looking for bullet proof estimates we're just looking to see if your effort estimate for an individual scraper is inline with our experience for creating a typical scraper like the example one provided.
3. Provide us your hourly rate.
If we're on the same page with rate and effort estimate per scraper we'll do a smaller paid project with you to do a few of the scrapers. If everything goes well we'll do a larger project with you for the remainder of the scrapers.
In a perfect world you'd work on this fulltime until all of the scrapers are complete. For the right candidate we're negotiable on that.
We care about cost but we also care a lot about high quality work and getting the work done in a timely fashion. You don't need to be the lowest bidder to win the work but you have to be in the ballpark.
We also have some non-scraper work on this Django based app that we would make available to the right candidate if the scraping goes well.
Skills required:
Django,
Python,
Web Scraping