Geschlossen

Instagram / Twitter / Flickr Python Data Crawler

Dieses Projekt hat 11 Angebote von talentierten Freelancern mit einem durchschnittlichen Angebotspreis von $859 USD erhalten.

Erhalten Sie kostenlose Angebote für ein Projekt wie dieses
Arbeitgeber arbeiten
Projektbudget
$250 - $750 USD
Angebotsanzahl
11
Projektbeschreibung

Description

We need to crawl 10M geotagged data from Flickr / Instagram / Twitter to do a data visualization on the map. To achieve something like

[url removed, login to view]

Freelancer will need to deliver

tasks:

1. register Flickr / Instagram/ Twitter dev account

2. research their API to write a crawler to grab the data within the geofence bounding box. e.g. San Francisco bounding box: [url removed, login to view], [url removed, login to view], [url removed, login to view], 37.8324.

3.

deliverables:

1. three daemon/service-like python programs to crawl the geotagged data from Instagram / Twitter and Instagram and stores these data into the NoSQL database MongoDB.

2. It should be stable enough to crawl the data 24/7.

3. It should crawl 1 millions geotagged data per week even given the rate limit of the APIs.

4. the programs must have scalibility and multithread ability like queue library e.g. Celery in Python.

GEOTAG is a must! we don't need data with no GPS information.

Qualities needed to be successful

Python Experience to write service / daemon like

MongoDB, Redis, Celery

Twitter / Instagram / Flickr API experience.

Other Skills: Data Science Data scraping MongoDB Python Redis Web Crawler

You will be asked to answer the following questions when submitting a proposal:

(1)Have you written a Python crawler to use Twitter / Instagram / Flickr API before?

(2)Have you used any queue library (e.g. Celery) with multithreaded workers in Python to write daemon/service like program?

(3)Have you used any noSQL database before to store data like mongoDB?

(4) We want to estimate how much time you need to put on this whole project.

(5) And we want to set up with a small interview milestone to test: simply use your API to grab 10+ Instagram, Flickr and Twitter raw json data with GEOTAG (latitude and longitude).

(6) Next question will be how can you deal with rate limitation while crawling data? Multiple IPs / accounts ?

Erforderliche Fähigkeiten

Möchten Sie Geld verdienen?

  • Legen Sie Ihr Budget und Ihren Zeitraum fest
  • Skizzieren Sie Ihr Angebot
  • Bekommen Sie Geld für Ihre Arbeit

Heuern Sie Freelancer an, die auch auf dieses Projekt geboten haben

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online