Find Jobs
Hire Freelancers

Save all content of webpages (including FRAMES) - 20871

$200-300 USD

Geschlossen
Veröffentlicht vor etwa 12 Jahren

$200-300 USD

Bezahlt bei Lieferung
The project is designed to create a piece of software that will save all content of webpages (including FRAMES) for any given list of URLs. Basically, the software should do the following: • After execution, it should ask the user to paste a list of URLs from Excel • For each URL, it should save the full contents (including content of all FRAMES) of the page located at that URL into a separate folder on the hard drive Now the full details: • The software has to be Windows-based • It can be written using any programming language • The most important requirement is the ability to save ALL contents, in particular content of FRAMES. It should save all files separately (related css files, images, html file and javascript files) – it should basically save it “faithfully” – just as browsers see it (please see the following note) • For this reason, it might be easier (might not be – we don’t know the best way and this is just an option to consider) to create this software in a form of a Google Chrome extension or a Mozilla Firefox add-on, because both Chrome and Firefox can save all contents of pages as they displays them – with frames, images, etc. (Chrome’s default “Save As” does that, while Firefox uses another add-on – “Mozilla Archive Format” – to save pages “faithfully”). However, we are not sure if Chrome and Firefox have any disk write APIs, so this might not work. For your own testing purposes, it might be a good idea to compare the results with the way Chrome saves pages. • The software must have the following adjustable parameters: o Minimum pause between processing next URL (in seconds) – MIN_WAIT o Maximum pause between processing next URL (in seconds) – MAX_WAIT o Download folder (folder on the hard drive) • This is how the software should work: o User starts the software o The software asks for a list of URLs o It should be capable of accepting lists of up to 10,000 URLs o We need to make input easy. We produce links in Excel, so we should simply select a range of cells with URLs (in one column), copy them and paste them into the software. o Then we should be able to set two pause parameters – MIN_WAIT and MAX_WAIT - min and max pause between finishing processing one URL and moving on to the next one. For example, MIN_WAIT =2sec, MAX_WAIT =10sec. Then for each URL that the software is about to load, it should wait a random amount of seconds between the MIN_WAIT and MAX_WAIT number of seconds before attempting to open and save it. o Then we should be able to select the download folder. By default, the software should remember previous choice. o Then we should hit a “start” button and for each URL the software should do the following:  Create a new folder for the contents of this URL within the Download folder. The individual folder’s name should follow this format: “YYYY-MM-DD-HH-MM-SS”, which is basically the time of creation.  Save all contents of this URL into this individual folder.  Add a line to the program log (see below).  Generate a random number of seconds between MIN_WAIT and MAX_WAIT and wait that number of seconds before moving on the next URL o Logging. The software should maintain a log file (text file) of all URLs that have been processed. For each URL it should save one line of text using the following format: “YYYY-MM-DD-HH-MM-SS: URL” – the timestamp should be same as the timestamp in the folder name for any given URL o The software must be able to work “quietly” – either in the tray or (if part of a browser) in the taskbar. Basically, it shouldn’t pop up for each URL or anything like it – the user should be able to use the PC for other tasks while the software is running. o Finally, the software should have a line with progress text to show that, for example, “120 or 1500 URLs processed”. o There should also be a button to stop processing URLs. On click, the software should cancel processing the current URL and stop. • The code must be comprehensively commented – if not every line, but every few lines to explain what they do • The deliverables are: o Uncompiled code with full instructions on how to compile it and run it (including all external libraries or modules that will be used to support this software) o Compiled version of the software
Projekt-ID: 16886305

Über das Projekt

6 Vorschläge
Remote Projekt
Aktiv vor 12 Jahren

Möchten Sie etwas Geld verdienen?

Vorteile einer Ausschreibung auf Freelancer

Legen Sie Ihr Budget und Ihren Zeitrahmen fest
Für Ihre Arbeit bezahlt werden
Skizzieren Sie Ihren Vorschlag
Sie können sich kostenlos anmelden und auf Aufträge bieten

Über den Kunden

Flagge von UNITED KINGDOM
Sunbury, United Kingdom
0,0
0
Mitglied seit März 18, 2012

Kundenüberprüfung

Danke! Wir haben Ihnen per E-Mail einen Link geschickt, über den Sie Ihr kostenloses Guthaben anfordern können.
Beim Senden Ihrer E-Mail ist ein Fehler aufgetreten. Bitte versuchen Sie es erneut.
Registrierte Benutzer Veröffentlichte Jobs
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Vorschau wird geladen
Erlaubnis zur Geolokalisierung erteilt.
Ihre Anmeldesitzung ist abgelaufen und Sie wurden abgemeldet. Bitte melden Sie sich erneut an.