Hello everybody,
I need a small script written, that crawls given directories such as e.g. a google search or a directory like this [login to view URL]
The crawler should open the found websites, checks if a link with the word or parts of the word "Impressum" (not capital sensitive) is found, opens this link and writes the content of this page (without HTML-Tags and without linebreaks) into a CSV file in the following format
"uncategorized", "CONTENT_OF_PAGE", "URL"