August 2015

Dead simple {for devs} python crawler (script) for extracting structured data from any website into CSV

Posted on August 16, 2015 by

On my previous post I wrote about a very basic web crawler I wrote, that can randomly scour the web and mirror/download websites. Today I want to share with you a very simple script that can extract structured data from any <almost> website. Use the following script to extract specific information from any website (i.e prices, ids, titles,

Continue reading

Posted in Technology | Leave a comment

Tiny basic multi-threaded web crawler in Python

Posted on August 12, 2015 by

If you need a simple web crawler that will scour the web for a while to download random site’s content – this code is for you. Usage: $ python tinyDirtyIffyGoodEnoughWebCrawler.py http://cnn.com Where http://cnn.com is your seed site. It could be any site that contains content and links to other sites. My colleagues described this piece of code I wrote

Continue reading

Posted in Technology | Leave a comment