The Blog

5 Ways to Measure the Impact of Crawled Web Data on Your Business

Posted on July 27, 2016 by

The analysis you provide is only as good as the raw data you start with. Although data from the open web is often perceived as a commodity, not all crawled data is created equal.  Whether you’re relying on a proprietary crawling technology, tapping into a vendor’s firehose, or implementing a combination of both strategies –

Continue reading

Posted in Big Data, Technology | Leave a comment

How to Keep Your Restaurant Sentiment Analysis Well-Fed

Posted on April 6, 2016 by

When the team from London-based data analysis service GetSentiment developed a bleeding-edge system to measure the emotional baggage found in free text, they were missing just one thing: relevant data. “We were looking for a data provider that would be able to give access to sufficiently large amounts of frequently updated mentions of brands,” recalls

Continue reading

Posted in Uncategorized | Leave a comment

Webhose.io helps Observify expand their coverage and add a new angle to their already rich offering.

Posted on March 17, 2016 by

We had a the pleasure of speaking to Karl from Observify to understand a bit more about them but also why and how they use Webhose.io A bit about Observify “Observify is a fast growing company on a mission to relieve their clients of their analytical headaches. We’re shaking up the social and web listening

Continue reading

Posted in Uncategorized | Leave a comment

How to Create a Custom RSS Feed for Content Monitoring

Posted on March 3, 2016 by

Imagine that you had the ability to track what’s being said, felt and published about a given topic, industry or brand. Whether you’re in marketing, sales, search engine optimization, management or just a curious person, there are some major benefits to staying on top of the latest discussions, trends, issues and developments happening in your

Continue reading

Posted in Uncategorized | Leave a comment

How Crawled Data Gave One News Outlet the Edge in the Israeli Election

Posted on February 18, 2016 by

In the spring of 2015, as Israel prepared for general elections, virtually all of the mainstream media analysts believed that change was in the air. Conventional wisdom at that time had it that the Israeli populace was ready to turn its back on Prime Minister Benjamin Netanyahu and the government led by his Likud Party

Continue reading

Posted in Uncategorized | Leave a comment

goPRit and Webhose.io

Posted on February 9, 2016 by

Startups, small businesses, and even enterprise-level organizations all need publicity not only to survive, but to thrive. Why? Because the truth is … without an audience, even the best product won’t win. Owners and founders know this, which is why they hire PR firms to help them get the word out and cultivate valuable relationships

Continue reading

Posted in Uncategorized | Leave a comment

The 15 Data Experts You Should be Following on Twitter

Posted on January 14, 2016 by

Twitter is a phenomenal place not only to connect with peers in the analytics industry but also to follow and learn from its leading authorities. Unfortunately, the Twitter marketplace is crowded and trying to wade through and research exactly who’s who on your own is overwhelming Even worse is making your Twitter decisions based on

Continue reading

Posted in Big Data, Technology | Leave a comment

Five Reasons a News Crawler Is Essential to Your Business

Posted on January 5, 2016 by

“Originality is the art of remembering something but forgetting where you heard it.” Case in point, I don’t remember where I heard that. Nonetheless, it’s absolutely true, especially when it comes to running an online business. Why? Because in today’s online marketplace, sales, brand management, and genuine engagement are all practices that shouldn’t begin with

Continue reading

Posted in API, Big Data | Leave a comment

Extracting Data from Forums: 3 Sources to Discover What Your Market Really Thinks

Posted on December 29, 2015 by

Robert Collier, the great ad man of the early 20th century, once summarized the secret of all effective marketing as entering “the conversation already taking place in the customer’s mind.” That’s powerful advice … and difficult. Why? Because most of the sources we normally turn to for market research are woefully incomplete. For example, surveys

Continue reading

Posted in Big Data | Leave a comment

How to Extract Data from a Website: 5 Steps to Transform Unstructured Data into Business Insights

Posted on December 8, 2015 by

Big data is big business. And for good reason. As Harvard Business Review recently reported, an exhaustive study of 330 North American companies led by the MIT Center for Digital Business in conjunction with McKinsey’s Business Technology Office revealed that the use of data in business decisions like product development, hiring and firing, as well

Continue reading

Posted in Big Data, Technology | Leave a comment

Social Media Analytics: Insights from Structured versus Unstructured Data

Posted on December 1, 2015 by

Let’s be honest … social media is a challenge. Not only is staying current, active, and “topped off” a chore, but crafting full-scale campaigns that contribute to your business’ and brand’s actual goals can be bewildering. At the same time, the market for social-media continues to grow. According to recent data from eMarketer, “Social Network

Continue reading

Posted in API, Big Data | Leave a comment

Dead simple {for devs} python crawler (script) for extracting structured data from any website into CSV

Posted on August 16, 2015 by

On my previous post I wrote about a very basic web crawler I wrote, that can randomly scour the web and mirror/download websites. Today I want to share with you a very simple script that can extract structured data from any <almost> website. Use the following script to extract specific information from any website (i.e prices, ids, titles,

Continue reading

Posted in Technology | Leave a comment

Tiny basic multi-threaded web crawler in Python

Posted on August 12, 2015 by

If you need a simple web crawler that will scour the web for a while to download random site’s content – this code is for you. Usage: $ python tinyDirtyIffyGoodEnoughWebCrawler.py http://cnn.com Where http://cnn.com is your seed site. It could be any site that contains content and links to other sites. My colleagues described this piece of code I wrote

Continue reading

Posted in Technology | Leave a comment

How we quadrupled the performance of Elasticsearch

Posted on July 19, 2015 by

Well, that’s a misleading title. We actually quadrupled the performance of our brand monitoring alert system that uses Elasticsearch’s Percolator, but that would have been a much longer title. Some background Buzzilla has two main products. The first is Webhose.io which provides businesses worldwide access to structured data from the open web, and the second

Continue reading

Posted in Technology | Leave a comment

Webhose.io Tips & Tricks: Search for Reviews

Posted on December 10, 2014 by

Are you looking to focus your data search specifically on consumer generated reviews? Here are a couple of simple Webhose.io tricks that might help: Limit your query to specific sites You can limit your search to specific “review sites” like amazon.com, bestbuy.com, newegg.com, cnet.com, engadget.com, pcmag.com etc.. Here is an example for how you should

Continue reading

Posted in Technology | Leave a comment