All of us know about search engine crawlers which sniff into websites and collect the data (primarily) for organic search. There has been surge of tools / portals which have came up which use crawler capability to offer “compare’ capability or in an open market it is via exposed API/ RSS feeds one such good example is http://www.mysupermarket.co.uk
While a host of crawlers exist (http://scraping.pro/software-for-web-scraping) and they let you configure the interface so that you can direct it to read data from a specific site (read urls), there is serious lack of ones which can pretty much go all out in world wide web to get the data. I.e. how about a portal/tool that gives an end user an interface where they can go out and say I want to buy a gold coloured iPhone 6. I know lot of people will say Google shopping can do it. Yes, you are right, however it can not do for each and everything. More over it is not so much business friendly where it can let a competitor know what other competitor is doing with prices/ content, etc.
With regards to crawlers available, it is so painful to be able to configure each and every term that you need to monitor. Further to that issues would be if a site has enabled Captcha these crawlers will fail or if webmaster notices traffic peak from a given ip, then that ip gets blocked. Some providers suggest that they generate dynamic ip and keep themselves anonymous, however that is not full proof either.
Another massive drawback with crawler approach is that it can not take into consideration any effect on pricing due to promotions or coupon codes.
I reckon there is a need to build such system, only to be transparent, competitive and open to user community. A system which is able to access data in a fair transparent and in open API based approach, that will help not only the end consumers, but also bring vendors on a competitive “fair trade” platform.