blekko - ReadWriteWeb http://www.readwriteweb.com/feeds/tag/blekko en Copyright 2009 Richard MacManus readwriteweb@gmail.com Mon, 23 Nov 2009 21:12:49 -0800 http://www.sixapart.com/movabletype/?v=4.23-en http://blogs.law.harvard.edu/tech/rss After Cuil, Blekko Will Be More Careful - But Does It Matter? My first post for ReadWriteWeb, just over 1 year ago, started with the premise that search was “game over”, that Google had won and the only space left was (re)search - what users do after the basic search.

None of the search start-ups since then has made me change my mind. None of the cool new user interface features or ways of expressing your search intentions matter one iota, if the core search proposition is not better from day one. Well, enter the latest contender: Blekko.

]]>Sponsor

]]> When Google launched, 10 years ago in 1998, there was no “new paradigm” or wizzy features - just a search box that worked better than the competition. The search competition bar is now way, way higher than it was back then.

Yet new search start-ups continue to get funded, even in what is a less frothy funding environment. Cuil(l) raised $33m. Looks like they blew it. In contrast, Blekko raised only $5m in two rounds. It is still in stealth mode and one assumes they'll will play the hype game a bit more cautiously after the Cuil debacle.

The proposition that launched countless search start-ups was “if we can get just 1% of the search market we will have a very valuable business“. That maybe true, but getting 1% has proved elusive. The reality is you either win big or fail totally in this game. There are no hedged positions in search. It is a really “non-trivial” technical problem.

Assuming the game is defined as the search infrastructure game. I think that game has been over for some time. The barriers to entry are just too high. An entrepreneur pitching VC now has to answer the “how do you avoid the Cuil problem?” Yahoo BOSS is the perfect play in the new game, with search infrastructure players offering their platform to developers. Hundreds of start-ups can make a decent business within less than 1% of the search market if the infrastructure is provided by somebody else. You don’t build operating systems, do you? You don’t build search infrastructure, do you?

You don’t, unless you believe that you really have disruptive technology. Blekko is one the few remaining plays building search infrastructure. They must think they have that disruptive technology.

Blekko seem to understand the complexity of the challenge, from comments on their Blog (as their site says nothing, their founder’s Blog is best source of insight into what they are up to):

“Search is an absolutely fascinating problem to work on for a bunch of reasons. For one thing you have to scale the thing before getting the first user. You can’t just start with a server or two and add more when the users come. Step 1 is to copy the internet onto your cluster. Step 2 is to analyze it..

The componentry is remarkably deep.

Search is like 7 hard problems wrapped into a stack. Distributed systems, html analytics, text analytics/semantics, anti-spam, AI/ML, frontend/UI. And scale…”

His Blog is well worth a thorough read if you are in the search game or just like hard technical problems. (As a historical footnote, Skrenta’s notes on Cuil, written well before the launch, make interesting reading).

Later on he says:

“you don’t need a million servers and half of the phd’s in the field to build a search app. It takes 20 people and $5M of hardware…if you know what you’re doing.”

I totally buy the “It takes 20 people” people bit. All my experience in software has confirmed that Frederick Brooks was totally right in the Mythical Man Month that small teams always outperform large teams. I cannot imagine what more than 20 people would do other than get in each other’s way.

Its the “you don’t need a million servers” bit that I am less certain about. Google invests $ billions in server farms. You have to have something fundamentally and totally disruptive. P2P enabled Skype to take on AT&T and Verizon. That was fundamentally and totally disruptive technology that enables such a compelling value proposition that they got millions of consumers using them. That is why I was excited to see Faroo attempt this with P2P, but I can see that they fail at the critical “has to be better than Google the day it launches” test.

Purely incremental improvements to the economics of crawling + indexing will not enable a new consumer search play. Saying “we only need $1 billion in infrastructure cost to compete out of the gate with Google and Google spent $3 billion” does not cut it with investors. Nobody will fund that $1 billion. However, incremental improvement is a great pitch to the big infrastructure players. If you can say “I can take 20% out of your infrastructure costs with my patented technology”, you will get your phone calls returned by Google, Yahoo and Microsoft. And one of them may offer to buy you for a big fat premium to prevent their rivals getting access to the same technology.

That is very, very different from launching a new consumer search engine.

In summary, I see 3 possible search plays in search today:

1. Build search applications on top of Yahoo BOSS or equivalent offerings from Google or Microsoft. There is room for hundreds of niche, vertical start-ups, using search as a feature not as the only proposition. I think Yahoo has a great shot at this as Google will suffer from cannibalization fears, so they won’t open up as much as Yahoo. Microsoft will undoubtedly play here as well, they are best at technology for developers.

2. Hard core search infrastructure technology sold to Google, Yahoo and Microsoft. That’s tough to get right as the technology has to be really, really good, the patent has to be rock solid and you have be good at playing poker with the big guys.

3. The totally disruptive Skype style venture that nobody has heard about.

]]>Discuss]]>
http://www.readwriteweb.com/archives/after_cuil_blekko_will_be_more_careful.php http://www.readwriteweb.com/archives/after_cuil_blekko_will_be_more_careful.php Analysis Wed, 30 Jul 2008 22:30:00 -0800 Bernard Lunn
11 Search Trends That May Disrupt Google My first post for ReadWriteWeb (nearly a year ago) started with the premise that search was "game over", that Google had won and the only opportunity left was (re)search - i.e. what one does after the basic search. Unfortunately, none of the search start-ups since then has made a dent in Google's relentless march towards search market dominance. In this article, we outline 11 search trends that may change that.

]]>Sponsor

]]> The proposition that launched countless search start-ups was: "If we can get just 1% of the search market, we will have a very valuable business". That may be true, but getting 1% has proved elusive. It has been an all or nothing game.

That may be about to change.

It is possible that Google will not be beaten by one big competitor. It is possible that they will be pecked at by thousands of tiny start-ups using a new outsourced infrastructure.

But before getting to that punchline, here is my 11 point recap of the search market:

1. Disambiguation is (still) not enough motivation to switch. All those learned PhDs with backgrounds in natural language search and AI explaining that the words "paris" and "apple" have multiple meanings that Google cannot parse from a single search, massively miss the point. The average user has figured that out and either enters multiple words or refines the search based on the first search. Using natural language search - which is complex to code and expensive to process - is a classic "hammer to crack a nut" solution.

2. Webmaster push-back and basic economics will accelerate the trend towards an outsourced crawler market. Webmasters won't accept a proliferation of crawlers as some of them maybe malicious and all of them impact performance to some degree. Google Yahoo Microsoft (GYM) will always be accepted as they drive enough SEO, but marginal crawlers will struggle. Basic economics mean that only a very small number of players will be able to afford the giant server farms needed to index the whole Web. The YM parts of GYM (as well as Amazon) will increasingly offer their infrastructure to anybody who can build value on top.

3. Yahoo Search Monkey may have arisen from desperation, but we may also be witnessing a "Linus moment". SearchMonkey is the most well-defined entry into the outsourced crawler market. It comes from their recognition that it is too late to beat Google in a head to head battle, so it could be dismissed as a sign of desperation. However I prefer to see it as a "Linus moment", that point in time when Linus Torvalds simply said "here is what I have done so far, anybody who can take it to the next step is welcome to try". To be truly disruptive, Yahoo may need to open this up even more than they have to date.

4. There will be many more attempts to monetize Wikipedia. Well-funded search ventures such as Powerset have retreated to the much narrower goal of searching Wikipedkia. Freebase also uses Wikipedia as the their core data. Walking around the RPI Web Science Research Initiative, I could see many interesting R&D experiments coming out of Academia all of which used Wikipedia as a base. Wikipedia has just enough structure and normalization to be useful. Above all, the History feature makes "data provenance" possible and that is critical for trust.

5. Core search is still getting funded. This is not what one would expect in what is by any definition a consolidated market with one mighty big gorilla sitting on top. Look at Blekko getting $2m without even a prototype to show the world. Are the investor's nuts? Possibly, but they include some pretty smart guys like Marc Andreessen and the founder Rich Skrenta is clearly a smart guy (his Blog is a good read). Or look at Cuill, which got $25m as recently as April. Maybe they are idealists tilting at windmills. Maybe they know something that the rest of us don't. Only time will tell. These new entrants will eschew any hype, which they know has not one single point of value in adoption.

6. Image search is another "hammer to crack a nut". Searching images, video and audio is one of those "non-trivial" computer science projects that great engineers love to tackle. However great investors should steer clear. It is hard to code and incredibly expensive to process. The competition is tagging (see next point) which is classic "just good enough and improving all the time at virtually no cost" that is impossible to beat.

7. Tagging is quietly but massively disruptive. The fact that thousands of webmasters and bloggers tag their content so that they can be found by Google is Google's secret weapon. But it could get turned against them. A small incentive to be found by other search engines will change tagging behavior. This is likely to play out in lots of vertical niches, where a small change in tagging behavior can make a huge difference in findability and that can make a big difference to both buyers and sellers. Whether people use RDF or Microformats or some other defacto vertical standard will continue to be the subject of much debate, but the format itself is not the issue. The human drive to tag (to order one's world) is deep and strong and has financial motivations as well.

8. Whitelist is a good way to kill spam. Spam is the big problem for search as well as email and whitelists work well for both. In search this is done by a site that uses something like Google Custom Search Engine (or Search Monkey) to define what sites to search within a defined domain. Even if that means defining 1,000 sites and adding new ones every day, that is well within the range that a single human curator can do within a single market domain. The human curator deletes any spam sites manually.

9. P2P search could still be a long-term disrupter and Microsoft's route back to relevance. The only way to do search without putting all the Web's pages into one server farm is via P2P. I have written about Faroo's attempt here. It relies on .Net and this maybe Microsoft's card to play but only if Vista gets real traction. This is a real long shot, but an intriguing one.

10. There is tons of great data inside relational databases that is quite easy to search. It is the HTML layer that is getting in the way. As more sites learn how to expose their structured, relational databases as Web Services APIs, a lot more data will be available that does not rely on word search on HTML pages.

11. It's the Adwords, stupid! All the search wizardry don't matter a hoot if the monetization is not done right. There is plenty of motivation out there. Sellers want cheaper search words to buy. Publishers want a bigger piece of the cake. Buyers/searchers may even want cash back (we will see if Microsoft's crude tactic, lambasted in the Blogosphere, makes it in the real world).

Conclusion

Most of these trends point in the direction of search as infrastructure feeding thousands of innovators in niche markets - a long tail approach, in other words. Google will play in this infrastructure game - they already do with Google Custom Search - but it is vendors such as Yahoo, Microsoft and Amazon with equally deep pockets and much more to lose from total Google dominance, who will be the disrupting innovators in this next phase of the search market.

Image credit: davemc500hats

]]>Discuss]]>
http://www.readwriteweb.com/archives/11_search_trends.php http://www.readwriteweb.com/archives/11_search_trends.php Search Services Mon, 16 Jun 2008 14:45:57 -0800 Bernard Lunn