<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" 
      xmlns:thr="http://purl.org/syndication/thread/1.0">
  <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php" />
  <link rel="self" type="application/atom+xml" href="http://www.readwriteweb.com/atom.xml" />
  <id>tag:,2008:/1/tag:72.47.210.69,2007://1.5245-</id>
  <updated>2008-08-22T19:03:06Z</updated>
  <title>Comments for Overview of Clustering and Clusty Search Engine</title>
  
  <generator uri="http://www.sixapart.com/movabletype/">Movable Type 4.1</generator>
  <entry>
    <id>tag:72.47.210.69,2007://1.5245</id>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.readwriteweb.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=5245" title="Overview of Clustering and Clusty Search Engine" />
    <published>2007-01-05T18:56:29Z</published>
    <updated>2007-12-16T23:16:42Z</updated>
    <title>Overview of Clustering and Clusty Search Engine</title>
    <summary>Written by Alex Iskold Earlier this week we wrote about The Race to beat Google. In that article we discussed various approaches that startups are taking trying to unseat the web giant. In this post we are going to zoom in on one of the companies - Clusty and their search clustering technology. Before looking...</summary>
    <author>
      <name>Alex Iskold</name>
      <uri>http://www.adaptiveblue.com</uri>
    </author>
    
    <category term="Search Services" />
    
    <content type="html" xml:lang="en" xml:base="http://www.readwriteweb.com/">
      <![CDATA[<p><em>Written by Alex Iskold</em></p>
<p>
<img src="http://www.readwriteweb.com/images/clusty_logo.jpg" vspace="5" hspace="5" border="0" align="left"/> Earlier this week we wrote about <a href="http://www.readwriteweb.com/archives/the_race_to_beat_google.php">The Race to beat Google</a>. In that article we discussed various approaches that startups are taking trying to unseat the web giant. In this post we are going to zoom in on one of the companies - <a href="http://www.clusty.com">Clusty</a> and their search clustering technology. Before looking at the specifics of Clusty, we will discuss the issues with search at large and will give an overview of clustering.
</p>
<p><strong>What is perfect search?</strong></p>
<p>
It is interesting to ask: <strong>What do we expect when we enter a term into a search box?</strong> Ideally, we'd like to get the perfect answer right away. Often, we have an idea what that perfect answer should be, and when computer does not get it for us we are disappointed. But are we being reasonable? Can we expect the "perfect" answer all the time?</p>
<p>Consider for example, our interactions with an Information clerk at the mall. When we ask for a location of a store, she may or may not give us the "perfect" answer. She might not know where this store is, she might not understand us or we may not understand what she said. So for many reasons we may not get the "perfect" answer right away.
</p>]]>
      <![CDATA[<p>What is qualitatively different between our experience with the Information clerk vs. a search engine is that
with the clerk we have a dialog. When she does not understand what we asked, she has a chance,
to say <strong>Excuse me, what do you mean?</strong>. Google does not do that, it just gives us the results. If we do not like the answer we have to start from scratch.
</p>
<p>
The problem is that human interactions are fundamentally iterative, while our interactions with computers are
mostly stateless. Perhaps we could get to the perfect search results if we could have a dialog with the computer? Clustering technologies, particularly the one offered by Clusty, give computer a chance to clarify:
<em>Excuse me, when you searched for Alex Iskold, did you mean to look for Read/Write Web or AdaptiveBlue or perhaps you where looking for static analysis tools that Alex worked on while at IBM?</em>.
</p>
<p><strong>What is clustering?</strong></p>
<p>
Clusters are very common phenomenon both in nature and in human society. The examples of clusters include
cities, galaxies, a family and of course web sites focused on a similar topic. At its core, clustering is simply a
similarity grouping. A good visual way to think about clustering web sites is by picturing a network, like the one shown below.
</p>
<p><img src="http://www.readwriteweb.com/images/clusty_p1.gif" vspace="5" hspace="5" border="0"/></p>
<p>The image above is from <a href="http://www.caida.org/Tools/Plankton">Bradley Huffaker Research</a></p>
<p>
There are many clustering techniques and certainly the exact ones used by Clusty or other search engines are a secret. Here is however, a simplistic view of how clustering works. Each web page is run through a statistical frequency analyzer, that outputs a list of most commonly occurring words and phrases. Each word and phrase then becomes a node in the network.
</p>
<p>
When two words occur in the same document, the link between them is formed. If the two words co-occur again, the weight of the link between them increased. This processed is repeated iteratively with billions of web pages.
The result is a network, or more mathematically speaking, a weighted graph. Since some words gravitate to each other more - this weighted graph will be clustered.
</p>
<p><strong>The Basic Web Search with Clusty</strong></p>
<p>
It is remarkable that the clusters formed in this way capture meaning. For example, pages where Alex Iskold
is the founder of AdaptiveBlue will be distinct from the pages where Alex Iskold is described as a Read/WriteWeb contributor. Clusty takes advantage of this and uses clusters to refine the search. Every time when we perform a search, Clusty pulls together the data from other engines like Ask, MSN and Wisenut. It then organizes the search results in a way that helps us navigates away from ambiguity towards specific cluster of results:
</p>
<p><img src="http://www.readwriteweb.com/images/clusty_p2.png" vspace="5" hspace="5" border="0"/></p>
<p>
The clusters appear in the left navigation bar while the main results are shown in the center section. The clustering performed by Clusty is hierarchical, so within each cluster there are sub-clusters that user can drill into. This is a good idea because it allows the user to further refine the results. As the user clicks on the link the results in the main section reload. All this is great and positive about Clusty, but there are also things that need to be improved.
</p>
<p>
First, the names of the clusters need to be normalized. For example, when I drill into AdaptiveBlue, under the
results for Alex Iskold, I get:
</p>
<ul>
  <li>Demo</li>
  <li>Interview, CTO</li>
  <li>Alex Iskold is the founder and CEO of AdaptiveBlue</li>
  <li>Tagged, Technorati</li>
  <li>etc.</li>
</ul>

<p>
which is not intuitive. This may not be an easy thing to fix, but as is, its just very difficult to understand. Another issue is rather cosmetic, but it also has a negative impact on the user experience. Pages reload every time when the user clicks on a cluster link, using Ajax here would make experience much more pleasant.</p>
<p><strong>Beyond the basic search</strong></p>
<p>
Clusty technology is generic, since the allows the user to perform vertical search for Blogs, Images, News, Jobs and Wikipedia. In addition to the same representation of results they all have the feature called <em>Find in a cluster</em>. This essentially is a secondary search, which allows the user to slice the results by another criteria.
I particularly like the implementation here, which highlights the matching clusters:
</p>
<p><img src="http://www.readwriteweb.com/images/clusty_p3.png" vspace="5" hspace="5" border="0"/></p>
<p>
Another thing that we found interesting is the application of clustering to building tag clouds. In 2006 we saw a lot of sites offering tag clouds to help users navigate through popular topics. Clusty applied their technology to generating the cloud that can be used on any site:
</p>
<p><img src="http://www.readwriteweb.com/images/clusty_p4.png" vspace="5" hspace="5" border="0"/></p>
<p><strong>Does Clusty have traction?</strong></p>
<p>
Clusty's technology is certainly interesting, but is it popular? The company has been around for a while, but has not really been able to sway away people from Google. Clusty's Alexa rank is slightly above 5,000 now, but a quick comparison with Snap over the last year does not point to a bright future:
</p>
<p><img src="http://www.readwriteweb.com/images/clusty_p5.png" vspace="5" hspace="5" border="0"/></p>
<p><strong>Conclusion</strong></p>
<p>
So what do we make of this company and the clustering approach overall? We think that the approach has a potential if done well and Clusty is on the right track. The idea of being able to "have a dialog" with a computer by drilling into a subset of results is a good idea. However, the current implementation of Clusty needs to be perfected and polished before people will be willing to spend more time with it. So in principle this can work, but <a href="http://vivisimo.com/">Vivisimo</a>, the company behind Clusty, needs to figure how to make it flawless.
</p>]]>
    </content>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41827</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41827" />
    <title>Comment from Adam Jusko on 2007-01-05</title>
    <author>
        <name>Adam Jusko</name>
        <uri>http://www.bessed.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.bessed.com">
        <![CDATA[<p>I don't use Clusty, but here's a superficial comment in their favor--they have an awesome logo.</p>]]>
    </content>
    <published>2007-01-05T19:36:24Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41828</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41828" />
    <title>Comment from Charles Knight on 2007-01-05</title>
    <author>
        <name>Charles Knight</name>
        <uri>http://www.CharlesKnightSEO.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.CharlesKnightSEO.com">
        <![CDATA[<p>I prefer Quintura.  Their UI is superior to Clusty's IMHO, and so I placed Quintura in the Top 10 of my Top 100 Search Engines. See www.quintura.com For a list of the entire Top 100 Search Engines for 2006, including the Top 10 and the #1 Search Engine of the Year; plus the Top 10 to watch in 2007 and more, email me at Charles@CharlesKnightSEO.com.  Thanks!</p>]]>
    </content>
    <published>2007-01-05T20:15:45Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41829</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41829" />
    <title>Comment from Steve Fleckenstein on 2007-01-05</title>
    <author>
        <name>Steve Fleckenstein</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>Thanks for the analysis.  You might take a look at Firstgov.gov, the US govt search engine.  It uses Clusty on top of MSN/Live search.  The "private label" approach might be the best direction for Vivisimo; rather than try to compete by offering Clusty as a general search site it could position it as a way to improve search results on other large scale sites.  Of course there's lots of competition in the corporate search space too...</p>]]>
    </content>
    <published>2007-01-05T21:53:22Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41830</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41830" />
    <title>Comment from rickdog on 2007-01-05</title>
    <author>
        <name>rickdog</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>The most visually advanced cluster engine is Grokker.  The Java driven Zoomable Map View blows my mind.  It's worth a look-see: <a href="http://live.grokker.com/grokker.html." rel="nofollow"><a href="http://live.grokker.com/grokker.html." rel="nofollow">http://live.grokker.com/grokker.html.</a></a></p>

<p><br />
Don't forget these</p>

<p>Infocious - <a href="http://search.infocious.com" rel="nofollow"><a href="http://search.infocious.com" rel="nofollow">http://search.infocious.com</a></a></p>

<p>QueryServer - <a href="http://www.queryserver.com/QServerExe/QServer.exe/web.ini" rel="nofollow"><a href="http://www.queryserver.com/QServerExe/QServer.exe/web.ini" rel="nofollow">http://www.queryserver.com/QServerExe/QServer.exe/web.ini</a></a></p>

<p>DumbFind - <a href="http://www.dumbfind.com/" rel="nofollow"><a href="http://www.dumbfind.com/" rel="nofollow">http://www.dumbfind.com/</a></a></p>

<p>Mooter - <a href="http://www.mooter.com" rel="nofollow"><a href="http://www.mooter.com" rel="nofollow">http://www.mooter.com</a></a></p>]]>
    </content>
    <published>2007-01-06T00:37:33Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41831</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41831" />
    <title>Comment from NitinK on 2007-01-05</title>
    <author>
        <name>NitinK</name>
        <uri>http://blog.softwareabstractions.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://blog.softwareabstractions.com">
        <![CDATA[<p>I agree with Charles Knight above - Quintura uses the same paradigm of clustering results, but has a supercool UI that allows you to dynamically move between the various clusters.</p>

<p>Metamojo addresses this problem of richer specification of search criteria in a <a href="http://blog.softwareabstractions.com/the_software_abstractions/2006/10/metamojo_vertic.html" rel="nofollow">slightly different way</a> - the user can specify a "Category" to qualify the search results, and the engine also provides a rich results set (video, reference lists, blogs etc.).</p>]]>
    </content>
    <published>2007-01-06T02:00:34Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41832</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41832" />
    <title>Comment from John Smythe on 2007-01-05</title>
    <author>
        <name>John Smythe</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>I would not draw any conclusions from Alexa data...  If you google (clusty?) the latest research you'll find that the alexa rank can be easily 'gamed'/influenced.  </p>

<p>That said: I like clusty, and sure hope it will survive.</p>]]>
    </content>
    <published>2007-01-06T05:53:10Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41833</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41833" />
    <title>Comment from Emre Sokullu on 2007-01-06</title>
    <author>
        <name>Emre Sokullu</name>
        <uri>http://emresokullu.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://emresokullu.com">
        <![CDATA[<p>I agree, Clusty really works, I like it.</p>]]>
    </content>
    <published>2007-01-06T08:20:36Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41834</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41834" />
    <title>Comment from Willie on 2007-01-06</title>
    <author>
        <name>Willie</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>Hi, could you explain the following sentence a little?<br />
"Since some words gravitate to each other more - this weighted graph will be clustered."</p>

<p>How is it clustered? I thought by your explanation that the space of the above diagram is the words in the set of documents and each word will be a point in the space. Lines are drawn from one point to another if the corresponding words are in the same document. The more frequently the words occur together the bolder the line between them, so that the boldness of the lines will indicate strong relationships or clusters of related words. But visually this kind of map does not really produce a "cluster". I could see that if you move words closer to each other rather than increasing the weight of the line that this would produce clusters but not as was describe above.<br />
Thanks.</p>]]>
    </content>
    <published>2007-01-06T10:24:56Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41835</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41835" />
    <title>Comment from Adrian keys on 2007-01-06</title>
    <author>
        <name>Adrian keys</name>
        <uri>http://www.jollyjo.org</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.jollyjo.org">
        <![CDATA[<p>Seriously, I do believe that Google has too much of a big jump...too much cash...and way too much branding. In this particular niche of search...some of these companies would do good in heeding the advice. Its simply a case of picking your battles carefully....</p>]]>
    </content>
    <published>2007-01-06T11:22:10Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41836</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41836" />
    <title>Comment from Hashim on 2007-01-06</title>
    <author>
        <name>Hashim</name>
        <uri>http://www.hiphop-blogs.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.hiphop-blogs.com">
        <![CDATA[<p>One thing I like about clustering is that even if you don't use the clusters it still gives you a ton of info about your subject at a glance.</p>

<p>Alex, I'm surprised you didn't mention Google's use of labels when you perform certain searches, like for video games. It seems that Google is willing to adopt the clustering mindset, yet they are using human intelligence to do so.</p>]]>
    </content>
    <published>2007-01-06T12:13:30Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41837</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41837" />
    <title>Comment from Alex Iskold on 2007-01-06</title>
    <author>
        <name>Alex Iskold</name>
        <uri>http://www.adaptiveblue.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.adaptiveblue.com">
        <![CDATA[<p>@Willie </p>

<p>Imagine that the nodes in the graph are physical tennis balls attached by springs. The boldness of the line equals the strength of the spring.</p>

<p>Alex</p>]]>
    </content>
    <published>2007-01-06T13:15:58Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41838</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41838" />
    <title>Comment from Adeelnkhan on 2007-01-06</title>
    <author>
        <name>Adeelnkhan</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>when it comes to search, the future is visual</p>

<p>google just acquired a visual recognition, face/objection company</p>

<p>the future of the search should be --> www.liveplasma.com</p>]]>
    </content>
    <published>2007-01-06T15:00:49Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41839</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41839" />
    <title>Comment from nightcleaner on 2007-01-06</title>
    <author>
        <name>nightcleaner</name>
        <uri>http://nightcleaner.blogspot.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://nightcleaner.blogspot.com">
        <![CDATA[<p>Another thing about Quintura. In addition to presenting keywords in a word/label cloud, it is possible to add words/labels directly to the cloud. This changes both the cloud and the result set. See my blog entry on the subject at <a href="http://nightcleaner.blogspot.com/2007/01/google-is-boring.html" rel="nofollow"><a href="http://nightcleaner.blogspot.com/2007/01/google-is-boring.html" rel="nofollow">http://nightcleaner.blogspot.com/2007/01/google-is-boring.html</a></a></p>

<p>Interactive clustering is perhaps more interesting than Clusty clustering.</p>

<p>Also, it may be more satisfying to work with Quintura clouds than with Clusty clusters because they are visual. Also, the clouds will become 3-d before too long which will be very cool.</p>]]>
    </content>
    <published>2007-01-06T15:16:13Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41840</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41840" />
    <title>Comment from John Doe on 2007-01-06</title>
    <author>
        <name>John Doe</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>I like clusty very much, but their management is very slow to implement. Since the dawn of clusty/visimio, the entire transition has been negative.</p>

<p>Clusty was originally suppose to be used for boasting their search product, instead they are focusing on monetizing clust through ads, big mistake, it has now become a second hand engine with no consistency to the search standards.</p>

<p>Great technology but bad marketing = great engineers, but bad management.</p>

<p>....</p>]]>
    </content>
    <published>2007-01-06T22:10:49Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41841</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41841" />
    <title>Comment from Franciov on 2007-01-06</title>
    <author>
        <name>Franciov</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>I think clustering could help people to start obtaining information from web search engines, and not just webpages.</p>

<p>[...] For more details I refer to this Overview of Clustering and Clusty Search Engine on Read/Write Web. [...]</p>]]>
    </content>
    <published>2007-01-07T02:20:24Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41842</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41842" />
    <title>Comment from Franciov on 2007-01-06</title>
    <author>
        <name>Franciov</name>
        <uri>http://franciov.altervista.org/blog/2007/01/07/clusty-and-clustering/</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://franciov.altervista.org/blog/2007/01/07/clusty-and-clustering/">
        <![CDATA[<p>[...] For more details I refer to this Overview of Clustering and Clusty Search Engine on Read/Write Web. [...]<br />
:P</p>]]>
    </content>
    <published>2007-01-07T04:25:11Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41843</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41843" />
    <title>Comment from Stanislaw Osinski on 2007-01-07</title>
    <author>
        <name>Stanislaw Osinski</name>
        <uri>http://stanislaw.osinski.name</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://stanislaw.osinski.name">
        <![CDATA[<p>You may also want to take a look at the <a href="http://www.carrot2.org" rel="nofollow">Open Source clustering engine</a> at:</p>

<p><a href="http://www.carrot2.org" rel="nofollow"><a href="http://www.carrot2.org" rel="nofollow"><a href="http://www.carrot2.org" rel="nofollow">http://www.carrot2.org</a></a></a></p>

<p>For more details about the Carrot2 project and the source code, please see the project website at:</p>

<p><a href="http://project.carrot2.org" rel="nofollow"><a href="http://project.carrot2.org" rel="nofollow"><a href="http://project.carrot2.org" rel="nofollow">http://project.carrot2.org</a></a></a></p>]]>
    </content>
    <published>2007-01-07T10:27:20Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41844</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41844" />
    <title>Comment from nightcleaner on 2007-01-07</title>
    <author>
        <name>nightcleaner</name>
        <uri>http://nightcleaner.blogspot.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://nightcleaner.blogspot.com">
        <![CDATA[<p>Clusty segments a result set. Pick a cluster and the result set is narrowed/refined.</p>

<p>Quintura grows/flows a result set. Hover over a label in the current cloud and you preview a new cloud/result set.</p>

<p>Select the second label to go with the original keyword, and you "AND" the two clouds and their result sets.</p>

<p>This raises the question whether Clusty, in fact, adds knowledge to search. Or, alternatively, if it is simply boring.</p>]]>
    </content>
    <published>2007-01-07T12:12:31Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41845</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41845" />
    <title>Comment from PyramidView on 2007-01-07</title>
    <author>
        <name>PyramidView</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>Clusty is my preferred search engine.  It works wonderfully with single-word searches.  The clusters allow me to look at only the subsets that interest me.  However, results become fractured, and rather muddled, when given multiple term searches and terrible when using quotation marks.</p>]]>
    </content>
    <published>2007-01-07T12:55:21Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41846</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41846" />
    <title>Comment from jason on 2007-01-07</title>
    <author>
        <name>jason</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>A search on the term "exchange" seems to be too hard for Quintura to cluster.  Clusty handles this ok.  Of course, this is just a single data point.</p>]]>
    </content>
    <published>2007-01-07T22:50:58Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41847</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41847" />
    <title>Comment from Ravi Mhatre on 2007-01-07</title>
    <author>
        <name>Ravi Mhatre</name>
        <uri>http://lsvp.wordpress.com/2007/01/05/is-google-unassailable-if-so-why-are-vcs-chasing-search/</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://lsvp.wordpress.com/2007/01/05/is-google-unassailable-if-so-why-are-vcs-chasing-search/">
        <![CDATA[<p>Nice write up.  I couldn't agree more that "stateless" query results, while an important tool in addressing a user's navigational search needs, are not very good at addressinig "discovery oriented searches" where a computer must simulate a more iterative or interactive dialog with the user to provide a meaningfull response.  Clustering can help in solving this problem.  I've provided some additional examples of the problem and possible implications for Google in a recent post on the Lightspeed Blog(see link above).</p>]]>
    </content>
    <published>2007-01-07T23:56:32Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41848</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41848" />
    <title>Comment from Raul on 2007-01-08</title>
    <author>
        <name>Raul</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>Vivisimo's (and Clusty's) clustering is secret, but it doesn't depend on offline clustering of words taken from web pages.  Instead, it clusters search results based on the overall similarity of one search result (title and snippet) to another.  That's why it works regardless of the language (English, Japanese at Clusty.jp, etc.) and why good results are obtainable on content that's never been seen before (as on corporate intranets).</p>

<p>I'm curious: why are the results under AdaptiveBlue unintuitive?</p>]]>
    </content>
    <published>2007-01-08T22:36:48Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41849</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41849" />
    <title>Comment from Alex Iskold on 2007-01-08</title>
    <author>
        <name>Alex Iskold</name>
        <uri>http://www.adaptiveblue.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.adaptiveblue.com">
        <![CDATA[<p>@Raul,</p>

<p>The set that I got was:</p>

<p># Demo<br />
# Interview, CTO<br />
# Alex Iskold is the founder and CEO of AdaptiveBlue<br />
# Tagged, Technorati</p>

<p>I do not see this as a list that I can easily comprehend.</p>

<p>Alex</p>]]>
    </content>
    <published>2007-01-09T02:26:17Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41850</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41850" />
    <title>Comment from Phill Midwinter on 2007-01-11</title>
    <author>
        <name>Phill Midwinter</name>
        <uri>http://phillmidwinter.wordpress.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://phillmidwinter.wordpress.com">
        <![CDATA[<p>Clustered search engines are very old, they're just a directory displayed in a different fashion. Sorry clusty, but no luck.</p>]]>
    </content>
    <published>2007-01-11T14:41:54Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.5245-comment:41851</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.5245" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/overview_of_clu.php#c41851" />
    <title>Comment from eas on 2007-01-12</title>
    <author>
        <name>eas</name>
        <uri>http://geekfun.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://geekfun.com">
        <![CDATA[<p>Phill, it's not how they are displayed, it's how they are built.  I suppose you could call a clustered search engine a "directory" with a different interface, but directories are typically built with direct human involvement.  Clustering is done automatically, and is centered around the pages matching an arbitrary search term, rather than some editors top-down view of how the world should be ordered.</p>]]>
    </content>
    <published>2007-01-12T22:16:07Z</published>
  </entry>

</feed>