<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" 
      xmlns:thr="http://purl.org/syndication/thread/1.0">
  <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/new_commercial_ontology_from_hakia.php" />
  <link rel="self" type="application/atom+xml" href="http://www.readwriteweb.com/atom.xml" />
  <id>tag:www.readwriteweb.com,2011:/1/tag:www.readwriteweb.com,2009://1.15873-</id>
  <updated>2011-08-16T16:54:05Z</updated>
  <title>Comments for A New Commercial Ontology from Hakia</title>
  
  <generator uri="http://www.sixapart.com/movabletype/">Movable Type 4.35-en</generator>
  <entry>
    <id>tag:www.readwriteweb.com,2009://1.15873</id>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/new_commercial_ontology_from_hakia.php" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.readwriteweb.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=15873" title="A New Commercial Ontology from Hakia" />
    <published>2009-07-30T12:00:17Z</published>
    <updated>2009-07-29T16:31:40Z</updated>
    <title>A New Commercial Ontology from Hakia</title>
    <summary>Editor&apos;s note: we offer our long-term sponsors the opportunity to write &apos;Sponsor Posts&apos; and tell their story. These posts are clearly marked as written by sponsors, but we also want them to be useful and interesting to our readers. We hope you like the posts and we encourage you to support our sponsors by trying...</summary>
    <author>
      <name>RWW Sponsor</name>
      
    </author>
    
    <category term="Sponsors" />
    
    <content type="html" xml:lang="en" xml:base="http://www.readwriteweb.com/">
      <![CDATA[<p><a href="http://hakia.com/"><img src="http://www.readwriteweb.com/images/hakia_logo.png" width="150" height="67" /></a><em><strong>Editor's note:</strong> we offer our long-term sponsors the opportunity to write 'Sponsor Posts' and tell their story. These posts are clearly marked as written by sponsors, but we also want them to be <strong>useful and interesting</strong> to our readers. We hope you like the posts and we encourage you to support our sponsors by trying out their products.</em></p>

<p>We at <a href="http://hakia.com/">Hakia</a> are proud to announce our upcoming commercial ontology, perhaps the world's first. What is a commercial ontology? If you're asking this question you have just touched on an important distinction: fantasy versus reality. In the context of the Web, a commercial ontology is a realistic version of an ontology, as we explain below.</p>]]>
      <![CDATA[<h2>Realities of the Web</h2>

<p>Hakia has accomplished two important innovations in building its commercial ontology (CO): first, the development of concepts and lexicons that follow strict guidelines on the realities of Web operations. What are these realities? Most search queries on the Web reflect a single dimension of intent, almost exclusively relevant to commercial topics. "Commercial topics" here must be taken in the broadest sense possible. For example, if you were looking for "the benefits of foot massage" or "the director of the movie Last Emperor," your queries would fall into a commercial pattern. One particular distinction of the commercial pattern is that they come in short packages, including a name (onomasticon) or referring to something sold, bought, watched, heard, etc.</p>

<p>In contrast, many (if not all) ontologies that have been built to date (or claimed to exist) are focused on the use of language in the general sense, but not in the sense of commercial patterns on the Web. Therefore, their usefulness when tackling Web search queries is greatly compromised, sometimes to the point of absolute failure. If such an ontology could disambiguate a dozen different senses of the word "kill," it would be sad news if the last 100,000 queries in the search logs did not include a single occurrence of the word "kill." Like drowning in two-inch-deep water, such ontologies do not use their disambiguation capacities for nearly 80% of queries because the queries include nothing but onomasticons or are too short (under-articulated).</p>

<h2>The Sequence Approach</h2>

<p>The second innovation used in the CO is the use of sequences instead of single words. A single word, like "kill," is the most ambiguous state of information and is hardly used in human communication without a strong implied context. As a result, building natural-language processing (NLP) systems by taking individual words as units of computation is an invitation for disaster.</p>

<p>In contrast, word sequences (two or more words) are inherently safe and highly descriptive. Take "road kill," for example. This sequence describes the corpse of an animal killed on the road by a passing vehicle. If a language processing system takes the sequence of words as a unit of computation, 99% of the ambiguity problem vanishes. There is no need to process the words "kill" and "road" separately, trace their senses, and locate convergence to identify the meaning of "road kill" if you can just take the sequence "road kill" itself as your unit of computation for mapping. This is depicted below:</p>

<p><img src="http://www.readwriteweb.com/images/hakia_sponsor_jul09a.jpg" width="395" height="339" /></p>

<p>Note the number of traces required in a conventional ontology approach compared to the sequence approach. The sequence approach requires a lot of data storage space (which is dirt cheap), whereas the conventional ontology approach requires a lot of CPU for a simple mapping task (which is expensive). But the bad news does not stop there. The trace routes in conventional ontology require manual work (impossible to automate), whereas sequence-based ontology can be easily built via automation.</p>

<p>Perhaps not everyone will understand the second point above. Nevertheless, the scalability and performance of the end product will speak for themselves when Hakia puts the testing platform online.</p>

<h2>Usage of the Commercial Ontology</h2>

<p>The immediate use of the CO is for search queries, or document characterizations, not tied to any advertising in conventional systems. This unrecognized domain of search queries and characterizations means loss of revenue. Hakia's CO is designed to fill in this gap. For example, if the search query or page characterization is "beat generation," the CO can map it to "literature" on the fly. As a result, systems using the CO will have a much deeper understanding of the incoming terms, and thus will be able to recognize the underlying intent beyond the face value of the words. The same capability can be used in a number of places other than advertising with the same effect.</p>

<p>Stay tuned for the release of the first version of <a href="http://hakia.com/">Hakia</a>'s commercial ontology.</p>]]>
    </content>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2009://1.15873-comment:171345</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2009://1.15873" type="text/html" href="http://www.readwriteweb.com/archives/new_commercial_ontology_from_hakia.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/new_commercial_ontology_from_hakia.php#c171345" />
    <title>Comment from lionblue003 on 2009-11-29</title>
    <author>
        <name>lionblue003</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>I like it. Very much. And I can see how it has grown organically from where you where yesterday (and from before that too) which is cool.</p>]]>
    </content>
    <published>2009-11-30T01:25:57Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2009://1.15873-comment:158489</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2009://1.15873" type="text/html" href="http://www.readwriteweb.com/archives/new_commercial_ontology_from_hakia.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/new_commercial_ontology_from_hakia.php#c158489" />
    <title>Comment from firewire on 2009-09-19</title>
    <author>
        <name>firewire</name>
        <uri>http://www.zoombits.co.uk/cables/</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.zoombits.co.uk/cables/">
        <![CDATA[<p>Hi Tom...<br />
I go through the whole article.It sounds interesting to me and I am trying to understand it fully.<br />
Please stay connected.Thanks for the post.<br />
</p>]]>
    </content>
    <published>2009-09-19T07:23:42Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2009://1.15873-comment:151333</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2009://1.15873" type="text/html" href="http://www.readwriteweb.com/archives/new_commercial_ontology_from_hakia.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/new_commercial_ontology_from_hakia.php#c151333" />
    <title>Comment from Valentin on 2009-08-09</title>
    <author>
        <name>Valentin</name>
        <uri>http://vzach.de/blog</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://vzach.de/blog">
        <![CDATA[<p>completely agree with Tom (#1) and you shouldn't be so fast to dismiss him. Its not an issue of "Twitter window attention span" but one of clear and precise language. </p>

<p>And btw. multiword identifier for concepts is not really a new idea, we did this in 2002 as part of the KAON system and I'm quite sure we wheren't the first. If you're sequence approach is something different from multiple (possibly multi-word) synonyms I did not see it in the explanation. </p>

<p>Anyway - looking forward to a chance to actually try this out. </p>]]>
    </content>
    <published>2009-08-09T20:44:30Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2009://1.15873-comment:149804</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2009://1.15873" type="text/html" href="http://www.readwriteweb.com/archives/new_commercial_ontology_from_hakia.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/new_commercial_ontology_from_hakia.php#c149804" />
    <title>Comment from Riza on 2009-07-30</title>
    <author>
        <name>Riza</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>Responding to #3 above.. <br />
Yes you are right. However, it is not a dictionary, it is a concept based ontology. Disambiguation is handled both at the data level (via sequences) and at the ontology level (where senses of sequences converge).</p>

<p>Responding to #2 above...<br />
The ontology is built based on the commercial value of the concepts. The concept of digital camera may be more important than the concept of German Opera in the commercial world, thus the former gets more refinement and detail in its ontological definition and lexicon space. </p>

<p>Responding to #1 above...<br />
Thanks for the constructive criticism. We will consider the readers with attention span of a Twitter window next time. You got a point.</p>]]>
    </content>
    <published>2009-07-30T21:27:28Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2009://1.15873-comment:149758</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2009://1.15873" type="text/html" href="http://www.readwriteweb.com/archives/new_commercial_ontology_from_hakia.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/new_commercial_ontology_from_hakia.php#c149758" />
    <title>Comment from Elad Kehat on 2009-07-30</title>
    <author>
        <name>Elad Kehat</name>
        <uri>http://philobuster.wordpress.com/</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://philobuster.wordpress.com/">
        <![CDATA[<p>Sounds very cool. Can you tell a little more?<br />
My understanding is that you built a large dictionary of possibly multi-word terms, categorized into various meanings, with some disambiguation mechanism. Am I right?</p>]]>
    </content>
    <published>2009-07-30T16:43:09Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2009://1.15873-comment:149741</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2009://1.15873" type="text/html" href="http://www.readwriteweb.com/archives/new_commercial_ontology_from_hakia.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/new_commercial_ontology_from_hakia.php#c149741" />
    <title>Comment from chris.spizzirri.myopenid.com on 2009-07-30</title>
    <author>
        <name>chris.spizzirri.myopenid.com</name>
        <uri>http://www.delawareediscovery.com/</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.delawareediscovery.com/">
        <![CDATA[<p>I agree with Tom, except I did slog through the whole article.  The technology sounds vaguely interesting, but I have no idea what it actually is or how it can be used.  It sounds like you've got an advanced, concept search method, or an improved method for analyzing search term trends for marketing purposes.  Am I close?</p>]]>
    </content>
    <published>2009-07-30T14:39:43Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2009://1.15873-comment:149734</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2009://1.15873" type="text/html" href="http://www.readwriteweb.com/archives/new_commercial_ontology_from_hakia.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/new_commercial_ontology_from_hakia.php#c149734" />
    <title>Comment from Tom_Fishman on 2009-07-30</title>
    <author>
        <name>Tom_Fishman</name>
        <uri>http://www.twitter.com/tom_fishman</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.twitter.com/tom_fishman">
        <![CDATA[<p>Just some constructive criticism: if you're going to use an unconventional term like 'Commercial Ontology' in a post pumping one of your products, you need to explain, CLEARLY, what that means right away.  "In the context of the Web, a commercial ontology is a realistic version of an ontology, as we explain below" is not an explanation--you used the word itself in the definition!  I stopped reading this post about six sentences in because at that point I realized I had gained no substantive information from what sounds like fluffy, pretentious double-speak.  I'd yank this post and re-write it if I were you.  Don't mean to be an a-hole, but I suspect I won't be alone on this.    </p>]]>
    </content>
    <published>2009-07-30T13:41:18Z</published>
  </entry>

</feed>
