ReadWriteWeb

Cognition Announces "World's Largest Semantic Map"

Written by Richard MacManus / September 16, 2008 9:55 AM / 8 Comments

Cognition Technologies, a Semantic Web company that specialises in Natural Language Processing (NLP) search, is today announcing the release of what it claims is "the largest commercially available Semantic Map of the English language." We interviewed Cognition CEO Scott Janus to find out what this means.

We also discovered that Cognition, which currently licenses its technology to other organizations, is planning to build a general consumer search engine - which will compete with Google and others.

What is a Semantic Map?

A Semantic Map is kind of like a dictionary, in that it's a representation of Cognition's ability to define things. Cognition claims that its Semantic Map has over 10 million semantic connections; over 4 million semantic contexts (word meanings that create contexts for specific meanings of other related words); over 536,000 word senses (word and phrase meanings); 75,000 concept classes (or synonym classes of word meanings); 7,500 nodes in the technology's ontology or classification scheme; and 506,000 word stems (roots of words) for the English language.

Image from Cognition

The company says that its Semantic Map "is more than double the size of any other computational linguistic dictionary for English".

Cognition Technologies has been working on its technology for 24 years, with a lot of input from lexicographers and linguists over that time. Because they've used a mix of algorithms and human input, Cognition has been able to discern relevancy, meaning, synonymy. Scott Janus told us that one of Cognition's strengths is that it can disambiguate words and phrases, which Janus says differentiates them from the keyword and pattern matching algorithms of Google, Yahoo and others.

For example Janus told us that Cognition's technology can find results even if direct words are not used - which he says Google can't do.

Cognition Plans General Search Engine

The comparisons to Google led us to ask the obvious question: does Cognition's semantic technology have a more general application? In other words, does Cogition plan to take on Google by creating a search engine for consumers? CEO Scott Janus replied that yes they do plan to "one day offer search on the general web". However he said that they need more capital funding to index the entire Web, put infrastructure in place, etc.

As of now Cognition will continue to license its semantic technology to verticals like law and health. Janus told us that Cognition is "good for complex content where lot of synonyms are used", so right now data-intensive industries are where it is aiming.

Cognition's current applications include legal (e.g. LexisNexis Concordance's case management), health (e.g. MEDLINE), and a semantically charged version of Wikipedia.

Image from Cognition

Cognition vs Powerset and Hakia

Two other Semantic search engines we've been tracking closely on ReadWriteWeb are Powerset and Hakia. We asked CEO Scott Janus what makes Cognition different from those two products?

In a nutshell, Janus says that its Semantic Map is bigger and better.

Specifically, he said that Powerset is actually "not so similar" to Cognition. According to Janus, Powerset does "parsing" - which it licensed from Xerox Parc. That is 20-25% of the solution, said Janus, but Powerset "doesn't have a good semantic map". Cognition went so far as to write a white paper (pdf) explaining why it thinks Powerset "misses the point".

As for Hakia, Janus said that as far as he can see Hakia is focused on "ontological classifications" - classifying words and concepts together. But he says Hakis doesn't have as full a semantic map as Cognition, so he thinks Cognition has "a better understanding" compared to Hakia.

In summary, Janus told us that semantic search companies "must include a comprehensive semantic map" to be successful. We're sure that Powerset and Hakia will have different opinions on what makes a successful semantic search company, but it does make for a good differentiator for Cognition.

Open Question

Tell us in the comments what you think of Cognition and whether you think it can compete with Google in the long run?



1 TrackBacks

TrackBack URL for this entry: http://www.readwriteweb.com/cgi-bin/mt/mt-tb.cgi/4916

Comments

Subscribe to comments for this post OR Subscribe to comments for all ReadWriteWeb posts

  1. In summary, Janus told us that semantic search companies "must include a comprehensive semantic map" to be successful. We're sure Powerset and Hakia would disagree with that...

    That there may be more than one way to actually wield a semantic map in a software environment is clear. It may even be that there is such a thing a "comprehensive." But semantic search, by definition, requires such a map. I.e. no map = no semantic search. This is NOT a matter of opinion, and to suggest that Powerset and Hakia would "disagree" is deceptive and leads to more "truthiness."

    The world is confusing and difficult enough to understand without more of such false debate. Please let words and concept stand for what they mean, especially in an article on semantics. Perhaps actually checking with Powerset and Hakia might lead to a deeper analysis in this vitally important new field?

    Posted by: Pieter B. Ruiter | September 16, 2008 11:10 AM



  2. They are just experimenting with what seems impractical. What they think that Google can not invest in Natural Search or so with having more that 450,000+ servers in multiple countries around the world. Google may index the entire web but are not doing so as most of the visitors don't need everything. Google indexes those pages which they found have practical importance and searched by millions. So optimization of resources means not to server everybody but to server most of them.

    Anyways have success with their project.

    Posted by: Janet | September 16, 2008 12:04 PM



  3. Pieter, I did ask Hakia for an opinion, but haven't heard back. What I was trying to suggest, and perhaps I worded it a bit ambiguously (!), is that Powerset and Hakia may have differing opinions than Cognition about what makes a successful semantic search company. Cognition obviously thinks its large Semantic Map makes them the best. But is it only about, ahem, the size of your Semantic Map?

    What do *you* think Pieter?

     Posted by: Richard MacManus Author Profile Page Posted on FriendFeed   | September 16, 2008 12:16 PM



  4. With all due respect, Cognition should not make such a comment unless they are hacking our servers and comparing what we have with what they have. Competition is good and useful, however such remarks contribute to information pollution, something we are all trying to avoid at all cost.

    hakia is deploying Ontological Semantics (OntoSem), which is well documented at labs.hakia.com. We even have an interactive section to display its complex structure. OntoSem is not a classification method. It is a network of concepts reflecting ontology. OntoSem parser produces Text-meaning-representation (TMR) that outlines which concepts exist, how they are related, and what senses of the words were used in a given text. We cover more than 100,000 senses (not words) in English not including onomasticons (name lists.) Including onomasticons, our coverage is over million words in English.

    The concept map of OntoSem is language independent which can be populated by any lexicon (currently English only)

    If Cognition was truly interested in how hakia solves the same problem, I would expect much better understanding by reading our technology pages, and a more scientific comment other than "we understand better."

    There is no silver bullet for a semantic solution that will succeed. Different approaches can be successful as long as the system developed is scalable, and impose minimum reliance on "words". The sheer size of the collection of words or concepts does not represent, by any means, the capability of the system.

    From what I see in Cognition's drawing, it looks like they have meshed up morphology, syntax, and ontology into one structure, which lacks the required modularity and independence.

    Officer + responsible + cover up

    The entire sequence must be analyzed as a whole to find what knowledge it represents in the ontology, then the word senses of each must be extracted accordingly. Our drawing for this example would have one box below to denote this process, then would have 3 individual boxes to show their interpreted senses. Morphological treatment is a default operation and we would not even mention it in such a drawing. This is how we do it.

    Nevertheless, we cannot claim which approach is better. We will not comment that far.

    Good luck to Cognition. And let's avoid unnecessary interpretations to keep this newly formed audience well-informed.

    Best Regards

    Posted by: Riza C. Berkan | September 17, 2008 1:14 AM



  5. It takes time to bid goggle. Wordnet has also similar semantic map which uses doesn't meant better search result in WWW.

    obviously large map covers more queries thus taking large group of searcher. I think a new solution is required (may be innovation) for better results using the cognition semantic map.

    Krishna

    Posted by: Krishna Sapkota | September 17, 2008 1:59 AM



  6. Thanks Riza for your comments. We're also seeking feedback from Powerset, so a follow up post will be published soon.

     Posted by: Richard MacManus Author Profile Page Posted on FriendFeed   | September 17, 2008 1:49 PM



  7. I'm excited to really see this in action; as someone who used to actually think it was fun to spend a couple of hours with the OED just looking back at how words had evolved, it seems to me that the combination of algorithms *and* human insight seems to make the most sense.

    Thanks for writing this up -- can't wait.

    Posted by: Merredith | September 18, 2008 12:22 PM



  8. Human effort + BOSS + reasonable amount of intelligence/semantics in concept mapping may solve 80% of the search problem.

    Posted by: Sumeet | September 20, 2008 4:46 AM



RWW SPONSORS


FOLLOW @RWW ON TWITTER

ReadWriteWeb on Facebook



TEXT LINK ADS