ontology - ReadWriteWeb http://www.readwriteweb.com/feeds/tag/ontology en Copyright 2009 Richard MacManus readwriteweb@gmail.com Tue, 24 Nov 2009 12:40:23 -0800 http://www.sixapart.com/movabletype/?v=4.23-en http://blogs.law.harvard.edu/tech/rss Twine Could Soon Surpass Delicious, Prepares Ontology Authoring Tool Nova Spivack's semantic web company Twine is developing a free service to write and host semantic ontologies; the classification trees that enable machines to put concepts in topical context. Ready to play Aristotle and create an ontology of cheese, model airplanes, global anti-hunger organizations or any other topic?

What blogging was to publishing, a simple tool that made far more people able to participate, Twine's new ontology writing and hosting service could be to the act of teaching machines about new topics.

]]>Sponsor

]]> The company wouldn't let us publish the new service's name but says it is aiming for a launch date this year, as soon as a go-to-market strategy and appropriate partnerships are lined up. The ontologies created won't only work on Twine; they will be referenceable by semantic apps anywhere around the web.

Aplus.net

Twine Could Surpass Delicious in a Matter of Months

Twine's public product lets people bookmark items like web pages and videos into topical collections. The service then analyzes the contents of all the bookmarks to identify the key concepts, people, places and other information automatically. It's like tagging in Delicious but automated and, in theory, more thorough than any human being would be in assigning tags.

Compete.com says Delicious gets about 2 million unique visitors a month and has stopped growing. Twine just passed 1 million uniques and is growing fast. Spivack said that 40% of that traffic comes from Google, and sure enough those Twine pages look awfully juicy from a spider's perspective. Spivack expects Twine to hit 2 million uniques in a matter of months and that looks like a credible claim to us.

twinetraffic.jpg

The number of saved items is far greater in Delicious than in Twine - about 150 million vs. 3 million. Spivack says though that the company will soon turn back on its system that crawls all the links on bookmarked pages. Those linked-to pages will be automatically bookmarked and analyzed too, quickly expanding Twine's total archives.

So by this summer, Twine could be bigger and more visited than Delicious. We wrote a scathing review of the Twine user experience when the long-awaited service began to launch last year. The site has changed a lot since then and we're excited about the company's plans for the future. We are still concerned about the company's ability to make its interfaces really usable -- but if they can, then look out, internet.

Twine and the Semantic Web

The semantic web is a paradigm that adds standardized, structured markup to web content so that savvy applications can comprehend the key topics of any web page. Publishers can do that when they publish, or services like Twine can create the semantic markup from the outside. The automatic tagging Twine does is actually semantic markup.

For example, you can't ask Google today to show you all the book reviews around the web that were written by friends of yours who live in New York - but semantic search engines could make such a query trivial and use that information as the ground level for building more sophisticated features on top. It's a form of standardized metadata. It turns free text into data that can be mashed up.

ontologysite.jpg

Semantics Plus Ontology Equals Meaning

Spivack says that his existing product, Twine, is just one of a number of applications that only extract key concepts (people, places, key terms) out of a web page. Placing those concepts in context is the next step.

Twine can tell you that a web page is about goat cheese, for example, but it doesn't yet know how to infer that the page is also about a dairy product - the larger category that is not explicitly stated in the article. An ontology is that context, be it a dairy ontology, a cheese ontology or a new node in the existing accepted ontology of food.

Those new ontologies can be created using Spivack's simple, open source authoring tool and then hosted on his open source community site for ontologies. It's open source authoring like Wordpress and code hosting and discussion like Sourceforge.

Either Twine or a third party will then combine the extracted "entities" (people, places, key terms) with an appropriate ontology and that company's "inference engine" to build a full picture of what a web page is about and where it stands in relation to everything else.

ontologyscreen2.jpg

Busting Out of the Tech Ghetto

The limited number of ontologies that have been authored to date are largely centered on technology topics. An easy ontology authoring tool could change that radically. A standardized, accessible ontology can shine a light on a whole new part of the world. Once that topic has been illuminated for the eyes of a semantics reading machine, web developers can build services that intelligently make use of the new information.

Spivack says that heavy-duty ontologies that require computationally intensive logic navigation will still need to be built using heavy-duty desktop apps. But web applications that just need data served up smartly will work well with the kinds of ontologies that can be written with Spivack's new authoring tool.

Ready for the whole, diverse internet to be contextually understandable by web applications? Ready to contribute to the creation of those contextual explanations yourself? Keep your eye on Nova Spivack because that's what he's aiming to make happen.

]]>Discuss]]>
http://www.readwriteweb.com/archives/twine_could_soon_surpass_delicious_prepares_ontolo.php http://www.readwriteweb.com/archives/twine_could_soon_surpass_delicious_prepares_ontolo.php Authoring Tools Mon, 16 Mar 2009 16:48:12 -0800 Marshall Kirkpatrick
Semantic Travel Search Engine UpTake Launches According to a comScore study done last year, booking travel over the Internet has become something of a nightmare for people. It's not that using any of the booking engines is difficult, it's just that there is so much information out there that planning a vacation is overwhelming. According to the comScore study, the average online vacation plan comes together through 12 travel-related searches and visits to 22 different web sites over the course of 29 days. Semantic search startup UpTake (formerly Kango) aims to make that process easier.

]]>Sponsor

]]> UpTake is a vertical search engine that has assembled what it says is the largest database of US hotels and activities -- over 400,000 of them -- from more than 1,000 different travel sites. Using a top-down approach, UpTake looks at its database of over 20 million reviews, opinions, and descriptions of hotels and activities in the US and semantically extracts information about those destinations. You can think of it as Metacritic for the travel vertical, but rather than just arriving at an aggregate rating (which it does), UpTake also attempts to figure out some basic concepts about a hotel or activity based on what it learns from the information it reads. Things such as, is the hotel family friendly, would it be good for a romantic getaway, is it eco friendly, etc.

"UpTake matches a traveler with the most useful reviews, photos, etc. for the most relevant hotels and activities through attribute and sentiment analysis of reviews and other text, the analysis is guided by our travel ontology to extract weighted meta-tags," said President Yen Lee, who was co-founder of the CitySearch San Francisco office and a former GM of Travel at Yahoo!

What UpTake isn't, is a booking engine like Expedia, a meta price search engine like Kayak, or a travel community. UpTake is strictly about aggregation of reviews and semantic analysis and doesn't actually do any booking. According to the company only 14% of travel searches start at a booking engine, which indicates that people are generally more interested in doing research about a destination before trying to locate the best prices. Many listings on the site have a "Check Rates" button, however, which gets hotel rates from third party partner sites -- that's actually how UpTake plans to make money.

The way UpTake works is by applying its specially created travel ontology, which contains concepts, relationships between those concepts, and rules about how they fit together, to the 20 million reviews in its database. The ontology allows UpTake to extract meaning from structured or semi-structured data by telling their search engine things like "a pool is a type of hotel amenity and kids like pools." That means hotels with pools score some points when evaluating if a hotel is "kid friendly." The ontology also knows, though, that a nude pool might be inappropriate for kids, and thus that would take points away when evaluating for kid friendliness.

A simplified example ontology is depicted below.

In addition to figuring out where destinations fit into vacation themes -- like romantic getaway, family vacation, girls getaway, or outdoor -- the site also does sentiment matching to determine if users liked a particular hotel or activity. The search engine looks for sentiment words such as "like," "love," "hate," "cramped," or "good view," and knows what they mean and how they relate to the theme of the hotel and how people felt about it. It figures that information into the score it assigns each destination.

Conclusion

Yesterday, we looked at semantic, natural language processing search engine Powerset and found in some quick early testing that the results weren't that much different than Google. "If Google remains 'good enough,' Powerset will have a hard time convincing people to switch," we wrote. But while semantic search may feel rather clunky for the broader global web, it makes a lot of sense in specific verticals. The ontology is a lot more focused and the site also isn't trying to answer specific questions, but rather attempting to semantically determine general concepts, such as romanticness or overall quality. The upshot is that the results are tangible and useful.

I asked Yen Lee what UpTake thought about the top-down vs. the traditional bottom-up approach. Lee told me that he thinks the top-down approach is a great way to lead into the bottom-up Semantic Web. Lee thinks that top-down efforts to derive meaning from unstructured and semi-structured data, as well as efforts such as Yahoo!'s move to index semantic markup, will provide an incentive for content publishers to start using semantic markup on their data. Lee said that many of UpTake's partners have already begun to ask how to make it easier for the site to read and understand their content.

Vertical search engines like UpTake might also provide the consumer face for the Semantic Web that can help sell it to consumers. Being able to search millions of reviews and opinions and have a computer understand how they relate to the type of vacation you want to take is the sort of palpable evidence needed to sell the Semantic Web idea. As these technologies get better, and data becomes more structured, then we might see NLP search engines like Powerset start to come up with better results than Google (though don't think for a minute that Google would sit idly by and let that happen...).

What do you think of UpTake? Let us know int he comments below.

]]>Discuss]]>
http://www.readwriteweb.com/archives/semantic_travel_search_uptake.php http://www.readwriteweb.com/archives/semantic_travel_search_uptake.php Products Wed, 14 May 2008 06:00:00 -0800 Josh Catone