ReadWriteWeb

And Nerds Became Kings: Yahoo! to Announce Semantic Web Support

Written by Marshall Kirkpatrick / March 13, 2008 9:35 AM / 20 Comments

TechCrunch and Search Engine Land are reporting this morning that Yahoo! will now be indexing Semantic Web and Microformats markup from around the web and will use that information to display more structured search results. Here is the Yahoo! post about the news.

We asked last month how vulnerable Google is in search and the leveraging of standards-based structured data may be the most obvious approach to improving on the search industry's current best practices. As Tim Berners-Lee said just weeks ago the time for the semantic web is now.

What Does This Mean?

Here's one example of what that could mean: Today, a web service might work very hard to scour the internet to discover all the book reviews written on various sites, by friends of mine, who live in Europe. That would be so hard that no one would probably try it. The suite of technologies Yahoo! is moving to support will make such searches trivial. Once publishers start including things like hReview, FOAF and geoRSS in their content then Yahoo!, and other sites leveraging Yahoo! search results, will be able to ask easily what it is we want to do with those book reviews. Say hello to a new level of innovation.

This has been really geeky stuff for a long time, with little market traction and a whole lot of promises from academic research and outlying innovators. That will now change.

The basic idea behind Semantic Web technology is that by signaling what kind of content you are publishing on an item-by-item or field-by-field basis, publishers can help make the meaning of their text readable by machines. If machines are able to determine the meaning of the content on a page, then our human brains don't have to waste time determining, for example, which search results go beyond containing our keywords and actually mean what we are looking for.

Publishers will now be able to clearly designate content on a page as related to other particular content, as business card type information, as a calendar event, a review or as many other types of content. It will make Yahoo! a lot smarter and should shake up the world of Search Engine Optimization and web publishing, a lot.

Who Does the Markup?

Many observers of the Semantic Web, including us at times, have argued that it's unrealistic to expect web publishers to markup their own content and that a more realistic path to market for technologies based on semantics is to build applications that can parse the semantics out of other peoples' content from outside.

In my interview with Mark Zuckerberg last week, for example, the Facebook CEO expressed disinterest in participating in the Semantic Web. I didn't publish it in the interview, but he indicated such a move would be up to a third party site organizing information via the Facebook Platform if it was going to happen at all. He will probably change his tune now, as adding hCard support to Facebook public profiles will now be a no-brainer. Other publishers will be faced with similar questions.

Semantic web markup will quickly become standard practice though for all CMS/publishing systems and we'll wonder what we ever did without it or why it seemed so hard.

Google Will Soon Follow

This move by Yahoo! will likely be followed up by Google, it's just too much opportunity for any search engine to pass up. Semantic markup is like a content-level site map, something all the search engines have agreed on a standard for already. Semantic web technology is next. There will be big job opportunities, more than there are for SEO in the short term, for people who can help publishers implement Semantic Web markup retroactively and into the future.

The Semantic Web was one of a handful of topics that we identified as key themes for the coming year in our RWW Toolkit for 2008. Check that toolkit out for resources you can use to follow this important topic as it unfolds.

Comments

Subscribe to comments for this post OR Subscribe to comments for all Read/WriteWeb posts

  • This is a phenomenal way to kick-start widespread semantic web adoption. Every online marketing department should be thinking about ways they can take advantage of this -- it's SEO+!

    Posted by: Scott Brinker | March 13, 2008 10:53 AM


  • I'm pretty sure within two weeks Google will make a similar announcement.
    I'm so very surprised they haven't done it much, much earlier already - is Google so big now that they lost their innovation edge?
    This and other recent moves by Yahoo clearly show me how desperately they want to return the trust of the market, of the users and the Internet sphere - to prove they are still alive, to desperately strive for independence from the sharks for which they are about to fall prey... whatever it is, all this will be great for the future Web 3.0 - the semantic Web - which I didn't think will come so soon, but backed by Reuters and now by Yahoo it is inevitably coming and this is for good, I reckon, no?...

    Posted by: Esdee | March 13, 2008 11:40 AM


  • Outstanding! And they did it right, to boot! They could have gone only one of two ways... either RDF or Microformats, but they went ahead and support both "perspectives"... including rdfa and erdf. Wow. I'm impressed and anxious.

    Thanks Yahoo! for giving me yet another argument to push the adoption of uf's and rdf.

    Posted by: André Luís | March 13, 2008 11:41 AM


  • "And Nerds Became Kings" Best post title I have seen in my feed reader on the subject.

    Suggestion: Interview Tantek, Chris Messina and Chris Saad and get their reactions. They have been crusading for these standards for years. Would be very interested in reading an interview with them, post the Yahoo and Google adoption of semantic web standards.

    Posted by: Todd | March 13, 2008 11:45 AM


  • Todd, that's a great idea.

    Posted by: Marshall Kirkpatrick Author Profile Page | March 13, 2008 11:46 AM


  • So when I saw this announcement - and mulled over the various unprocessed microformat related stuff I've scanned over in the last week or two - the obvious question that occurred to me was:

    - so will Yahoo start to index IE8 webslices, and add Yahoo widget/webslice subscription result items to their listings?

    hmmm...

    Has anyone built a week 0 webslice indexing engine, and started looking at universal widget front ends for them?

    Posted by: Tony Hirst | March 13, 2008 11:51 AM


  • The key is still the inclusion of semantic data in the websites that are grokked by the search engines. If the originiating websites do not include the data, then a model might emerge from user-driven content.

    Consider the considerable amount of energy goes into commenting on the content of a web site posted on a site such as Digg. In general most of those comments are useless and do not add substantively to the value of the article.

    If that willingness to comment could be harnessed and used to add semantic information about the articles, such, as: requiring the users to fill in the author's name, the article name and subtitle, date of publication, etc. BEFORE the user's could leave a comment, then a semantic concordance could slowly be established with the added benefit that requiring the extra effort before allowing a comment might serve as a useful "twit filter" and perhaps greatly enhance the value and content of comments with respect to the aricle.

    True, this builds a semantic database separate from the original aritcle, but it would greatly enhance the search capabilities of the a site, such as Digg. The nominal efforts by search sites to query users with questions, such as: Was this information useful to you? Does not really do much to enhance the predictative value of search because what is useful to any three people can vary widely. Semantics deals with specifics and specifics are more predictive than generalities.

    Posted by: Rohn Wood | March 13, 2008 11:55 AM


  • I've posted the Twine perspective on this, here:

    http://novaspivack.typepad.com/nova_spivacks_weblog/2008/03/twine-perspecti.html

    Posted by: Nova Spivack | March 13, 2008 12:22 PM


  • I wonder if years from now this announcement will be considered as one of the most important milestones in the evolution of the internet? If Google follows, it might be just so.

    Posted by: Timo Paloheimo | March 13, 2008 12:28 PM


  • Timo,

    I think you are spot on in your musing. It has taken a while to get to this point but it is refreshing to see that after a lot of hard work by the W3C and others that the "majors" have decided to support rdf and microformats.

    Indeed, 5 years from now we will very well be sitting around and remembering the pre semantic web days and wondering how we ever got by on 1 dimensional search.

    Posted by: A Taylor | March 13, 2008 12:38 PM


  • Google is already indexing XFN and FOAF data (see http://code.google.com/apis/socialgraph/). I don't know to what (if any) extent it is being used by them thus far but they are providing access to it for others to use.

    Posted by: Geoffrey Foster Author Profile Page | March 13, 2008 12:50 PM


  • I'd be curious to hear people's thoughts on which companies have the most to fear from the semantic web.

    Which services - online or offline - would be rendered obsolete by a semantic world?

    It seems to me that semantic technology would stand in exact opposition to the type of 3.0 world envisaged by projects like Mahalo.

    "Today, a web service might work very hard to scour the internet to discover all the book reviews written on various sites, by friends of mine, who live in Europe. That would be so hard that no one would probably try it. The suite of technologies Yahoo! is moving to support will make such searches trivial."

    Mahalo seems designed to fill in the gap for exactly these types of searches, but if I'm understanding the vision for semantic technology outlined in this post, then there would be no need for helper monkeys to create lists of stuff.

    What types of aggregation are doomed by the spread of semantics?

    Which companies should go to bed tonight with a paralyzing sense of fear?

    [Regarding Mahlao, I'm referring mostly to their original vision. It seems at times like they're moving more towards keyword-optimized journalism, perhaps in recognition of the threat posed by semantics?. See http://mahalo.com/How_to_Buy_Land_on_the_Moon.]

    Posted by: Gabe | March 13, 2008 1:19 PM


  • I'm a big fan of the semantic web, the benefits are just too many to ignore.

    However, as for Google getting caught short by Yahoo!, well that's just wishful thinking.

    The only way that's going to happen is if the entire workforce of Google simply doesn't turn turn up for work — for the next 3 years.

    And that's no disrespect to Yahoo! either.

    A misconception I often read when people are trying to articulate the concepts of the semantic web is that it will do away with the search engine.

    This is of course nonsense.

    As I've discussed before, we're going to see things like the Found Engine, but that's another topic.

    The key strength of the semantic web is in how it plays to the specific searches — where you know precisely what you're looking for.

    This is where the "long tail" becomes the final frontier for search marketing and monetization.

    But the fact of the matter is, the moment you tap in the words "iPod" or "Paris Hilton" into Yahoo! Semantic, you're still going to get a zillion entries.

    The difference is, there'll be some proximity filtering in place, in addition to their own version of PageRank...

    Posted by: Wayne Smallman | March 13, 2008 2:17 PM


  • Wow, that is funny that you say that Mark Z doesn't want to support the Semantic Web but surely hCard will be the new go...

    That is pretty lame coming from the Z man. The Semantic Web is not just another bolt-on tech that adds value like AJAX. It is a fundamental way of thinking about data and inter-relations. The graph! Wow, why didn't I think of that.

    Oh, wait a minute, Mark did. The social graph. Maybe that is why he is so convinced that the Semantic Web is no good, he has already re-invented it and gave it a new name. Well done Mark.

    Back to the subject, RDFa [1] is set to become a W3C Recommendation in the middle of the year. This is THE cornerstone of what will become the Semantic Web/Web of Data.

    [1] http://rdfa.info/wiki/RDFa_Wiki

    Posted by: David Peterson | March 13, 2008 3:35 PM


  • This is certainly exciting... more new toys from old players. The web will be all the more interesting than any time before!

    Posted by: YDrive | March 13, 2008 10:38 PM


  • >> I'd be curious to hear people's thoughts on which companies have the most to fear from the semantic web.

    I posted some of my own initial thoughts on my blog.

    I think winners are the search engines themselves plus markup-tools companies, plus those suppliers that are closest to the data-supply source.

    And losers, in general, I think, may be meta-search companies, classified-ad-style vertical sites, and niche search sites.

    Posted by: Steve Murch | March 14, 2008 6:36 PM


  • The semantic web as expressed so far is very simplistic. However, semantics is everything, so we need to make semantics run enterprise, social networks, etc etc. The current structures are just the first in evolution.

    To answer 16. The companies that should fear the most this development are Microsoft, IBM, Oracle. When the next level of development of semantic web occurs enterprises will be driven differently making current technologies obsolete.

    Posted by: pawel lubczonok | March 15, 2008 12:15 PM


  • and the winners will be??

    i would love to see the moment a scientist realised a ground breaking theory when two or more (what might appear to be) irrelevant pieces of information come together through a semantic web process.

    any thoughts as to other applications?

    Posted by: David Craig | March 17, 2008 4:08 AM


  • I think nonprofits could totally use this but like most businesses they'll need to see a clear reference application that also shows how it would enhance SEO and improve their Google AdWords results. Oh wait -- this is Yahoo! Oh well..

    On the blogging side, this will hopefully spur adoption of the Wordpress Sandbox theme as it already supports hAtom. I already use it on my blog and it would be fantastic if all that work supporting the format would actually come to some sort of fruition.

    Posted by: Allan Benamer | March 20, 2008 10:38 PM


  • I worry a little about this. I still don't understand microformats, but the idea of having the tags telling us what the stuff inside actually is, that's great. I just don't want the standards to be so dang blog-MySpace-FOAF-oriented. What, will we get a social-oriented microformat and one for the rest of the web? If a website has calendars but no events (other information instead), what would the author use? hCalendar, no.
    Mostly, we in business don't want to spend out time and money on whoever becomes the Betamax, while we of course would love to have videorecording.

    Posted by: Mallory | March 21, 2008 4:01 AM




RECENT JOBS


RWW READERS


TEXT LINK ADS


RWW PARTNERS

adaptiveblue

Yahoo Buzz