ReadWriteWeb

Microcontent Design, Part 1

Written by Richard MacManus / March 21, 2006 1:16 PM / 25 Comments

This is the first post in a series in which I will explore microcontent design.

"...content will be more important than its container in this next phase.

That's a big shift for old media to come to grips with. Killer apps, such as search, RSS and video-capture software such as Tivo -- to name just a few -- have begun to unlock content from any vessel we try to put it in.

Who needs to bookmark and surf a bunch of Web sites anymore, when you can search or monitor several RSS "feeds" much more efficiently?"

containersWhen Associated Press CEO Tom Curley spoke those words in a November 2004 keynote speech to the Online News Association Conference, he also struck at the heart of a paradigm shift in web design - from designing for the page to designing for microcontent. Put another way: when a Web ‘site’, or 'container' to use Curley's lexicon, is no longer necessarily how users will experience your content – what does that mean for web designers? It essentially means taking a microcontent view of design. 

Photo: venegas

As I’ll outline in this series, microcontent design involves: microchunking your content, taking advantage of open standards, employing microformats, letting users subscribe to all kinds of RSS feeds, freeing your content via APIs and other means, designing for re-use of information, monetizing it, and more.

Data sources and formats

"The Semantic Web is just the application of weblike design to data; it will be many more decades before we will be able to say we have really implemented the Web idea in the full, if ever we can."
Tim Berners-Lee, October 2004

While Sir Tim Berners-Lee was referring to the grand notion of the Semantic Web in the above quote, in many ways his vision of applying “weblike design to data” is already being implemented in the form of technologies like RSS, APIs, XML.

XML has largely lived up to its promise of being the data format of choice for the Web 2.0 era. And by far the most widely deployed format is RSS 2.0, which is a loosely structured XML dialect. Sir Tim Berners-Lee would probably prefer that RDF, a much more rigorously structured form of XML, were used instead. But that’s another story! 

Microsoft bullish for RSS, Google for Atom

Microsoft and Yahoo are two big Internet companies putting their weight behind RSS 2.0, as I've documented at length over the last couple of years. But there are also a lot of advocates for Atom, an alternative RSS format that is said to be more extensible. Indeed at the Microsoft Mix '06 event yesterday, Google employee Patrick Chanezon (an Adwords evangelist) said in an interview that Google is "very bullish" on Atom. Patrick said:

"Instead of taking Atom as the rich content model for feeds at the implementation layer, you [Microsoft] took RSS 2.0 - which obliges you to do all kind of translations.  [...] I really think this [Atom] is the future of syndication. At Google we're very bullish for Atom. [...] As Gates said in his speech, feeds usage will skyrocket in the next few years - but Atom is a much more solid format for that kind of growth."

The Microsoft interviewer retorted that RSS has the same "good enough" attribute that drove the adoption of MP3.

rss
Pic: kathy kawasaki

Either way you look at it, RSS (including Atom) and XML are the de-facto formats for data in the Web 2.0 world. If you release your data in those formats, that’s step one in the data design process.

Representing data and designing for re-use of information

Step two is standard ways of representing data, to enable people (and machines) to find and consume it. In an era where a veritable glut of media is available online, from both professional and amateur content sources, it’s become very important to make sure your data is easy to find and use.

Structured Blogging and microformats are two relatively geeky topics at this point in their evolution, but they are significant developments in terms of representing data.

Structured Blogging

Structured Blogging is an initiative launched in December 2005 by small RSS-driven companies PubSub and Broadband Mechanics (disclaimer: I work for the latter). Structured Blogging is a set of formats and plugins that enable blogs to publish different kinds of information - like events, reviews and classified ads - in a 'structured' format, so that aggregators can pick up the data from all over the Web.

It’s that ‘re-use’ of blog content via aggregation that will be where the real value lies in Structured Blogging. As of writing there are no Structured Blogging aggregators available, but a hint at the value that it could provide in future is the independent company edgeio – which was launched in February 2006. Sellers can get their data listed on edgeio’s website, simply by posting an item to sell on their own weblog and tagging it “listings”. Buyers are able to search and find goods and services at the edgeio website. How it works is that edgeio aggregates goods and services data by scanning over 25 million RSS feeds, looking for the tag "listings".

This is a great example of how data you publish on your own weblog, or at a specialist service such as the jobs listing site SimplyHired.com, can be ‘re-used’ by a service like edgeio – simply because of the way the data is marked up. Whether you use Structured Blogging markup, or simple tagging that edgeio will recognize and pick up, either way you are in a very real sense designing your microcontent for re-use.

Microformats

microformattsMicroformats is the generic name given to any format that builds on XML (X)HTML to provide additional metadata about web objects. This is the definition on the microformats.org website:

"Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards."

A good example of a microformat is hReview, a format that provides a common markup for reviews (of products, services, etc). Check out Phil Pearson's NZ Coffee Review site for an example of hReview in action. Also it's great to see Microsoft embracing microformats, as announced at Mix '06.

It’s important to note that microformats and the Structured Blogging initiative are both open standards and complement one another. The Structured Blogging toolset outputs reviews in the hReview format, for example. So essentially Structured Blogging provides tools to publish structured content, which formats it nicely for users and marks it up with microformats.

To be continued...

Update: microformats actually build on (X)HTML, not XML as I originally wrote in the first sentence of the last section. Thanks Ryan King for correcting me and Phil Pearson for confirming it.



7 TrackBacks

TrackBack URL for this entry: http://www.readwriteweb.com/cgi-bin/mt/mt-tb.cgi/2635

Comments

Subscribe to comments for this post OR Subscribe to comments for all ReadWriteWeb posts

  1. I was trying to hold back, but ah well.... Atom isn't an acronym, so why capitalize the whole thing?

    Anyhow, I believe MS is into using Atom as well as RSS, at least to some extent. Their reader will certainly understand both.

    Posted by: Michael Fagan | March 21, 2006 1:59 PM



  2. Good point Michael, I've de-capitalized Atom now :-)

    Posted by: Richard MacManus | March 21, 2006 2:04 PM



  3. My concern with this is that the value of content might go down. We havn't really seen this yet, but in time, it is inevitable. For example, one might post listing for furniture in their blog, and edgio will list it, thinking it is a valueable listing, and a link in the post will bring the visitor to their furniture affiliate link.

    This is a very "general" example, but the concern is there and with more companies and web sites on the path to doing this I just hope they have some sort of filter in place. For their sake and my own :)

    Posted by: D. Berube | March 21, 2006 2:05 PM



  4. Keep up the good work!

    Posted by: Andrew Parker | March 21, 2006 2:15 PM



  5. I have really liked the idea of MicroContent and StructuredBlogging for a long time. What I don't yet understand is how to provide an interface for using these.

    I can envision the interface for consuming these new nuggets of content (indeed it is backward compatible); I just have a really hard time see how this will not mack content creation much more difficult.

    Is there going to be a part in this series where you talk about creating MicroFormatted content? One of the reasons there is so much blog content is that it is so easy. The same can be said for TiVO, whose success seems to have alot to do with the remote control). StructuredBlogging and MicroFormats just seem to have a much higher barrier to entry.

    Posted by: Jackson | March 21, 2006 2:16 PM



  6. I think it will still take some time before people really grok stuff like microcontent, but then it will be a blast.

    now gimme that free book :)

    Posted by: Nico Lumma | March 21, 2006 2:21 PM



  7. I think the key element to the proliferation of microcontent is that technology companies and ventures need to realize that microcontent design and the various formats are nothing more than technical details. As you noted:

    "Either way you look at it, RSS (including Atom) and XML are the de-facto formats for data in the Web 2.0 world. If you release your data in those formats, that’s step one in the data design process."

    This is, of course, the crux to the process of pushing microcontent to the masses, because the normal user has absolutely zero interest in understanding Atom vs. RSS.

    The particular standard that is set is not nearly as important as SETTING a standard that is sufficient. Then people can move forward in building technologies and services that obscure the details of the standard, opening these technologies for actual mass adoption. Otherwise, all of us early adopters will just sit in a circle talking amongst ourselves.

    Posted by: Jack Chou | March 21, 2006 2:43 PM



  8. Microformats have loads of potential, my worry is how many formats people will create. A format is only useful if it's used across the board - the last thing we want is to have 101 similar formats when 1 would have sufficed.

    I think as long as developers/designers are aware of what is out there and how to succesfully implement the formats then they could offer a great way of formatting content to be used everywhere.

    By not limiting data to restricted formats (such as displaying on a html page) but having the ability to output your data to any meta language you instantly open your content to the world. Whether your presentation layer is aimed at mobile or digital tv or web.

    Posted by: Steve C | March 21, 2006 2:45 PM



  9. Microformats definitely have a long way to go. Frankly this is what everyone has needed for ages. The whole idea of exchanging data (er information) using non-proprietary standards is revolutionary to some extent.

    This is similar to getting people to speak or express in the same language. However, an important distinction is that this "language" would have evolved after a democratic process and will be open. But this process will not be as seamless as it may seem. Since there will always be conflicting standards for exchanging same type of data. No two people think alike, and then no two groups either. So it will be hard to agree on a common set of standards for exchange of a particular type of data. However as its popularity grows, we will definitely see some patterns for improving the entire process.

    It will definitely be a great experiment to watch out for.

    Posted by: Prashant Chaudhary | March 21, 2006 2:52 PM



  10. A lot of ressources here about the microformats :http://developers.technorati.com/wiki/MicroFormatsPresentations

    Hope I could read the 37signals book ;)

    Posted by: henri | March 21, 2006 2:56 PM



  11. While microcontent is the certainly more convenient, I wish there was more structure. There's too much noise. Quality content has become easier to find, but so has crappy content. And don't even get me started on RSS spam and splogs.

    Posted by: Fred Simmons | March 21, 2006 2:58 PM



  12. Wow, that was quick! Here are the 5 winners of the 37Signals book. These are the first 5 who posted *informative* comments (1-liners and generic statements don't count, sorry):

    Comment #1: Michael Fagan
    comment #3: David Berube
    Comment #5: Jackson
    Comment #7: Jack Chou
    Comment #8: Steve C

    Thanks all for commenting and apologies if you missed out this time.

    Even though the books are taken now, please feel free to comment more! :-)

    Posted by: Richard MacManus | March 21, 2006 3:21 PM



  13. Has anyone put together a directory of existing microformats? It will be interesting to see how differences across formats are resolved when more than one group defines a schema for a given type of information. Hopefully we won't end up in a situation where we have many competing formats, similar to the various versions of RSS and Atom today.

    Posted by: Hans | March 21, 2006 3:22 PM



  14. @ Hans, take a look at microformats.org where much of this work is going on.

    Also, I think that the microformats process actually dispells much of the me-too phenomenon of reinventing a rounder wheel every time someone wants to semantically mark up content... we aim at 80% solutions that are good enough to get adopted -- not the .00002% edge cases that only the geekiest of geeks care about.

    The future of microformats are assured -- and yes, they will be coming to a browser near you sooner than you might think.

    Posted by: Chris Messina | March 21, 2006 3:58 PM



  15. I tend to think of microformats as patterns and many of them, for instance hReview, are a natural fit translated as RSS/Atom feeds. Then your "machine" is easy, it's just a feed aggregator/spider. My list of resources for Web developers is one big categorized list of hReviews, with feeds for each category and an OPML of the whole thing.

    Posted by: Douglas Clifton | March 21, 2006 8:42 PM



  16. Hey great post Richard - Manual Trackback!
    http://www.touchstonegadget.com/blog/2006/03/
    rss-has-good-enough-quality-that-made.html

    Posted by: Chris - Touchstone Gadget | March 22, 2006 4:24 AM



  17. Having shouted for years about nested structures good, loose markup bad, how much more like Animal Farm can it be!

    If you turn out to be right - there'll be
    some red faces around after the volte face.

    Since much of the content we deal with needs structure applying to basic markup, I suspect your analysis could be broadly accurate!

    Posted by: Dave Pawson | March 22, 2006 4:27 AM



  18. This discussion needs to extend a bit further to include REALLY structured blogging via enclosures containing not just audio and video, but images, maps and detailed tabular structured data as well. Consider, for example, a sales catalogue with pictures, maps, audio-visual sales pitches and very detailed,tabular product specifications. Product reviews are nice, but catalogues are the most common structured data, and the data that really counts at the point of purchase. Enclosing an audio or video file in an RSS/Atom catalogue feed (pod/vid casting) will get some of the above sales messages across, but can we realistically do the structured data and multiple images part with extended industry/product-specific tagging schemes any time soon? XML schema exist for only a few industries, mostly financial products that don't need pictures or maps or much explanation. But even if a complete schema exists, is there really a compelling business reason for putting complete sets of micro-formatted/tagged data on the open web?

    IF you want to make your data available to anyone for free, and you don't care what other people do with it, including drawing ad-supported traffic away from your site/product to their harvester/aggregator site advertising alternatives and making phony comparisons, then by all means seek out some kind of tagging standard and publish every single attribute you can find a tag for.

    But let's face it, really structured data is not reviews and calendars...which is why current (mis-named) initiatives focus on simple attribute text reviews (rating=1 star), Joe Bloggs' breathless blog text, and trendy tech buzzword tags only a few people use). Type in any 15 digit product identifier into Technorati and see what you get...also try 'Java'...there was a great post (sorry no link) about the different meanings of the tag 'java' across the leading social tagging sites).

    REAL business involves VERY structured data with no room for ambiguity...detailed product specifications, prices, serial numbers for instruction manuals, etc. Businesses will prefer to ENCLOSE this kind of carefully scrubbed, completely structured data- replete with images, detailed feature sets, explanitory text/audio clips and non-memorable 15 character alpha numberic codes.

    REAL structured blogging is about tagging ENCLOSURES with standard metadata tags that enable anyone to find and subscribe to an easy to visualise and navigate downloadable enclosure file containing structured data,text,images,maps, and linked audio/video clips, with shopping basket and web services menu attached. We at Iokio call this client-side container architecture for structured blogging 'datacasting' and we have developed a general purpose 'datacasting' container we call Omniscope.

    Container design is far from dead, but it HAS moved away from the server-side web page as container. However, adding more detailled tags to the web page/feed is not the answer. We at Iokio are addressing the container design challenge for structured data by trying to develop the best type of portable web feed enclosure for downloading and presenting structured data,text,images, and maps with links to audio and video in navigable, searchable form. Iokio believe that Excel is not the ideal container for 'datacasting', nor Acrobat either...database files are non-standard, bloated and require SQL. Iokio Omniscope is much closer to the container of the future for 'structured blogging' in the client-side 'datacasting' implementation.

    In conclusion the server-side web page is indeed no longer the only container, and it must certainly become a means of posting and delivering some searchable metadata on the open web. But forget the idea of tagging all the relevant data in the feed...the standards will never be detailed enough, and content owners will soon have second thoughts about wanting to give it all away for harvesting and re-use by anyone for any purpose for free.

    Rather than master ever-more detailed and fragile structured data schema, it will be simpler to start by using your web pages/feeds to effectively tag and deliver structured data bearing enclosures, and such structured data enclosures for rich media 'datacasting' already exist.

    Posted by: Thomas Bate | March 22, 2006 4:57 AM



  19. Damned I hoped to win a book from 37signals ;)

    Posted by: henri | March 22, 2006 1:41 PM



  20. Two unrelated thoughts:

    Democracy Player is another interesting channel for audio and video content.

    The default format for RSS isn't going to be RSS 2.0 or Atom. It's going to be Feedburner.

    Posted by: Ed Kohler | March 22, 2006 3:09 PM



  21. I am one of this blog readers from china. I read your blog every day, but I have never post a comment before. In fact, you have many fans : )

    Posted by: Kenvol | March 22, 2006 9:03 PM



  22. "As of writing there are no Structured Blogging aggregators available"

    We aggregate hReviews from around two hundred blogs at postgenomic.com, and it works quite well.

    The only problem is with uptake - it's perhaps easy to forget that outside of tech blogs most authors are only concerned with the content - what's HTML, X or otherwise?

    Structured blogging plugins should help with that, though.

    Posted by: Stew | March 25, 2006 2:59 AM



  23. What do you think of Flash and the web 2.0 development? Fx. flash interactive infographics as the ones we see in online-news like The Guardian, El Mundo, etc. This phenomena is very “non-web 2.0″, but also predictet to be the future on online communication, and the way we interact with our digital world in the future… Where do you think these two things will converge?

    Posted by: matslykke | March 25, 2006 1:05 PM



  24. RSS has much more broader use then it is being used today... but many people still think it's just a bloging tool...

    Posted by: Akram | March 27, 2006 12:01 PM



  25. Richard, a great post and I reckon you're looking in the right areas. But there's a big difference between what Tim Berners-Lee is talking about and RSS 2.0.

    RSS 2.0 is a thin wrapper around human-readable content, it's pretty awful as a format for delivery of arbitrary data because of its ambiguity. Atom, especially with the forthcoming Publishing Protocol does have the potential for general data exchange (because its well-defined) but with only a couple of exceptions I know of, that too is currently only being used for human-readable content.

    Structured Blogging and Microformats have more in common with the SemWeb ideas - they can carry directly machine-processable data like your iCal material or address book or geographical info etc, which all happens to be cleanly expressable in HTML.

    Using microformats as a syndication payload offers a big step forward, but to be able to usefully integrate information from diverse sources, a shared data model such as RDF (or something very like it) is necessary. Formats by themselves are not enough.

    Posted by: Danny | April 1, 2006 3:00 PM



RWW SPONSORS


FOLLOW @RWW ON TWITTER

ReadWriteWeb on Facebook



TEXT LINK ADS