This week ReadWriteWeb will run a series of posts detailing what we think are the 5 biggest, most cutting edge Web trends to come out of 2009. We'll be posting one trend analysis per day. Then at the end of the week we'll publish a major update to our standard presentation about web technology trends.
The first major Web trend we're looking at is Structured Data. In prior presentations, this has sometimes been referred to under the umbrella term of 'Semantic Web'. However the way 2009 has panned out so far, it's become clear that this trend is much more than the Semantic Web. In this post, we'll analyze the developments in Structured Data this year and provide you with 3 product examples: OpenCalais, Google, Wolfram Alpha.
Tim Berners-Lee said in February this year that we're now in a Web of Data, rather than a Web of Documents. The organization that Berners-Lee heads, the W3C, has heavily promoted two key initiatives that are helping to build this Web of Data: the Semantic Web and more recently Linked Data.
However over the past few years, we've seen that there are many other ways to structure data and enable others to build off it. The best current example is surely Twitter, whose API has historically been responsible for around 90% of Twitter's activity - via third party apps.
The basic principle of the Web of Data is still the same as what Alex Iskold articulated on ReadWriteWeb back in March 2007: "unstructured information will give way to structured information - paving the road to more intelligent computing."

Our first example product, OpenCalais, is probably the best current example of Linked Data (which is a type of structured data endorsed by W3C). Thomson Reuters, the international business and financial news giant, launched an API called OpenCalais in Feb '08. In a nutshell, OpenCalais turns unstructured HTML into semantically marked up data. It orders data into groups such as 'people,' 'places,' 'companies' and more. This way, third party applications and sites can build interesting new things from that data - one of the defining principles of Linked Data.
For a full explanation of Linked Data, read Alexander Korth's technical introduction The Web of Data: Creating Machine-Accessible Information from April 2009. I also explained the background and benefits of Linked Data in a May '09 post entitled Linked Data is Blooming: Why You Should Care.

In May this year, Google added structured data to its core search, in the form of a feature called 'Rich snippets.' Essentially this feature extracts and shows useful information from web pages, by way of structured data open standards such as microformats and RDFa. On launch in May, Google invited publishers to mark up their HTML. While it will take a while for this markup to become widespread, the fact that a huge company like Google implemented it shows the increasing importance of structured data on the Web.

Other big companies are also heading in this direction - in particular, Yahoo was an early leader.
Ever since Wolfram|Alpha's much hyped launch in May, we've been tracking this innovative product closely. It's a self-described "computational knowledge engine" and while it's not quite the Google killer some predicted, it has many potential uses.
Wolfram|Alpha has a search engine-like interface, allowing you to type natural language statements into it. But the main part of the product is the computations you can do on data. The product is premised on using and computing data. If Web 2.0 was about creating data (a.k.a. user generated content), then the next generation of the Web is all about using that data.
We can see from the above three examples that structured data is rapidly becoming a feature of today's Web. Companies like Thomson Reuters and Google are enabling data to be structured, and new types of products (like Wolfram|Alpha) will make use of structured data in ways we perhaps can't imagine right now.
ReadWriteWeb's Top 5 Web Trends of 2009:
Comments
Subscribe to comments for this post OR Subscribe to comments for all ReadWriteWeb posts
How come we all seem to be o.k. with the internet of Date being old by companies that have as their core values profit over public good.
For me this is hypocritcal. In one era of the internet we say that it is "evil" for a company to try to own the OS space, and now it is o.k. for a group of companies to control the data of the intenet. What a crock !!!!!
Once again open standards taken over by Companies.
Are the terms "structured data" and "linked data" used interchangeably, and if not, what is the difference?
@Michael "linked data" is a type of "structured data".
No, these terms are far far apart.
"Linked data" is a formal set of strictures how to use URIs in the context of the semantic web, another formal and well-defined term. Go to the W3C site and read the details if you are interested.
Structured data is a vague term that just means what it says, and has no real formal definition outside this article. Any database or XML document or tag-value table offers structured data, so the term is pretty loosy-goosy.
If you ever use one term as a synonym for the other, you are making a mistake.
Miramon: Thanks for answering my question with such clarity!
I really don't understand how any of the companies you have mentioned are part of some new emergent "real-time web". The web has always been real-time, or as close to it as technology would allow. Over time, technology improves, so we should expect latency times to decrease. There isn't some new media format evolving, it's just that tech is getting better, and people are becoming more comfortable using tech. Ken Fromm's definition of the "real-time web" is overblown and misguided. Can we please stop focusing on marketing jargon as if it represents legitimate entities.
Can see Alex Iskold's comment on 'intelligent computing' coming into play here - good point. Specifically that layers of data from Web services (e.g. Google snippets) are making it more like real-time informed computing through peripheral data.
However, it'll be interesting to see if/when we approach our threshold of how much real-time, embedded data is too much.
achat pc @ 6:
I've come to realize that these tech blogs thrive more on buzzwords and Gartner-promulgated hype terms than on technology. There's not too much that can be done about it -- hype sells, and journalists generally don't understand technology very well, so they tend to feed on themselves.
Still they do occasionally post something worthwhile....
Miramon: Would like to follow you. What is your Twitter user name?
Michael: I do not tweet, so following my account will only find a very rare reply.
Structured Data is indeed an exciting trend! New tools are appearing, creating a whole eco-system of cloud-based services for extracting and otherwise dealing with structured data.
Orchestr8 today launched a new structured data mining capability called 'Visual Constraints'. Constraints enable extraction of structured data (product information, pricing, descriptions, etc.) from web pages, using simple "natural language" queries, such as: "all links after product details"
http://www.reuters.com/article/pressRelease/idUS82425+10-Sep-2009+PRN20090910
Wow Luis Vuitton must be scraping the bottom of the barrel if they are resorting to random posts to random blogs and forums. Very sad.
But I guess proceeds from the sale of one bag can pay for a thousand scum-sucking viral marketers for a year. Nice work if you can get it.
Hmmm. "Structured data" sounds very much like the way that librarians have been constructing, formatting, and compiling data about documents, books, media, etc. for a century or so.
I see this every day, as I am working on a semantic web database platform that uses structured data.
http://www.webepags.com
Structured data is very useful to help define & locate things quickly... but is that really the next evolution?
I think that we should be formulating an executable structure or code of knowledge...
I think this site has the best idea i've seen in a long, long time...
website home page:
www.knowgenes.com/home.aspx?nid=58
website example:
www.knowgenes.com/home.aspx?kgid=1144&nid=58
It actually structures knowledge using the 'code of knowledge'. Is is machine executable & easily understood by humans as it centers around the natural learning questions of the brain WHAT/HOW/WHY
The more you structure your data in standard formats the more value you give away to the intermediary, which will display it all in their search results without giving you much value. Which will also make it easier for Acting jobs well funded competitors to steal your work - without attribution, of course. Rather than giving away tons of raw data it makes sense to put it in a format that is both branded and harder to copy without giving attribution - like an image with your logo on it.
Have a nice day!
Structured data has been around for quite a while. I think the question is what will come out of it.
@legitimate - what will come out of it? More products! Wouldn't you want your hands on tons and tons of structured data? I know I would...
I guess API and AJAX had their share in this. JSON is hot like fire, and it is first thing that pops my mind when you say structured data.
yeap,how come we all seem to be o.k. with the internet of Date being old by companies that have as their core values profit over public good.
NFL Jerseys
Structured data should be a bottom up and not top down .If a company is going to try to "own" the structures data space they should at the minium attempt to do some public good with the revenue that they genreate.
The other thing that I think all of these companies mentioned have wrong is that they seem to view the creation of linked data as something that only a comptuer or an expert can create linked data. For the next phase of the Semantic web to come about tools that gve non technical users the abillity to easliy create linked data need to be created and adopted.
I am surprised your first mention is not Twitter.