Written by Alex Iskold and edited by Richard MacManus. In this post Alex tests out and explores the emergent world of Yahoo! Pipes. He sees some interesting parallels with Relational Databases in the 90's, concluding that with pipes, the Web essentially becomes a giant database that can be queried and remixed in any number of ways.
One of the central concepts
in Complex Systems is
Emergence. It is this automagical process through which elements of a
system give rise to a higher order system. Emergence is how physics becomes chemistry and
chemistry becomes biology. It is how web 1.0 evolved into web 2.0, and how that, in turn,
will become the next web.
While the exact mechanics of emergence is complicated and far from being completely understood, scientists know that a new system emerges as a combination of its elements and their interactions. In other words, complex systems are really networks - where elements interact with each other and give rise to a new system.
Perhaps today we are witnessing one of the most vivid examples of emergence - the remixing of the world wide web. The parts of the new web have crystallized - blogs, photos, video, audio, maps, RSS, social network profiles and even plain old HTML pages have formed an impressive network, that now can be mined and remixed. Mashups are really nothing new, the web has been a programmable oyster for at least a few years now.
What is new though is the recent systematic thinking about the web as a database. A few companies, including Dapper, have been working on the problem. But with the recent launch of Yahoo! pipes, we are beginning to see the real power of remixing.
The Web is just a vast database of information. Everyday, we interact with it without thinking about that too much. We simply take our best query tool, usually called Google, and fire away. Yet decades before the web made its way into our lives, a different kind of database revolutionized our lives. The Relational Database qualifies as one of our best computer science inventions. Lesser known to the non-techie crowd, it nowadays quietly stores terabytes of information behind most familiar ecommerce and corporate sites.

Microsoft Access Circa 1999
But Relational Databases are remarkably simple. They are collections of tables (structured data) that can be joined (mixed) together via keys to produce a new set of results. For example, the table of sales can be joined with the table of employees to produce a report of who sold what. By combining the tables in various ways, programmers are able to bring seemingly hidden information into the spotlight (think emergence). For example, by combining the sales information with employee records and their geographical locations, one can determine the best sales people in each country.
Another thing that Relational Databases are famous for is visual query and UI tools. Because databases are so simple, and the data is well structured, people have created GUI builders like Visual Basic or Power Builder to automate the UI for fetching and exploring the data. We got so good and so perfect at mapping the databases to the UI, that it's become quite a boring thing to do since about 1997.
Well, now Yahoo! is making this whole business cool again, by changing the rules of the game - the Web is now the new database.

Yahoo! Pipes Circa 2007
Yahoo! Pipes is a remarkable offering that was announced last week. It is the first GUI builder for the biggest database in the world, the Web iself. When compared to Visual Basic and Power Builder, Yahoo! Pipes comes out as more inventive and no less rigorous that its predecessors. It empowers developers to remix the building blocks of the web in a whole new way. And it does it with remarkable simplicity.
In Yahoo! Pipes, what used to be a table in the relational database is now: a web page, an RSS feed, etc. The current list of sources includes: Yahoo! Search, Yahoo! Local, Fetch (RSS feeds), Google Base and Flickr. Each source can be searched or queried using either pre-defined or user-defined parameters. For example, there can be a search of all french restaurants in Chicago via Yahoo! Local. The data source and the searches can be mixed together (think emergence), using a reach set of operators. Among them is the iterator (which lets the user loop through the results), a counter and many other functions that facilitate cleaning, manipulating and recombining the information.
By bringing together many sources and operators, the user can build sophisticated queries that fetch interesting, non-obvious information from the web. For example, one can build a pipe that extracts the listings of all French restaurants in Chicago, along with their Flickr photos. Since the underlying data is virtually limitless and the set of operators is quite powerful, the number of interesting possible pipes is vast. And for this reason, unlike its predecessor the Relational Database, Yahoo pipes will never get boring.
Yahoo! pipes are cool, but they have ways to evolve. The biggest issue is that, unlike in Relational Databases, the data is neither structured nor clean. For example, how can we ensure that Flickr pictures of restaurants in Chicago will be the right ones? We really cannot. The same problem will exist in all pipes, simply because the underlying data online is not as precise and polished as data usually is in a Relational Database. What are the consequences of this? Well, users currently forgive some imprecision in tags on Flickr and del.icio.us, yet they expect near perfect answers from Google. So having precise instruments to clean the data in the pipes would go a long way.
Another, very different, axis for the evolution of the pipes is to make them usable by a less technical crowd. As it stands right now, like Relational Databases, the pipes require a techie brain to be used efficiently. Yet, it seems like there is a possibility, particularly from the user interface and operator simplification point of view, to make this tool usable by moms and pops. But even if not, judging again from the Relational Database, getting wide adoption in the technical community would be just fine.
So what is the catch - why did Yahoo do it? The answer is the same old: search and ads. The majority of the current data sources are from Yahoo! and so that means Yahoo! will get the ad revenue when the pipes are run. So empowering thousands of enthusiastic techies to remix the web using Yahoo's data is a great idea.
Will this work? Will developers start using pipes? At the time of this writing there are over 5,000 pipes, which is an impressive number given that the application is not even a week old. But we should check in a month or so to see how things unfold. Certainly the key to its success will be polishing the UI and adding new operators and data sources. Since Yahoo! is known for its good design and focus on the user experience, it is likely that we will see the pipes improving in that regard over time.
Please give the pipes a try if you have not done so yet, and let us know what you think is going to happen to it over time.
TrackBack URL for this entry: http://www.readwriteweb.com/cgi-bin/mt/mt-tb.cgi/1966
Comments
Subscribe to comments for this post OR Subscribe to comments for all ReadWriteWeb posts
"So what is the catch - why did Yahoo do it? The answer is the same old: search and ads. "
I don't think that's the whole reason. There are a couple of other reasons which spring to mind:
1) Attention data: which feeds are popular, who is interested in what topics. That's valuable data.
2) Platform Strategy 2.0: By creating a developer platform Yahoo can build platform lock-in. That in turn is going to drive traffic, revenue and provide ways to see exactly who is doing what all over the web.
Posted by: Nick Lothian | February 14, 2007 3:41 AM
Thank you for this post. Finally I understand what the pipes are all about. I believe that taking advantage of the web as the source of all content, or as the DB as you mention, will be the next step in the evolution of the internet.
Posted by: Lars Teigen | February 14, 2007 4:23 AM
Agree with Lars - fantastic post that captures the big picture vision behind the admittedly early beta.
Posted by: Bradley Horowitz | February 14, 2007 6:55 AM
I've given it a try, and written 5 pipes to show the capabilities of this great service. You can find it over at my site.
By the way, I'm willing to bet that the first new feature Yahoo will add to the Pipes is the ability to look at the actual code. The GUI is cool at first, but it gets really annoying when you try to do some complex pipes.
Also, the ability to do a thorough analysis of the feed content is sorely lacking. If anyone needs help with building a pipe, don't hesitate to contact me, I've tested it pretty extensively and I know what can and cannot be done.
Posted by: franticindustries | February 14, 2007 10:25 AM
Alex, I agree entirely. However there is already something very similar to the relational model, designed specifically with the web in mind : RDF. This meshes well with the pipes idea (blogged). There's still the problem of user-friendly interfaces, but they're on their way ;-)
Posted by: Danny | February 14, 2007 11:26 AM
Completely off-topic minor quibble: Your Access relationship diagram is Access 2000 (which is obvious from the icon in the upper left corner), so that would be circa 1999.
Of course, the relationship window in Access is basically unchanged from at least Access 2 (released in 1994).
Posted by: David W. Fenton | February 14, 2007 11:55 AM
@Danny,
This is an interesting point. I actually think that despite the fact that RDF is more canonical, it would be difficult to move the entire web to use it. How do you envision this happening?
Also, in a sense RDF is a graph, the data in feeds and pages are reacher, it does not necessarily lend itself onto a graph.
Alex
Posted by: Alex Iskold | February 14, 2007 12:30 PM
Short answer: the entire web doesn't need to use it to get significant benefits, and lot of web data is already interpretable as RDF - e.g. this page itself is rich with data - explicit stuff, loads of links, there's even a feed, it's all mappable to RDF. Long answer : hmm, where to start?? - ok, I just put these two docs online, they give the Web 2.0 kind of angle (IMHO) : The Shortest Path to the Future Web and From Here to There (PDFs)
Not sure I get your last sentence - reacher=richer? Anyhow the web is a graph, feeds & (XML/HTML) docs are tree structures which can be mapped onto a graph (RSS 0.9 and 1.0 were both RDF, only 0.91 & 2.0 watered things down).
Posted by: Danny | February 14, 2007 2:15 PM
Nice article. I would like to know, how many people will really use this Yahoo Pipes?
Posted by: PohEe.com | February 14, 2007 5:18 PM
@9 - that remains to be seen. We should check in a few month.
Alex
Posted by: Alex Iskold | February 14, 2007 6:20 PM
Yahoo Pipes cannot replace relational databases. It looks like a aggregator hosted service so that you can mix and match information from allready existing sources.
What if you want to become source then probaly you will have to design a database of some kind.
I like the GUI tool its kewl.
Posted by: Adnan Rana | February 14, 2007 6:21 PM
So the internet is, in fact, a series of tubes.
Posted by: Adrian Pike | February 14, 2007 6:30 PM
Don't forget you can write your own operators and functions!
Because it can fetch from any URL, it's really not hard at all to extend pipes to do whatever you want with feeds, should you have a web server.
For example, I needed some non-RSS data transformed, well I just wrote my own fetching PHP script with a query string parameter to pass and voila, now I have the widget I want for my pipe. Using this, it would be possible to add things like feed list intersections, your own data normalizations, etc, etc.
You can also use services like Dapps (www.dappit.com) and others to help transform existing website data into something useful for pipes to consume.
I really want to see people writing their own sites to plug into pipes to make them more powerful. That's where a huge amount of power comes in-- above and beyond what's already out there formatted for RSS readers.
Posted by: Mattie | February 14, 2007 7:24 PM
As an old mainframe database programmer I can see the realtionship of the web to a database. Pipes is an innovative project but it does need to become more user friendly before the masses will even try to use it.
Posted by: Cathy | February 14, 2007 8:06 PM
It has been a while since I've seen anything inovative from Yahoo.
This is the kind of thing I would expect Google to do.
This has a lot of potential, lets hope Yahoo doesn't ruin it.
Although I don't mind the complexity, if Yahoo ever intends for the masses to use it, it needs to become more user friendly.
Posted by: Emmanuel | February 15, 2007 12:18 AM
The GUI is so slick!
The best addon I liked was this webpage to rss pipe http://pipes.yahoo.com/pipes/NlNQKdO62xGAq1ZgZoQMOQ
If only Yahoo plays their cards right, before Google catches up ;)
Posted by: James Oister | February 15, 2007 6:31 AM
Great tool put up by yahoo..
but can be pretty complicated for normal user to use...
Posted by: Sunny Tan | February 17, 2007 2:43 AM
feedGod has had a simplified version of Yahoo's "Pipes" available for a while now. Not as many options as Yahoo's but it's much easier to use and includes thousands of built in feeds, whereas with Pipes you have to find all of your own feeds to "Pipe".
Posted by: Matt | February 20, 2007 8:44 AM
please provide a printer-friendly version of articles.
firefox 2 on windows xp prints three pages for this article:
1. almost blank page with R/WW top banner
2. article text, until "since about 1997"
3. almost blank page with copyright notice
Posted by: OFF TOPIC - PROVIDE A PRINTER FRIENDLY VERSION | February 22, 2007 4:42 AM