dapper - ReadWriteWeb http://www.readwriteweb.com/feeds/tag/dapper en Copyright 2009 Richard MacManus readwriteweb@gmail.com Tue, 24 Nov 2009 10:13:22 -0800 http://www.sixapart.com/movabletype/?v=4.23-en http://blogs.law.harvard.edu/tech/rss Semantic Web Patterns: A Guide to Semantic Technologies In this article, we'll analyze the trends and technologies that power the Semantic Web. We'll identify patterns that are beginning to emerge, classify the different trends, and peak into what the future holds.

In a recent interview Tim Berners-Lee pointed out that the infrastructure to power the Semantic Web is already here. ReadWriteWeb's founder, Richard MacManus, even picked it to be the number one trend in 2008. And rightly so. Not only are the bits of infrastructure now in place, but we are also seeing startups and larger corporations working hard to deliver end user value on top of this sophisticated set of technologies.

]]>Sponsor

]]> Editor's note: Looking back over 2008, there were some posts on ReadWriteWeb that did not get the attention we felt they deserved - whether because of timing, competing news stories, etc. So in this end-of-year series, called Redux, we're resurrecting some of those hidden gems. This is one of them, we hope you enjoy (re)reading it!

The Semantic Web means many things to different people, because there are a lot of pieces to it. To some, the Semantic Web is the web of data, where information is represented in RDF and OWL. Some people replace RDF with Microformats. Others think that the Semantic Web is about web services, while for many it is about artificial intelligence - computer programs solving complex optimization problems that are out of our reach. And business people always redefine the problem in terms of end user value, saying that whatever it is, it needs to have simple and tangible applications for consumers and enterprises.

The disagreement is not accidental, because the technology and concepts are broad. Much is possible and much is to be imagined.

1. Bottom-Up and Top-Down

We have written a lot about the different approaches to the Semantic Web - the classic bottom-up approach and the new top-down one. The bottom-up approach is focused on annotating information in pages, using RDF, so that it is machine readable. The top-down approach is focused on leveraging information in existing web pages, as is, to derive meaning automatically. Both approaches are making good progress.

A big win for the bottom-up approach was recent announcement from Yahoo! that their search engine is going to support RDF and microformats. This is a win-win-win for publishers, for Yahoo!, and for customers - publishers now have an incentive to annotate information because Yahoo! Search will be taking advantage of it, and users will then see better, more precise results.

Another recent win for the bottom-up approach was the announcement of the Semantify web service from Dapper (previous coverage). This offering will enable publishers to add semantic annotations to existing web pages. The more tools like Semantify that pop up, the easier it will be for publishers to annotate pages. Automatic annotation tools combined with the incentive to annotate the pages is going to make the bottom-up approach more compelling.

But even if the tools and incentive exist, to make the bottom-up approach widespread is difficult. Today, the magic of Google is that it can understand information as is, without asking people to fully comply with W3C standards of SEO optimization techniques. Similarly, top-down semantic tools are focused on dealing with imperfections in existing information. Among them are the natural language processing tools that do entity extraction - such as the Calais and TextWise APIs that recognize people, companies, places, etc. in documents; vertical search engines, like ZoomInfo and Spock, which mine the web for people; technologies like Dapper and BlueOrganizer, which recognize objects in web pages; and Yahoo! Shortcuts, Snap and SmartLinks, which recognize objects in text and links.

[Disclosure: Alex Iskold is founder and CEO of AdaptiveBlue, which makes BlueOrganizer and SmartLinks.]

Top-down technologies are racing forward despite imperfect information. And, of course, they benefit from the bottom-up annotations as well. The more annotations there are, the more precise top-down technologies will get - because they will be able to take advantage of structured information as well.

2. Annotation Technologies: RDF, Microformats, and Meta Headers

Within the bottom-up approach to annotation of data, there are several choices for annotation. They are not equally powerful, and in fact each approach is a trade off between simplicity and completeness. The most comprehensive approach is RDF - a powerful, graph-based language for declaring things, and attributes and relationships between things. In a simplistic way, one can think of RDF as the language that allows expressing truths like: Alex IS human (type expression), Alex HAS a brain (attribute expression), and Alex IS the father of Alice, Lilly, and Sofia (relationship expression). RDF is powerful, but because it is highly recursive, precise, and mathematically sound, it is also complex.

At present, most use of RDF is for interoperability. For example, the medical community uses RDF to describe genomic databases. Because the information is normalized, the databases that were previously silos can now be queried together and correlated. In general, in addition to semantic soundness, the major benefit of RDF is interoperability and standardization, particularly for enterprises, as we will discuss below.

Microformats offer a simpler approach by adding semantics to existing HTML documents using specific CSS styles. The metadata is compact and is embedded inside the actual HTML. Popular microformats are hCard, which describes personal and company contact information, hReview, which adds meta information to review pages, and hCalendar, which is used to describe events.

Microformats are gaining popularity because of their simplicity, but they are still quite limiting. There is no way to describe type hierarchies, which the classic semantic community would say is critical. The other issue is that microformats are somewhat cryptic, because the focus is to keep the annotations to a minimum. This, in turn, brings up another question of whether embedding metadata into the view (HTML) is a good idea. The question is: what happens if the underlying data changes when someone makes a copy of the HTML document? Nevertheless, despite these issues, microformats are gaining popularity because they are simple. Microformats are currently used by Flickr, Eventful, and LinkedIn; and many other companies are looking to adopt microformats, particularly because of the recent Yahoo! announcement.

An even simpler approach is to put meta data into the meta headers. This approach has been around for a while and it is a shame that it has not been widely adopted. As an example, the New York Times recently launched extended annotations for its news pages. The benefit of this approach is that it works great for pages that are focused on a topic or a thing. For example, a news page can be described with a set of keywords, geo location, date, time, people, and categories. Another example would be for book pages. O'Reilly.com has been putting book information into the meta headers, describing the author, ISBN, and category of the book.

Despite the fact that all these approaches are different, they are also somewhat complementary; and each of them is helpful. The more annotations there are in web pages, the more standards are implemented, and the more discoverable and powerful the information becomes.

3. Consumer and Enterprise

Yet another dimension of the conversation about the Semantic Web is the focus on consumer and enterprise applications. In the consumer arena we have been looking for a Killer App - something that delivers tangible and simple consumer value. People simply do not care that a product is built on the Semantic Web; all they are looking for is utility and usefulness.

Up until recently, the challenge has been that the Semantic Web focused on rather academic issues - like annotating information to make it machine-readable. The promise was that once the information is annotated and the web becomes one big giant RDF database, then exciting consumer applications would come. The skeptics, however, have been pointing out that first there needs to be a compelling use case.

Some consumer applications based on the Semantic Web: generic and vertical search, contextual shortcuts and previews, personal information management systems, semantic browsing tools. All of these applications are in their early days and have a long way to go before being truly compelling for the average web user. Still, even if these applications succeed, consumers will not be interested in knowing about the underlying technology - so there is really no marketing play for the Semantic Web in the consumer space.

Enterprises are a different story for a couple of reasons. First, enterprises are much more used to techno speak. To them utilizing semantic technologies translates into being intelligent and that, in turn, is good marketing. 'Our products are better and smarter because we use the Semantic Web' sounds like a good value proposition for the enterprise.

But even above the marketing speak, RDF solves a problem of data interoperability and standards. This "Tower of Babel" situation has been in existence since the early days of software. Forget semantics; just a standard protocol, a standard way to pass around information between two programs, is hugely valuable in the enterprise.

RDF offers a way to communicate using XML-based language, which on top of it has sound mathematical elements to enable semantics. This sounds great, and even the complexity of RDF is not going to stop enterprises from using it. However, there is another problem that might stop it - scalability. Unlike relational databases, which have been around for ages and have been optimized and tuned, XML-based databases are still not widespread. In general, the problem is in the scale and querying capabilities. Like object-oriented database technologies of the late '90s, XML-based databases hold a lot of promise, but we have yet to see them in action in a big way.

4. Semantic APIs

With the rise of Semantic Web applications, we are also seeing the rise of Semantic APIs. In general, these web services take as an input unstructured information and find entities and relationships. One way to think of these services is mini natural language processing tools, which are only concerned with a subset of the language.

The first example is the Open Calais API from Reuters that we have covered in two articles here and here. This service accepts raw text and returns information about people, places, and companies found in the document. The output not only returns the list of found matches, but also specifies places in the document where the information is found. Behind Calais is a powerful natural language processing technology developed by Clear Forest (now owned by Reuters), which relies on algorithms and databases to extract entities out of text. According to Reuters, Calais is extensible, and it is just a matter of time before new entities will be added.

Another example is the SemanticHacker API from TextWise, which is offering a one million dollar prize for the best commercial semantic web application developed on top of it. This API classifies information in documents into categories called semantic signatures. Given a document, it outputs entities or topics that the document is about. It is kind of like Calais, but also delivers a topical hierarchy, where the actual objects are leafs.

Another semantic API is offered by Dapper - a web service which facilitates the extraction of structure from unstructured HTML pages. Dapper works by enabling users to define attributes of an object based on the bits of the page. For example, a book publisher might define where the information about author, ISBN and number of pages is on a typical book page and the Dapper application would then create a recognizer for any page on the publisher site and enable access to it via REST API.

While this seems backwards from an engineering point of view, Dapper's technology is remarkably useful in the real world. In a typical scenario, for websites that do not have clean APIs to access their information, even non-technical people can build an API in minutes with Dapper. This is a powerful way of quickly turning websites into web services.

5. Search Technologies

Perhaps the first significant blow to the Semantic Web has been the inability thus far to improve search. The premise that a semantic understanding of pages leads to vastly better search has yet to be validated. The two main contenders, Hakia and PowerSet, have made some progress, but not enough. The problem is that Google's algorithm, which is based on statistical analysis, deals just fine with semantic entities like people, cities, and companies. When asked What is the capital of France? Google returns a good enough answer.

There is a growing realization that marginal improvement in search might not be enough to beat Google or to declare search the killer app for the Semantic Web. Likely, understanding semantics is helpful but not sufficient to build a better search engine. A combination of semantics, innovative presentation, and memory of who the user is, will be necessary to power the next generation search experience.

Alternative approaches also attempt to overlay semantics on top of the search results. Even Google ventures into verticals by partitioning the results into different categories. The consumer can then decide which type of answer they are interested in.

Yet search is a game that is far from won and a lot of semantic companies are really trying to raise the bar. There may be another twist to the whole search play - contextual technologies, as well as semantic databases, could lead to qualitatively better results. And so we turn to these next.

6. Contextual Technologies

We are seeing an increasing number of contextual tools entering the consumer market. Contextual navigation does not just improve search, but rather shortcuts it. Applications like Snap or Yahoo! Shortcuts, and SmartLinks "understand" the objects inside text and links and bring relevant information right into the user's context. The result is that the user does not need to search at all.

Thinking about this more deeply, one realizes that contextual tools leverage semantics in a much more interesting way. Instead of trying to parse what a user types into the search box, contextual technologies rely on analyzing the content. So the meaning is derived in a much more precise way - or rather, there is less guessing. The contextual tools then offer the users relevant choices, each of which leads to a correct result. This is fundamentally different from trying to pull the right results from a myriad of possible choices resulting from a web search.

We are also seeing an increasing number of contextual technologies make their way into the browser. Top-down semantic technologies need to work without publishers doing anything; and so to infer context, contextual technologies integrate into the browser. Firefox's recommended extensions page features a number of contextual browsing solutions - Interclue, ThumbStrips, Cooliris, and BlueOrganizer (from my own company).

The common theme among these tools is the recognition of information and the creation of specific micro contexts for the users to interact with that information.

7. Semantic Databases

Semantic databases are another breed of semantic applications focused on annotating web information to be more structured. Twine, a product of Radar Networks and currently in private beta, focuses on building a personal knowledge base. Twine works by absorbing unstructured content in various forms and building a personal database of people, companies, things, locations, etc. The content is sent to Twine via a bookmarklet, via email, or manually. The technology needs to evolve more, but one can see how such databases can be useful once the kinks are worked out. One of the very powerful applications that could be built on top of Twine, for example, is personalized search - a way to filter the results of any search engine based on a particular individual.

It is worth noting that Radar Networks has spent a lot of time getting the infrastructure right. The underlying representation is RDF and is ready to be consumed by other semantic web services. But a big chunk of the core algorithms, the ones that are dealing with entity extraction, are being commoditized by Semantic Web APIs. Reuters offers this as an API call, for example, and so moving forward, Twine won't need to be concerned with how to do that.

Another big player in the semantic databases space is a company called Metaweb, which created Freebase. In its present form, Freebase is just a fancier and more structured version of Wikipedia - with RDF inside and less information in total. The overall goal of Freebase, however, is to build a Wikipedia equivalent of the world's information. Such a database would be enormously powerful because it could be queried exactly - much like relational databases. So once again the promise is to build much better search.

But the problem is, how can Freebase keep up with the world? Google indexes the Internet daily and grows together with the web. Freebase currently allows editing of information by individuals and has bootstrapped by taking in parts of Wikipedia and other databases, but in order to scale this approach, it needs to perfect the art of continuously taking in unstructured information from the world, parsing it, and updating its database.

The problem of keeping up with the world is common to all database approaches, which are effectively silos. In the case of Twine, there needs to be continuous influx of user data, and in the case of Freebase there needs to be influx of data from the web. These problems are far from trivial and need to be solved successfully in order for the databases to be useful.

Conclusion

With any new technology it is important to define and classify things. The Semantic Web is offering an exciting promise: improved information discoverability, automation of complex searches, and innovative web browsing. Yet the Semantic Web means different things to different people. Indeed, its definitions in the enterprise and consumer spaces are different, and there are different means to a common end - top-down vs. bottom-up and microformats vs. RDF. In addition to these patterns, we are observing the rise of semantic APIs and contextual browsing tools. All of these are in their early days but hold a big promise to fundamentally change the way we interact with information on the web.

What do you think about Semantic Web Patterns? What trends are you seeing and which applications are you waiting for? And if you work with semantic technologies in the enterprise, please share your experiences with us in the comments below.

]]>Discuss]]>
http://www.readwriteweb.com/archives/semantic_web_patterns_a_guide_redux.php http://www.readwriteweb.com/archives/semantic_web_patterns_a_guide_redux.php Trends Fri, 26 Dec 2008 09:00:00 -0800 Alex Iskold
Top 10 RSS and Syndication Products of 2008 RSS and syndication are the veins that the new social web flows through. Countless products and services have been built on top of RSS in the past few years but there are always a few that stand above the rest.

As part of this year's Top 10 Products series, we offer below the Top 10 RSS and Syndication Products of 2008. These are the feed tools we and the people we know use day in and day out - we love them, we hate them, we wouldn't want to work without them.

]]>Sponsor

]]> This is the fourth in our series of top products of 2008:

  1. Top 10 Semantic Web Products of 2008
  2. Top 10 International Products of 2008
  3. Top 10 Consumer Web Apps of 2008

Mashery

About the Selections

These aren't all new products from 2008. They are the products in the RSS and syndication world that we think made the biggest impact or were the most useful.

To be honest, this was not a particularly good year for innovation in the RSS space. Too many of the products listed below are incumbents, several of which drove us crazy this year. They remain on the list, however, because they are incredibly useful and nothing topped them.

Some honorable mentions are deserved as well. We talked to many people who like RSS magazine-style start page Feedly, though we found it overly constrictive and don't feel that it's made a big market splash yet. We also found the Associated Press's AP Member Marketplace very interesting. Had we gotten a chance to get to know it better, it could very well have been on this list. Finally, we love African social media aggregator Afrigator - it's a great way to learn about what's happening all over the continent and it's a great use of RSS. We named it one of the Top 10 International Products of 2008 but we think it deserves an honorable mention in this category as well.

And Now the RWW Top 10 RSS and Syndication Products of 2008

Postrank

postrankimage.jpgFormerly known as AideRSS, Postrank is simply the most useful RSS related application we've seen in a long time. Plug in any RSS feed and Postrank will rate each item in the feed on a scale of 1 to 10, by number of comments, inbound links, saves in Delicious, etc. You can then subscribe to a filtered feed of just the 10% most popular items in that feed.

We use Postrank all the time, in all kinds of contexts: from monitoring break-out stories in niche markets we don't follow closely, to finding out about the bread and butter of new blogs we discover to running search feeds through Postrank to surface hot conversations on any topic.

Postrank has been around for about a year and a half, but we write about it over and over again.

This year Postrank opened an API, made a bunch of deals with other companies, improved its service, raised a round of funding and just generally rocked.

FriendFeed

Social "life streaming" service FriendFeed is making syndication a more social activity than anything else has yet. The service aggregates your activity data from all around the web, lets your friends comment on it and shows you the activities of all your friends' friends when someone you know comments on something and exposes it to their network.

friendfeedRWWroom.jpgIf RSS readers will change your life and work through their awesome usefulness, FriendFeed is a service that makes syndication fun. It's one of the first places we go on the web every morning.

We interviewed the ex-Googlers who founded FriendFeed last February and that interview is still the best place to learn how the service works under the hood.

If you'd like to connect with the ReadWriteWeb crew on FriendFeed (and we hope you will) we've posted a tour of our FriendFeed profile pages here. Please join us also in the ReadWriteWeb FriendFeed Room.

Gnip

Gnip is a social media ping server, a service that other services ask for user data updates from all around the web. There's nothing here for users, but almost every developer we talk to these days who is aggregating content in order to add value to it (and that is the name of the game) has Gnip on its radar. The company aims to make aggregation more timely, scalable and efficient than it is today.

We wrote about Gnip at length when the service launched in July.
gnipscreen3.jpg

Snackr

snackrscreen5.jpgSnackr is a simple little RSS ticker built in Adobe AIR. Its frenetic and unstopping delivery of news is too much for many people, but the rest of us love it. It's where our eyes wander during page loads and other down times. Many of the stories you read here at ReadWriteWeb were based on things we first caught wind of through Snackr.

Snackr was built in-house at Adobe by Flex team member Narciso Jaramillo. We reviewed it in May and have been using it ever since.

Google Reader

Google Reader is the market leader in full featured RSS readers, having pulled ahead of the troubled Bloglines in recent months. This year Google Reader has made their sharing feature much more transparent, added the ability to translate any feed into a number of different languages and recently redesigned.

It hasn't been a super exciting year for the product, and there are still basic problems like very infrequent caching of rare feeds, but Google Reader's incredible dominance in the field makes it a required part of this list.

Google Reader RSS Subscriber Count Greasemonkey Script

greasemonkeyscriptgreader.jpgOne of the simplest little changes we've made to our browsers lately is the addition of this greasemonkey script that shows the number of readers in Google Reader that any page's RSS feed has. You can usually multiply that number by 2 to 4 times for an estimate of how many total readers a feed has across all readers, but either way it's a great little indication of a site's popularity.

The script was written by an anonymous user named "uncv" and we'd like to thank them. We love what they've done! This was one of the 7 coolest browser tweaks from the last month that we wrote about earlier this week. It's already won a permanent place in our hearts!

Dapper

Dapper.net is a point and click interface for data extraction - a nice way to say scraping an RSS feed. We continue to depend on Dapper for all kinds of research, we're always finding new ways to use it around here. We love it.

dapperscreen2008.jpg

Unfortunately, some sites don't like us to have access to links back to them available in our RSS readers (like Facebook, for example) and that really upsets us. In many cases those feeds that we created ourselves are the only way we'd be drawn back to a site, so it's their loss as much as ours.

Dapper has been around since 2006, but they recently launched a semantic ad platform that we included in our list of the top 10 semantic web products of 2008.

Twitterfeed

twitterfeedscreen.jpgLove it or hate it, Twitterfeed has made a big impact on the web in 2008. It's the service people use to publish an RSS feed right into Twitter.

Some people argue that twitter is all about conversation and that publishing an RSS feed there is grating and inappropriate. We like getting our local newspaper story links on Twitter, though, and everything from disaster monitoring to traffic conditions are now available via Twitterfeed.

Feedburner

Google's RSS publishing service Feedburner hurt our ability to break news first, can't be used in many corporate environments because it gets blocked in China and only made 6 posts all year to its company blog, none since May. That's compared to 28 posts in 2007. Apparently once you get your Google money there's not much point in communicating with the people who depend on you every day.

Why would we call Feedburner one of the top 10 RSS products on the year then? Because despite how frustrating it can be, the service is still so incredibly useful that we don't know what we'd do without it. Not just for publishing and analytics for ReadWriteWeb feeds - from numbers to email delivery to FeedFlare links, Feedburner will work magic easily on any feed you work with. I've got 68 different feeds in my account and I'll probably publish several more before the year is up.

Pipes

Yahoo! Pipes is another RSS based service that is really frustrating, hasn't innovated substantially in the last year - but is still so powerfully useful that it deserves a spot as one of the top products in this market.

Splicing and filtering RSS feeds is the simplest thing to do with Pipes, but there's much more you can do with it as well. It's great for us pseudo-geeks, we can work all kinds of magic with it. We've used Pipes throughout the year to do things that we (ok I) don't have the technical chops to do otherwise. For that I thank the Pipes team a whole lot.

PipesScreen2008.jpg

Those Were Our Favorites This Year - How About You?

Did we miss anyone you think should have been on this list? We hope you'll share your favorites in comments below. What RSS and syndication products impacted you the most in 2008?

]]>Discuss]]> http://www.readwriteweb.com/archives/top_10_rsssyndication_products_of_2008.php http://www.readwriteweb.com/archives/top_10_rsssyndication_products_of_2008.php 2008 in Review Thu, 11 Dec 2008 15:30:30 -0800 Marshall Kirkpatrick Some Web Apps Work Better Together web20.jpgHow many new websites can you fit in a Volkswagen Beetle? Sometimes it feels like that's what we're trying to do these days - but all these new applications and services don't have to be crammed into our heads and lives as separate things to try out and remember.

Many new technologies work best in concert; the functionality of one application can be vastly improved by using it together with another one. Here are some of our favorite examples of apps that work best together, followed by some favorite workflows from friends of ReadWriteWeb. We hope you'll share your favorite combos in comments, too, so we can all learn some new things.

]]>Sponsor

]]> Some of Our Favorites

AideRSS plus Snacker

RSS news ticker Snackr was an app that people either loved or hated when we first wrote about it here. The attractive Adobe AIR interface is now even more compelling now that you can sync it with your Google Reader account (as of last week). One of the best uses we've found for this ever-flowing stream of news though has been to fill it up with "best of" feeds from AideRSS. AideRSS is an app we've written about over and over again here because it's just so darned useful and cool.

Picture 458.png

Put the two together though and you've got a stream of just the breakout hits from high traffic feeds. We enjoy and recommend reading the top stories on topics like the semantic web, mobile and recommendation technology through Snackr - but we're sure you can build your own easily.

Ma.gnolia (or Del.icio.us) plus Feed.Informer

Picture 453.pngYou can do a whole lot of different things with social bookmarking tools like Ma.gnolia and Del.icio.us, probably including some things most readers here aren't familiar with. One of our favorite things though is to pick a particular tag from your account and run the RSS feed from that tag through a handy little service called Feed.informer.

You can display any amount of the feed on a web page with just a few lines of embed code, including the "notes" field for your tag as editorial or summary information. The result is a little news section for your website, powered by your social bookmarking tool. It's a great way to continue sharing found items online that don't warrant an entire blog post.

FriendFeed and MuxTape plus FluidApp

We wrote here earlier this year about a fabulous mashup of mixtape service Muxtape and single-app browser creation tool for Mac called FluidApp, but it's also really useful to combine FriendFeed and Fluid.

Most of the other standalone FriendFeed apps are hard to use (excluding the wonderful mobile app FFtoGo) but putting your friends' feeds and conversation in a standalone browser makes it easy to follow along without losing the FF tab in your browser. FriendFeed's auto-updating keeps the dedicated browser up to date and the FF favicon looks great in your dock.

Single app browsers fall into the "seems stupid until you try it" category, but put the right app in there and you'll enjoy it.

Windows users can check out Bubbles, a service that was reviewed and discussed recently at Download Squad.

Facebook plus Dapper

The RSS extraction tool Dapper is really powerful, once you figure out how and why to use it. Here's a 4 minute screencast we recorded about how to use Dapper but the sky's the limit with what you can do with this free tool.

One of the things we've done with it lately is scrape birthday notifications out of Facebook. Not everyone logs into Facebook everyday, but people tend to put their real birthdays into their profiles there. It's really nice to get those birthday notifications by RSS in another setting that you spend time in more regularly. Step by step instructions for doing so are available here.

facebookdapper.png

Friends of RWW

We asked around and got some input from friends about what apps they like to use together. The responses ranged from combinations aimed to increase productivity to making the most of music listening. Here are some of our favorites.

Local Portland tech blogger Rick Turoczy says he likes to use Twitter search (formerly Summize), combined with Yahoo! Pipes and RSS to SMS service Pingie. We're not sure what he does with those apps together, but the magic results in his getting a lot of industry news before mainstream media outlets do.

MicroISV consultant Bob Walsh makes the most of his fleeting thoughts by sending voice recordings through Jott over to "memory extender" EverNote and "thence to various programs on my Mac." That's the kind of thing many of us have probably envisioned doing, we're glad it's working for Bob.

Susan Kirkpatrick (no relation) is a prolific multi-media blogger. How does she do it? [I] "send a blog post with a picture attachment via email to Utterz; it posts to Flickr, WordPress, Pownce and Twitter." We haven't used it a lot ourselves, but Utterz is pretty impressive and we here rumors that there is even more sophisticated developments being worked on behind the scenes there, too.

Virginie De Bel Air says she likes Last.fm + SonicLiving, a service that tracks your favorites on iTunes, Last.fm or Pandora and notifies you when those bands are coming to perform in your area. Utilitarian and rock and roll! We hadn't seen SonicLiving before.

Security and IT exec Greg Hughes likes to let his hair down and shout Shazam! sometimes. Specifically, Hughes says he finds himself using the Shazam music identification app to identify a song he hears and then Pandora to discover more that's related. All on the iPhone, too.

What About You?

What are your favorite apps to use together? There are so many new apps that launch everyday, we can't imagine the infinite permutations that users could come up with. Putting together multiple apps usually implies though that you're fairly comfortable with one or both of them, that they are equipped to live as something other than a walled garden and that each has stood enough of a test for users to believe they are stable enough to smoosh together.

Productivity? Fun? A combination of both, perhaps? We'd love to know what your favorite apps are to run together.

Photo: "Web 2.0 Crawl Yahoo Brickhouse: Nate Westheimer of BricaBox, Dave McClure, Gabe Rivera of Techmeme" by Brian Solis. Just imagine how great it would be if these app guys worked together!

]]>Discuss]]>
http://www.readwriteweb.com/archives/some_web_apps_work_better_together.php http://www.readwriteweb.com/archives/some_web_apps_work_better_together.php Mashups Wed, 30 Jul 2008 17:11:09 -0800 Marshall Kirkpatrick
Semantic Web Patterns: A Guide to Semantic Technologies In this article, we'll analyze the trends and technologies that power the Semantic Web. We'll identify patterns that are beginning to emerge, classify the different trends, and peak into what the future holds.

In a recent interview Tim Berners-Lee pointed out that the infrastructure to power the Semantic Web is already here. ReadWriteWeb's founder, Richard MacManus, even picked it to be the number one trend in 2008. And rightly so. Not only are the bits of infrastructure now in place, but we are also seeing startups and larger corporations working hard to deliver end user value on top of this sophisticated set of technologies.

]]>Sponsor

]]> The Semantic Web means many things to different people, because there are a lot of pieces to it. To some, the Semantic Web is the web of data, where information is represented in RDF and OWL. Some people replace RDF with Microformats. Others think that the Semantic Web is about web services, while for many it is about artificial intelligence - computer programs solving complex optimization problems that are out of our reach. And business people always redefine the problem in terms of end user value, saying that whatever it is, it needs to have simple and tangible applications for consumers and enterprises.

The disagreement is not accidental, because the technology and concepts are broad. Much is possible and much is to be imagined.

1. Bottom-Up and Top-Down

We have written a lot about the different approaches to the Semantic Web - the classic bottom-up approach and the new top-down one. The bottom-up approach is focused on annotating information in pages, using RDF, so that it is machine readable. The top-down approach is focused on leveraging information in existing web pages, as-is, to derive meaning automatically. Both approaches are making good progress.

A big win for the bottom-up approach was recent announcement from Yahoo! that their search engine is going to support RDF and microformats. This is a win-win-win for publishers, for Yahoo!, and for customers - publishers now have an incentive to annotate information because Yahoo! Search will be taking advantage of it, and users will then see better, more precise results.

Another recent win for the bottom-up approach was the announcement of the Semantify web service from Dapper (previous coverage). This offering will enable publishers to add semantic annotations to existing web pages. The more tools like Semantify that pop up, the easier it will be for publishers to annotate pages. Automatic annotation tools combined with the incentive to annotate the pages is going to make the bottom-up approach more compelling.

But even if the tools and incentive exists, to make the bottom-up approach widespread is difficult. Today, the magic of Google is that it can understand information as is, without asking people to fully comply with W3C standards of SEO optimization techniques. Similarly, top-down semantic tools are focused on dealing with imperfections in existing information. Among them are the natural language processing tools that do entity extraction - such as the Calais and TextWise APIs that recognize people, companies, places, etc. in documents; vertical search engines, like ZoomInfo and Spock, which mine the web for people; technologies like Dapper and BlueOrganizer, which recognize objects in web pages; and Yahoo! Shortcuts, Snap and SmartLinks, which recognize objects in text and links.

[Disclosure: Alex Iskold is founder and CEO of AdaptiveBlue, which makes BlueOrganizer and SmartLinks.]

Top-down technologies are racing forward despite imperfect information. And, of course, they benefit from the bottom-up annotations as well. The more annotations there are, the more precise top-down technologies will get - because they will be able to take advantage of structured information as well.

2. Annotation Technologies: RDF, Microformats, and Meta Headers

Within the bottom-up approach to annotation of data, there are several choices for annotation. They are not equally powerful, and in fact each approach is a tradeoff between simplicity and completeness. The most comprehensive approach is RDF - a powerful, graph-based language for declaring things, and attributes and relationships between things. In a simplistic way, one can think of RDF as the language that allows expressing truths like: Alex IS human (type expression), Alex HAS a brain (attribute expression), and Alex IS the father of Alice, Lilly, and Sofia (relationship expression). RDF is powerful, but because it is highly recursive, precise, and mathematically sound, it is also complex.

At present, most use of RDF is for interoperability. For example, the medical community uses RDF to describe genomic databases. Because the information is normalized, the databases that were previously silos can now be queried together and correlated. In general, in addition to semantic soundness, the major benefit of RDF is interoperability and standardization, particularly for enterprises, as we will discuss below.

Microformats offer a simpler approach by adding semantics to existing HTML documents using specific CSS styles. The metadata is compact and is embedded inside the actual HTML. Popular microformats are hCard, which describes personal and company contact information, hReview, which adds meta information to review pages, and hCalendar, which is used to describe events.

Microformats are gaining popularity because of their simplicity, but they are still quite limiting. There is no way to described type hierarchies, which the classic semantic community would say is critical. The other issue is that microformats are somewhat cryptic, because the focus is to keep the annotations to a minimum. This, in turn, brings up another question of whether embedding metadata into the view (HTML) is a good idea. The question is: what happens if the underlying data changes when someone makes a copy of the HTML document? Nevertheless, despite these issues, microformats are gaining popularity because they are simple. Microformats are currently used by Flickr, Eventful, and LinkedIn; and many other companies are looking to adopt microformats, particularly because of the recent Yahoo! announcement.

An even simpler approach is to put meta data into the meta headers. This approach has been around for a while and it is a shame that it has not been widely adopted. As an example, the New York Times recently launched extended annotations for its news pages. The benefit of this approach is that it works great for pages that are focused on a topic or a thing. For example, a news page can be described with a set of keywords, geo location, date, time, people, and categories. Another example would be for book pages. O'Reilly.com has been putting book information into the meta headers, describing the author, ISBN, and category of the book.

Despite the fact that all these approaches are different, they are also somewhat complimentary; and each of them is helpful. The more annotations there are in web pages, the more standards are implemented, and the more discoverable and powerful the information becomes.

3. Consumer and Enterprise

Yet another dimension of the conversation about the Semantic Web is the focus on consumer and enterprise applications. In the consumer arena we have been looking for a Killer App - something that delivers tangible and simple consumer value. People simply do not care that a product is built on the Semantic Web, all they are looking for is utility and usefulness.

Up until recently, the challenge has been that the Semantic Web is focused on rather academic issues - like annotating information to make it machine readable. The promise was that once the information is annotated and the web becomes one big giant RDF database, then exciting consumer applications will come. The skeptics, however, have been pointing out that first there needs to be a compelling use case.

Some consumer applications based on the Semantic Web: generic and vertical search, contextual shortcuts and previews, personal information management systems, semantic browsing tools. All of these applications are in their early days and have a long way to go before being truly compelling for the average web user. Still, even if these applications succeed, consumers will not be interested in knowing about the underlying technology - so there is really no marketing play for the Semantic Web in the consumer space.

Enterprises are a different story for a couple of reasons. First, enterprises are much more used to techno speak. To them utilizing semantic technologies translates into being intelligent and that, in turn, is good marketing. 'Our products are better and smarter because we use the Semantic Web' sounds like a good value proposition for the enterprise.

But even above the marketing speak, RDF solves a problem of data interoperability and standards. This "Tower of Babel" situation has been in existence since the early days of software. Forget semantics; just a standard protocol, a standard way to pass around information between two programs, is hugely valuable in the enterprise.

RDF offers a way to communicate using XML-based language, which on top of it has sound mathematical elements to enable semantics. This sounds great, and even the complexity of RDF is not going to stop enterprises from using it. However, there is another problem that might stop it - scalability. Unlike relational databases, which have been around for ages and have been optimized and tuned, XML-based databases are still not widespread. In general, the problem is in the scale and querying capabilities. Like object-oriented database technologies of the late nineties, XML-based databases hold a lot of promise, but we are yet to see them in action in a big way.

4. Semantic APIs

With the rise of Semantic Web applications, we are also seeing the rise of Semantic APIs. In general, these web services take as an input unstructured information and find entities and relationships. One way to think of these services is mini natural language processing tools, which are only concerned with a subset of the language.

The first example is the Open Calais API from Reuters that we have covered in two articles here and here. This service accepts raw text and returns information about people, places, and companies found in the document. The output not only returns the list of found matches, but also specifies places in the document where the information is found. Behind Calais is a powerful natural language processing technology developed by Clear Forest (now owned by Reuters), which relies on algorithms and databases to extract entities out of text. According to Reuters, Calais is extensible, and it is just a matter of time before new entities will be added.

Another example is the SemanticHacker API from TextWise, which is offering a one million dollar prize for the best commercial semantic web application developed on top of it. This API classifies information in documents into categories called semantic signatures. Given a document, it outputs entities or topics that the document is about. It is kind of like Calais, but also delivers a topical hierarchy, where the actual objects are leafs.

Another semantic API is offered by Dapper - a web service which facilitates the extraction of structure from unstructured HTML pages. Dapper works by enabling users to define attributes of an object based on the bits of the page. For example, a book publisher might define where the information about author, isbn and number of pages is on a typical book page and the Dapper application would then create a recognizer for any page on the publisher site and enable access to it via REST API.

While this seems backwards from an engineering point of view, Dapper's technology is remarkably useful in the real world. In a typical scenario, for web sites that do not have clean APIs to access their information, even non-technical people can build an API in minutes with Dapper. This is a powerful way of quickly turning web sites into web services.

5. Search Technologies

Perhaps the first significant blow to the Semantic Web has been the inability thus far to improve search. The premise that semantical understanding of pages leads to vastly better search has yet to be validated. The two main contenders, Hakia and PowerSet, have made some progress, but not enough. The problem is that Google's algorithm, which is based on statistical analysis, deals just fine with semantic entities like people, cities, and companies. When asked What is the capital of France? Google returns a good enough answer.

There is a growing realization that marginal improvement in search might not be enough to beat Google, and to declare search the killer app for the Semantic Web. Likely, understanding semantics is helpful but not sufficient to build a better search engine. A combination of semantics, innovative presentation, and memory of who the user is, will be necessary to power the next generation search experience.

Alternative approaches also attempt to overlay semantics on top of the search results. Even Google ventures into verticals by partitioning the results into different categories. The consumer can then decide which type of answer they are interested in.

Yet search is a game that is far from won and a lot of semantic companies are really trying to raise the bar. There may be another twist to the whole search play - contextual technologies, as well as semantic databases, could lead to qualitatively better results. And so we turn to these next.

6. Contextual Technologies

We are seeing an increasing number of contextual tools entering the consumer market. Contextual navigation does not just improve search, but rather shortcuts it. Applications like Snap or Yahoo! Shortcuts or SmartLinks "understand" the objects inside text and links and bring relevant information right into the user's context. The result is that the user does not need to search at all.

Thinking about this more deeply, one realizes that contextual tools leverage semantics in a much more interesting way. Instead of trying to parse what a user types into the search box, contextual technologies rely on analyzing the content. So the meaning is derived in a much more precise way - or rather, there is less guessing. The contextual tools then offer the users relevant choices, each of which leads to a correct result. This is fundamentally different from trying to pull the right results from a myriad of possible choices resulting from a web search.

We are also seeing an increasing number of contextual technologies make their way into the browser. Top-down semantic technologies need to work without publishers doing anything; and so to infer context, contextual technologies integrate into the browser. Firefox's recommended extensions page features a number of contextual browsing solutions - Interclue, ThumbStrips, Cooliris, and BlueOrganizer (from my own company).

The common theme among these tools is the recognition of information and the creation of specific micro contexts for the users to interact with that information.

7. Semantic Databases

Semantic databases are another breed of semantic applications focused on annotating web information to be more structured. Twine, a product of Radar Networks and currently in private beta, focuses on building a personal knowledge base. Twine works by absorbing unstructured content in various forms and building a personal database of people, companies, things, locations, etc. The content is sent to Twine via bookmarklet or via email or manually. The technology needs to evolve more, but one can see how such databases can be useful once the kinks are worked out. One of the very powerful applications that could be built on top of Twine, for example, is personalized search - a way to filter the results of any search engine based on a particular individual.

It is worth noting that Radar Networks has spent a lot of time getting the infrastructure right. The underlying representation is RDF and is ready to be consumed by other semantic web services. But a big chunk of the core algorithms, the ones that are dealing with entity extraction, are being commoditized by Semantic Web APIs. Reuters offers this as an API call, for example, and so moving forward, Twine won't need to be concerned with how to do that.

Another big player in the semantic databases space is a company called Metaweb, which created Freebase. In its present form, Freebase is just a fancier and more structured version of Wikipedia - with RDF inside and less information in total. The overall goal of Freebase, however, is to build a Wikipedia equivalent of the world's information. Such a database would be enormously powerful because it could be queried exactly - much like relational databases. So once again the promise is to build much better search.

But the problem is, how can Freebase keep up with the world? Google indexes the Internet daily and grows together with the web. Freebase currently allows editing of information by individuals and has bootstrapped by taking in parts of Wikipedia and other databases, but in order to scale this approach, it needs to perfect the art of continuously taking in unstructured information from the world, parsing it, and updating its database.

The problem of keeping up with the world is common to all database approaches, which are effectively silos. In the case of Twine, there needs to be continuous influx of user data, and in the case of Freebase there needs to be influx of data from the web. These problems are far from trivial and need to be solved successfully in order for the databases to be useful.

Conclusion

With any new technology it is important to define and classify things. The Semantic Web is offering an exciting promise: improved information discoverability, automation of complex searches, and innovative web browsing. Yet the Semantic Web means different things to different people. Indeed, its definition in the enterprise and consumer spaces is different, and there are different means to a common end - top-down vs. bottom up and microformats vs. RDF. In addition to these patterns, we are observing the rise of semantic APIs and contextual browsing tools. All of these are in their early days, but hold a big promise to fundamentally change the way we interact with information on the web.

What do you think about Semantic Web Patterns? What trends are you seeing and which applications are you waiting for? And if you work with semantic technologies in the enterprise, please share your experiences with us in the comments below.

]]>Discuss]]>
http://www.readwriteweb.com/archives/semantic_web_patterns.php http://www.readwriteweb.com/archives/semantic_web_patterns.php Trends Tue, 25 Mar 2008 15:20:45 -0800 Alex Iskold
Semantify - Automate Your Semantic Web SEO in Five Minutes The timing couldn't be better for the release of Semantify, a new service from Israel/San Francisco's Dapper.net. One week after Yahoo! announced that it will begin indexing the semantic markup and meaning of content on the web, Semantify offers a remarkably simple way to get your website marked up semantically. Automatically, forever.

]]>Sponsor

]]> Once you learn how to use Dapper's basic interface, it can take less than five minutes to set up the Semantify service. Hello SEO, 3.0.

Just a Few Steps

Here's what it takes:

1. Identify your website and show Dapper a few different pages on it.

2. Point and click to identify particular fields on your pages, like the titles, dates and authors of articles. Sometimes this requires a few extra clicks to exclude false positives in the previewed results.

3. Name those fields according to any number of Semantic Web naming protocols. In my test of Semantify, for my personal site marshallk.com, I used the Dublin Core namespaces "title," "date," description" and "creator" to name my fields in Dapper. I could have designated fields as the names of my friends or as particular locations. There are simple descriptions of other namespace conventions linked to from the Semantify page and this part is pretty intuitive.

4. Once you've gotten this far, in the standard method of using Dapper you'd grab an RSS feed that would deliver changes that get made to the fields you're monitoring. With Semantify, though, you get a few lines of PHP code to paste into the header of your website. See the screenshot at the bottom of this post.

And then you're done.

Dapper GUI + Semantic Web vocab list + PHP embed code = automated Semantic Web markup for your site. It's like a point and click sitemap creator on the element-by-element level. It's a perpetual standards-based SEO machine. That's the incentive for publishers. For the rest of us, once the meaning of content is machine readable - there's a world of sophisticated information processing we'll be able to automate and leverage.

It's The Early Days

It's as simple as that, or at least it will be once all the little kinks are worked out. At launch the embed code is only available in PHP but the company says more options are right around the corner. The company rushed to get this service out the door and that's a little obvious right now. It's also clear that the problems are small ones that they'll be able to solve quickly. There's more sophisticated options coming (more granular control over namespaces, for example) and the user interface could always be improved over there. None the less, this service could end up being very, very big.

You can go through those steps above today, I have, and whenever the Yahoo! spider hits your webpage, it will be shown a semantically marked up version of whatever content is live on your pages at the time. It will come from your domain and everyone will be happy. Wash, rinse and repeat for all your domains. Then, thank Dapper for making it so damn easy.

Historical Context

Many people have questioned the viability of the Semantic Web vision, asking who will do the markup. Yahoo! has stepped in and provided the incentive for every publisher to do so, now Dapper's Semantify is hoping to provide the service that will make it easy, too.

Once it's just a matter of course for publishers to publish semantic markup with their content, look out world. My favorite example, from our coverage of the Yahoo! announcement, is this: show me all the movie reviews written by a user's friends who live in Europe. Today, that would be hard to do. Once semantic markup is widely published and indexed - then such queries will be trivial and the only question will be what we want to do with that information.

The Semantic Web could change the world. The only things missing are incentive like Yahoo! now provides and ease-of-use, as Semantify began offering today.

Picture 2.png
]]>Discuss]]>
http://www.readwriteweb.com/archives/semantify_automate_your_semantic_web_seo_in_five_minutes.php http://www.readwriteweb.com/archives/semantify_automate_your_semantic_web_seo_in_five_minutes.php Products Thu, 20 Mar 2008 17:07:35 -0800 Marshall Kirkpatrick
Funding the Semantic Web: Dapper's Ad Network Plan The founders of the data extraction and API creation service Dapper announced this week that their aim is to leverage Dapper in the service of ad networks and derive a semantic index of pages around the web from that activity. They will launch their ad powering product at Ad:Tech in April. Essentially, it will perform ad funded indexing of the semantic web.

Here's how it will work: Dapper lets users identify and tag particular fields on any page. It then extracts the value in that field and makes it available in XML. As a result of this advertising activity, Dapper believes a substantial quantity of pages around the web could have fields of interest delineated and tagged with relevant terms. Relationships between pages and fields and terms and tags can all be extracted and analyzed from this aggregated activity.

]]>Sponsor

]]> The company has already built a demonstration semantic search engine based on Dapper activity and its ability to parse search results by semantic meaning and detail is quite sophisticated. The potential applications of a semantic index built by Dapper are really exciting to consider.

Dapper currently has 35,000 extraction functions (Dapps) created, but they are betting that a clear profit motive will incentivize advertisers to create many, many more. Advertisers will pay to have web content delineated by field and categorized.

The company argues that advertisers see substantially increased relevance and click-through if ads can be served based on very specific fields of content on a page. Early prototypes run on top music site Pitchfork and book summary site Shvoong saw 100 to 500% increases in CTR.

While Dapper's approach would likely leave the vast majority of fields on a page unindexed, it could also rack up a whole lot of semantic knowledge by riding the profit motive to discover the semantic meaning of the most monetizable fields on a far greater number of pages than would likely be analyzed otherwise. What better way to analyze the web than to ride along with ad networks? I can't think of any better way.

I think Dapper has a shot at helping fund the semantic analysis of much of the web. What will they do with the data other than use it to contextualize ads? That's another question, but an interesting one to consider.

Dappercamp was a great event this week and the tool itself is one I highly recommend. It's in startup mode and I'll be frank - many of the output formats simply don't work and there are a number of errors throughout the site. None the less, I derive significant value for my work every time I engage with it. Here's a screencast tutorial I recorded on the service. Several Dapps, Dapper-created data extractions, have become daily go-to sources of information for me - but I also recognize that only so many people are going to be as excited about this technology for research purposes. For the rest of the world, for the viability of the company, and for the potentially gigantic secondary benefit of widespread semantic indexing - I think putting Dapper in service of ad networks is a plan of simple brilliance.

]]>Discuss]]>
http://www.readwriteweb.com/archives/dapper_funding_the_semantic_web.php http://www.readwriteweb.com/archives/dapper_funding_the_semantic_web.php Products Wed, 06 Feb 2008 09:15:24 -0800 Marshall Kirkpatrick
MindTouch Powers-Up DekiWiki with Dapper Open source wiki vendor MindTouch is releasing a series of major new features Monday and some of them are quite interesting. People used to talk about MindTouch for its outlandish stunts - like working with nutball John Gotts on the short-lived Wiki.com platform and hiring a Bono impersonator to walk the exhibit floor at DEMO. Those days seem like the distant past as now the MindTouch software gets attention on its own.

]]>Sponsor

]]>

Today the company's product, called Dekiwiki, gets an application platform based on its own simple language called dekiscript and a new execution engine. Additionally, a newly organized infrastructure will now allow thousands of wikis to be run with a single multi-tenant install. This should make management of multiple wikis in one organization far easier than ever before.

Also new is easy integration outside data scraped by Dapper.net and displays data using the new Google Charts API wrapped in Dekiscript for easier mashup creation. I really like Dapper a lot - see our most recent coverage of this paradigm changing tool here. This is really what motivates me to write about this release. Nn open source wiki integrating the screen scraping power of Dapper and displaying the data using the Google Charts API is just plain cool.

The company claims it sees 30k free downloads each month and most public discussion of the product is very positive. This past month, the company added open source industry journalist Matt Asay to its board of advisors and released versions of Dekiwiki translated into 9 different languages.

Below is a video from the company showing off all the new features in today's release. There's a lot going on for MindTouch - the company's outlook seems to be getting brighter all the time.

]]>Discuss]]>
http://www.readwriteweb.com/archives/mindtouch_powersup_dekiwiki.php http://www.readwriteweb.com/archives/mindtouch_powersup_dekiwiki.php Startups Mon, 07 Jan 2008 07:52:52 -0800 Marshall Kirkpatrick
The Glory, Bliss and How-to of Screen Scraping for RSS Wired has an awesome top story today on the world of startups utilizing scraped data from big companies to offer new layers of value for their own users. It's a roughly objective piece that I highly recommend reading but it was also inspiration for me to finally record a screencast on the subject (see below).

I love RSS, probably more than anything on the web. If you're not familiar with the concept, see my very old definition of RSS and my almost-as-old post on teaching people about RSS.

Not every page on the web publishes an RSS feed, though. Thus the need for these wonderful screen scraping tools. I've written about a variety of tools you can use to create a feed for a site or page that doesn't have one. Sometimes, though, you've got to pull out the big guns. In those cases, it's time for Dapper.

]]>Sponsor

]]>

Dapper is a company founded in Israel, now venture backed and was named in the aforementioned Wired article. It is the sweetness.

Dapper will let you pull data from almost any web page and get it in a wide variety of outputs, including RSS, email, iCal, a Google Gadget, CSV and Google Maps. Is that incredible or what?

Let's let the video do that talking. I have an awful cold (it's almost better, Mom!) so please excuse the very rough voice. I made the following screencast using JingProject, setting up an RSS feed of search results in Del.icio.us for articles tagged from ReadWriteWeb.

Clicking on the image below will open up another window so you can view the 4 minute video full screen.

If you're as excited about Dapper as I am, you should check out DapperCamp, a two day free conference all about Dapper coming up in early February in San Francisco. IBM and Mindtouch are sponsoring the event and Mitch Kapor is keynoting it. It looks like it's going to be a lot of fun.

Take that, Wired Mag ambivalence! Really, though, you should read that Wired article - it's a good one that discusses some issues that are going to be very big once more people figure out how exciting data portability is.

]]>Discuss]]>
http://www.readwriteweb.com/archives/screen-scraping.php http://www.readwriteweb.com/archives/screen-scraping.php Mon, 31 Dec 2007 20:57:24 -0800 Marshall Kirkpatrick