Each night this week while making dinner, I've listened to a different podcast interview with a semantic technology professional. It's been a fascinating experience and the man I have to think for the education is Dr. Paul Miller, technology evangelist for semantic web platform vendor Talis. Miller is producing an informative, enjoyable and prolific series of hour-long conversations with some people whose work is simply amazing.
Semantic web tools are no longer trapped in the lab, after years of research they are becoming products and entering the public market. There is semantic technology work underway at Skype, Joost and the BBC, just to name a few brand names. Last month we called Twine possibly the first mainstream semantic web app.
I think of it like this: Once our software is capable of deriving semantic meaning from web pages it looks at for us - then there's a whole lot of work that will already be done, allowing our human, creative minds to save time and reach new heights.
Semantic technology is a subject of great interest to many of our readers here and elsewhere; we're proud that our post from yesterday highlighting 10 semantic companies to watch (including Miller's Talis) hit the front page of Digg this morning, taking this look into the future to an even larger audience.
I didn't know Richard MacManus was writing that post last night, but coincidentally I was already working on this interview with Dr. Miller. I'm very thankful that he took the time to help us dive deeper into the topic.
Marshall Kirkpatrick: What's the elevator pitch on what the semantic web is and what promise it holds?
Dr. Paul Miller: I hope it's quite a tall building we're riding up in this elevator, as the Semantic Web offers a wide range of opportunities moving forward. Talis Platform Advisory Group member Mills Davis of Project 10X is about to release his 'Semantic Wave 2008' report, and that does a great job of illustrating just how broad the potential for semantic technologies could be.
Reaching right back to that famous Scientific American article in 2001, there's been a tendency to paint a grand vision for The Semantic Web that encompasses a plethora of devices calling upon powerful reasoning and data mining capabilities to deliver a seamless, intelligent and unobtrusive end-to-end service to the end user. That vision is an interesting one, but still some way off.
One of the highlights of October's Web 2.0 Summit in San Francisco was the emergence of 'Semantic Apps' as a force. Note that we're not necessarily talking about the Semantic Web, which is the Tim Berners-Lee W3C led initiative that touts technologies like RDF, OWL and other standards for metadata. Semantic Apps may use those technologies, but not necessarily. This was a point made by the founder of one of the Semantic Apps listed below, Danny Hillis of Freebase (who is as much a tech legend as Berners-Lee).
The purpose of this post is to highlight 10 Semantic Apps. We're not touting this as a 'Top 10', because there is no way to rank these apps at this point - many are still non-public apps, e.g. in private beta. It reflects the nascent status of this sector, even though people like Hillis and Spivack have been working on their apps for years now.
Firstly let's define "Semantic App". A key element is that the apps below all try to determine the meaning of text and other data, and then create connections for users. Another of the founders mentioned below, Nova Spivack of Twine, noted at the Summit that data portability and connectibility are keys to these new semantic apps - i.e. using the Web as platform.
In September Alex Iskold wrote a great primer on this topic, called Top-Down: A New Approach to the Semantic Web. In that post, Alex Iskold explained that there are two main approaches to Semantic Apps:
1) Bottom Up - involves embedding
semantical annotations (meta-data) right into the data.
2) Top down -
relies on analyzing existing information; the ultimate
top-down solution would be a fully blown natural language processor, which is able to
understand text like people do.
Now that we know what Semantic Apps are, let's take a look at some of the current leading (or promising) products...
Tim Berners-Lee, inventor of the World Wide Web, today published a blog post about what he terms the Graph, which is similar (if not identical) to his Semantic Web vision. Referencing both Brad Fitzpatrick's influential post earlier this year on Social Graph, and our own Alex Iskold's analysis of Social Graph concepts, Berners-Lee went on to position the Graph as the third main "level" of computer networks. First there was the Internet, then the Web, and now the Graph - which Sir Tim labeled (somewhat tongue in cheek) the Giant Global Graph!
Note that Berners-Lee wasn't specifically talking about the Social Graph, which is the term Facebook has been heavily promoting, but something more general. In a nutshell, this is how Berners-Lee envisions the 3 levels (a.k.a. layers of abstraction):
1. The Internet: links computers
2. Web: links documents
3. Graph: links relationships between people and/or documents -- "the things documents are about" as Berners-Lee put it.
The Graph is all about connections and re-use of data. Berners-Lee wrote that Semantic Web technologies will enable this:
"So, if only we could express these relationships, such as my social graph, in a way that is above the level of documents, then we would get re-use. That's just what the graph does for us. We have the technology -- it is Semantic Web technology, starting with RDF OWL and SPARQL. Not magic bullets, but the tools which allow us to break free of the document layer."
Sir Tim also notes that as we go up each level, we lose more control but gain more benefits: "...at each layer --- Net, Web, or Graph --- we have ceded some control for greater benefits." The benefits are what happens when documents and data are connected - for example being able to re-use our personal and friends data across multiple social networks, which is what Google's OpenSocial aims to achieve.
Venture funded UK semantic search engine TrueKnowledge is unveiling a demo of its private beta today and looks like an interesting site to watch. One cannot help but think of the still-unlaunched Powerset, but it's also reminiscent of the very real Ask.com "smart answers".
Though the video the company published this morning speaks quite well for itself, the gist of what's happening is this. TrueKnowledge combines natural language analysis, an internal knowledge base and external databases to offer immediate answers to various questions. Instead of just pointing you to web pages where the search engine believes it can find your answer, it will offer you an explicit answer and explain the reasoning patch by which that answer was arrived at. There's also an interesting looking API at the center of the product. "Direct answers to humans and machine questions" is the company's tagline.
It sounds very interesting and I'd love to get my hands on it. Unfortunately, the company isn't allowing general access to the site and hasn't given me a login yet either. I hope it's real and really performs as advertised. It takes a very special technology to get coverage of a screencast and coverage again of an actual product release later. This might be one of those technologies. With the sense of self-importance that's implied by the act of unveiling your private beta to the world, one hopes there will be some meat here.
Founder William Tunstall-Pedoe says he's been working on the software for the past 10 years, really putting time into it since coming into initial funding in early 2005. Hopefully there won't be a Powerset style wait for the actual product. Keep an eye on our network blog AltSearchEngines for coverage of TrueKnowledge and the rest of the search engine world as soon as information emerges. See also Alex Iskold's excellent write up on a top-down approach to the semantic web and our coverage of semantic app Twine.
Dapper, the Israeli founded super-tool that "creates an API for any website" is looking to make the transition from useful service to viable business with the release of DapperAds, the company's new ad serving technology.
Dapper's basic service, found at Dapper.net, is essentially a screen scraper with a point and click interface. You identify a field that's common across multiple pages on a website and Dapper will deliver whatever values are in that field by RSS, iCal, KML (mapping language) or in a variety of other formats. It's a very cool way to create mashups on the fly. The company recently released an interface for making Facebook apps quickly and easily.
Dapper is really a head turner, enough so that it has raised VC backing - something only a few consumer level data manipulation services like this can say. They've got a long term vision of creating an API marketplace that could work if website owners get on board. In the meantime, though, they've come up with something simpler and easily monetizable.
The new Dapper Ads service takes Dapper's core technology and puts it to work for site owners themselves. It's pretty simple.
Today hakia added a hakia highlighter to their “meaning-based” search engine, producing a highlighted sentence inside a search result. The bigger announcement is tomorrow, when hakia will launch a scoop button - a browser plug-in that not only highlights text, but when you click on a result page it scrolls automatically to the highlighted passage, enables you to save data to your computer, and more customization features that we'll discuss below.
Both of these new tools allow for faster more relevant result selection and additional utility for users.
By Alex Iskold
We've been writing recently
about the rise of semantic web and how in 2007 we'll see many interesting semantic
technologies. The fundamental problem that all these technologies need to solve is
explaining the meaning of things to computers. There are several approaches to this, all
of which in principle can work.
There are companies and technologies that are doing it bottom up - by embedding semantical annotations (meta-data) right into the data. The opposite camp is exploring the top-down approach, which relies on analyzing existing information. The ultimate top-down solution would be a fully blown natural language processor, which is able to understand text like people do.
In this post, we are going to look at ClearForest - one of the companies in the top-down camp. At first glance, you might not think much of the company's web site, but a deeper dive reveals that ClearForest is restructuring - to apply its core natural language processing technology to facilitate next generation semantic applications. The fact that ClearForest has released both a Web Service and a Firefox extension that leverages an API to deliver the end-user application, says that the company gets what the next generation web is all about.