RWW: An interesting new product was launched this year called Wolfram|Alpha, described as a 'computational knowledge engine.' It's kind of a mix between Google (search) and Wikipedia (knowledge), and its key attribute is that enables you to compute something. The founders think that 'computing' things on the fly is something we're going to see a lot of in future. What's your take on Wolfram|Alpha?
TBL: There are two parts to that sort of technology. One of them is a sort of stilted natural language interface. We've seen those sort of natural language queries for years. Boris Katz [from W3C] created a system called START [a software system designed to answer questions that are posed to it in natural language]. I think with the Semantic Web out there, those sorts of interfaces are going to become important, very valuable, because people will be able to ask more complicated things. The search engine has traditionally been limited to just a phrase, but some of the search engines are now starting to realize that if they put data behind them and have computation engines, then you can ask things like 'what's this many pounds in dollars?' and so on. So yes, those interfaces will become important.
"Those sorts of interfaces will become important [...] people will be able to ask more complicated things."
Conversational interfaces have always been a really interesting avenue. We've had voice browser work in W3C, that has been an interesting alternative avenue. It's possible that as compute power goes up, we'll see a prolifieration of machines capable of doing voice. It'll move from the mainframe to being able to run on a laptop or your phone. As that happens, we'll get actual voice recognition and pattern natural language at the front end. That will perhaps be an important part of the Semantic Web.

We talked before about what a great challenge the Semantic Web is going to be from a user interface point of view. Conversational interfaces are going to be part of [solving] that. Of course it's also going to be really valuable to have compositional interfaces - for the visually impaired and so on.
Wolfram|Alpha is also a large curated database of data sets. Obviously I'm interested in the big data set which is out there, which is Linked Data. This everybody can connect to. I don't really know a lot about the internals of Wolfram|Alpha's data set. I don't know whether they're likely to put any of it out on the web as Linked Data - that might be an interesting addition. I imagine that quite a lot of it may have come from the web of Linked Data.
RWW: There have been reports recently that both Google and Yahoo will be supporting the Good Relations ontology and linked data for e-commerce. Companies such as Best Buy are already putting out product information in RDFa. What would be your advice to e-commerce vendors right now, to help them transition to this world of structured data on the Web. The same question could be asked across many verticals, but e-commerce seems like one area which has some momentum right now. Would you advise them just to put out their data as Linked Data?
TBL: Yup! Certainly this year is the year to do it. I've been advising governments to do it and when you look at an enterprise, you find that a lot of the issues are the same. But when you put your data from government or enterprise out there, make sure you don't disturb existing ecosystems. Don't threaten those systems, because you've spent years building them up.
Maybe there's an analogy with when the Web first started and the first bookshops went online. They were more or less a flyer, saying 'hey we have a great bookshop at 23 Main St, come on down!'. Let's say that a person named Joe owned one of these early online bookshops. If somebody had suggested to Joe that he should put his catalog online, Joe would've felt that that was very proprietary data. And he'd be worried that other bookshops would see where he was weak, so they'd be able to advertise themselves as filling that niche he's weak in.
"When you put your data out there, make sure you don't disturb existing ecosystems."
But when his competitors Fred and Albert put their catalogs online, then Joe can check which books people are browsing at Fred and Albert's websites. So Joe would [finally] be pursuaded to put his book catalog up online. But he doesn't put up the prices... until Albert and/or Fred does. And even if catalog and pricing is up there, nobody puts their stock levels online. And there was a period of time when nobody [i.e. online booksellers] had their stock levels up. But people got fed up with ordering stuff that wasn't in stock. So the first book shop to actually tell you about stock levels suddenly was then unbelievably attractive to its customers.
So there's this syndrome of progressive competitive disclosure. This happens when people realize that if you're going to do business with somebody, if you're going to have your partners up and down the supply chain, really it's useful to check the data web - and life goes much more quickly and open.
Best Buy may be what starts the ball rolling [among e-commerce vendors]. Now if I want to look out for what [products are] available, I can write a program to see what there is. If somebody wants to compete with Best Buy, to my program they'll be invisible unless they can get their data up in RDF. Doesn't matter whether they use RDFa or RDF XML, as long as it maps in a standard fashion to the RDF model, then they will be visible.
Next Page: Internet of Things; Conclusion
Comments
Subscribe to comments for this post OR Subscribe to comments for all ReadWriteWeb posts
The problems of interfaces is a very important one, not only for displaying the linked data, but also for motivating the users to enter it and to interact with it. We're working for over 2 years already testing out different interfaces for ThisIsLike.Com – seeing what works best to make people excited about entering their data into the system and link it together.
> "keywords haven't proven up to the task of finding stuff on the Web. One of the reasons is that people lie, the other is that they can't be bothered to enter keywords"
TBL said it perfectly. In fact, with meta-data, users are ENCOURAGED to lie because they know the data is going to be used to organize or prioritize content.
If users want their 15 minutes of fame for writing customer or peer reviews, then of course they tag their review with popular search terms and rate the video (or blog posting) a 10 or a 0.
I'm interested in this issue because I've been working with a team to see if we can rate, cluster, summarize, and compare objects using algorithms that rely solely on linguistic and semantic analysis of text reviews.
Thanks for the great interview!
I really like this interview and re-read it a couple of times. It contains a lot of the essential grounding elements for the Semantic Web and its evolution.
URI's and RDFized content will become mainstream.