sematic web - ReadWriteWeb http://www.readwriteweb.com/feeds/tag/sematic web en Copyright 2012 Richard MacManus readwriteweb@gmail.com Wed, 15 Feb 2012 10:45:03 -0800 http://www.sixapart.com/movabletype/?v=4.35-en http://blogs.law.harvard.edu/tech/rss Google Adds Semantic Web, Facebook Support for Video Search Google announced today support for enhanced markup for video search. This will allow webmasters to include important information, such as titles and descriptions, in machine-readable HTML along with the JavaScript or Flash videos themselves.

In a blog post, video search project manager Michael Cohen wrote, "We wanted to offer webmasters an additional tool, so today we're taking a page from the rich snippets playbook and announcing support for Facebook Share and Yahoo! SearchMonkey RDFa."

]]> Google's "rich snippets," as we previously reported, use structured data open standards such as microformats and RDFa to give users more detailed previews of the information contained on a web page.

Both the Facebook Share and RFDa markup formats will enable webmasters to give Google - and video-searching users - vital details, including video titles and descriptions. Like other semantic web technologies, these details allow our search engines to become smarter, our results richer and more relevant. And by allowing webmasters to specify the content type as video content, users' searches for video will yield more results with greater relevancy.

"While we've become smarter at discovering this information on our own," Cohen writes, "we'd certainly appreciate some hints directly from webmasters."

Yahoo! SearchMonkey, a semantic search technology which we've covered extensively in the past, gives webmasters the opportunity to create descriptions about content - in this case, online video. With these machine-readable descriptions, the search engine extracts structured data about videos and renders that data as enhanced search results.

The Facebook Share markup format also allows for the inclusion of metadata with video content.

]]> Discuss]]>
http://www.readwriteweb.com/archives/google_adds_semantic_web_facebook_support_for_vide.php http://www.readwriteweb.com/archives/google_adds_semantic_web_facebook_support_for_vide.php Google Mon, 14 Sep 2009 17:00:04 -0800 Jolie O'Dell
How Does the Web Feel? Evri's New Sentiment API Tells You Semantic search engine Evri can now understand how the web feels with the launch of their new sentiment web API. While busy scouring the net for people, places, and things and determining the relationships between them, the search engine is now able to understand the feelings associated with these entities, too, be them positive or negative. Using the API, developers can build applications for things like market intelligence, market research, sports and entertainment, brand management, product reviews and more.

]]> Not Just Good or Bad, but Who, What, and Why, Too

At first we thought Evri's API would simply rank things as positive or negative, much like the Twitter tracker twendz does today, highlighting positive, negative, and neutral items. However, the sentiment API does so much more, allowing you deeper insight into the "who's," and "what's," and "why's" associated with the particular expression or feeling.

To be more specific, according to the announcement, Evri lets you:

  • Find the percentage of positive and negative expressions of sentiment made by an entity, or about an entity. For example, find out what percentage of things being written about the iPhone are positive and which percent are negative.
  • Discover who is criticizing and who is praising a particular person, place or thing. For example, see who is criticizing and praising Microsoft right now.
  • Read what praisers and critics are saying about an entity. For example, see what the GOP are saying about the Democrats.
  • Discover who or what your favorite entity is bashing and why. For example, see who Lance Armstrong is complaining about.
  • Discover who or what your favorite entity is praising and why. For example, see who the World Health Organization is commending and why.

When unleashed upon the web as a whole, this could unearth a veritable goldmine of information. Just thinking of how many different ways it could be used is enough to blow anyone's mind. Of course, marketers will be the first to jump on board, looking for practical ways to track the feelings about their companies, clients, and brands and why they're changing, but an engine that understands sentiment could do so much more than just this. It can literally take the pulse of the web the way we take the pulse of Twitter using apps like the above-mentioned twendz to rank trends as positive or negative.

Demo: The "Vibology Meter"

To demonstrate what Evri can do, the company created a widget called the "Vibology Meter." (Sadly, no link is provided). The widget not only ranks the good or bad "vibes" about a particular entity (in the example, Barack Obama), but also explores topics associated with that entity and whether or not the primary entity feels positively or negatively towards them. For example, the widget shows Obama is negative towards the GOP and Rush Limbaugh but feels positive about Michele Obama. (Well, that's good!)

When you click on any one of the associated topics (or click on "anything" to see all topics of either positive or negative slant), you're then presented with a sidebar of information. Here, snippets from articles found on the web display along with a title, link, and timestamp.

Of course, this is just a simple example of the Evri API in action. We're sure the developers out there can think up even better ideas than this.

Challenges Ahead

The challenge now for Evri is to keep expanding its index in order to track more sources to rank. At the moment, the engine doesn't track a large slice of the web the way a typical search engine like Google does - in fact they don't even claim to be a search engine...despite what that "Go to" box on their homepage would have you believe. Instead, Evri looks specifically at the people, places, and things on the web and maps the connections between them.

To determine these connections - and now, the associated sentiments as well - Evri pulls from a limited number of "highly regarded" sources. That means you'll definitely see a site like CNN used to rank a person like Obama, but the myriad of tiny politico blogs will be ignored. That's actually a shame, since delving into this "long tail" of the web could give a better overall picture of how all people really feel, not just the sentiments expressed on high-profile sites written by top bloggers and journalists. Still, we know indexing and parsing this long tail is something that's much easier said than done.

In the end, what Evri's doing, even on this smaller scale, is definitely interesting. We hope to see the new API put to good use in the near future.

]]> Discuss]]>
http://www.readwriteweb.com/archives/how_does_the_web_feel_evri_tells_you.php http://www.readwriteweb.com/archives/how_does_the_web_feel_evri_tells_you.php Semantic Web Fri, 14 Aug 2009 07:37:20 -0800 Sarah Perez
Faviki's Social Bookmarking Tool Makes Semantic Tagging Even Easier When we first looked at Faviki, a social bookmarking application which made its debut last year, we were intrigued by their idea of "semantic tagging." What makes Faviki different from its competitors, services like del.icio.us, Diigo, and the now-defunct Ma.gnolia, is the way the service suggests tags to its users. The suggestions don't come from the community of Faviki users and their tagging history - they come from structured info extracted from the Wikipedia database.

Today, Faviki is releasing an upgrade to their service which will give you even better control over the tagging process, making bookmarking even easier than before. They're also announcing support for OpenID.

]]> A Better Tagging Interface

The biggest upgrade today is Faviki's enhanced tagging interface. In the past, Faviki struggled with some of the tag suggestions pulled out of Wikipedia because they were too long and too hard to enter for practical use. Plus, users wanted to use tags of their own creation, not the tag suggestions.

For example, if someone is tagging an article about the soccer player "Filippo Inzaghi," they may want to tag it by the player's nickname "Pippo." Before, this was not possible. But now, if Faviki doesn't understand a tag, it will pull in possible matches and ask you "What exactly do you mean by ______?" After you pick your selection, Faviki will remember your choice.

This is an important change for the service because it means users can tag web pages any which way they want, but they're still linked to the structured data on the back-end. That way, when someone searches through Faviki's community tags, all the web pages for that particular item or concept will appear, even if people tagged them using their own personal keywords.

Beyond Wikipedia

Another change in Faviki's service is the ability to define new tags. Prior to today, the service was limited to searching Wikipedia for tag suggestions, but now it has the whole web at its fingertips. If a tag is entered which doesn't match anything from Wikipedia, Faviki will search Google for relevant URLs and then ask if the links presented represent the same tag. As multiple users go through this process, Faviki learns what URLs best represent that concept and adds the new tags created by the users to its database.

API, OpenID, and More

Faviki has also just launched a Save/Edit API that provides a way to save and edit bookmarks from other applications. In addition, they've introduced support for OpenID. Other new features arriving today include a smarter autocomplete list, the ability to convert tags, spam control, the ability to export/backup your bookmarks, and a new tag description tooltip.

The only issue we have with Faviki is the same one we had before: there's still no import function available. That means you'll have to leave your extensive bookmark collection behind if you want to use this service. We suppose that it could be difficult to properly tag and match all of our old bookmarks, but without this feature, Faviki doesn't have the best shot at attracting the heaviest users of social bookmarking services.

]]> Discuss]]>
http://www.readwriteweb.com/archives/favikis_social_bookmarking_tool_makes_semantic_tagging_easier.php http://www.readwriteweb.com/archives/favikis_social_bookmarking_tool_makes_semantic_tagging_easier.php Product Reviews Thu, 02 Jul 2009 06:04:01 -0800 Sarah Perez
Tags as Far as the Eye Can See: New York Times to Publish Index as Linked Data Today, at the Semantic Technology Conference, Rob Larson and Evan Sandhaus of the New York Times announced together that the Times will soon be publishing its copious index as Linked Data.

The Times' data will join content from Project Gutenberg, a vast online library of text from public domain books, data from the U.S. census, and information from many other formative and vital entities in the semantic web space. Larson and his team intend to make available hundreds of thousands of tags for content dating back to 1851. This will providing give developers an invaluable, automatically navigable roadmap for the publication's vast directory of knowledge and will link that data to existing pages, people, and content around the web.

]]> In his keynote address, Larson emphasized "How deeply we [at the Times] care about metadata."

"It's been fundamental to what we do for a long time. We feel we're good at it, but our content is an island... we want to announce our intention to publish our thesaurus to the community under a license that will allow you to use it and contribute your improvements... The results of this effort will in time take the shape of the Times entering this Linked Data cloud. This is wholly consistent with our open strategy... to facilitate access to slices of our data for those who want to include it in their applications."

Larson likened the Times corpus to a quarry of data. He said that the newspaper's API provided the picks and shovels to mine data, and the Linked Data initiative would be the map.

The timing, licensing, format, and other factors of the project are yet to be determined.

This announcement comes on the heels of CNET's partnership with Reuters to publish data to the Linked Data cloud. Moreover, exactly one month ago, we wrote that Linked Data was a concept "whose time has come" and gave a thorough overview of the concepts and standards it entails, for curious readers who would like to drill deeper on the subject.

In another recent interview, Sandhaus detailed the tagging process for the Times' corpus, both for print and online articles:

"There are two types of tagging that go on at the times... Every day, indexers take the paper and go article by article and associate each article with subject keywords. Then they manually summarize it. It's like a Google list, but in dead tree form.

Another type of tagging we do is... when an article goes from the newsroom to the web, it's put there by a producer who will augment the article with any number of rich features like images, multimedia... and subject keywords. Unlike the indexers, who do this completely by hand, the producers are assisted in their tagging by an automated classification system which suggests tags to be applied to the data and which are ultimately approved by the producer."

An official announcement is expected at the Times' Open blog tomorrow, with details on the project to follow.

]]> Discuss]]>
http://www.readwriteweb.com/archives/nytimes_linked_data.php http://www.readwriteweb.com/archives/nytimes_linked_data.php Semantic Web Thu, 18 Jun 2009 12:43:03 -0800 Jolie O'Dell
Why the Web 3.0 Conference Was a Success The Web 3.0 Conference in New York last week was a visible success. Attendance was good, and so it seems that the organizers are making money. That is significant in a recession, when many conferences that were announced have had to be suddenly canceled due to lack of interest. At a more qualitative level, the Web 3.0 Conference had a good mix of different types of people. It was not an echo chamber. Personally, I found the conversations more stimulating than average for a conference.

]]> Who Was There?

This a personal impression based on actual conversations, not based on the attendance list.

  • Serial entrepreneurs seeking their next big venture. I spoke to two of them. What was interesting was that both were very successful, knew very little about the semantic Web (they were there to learn), and were extremely open to seeking where the opportunities lie. In other words, they were at the formative stages of their ventures.
  • Semantic Web pioneers. Conference organizers made it very clear they did not want an echo chamber of SemWeb experts talking to SemWeb experts. They wanted SemWeb experts to connect with business people who had problems that needed solving. That seemed to be happening.
  • Connectors, money guys, promoters. There were quite a few of these, usually a sign that something is either happening or about to happen.
  • Publishers. Well, the conference was in New York, so you would expect publishers, of all types, both big and small.
  • Semantic Web ventures that are already getting traction. Most of these appearances took the form of speakers and conference sponsors.

Where Is the Value in this Next Phase of the Web?

This is what the serial entrepreneurs were asking. Here is my view after a few days of reflection. Three big market opportunities will see semantic Web technology used in different ways in the near term:

  • Scientific/technical/medical (STM) publishing,
  • Market research information created from random social media chatter,
  • Improved advertising relevance.

Each of these deserves closer inspection.

Scientific/Technical/Medical Publishing

Open-source data will disrupt traditional data publishing -- in particular and immediately STM publishing -- similar to how open-source software disrupted the software industry. STM publishing is a market worth more than $10 billion, so this is significant. Similar forces will play out in financial, legal, and other data-rich industries, but STM is likely to be in the vanguard for the following reasons:

  • Everybody in the eco-system wants this to happen except the current publishers. Governments and institutions that fund research want it to be freely available. The authors are not like book authors; they don't get paid per book sold. They want wide distribution and peer recognition.
  • There are huge benefits to the raw data being machine-readable, not the least of which is that the data can be used for further analysis, rather than be squeezed into the artificial format designed for print journal distribution.
  • Scientists and researchers will use the semantic Web tools that consumers and business people consider too complex (until some great UI designers take on this challenge).

As in any market transition, there will be winners and losers.

Winners:

  • Scientists and researchers,
  • (Indirectly) everyone who benefits from the products created by scientists and researchers,
  • New publishers (or some other entity) that add enough value to free source data that they are still able to charge for it.

Losers:

  • Traditional STM publishers who cling too tightly to their current cash cow and so cannot effectively ride the next wave.

After it goes through the STM sector, this wave will crash through other data-rich publishing markets, such as:

  • Finance
  • Law

Market Research Information Created from Social Media Chatter

The Web 2.0 era has unleashed an enormous amount of social media chatter. These conversations are inconsequential to all except the participants... until, that is, they are aggregated, structured, and analyzed. This is not simple to do, as security and intelligence agencies have long understood. When you can record any conversation you like, you quickly find that discovering something useful is really hard. Historically, only intelligence agencies have had access to this volume of chatter. And the public has only had access to conversations between "important" people about important subjects. Multiply the chat you and I had about what we had for breakfast a few million times, and someone might get interested, specifically someone in the market research industry.

Market research is a large industry. Obtaining explicit data about people by getting them to fill in surveys is becoming increasingly hard and expensive. Perhaps gathering data about what people are actually talking about and deriving something useful from that would be easier.

This is not likely that elusive native revenue model for social media. But it could be a useful add-on revenue stream. Semantic Web ventures that can pay social media sites for raw data, extract that data, add meaning, and sell it to marketers could do very well. That won't be easy to do well, though.

Improved Advertising Relevance

AdWords represented a massive advance in advertising relevance. It changed the advertising and media industries beyond recognition and made Google the most powerful technology company on the planet.

But is this as far as we can go with advertising relevance? Almost certainly not. Whether Google or another venture leverages the semantic Web, there is little doubt that semantic Web technology will improve advertising relevance. Quite how to do this is the subject of another post.

Disclosure: Web 3.0 was a sponsor of ReadWriteWeb, but we have no other financial interest in the event.

]]> Discuss]]>
http://www.readwriteweb.com/archives/why_web_30_conference_was_a_success.php http://www.readwriteweb.com/archives/why_web_30_conference_was_a_success.php Events Guide Tue, 26 May 2009 11:40:39 -0800 Bernard Lunn