ReadWriteWeb

Will Social Bookmarking Pay Dividends with Search Result Augmentation?

Written by Josh Catone / February 21, 2008 1:20 PM / 9 Comments

Last May we asked the question, "are social bookmarking sites better at search than Google?" Though some readers questioned our specific methods, our conclusion was that "while social bookmarking and ranking sites don't make great search engines on their own, they offer a wealth of user-vetted data that could be used to augment search results in a positive way." Recently, Yahoo! began testing including del.icio.us data in search results. While it is unclear whether the del.icio.us data is affecting search rankings, the more important question is: would it even matter?

A group of researchers at Standford recently presented a paper in which they answered that very question: can social bookmarking augment traditional search results? Marshall Kirkpatrick made brief mention of the paper in a post earlier today.

The paper, which is entitled, "Can Social Bookmarking Improve Web Search?," was presented at the First ACM International Conference on Web Search and Data Mining (WSDM'08) and includes eleven experiments designed to evaluate "different aspects of social bookmarking and their impact on web search," using del.icio.us bookmarking data, Yahoo! and AOL search data, and ODP data gathered between May and June of 2007.

Overall, the group concluded that the relatively small size of the social bookmarking community (the paper's authors estimate that only about 1/1000th of the web has been bookmarked and tagged in del.icio.us) means that it is not yet ready to make a significant impact on search, but that ther are still ways in which social bookmarking data can be used to improve how search engines work. That's a similar conclusion to one we made in December when we noted that del.icio.us is mostly being used to bookmark stories related to a few narrow fields, which means that its usefulness in augmenting search rankings is limited.

One of the commenters on that post theorized that the reason social bookmarking isn't being used as much for cataloging things like celebrity gossip or sports news is because social bookmarking is used mainly for storing information that is not time sensitive. "Delicious does not do good at celebrities, sports news & so on because it's meant for what you want to keep over time. Not the latest updates. It's as simple as that to me." (NatC)

That sounds logical, however, the Standford group found that del.icio.us users tend to post pages "that are actively updated or have been recently created" and recommended that search engines could use social bookmarking data to augment or improve their crawl schedules. In fact, about 25% of URLs entered into del.icio.us are not seen in search engines for another 4 weeks to to 6 months, which indicates, says the researchers, that social bookmarking could be used "as a (small) data source for new web pages and to help crawl ordering."

What we're most interested in, though is how social bookmarking can be used to affect search engine rankings. The Stanford team found that tagging at social bookmarking sites would probably not be very helpful because 80% of tags are found in page text or the surrounding text and so those pages would likely be found by search engines anyway. The team also found, though, that del.icio.us has a high level of redundancy for about 20% of URLs, and that there is generally an adequate level of overlap between top search results and bookmarked pages. While the paper doesn't make any conclusions regarding the relevancy of URLs that are tagged more than once, it has long been our contention that the number of times a URL has been saved, in conjunction with tag data for that URL, could be used by search engines to augment ranking algorithms -- i.e., URLs that are saved more (or more often) are likely to be more useful.

Certainly there are problems with relying too heavily on how many times a URL has been saved to social bookmarking sites when determining its search result position. For example, that number may be easily gamed or influenced via blackhat techniques. But even without that use case, the researchers at Stanford outlined a number of ways in which social bookmarking sites could be used to theoretically improve search engines, if not yet on a grand scale.

What do you think? Will social bookmarking data ever be used to enhance search engines? Should it? Let us know in the comments below.

Comments

Subscribe to comments for this post OR Subscribe to comments for all Read/WriteWeb posts

  • "it has long been our contention that the number of times a URL has been saved, in conjunction with tag data for that URL, could be used by search engines to augment ranking algorithms -- i.e., URLs that are saved more (or more often) are likely to be more useful."

    Google and Yahoo already in some ways have a comparable metric in that they can tell how many times a search result gets clicked (and then how long a visitor stay depending upon if the user has Google toolbar installed or if the site uses Google AdSense, Analytics, etc.) Granted, that can also become a self-fulfilling statistic as a site that ranks highly is obviously going to get clicked more than a site that shows up on page 3. But no doubt, those variances could be somewhat marginalized and weighted to an extent to get an idea of what's performing better.

    But I generally agree, the number of times a URL is bookmarked is often a good general view of its usefulness. I know for many of my more geek-oriented searches, I often find myself turning to Del.ico.us before I try the search on Google. For that subset of searches, Del.icio.us is an amazingly useful search tool.

    Posted by: RS | February 21, 2008 2:23 PM


  • Great article, Josh!

    This is an interesting question that was discussed indirectly at last week's SDForum meet on: Search & the Social Graph.

    One of the panelists at the session pointed out that "with the explosion of self-publishing and user-generated content on the web, the type of data getting created on the web is changing, and the classic search algorithms are becoming less effective.".

    As social behavior on the web permeates more into the mainstream, users are increasingly interested in what their social or professional peers found valuable; data from social bookmarks can thus be used to increase the relevance of search results for these users.

    On the other hand, as you point out, user penetration of services is still small at present, relative to the overall population of online users - but this will change with time.

    Posted by: NitinK | February 21, 2008 3:25 PM


  • This is great food for thought Josh!

    My quick take is this.. Its hard to beat or even improve Google with anything that's a single shot. The next innovation in search needs to be in UI and shortcuts, because just like semantics does not produce qualitatively better results against freq analysis nor does using corpus like del.icio.us.

    Alex

    Posted by: Alex Iskold | February 21, 2008 6:49 PM


  • i think it will. it would be nice if it was more implicit, though. for example, how readburner uses the amount people share stories to surface stories. on the onion, too, the most e-mailed stories are good ones.

    Posted by: Coleman Foley | February 21, 2008 7:44 PM


  • Can Social Bookmarking improve search ? The answer is 'Yes'. In fact every search engine must give more weight factor to visited or bookmarked results while deriving page rank. In case of Google, the importance given to visited links is not so meaningful as a page visited never convey like or dislike of the person who visited the page.

    So, search engines must give an option to people bookmark search result pages(SERP) so that those results can be delivered to other searchers while searching for the same keyword or keyword combinations. May be as a set of 'Bookmarked Results' on the right side of the SERPs.

    Search giants like Google will not prefer this way because they don't want people to leave search engine from first SERP itself[because of the nature of their revenue model], with the right link/page that is bookmarked by users. But 'Bookmarked Results' concept can help people find most relevant and useful[verified by a human] link for any keyword search.

    We are on the process of building a meta search engine [A prototype of the engine is live at www.rygoo.com ] where people can bookmark search results so that those bookmarked results can be displayed for other searchers who are looking for the same information. Search made more efficient and social.

    Posted by: Abdul Jaleel KK | February 22, 2008 7:53 AM


  • There is a lot of evidence that we have more confidence in information from users than from the media, corporations or the government. A logical extension of that argument is that we have more confidence in the organization of information by users rather than by companies or by algorithms. Therefore I think search will move away from Google type approaches toward "user organized information" search such as social bookmarking.

    On the point about tags being limited in number and subject area, that I agree with but I think you miss another important trend which you frequently reference on this site. Specialized search engines are becoming increasingly popular and a tag on delicious is nothing more than a specialized search facility waiting to happen. For example, the venture_capital tag on Delicious has great information.

    More thoughts on the future of search are at http://sophisticatedfinance.typepad.com/sophisticated_finance/2008/02/web-30.html

    Posted by: rhhfla | February 22, 2008 9:49 AM


  • thank you wery mach

    Posted by: Güzel Sözler | February 22, 2008 12:33 PM


  • Researchers at Standford?

    Posted by: Proof Reader | February 27, 2008 9:11 AM


  • This is very interesting! Great article about social bookmarking. I have enjoyed reading this very insightful post. Very engaging and informative. Thanks for sharing. :)

    Posted by: Aurelius Tjin | March 19, 2008 9:09 PM




RECENT JOBS


RWW READERS


TEXT LINK ADS


RWW PARTNERS

adaptiveblue

Yahoo Buzz