ReadWriteWeb

Filter Geeks Try to Solve Info Overload at the Real-Time Web Summit

Written by Jolie O'Dell / October 15, 2009 1:34 PM / 7 Comments

How do you create filters for the real-time web? From spam filtration to relevant discovery, the "filter geeks" at the Real-Time Web Summit today are all about creating simple, rich user experiences.

Hashtags for Twitter are a great start, but how are the startups moving and shaking the real-time web planning on giving users filters to control their streams in ways that make the ever-increasing volumes of information more usable? From Thing Labs and Twingly to PostRank and SocialText, read on for the problems these companies and their users have encountered and how they plan to solve information overload through clever curation and cooperation.

The session was led by Twingly CEO Martin Kallstrom, who opened with a discussion about hashtags. But one of the best things about both the unconference format and the intellectual cachet of Silicon Valley is illustrated by what happened next.

Thing Labs' CEO, Jason Shellen, interrupted to insist that we broaden the discussion to include the entire real-time web and all possible examples of filtration systems, not just Twitter and not just hashtags. From there, the conversation exploded into an executive-level goulash of how to make the real-time web useful.

The overall poverty of the user experience was generally deplored. "We hear from our users about what they want," said Shellen. "People say, 'Just show me the important stuff.'" The current state of real-time UXes allows for a lot of opportunity - the opportunity to make this iteration of the Internet simple for new users as well as appropriately complex for powerusers, unlike what we've all seen with RSS, which remains an underused geekcore feature.

The spectrum of data and metadata was brought up several times, as well. Keywords (e.g., hashtags) are a good start, but richer metadata would allow for filtration by sentiment or location. For example, a user might want to see blog posts about Obama's winning the Nobel Peace prize from right-leaning sources only. Or I might want to see pictures posted by people within 100 feet of me while I'm at the Real-Time Web Summit.

Overall, having the author, location, time, sentiment, and keywords automatically applied to user-generated data could lead to much richer streams with built-in filtering opportunities, both filtering content out as well as discovering new content and sources.

Another major point of emphasis for this session was the fact that a critical mass of users generally leads to the best filtering: Large datasets create very specifically defined problems and finely tuned filtration. Unfortunately, the startups involved in the real-time web often have smaller user bases than would be desired; there is simply not enough data generated by the users of the individual services. But what if all that user data was combined somehow?

"Right now," said Kallstrom, "people doing startups trying to combat information overload are mostly focused on finding high-quality signals. It's a very hard problem. The highest quality for the end user is achieved by moving from competing on gathering the signals to creating a great user experience through more open data."

One participant suggested publishing user activity to open-source the problem of how to filter real-time data. Many other participants agreed that the problem requires collaboration, data portability, and open standards between all the companies in the room and beyond. Such collaboration would make all real-time products better and lead to better experiences for users.

Then again, better filtration could be a real-time holy grail, a solution worth selling. And when the question of money comes up, will these startups be willing to sacrifice a theoretical goldmine to collaborate on a user-friendly solution?


Comments

Subscribe to comments for this post OR Subscribe to comments for all ReadWriteWeb posts

  1. These are some really insightful ideas. The information overload problem is a very real one within our social media space.

    With http://viewpointapp.com; we are tackling this problem with a combination of real time filtering and a great user experience.

    Posted by: Jay | October 15, 2009 2:10 PM



  2. The key to solve the information overload problem is to follow a bottom-up strategy as described here:

    http://nick.iss.im/2007/12/20/bottom-up-information-sharing-with-iss/

    Posted by: Nick Vidal | October 15, 2009 6:30 PM



  3. I was unable to attend the very interesting summit today. I agree with Jolie that Real-time web is a goldmine of opportunity for all of us actively involved in building it today.

    Like in any new, exciting area, a thousand different approaches will be tried but only a few will succeed. I want to bring to your attention the first and only truly semantic search engine that currently works on Twitter data, TipTop, now available in a beta version at http://FeelTipTop.com

    TipTop’s powerful engine understands each and every message on Twitter just like a human being would. As a result, it can discover from within the data the very best tweets organized nicely along a variety of categories and concepts learned dynamically. In fact, the entire platform learns from data as data flows through the engine. You can now see in real time the sentiment associated with anything in the world that people are talking about. Please give it a try.

    Posted by: Shyam Kapur | October 15, 2009 8:01 PM



  4. Jolie,

    Great post! (and great meeting you yesterday) Thanks for documenting the discussion we had in that session.

    The conversation started off with "filtering", which I took quite specifically (vs. broadly). To me, filtering means filtering out unwanted information, which naturally led to discussing anti-spam and how we might apply lessons learned there. I also think that filtering is a specific solution approach aimed to help meet the larger problem/goal of delivering relevant information to users.

    Once we opened up the discussion to that broader user experience goal, we had a good discussion on how you can/should determine the relevance of content for a given user, which is where I think the real challenge (and opportunity) lies.

    I moderated a later session at 3 pm on Real-Time Discovery where we continued the discussion of that topic and related ideas.

    I founded my real-time discovery engine startup YourVersion to help solve this problem for users. YourVersion continuously discovers new, relevant content tailored to your interests and makes it easy to bookmark and share that content. In addition to our website, we also have a free iPhone app and Firefox toolbar. We launched a month ago at TC50 where we were the People's Choice Winner.

    I'd encourage ReadWriteWeb readers to give YourVersion a try to see how we've approached solving some of the issues that came up in the great discussions at the Real-Time Web Summit (kudos to RWW for an awesome event!).

    Cheers,

    Dan Olsen
    CEO & Founder, YourVersion
    YourVersion

     Posted by: Dan Olsen Author Profile Page | October 16, 2009 11:25 AM



  5. Devita Saraf posted an article on information overload in the Wall Street Journal yesterday where she commented :

    "I don't see a situation of data reducing in this information overloaded age, so the only solution is to use the most personal and effective filter I know of -- our own brain."
    - http://online.wsj.com/article/SB125567521154489511.html

    The more crap we "share" on the web, the better filters we need.

    I think one of the reasons Twitter is often discussed is that it is one of the best examples of "sharing crap" because it is VERY limited both with regards to information quantity and quality.

    Hashtags is a "poor-mans" solution to this - but we need both better options for describing the information AND better options for filtering it out.

    No filter can help you with poor data - only your brain is powerful enough in that case. However, improving the data with better meta-information will enable "intelligent agents" to assist you with filtering out the noise...but we're not even close yet !

    Posted by: Atle Iversen | October 17, 2009 2:06 AM



  6. "One participant" was me, BTW. Great session. My company is also working on this. Hopefully we'll have some demos to show off pretty soon.

     Posted by: Gabe Ortiz Author Profile Page | October 17, 2009 8:16 PM



  7. I see this space evolving as an ecosystem where each several services can enrich an item/activity with extra metadata and re-submit it to others to do the same. We are looking to do this at the geodata level, others are analyzing sentiments, others are adding statistics (shares, likes).

    The problem we have now is that there is no way to weave together all this as linked data for a ressource (let's say a tweet). We need better ways to find the "canonical" ressource and defined rules to attach extra metadata to it.

     Posted by: Sylvain Carle Author Profile Page | October 19, 2009 10:02 AM



Leave a comment

Optional: Sign in with Connect Facebook   Sign in with Twitter Twitter   Sign in with OpenID OpenID  |  
RWW SPONSORS


FOLLOW @RWW ON TWITTER

ReadWriteWeb on Facebook



TEXT LINK ADS