ReadWriteWeb

Open Data Workshop 2007

Written by Alex Iskold / March 14, 2007 1:48 AM / 3 Comments

The Open Data Workshop 2007 took place yesterday in New York City. The gathering was organized by AttentionTrust and sponsored by Reuters. The workshop brought together over sixty influencers from startups, web giants, financial firms, venture capitalists and journalists interested in the attention space. It was a very refreshing, open and candid dialog about the different issues and faces of open data.

Open Data in the context of the workshop referred to three different things:

  • A public API for a service;
  • Release of anonymous consumer information for research;
  • The user's explicit and implicit attention information.

Opt-in or Opt-out

Probably the hottest topic revolved around tracking users behavior. Websites (and even ISPs!) these days routinely track peoples behavior without, letting them know in an obvious way. Often there are disclaimers and fine print, but people are not warned upfront in plain English about the fact that their data is being captured. Is this reasonable?

The group voiced various opinions supporting both sides of this argument, splitting nearly 50-50. Some people thought that consumers have to be informed in big bold letters about what is going on. Others argued that businesses need targeted advertising to make money, so it is fair game.

People with opt-in experience explained that it is rather tough to make money this way, since only 10% of users sign up. So collecting data by default, with a somewhat obscure opt-out, is basically the only sure way to monetize attention today. Personally, I voted for the opt-in camp, since I think that consumers get angry when they find out their data is being collected without sufficient warning.

Here is an example of something that I find reasonable. Abdur Chowdhury's new venture, called Summize, has very clearly stated a simple and understandable privacy policy:

APIs

Dick Costolo, CEO of Feedburner, said that having an API is a competitive advantage. He explained that by having APIs, Feedburner encourages people to build other services on top of Feedburner - instead of building a fully competitive offering. As a result user information flows through Feedburner, making it an information hub. Amazon is similar - its eCommerce API was one of the first APIs on the market. Their leverage is that all product URLs point back to Amazon, so the traffic comes back to them.

Beyond competitive advantages and centralizing information, another value is creativity. APIs encourage people to come up with innovative ways to use information. This benefits everyone. For example, Upendra Shardanand, CEO of the DayLife news service, demonstrated several interesting and creative ways of using their API. Among them was a visualization that depicted news as a constellation of stars, which offered insightful navigation and allowed the user to drill in to get more details.

Minding Open Data

Most people agreed that having APIs is beneficial and that opening up information can lead to good things. The workshop participants also recognized that there may be bad, unintended consequences. Abdur Chowdhury, who was AOL's Chief Architect for Search & Navigation during the private data leak, warned businesses to ask three questions before opening up consumer information:

Think about why you are opening the data?
Hopefully you will be able to improve users experience and give them something in return.

What are you going to do once you open the data?
How are you as a company going to manage the release? Will you be able to keep up with the volume of requests and interest in the data? How will you handle the public shock that the data is being collected in the first place?

Are you ready for the unexpected consequences?
Obviously AOL were not, but will you be?

Conclusion

Despite the fact that participants were not in agreement on several points, the day felt very productive, remarkably open and candid. Ideas were exchanged without agendas, partly because the topics were very interesting and partly because the moderators - Gerry Campbell from Reuters and Seth Goldstein from Attention Trust - kept everybody on track. The day ended with a bullet list summary of the discussion:

  • You think people know that data is being collected, but they do not. Even sophisticated users do not. How do we increase consumer awareness of these issues?
  • It is important to understand what information is being collected/shared and for what purpouse. How do we segment the information into private and less private?
  • Attention information has worth. How do we go about monetizing attention?

Please join the debate here and let us know what you think about these thought-provoking issues.


3 TrackBacks

TrackBack URL for this entry: http://www.readwriteweb.com/cgi-bin/mt/mt-tb.cgi/2045

Comments

Subscribe to comments for this post OR Subscribe to comments for all Read/WriteWeb posts

  • It's worth looking at Wesabe's 'Data Bill of Rights' (http://blog.wesabe.com/index.php/2006/11/10/open-data-at-web-20-and-our-data-bill-of-rights/) and (http://www.wesabe.com/page/security).

    I think Open Data is less about attention and data mining and perhaps more importantly data portability...the ability to let users own the data inside web application, protect it and also take it with them.

    Posted by: Imran Ali | March 14, 2007 4:37 AM



  • I run websites.

    The data that is collect about peoples viewing habits is such a trivial 'invasion of privacy' that it's irrelevant. you generally cant identify individual people, and even if you could, the data that you collect is pretty uninteresting. and that data is generally used to make the users experience a better one anyhow

    security cameras filming you when you enter a retail store is a far bigger 'invasion of privacy' yet no-one complains about this

    why do people get whingy about this? i think it's because we tend to massively overstate our own importance i.e. that our browsing habits are somehow interesting and stand out from the other half a billion internet users

    meh. beer! ;)

    Posted by: Steve Boyd | March 15, 2007 4:49 AM



  • Its true that the data collected by a website is not much. But when things like AdSense and other 3rd party cookie systems are spread so widely across the internet, these systems start to get a very intimate look at one's browsing (and searching) habits.

    With more of this information out there, it gets easier and easier to have a mosaic effect where lots of little bits of information add up to a complete profile. This point was made my many at the conference. That's what we saw in the AOL data release scandal. Surely we'll see more of this in the future.

    -- Thanks for the excellent writeup Alex. I enjoyed your presentation of Adaptive Blue.

    (As an aside, there is lots of other information that can also be tracked, e.g. see the experiment I set up recording any selecting of text on my blog: http://wanderingstan.com/selections )

    Posted by: Stan James | March 15, 2007 1:46 PM




RECENT JOBS



TEXT LINK ADS


RWW PARTNERS


RWW READERS