ReadWriteWeb

Data.gov Now Live; Looks Nice But Short on Data

Written by Marshall Kirkpatrick / May 21, 2009 9:02 AM / 13 Comments

Data.govlogo.jpgThe long awaited catalog of public data from the US government launched this morning at Data.gov. Developers, watchdogs and data nerds around the world rejoiced - but the initial offering is a bit of a let down.

New federal CIO Vivek Kundra is in charge of the site, which will act as a central repository for government data, including XML, CSV, KML files and more. At launch a mere 47 data sets are included and they appear to lean towards the least controversial matters. None the less, it's exciting to see the effort happening. Hopefully some awesome mashups are on the way!

Data.govscreen.jpg

There are many, many sets of data available from the federal government but the Data.gov site says it was selective about quality and standards when choosing what to include. It's hard not to compare other sources of government data and feel disappointed, though. The privately built USGovXML.com contains far more data and was built by one independent developer over four months. That site lists ten Department of Interior XML feeds, for example, none of which appear on Data.gov. You can find a feed of food recalls there, but not on Data.gov.

Twenty six government agencies are represented in the catalog, though not all are offering raw data. The FBI is listed as a source but only offers a widget that can be placed on websites, not access to raw data.

New York Times data wonk Derek Willis pointed out that the initial offerings are non-controversial. "Most are from USGS, EPA and National Weather Service," Willis observed this morning. "No [data from] Department of Homeland Security, State or DOJ."

Likewise, a search of the data sets for keywords like food, prisons and drug all bring up zero results. Those are examples of particularly important topics because they are matters of justice and injustice - shedding light into dark corners where injustices are being perpetrated is one of the most important things that government data and the subsequent computer assisted reporting can accomplish.

There are no RSS feeds available for the whole catalog or search queries, something that would be very useful for tracking additions of new data. We expect that will change soon.

People will no doubt argue that some data is much better than no data, and while that's true: for a new federal office to engage with such an important topic with the weight of history and the whole administration behind it and then come up with something this limited is disappointing.

API and mashup watcher John Musser of ProgrammableWeb was more generous than we are about the initial offerings:

"They're off to an excellent start. It's a big step in accessibility of government data. As we've been seeing with other v1 gov-data efforts, like the recently available data on senate votes: step one is give people structured data like xml, step two (or later) is to make it available via an API. They have a healthy amount of metadata. The number of data sets is not that large, but of course it's just the beginning."

It is just the beginning and we applaud the launch of this effort. We hope that the initial launch will pale in comparison to the long term value of this collection of data.

The folks at Sunlight Labs, Google, O'Reilly/TechWeb and Craig Newmark just launched a new part of their Apps for America contest to build the best mashups and data visualization tools for data in the new Data.gov site. Check it out!

See also the newly launched Whitehouse.gov/open - launches today just keep popping up.


Comments

Subscribe to comments for this post OR Subscribe to comments for all ReadWriteWeb posts

  1. This will be very useful and important if expanded on. It would be great if it sets a precedent for future administrations to follow. The notion that government data should be public if it's not classified is I think very valuable.

    Of course many agencies already ship CDs worth of public data of many types on request, but collecting all the data online in a single index is a very helpful thing to do.

    Posted by: Miramon | May 21, 2009 10:13 AM



  2. Miramon, beyond making the data available as is possible in CDs, this makes it dynamically available online for machine reading and integration into other applications.

     Posted by: Marshall Kirkpatrick Author Profile Page | May 21, 2009 10:27 AM



  3. Indeed, it's vastly preferable to have the data online. You really have to know all about the data in advance to order a disk, and then you have to wait for it -- and quite likely you had to pay something, too.

    But free, online, you can acquire it in a more timely manner, and you can discover it interactively if you want it, without having to make a big investment of time and effort. Just reducing the investment in time needed to allow access to the data is a big deal.

    Right now a data-minded journalist or even a blogger can hack out a precis or a mashup of this data for an article with relatively little effort, but it would be much less likely for the same person to even discover some random CD exists or to order it.

    Posted by: Miramon | May 21, 2009 12:18 PM



  4. I definitely think this is a great development and there is for sure a buzz about it here in Washington, DC. Sometimes I wonder what the net net of it all will be. It's great to make data available and talk about how transparent we are but doesn't mean that all of a sudden you're going to get an engaged citizenry. Even with all the mashups and apps, its great but tools are just tools unless you build a community around them. Someone needs to build a movement around creating a culture that wants to participate in the political process. I just don't see it. There is just FAR too much apathy.

    My 2 cents.

     Posted by: Justin Author Profile Page | May 22, 2009 4:27 AM



  5. Socrata - http://www.socrata.com - you can discover existing government data, upload new datasets and interact with the community on this. The datasets are viewable, searchable, filterable all online through the Socrata site.

    Posted by: digibomber | May 29, 2009 2:10 AM



  6. It's an great thing.I definitely think that it must have created buzz around the city and i hope it supports a lot.Thanks for a such fantastic article.

    Posted by: Portable Storage | June 29, 2009 1:42 PM



  7. Twenty six government agencies are represented in the catalog, though not all are offering raw data. The FBI is listed as a source but only offers a widget that can be placed on websites, not access to raw data.

    Posted by: ugg | September 10, 2009 1:51 AM



  8. its very good site, to create any database or to find any database... and this also helps us to interact with other people,, so its very usefull for us........
    =========================================================
    alex01
    mls search

    Posted by: alex01 | October 27, 2009 5:18 AM



  9. I will remember your blog place. Because I love you more ideas.
    After this I will read all your posts thankful.

    Posted by: penpen Author Profile Page | November 29, 2009 5:41 PM



  10. But free, online, you can söve acquire it in a more timely manner, and you can discover it interactively if you want it, without having to make a big investment of time and effort. Just reducing the mantolama investment in time needed to allow access to the data is a big deal.

    Right now a data-minded boyacı journalist or even a blogger can hack out a precis or a mashup of this data for an article with relatively little effort, dış cephe kaplama but it would be much less likely for the same person to even discover some random CD exists or to order it.

    Posted by: boya Author Profile Page | December 23, 2009 11:15 AM



  11. awesome! that is what i needed. it is useful. thanks

    Posted by: aumar | February 9, 2010 12:58 AM



  12. can i reach all goverments' datsheets?

    Posted by: akilopanta | February 9, 2010 1:54 AM



  13. awesome! that is what i needed. it is useful. thanks

    Posted by: ดูหนังออนไลน์ | February 9, 2010 3:32 AM



Leave a comment

Optional: Sign in with Connect Facebook   Sign in with Twitter Twitter   Sign in with OpenID OpenID  |  

If you think Twitter is big, check out the Real-Time Web
RWW SPONSORS



FOLLOW @RWW ON TWITTER

ReadWriteWeb on Facebook
ReadWriteCloud - Sponsored by VMware and Intel



TEXT LINK ADS



RWW PARTNERS