ReadWriteHack

The Stanford Visualization Group Debuts Visual Tool for Cleaning Up Data

Today at the Strata conference The Stanford Visualization Group debuted a Web-based visual tool for cleaning up messy data called DataWrangler. According to its website, "Wrangler allows interactive transformation of messy, real-world data into the data tables analysis tools expect." Data can be exported as a CSV or TSV or as JSON data.

Data wranglers can use the tool with the group's data visualization tool Protovis, or with tools such as Excel, R and Tableau.

The group has released a paper explaining how the tool works. Joseph M. Hellerstein explains the origins of the project in a blog post:

Another thing I often hear is that a large fraction of the time spent by analysts -- some say the majority of time -- involves data preparation and cleaning: transforming formats, rearranging nesting structures, removing outliers, and so on. (If you think this is easy, you've never had a stack of ad hoc Excel spreadsheets to load into a stat package or database!)

Putting these together, something is very wrong: high-powered people are wasting most of their time doing low-function work. And the challenge of improving this state of affairs has fallen in the cracks between the analysts and computer scientists.

It will compete with Google Refine, which we covered here.


ReadWriteWeb encourages comments, but please remember: Keep it nice, keep it clean, and avoid promotional comments. We do pre-moderate some comments with links. For more information, please read our full comment policy.
blog comments powered by Disqus
Recommended Story
RWW SPONSORS



RWW PARTNERS