It's a valid question: "Why has all the data the government has been collecting turned out to be too big to handle?" The results of a U.S. and state government IT survey released this week by the public sector IT community MeriTalk sheds a bright, halogen spotlight on the answer: It's because it's being collected in an unfiltered format and is waiting for someone - anyone - to claim it and write viable applications for it.
Having data available electronically is not the same thing as the data being useful. Campaign finance disclosures provided electronically by the Federal Elections Commission (FEC), are a good example of that. The New York Times's Fech (not "fetch") is a RubyGem - a packaged application - designed to help journalists and public interest organizations access and make sense of FEC filings.

Haven't we seen this before: IBM making a big acquisition of a “V” company in the big data space? Indeed, last April 13, IBM purchased analytics software maker Varicent for an undisclosed amount. And then this morning, IBM announced its acquisition of enterprise search facilitator Vivisimo.
IBM usually doesn’t acquire applications software makers unless it has something very specific in mind for them. IBM is clearly piecing together a “big data” platform - a comprehensive package for storing, accessing and analyzing unstructured data.
Oracle CEO Larry Ellison likes to walk into a presentation with raw numbers in hand. When he touts Oracle technology as faster, he prefers to say how much... sometimes inflating the number as he goes along. "A factor of five. A factor of 10! A factor of 15!"
What would Ellison have given for the opportunity to tout a factor of 400,000? Evidently not enough. The notion that a database manager would run six orders of magnitude faster if it resided in DRAM rather than on a hard drive, is decades old, just waiting for an Oracle or a Microsoft or a Dell to make it hapen. Instead, SAP is the company delivering the dramatic database performance boost.
The Power View tool is meant to be run remotely, including from a service linked to a SQL Azure cloud-based database, so this is indeed a Web application. Technically, it may run from any browser that supports Silverlight. As with charting in Excel, you point Power View to your source and adapt the visualizations to best suit how that data may be explained to a viewer, and you do so without impacting the data itself.
One new tool called the slicer, added to the project during its public preview phase (proof that Microsoft can indeed incorporate good suggestions during a public preview), lets the user select a segment of data in a table to pull out and either highlight in the context of the bigger chart, or demonstrate within a separate graph. Then by creating what Power View calls cards, you can take a record about one of the items depicted in a chart - for example, one of the factors that the chart is comparing, like HP's share price to Dell's or Angus cattle prices compared to Hereford - and generate what Metro would call a "tile" for that item.
It's been the case for every SQL database in practical use since E. F. Codd first came up with the concept: Records either exist or they don't. When you run a SELECT statement, you're querying the current state of the data. A state is either true or false.
As far back as 1993, efforts to incorporate some type of temporal query into SQL - some way of saying, "Tell me whether this event will be true three hours from now" - have proven successful only with add-ons and attachments. IBM's new "Time Travel" aims to make this capability generally available.
MapR today announced a comprehensive set of data connection options for Hadoop enabling a wide range of data import and export options to extend the ability to connect to data warehouses and applications such as Talend Open Studio, Pentaho Kettle and an OBDC driver. A summary of the announcement from MapR can be found here.
If you are looking for large content repositories, you probably can't get much larger than the article archive of the Associated Press. Today they announced they have launched a content analysis tool that is used to search the millions of articles in their archives to create custom archive products for their customers. Users can query for particular keywords, and the AP can use the search query traffic to see trending topics and deliver article collections to particular B2B customers. For example, they could create references on a particular subject or moment in time. The project makes use of a solution from MarkLogic, a major Big Data enabler that is used by many different kinds of publishers for this type of purpose, such as Lexis/Nexis.
NASA is inviting all citizens of planet Earth to take part in a two-day coding marathon next month. Called the International Space Apps Challenge, the idea is to develop software for various purposes to support NASA's mission. It is open to just about anyone interested, including "engineers, technologists, scientists, designers, artists, educators, students, and entrepreneurs." The challenge will take place in several cities on all continents around the globe, including San Francisco, Santo Domingo, Sao Paulo, Nairobi, Tokyo and even on Antarctica at McMurdo Station.
There's a public relations brochure template someplace that reads, "________ is changing the way the world does business." If this were a Mad-Lib, you could insert the proper noun of your choice. Historically, evolutionary changes in both business and the economy that supports it, have mandated the need for subsequent changes in technology. There are certain very notable exceptions (thank you, Tim Cook), but let's be honest and admit that databases didn't spring up from gardens like daisies and change the landscape of business from winter into spring. There was a need for relational databases that went far beyond keeping up with the competition.
So when companies say that big data will change the way you work... really? Is that the best value proposition that vendors can come up with - "It's coming like a thunderstorm, so you'd better be prepared?" In the final part of ReadWriteWeb's conversation with IBM Vice President for Big Data Anjul Bhambhri, which continues from part 2, I told her a true story about a customer on a vendor webcast that was set in its ways and resisted the change that the PR folks were saying was inevitable.