10 result(s) displayed (1 - 10 of 38):
Looking for innovative ways to use R, the Big Data open source analytics language? Then take a gander at the two top winners of the first of a series of contests that R's corporate caretaker Revolution Analytics has produced. The winners, announced today, receive prizes that range from $1,000 to $10,000 for their submissions. It is an interesting collection and shows off the power of the language itself.
The U.S. Central Intelligence Agency has a crack group of analysts tracking the Internet, including tweets and Facebook messages, that takes the pulse of the world. Located in McLean, Virginia analysts at the CIA Open Source Center are known as the "vengeful librarians" according to a report from the Associated Press. These librarians are tracking up to five million tweets a day from places like China, Pakistan and Egypt.
It is sometimes disconcerting to know what the U.S. intelligence complex is doing, right in your backyard. McLean is a beltway city in Northern Virginia that is best known for Tysons Corner, one of the shopping hubs of the East Coast. On the outskirts of the city limits there is also the George H.W. Bush CIA complex, on of the agency's main hubs in the D.C. region.
Sticking with her original deadline announced last year, European Commission Vice President Neelie Kroes told a European interoperability standards forum yesterday that a public portal for access to government and public data from across the continent is on track to go online in Spring 2012. Following that, the next stage in Comm. Kroes' agenda includes an ambitious project to launch a community-built, crowd-sourced public data platform for all of Europe.
Kroes told the OpenData Forum in Brussels she expects for a pan-European forum for public data mining to go live no later than 2013. "Will she really be able to pull off all that?" the commissioner asked rhetorically, referring to herself.
It's hard to keep track of all the database-related terms you hear these days. What constitutes "big data"? What is NoSQL, and why are your developers so interested in it? And now "NewSQL"? Where do in-memory databases fit into all of this? In this series, we'll untangle the mess of terms and tell you what you need to know.
In Part One we covered data, big data, databases, relational databases and other foundational issues. In this section we'll talk about data warehouses, ACID compliance, distributed databases and more. In part three, we'll cover non-relational databases, NoSQL and related concepts.
Rapid-I announced this week that it will offer a marketplace for RapidMiner extensions to its open source data mining tool RapidMiner. "Over the years, many of you have been developing new RapidMiner Extensions dedicated to a broad set of topics," the company's announcement stays. "Whereas these extensions are easy to install in RapidMiner - just download and place them in the plugins folder - the hard part is to find them in the vastness that is the Internet." You can visit the beta version of the extension marketplace here.
It doesn't appear that there's a mechanism for offering paid extensions, yet. But Decision Stats blogger Ajay Ohri hopes to see this turn into an app store for algorithms.
Google Labs has come out with a new tool that it is calling "Like Google Trends in reverse." Google Correlate allows users to enter a data series and get back queries that follow a similar pattern. Correlate is based off the technology that Google used to create Google Flu Trends.
When you enter a data set into Correlate, it uses the Pearson Correlation Coefficient - a principle of statistics regarding data sets - to show the highest related coefficient within the search term. Correlate data can be input from either a spreadsheet or by exporting a CSV. Correlate also has pre-existing data sets from locations like states.
Today Revolution Analytics announced a partnership with Kaggle to provide Revolution R Enterprise software for free to participants in Kaggle's data contests. Competitors can download the software here.
Revolution is a company that provides commercial support and tools for the statistical programming language R (see our previous coverage). Kaggle hosts data analysis competitions for organizations such as such as Deloitte, NASA, Wikipedia and The Heritage Health Network.
R, the statistical programming language, continues to grow in popularity. A recent poll at KDnuggets found that 34% of respondents do at least half of their data mining in R. Although it's a domain specific language, it's versatile. Here are three different presentations, each on a different aspect of R.
The Data Science Toolkit is a collection of data tools and open APIs curated by our own Pete Warden. You can use it to extract text from a document, learn the political leanings of a particular neighborhood, find all the names of people mentioned in a text and more. He unveiled it today at GigaOM Structure in San Francisco GigaOM Structure Big Data in New York City.
It's available as a Web service, or you download a virtual machine and host it on your own server.
Last week we told you that enterprises are investing more into business intelligence and analytics initiatives. This week there's more good news for professionals in this area: according to KDNuggets, salaries are rising for analytics and data mining professionals.
Based on a poll with approximately 250 respondents, KDNuggets found that salaries are up from its 2010 poll in North America, Western Europe, Asia and Latin America. (There is no mention of Eastern Europe, Africa or Antarctica.)
It's a good time to be a geek, particularly one with a background in statistics, analytics and data mining. But a bad time to be almost any other type of worker.
Movable Type search results powered by Fast Search