10 result(s) displayed (1 - 10 of 18):
Last month, veteran IDC analyst Dan Vesset predicted that while Hadoop will become a standard component of the modern data center, by 2015 the market around Hadoop will have matured at such a rate that the major players we recognize today probably would no longer exist. MapR - a commercial Hadoop provider whose name was inspired by the MapReduce programming model for Hadoop - was one of the companies on Vesset's target list for acquisition, and perhaps a ceremonial asterisk for history once Wikipedia emerges from blackout.
So you might expect the predictions of MapR CEO John Schroeder for the year 2012 would not include obscurity for his own company. But Schroeder makes at least an arguable case: The difference, he says, between the database market in 2012 versus the one from 1992 has to do with the customer's preference to refrain from vendor lock-in, and that customer's newfound ability to ensure against it.
Some of our ReadWriteWeb staff in Oregon may have noticed their lights slightly dimmer this morning than yesterday. Okay, we're kidding. Amazon's has thrown the switch on its second set of Western U.S. cloud computing clusters, alleviating some of the burden on its large and growing cache of customers in Silicon Valley.
The new Oregon cluster is Amazon's latest, fully operational example of its incredible cloud-in-a-box - quite literally a set of refurbished shipping containers retrofitted with compact cooling equipment.
A select group of developers is being invited by Hortonworks, the commercial caretaker of Apache Hadoop, to join a limited technology preview of the company's own forthcoming cloud data platform based on Hadoop. If you haven't heard of it yet, this is either your first time on ReadWriteWeb, or you've been living in a desert with no elephants. It's the NoSQL database system born from a Yahoo project, and now Hortonworks wants businesses to be able to utilize it as a platform without having to install it in their own data centers.
Hortonworks Data Platform, as the company's CEO tells RWW, will enter a public preview phase later this year. At that time, availability and ease of deployment will no longer be adequate excuses for businesses not wanting to try moving their big data to a scalable management platform.
It's hard to keep track of all the database-related terms you hear these days. What constitutes "big data"? What is NoSQL, and why are your developers so interested in it? And now "NewSQL"? Where do in-memory databases fit into all of this? In this series, we'll untangle the mess of terms and tell you what you need to know.
In Part One we covered data, big data, databases, relational databases and other foundational issues. In Part Two we talked about data warehouses, ACID compliance, distributed databases and more. Now we'll cover non-relational databases, NoSQL and related concepts.
This week Microsoft Research released Project Daytona MapReduce Runtime, a developer preview of a new product designed for working with large distributed data sets. Microsoft also has a big data analytics platform that uses LINQ instead of MapReduce called LINQ to HPC. Notably, LINQ to HPC is used in production at Microsoft Bing.
But Microsoft is entering an increasingly crowded market. There's the open source Apache Hadoop, which is now being sold in different flavors by companies such as Cloudera, DataStax, EMC, IBM and soon a spin-off of Yahoo. Not to mention HPCC which will be open-sourced by LexisNexis.
Microsoft's products are currently in early, experimental stages and the company may never step up the development and marketing of these to be serious Hadoop and HPCC competitors. But could Microsoft be competitive here if it wants to?
Today Microsoft Research announced the availability of a free technology preview of Project Daytona MapReduce Runtime for Windows Azure. Using a set of tools for working with big data based on Google's MapReduce paper, it provides an alternative to Apache Hadoop.
Daytona was created by the eXtreme Computing Group at Microsoft Research. It's designed to help scientists take advantage of Azure for working with large, unstructured data sets. Daytona is also being used to power a data-analytics-as-a-service offering the team calls Excel DataScope.
This week Basho, the company behind the open source NoSQL database Riak, released a beta of Riak Pipe. Basho community manager Mark Philips describes Pipe as "more or less a rewrite of our existing MapReduce framework. It builds on the lessons we learned in the initial (and still in use) version of MapReduce." You can find the code in GitHub.
If you work with Hadoop, or want to, check out Antonio Piccolboni's overview of eight MapReduce languages. Piccolboni explores each language in search of a language that provides both concise syntax and the power to run both the "'inside' of map reduce, that is the code for the mapper and the reducer, as well as the 'outside', the logic that decides which map reduce jobs to run." He also looked to whether he could write MapReduce programs that require multiple MapReduce jobs " including the case of a data dependent number and type of jobs."
He decided Rhipe, which integrates R with Hadoop, was the closest to what he was looking for. However, one notable absence from his overview is Wukong, which brings Ruby to Hadoop. (Though I'm not sure whether this would meet his requirements).
Which language do you prefer for creating MapReduce jobs, and why?
Riak is a NoSQL database influenced Dynamo and written in Erlang. We explored some of its uses here. It's sponsored by a company called Basho, which just hired NoSQL expert Mathias Meyer last month.
In this video, Meyer and fellow developer advocate Sean Cribbs talk about using Node.js with Riak. It's not an introductory talk on Riak - some experience with the database is assumed.
Facebook is working a new dashboard for developers to gain better insights about their Facebook applications, it was revealed during a Tech Talk at the company's Seattle office this week. The old analytics dashboard often contains data that is no more recent than 48 hours. The new analytics dashboard will be real time. The data will be anonymous - people won't be able to find out WHO is looking at what, just how popular different items are.
Facebook is building the solution with the MapReduce database HBase. The Tech Talk goes into more technical detail about how the solution was built and scaled.
Movable Type search results powered by Fast Search