ReadWriteHack

Yahoo Is Discontinuing Its Hadoop Distribution to Focus on Apache Hadoop

Yesterday Yahoo announced that it will discontinue The Yahoo Distribution of Hadoop and refocus its development efforts on the main Apache Hadoop distribution. In a blog post, Yahoo VP of Software Engineering Eric Baldeschwieler writes that Yahoo will work more closely with Apache in the future. This leaves only two major distributions of Hadoop: Apache Hadoop and Cloudera's enterprise-focused Hadoop.

"Unfortunately, Apache is no longer the obvious place to go for Hadoop releases," writes Baldeschwieler. "The Yahoo! team wants to return to a world where anyone can download and directly use releases of Hadoop from Apache."

This begins a long process of contributing Yahoo's Hadoop code back into the main project. Baldeschwieler writes that Yahoo is in the process of contributing much of its work to the branch hadoop/common/branches/branch-0.20-security. Yahoo has proposed this be released as Apache Hadoop 20.100.

Yahoo has created another branch called Hadoop-future, a place for its proposed new features, including:

  • HDFS-1052 - Federation, the ability to support much more storage per Hadoop cluster.
  • HADOOP-6728 - A the new metrics framework
  • MAPREDUCE-1220 - Optimizations for small jobs

Cloudera has remained an active contributor to Apache Hadoop. We haven't heard back from Cloudera yet, but we don't expect Yahoo's announcement to change anything there.

Also of interest is Baldeschwieler's history of Yahoo's involvement in Hadoop.


ReadWriteWeb encourages comments, but please remember: Keep it nice, keep it clean, and avoid promotional comments. We do pre-moderate some comments with links. For more information, please read our full comment policy.
blog comments powered by Disqus
Recommended Story
RWW SPONSORS



RWW PARTNERS