10 result(s) displayed (11 - 20 of 38):
As explained in this blog post, Foursquare needed a way for its business staff to run reports based on its data without slowing down production servers and without learning technologies such as Scala and MongoDB. The company decided to make its data available to business staff through a Hadoop cluster hosted by Amazon Web Services. Foursquare's data miners could then query it using Hive, which provides a SQL-like query language for Hadoop.
As a proof-of-concept the company has produced a report on the rudest cities in the world, based on the number of tips that contain profanity. Which is pretty cool (apart from the assumption that profanity use = rudeness). But it makes me realize just how under-utilized geolocation APIs are.
Pattern is a collection of open source (BSD license) web mining modules for Python from the Computational Linguistics and Psycholinguistics Research Center. It contains tools for data retrieval, text analysis and data visualization and comes with over 30 sample scripts.
What We Pay For is a simple website that lets you enter your income and filing status and find out how your tax dollars are being spent. It breaks down the amount you're likely to pay in taxes by various spending categories, such as Social Security, national defense and Medicare.
The developers behind What We Pay For have released an API for the service, and Google and the non-profit organization Eyebeam are sponsoring a contest for visualizations based on the site's data.
Revolution Analytics is a company that provides commercial support for the open source statistical programming language R. Its flagship product is Revolution R for Enterprise, a distribution of R that competes with other commercial statistical products such as SAS and SPSS. Revolution CEO Norman H. Nie was the co-inventor of SPSS.
A job posting from Apple reveals the company is using or will use Hadoop for its iAds system. The job listing is for a "Senior Software Engineer - Hadoop" with experience in MapReduce, Hive and either HBase or Cassandra. Oozie and Flume are also mentioned. The ad was first spotted by The Register, and has since been removed from Apple's website. However, searching through Apple's job listing reveals other places that Hadoop may be in use, including improving the iOS experience.
The question I always come back to when I hear the term Enterprise 2.0 is one that I think my buddy Dennis Howlett would ask. I mean, who gives a flying trombone? That's not really how Dennis would say it. I will let him express himself in his own words about why anyone in their right mind would pay for anything with a high price tag that has a big fat social label on it.
Here's what gets me. We get so wrapped up about collaboration concepts and the nuances of social. In this upside down world, social is a term that is more commonly use to describe enterprise architecture than it is about sharing a beer with your mates.
This week, instead of a single API we're spotlighting ReadWriteWeb contributor Pete Warden's new e-book Data Source Handbook, which was just released today. Pete covers a slew of data sources including, of course, many APIs.
"These are hand-picked services that I've actually spent time using during my own work," Pete writes. "And I chose them because they add insights and information to data you're already likely to be dealing with."
He's made a list of services and a couple excerpts available here.
This week, Boing Boing posted its entire 11 year archive (63,999 posts) in XML format. But Nicholas H.Tollervey from FluidDB wanted more." XML is good, but having a searchable database of posts is better," he writes on the FluidDB blog. So he ported Boing Boing's XML archive into FluidDB.
"Because of FluidDB's open nature anyone can now make use of boingboing's data via a few simple and easy to construct RESTful calls to FluidDB," he writes. In other words, FluidDB is hosting a Boing Boing API. For free.
The cool thing - apart from being able to use FluidDB to mine BB for interesting data - is that you can do this yourself with your own blog.
Yesterday SMART@znmeb (SMART stands for "social media analytics research toolkit"), a SUSE Linux appliance created by Ed Borasky, added sentiment analysis to its set of features. The toolkit now includes texttir, a sentiment analysis package created in the statistical programming language R. SMART@znmeb includes other open source tools that include data mining, dashboarding and data visualization.
Borasky says textir is the first open source sentiment analysis library he's found that he thinks may actually work. "Most of the vendors sell a sentiment analysis tool of some kind or another, and the customers that have tested multiple tools spend a lot of time trying to figure out why they give different answers," he says. He also cautions that sentiment analysis is vulnerable to spam and other gaming tactics and requires a large investment in hardware.
Programmer Ted Dziuba suggests an alternative to traditional program that he called "Taco Bell Programming." The Taco Bell chain creates multiple menu items from about eight different ingredients. Dziuba wants to be able to be able to create many applications with combinations of about eight different shell commands.
Movable Type search results powered by Fast Search