mechanical turk - ReadWriteWeb http://www.readwriteweb.com/feeds/tag/mechanical turk en Copyright 2012 Richard MacManus readwriteweb@gmail.com Tue, 14 Feb 2012 18:04:00 -0800 http://www.sixapart.com/movabletype/?v=4.35-en http://blogs.law.harvard.edu/tech/rss 40% of New Mechanical Turkers' Work Requests Are Now For Spamming - What a Lost Opportunity! mechanicalturkspamless.jpg

What if you were given incredible powers but had such a limited imagination that you only used them to pollute the internet with spam? That's what's happening to the powerful distributed labor marketplace of Amazon's Mechanical Turk, where requesters pay small sums of money for people around the world to perform small tasks that only a human can do. You can use Mechanical Turk to do incredible things - but it turns out that most people don't. Many just use it to hire an army of spammers.

A new study by NYU academics Professor Panos Ipeirotis, Dahn Tamir and Priya Kanth studied all the new Mechanical Turk requester accounts that have been created over the last two months and found that more than 40% of their requests were for the workers to commit acts of spam. The team used Mechanical Turk itself to evaluate the tasks submitted, but they had to take extra steps after their own requests for work came back filled with spammy, random input from workers who didn't care. The whole situation is a tragic loss of opportunity - because there are some really fantastic things you can use this service for.

]]> The Good News

What's the screenshot at the top of this post? That's the end result of a Mechanical Turk project I did a few months ago. I was blogging about a high-powered technology conference called Techonomy and in preparation for the event, I submitted all of the attendees' names to Mechanical Turk. I asked the workers to find the Twitter username, web home page URL, US or International designation and gender for each person's name. I had 3 people look at each name (for quality control) and paid 15 cents for each task.

The people in the system processed hundreds of names in just a few hours for about $50, much faster than I could have done myself. Meanwhile, I was able to spend my time doing other things.

Once they were done, I created all kinds of resources with the information they procured, including a Twitter list of the women attending the Techonomy conference. I then imported that Twitter list into the fabulous service Flipboard, which creates a personalized iPad magazine from the links shared by any list of Twitter users. Thus I now have a daily, dynamic, multi-touch personalized magazine on my iPad made up of the content shared by women who attended the Techonomy conference. That's pretty awesome. Thanks, Mechanical Turk!

The Bad News

A whole lot of the work being done on Mechanical Turk is downright spammy. That's a problem for me as a Mechanical Turk work requester because it creates an atmosphere of low quality work and indifference. It's a problem for me as a web user because I have to deal with the spam - in my search results, in my blog comments and in my email.

The NYU team used the following criteria to determine whether a task was spam:


  • SEO: Asks me to give a fake rating, vote, review, comment, or "like" on Facebook, YouTube, DIGG, etc., or to create fake mail or website accounts.

  • Fake accounts: Asks me to create an account on Twitter, Facebook, and then perform a likely spam action.

  • Lead Gen: Asks me to go to a website and sign up for a trial, complete an offer, fill out a form requesting information, "test" a data-entry form, etc.

  • Fake clicks: Asks me to go to a website and click on ads.

  • Fake ads: Asks me to post an ad to Craigslist or other marketplace.

  • Personal Info: Asks me for my real name, phone number, full mailing address or email.

  • You can also use your intuition to classify the HIT


After studying 5842 task groups submitted by 1733 new users, the group drew the following conclusions:

  • 40% of the HITs from new requesters are spam.

  • 30% of the new requesters are clear spammers.

  • The spam HITs have bigger value than the legitimate ones.

That's bad news, but it could be worse. We wrote about the proliferation of social media spam requests on Mechanical Turk two and a half years ago and though the types of spam have changed (hello, Facebook "Likes" and R.I.P. Delicious) it's not clear that it's gotten any worse.

We do have a baseline now, so hopefully these same academics can study the marketplace again in a year and see whether things have grown more or less spammy. They say Amazon is indifferent to the reality of how much hired spam is coming out of Mechanical Turk - and that is in truth a bad thing, for all of us. The culture of work on the site could really be improved, and thus the end product would improve.

It would be a terrible shame if Mechanical Turk was shut down - but it's also tragic that this is how so many people are using it. Come on everybody, Mechanical Turk is made of people - let's do more interesting things with it!

]]> Discuss]]>
http://www.readwriteweb.com/archives/study_40_of_new_mechanical_turkers_work_requests_a.php http://www.readwriteweb.com/archives/study_40_of_new_mechanical_turkers_work_requests_a.php Crowdsourcing Fri, 17 Dec 2010 00:33:19 -0800 Marshall Kirkpatrick
How to Use Mechanical Turk to Rock Conference Blogging Let's say you are going to, or hosting, a conference and you want to make a good impression with the attendees and organizers. One way to do that is to create useful and thoughtful original content and resources regarding the event.

Thanks to tools like Mechanical Turk, Google Custom Search and of course Twitter, you can now do incredible things around conferences that would have been very inefficient to do before.

]]> Earlier this month I went to the Techonomy conference in Lake Tahoe and wrote both here on ReadWriteWeb and on the conference blog. The event brought technologists together to talk about tackling the world's big problems, like water and food shortages. It was a very impressive group of people. Before the event began, I used a few online tools to create some resources that proved very helpful in creating high-quality coverage of the event. I thought I would share what I did so that readers could make use of these same or similar methods.

Resources Created

Specifically, here's what I set up:

  • A Twitter list of all the conference attendees who use Twitter. This was very useful for keeping track of what people were saying during the event, even if they weren't using the official hash tag. It's also a really impressive list of people to keep in touch with in the future, and now when I'm viewing their Tweets in a list, I'll always know the context that I discovered them in.

  • A Twitter list of women participating in the event. I also did the research to make it easy for me to create a list of people from outside the United States who were there. It's good to create a special view into the conversations of some groups of people who can get lost in the noise of known industry leaders, in this case disproportionately men from Silicon Valley. Those lists are good not just to track particular perspectives during the event, I've also subscribed to the same lists in the beautiful iPad app Flipboard, so now I've got a personalized magazine made up of all the links shared on Twitter by international attendees of the Techonomy conference. That's nice to have.

  • Most important for the blogging at the conference, I created a Google Custom Search Engine that searches the archives of all the websites of the organizations the conference attendees work for. This proved invaluable, as I could reference the previous work and research of the people present in writing about their discussions that weekend. It made for much better-informed blogging than I would have been capable of without the tool.

How To Get the Info

I created all of the above in one night, for $50.

Here's how I did it.

First, and this is probably the least accessible part, but it's not that hard. I asked ReadWriteWeb's technical guy Tyler Gillies to scrape me a CSV (comma separated value) file of all the names and descriptions on the page listing the participants. "I used an html parser and searched for div tags with the class that matched participant," he says. It was a Ruby parser he wrote himself and luckily the CSS of the page put participants in a nice div. That took a few minutes at most.

Next, I loaded up Amazon's Mechanical Turk web application. Mechanical Turk is a system that allows you to break down projects into very small tasks and pay a human being a small sum for each task they complete.

I created a template in Mechanical Turk that basically said: "Look at this person's name and description. Find their Twitter username, their organization's blog or website, tell me if they appear to be Male or Female and whether they appear to work inside or outside the US. For each name you do this for, I will pay you 20 cents." The end result was that I got all that info about 256 people very quickly, mostly in a few hours but 100% complete overnight, for just over $50. It was very much worth it.

Mechanical Turk takes a little bit of time to figure out how to use, but it's not that hard. Basically, you build a template, then you upload a file to populate that template for each worker. It's remarkably efficient.

mttechonomy

turkconf

At that point it wasn't hard to click through all the Twitter usernames and add people to one or more lists. It wasn't hard to copy and paste the website URLs into Google Custom Search Engine, create a collection, test it and then create a second one without the more prolific news organization sites included.

I visited each of my two new search engines, performed a search for the word "love," added the advanced search operator "more:recent4" to put results in roughly reverse chronological order instead of pagerank, and then I bookmarked that search results page.

So I shared these resources on my personal blog before the conference, then hit the venue with laptop in hand. I attended conference sessions, watched what people were saying about them in the backchannel via my Twitter lists, then found pertinent details about what people were saying, from their organizations themselves, with great efficiency using my custom search engines. It made for some very informative blog posts, quickly put up but with good supplemental details.

csetechonomy

That's how I did it, and that's how I'll do it again in the future. Other tactics I might have employed include subscribing to blogsearch feeds for the conference name through an IM interface so I can keep track of what everyone else is saying in real time, doing the same with Twitter and perhaps another trick or two that I'm just now considering but will test out before talking about. :)

Thoughts? Suggestions?

Illustration titled "Blogging Au Plein Air, Jean-Baptiste-Camille Corot" by Flickr user Mike Licht

]]> Discuss]]>
http://www.readwriteweb.com/archives/how_to_use_mechanical_turk_to_rock_conference_blogging.php http://www.readwriteweb.com/archives/how_to_use_mechanical_turk_to_rock_conference_blogging.php Blogging Tue, 17 Aug 2010 14:43:35 -0800 Marshall Kirkpatrick
Project Management + Mechanical Turk? Smartsheet Looks Awesome smartsheetlogo.jpgWhy didn't we think of this? Project management startup Smartsheet released a new core feature this week - integration of Amazon's outsourcing service Mechanical Turk. The Smartsheet interface will now let you set up Turk research jobs that thousands of anonymous workers around the world will split up and perform quickly for a very low price.

In the example the company provides on its product page, the user publishes a series of small work orders for research on the names and profiles of top CEOs around the country. That kind of drudgery would take hours to perform, but with Mechanical Turk it can be done on the cheap, quickly.

]]> The Smartsheet interface for Mechanical Turk looks good to us, though we must admit that we're not regular users of the service. This sounds like a great idea, though, and we'll be excited to see if it works well for people. Far too much of the work on Mechanical Turk is for publishing spam - so putting that energy to legitimate business uses is a great idea. There's a whole lot of untapped potential there. The possible applications of bulk human labor in information work are many and are just starting to be explored.

Smartsheetscreen3.jpg

How should a person feel about the Turks though? We admit that we were a little concerned at first; the last thing we want is to use some creepy neo-colonial crap like ODesk.com, which sends customers hourly webcam photos and screenshots of their contracted overseas labor in action, usually with their eyes flared with surprise at the intrusion.

Mechanical Turk seems different though. There appears to be a real art to using it well, and that is one thing that concerns us about the viability of Smartsheet's product.

We get our Mechanical Turk advice from Andy Baio's Waxy.org. Baio recently paid Turkers 50 cents each to upload a picture of themselves with a paper sign explaining "why you Turk." The responses were incredibly humanizing and more than a little amusing.

mechanicalturkpicsmall.jpg

Click to enlarge

Would you want a project management app that let you leverage those peoples' time at a low price? That sounds like a pretty intriguing idea to us.

]]> Discuss]]>
http://www.readwriteweb.com/archives/project_management_mechanical_turk_smartsheet_looks_awesome.php http://www.readwriteweb.com/archives/project_management_mechanical_turk_smartsheet_looks_awesome.php Product Reviews Thu, 12 Feb 2009 11:15:33 -0800 Marshall Kirkpatrick
Twenty 9th Graders from Georgia Take On Google digi_teen_dec_08.jpgThe Digiteen Dream Team, a group of passionate ninth graders who have been using Google's Lively as part of the Digiteen Project, are planning to protest this Wednesday against Google's decision to close down its virtual world environment, Lively, at the end of this year.

In their shutdown announcement, Google suggested Lively users capture their work by taking videos and screenshots, and thanked their users adding: "We've learned a lot about how users interact in rich social environments." Is this all Lively was about? An experiment in user behavior?

]]> Soon after Google's announcement, here at ReadWriteWeb we speculated that the reason for the kill was that Lively didn't offer Google any relevant data. Today, we have to question why any company would discontinue a service without providing alternatives to their customers - paying or not.

Digiteenactionproject006_dec_08.jpg
Image: Digiteen Action Project

Teacher Vicki Davis, in a blog post on the Dream Team site said that the class had contacted Livelyzens (other Lively users) and found that there are classrooms around the world using the tool. "On a Skype call between my class and some Livelyzen's yesterday, we learned that one Livelyzen has built a translator for multiple languages to allow avatars to communicate and speak in their native language! So cool!"

The American Education System Needs Your Help

In an attempt to have their voices heard, the Digiteen Dream Team created a blog where they have been listing their goals, along with suggestions on how Google could turn Lively around. You have to commend them on their efforts.

The student led protest is planned for this Wednesday 2.15 - 3.00 p.m. (EST). These are the ways you can help:

  1. Create an account on Lively
  2. Create a room and host a protest. Let the Dream Team know, and they'll post about it
  3. Visit the protest room on Wednesday and show your support
  4. Sign the Lively petition
  5. Write a letter to Google about the use of Lively in education
  6. Pass the word on; promote the protest

With students around the world counting on virtual worlds, the economy in the sorry state it is in, and schools across the United States working on minimal funding, we need to find a way to provide a safe online environment for students to work in.

"My students have a dream to create 3D worlds for teaching digital citizenship - they are going to pursue this dream and I'm going to help them. We will not stop - but if we have to start over we want it to be the right place that is accessible to as many students as possible," Davis said recently on her post.

With the holiday season fast approaching, let's hope Santa has a few goodies in his bag - or at the very least, a trick up his sleeve.

]]> Discuss]]>
http://www.readwriteweb.com/archives/twenty_9th_graders_from_georgi.php http://www.readwriteweb.com/archives/twenty_9th_graders_from_georgi.php Google Mon, 08 Dec 2008 00:22:20 -0800 Lidija Davis
Amazon's Mechanical Turk Used for Fraudulent Activities Amazon's Mechanical Turk has fallen prey to social media spammers and it is now full of requests to spam bookmarking services for pennies per link. Although these HITs may stop short of being "fraud" in the legal sense of the word, they are certainly dishonest and unsavory. In addition to these spam bookmarking requests, we're also seeing HITs for Diggs, Stumbles, Slashdots, etc. of spammers' web pages and web sites.

In case you're unfamiliar, Amazon's Mechanical Turk is a crowdsourced marketplace for tasks. A person needing work done can set up a HIT (human intelligence task) - the small job they need done. Others come along to perform the HITs, earning micro payments along the way. In this way, businesses, developers, and other individuals have access to an affordable, scalable workforce

]]> The Dark Side to Mechanical Turk

Unfortunately, it appears that the convenience of the Turk marketplace has some appeal to social media spammers, who are now using the site to earn Diggs, bookmarks, and other social recommendations they do not deserve. Here's an example:

Photo courtesy of Brynn Evans

Anyone who uses Amazon's Mechanical Turk has no doubt come across similar HITs posted by spammers. For example, this guy is requesting someone create 29 social bookmark accounts from 29 sites:

A search for "bookmark" on MT today displays 48 results (at the time of writing) where spammers are requesting social bookmarking of their web site. Search for "digg" and you'll find people paying for Diggs.

Of course, whenever there is a system in place (like social media) that can help drive traffic to a web site, there will be those people who use it to generate traffic for their spam sites. But why are they able to use Amazon Mechanical Turk to do so? Shouldn't Amazon police the Turk to shut down these spam accounts?

Mechanical Turk Still Has Promise, Despite Spammers

However, this doesn't mean that Mechanical Turk doesn't hold any value - it's still an innovative and useful tool for many. In fact, members of the HCI community (Human Computer Interaction) have begun to use Turk for user research studies with great success. This work has inspired others like open source advocate, Chris Messina, to do the same. He plans to use Turk for usability studies on OpenID and OAuth. Since the HITs are spread out among many, the cost of performing these studies is greatly reduced. Being able to crowdsource research is a great way that MT can be used today, and one that will have a big impact on the future, too.

Thanks to Brynn Evans, a graduate student in the Department of Cognitive Science at University of California, San Diego for discovering this and thanks to open source advocate Chris Messina for sending it along to us.

]]> Discuss]]>
http://www.readwriteweb.com/archives/amazons_mechanical_turk_used_for_fraud.php http://www.readwriteweb.com/archives/amazons_mechanical_turk_used_for_fraud.php Trends Fri, 29 Aug 2008 08:36:17 -0800 Sarah Perez
Amazon Announces New Payment Services and Updates to Mechanical Turk amazon-logo.pngIn a quick succession of announcements, Amazon released a set of hosted e-commerce payment services, as well as an update to its Mechanical Turk service. The payment service, Checkout by Amazon, will allow online retailers to use Amazon's one-click checkout system, calculate shipping costs and tax, as well as allow their customers to track shipments. The updates to the Mechanical Turk are mostly meant to streamline the creation of new tasks by guiding businesses through the process more efficiently.

]]> Checkout by Amazon

amazon-shopping-cart.jpgOut of the two announcements, the payment services service are the most interesting. Amazon gives its customers two options: Checkout by Amazon or Amazon Simple Pay. Simply Pay is basically a stripped-down version of the full Checkout package and doesn't include the one-click checkout and most of the order management features such as calculating sales tax and shipping rates, creating packing slips, or collecting buyer feedback. Simple Pay, on the other hand, allows sellers to use more payment options, including credit cards and bank accounts. Checkout by Amazon can only accept credit cards.

These services are basically an extension of Amazon's "Flexible Payment Service." This service (which has been in beta for quite a while now) gives developers a set of API that hook into Amazon's payment services. One area that Amazon is especially targeting with this is micro-payments.

With these new services, Amazon is going up against Google Checkout, as well as most credit card merchant accounts. However, with Amazon's already established reach among consumers, as well as the level of trust that most consumers have when it comes to working with Amazon, both Checkout and Simple Pay have a distinct advantage over their competition. For merchants, Amazon's Checkout service also offers a wider range of services than most credit card processors or Google Checkout currently offer. Google Checkout, however, is generally cheaper than Amazon's offerings - though it also offers fewer services.

Streamlined Mechanical Turk

Amazon's Mechanical Turk is basically a way to outsource menial tasks that would be too computing intensive or simply need human intelligence to be completed (Amazon calls them "Human Intelligence Tasks"). Applications reach from tagging photos to rewriting trivia questions, or digging a particular story.

With this latest update, 'requesters', as Amazon calls them, can look forward to a simpler user interface that will guide them through the process more effectively. Amazon has also created a set of more efficient tools to track and monitor the work that is being done.

amazon-mechanical-turk.jpg

]]> Discuss]]>
http://www.readwriteweb.com/archives/amazon_payment_services_mechanical_turk.php http://www.readwriteweb.com/archives/amazon_payment_services_mechanical_turk.php News Wed, 30 Jul 2008 09:43:32 -0800 Frederic Lardinois
10,000 Cents Buys You $100: Awesome Crowdsourced Art Project "Ten Thousand Cents" is a crowdsourced art project that led 10,000 artists, each paid one penny for their contribution, to recreate a US $100 bill one tiny section at a time. The brainchild of San Francisco artists Aaron Koblin and Takashi Kawashima, "Ten Thousand Cents" utilized Amazon's Mechanical Turk service and a bit of custom Flash software to lead 10,000 web workers in a coordinated, crowdsourced art project. The result is a rather impressive rendering of a US one hundred dollar bill drawn by an army of contributors.

]]> Koblin and Kawashima first divided a high resolution scan of the $100 bill into 10,000 equal parts. Each part was then delivered to a turker who was paid a penny to duplicate it using a simple Flash-based drawing tool. Contributors didn't have any idea what they were working on while the were working on it.

The project took 5 months to complete and involved contributions from 51 different countries. Because some turkers participated more than once, there weren't truly 10,000 different artists contributing to the project, but it appears that most countries had unique visitor rates of above 60%. The end result was a reproduction of a $100 bill that cost $100 to create.

"The project explores the circumstances we live in, a new and uncharted combination of digital labor markets, 'crowdsourcing,' 'virtual economies,' and digital reproduction," according Koblin and Kawashima on the project web site.

The completed artwork is being displayed on the "Ten Thousand Cents" web site as an interactive video depicting all 10,000 pieces of the bill being drawn at once. A limited edition signed print (presumably signed by Koblin and Kawashima, not thousands of random turkers), is also available on the site for $100, with all proceeds going to the One Laptop Per Child project.

A video about "Ten Thousand Cents" is below.

]]> Discuss]]>
http://www.readwriteweb.com/archives/ten_thousand_cents.php http://www.readwriteweb.com/archives/ten_thousand_cents.php User Generated Content Wed, 16 Apr 2008 11:39:51 -0800 Josh Catone
Amazon's Other Service: A Virtual Sweatshop? Actually, No Amazon's web services get a ton of press, but mostly in the context of the Elastic Compute Cloud (EC2), the Simple Storage Service (S3), SimpleDB or one of the company's other developer-centric offerings. One that doesn't get much coverage in the tech media these days is the Mechanical Turk service, which Amazon refers to as the "on-demand workforce." When it does get coverage, it is sometimes to level accusations that Amazon is offering workers at sweatshop wages. But are those concerns really valid? Just who are these workers?

]]> What is Mechanical Turk?

The Mechanical Turk service, which Amazon released in November of 2005, is a web service that allows companies to outsource simple, generally repetitive tasks to human workers for small sums of money. It now has 100,000 workers in 100 countries and counts corporations such as comparison search engine PriceGrabber among its users.

The service got mainstream attention when Amazon used it to help organize a virtual search for missing Microsoft researcher Jim Gray last year.

A quick survey of open assignments on the Mechanical Turk site reveal a handful that pay up to $15.00 but the vast majority paying out under $1, and many paying only a few cents. There are over 31,000 so-called Human Intelligence Tasks available on the site right now, but scanning through them by price makes it easy to imagine that collectively they're still probably not worth as much as what some of Amazon's executives can find in their couch cushions. So it's not hard to see how some people could accuse Amazon of creating a virtual sweatshop labor force.

Who Are These People?

A recent demographic survey done through Mechanical Turk sheds some surprising light on just who these "Turkers" are, however. Contrary to the pictures painted by some media outlets that Amazon has assembled a third world workforce of people willing to work for pennies, most Turkers are actually from the United States. According to the survey, 76.25% are from the US, with just over 8% from India.

Further, the vast majority of Mechanical Turk participants are under 40 years of age, and over 50% of them have bachelor's degrees. About half also make over $25,000 per year -- and a surprising percentage make over $40,000 per year.

So why participate in the Mechanical Turk program -- one which nets most people under $600 per year -- if you're well educated, already have a paying job, and there are so many other ways to make money? One answer is that people find the tasks offered on Mechanical Turk fun. The Amazon service provides people with time wasters that also pay a little money and for younger users, especially, the service offers an easy way to make a little pocket change.

A recent New York Times article relates a number of anecdotal stories about why people participate, as does an earlier post on Panos Ipeirotis's blog (Ipeirotis is responsible for the demographic survey referenced in this post). Clearly, very few people participate in Mechanical Turk solely to make money. Most people do it out of boredom, to make a little pocket change, or because they are limited in the type of work they can do due disability or some other reason.

Have you ever used Mechanical Turk to outsource a task? Have you ever participated as a worker? What was the result? Let us know in the comments.

]]> Discuss]]>
http://www.readwriteweb.com/archives/amazon_mechanical_turk_demographics.php http://www.readwriteweb.com/archives/amazon_mechanical_turk_demographics.php Trends Fri, 28 Mar 2008 12:23:32 -0800 Josh Catone