spam filtering - ReadWriteWeb http://www.readwriteweb.com/feeds/tag/spam filtering en Copyright 2012 Richard MacManus readwriteweb@gmail.com Wed, 15 Feb 2012 10:45:03 -0800 http://www.sixapart.com/movabletype/?v=4.35-en http://blogs.law.harvard.edu/tech/rss Spammers Newest Tactic: YouTube Video Spam Researchers at Kaspersky Lab have recorded a mass mailing of spam emails containing a link to a video advertisement on YouTube. Although in the past, spammers have attempted to lure people into clicking links by claiming the link would display a YouTube video, this is the first case in which the link actually does point to YouTube. In this particular incident, the video in question is a Russian ad promoting industrial real estate.

]]> Two years ago, Kaspersky Lab predicted that YouTube would eventually become a vector for disseminating spam due to its worldwide popularity. However, this is the first time the video-sharing site has been used in this way as far as the researchers can tell.

Says Darya Gudkova, Head of Content Analysis & Research at Kaspersky Lab, "naturally, this type of advertising is more interesting and gets more hits." That's bad news for YouTube because when something works, spammers keep at it... with a vengeance. Once word gets around that video spam is more successful than traditional methods, there's no doubt that it will only increase.

How Would YouTube Handle Video Spam?

So what will YouTube do if video spam becomes a real problem on its network? We would like to think that it would take the offending content down, but that could be easier said than done. After all, this isn't like the copyrighted content that their Content Identification tool can easily identify and remove. That tool works by comparing unique signatures somewhat like a digital "fingerprint" from a content owner's copyrighted file to user uploads across the site. Then, if a match occurs, the copyright holder has the option to have the video taken down.

Identifying a spammer's video would be much harder. Just because someone is using YouTube to sell something, that doesn't necessarily mean it's video "spam." That moniker should only be reserved for videos which are truly undesirable messages where fraudulent activities are underway. The question is, how would YouTube know?

Assuming that video spam takes off, the best thing the site could do to police online content is to include a "report spam" button for videos themselves, as it now has for video comments only. 

Of course, for potential victims of video spam, the best thing is not to get duped into visiting YouTube in the first place. Spam filters will simply have to adapt to this new technique. Unfortunately, that will be yet another challenge for Google, which, in addition to owning YouTube, also offers a feature in its webmail product Gmail that automatically embeds any YouTube videos referenced in the email directly in the message itself. That makes it even more convenient for video spammers, who wouldn't have to convince their victims to leave their inbox and launch a new browser window: just click a button on the video embedded below.

]]> Discuss]]>
http://www.readwriteweb.com/archives/spammers_newest_tactic_youtube_video_spam.php http://www.readwriteweb.com/archives/spammers_newest_tactic_youtube_video_spam.php Video Services Fri, 09 Oct 2009 06:02:29 -0800 Sarah Perez
Mollom Blocks 100 Millionth Spam Message Editor's note: we offer our long-term sponsors the opportunity to write 'Sponsor Posts' and tell their story. These posts are clearly marked as written by sponsors, but we also want them to be useful and interesting to our readers. We hope you like the posts and we encourage you to support our sponsors by trying out their products.

Mollom, the spam-filtering startup that eliminates comment and post spam on popular content management systems, just reached two important milestones: it processed 100,000,000 messages and is now actively protecting over 10,000 websites.

]]> It was only about three months ago when the startup, began by Dries Buytaert and Benjamin Schrauwen, celebrated its 50 million message milestone, and only two months before that when the company reached 25 million. Mollom is still a young company, but these milestones are coming fast because so many websites are getting on the bandwagon with the aim of increasing the quality of their website interaction by blocking spam.

Even more impressive is that these statistics are for Mollom's public servers only and don't include message processing on private servers operated for large-volume clients, such as Netlog, an online social portal for European youth.

Mollom set up dedicated servers in Netlog's data center to provide automated around-the-clock monitoring and custom-trained content classifiers. Mollom's servers analyze more than 50 messages per second for Netlog, adding up to an additional 4 million messages per day that are not counted in the latest milestone.

Large sites such as Netlog are turning increasingly to Mollom for its ability to filter spam in near real-time. Another site, popular citizen journalism hub NowPublic, had been receiving almost 25,000 spam posts per day before implementing Mollom's service. After NowPublic installed Mollom, the number of legitimate comments by users jumped 180%, while spam comments fell to nearly zero.

Taking into account the traffic from the 10,000 websites that Mollom protects, Mollom currently processes up to 150 million messages a month, making it one of the largest website spam filtering services available today.

But Mollom is not content to rest on its past achievements. The company is currently changing the architecture of its back-end, which will make the software learn faster and make its actions easier to debug, analyze, and oversee.

Mollom offers its services in tiers, with products targeted at small blogs, mid-sized sites, and large enterprise-level Web properties. Mollom Free, designed for small blogs and sites with small posting volumes, is provided free of charge to the Web community, while Mollom Plus and Mollom Premium are commercial services designed for sites with higher volumes and reliability requirements. More information about its service plans is available on Mollom's website.

]]> Discuss]]>
http://www.readwriteweb.com/archives/mollom_blocks_100_millionth_spam_message.php http://www.readwriteweb.com/archives/mollom_blocks_100_millionth_spam_message.php Sponsors Thu, 23 Jul 2009 05:00:26 -0800 RWW Sponsor
Maybe Twitter Trends Shouldn't Be Entirely Automated? Over the weekend, Twitter's trending topics were once again the target of an attack, this time implemented by the members of the infamous image board 4chan, the site known for their internet memes and pranks. As with previous attempts to pollute the trends with nonsense, the hashtag pushed into the leaderboard was yet another inappropriate term. Last time this happened, we saw Twitter pull the offensive tags from the trends section, a move which prompted us to cheer: Twitter censoring trending topics? Isn't it about time?  Again, it seems the company has pulled the same move. By the time tech blogs picked up the story, the term had disappeared completely from the trends section.

]]> But maybe "trends" like this have no business ever making "trend" status at all. We have to wonder if censorship after the fact is going to be good enough for Twitter going forward. As Twitter continues to grow, more and more people will want to get their keyword or hashtag featured in this popular section of the Twitter Search site. Perhaps Twitter should consider putting a human editor in charge of weeding through the supposed trends before they get posted.

Twitter Censoring Trends: Is it Enough?

At the end of the day, we agree with Twitter's decision to pull the obviously forced hashtag from the trends section just as they did the last time a bunch of folks thought they would have some fun by tweeting other offensive words and phrases. But these incidents have made us wonder: has Twitter trends outlived its ability to function properly as an entirely algorithm-based service? Given how many people rely on Twitter trends to track hot topics and breaking news, the section will be under constant attack from those who want to use the algorithm for their own purposes...and not necessarily good ones.

In some cases, like the latest 4chan move, the term-made-trend will be a somewhat offensive, but ultimately harmless prank. In other cases, the trends will be courtesy of some marketer pushing their hashtag up through the ranks thanks to their latest "tweet-to-win" contest. But do either of these cases represent an organic news-based trend that deserves the spotlight? Perhaps not.

Although censorship isn't something that most people would normally support, in these cases it would feel less like censorship than it would a simple act of filtering. It's easy to see that "trends" like these aren't really the sort of trends that the section was meant to highlight. However, by letting the algorithm do all the work, everyone with an evil plan to get their hashtag into the leaderboard has a shot at 15 minutes of fame. And on the real-time web, that's an eternity.

If, on the other hand, Twitter started pre-filtering the trends for relevance, there would long be a reason for hoaxsters, pranksters, and other trend-hogging marketers to attempt to game the system. Just by putting a human editor in charge of Twitter trends, "fake trends" like these could easily be avoided. Even if the company didn't want to go with full-on censorship, they could at the very least move the "other" trends off the main page by adding a link that said "More..."

Filtered Trends Could Delay Breaking News

But the drawback to a human-filtered trends section could be a delay in seeing breaking news make trend status - and that would be a disaster for a service that's all about immediacy. For some people, even the threat of a delay such as this would probably have them saying, "forget censorship and filtering - I want real-time trends, legit or not!"

But to those people, we ask: what about when Twitter becomes so uber-popular that the real-time trends section you crave becomes filled with junk trends thanks to internet memes and marketers' messages? Will you still prefer it then?

We're not sure if a human editor is the right solution for Twitter, but one day soon, something will have to be done. One commenter on a previous post mentioned some other ideas for filtering trends and hashtag spam, including having users tweet "#spam=hashtag" and suggesting Twitter adds a feature which would let us block hashtags from our streams. Another commenter suggested Outlook-like rules for hiding certain hashtags. If you have any ideas of your own about what Twitter should do, feel free to share them in the comments.

]]> Discuss]]>
http://www.readwriteweb.com/archives/maybe_twitter_trends_shouldnt_be_entirely_automated.php http://www.readwriteweb.com/archives/maybe_twitter_trends_shouldnt_be_entirely_automated.php Twitter Mon, 06 Jul 2009 07:49:26 -0800 Sarah Perez
When it Comes to Spam, Everything Old is New Again spam_logo_jul09.jpgGoogle released some interesting data about the volume and types of attacks its spam detection software identified over the last quarter. According to Google, overall spam levels in the second quarter of 2009 were 53% higher than during the first quarter, and 6% higher than a year ago. Even though the total volume of spam dropped by 70% after the the takedown of the infamous McColo ISP, it only took four months for spam levels to get back to normal. Last month, 3FN, an other large ISP spam source was also shut down, but spam volume only dropped by about 30%, and chances are that the spam market will simply rebound within a few months, as new spammers get into the market.

]]> The Return of Image Spam

Interestingly, Google also notes that image spam, which is generally filtered out quite well by modern spam detection software, has seen a major resurgence. Amanda Kleha, a member of Google's message security and archiving team, theorizes that this might be due to new spammers getting into the market after the shutdown of McColo and 3FN, and these new players are starting out with well established methods, even if they are not very effective. Kleha also notes that spammers might just be testing how well the current generation of spam filters handles these messages in order to perform statistical analysis based on which subject lines and content make it into users' inboxes.

Google also notes that one of the largest spam attacks in the last quarter was based on an old school "newsletter" template (with malevolent links and images thrown in there for good measure). This attack unleashed about 50% an average day's spam volume in only 2 hours. So while it might not have been highly sophisticated, there was surely a massive network behind it that was able to send out this huge amount of spam in such a short time.

google_spam_q22009.png

]]> Discuss]]>
http://www.readwriteweb.com/archives/when_it_comes_to_spam_everything_old_is_new_again.php http://www.readwriteweb.com/archives/when_it_comes_to_spam_everything_old_is_new_again.php News Wed, 01 Jul 2009 09:13:08 -0800 Frederic Lardinois
Mollom's Spam Filtering Helps Fast-Growing NowPublic Editor's note: we offer our long-term sponsors the opportunity to write 'Sponsor Posts' and tell their story. These posts are clearly marked as written by sponsors, but we also want them to be useful and interesting to our readers. We hope you like the posts and we encourage you to support our sponsors by trying out their products.

The Web is changing. In today's world, user participation can make or break a site. Allowing users to react, participate, and contribute while keeping your site under control can be a huge challenge. If poor-quality content or spam hits your website, it can undermine your site's search engine listing, damage your brand and reputation, and degrade your visitors' experience. Good user-contributed content, meanwhile, can add a lot of value to your site, which translates into more activity, improved stickiness, and more and better monetization opportunities. As the Web continues to become more social, more websites will need a strategy to deal with spam and unwanted content.

]]> Given the state of today's publishing world and the decrease in print media revenue, many publishers are looking to their online presence to increase revenue and readership. To engage with new readers and encourage them to contribute comments and content, media houses and content sites are adding social features.

The addition of these social features has brought the problem of spam. Two major challenges arise from trying to control website spam. First, visitors may lose their motivation to comment or contribute content because they are required so often to prove that they are human and not spam by registering. This erodes participation.

Secondly, whether visitors are asked to register or not, site moderation becomes more time-consuming and expensive. Website moderators have to scan comments and other content to find spam instead of interact with the community. And publishing companies have to pay for more site moderators to deal with all the spam on their sites.

NowPublic is a Vancouver-based news network that mobilizes an army of reporters to cover events around the world. During Hurricane Katrina, NowPublic had more reporters in affected areas than most news organizations have on their entire staff. NowPublic was up against as many as 25,000 spam attempts a day, so it needed a solution that would allow the site to grow faster and more effectively without being slowed by comment spam.

A year ago, NowPublic implemented Mollom, a Web service that protects blogs, social networks, and communities against spam and other unwanted content. Within 12 months, the company had become one of the fastest-growing news organizations in the world, with thousands of reporters in more than 140 countries. In addition to this growth in reporters, NowPublic saw an 180% increase in the average number of comments posted per month by users since implementing Mollom's spam-filtering service.

"Integrating Mollom in NowPublic's systems was quick and easy," says Michael Meyers, co-founder and CTO of NowPublic. "It took only a few hours, and the API service has been fast and 100% reliable. By the end of the first month, we saved more in-person hours alone than Mollom cost us for the year."

Mollom has prevented more than one million spam attempts since it started protecting NowPublic. But NowPublic uses Mollom for more than just comment spam. It uses it to identify bogus profiles, vet new account sign-ups, and protect forums.

Mollom, in effect, removed a major barrier to visitor participation for NowPublic, allowing readers to comment anonymously. "Mollom has been a critical ingredient in our success," adds Michael Tippett, co-founder and CMO. "It has allowed us to open our comments to anonymous users while limiting the ability of spammers to vandalize our site. This has helped us grow our page views and truly tap into the wisdom of crowds."

Mollom also allows NowPublic's website maintainers and editors to focus on providing content instead of removing spam. "Since NowPublic began using Mollom," says Jordan Yerman, NowPublic's Contributor Support Manager, "I've saved at least an hour per day dealing with spam in stories, profiles, comments, etc. Thanks to Mollom, I can be more pro-active than reactive. I have more time to engage and interact with our users."

Other major publishers using Mollom to protect their websites from spam are Sony Music, Warner Bros Records, Netlog, The Economist, Fox Interactive, and the New York Observer.

Visit mollom.com to download Mollom's spam filtering service for your website.

]]> Discuss]]>
http://www.readwriteweb.com/archives/mollom_spam_filtering_helps_nowpublic.php http://www.readwriteweb.com/archives/mollom_spam_filtering_helps_nowpublic.php Sponsors Thu, 18 Jun 2009 05:00:08 -0800 RWW Sponsor