code - ReadWriteWeb http://www.readwriteweb.com/feeds/tag/code en Copyright 2012 Richard MacManus readwriteweb@gmail.com Wed, 15 Feb 2012 07:00:00 -0800 http://www.sixapart.com/movabletype/?v=4.35-en http://blogs.law.harvard.edu/tech/rss Developer Creates Tool to Bring RSS Back to Twitter Earlier this month, entrepreneur and blogger Jesse Stay noticed that both Facebook and Twitter had completely removed support for RSS from of their websites. After much outcry from the tech community, Facebook relented and re-added an RSS link to Facebook Pages once again. Twitter, however, did nothing.

But now, one developer has taken it upon himself to build a tool that uses Twitter's API (application programming interface) to create RSS feeds. The code, called "Twitter API 2 RSS," is now available on GitHub here.

]]> Twitter Kills RSS

According to Stay's earlier post, Twitter has been moving away from RSS for some time. Last year, Twitter developer Isaac Hepworth told Stay that only hyperlinks to RSS feeds were being removed from Twitter profile pages, but links to the RSS in the Twitter metadata would remain. Their temporary removed was "accidental," Hepworth had said, and would be fixed soon.

But Stay says the problem was never fixed, and he could not find any evidence of RSS in the HTML source, either. This lead him to conclude that Twitter had indeed killed off all support for the technology. An article in Twitter's Help section confirmed this, saying: "we no longer directly support RSS feeds on Twitter."

As Stay noted at the time, developers could access RSS through Twitter's API, which may be the last recourse for getting an RSS feed from Twitter's website, outside of third-party services.

Twitter API 2 RSS

Now another developer, Shawn McCollum, has done just that. Twitter API 2 RSS, available as a code snippet (aka a "gist" on GitHub), is now ready for testing, he says. The code was originally written for personal use when he wanted to build his own better-looking and more functional RSS feeds for Twitter profiles.

After McCollum heard that Twitter was removing RSS support, he realized that the same code could be retooled for use by others. The only problem now is that he does not know how to get past Twitter's API limit of 150 calls per hour from a single IP address. He's looking for ideas to help with that, if you want to pitch in.

In the meantime, technical users can host their own copy of Twitter API 2 RSS and then subscribe to the resulting feeds in Google Reader or any other RSS reader application. However, the code is not yet available as a service for end users at this time. Details on how to use the code are available here on McCollum's blog.

Here's what it looks like, in action:

Twitter2rss

]]> Discuss]]>
http://www.readwriteweb.com/archives/Developer_creates_tool_to_bring_RSS_back_to_Twitter.php http://www.readwriteweb.com/archives/Developer_creates_tool_to_bring_RSS_back_to_Twitter.php RSS & Feeds Fri, 27 May 2011 08:58:16 -0800 Sarah Perez
Is It Time For a Web Crawling Code of Conduct? webcrawling_fakespider.pngEarlier this week, The Wall Street Journal posted an article entitled "'Scrapers' Dig Deep for Data on Web". While the article highlights some important issues surrounding the murky and potentially shady business of Web crawling, it fails to provide a comprehensive story on the uses of Web crawling. In other words, by focusing on one or two companies with spotty business practices, it casts the entire practice of data collection from the Web as something to be feared.

]]> Guest author Shion Deysarkar (@shiondev) is responsible for overall business development at 80legs. In a previous life, he founded and ran a predictive modeling firm. He enjoys playing poker and soccer, but is only good at one of them.

Why Web Crawling Is Good

There have certainly been cases where Web crawling has gone too far. The PatientsLikeMe.com case highlighted in the article is a great example. However, I would argue that there are far more cases where Web crawling and data collection from the Web has generated real value - not only for companies, but for individuals as well.

For instance, aggregate data from the Web helps companies learn what people think about their products. Companies that can listen better can meet the needs of their customers better. Another interesting use-case is discovering and analyzing potential ad channels. Ad networks crawl millions of Web pages to find content relevant to their ad inventory. Crawling also allows companies like Infochimps and Factual to build better, more structured data sets with anything from property data to sports data. Rather than having this data scattered around the Web, it's now centralized for easy consumption and analysis.

A Web Crawling Code of Conduct

Unfortunately, and somewhat understandably, it's easier to focus on the murky underbelly of Web crawling. People gravitate more to stories about organizations doing the wrong thing than stories about companies just running their businesses the right way. 80legs and other companies involved in legitimate Web data collection need to make sure we are not grouped in with the other organizations.

I think a great first step toward this is establishing a "Web Crawling Code of Conduct". The rules and laws surrounding Web crawling have been hazy at best and show no signs of being clarified. This is not surprising, considering that law tends to play catch-up with technology. However, after some experience in this industry, I feel that the following two rules embody the minimum necessary guidelines for proper Web crawling:

1. Only publicly-available sources may be crawled. This means bots cannot log into websites, unless explicitly allowed by the website.

2. Do not overwhelm a website with crawling requests. Crawling requests should not significantly increase the amount of bandwidth needed by the server.

Some readers may feel I've left out certain aspects that should be included in proper Web crawling, such as following robots.txt and other practices. While I recognize the value that those practices have, my personal opinion is that Web data sources and Web data collectors should work together to maximize the value of Web data, and that some common practices hamper that unnecessarily. Further discussion is welcome and eagerly anticipated.

Perhaps while we wait for proper regulations to help distinguish those socially aware crawling services acting with best practices in mind from the more dubious companies with other interests, we should move toward creating a more formal, independent board that can certify, whether officially or unofficially, those crawling companies adhering to such a code and operating legitimate services.

Photo by homyox

]]> Discuss]]>
http://www.readwriteweb.com/archives/is_it_time_for_a_web_crawling_code_of_conduct.php http://www.readwriteweb.com/archives/is_it_time_for_a_web_crawling_code_of_conduct.php Security Fri, 15 Oct 2010 11:30:00 -0800 Guest Author
Snipt.org Extends Service with API, AIR Application The folks at Snipt.org haven't been sitting still since we first told you about their code snippet-sharing utility in January. Today they released a number of tools to extend their service. First, they have a new API that allows for applications to be built that can talk to Snipt directly. This allowed for the creation of an Adobe AIR-based client application called Cloud Coder, Javascript embed code to integrate Snipt on a blog and finally, a WordPress plugin that all use the API.

]]> Details of the new offerings can be found on the Snipt.org blog. The Snipt API enables all the tools released today, and paves the path for 3rd-party development of new tools or support for Snipt as a de facto repository for code snippets. The Snipt API uses a REST service model and mirrors the Snipt web page by using the user's Twitter username and password for authentication.  They have simultaneously released an open source actionscript library for use in other Adobe Flash/Flex/AIR projects.

The AIR application, called Cloud Coder, is fairly straightforward. After providing Twitter credentials, it will display all the snipts in the user's SniptBox (basically, saved bits of code), where they can be retrieved, edited and copied to the clipboard. There is a single option to enable 'auto snipt' which watches the clipboard and catalogs new code segments automatically. The SniptBox view also has a tweet this button and shows the URL of the selected snipt for sharing on other networks. Finally, the application can be docked when not in use.

The WordPress plugin and Javascript code both enable Snipt integration on blogs and web sites. There are other plugins that allow application code to be displayed in blog posts, but the advantage to Snipt's offering is that these code snippets can be selected directly from the Snipt database directly without additional cutting, pasting, or formatting work.

]]> Discuss]]>
http://www.readwriteweb.com/archives/sniptorg_releases_api_air_app_and_wordpress_plugin.php http://www.readwriteweb.com/archives/sniptorg_releases_api_air_app_and_wordpress_plugin.php News Wed, 18 Mar 2009 22:00:00 -0800 Phil Glockner