<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" 
      xmlns:thr="http://purl.org/syndication/thread/1.0">
  <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_now_scanning_rss_atom_feeds.php" />
  <link rel="self" type="application/atom+xml" href="http://www.readwriteweb.com/atom.xml" />
  <id>tag:,2010:/1/tag:www.readwriteweb.com,2009://1.16953-</id>
  <updated>2010-03-01T16:48:26Z</updated>
  <title>Comments for Google Now Scanning RSS, Atom Feeds, May Experiment with Real-Time Protocols in Future</title>
  
  <generator uri="http://www.sixapart.com/movabletype/">Movable Type 4.23-en</generator>
  <entry>
    <id>tag:www.readwriteweb.com,2009://1.16953</id>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_now_scanning_rss_atom_feeds.php" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.readwriteweb.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=16953" title="Google Now Scanning RSS, Atom Feeds, May Experiment with Real-Time Protocols in Future" />
    <published>2009-10-30T13:44:01Z</published>
    <updated>2009-10-30T15:49:08Z</updated>
    <title>Google Now Scanning RSS, Atom Feeds, May Experiment with Real-Time Protocols in Future</title>
    <summary>According to a post on Google&apos;s Webmaster Central blog, Google is now discovering web sites by automatically scanning RSS and Atom feeds. This new process will help Google more quickly identify web pages and will allow users to find new content in search results as soon as it goes live. While not exactly &quot;real-time,&quot; using...</summary>
    <author>
      <name>Sarah Perez</name>
      <uri>http://www.sarahintampa.com</uri>
    </author>
    
    <category term="Google" />
    
    <category term="NYT" />
    
    <category term="News" />
    
    <category term="Real-Time Web" />
    
    <category term="Search" />
    
    <content type="html" xml:lang="en" xml:base="http://www.readwriteweb.com/">
      <![CDATA[<p><img src="http://www.readwriteweb.com/images/google_logo.gif" />According to a post on <a href="http://googlewebmastercentral.blogspot.com/2009/10/using-rssatom-feeds-to-discover-new.html">Google's Webmaster Central blog</a>, Google is now discovering web sites by automatically scanning RSS and Atom feeds. This new process will help Google more quickly identify web pages and will allow users to find new content in search results as soon as it goes live. While not exactly "real-time," using feeds to identify updates to websites is an arguably faster method than the traditional crawling techniques Google has used in the past. And Google may get <em>even faster</em> in the near future - the post also notes that the company may soon explore using mechanisms like the real-time protocol <a href="http://code.google.com/p/pubsubhubbub/">PubSubHubbub</a> to identify updated items going forward. </p>]]>
      <![CDATA[
<p>The blog post doesn't say whether or not RSS and Atom discovery is displacing traditional web crawling for sites that are feed-enabled, but it's likely that, if given the choice, Google will opt for the faster method if available. As Vanessa Fox notes on the <a href="http://searchengineland.com/googles-additional-discovery-method-rss-and-atom-feeds-28828">SearchEngineLand blog</a>, since it's unknown at this time whether Google is using the feeds in place of traditional web crawling, it may make sense to use full feeds rather than partial ones in order to get your content indexed faster by Google's search engine. </p>

<h2>Real-Time Web Crawling in the Future?</h2>

<p>Although only briefly mentioned in the post, Google hinted that they may begin looking into other mechanisms such as <a href="http://code.google.com/p/pubsubhubbub/">PubSubHubbub</a>, an open protocol that provides near-instant notifications of change updates. No further details were provided beyond the one sentence, but the announcement clearly shows that Google has seen the writing on the wall and knows that the real-time web is the future. This is one trend the company isn't planning to ignore.</p>

<p>The real-time web, heavily influenced by the speed of Twitter and other other rapid-fire social networking updates, has created a desire among internet users for faster access to information. This desire has, in turn, led to the creation of new real-time protocols such as the above mentioned <a href="http://code.google.com/p/pubsubhubbub/">PubSubHubbub</a> and its counterpart <a href="http://rsscloud.org/">RSSCloud</a>. If Google began to use these technologies for scanning the web, their search results wouldn't just be updated <em>faster</em> - they would be updated in real-time. That means information would become available in the search results listings as soon as it was published to the web. </p>

<a href="http://www.readwriteweb.com/reports/real-time-web.php"><img src="http://www.readwriteweb.com/images/300x100rtwreportad.png" align="right" hspace="5px" vspace="5px"></a><p>That, of course, would lead to a whole new series of challenges for the search engine - most notably, how to rank the real-time results? Given that Google's search algorithm has been built on top of the concept of PageRank, a way to determine the relevance of a website by what other sites link to it, ranking search results that are so fresh that there is an absence of links could prove a difficult feat. However, Google is already doing this to some extent now. Over time, the PageRank algorithm has evolved and can now reward sites with fresher, more fitting content and rank them higher than sites with more links on some occasions. And if anyone can figure out the proper algorithm for mixing in real-time content and ranking it appropriately along with static pages, it's got to be Google. In fact, we'll probably soon see exactly how they plan on addressing this issue, when they <a href="http://www.readwriteweb.com/archives/google_indexes_twitter.php">incorporate Twitter search results</a> into their index, as announced last week. </p>

<h2>...But Until Then, Google Delivering Faster, Fresher Results Instead</h2>

<p>Although the PubSubHubbub mention may have been the most exiting part of the announcement, real-time search results aren't here just yet. In the meantime, we have to just be content with <em>sped up</em> results instead. The post advises website owners who are blocking Google's search bot software known as Googlebot from crawling their RSS/Atom feeds to unblock it via their robots.txt file. If unsure, webmasters can test their feed URLs with the <a href="http://www.google.com/support/webmasters/bin/answer.py?answer=156449">robots.txt tester in Google Webmaster Tools</a>, as the post recommends. </p>]]>
    </content>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2009://1.16953-comment:165937</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2009://1.16953" type="text/html" href="http://www.readwriteweb.com/archives/google_now_scanning_rss_atom_feeds.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_now_scanning_rss_atom_feeds.php#c165937" />
    <title>Comment from Louis Moynihan on 2009-10-30</title>
    <author>
        <name>Louis Moynihan</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>RSS getting faster has put it back in vogue... RWW</p>]]>
    </content>
    <published>2009-10-30T16:36:49Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2009://1.16953-comment:165960</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2009://1.16953" type="text/html" href="http://www.readwriteweb.com/archives/google_now_scanning_rss_atom_feeds.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_now_scanning_rss_atom_feeds.php#c165960" />
    <title>Comment from Arnaldo Queiroz R .F on 2009-10-30</title>
    <author>
        <name>Arnaldo Queiroz R .F</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>Good to know that, Google aways innovating their products ! </p>]]>
    </content>
    <published>2009-10-30T18:47:58Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2009://1.16953-comment:165968</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2009://1.16953" type="text/html" href="http://www.readwriteweb.com/archives/google_now_scanning_rss_atom_feeds.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_now_scanning_rss_atom_feeds.php#c165968" />
    <title>Comment from Realtime Top News on 2009-10-30</title>
    <author>
        <name>Realtime Top News</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>Great feature as always from google, always innovating.</p>]]>
    </content>
    <published>2009-10-30T19:15:36Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2009://1.16953-comment:166190</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2009://1.16953" type="text/html" href="http://www.readwriteweb.com/archives/google_now_scanning_rss_atom_feeds.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_now_scanning_rss_atom_feeds.php#c166190" />
    <title>Comment from Lawrence @ CRB on 2009-10-31</title>
    <author>
        <name>Lawrence @ CRB</name>
        <uri>http://www.creditrestorationbureau.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.creditrestorationbureau.com">
        <![CDATA[<p>Google never ceases to amaze me. </p>]]>
    </content>
    <published>2009-10-31T21:26:57Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2009://1.16953-comment:166291</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2009://1.16953" type="text/html" href="http://www.readwriteweb.com/archives/google_now_scanning_rss_atom_feeds.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_now_scanning_rss_atom_feeds.php#c166291" />
    <title>Comment from Gabriella on 2009-11-01</title>
    <author>
        <name>Gabriella</name>
        <uri>http://level343.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://level343.com">
        <![CDATA[<p>Always nice to see Google Daddy pushing the envelope. Now if I can only get their algorithms figured out ;)</p>]]>
    </content>
    <published>2009-11-01T18:02:30Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2009://1.16953-comment:173732</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2009://1.16953" type="text/html" href="http://www.readwriteweb.com/archives/google_now_scanning_rss_atom_feeds.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_now_scanning_rss_atom_feeds.php#c173732" />
    <title>Comment from kartę r4i on 2009-12-09</title>
    <author>
        <name>kartę r4i</name>
        <uri>http://www.r4-nintendo-ds.pl</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.r4-nintendo-ds.pl">
        <![CDATA[<p>Google has such a great team working and a pool of talented people to make new inventions for them. Google is a giant company and can never fail to amaze people.</p>]]>
    </content>
    <published>2009-12-10T04:40:52Z</published>
  </entry>

</feed>