<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" 
      xmlns:thr="http://purl.org/syndication/thread/1.0">
  <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php" />
  <link rel="self" type="application/atom+xml" href="http://www.readwriteweb.com/atom.xml" />
  <id>tag:,2009:/1/tag:www.readwriteweb.com,2008://1.6880-</id>
  <updated>2009-11-23T18:57:10Z</updated>
  <title>Comments for Google Now Knows About 1 Trillion Pages</title>
  
  <generator uri="http://www.sixapart.com/movabletype/">Movable Type 4.23-en</generator>
  <entry>
    <id>tag:www.readwriteweb.com,2008://1.6880</id>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.readwriteweb.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=6880" title="Google Now Knows About 1 Trillion Pages" />
    <published>2008-07-25T23:31:43Z</published>
    <updated>2008-07-25T23:43:41Z</updated>
    <title>Google Now Knows About 1 Trillion Pages</title>
    <summary>Google today announced that it is now indexing the amazing amount of 1 trillion unique URLs. Google&apos;s first index in 1998 only had 26 million pages and by 2000 that number had jumped to 1 billion. Today, the Google index is growing by several billion pages per day alone. Not too long ago, Google used...</summary>
    <author>
      <name>Frederic Lardinois</name>
      
    </author>
    
    <category term="News" />
    
    <content type="html" xml:lang="en" xml:base="http://www.readwriteweb.com/">
      <![CDATA[<p><img alt="google150.jpg" src="http://www.readwriteweb.com/images/google150.jpg"  /><a href="http://googleblog.blogspot.com/2008/07/we-knew-web-was-big.html">Google today announced</a> that it is now indexing the amazing amount of 1 trillion unique URLs. Google's first index in 1998 only had 26 million pages and by 2000 that number had jumped to 1 billion. Today, the Google index is growing by several billion pages per day alone. Not too long ago, Google used to have a counter on the front page of its search engine, displaying the number of sites in the index, but they dropped this information from the site around 2005. </p>]]>
      <![CDATA[<p><img alt="google-questions.png" align="right" src="http://www.readwriteweb.com/images/google-questions.png"  />When there was still real <a href="http://searchengineland.com/080725-161058.php">competition</a> between search engines in the late 90s, the size of a search engine's index was one of the main methods of comparing search providers. Today, the number of pages in any given search engine's index has dropped out of our collective conscience - and that might be a good thing, as the focus has shifted towards returning relevant search results over the ability to return the most results. </p>

<p>That, after all, was the real advance that Google brought to the search engine market. Early search engines like Altavista, Excite, or HotBot simply weren't able to return the most useful results to users.</p>

<p>However, we are also getting close to reaching a new crossroad again, where even Google's results are often polluted by spam. Yet, at the same time, the great <a href="http://www.readwriteweb.com/archives/semantic_search_the_myth_and_reality.php">promise of semantic search engines</a> is still just a promise for now. Given the <a href="http://www.readwriteweb.com/archives/google_70_percent_market_share.php">latest data</a> about the search engine market and the <a href="http://www.readwriteweb.com/archives/battle_is_over_icahn_will_join.php">end</a> of the Microsoft/Yahoo negotiations over acquiring the Yahoo search business, Google is pretty much becoming the de-facto standard search engine for most people. </p>

<p>Chances are that anybody who wants to enter this market and compete with Google is simply going to be bought by the search giant, so if anything, Google's strong position is going to get even stronger in the foreseeable future.</p>

<p>For now, though, the real question about Google's search index is when it will reach a size of 1 googol...</p>

<p><em>Photo by Flickr user <a href="http://www.flickr.com/photos/myklroventine/">Mykl Roventine</a>.</em></p>]]>
    </content>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2008://1.6880-comment:61673</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2008://1.6880" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php#c61673" />
    <title>Comment from Nick Stamoulis on 2008-07-25</title>
    <author>
        <name>Nick Stamoulis</name>
        <uri>http://www.searchengineoptimizationjournal.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.searchengineoptimizationjournal.com">
        <![CDATA[<p>Wow..that's phenomenal! Google = Headed for world domination. </p>]]>
    </content>
    <published>2008-07-26T00:16:48Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2008://1.6880-comment:61679</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2008://1.6880" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php#c61679" />
    <title>Comment from Yasser on 2008-07-25</title>
    <author>
        <name>Yasser</name>
        <uri>http://yasser.hastalent.net</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://yasser.hastalent.net">
        <![CDATA[<p>I wonder how many of these pages are fakes/spam.</p>]]>
    </content>
    <published>2008-07-26T01:46:07Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2008://1.6880-comment:61682</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2008://1.6880" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php#c61682" />
    <title>Comment from Dan Grossman on 2008-07-25</title>
    <author>
        <name>Dan Grossman</name>
        <uri>http://www.dangrossman.info</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.dangrossman.info">
        <![CDATA[<p>Google's index is in the tens of billions of pages, nowhere near a trillion. They only announced that they've identified a trillion unique URLs, they only index a tiny fraction of that. There's rumors that this announcement was purposely put out ahead of some major news coming next week from one of their competitors...</p>]]>
    </content>
    <published>2008-07-26T02:06:34Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2008://1.6880-comment:61689</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2008://1.6880" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php#c61689" />
    <title>Comment from Tinus Guichelaar on 2008-07-25</title>
    <author>
        <name>Tinus Guichelaar</name>
        <uri>http://www.tractorfan.eu</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.tractorfan.eu">
        <![CDATA[<p>I have about 120.000 pages on site A. Yahoo has indexed about 90%. Google has indexed about 50%. MSN Live has indexed just 59 pages!</p>

<p>I'm using Google Sitemaps, Yahoo Site Explorer and Live Webmaster Central, all sites are authenticated and XML sitemaps are in place, but MSN is seriously lacking in discovering all pages. </p>

<p>I really think that they would love to add more pages, but their infrastructure can't handle it. I think MSN could use some Hadoop?</p>]]>
    </content>
    <published>2008-07-26T06:19:47Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2008://1.6880-comment:61692</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2008://1.6880" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php#c61692" />
    <title>Comment from Website Designer Perth on 2008-07-26</title>
    <author>
        <name>Website Designer Perth</name>
        <uri>http://www.igenerator.com.au</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.igenerator.com.au">
        <![CDATA[<p>I agree with Dan - smacks of sensationalism and spoiling tactics. If it turns out to be the case I for one will be disappointed with Google - they are so dominant now as to not have to behave like a bully-corporation. They'll get much more cred' positioning themselves 'for the people' as opposed to 'competing' like the likes of Coke and Pepsi.</p>]]>
    </content>
    <published>2008-07-26T07:56:50Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2008://1.6880-comment:61694</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2008://1.6880" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php#c61694" />
    <title>Comment from gregorylent on 2008-07-26</title>
    <author>
        <name>gregorylent</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>one's most cherished insights .. mere data</p>

<p>lot of info, and no wisdom ... google will never replace awareness<br />
</p>]]>
    </content>
    <published>2008-07-26T08:30:02Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2008://1.6880-comment:61711</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2008://1.6880" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php#c61711" />
    <title>Comment from panos on 2008-07-26</title>
    <author>
        <name>panos</name>
        <uri>http://www.wi-not.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.wi-not.com">
        <![CDATA[<p>Come on guys!<br />
This was a velvet article.<br />
Arrington was a lot more teasing!<br />
Is search market changing in one week?<br />
I think not but everybody can realize a lot things are running in the background.</p>]]>
    </content>
    <published>2008-07-26T16:43:04Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2008://1.6880-comment:61715</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2008://1.6880" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php#c61715" />
    <title>Comment from promosyon şapka on 2008-07-26</title>
    <author>
        <name>promosyon şapka</name>
        <uri>http://httP://www.sapkaci.biz</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://httP://www.sapkaci.biz">
        <![CDATA[<p>I wonder how many of these pages are fakes/spam</p>]]>
    </content>
    <published>2008-07-26T18:54:05Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2008://1.6880-comment:61751</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2008://1.6880" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php#c61751" />
    <title>Comment from Amit Aviv on 2008-07-27</title>
    <author>
        <name>Amit Aviv</name>
        <uri>http://kaalga.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://kaalga.com">
        <![CDATA[<p>Innovation in search is stifled by the huge costs of crawling the web.<br />
I would like to see a company that opens up their crawling infrastructure, allowing developers to write small modules that will run on every page the crawler passes, and add the generated data to the main index.<br />
A developer will be able to define on what kind of pages his module will run, and in what processing rates, and will be charged according to the module's cpu consumption.<br />
There should also be a choice whether the new data is public or private, and choosing public will give a significant discount to the developer.<br />
This way it will be possible for small companies (and even individuals) to run significant scale experiments in indexing, and to create search engines that integrate all the innovative ideas on how data should be extracted from the web  </p>]]>
    </content>
    <published>2008-07-27T12:15:39Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2008://1.6880-comment:61797</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2008://1.6880" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php#c61797" />
    <title>Comment from Dan Grossman on 2008-07-27</title>
    <author>
        <name>Dan Grossman</name>
        <uri>http://www.dangrossman.info</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.dangrossman.info">
        <![CDATA[<p>And here's the news they were trying to wash out. <a href="http://www.cuil.com" rel="nofollow">Cuil</a> launched today with an index of over 120 billion webpages, which is about 3 times larger than Google's.</p>]]>
    </content>
    <published>2008-07-28T05:44:42Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2008://1.6880-comment:61809</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2008://1.6880" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php#c61809" />
    <title>Comment from Harish Agrawal on 2008-07-28</title>
    <author>
        <name>Harish Agrawal</name>
        <uri>http://www.vedainformatics.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.vedainformatics.com">
        <![CDATA[<p>Google is a habit now. It is one of those habit that is hard to give up as well. Can it be done? Lets just Google it and find out. </p>]]>
    </content>
    <published>2008-07-28T10:27:17Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2008://1.6880-comment:61835</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2008://1.6880" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php#c61835" />
    <title>Comment from free movies on 2008-07-28</title>
    <author>
        <name>free movies</name>
        <uri>http://www.80millionmoviesfree.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.80millionmoviesfree.com">
        <![CDATA[<p>Cool that's phenomenal</p>]]>
    </content>
    <published>2008-07-28T16:25:20Z</published>
  </entry>

  <entry>
    <id>tag:www.readwriteweb.com,2008://1.6880-comment:61896</id>
    <thr:in-reply-to ref="tag:www.readwriteweb.com,2008://1.6880" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/google_hits_one_trillion_pages.php#c61896" />
    <title>Comment from Luis Pereira on 2008-07-28</title>
    <author>
        <name>Luis Pereira</name>
        <uri>http://www.stumpedia.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.stumpedia.com">
        <![CDATA[<p>If Cuil's advantage over Google is in the size of their index, then they are in big trouble.</p>]]>
    </content>
    <published>2008-07-28T23:33:25Z</published>
  </entry>

</feed>