<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" 
      xmlns:thr="http://purl.org/syndication/thread/1.0">
  <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php" />
  <link rel="self" type="application/atom+xml" href="http://www.readwriteweb.com/atom.xml" />
  <id>tag:,2008:/1/tag:72.47.210.69,2007://1.3936-</id>
  <updated>2008-05-09T18:13:34Z</updated>
  <title>Comments for Google&apos;s Udi Manber - Search is a Hard Problem</title>
  
  <generator uri="http://www.sixapart.com/movabletype/">Movable Type 4.1</generator>
  <entry>
    <id>tag:72.47.210.69,2007://1.3936</id>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php" />
    <link rel="service.edit" type="application/atom+xml" href="http://www.readwriteweb.com/cgi-bin/mt/mt-atom.cgi/weblog/blog_id=1/entry_id=3936" title="Google's Udi Manber - Search is a Hard Problem" />
    <published>2007-06-21T22:40:19Z</published>
    <updated>2007-12-16T23:11:42Z</updated>
    <title>Google&apos;s Udi Manber - Search is a Hard Problem</title>
    <summary>Udi Manber, Google&apos;s VP of Engineering, gave a brief 15 minute presentation at Supernova today entitled Search is a Hard Problem. He explained that with an audience like Supernova, he imagines we understand to some extent how difficult a problem it is, but it&apos;s probably a harder problem then we even appreciate. He laid out...</summary>
    <author>
      <name>Sean Ammirati</name>
      
    </author>
    
    <category term="Google" />
    
    <category term="Search Services" />
    
    <category term="Supernova 2007" />
    
    <content type="html" xml:lang="en" xml:base="http://www.readwriteweb.com/">
      <![CDATA[<p><img src="http://www.readwriteweb.com/images/google_logo.gif" align="right" hspace="5"
vspace="5" /><a href="http://en.wikipedia.org/wiki/Udi_Manber">Udi Manber,</a> Google's
VP of Engineering, gave a brief 15 minute presentation at Supernova today entitled
<i>Search is a Hard Problem</i>. He explained that with an audience like Supernova, he
imagines we understand to some extent how difficult a problem it is, but it's probably a
harder problem then we even appreciate. He laid out three reasons why this is the
case:</p>

<ul>
<li>Scale and diversity are almost beyond comprehension</li>

<li>Expectations and needs will continue to grow</li>

<li>20 to 25% of the queries we see today, we have never seen before</li>
</ul>

<p>I found the third point quite amazing. I would think with the number of queries that
Google processes, they would have seen a much higher percentage of the queries
before.</p>]]>
      <![CDATA[<h2>A Deeper Understanding</h2>

<p>Next Udi explained that there are three levels involved in trying to deliver relevant
information back to users:</p>

<ul>
<li>Users and Queries</li>

<li>Models</li>

<li>Languages</li>
</ul>

<h3>Users and Queries</h3>

<p>Udi gave some examples of Google's ability to understand the different between two
very similar queries. For example, Google understands that 'GM' stands for 'General
Motors', while 'GM foods' is actually 'genetically modified.' If you search for 'B&amp;B
AB', Google knows that is 'bed and breakfast in Alberta', while 'Ramstein AB' is
'Ramstein Airbase'.</p>

<p>Google also will recommend queries that may deliver better results. For example, if
you query 'Types of dogs' it will give results, but also suggests 'breeds of dogs' as a
better search.</p>

<p>He then explained that they still can't find all the answers. As a fun example, he
said the query "Why Search is Hard" is actually a very difficult query for Google to
parse.</p>

<h3>Models</h3>

<p>Next Udi reviewed some new Google search functionality, which while not live yet -
will be soon. Apparently, Google is going to start trying additional queries based on
certain user queries. For example, the query "How much does it cost for an exhaust
system" will pull up results from "cost of an exhaust system." Beyond just removing
certain general words, they are also <b>interpreting the question</b> as part of the
model; for example the following two queries:</p>

<ul>
<li>&lsquo;overhead view of bellagio pool' to 'bellagio pool pictures'</li>

<li>&lsquo;fedora 5 losing network connections' to 'fedora 5 network
configuration&rsquo;</li>
</ul>

<h3>Different Queries for Different Locations</h3>

<p>Finally, Udi talked about how results need to be different when the query is conducted
in different locations. For example, the query 'government' needs to return results about
your countries' government. I haven't tried this in other countries, but here in San
Francisco the first result is for the US Government.</p>

<p>He also reviewed a tool at <a href="http://www.google.com.eg/">Google.com.eg</a>,
which actually takes a query in another language, translates it to english, runs the
query, and then returns the results in that language. You can actually view the page in
that language. There are a whole suite of language tools Google seems to be leveraging
at: <a
href="http://www.google.com/language_tools">http://www.google.com/language_tools</a></p>

<p>I'm surprised there aren't more copyright issues here, but I'm not a lawyer. For an
example, here is <a
href="http://translate.google.com/translate?hl=ar&amp;sl=en&amp;u=http://readwriteweb.com/&amp;sa=X&amp;oi=translate&amp;resnum=1&amp;ct=result&amp;prev=/search%3Fq%3Dsxsw%26hl%3Dar%26sa%3DG">
the Read/WriteWeb homepage translated into Arabic</a>. Apparently, when Udi was demoing
this for Larry Page, he asked why the images weren't translating. Obviously, there is
still work to be done, but it is quite amazing.</p>

<p><img src="http://www.readwriteweb.com/images/RWWArabic.jpg" /></p>

<h2>Conclusion</h2>

<p>After listening to Udi's talk, I must agree that while I thought search was complex, I
probably underestimated some of the areas of real difficulty. It is amazing to step back
and think about how conceptually complex this is. It sheds new light on many of our <a
href="http://www.readwriteweb.com/archives/retrospective_day_without_google_lexxe_powerset.php">
challenging experiences</a> around the <a
href="http://altsearchengines.com/2007/06/10/a-day-without-google/">AltSearchEngine's Day
without Google</a>.</p>]]>
    </content>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.3936-comment:33897</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.3936" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php#c33897" />
    <title>Comment from Josh Catone on 2007-06-21</title>
    <author>
        <name>Josh Catone</name>
        <uri>http://www.readwriteweb.com/</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.readwriteweb.com/">
        <![CDATA[<p>The 20-25% new queries stat, do you know if that is a percentage of total queries (that seems high, as you said) or as a percentage of the number of unique queries (i.e., 'football' searched 40,000 times counts as 1 query, the same as 'skippy peanut butter recipes with pictures' searched once is counted as a single query).  The latter would be less shocking. :)</p>]]>
    </content>
    <published>2007-06-21T22:56:42Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.3936-comment:33898</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.3936" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php#c33898" />
    <title>Comment from Jacinta on 2007-06-21</title>
    <author>
        <name>Jacinta</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>I'm based in Australia - Google's first result for me typing 'government' is www.australia.gov.au, the Australian government portal</p>]]>
    </content>
    <published>2007-06-22T00:48:11Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.3936-comment:33899</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.3936" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php#c33899" />
    <title>Comment from Sean Ammirati on 2007-06-21</title>
    <author>
        <name>Sean Ammirati</name>
        <uri>http://www.profitablesignals.com/blog/</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.profitablesignals.com/blog/">
        <![CDATA[<p>Josh, </p>

<p>Good question. I'm not sure but I would imagine it is unique queries which makes much more sense.  This also helps me reconcile some of the things Jason C has said about Mahalo regarding the 'head' of search.</p>

<p>- Sean</p>]]>
    </content>
    <published>2007-06-22T00:58:30Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.3936-comment:33900</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.3936" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php#c33900" />
    <title>Comment from Manoj on 2007-06-21</title>
    <author>
        <name>Manoj</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>Sean, This looks like a repeat of Udi's presentation during Google's universal search PR event a month or so back.</p>]]>
    </content>
    <published>2007-06-22T01:22:00Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.3936-comment:33901</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.3936" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php#c33901" />
    <title>Comment from Billy on 2007-06-21</title>
    <author>
        <name>Billy</name>
        <uri>http://www.fatinfo.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.fatinfo.com">
        <![CDATA[<p>The Arabic translation looks neat. Were you able to find someone who understand Arabic to confirm its accuracy?</p>

<p>Also, I can't imagine Google being able to translate images in a very long time (GIF and JPG images are pixels, and have no text values). Think about how long has it been brought up that FLASH SWF files was going to be indexed; and search results today of SWF files are close to nil.</p>]]>
    </content>
    <published>2007-06-22T03:35:52Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.3936-comment:33902</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.3936" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php#c33902" />
    <title>Comment from Paul on 2007-06-21</title>
    <author>
        <name>Paul</name>
        <uri>http://wizag.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://wizag.com">
        <![CDATA[<p>The gm cars vs. gm food is nothing more than a synonym look-up. See the gm vs. genetically modified search examples in this post <a href="http://stack-up.com/stack/2007/05/22/searchology-no-magic-behind-googles-magic/" rel="nofollow"><a href="http://stack-up.com/stack/2007/05/22/searchology-no-magic-behind-googles-magic/" rel="nofollow">http://stack-up.com/stack/2007/05/22/searchology-no-magic-behind-googles-magic/</a></a></p>]]>
    </content>
    <published>2007-06-22T06:13:47Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.3936-comment:33903</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.3936" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php#c33903" />
    <title>Comment from Oli on 2007-06-22</title>
    <author>
        <name>Oli</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>Re the automatic translation, you can also take an Arabic web page, e.g. from <a href="http://news.bbc.co.uk/hi/arabic/news/" rel="nofollow">BBC's Arabic News Service</a> and translate it <a href="http://translate.google.com/translate?u=http%3A%2F%2Fnews.bbc.co.uk%2Fhi%2Farabic%2Fnews%2F&langpair=ar%7Cen&hl=en&ie=UTF-8&oe=UTF-8&prev=%2Flanguage_tools" rel="nofollow">into English</a>. Makes it easier to assess than in the other direction ;-)</p>]]>
    </content>
    <published>2007-06-22T12:52:23Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.3936-comment:33904</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.3936" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php#c33904" />
    <title>Comment from troma on 2007-06-22</title>
    <author>
        <name>troma</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>GM food<br />
in GM food was served in paper plates..<br />
produced by GM food containers leaked..<br />
at GM food is the favorite pass-time..</p>

<p>statistics/popularity does not always work! This is not understanding! From 1000s of employees, don't they have one linguist to correct these statements?</p>]]>
    </content>
    <published>2007-06-23T00:30:18Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.3936-comment:33905</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.3936" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php#c33905" />
    <title>Comment from kaz on 2007-06-24</title>
    <author>
        <name>kaz</name>
        <uri></uri>
    </author>
    <content type="html" xml:lang="en" xml:base="">
        <![CDATA[<p>Good Article :)</p>]]>
    </content>
    <published>2007-06-24T20:15:24Z</published>
  </entry>

  <entry>
    <id>tag:72.47.210.69,2007://1.3936-comment:33906</id>
    <thr:in-reply-to ref="tag:72.47.210.69,2007://1.3936" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php"/>
    <link rel="alternate" type="text/html" href="http://www.readwriteweb.com/archives/udi_manber_search_is_a_hard_problem.php#c33906" />
    <title>Comment from Zip Drugs legal discount online pharmacy on 2007-06-26</title>
    <author>
        <name>Zip Drugs legal discount online pharmacy</name>
        <uri>http://www.zipdrugs.com</uri>
    </author>
    <content type="html" xml:lang="en" xml:base="http://www.zipdrugs.com">
        <![CDATA[<p>It is an interesting attempt to deliver better search results, however when Google tries to "guess" what you are searching for it will not deliver what you are actually searching for.</p>]]>
    </content>
    <published>2007-06-26T14:41:14Z</published>
  </entry>

</feed>