search engine - ReadWriteWeb http://www.readwriteweb.com/feeds/tag/search engine en Copyright 2009 Richard MacManus readwriteweb@gmail.com Sat, 21 Nov 2009 05:00:00 -0800 http://www.sixapart.com/movabletype/?v=4.23-en http://blogs.law.harvard.edu/tech/rss 80Legs: A Web Crawler as a Service 80Legs is a web crawling and online content analysis service which first impressed us back in April at the Web 2.0 Expo. At that time, the company was launching into a private beta, but today at the DEMOfall 09 conference, they're going live. In the time that has passed since their initial debut, the company has been working on scaling out the performance and power of their service while also preparing to launch a new feature which should appeal to both developers and non-developers alike: an "app store." This feature allows 80Legs users to write applications that run on top of the 80Legs service and gives them the ability to share those apps with others.

]]>Sponsor

]]> With 80Legs, anyone can have their very own search engine to command and control, and now thanks to the apps, they can have it do anything they like with just a click of a button.

What 80Legs does is no easy feat. It provides its users a service which offers up 50,000 computers which can crawl up to 2 billion web pages per day. Yes, it's like having your own little search engine that you can rent for a small fee. How small? 80Legs is about 50% less expensive than any other competitive service out there.

While consumers may not have a use for a service such as this, there is an extensive market that does. 80Legs aims to attract customers from a wide variety of disciplines including alternate search engine developers, market researchers, IP protection services (those who go after copyright infringements, article theft, etc.), competitive intelligent services, and ad networks looking to audit their own ads and see where other ad networks are placing theirs.

80Legs App Store

In addition to the big news about the service's public launch, 80Legs is also revealing their development program. With this new feature, developers can write their own applications that run on top of the 80Legs service and then make them available to others through a soon-to-launch "app store." Here, other customers can browse and purchase apps that suit their needs whether it's something for media analysis, market research, sentiment analysis, or whatever else the developer comes up with. The developers get to set their own CPR price for the apps and get to keep 100% of the revenue earned, too.

The API for app building was actually made available to beta users a couple of months prior to DEMO, but the company plans to soon support multiple languages which will include Java, .NET, Perl, and Python, so developers can work in whichever they feel most comfortable with.

While there are other ways to crawl the web, 80Legs wants to make sure that there's nowhere else you can do it for such a small fee. If you're interested in trying 80Legs for yourself, you can do so as of today by signing up at 80Legs.com. Use the code "RWW" to receive an additional 50% credit on top of the amount you put in. (First 50 users only).

]]>Discuss]]>
http://www.readwriteweb.com/archives/80legs_a_web_crawler_as_a_service.php http://www.readwriteweb.com/archives/80legs_a_web_crawler_as_a_service.php Products Wed, 23 Sep 2009 11:50:00 -0800 Sarah Perez
Tweetmi: Another Twitter Search Engine with a Twist tweetmi_logo_sep09.pngThere are, of course, already numerous Twitter search engines at this point and every new one will have to offer users a very good reason to switch from their current favorite. Tweetmi is jumping into the fray with a Twitter search engine that focuses on presenting users with a more personalized view. While the service also works well as a regular real-time Twitter search engine, users who sign in to Tweetmi will also see the most active users in their Twitter stream and the top stories from the people they already follow.

]]>Sponsor

]]> In addition, Tweetmi allows users who are signed in through Twitter's Oauth login mechanism to quickly reply and retweet any story. In this respect, Tweetmi is quite similar to Twazzup, which also gives users the ability to interact with Twitter directly. Unlike Twazzup, though, Tweetmi doesn't offer the ability to save searches, however.

tweetmi_large_1.png

Become a Fan

One feature that makes Tweetmi unique is that it gives users the option to become 'fans' of a certain topic. While this is definitely an interesting concept, users actually have to send out a tweet about the fact that they are now fans of 'RWW' or 'Follow Friday,' which somehwat limits the usefulness of this feature.

Another feature we liked is that the application can show you a list of all the Twitter users who tweeted a popular link. Like most Twitter search engines, Tweemi displays a list of the most popular links about a topic in a sidebar.

Given that there are already numerous Twitter search engines and more comprehensive real-time search engines like OneRiot on the market, Tweetmi will probably have a bit of a struggle to attract a dedicated user base. It is however, a perfectly capable Twitter search engine that offers all the typical features you would expect and definitely worth a try.

]]>Discuss]]>
http://www.readwriteweb.com/archives/tweetmi_another_twitter_search_engine_with_a_twist.php http://www.readwriteweb.com/archives/tweetmi_another_twitter_search_engine_with_a_twist.php News Fri, 11 Sep 2009 13:30:52 -0800 Frederic Lardinois
Caffeine: Google Tests New Search Infrastructure caffeine_google_aug09a.jpgJust as Facebook announces internal search for public notes, Google counters with an effort to improve on its existing services. In today's blog post, the company unveiled its new Caffeine search infrastructure to web developers. The question is, will Caffeine enhance performance or lead to user anxiety?

]]>Sponsor

]]> caffeine_google_aug09c.jpg

"It's the first step in a process that will let us push the envelope on size, indexing speed, accuracy, comprehensiveness and other dimensions," says a joint post from engineers Matt Cutts and Sitaram Iyer. "Web developers and power searchers might notice a few differences, so we're opening up a web developer preview to collect feedback."

In other words, the company aims to crawl the web, index it well, and deliver fast, relevant results - something it's always aimed to do. After a number of searches, it's clear Caffeine offers increased results in a shorter timeframe; however, how results appear remains a mystery.

From a consumer standpoint, Caffeine is identical to regular Google search save for shaving off a few precious half seconds. But for search engine marketers, the new indexing system adds another hurdle to their jobs. While it's likely that Google will have at least a month head start in outrunning them, in the past savvy keyword specialists have always managed to make their mark.

To help improve Caffeine, search in the sandbox and click the "Dissatisfied? Help us improve" link to offer feedback.

]]>Discuss]]>
http://www.readwriteweb.com/archives/caffeine_google_tests_new_search_infrastructure.php http://www.readwriteweb.com/archives/caffeine_google_tests_new_search_infrastructure.php Google Mon, 10 Aug 2009 18:14:46 -0800 Dana Oshiro
Even Social Search Needs an Algorithm: Arguing Against Data Entry As Search Engine With advance apologies to the hard-working PR folks and startup companies who have pitched us their social search engines this week, there is a rising menace in new media: A cluster of sites that call themselves user-powered search engines.

Much in the vein of the failed Wikia Search (the abandoned brain child of Wikipedia founder Jimmy Wales), these engines purport to "crowdsource" intelligence about URLs and search terms by allowing users to create profiles and submit, submit, submit content. Stumpedia and Gurutoy are two products in this category. Each offers the excitement of multimedia, semantic, "neue search" capabilities; and each delivers astonishingly dysfunctional results.

]]>Sponsor

]]> Exhibit A: Stumpedia Stumpedia calls itself "the human-powered search engine... a personalized social & real-time collaborative search engine that relies on human participation to index, organize, and review the world wide web. Stumpedia does not depend on bots, algorithms, or company insiders to make decisions on the relevance and ranking of search results."

Because god knows those algorithms have done nothing for search in the past. As for the "company insiders" part, we're drawing a blank on precisely what that means (Megan McCarthy, was this aimed at you?) and defer to the wisdom of the all-knowing RWW commenters to fill us in.

Stumpedia currently boasts around 28,000 URLs and 75,000 search terms in its digital lexicon - hardly enough to allow for a good or interesting browsing experience. By way of comparison, Wikia Search had indexed about 30 million websites before Jimmy Wales could say with a straight face that the product didn't suck. Just because we know he likes the attention, we ran a search on Robert Scoble:

As you can see, the single returned result was entirely irrelevant to the search term; Scoble's name was nowhere to be found on the linked-to page.

And sadly, for all the talk about insiders not gaming the system, the most relevant results in many searches we tried came from the Stumpedia founder/CEO. Here's a look at his profile and submissions:

We wanted to run a search for irony, but apparently the CEO hasn't submitted anything ironic lately.

Exhibit B: Gurutoy

Gurutoy recently appealed to us for coverage, styling itself "a visual search engine run completely by you." According to its homepage, Gurutoy asks users to "tell us what is cool and interesting in the worldwide web, and it'll be posted up in Gurutoy for others to see. Search Gurutoy using keywords and phrases and you'll see an array of websites uploaded by you and other users."

Assuming that the 99 percent of Internet users who are not tech bloggers use search engines because they need to find accurate, relevant results, the bar of expectations rests rather high.

For example, if a user searches for "orange juice," he might not expect to see this:

As can be seen by mousing over the thumbnails, the two results returned for that search term were both uploaded by a Los Angeles haberdasher. The results were tagged with relevant ("plaid," "headware") as well as damn perplexing ("brad suzuki," Gurutoy's CEO) terms, and we're still not sure how this cap was returned as a result for "orange juice."

Distressingly, a recommended search for "action figures" returned dismally irrelevant results:

Two of the 13 featured results had information on action figures, and none of the images contained action figures.

The Problem with Reliance on UGC

When thinking about building a "visual search engine," entrepreneurs must consider the relevance of the images as well as the URLs. They are faced with the reality of competing with Flickr and Google Images, both of which have powerful tech backed up and fed by a critical mass of user-generated information in the form of tags. They also must compete with Google, Yahoo!, and Microsoft Live search engines on the relevance of results' content.

Expecting that users will do the kind of data entry necessary to create a competitive product in this arena is ludicrous. The Internet already has a Wikipedia, so the kind of people with the knowledge and skill sets and the sheer time to invest have likely already picked their hobby and are eyeball-deep in barnstars.

However, Suzuki sees it differently: "The goal of Gurutoy is to become a visual directory of websites (any subject) on the net. But in a cool way, with the pictures." He compares the site to YouTube and has every faith in the power of user-submitted content.

"Gurutoy does not use any spiders to search the web for content. What we're counting on is for the masses to catch on with Gurutoy and to grow the content to make it relevant."

I asked SproutBox cofounder and venture tech/capital expert Mike Trotzke what he thought of algorithm-free social search engines.

"Oh, you mean a purely spam search engine with no users? Yeah, they suck.

"If you are going to try to introduce UGC into search engines, you've got to have some indexing first. It has to have some value out of the gate or no one will care. Not even Jimmy Wales could pull that off."

Trotzke continued to say that if any company would be able to incorporate valuable user-generated information into search, it would be Google. And he doesn't imagine that the search giant would be interested in buying a smaller company for their data or technology.

"[Google has] the vote-up technology already ready in waiting. They just need to tweak and start giving weight to all the data they have been collecting in SearchWiki notes for months already."

The Spam Question

In Social Media 101, we learn that where there is user-generated content (i.e., where anyone is allowed to tag and submit unreviewed content at no charge), there is spam.

Right now, most of the "users" interested in submitting content to these sites are retailers, enterprise sites, or others with a vested fiscal interest in driving traffic to their URLs. As you can see in this screenshot, MyJewelersPlace.com is spamming the heck out of Stumpedia:

Any site that permits user-submitted links is going to suffer the predictable, lamentable onslaught of black-hat, link-stuffed atrocities, especially for competitive verticals (I personally dare you to search any of these sites for iPods or Viagra.) Especially when adoption rates are low to begin with, UGC search engines are at high risk for being overrun by this kind of spam. This begins a circular process wherein potential users are scared or bored away from the site when search results are irrelevant, desperate pleas for clickthrus and credit card information.

For generic, noncommercial queries, few or no results will be returned. For more consumer-minded searches, results will be skewed and often uninformative. Allowing the community to police itself by flagging suspicious content is a necessary feature for any UCG site. However, when the amount of spam already outnumbers the amount of useful content on a relatively new search platform, what users are going to stick around long enough to register an account, let alone slog through the spam, planting flags left and right.

So, with more apologies to the startups named above, social search still needs to amass and index content using traditional search algorithms if results are to be useful to the end user. Then again, you could just let Google have this one and wait for your next big idea.

]]>Discuss]]>
http://www.readwriteweb.com/archives/social-search-needs-an-algorithm.php http://www.readwriteweb.com/archives/social-search-needs-an-algorithm.php Search Services Thu, 21 May 2009 06:00:00 -0800 Jolie O'Dell
Wolfram Alpha Launch Starts Tonight at 5pm Pacific: Here is What You Need to Know wolfram_alpha_logo_may09.pngWolfram Alpha, the new "computational knowledge engine" from the makers of Mathematica is scheduled to officially launch on Monday next week, but starting tonight, Alpha will 'soft launch,' starting with a live webcast of the launch preparations tonight. After that, Alpha will gradually open its doors to everybody throughout the weekend. We have had a chance to test a preview version of Alpha for the last seven days, and we are quite impressed with what we have seen so far. Here are some resources for getting up to speed with Alpha, as well as some recommendations for getting started with this powerful, but sometimes frustrating new tool.

]]>Sponsor

]]> Update: Alpha is now up and running, though the team might take it down at any point during the weekend to fix any problems it discovers during its tests.

Wolfram and his team will chronicle the launch in a live webcast on justin.tv, which will start at 5pm Pacific/8pm Eastern tonight. We are not quite sure how Wolfram will manage the gradual launch over the weekend, though we assume that if you are on the preview waiting list, you will get first dibs.

Some Things to Keep in Mind

alpha_frustration.pngHere are a few things to keep in mind as you start experimenting with Alpha tonight or over the weekend:

  • Wolfram Alpha is not a general purpose search engine - it does not directly compete with Google and if you treat it like Google, you will inevitably be disappointed.
  • Check out the copious examples from the home page - they will give you a good idea of the type of queries that Alpha can handle best.
  • Here is one thing we can almost guarantee: you will be disappointed at first (especially if you were expecting a Google killer).
    Alpha is a great tool, but it takes some time to learn its limits and strengths. Unlike Google, some searches simply don't return any results at all.

Using Alpha

alpha_no_result.pngOnce you get access to Alpha, here are some tips for how to structure your searches and searches you should try:

  • If Alpha doesn't give you the results you are looking for, try a different way of phrasing the query - sometimes even capitalization can make a difference!
  • Try to search for anything that can be packed into data snippets (height of a mountain, chemical formulas, population stats, stars, planets, etc.) .
  • Try combining two searches. Alpha usually does a great job with these kind of queries.
  • Feed it some math problems. The fact that Alpha is based on Mathematica really shines through here.
  • Do some test searches for food items or drugs.
  • Let it solve some word puzzles for you. Just head to the "Words & Linguistics" section for some good examples.
  • If you're a sports fan, look up some baseball or football stats: "passing touchdowns Dallas Cowboys, Denver Broncos."

Our Wolfram Alpha resources

alpha_math_prelaunch.pngScreenshots: See Wolfram Alpha in Action

Our Preview: Wolfram|Alpha: Our First Impressions

Wolfram|Alpha will be an amazing product, but it's quite different from Google and other search engines. Indeed, maybe it is actually wrong to call it a search engine at all (and Wolfram prefers to call itself a "computational knowledge engine"). If you wanted to know what sights to see on your next trip to New York City, for example, Alpha, from what we've seen so far, will not be able to help you.

Our Review: Mixed Emotions: Our First Hands-On Test Of Wolfram|Alpha

At the end of the day, Wolfram Alpha is a tool; and once you take some time to learn its ways, it can become a very powerful tool. While a lot of media outlets have compared Alpha to Google, we think that this is a moot question. Alpha simply doesn't want to be a Google killer and, in its current form, won't take market share away from Google. As we reported in our first look at Alpha a few weeks ago, Alpha will take away some users from Wikipedia (but it's no Wikipedia killer either), as it can give those users quick and easy access to a wide range of data.

For now, we expect Alpha to remain a niche player. It will be a highly valuable tool for a small subset of potential users. Though, hopefully, over time the team will add more and better databases to draw information from so that Alpha will become more useful to a mainstream audience as well.

Videos

alpha_screencast_logo.pngStephen Wolfram's screencast demo of Alpha.

First Public Demo of Wolfram Alpha at the Berkman Center:

Stephen Wolfram and colleagues discuss the launch preparations:

Setting up the Wolfram Alpha data center (one of five W|A data centers):

Other Wolfram Alpha Reviews

Technology Review (compares Alpha to Google):

Generally, I did not use search terms that clearly had no computable answer (and therefore would have stumped Wolfram). But I also didn't throw any softballs in areas close to the heart of its makers: physics, chemistry, engineering, and genomics. On hard-core scientific questions, it gives you tons of symbols and graphics and other information that would be useful to a researcher but obscure to most people. But on many common questions for which there is no obvious data element, you will not get much help. In any event, if its plans hold, you should be able to test it out yourself in two or three weeks.

Search Engine Land (very in-depth look):

Wolfram Alpha's edge may be that it's a unique repository of general knowledge that imitates a search engine (unlike Wikipedia, which has no search engine feel). Of course, the killer combination would be for Wolfram Alpha to be partnered with a major search engine. It's something Wolfram said is being considered, though there are no formal discussions at the moment. The focus is really getting the service opened to the public and seeing how the initial reaction goes.

Telegraph:

How many times have you used to the internet to calculate the answer to a simple mathematical problem, for help with calculus, or for information on the GDP of Gibraltar? If the answer's, "not often", then it's going to be quite some time before Wolfram Alpha crops up as your search engine of choice.

What Will You Ask?

If you are looking forward to the launch of Wolfram Alpha, let us know what questions you want to ask in the comments. We'll try to answer the most interesting questions (try to give us specific queries!) with links to screenshots from the Wolfram Alpha preview in the comments until about 3pm PST today.

]]>Discuss]]>
http://www.readwriteweb.com/archives/wolfram_alpha_launch_starts_tonight.php http://www.readwriteweb.com/archives/wolfram_alpha_launch_starts_tonight.php News Fri, 15 May 2009 12:20:27 -0800 Frederic Lardinois
Mixed Emotions: Our First Hands-On Test Of Wolfram|Alpha wolfram_alpha_logo_may09.pngWolfram Alpha, the hyped "Google killer" will officially launch on May 18, but we already got preview access to it today, and had a chance to put it through its paces.

Let's get this out of the way quickly: Wolfram Alpha is not (yet?) geared towards mainstream Internet users, who, for the most part, are still better served by Google. Of course, comparing Alpha to Google isn't even fair, but most users will treat it like Google, and will most likely come away sorely disappointed. Instead, Alpha, for now, is going to be a great tool for students, engineers, and academics - and anybody who needs data quickly and knows how to interpret it. It takes some time to learn how to best use Alpha, and it still has its rough patches, but, overall, we have come away quite impressed, though, at times we were also frustrated.

]]>Sponsor

]]> As we expected, the areas where Alpha exceeds are in Mathematics, Engineering, Chemistry, Physics, and the Life Sciences. When it comes to the Humanities, however, Alpha isn't that interesting. When you type in the name of authors, for example, you will get basic biographical data, but not a list of the books they wrote.

One thing to keep in mind about Alpha is that it will give you data - but it will not supply meaning. Users have to interpret the data themselves.

Results: Great in Some Areas - Very Limited in Others

alpha_span_germany_gdp.pngSometimes, Alpha's data set can also be uneven. You can, for example, get unemployment data for states, but if you want to drill down to specific cities, Alpha has to pass. Alpha can also answer odd trivia questions like "wingspan B-29 Superfortress" (141' 2.882"), how many pharmacists there are in the U.S. (and their median wage), or how much money "The Wrath of Khan" made at the box office (including, oddly enough, the conversion of those $78.91 million into Japanese yen, British pounds, and euros).

It can also do impressive calculations (though some of our more complex queries timed out), draw a Sierpinski gasket for you, or tell you what a safe heart rate for exercising is when you are 25yrs old. But while it knows who the German president was in 1984, it refused to tell us who the German chancellor was in that year. And the only info about World War I or II we got were basic dates, but at the same time, Alpha can tell us how many people die per minute in Germany today (1.698) and compare that to current birth rates. Alpha can also give you nutritional information, but we weren't able to figure out how to scale this data to different weights.

Some Humor

Sometimes, some humor also shines through in the search results. When you look for "5 kilo," for example the results will give you basic conversions, but Alpha will also tell you that 5 kilo is roughly equivalent to the weight of 2 copies of Stephen Wolfram's A New Kind of Science.

Limits

Clearly, there are holes in Alpha's data set. And most of these holes are in non-technical areas which, in many ways, is understandable as it would be harder to make that kind of information parsable for a system like Wolfram Alpha (though Alpha is great at solving word puzzles and anagrams). Thankfully, Alpha ads link relevant Wikipedia articles to every results page.

Lack of Interactivity

At times, there is also a lack of interactivity that can quickly become frustrating. All the images on Alpha are just static images, for example, which means that you can't zoom in or out of a map. Or, when you search for biographical data, none of the information is linked, so that you can't just click on a person's birthplace to get more information. This means that really drilling down into a subject can be hard as you constantly have to type in new queries.

Capitalization Matters

khan_wolfram_small.pngAlpha can also be extremely finicky. When we typed in 'pdx,' it didn't know what to do with it, but when the query was capitalized it returned both information for Pursuit Dynamics, which uses PDX as its trade symbol, and the option to get info about Portland International Airport, which was the information we were actually looking for. While Google is completely agnostic when it comes to capitalization, Alpha obviously cares (maybe that is also the legacy of a complex tool like Mathematica that is the foundation of Alpha).

Alpha provides new users with an extensive set of sample queries (all of which, of course, work great). To get the most out of Alpha, it really helps to look at those to see how to best formulate your queries.

Alpha For Developers

Alpha is going to have an extensive API for third-party developers. We only had a quick look at the documents that are aimed at developers, but from what we can see, developers will pretty much get full access to Wolfram Alpha's datasets. It should be interesting to see how the developer community manages to mash this data up with other sources to even out some of the areas where Alpha doesn't quite shine yet.

A Great Tool - But Not for Everybody

At the end of the day, Wolfram Alpha is a tool - and once you take some time to learn its ways - it can become a very powerful tool. While a lot of media outlets have compared Alpha to Google, we think that this is a moot question. Alpha simply doesn't want to be a Google killer and, in its current form, won't take market-share away from Google. As we reported in our first look at Alpha a few weeks ago, Alpha will take away some users from Wikipedia (but it's no Wikipedia killer either), as it can give its users quick and easy access to a wide range of data.

Alpha's biggest problem, right now, is interpreting search queries. Too often, a minor change in a query can mean the difference between no result, and finding exactly what you are looking for.

We also hope that Wolfram will find a way to link more of the data and search results together. It is rather frustrating when you find something interesting in your search results, only to have to type yet another query, even though a simple click should suffice.

Great for Engineers - Not For the Mainstream

For now, we expect Alpha to remain a niche player. It will be a highly valuable tool for a small subset of potential users. Though, hopefully, over time the team will add more and better databases to draw information from so that Alpha will become more useful for a mainstream audience as well.

Note: If you would like to see more screenshots of Wolfram Alpha in action, you can find them here.

]]>Discuss]]>
http://www.readwriteweb.com/archives/hands-on_with_wolfram_alpha.php http://www.readwriteweb.com/archives/hands-on_with_wolfram_alpha.php Products Fri, 08 May 2009 14:17:58 -0800 Frederic Lardinois
See Wolfram Alpha in Action: Our Screenshots alpha_logo_apr09.pngLast weekend, we attended a web demo of Wolfram Alpha, a new "computational knowledge engine" based on the work of Stephen Wolfram. Some have dubbed Alpha a "Google killer," but, in reality, it is very different from the standard search engines that we are all familiar with today.

When we got the demo, Wolfram asked us to refrain from publishing any screenshots. Today, however, the Berkman Center posted a video of the public demo Wolfram gave earlier this week, so we think it's only fair that we share our own screenshots with our readers at this point.

]]>Sponsor

]]> Homepage

alpha_homepage_shot.png

Query #1: internet users in Europe

wolfram_alpha_3.png

Query #2: weather oakland

wolfram_alpha_2.png

Query #3: oakland

oakland_alpha.png

Query #4: uncle's uncle's brother's son

wolfram_alpha_1.png

Query #5: water 550C 3 atm

alpha_water.png

Query #6: integrate x^3 sin^2 x dx

alpha_integrate.png

Query #7: bob

alpha_bob.png

Example of a copy and paste dialog:

alpha_copy_paste.png

Embedding Search Results:

alpha_embed.png

Here is the video of the public demo at the Berkman Center. It is a bit blurry, but it does show Wolfram Alpha in action:

And if you really want a look behind the scenes, here is a look behind the scenes of the Wolfram Alpha datacenter:

]]>Discuss]]>
http://www.readwriteweb.com/archives/see_wolfram_alpha_in_action_-_video_and_screenshots.php http://www.readwriteweb.com/archives/see_wolfram_alpha_in_action_-_video_and_screenshots.php News Thu, 30 Apr 2009 21:42:59 -0800 Frederic Lardinois
Duck Duck Go: Silly Name, Interesting Search Engine duckduckgo_logo_apr09.pngThe search engine market is obviously dominated by a small number of big players, but that doesn't mean that small companies with interesting ideas can't still get at least a small slice of this market. One of these services is Duck Duck Go, which has a rather silly name, but turns out to be a pretty interesting search engine. Duck Duck Go aims to get its users to their desired destinations in as few clicks as possible. Instead of long lists of results, Duck Duck Go simply tries to return the most relevant links about a given topic.

]]>Sponsor

]]> Features

Whenever you do a search on Duck Duck Go, the service will try to bring up the most 'official' page first, and if the search terms has a Wikipedia page, it will also include a short blurb from Wikipedia, as well as related search terms in a box at the top of the page.

duckduckgo_small_rww.pngFor some topics, Duck Duck Go features special category pages, and it can also recognize calculations, phone numbers, zip codes, ISBN numbers, and product codes, as well as street and IP addresses.

Judging from the results we have seen, it seems like Duck Duck Go actually gets a lot of its information from Wikipedia, though it also uses Yahoo's BOSS service to provide users with standard search results when the service can't find better information on Wikipedia.

Duck Duck Go also does a great job at providing users with options for disambiguation, which also look like they are based on Wikipedia's disambiguation pages. If you search for "Berlin," for example, Duck Duck Go will ask you if you are looking for the German capital, an album from Lou Reed, or a town in Connecticut.

Firefox Toolbar and iPhone App

Duck Duck Go also has a Firefox toolbar, which just came out of beta today, and the company boasts that this toolbar can prevent users from going to over 44 million spam or parked domains (based on a list maintained by the Parked Domains Project).

The company also provides an iPhone app, as well as a number of blog widgets that are not directly related to its core business.

We like the simplicity of the service, and the company's focus on getting users to results quickly by mashing up data from Yahoo and Wikipedia works well for most search terms. In many ways, it actually feels a bit like an automated version of Mahalo. Of course, Duck Duck Go's name might not exactly help it gain mainstream traction, but other search engines before it also had seemingly silly names and they did quite well in the marketplace.

]]>Discuss]]>
http://www.readwriteweb.com/archives/duck_duck_go_silly_name_interesting_search_engine.php http://www.readwriteweb.com/archives/duck_duck_go_silly_name_interesting_search_engine.php Products Thu, 30 Apr 2009 11:03:46 -0800 Frederic Lardinois
SearchMonkey Keeps Getting Smarter: Now Embeds Videos, Games, and Documents searchmonkey_logo_feb09.pngYahoo today announced a new feature for SearchMonkey that makes it very easy for site owners to embed flash videos, games, and documents directly on the Yahoo Search results page. The first sites to make use of this new feature are Hulu, Metacafe, and YouTube. Whenever a video from these sites appears in your search results, you can now watch it immediately in an embedded player right on the search results page.

]]>Sponsor

]]> SearchMonkey supports a number of popular video players, including Hulu, YouTube, and MetaCafe, as well as documents from Scribd and Slideshare, and Playcrafter games. Embedding these documents in the search results is relatively easy, and Yahoo provides content owners with an extensive set of helpful documents to get them started. searchmonkey_simpons.pngTo embed a video, for example, a developer only needs to add two lines of code. Videos are already appearing in Yahoo's search results now, and documents and games will become available in the next month or so.

Google, of course, also shows thumbnails for YouTube clips in its search results, but clicking on these will take you to YouTube and won't open the video player right on the page.

Yahoo says that it wants to make it easier for developers to make use of SearchMonkey. SearchMonkey is an extremely powerful tool, but it can also be very hard to use for somebody who doesn't have the technical knowledge required to create a SearchMonkey app. Thanks to this new feature, even novice webmasters will now be able to embed some of the most popular forms of content on Yahoo's search results page.

As we have said before, Yahoo continues to develop new and innovative ways to enhance its search, but so far, this hasn't made too much of a dent in Google's market share. Breaking Google's momentum will be very hard for any player in the search engine market, but if anything, Yahoo is clearly showing that it is not willing to throw in the towel just yet.

]]>Discuss]]>
http://www.readwriteweb.com/archives/searchmonkey_embeds_videos_documents_and_flash_games.php http://www.readwriteweb.com/archives/searchmonkey_embeds_videos_documents_and_flash_games.php News Thu, 12 Mar 2009 12:18:07 -0800 Frederic Lardinois
Hitwise: Search Queries are Getting Longer hitwise_logo_nov08.pngAccording to Hitwise, search queries on all the major search engines are starting to get longer and longer (PDF). While the average search query is still around two words long, queries that are longer than four words have become increasingly popular over the last twelve months.

Hitwise's latest data also confirms that Google's market share in the search business is continuing to grow at a steady clip (9% year-over-year). Year-over-year, all of Google's larger competitors lost ground, though at least between December and January, both Yahoo and Ask.com saw a very minor increase in their market share.

]]>Sponsor

]]> search_engine_stats.png

Longer Search Queries

Year-over-year, using one and two-word search engine queries became slightly less popular, while the number of three-word queries remained flat. Instead, a growing number of users are now opting to use longer queries. Overall, longer search queries have increased ten percent over the last year.

This is an interesting trend, and it could be interpreted in a variety of way. This could mean that a growing number of users is finding less value in the search results they get from relatively unspecific, short queries. It could also indicate that users are becoming more sophisticated in how they structure their queries when they are looking for very specific answers.

Do you have a theory why more users are turning to longer search queries? Feel free to let us know in the comments.

search_queries_length.png

]]>Discuss]]>
http://www.readwriteweb.com/archives/hitwise_search_queries_are_getting_longer.php http://www.readwriteweb.com/archives/hitwise_search_queries_are_getting_longer.php News Tue, 24 Feb 2009 14:03:31 -0800 Frederic Lardinois
FriendDeck: A FriendFeed Search Tool FriendDeck is a new web-based interface designed for performing searches across the social web aggregation service, FriendFeed. Having obviously taken inspiration from the popular Twitter desktop application, TweetDeck, FriendDeck displays information in columns that spread across your screen, allowing you to track multiple search terms within the same window. As the individual items appear, you have the option of clicking "like" or commenting inline on the postings.

]]>Sponsor

]]> Why Search FriendFeed?

FriendFeed, the service that allows people to aggregate their activities across the social web, is a great place to find what sorts of things people are talking about and what they are saying. In some ways, FriendFeed is better for "real-time" web searches than Twitter because a FriendFeed search will not only return Twitter posts, but will also include shared RSS feeds, Facebook status updates, items posted natively in FriendFeed itself, stories being promoted on social news web sites like Digg.com, and much more. However, unlike Twitter, FriendFeed's user population is smaller and tends to consist of people who are more technology-focused, so the results will be somewhat skewed in that direction.

Although useful, searching FriendFeed today still leaves a lot to be desired. That's where FriendDeck can help. After authenticating with your FriendFeed username and remote key, you can kick off searches from the box at the top of the FriendDeck window. Each search term will then display in its own column within FriendDeck. The end result is a web app that very much resembles the TweetDeck's desktop application, which also lets you display search terms in columns. However, unlike FriendDeck, TweetDeck additionally lets you organize your Twitter friends into groups in order to follow and track different sets of users along with your search queries.

What FriendDeck Won't Do

Unfortunately, FriendDeck only allows for monitoring searches, not groups. Perhaps because FriendFeed already includes a "lists" feature, FriendDeck's creator didn't think to add the ability to simultaneously track different groups of people. That's disappointing to say the least, since tracking lists (groups) on FriendFeed means having to constantly switch between them to see the latest news from each group. What we wouldn't give for a TweetDeck-inspired FriendFeed app that let us track lists, rooms, and search terms like this!

That said, there are still a couple of tricks you can do with FriendDeck in order to see more than just traditional searches. You can also:

  • See a user's likes - type in the query likes:{username} (Ex: likes:sarahintampa)
  • See a user's comments - type in the query comments:{username}
  • See a user's friends - type in the query friends:{username}
  • A list of posts relating to a URL - type in the query url:{url.com}
  • A list of posts about a domain - type in the query domain:{domain}

Although those custom queries are certainly handy, we would love to see FriendDeck do more. If you also have suggestions for what you would like to see in FriendDeck, you can join their FriendFeed room (http://friendfeed.com/rooms/frienddeck) or you can email the developer Paul Kinlan at paul@frienddeck.com.

]]>Discuss]]>
http://www.readwriteweb.com/archives/frienddeck_a_friendfeed_search_tool.php http://www.readwriteweb.com/archives/frienddeck_a_friendfeed_search_tool.php Products Mon, 19 Jan 2009 07:06:01 -0800 Sarah Perez
A Productive Application of Semantic Search Noesis is a new semantic web search engine that helps scientists studying the environment access and retrieve the research data they need. Developed at the University of Alabama in Huntsville, the new engine has the potential to enable scientists and researchers everywhere to perform more productive and focused searches thanks to the semantic technology Noesis uses.

]]>Sponsor

]]> About Noesis

The Noesis search engine (PDF) is different than regular search engines because it employs the use of semantics to help its users better shape their search queries. The results of this lead to better, more accurate, and more complete sets of search results. Those results can then be refined even further by Noesis' end users if necessary.

The goal of the Noesis project is to provide scientists working in the field of Atmospheric Science a way to better search through the "hidden web" of scientific catalogs that traditional search engines cannot reach. Because these catalogs are built using a standard vocabulary, the most efficient searches on the catalogs involve using specific terminology.

To create Noesis, researchers simply annotated those specific vocabulary terms with ontologies - the machine-readable definitions for the words that help computers understand the concept of the term and its relationship to other terms. Of course, annotations alone do not make a semantic web search engine. The ontologies must be coupled with a tool that's capable of searching through them. To that end, Noesis employs something they call the Ontology Interface Service (OIS), a SOAP-based web service interface to an inference engine. When a user performs a search, the OIS is also immediately searched for associated concepts. The Specializations and Generalizations discovered are returned in a tree structure which the user can navigate further. Synonyms and related terms are also shown, and, using checkboxes, they can be appended to the original query to refine it further.

Although the project was designed for use in one select area of science, its framework could easily be replicated in other scientific fields of study.

The Semantic Web: Better in Niches?

The main problem with the semantic web today is that the assignment of those above-mentioned ontologies - the pieces of code that allow machines to grasp meanings that humans innately understand - is that there's no solid way to automate their assignment. At the present time, no automatic or semi-automatic processes to do so have been achieved...at least, not to the point that a true vision of a new, intelligent web can be realized.

Most of the time, annotating web resources must be done using manually inserted bits of code placed into various web pages. Obviously, that's a challenge when you consider the size of the internet - it would be impossible to manually annotate this ever-growing resource. Unfortunately, without automated methodologies, a true semantic web will remain an unrealized dream.

However, in smaller communities, the semantic web can easily become a reality. Scientific data catalogs only represent small portions of the web as whole. Because of their limited size, manually annotating the resources they contain is a manageable feat. This is the case with Noesis. It shows there is promise for the semantic web after all - if only in small niches.

Image credit: rule100

]]>Discuss]]>
http://www.readwriteweb.com/archives/a_productive_application_of_semantic_search.php http://www.readwriteweb.com/archives/a_productive_application_of_semantic_search.php Semantic Web Wed, 14 Jan 2009 08:01:38 -0800 Sarah Perez