vertical search - ReadWriteWeb http://www.readwriteweb.com/feeds/tag/vertical search en Copyright 2012 Richard MacManus readwriteweb@gmail.com Tue, 14 Feb 2012 18:04:00 -0800 http://www.sixapart.com/movabletype/?v=4.35-en http://blogs.law.harvard.edu/tech/rss Spock To Offer Public Record Search Subscription Service Remember Spock? Over a year ago there was a lot of buzz around this vertical search engine for people, but now that excitement has worn off. Instead of searching for people on Spock or other similar people search engines, most users simply turn to old standbys like Facebook or LinkedIn. But don't count Spock out just yet. Their new service, scheduled for launch in a couple of months, will transform them from a simple people search engine to a full-on public record search tool for only $1.99 per month.

]]> According to MediaPost, in mid-January Spock plans to debut a new subscriber-only service that gives users access to data mined from public records found in government databases as well as info found on social network pages from sites like MySpace, LinkedIn, and Facebook. This new tool aims to complement other existing services they plan to offer, including the one launched earlier this year which scans Gmail accounts to help you find your friends.

Spock currently offers a public records search option to logged in users. In addition to the pictures, news, and web results Spock returns, public record data is pulled from USSearch.com. However, to get the details to those records, you still have to pay a one-time fee.

Privacy Concerns

Since its launch, many have people have raised concerns about how Spock operates. As noted earlier this year on Skiptease, there are several reasons to be wary of Spock, including the following:

1. Spock allows anyone to create and edit your personal information on the site, which raises numerous privacy concerns as well as concerns about the reasons people may have for editing your information on the site.

2. Editing or deleting information added about you does not guarantee that the changes will be made on Spock.com.

3. If you aren't informed that a profile or personal information has been added about you on the site, you might not discover the information until it shows up in a search engine query.

4. Even when your Spock profile is claimed by you, you still have little control over the information published on it. You can't personally get rid of any information and you have to request that the page be removed by the Spock search.

5. Turning a people search and social networking site like Spock.com into a wiki format where anyone can add and edit a profile on you allows people with a malicious intent to hijack your online identity and reputation.

6. The Spock people search allows users to flag inaccurate information. However, if you don't know that you are in their search database, there is no way to handle the information that has been published about you on the site.

On SEOMoz, after experiencing issues editing her own info, Jane Copland asked if a search engine that allows strangers to edit personal information about other people and then doesn't offer those people a quick way to remove the information they don't appreciate was "just a part of the internet we have to get used to" or if Spock goes a bit too far.

We found editing our profile information on Spock easy, though...that is, until an error message appeared which prevented us from continuing with the deletions we made. Afterward, we returned to our profile only to find that everything we had previously removed was still present.

With the upcoming public record subscription service, Spock has the potential to become even more of a privacy concern than before. By tying together your photos, web search results, social network profile info, and public record information, which invariably contains things like age and address history, we can see where Spock starts to cross the line from becoming simply a useful tool to one that starts to creep us out a little.

Do you agree? Let us know in the comments.

Note: we tried to contact Spock for more information on the service, but emails sent to the company bounced. Yikes.

]]> Discuss]]>
http://www.readwriteweb.com/archives/people_search_engine_spock_to_offer_public_record_subscription_service.php http://www.readwriteweb.com/archives/people_search_engine_spock_to_offer_public_record_subscription_service.php Product Reviews Fri, 07 Nov 2008 06:03:40 -0800 Sarah Perez
Semantic Travel Search Engine UpTake Launches According to a comScore study done last year, booking travel over the Internet has become something of a nightmare for people. It's not that using any of the booking engines is difficult, it's just that there is so much information out there that planning a vacation is overwhelming. According to the comScore study, the average online vacation plan comes together through 12 travel-related searches and visits to 22 different web sites over the course of 29 days. Semantic search startup UpTake (formerly Kango) aims to make that process easier.

]]> UpTake is a vertical search engine that has assembled what it says is the largest database of US hotels and activities -- over 400,000 of them -- from more than 1,000 different travel sites. Using a top-down approach, UpTake looks at its database of over 20 million reviews, opinions, and descriptions of hotels and activities in the US and semantically extracts information about those destinations. You can think of it as Metacritic for the travel vertical, but rather than just arriving at an aggregate rating (which it does), UpTake also attempts to figure out some basic concepts about a hotel or activity based on what it learns from the information it reads. Things such as, is the hotel family friendly, would it be good for a romantic getaway, is it eco friendly, etc.

"UpTake matches a traveler with the most useful reviews, photos, etc. for the most relevant hotels and activities through attribute and sentiment analysis of reviews and other text, the analysis is guided by our travel ontology to extract weighted meta-tags," said President Yen Lee, who was co-founder of the CitySearch San Francisco office and a former GM of Travel at Yahoo!

What UpTake isn't, is a booking engine like Expedia, a meta price search engine like Kayak, or a travel community. UpTake is strictly about aggregation of reviews and semantic analysis and doesn't actually do any booking. According to the company only 14% of travel searches start at a booking engine, which indicates that people are generally more interested in doing research about a destination before trying to locate the best prices. Many listings on the site have a "Check Rates" button, however, which gets hotel rates from third party partner sites -- that's actually how UpTake plans to make money.

The way UpTake works is by applying its specially created travel ontology, which contains concepts, relationships between those concepts, and rules about how they fit together, to the 20 million reviews in its database. The ontology allows UpTake to extract meaning from structured or semi-structured data by telling their search engine things like "a pool is a type of hotel amenity and kids like pools." That means hotels with pools score some points when evaluating if a hotel is "kid friendly." The ontology also knows, though, that a nude pool might be inappropriate for kids, and thus that would take points away when evaluating for kid friendliness.

A simplified example ontology is depicted below.

In addition to figuring out where destinations fit into vacation themes -- like romantic getaway, family vacation, girls getaway, or outdoor -- the site also does sentiment matching to determine if users liked a particular hotel or activity. The search engine looks for sentiment words such as "like," "love," "hate," "cramped," or "good view," and knows what they mean and how they relate to the theme of the hotel and how people felt about it. It figures that information into the score it assigns each destination.

Conclusion

Yesterday, we looked at semantic, natural language processing search engine Powerset and found in some quick early testing that the results weren't that much different than Google. "If Google remains 'good enough,' Powerset will have a hard time convincing people to switch," we wrote. But while semantic search may feel rather clunky for the broader global web, it makes a lot of sense in specific verticals. The ontology is a lot more focused and the site also isn't trying to answer specific questions, but rather attempting to semantically determine general concepts, such as romanticness or overall quality. The upshot is that the results are tangible and useful.

I asked Yen Lee what UpTake thought about the top-down vs. the traditional bottom-up approach. Lee told me that he thinks the top-down approach is a great way to lead into the bottom-up Semantic Web. Lee thinks that top-down efforts to derive meaning from unstructured and semi-structured data, as well as efforts such as Yahoo!'s move to index semantic markup, will provide an incentive for content publishers to start using semantic markup on their data. Lee said that many of UpTake's partners have already begun to ask how to make it easier for the site to read and understand their content.

Vertical search engines like UpTake might also provide the consumer face for the Semantic Web that can help sell it to consumers. Being able to search millions of reviews and opinions and have a computer understand how they relate to the type of vacation you want to take is the sort of palpable evidence needed to sell the Semantic Web idea. As these technologies get better, and data becomes more structured, then we might see NLP search engines like Powerset start to come up with better results than Google (though don't think for a minute that Google would sit idly by and let that happen...).

What do you think of UpTake? Let us know int he comments below.

]]> Discuss]]>
http://www.readwriteweb.com/archives/semantic_travel_search_uptake.php http://www.readwriteweb.com/archives/semantic_travel_search_uptake.php Product Reviews Wed, 14 May 2008 06:00:00 -0800 Josh Catone
Beyond Vertical Search to Business Networks Vertical Search is one of those confusing terms that means many different things, depending on where you are coming from. To most RWW readers, Vertical Search tends to mean “the search space that Google has not yet grabbed and that does not require a major technology breakthrough such as natural language search”. That’s a good enough definition from a start-up perspective. For traditional media, Vertical Search is also about creating a space that Google cannot simply steamroll over. Traditional media may call it Rich Data or Information Services or Data Products, but the end goal is the same.

]]> For the sake of consistency I will continue to call this Vertical Search, although the opportunity is deeper than simply search. In fact, in order to build sustainable advantage against the Google steamroller, it has to be deeper than just search.

As Alex Iskold explained recently, Google Custom Search is setting the bar for Vertical Search Engines. This looks like a smart play by Google and they will grab the big low hanging fruit very well. For example, search for something related to healthcare and you can see that Google already understands concepts such as Symptoms and Treatment.

That still leaves a lot of opportunities within niches that may look small at first glance (and which won’t justify any focused effort by Google). You can dismiss these niches as “picking up peanuts in front of a steamroller” but when the speed and direction of the steamroller is fairly clear, there are lots of opportunities if you are agile and quick on your toes :-)

You may also be surprised by how much money can be made in small niches. My favorite unglamorous niche is ASI, the Advertising Specialty Institute. Their business is “promotional products” or as they put in their site: “think of all those freebies like mugs and T-shirts you see at trade shows”. It is an $18.6bn industry which is pretty small (compared to the big markets targeted by vertical search start-ups) . Yet a few years ago ASI’s revenues were well north of $50m and they were highly profitable and growing at a reasonable clip. This type of niche won’t get you on the front cover of Fortune but it could make you rich.

ASI is more than vertical search, a lot more, but the core of the business is data about their industry. That data enables them to take small slices of lots of interactions within their market. They have become a “business network”. More on that later.

First, lets look at who the players are in this market:

  1. Start-ups. Most Vertical Search start-ups (at least the visible ones) operate in broad, large scale markets that are primarily consumer-centric. Many of these have been covered before in Read Write Web, specifically this post in September ‘06. The big sectors are travel, finance, consumer electronics and health. These markets are the big low hanging fruit that pass the addressable market size hurdle set by Venture Capital; this also makes them the most vulnerable to Google.
  2. Traditional B2B and specialty enthusiast publishers. This market comprises lots of small niches, micro niches if you like; think “Waste Haulage” or “Pig and Poultry” (those are real titles of B2B magazines). Think Trainspotters Monthly as an example of a specialty enthusiast publisher (I made that one up). This is the long tail of business publishing. Traditional B2B publishers have two big advantages. First, their long established brand (and the trust that it represents) and second, decades of domain expertise. They also have two big hurdles. The first is that they have traditionally been weak on technology; that is relatively easy to fix by partnering with technology vendors. The second hurdle is that Vertical Search is one of many lines of business (and it is currently a small % of their total revenues) and their core lines of business are under threat; Vertical Search therefore does not usually get the senior management attention and investment capital that is needed to win.
  3. The mass market online players (Google, Yahoo, Microsoft). They are grabbing the large low hanging niche markets and butting heads with those VC funded Vertical Search start-ups; but there is no Google team going after the Pig and Poultry market (AFAIK).
  4. Vendors including:

      * Technology. This includes search engines, scrapers, tagging tools, XML databases, research workflow, subscriptions/billing and more.

        * Data research services. Many sites rely entirely on aggregating, filtering and displaying open content from other sites. Other sites like to add proprietary data that needs to be collected by traditional market research techniques.

        There are also a few small niche consulting firms providing strategic advice, usually through some mix of research reports, seminars and workshops. Financing is often provided or arranged through a number of specialist “boutique” investment bankers and Private Equity investors specializing in this market.

        Vertical Search sites typically provide one of three main types of data:

        1. Lead Generation (contact details with enough data to target effectively for selling or recruiting). The value proposition is obvious. The danger is that these databases simply become tools for spammers. Every publisher does “list rental”, which means providing the list of subscribers to marketers; that can easily degrade into spamming. One intelligent approach is something like LinkedIn, where the person has control over what data is provided. Another intelligent approach is to provide enough depth of data to allow highly targeted marketing. One example is Computing Market Intelligence in the UK. People usually don’t mind a pitch if it is highly relevant and targeted to their current situation.
        2. Pricing Guides in opaque markets. The best examples create lock-in by becoming the authoritative source for the industry. There is plenty of news, comment and analysis in most markets that are free to the audience (i.e. paid by advertising); but it is still difficult to get hard data that answers the question “how much would it cost me?” There are many markets where pricing is quite transparent; however there are many other markets where there is major information asymmetry that is maintained deliberately by vendors. A service that opens up these markets will save money for buyers; B2B buyers are willing to pay for that.
        3. Financial market information for investors. The base data about stocks and bonds is increasingly free and available on multiple sites. What is hugely valuable is unique insight backed by hard facts that are not commonly available (and that do not breach insider trading rules). One example is tracking a new hot consumer electronics product, figuring out early how the product is shifting at a retail level, working out which publicly traded companies make components for that product; that leads to the right mix of legal, proprietary and timely data that institutional investors will pay big $$$ to subscribe to.

        There are also three basic techniques:

        1. Search engine on top of unstructured data. Google Custom Search engine serves this purpose and sets the bar for competing engines. Many Blogs use Eurekster and traditional B2B publishers often use Convera.
        2. Apply structure to unstructured data. Structured data (fundamentally data in a relational database) enables more effective parametric searching. Structure can be added through manual or automated techniques such as tagging and scraping; increasingly it is the combination of automation and manual methods that wins.
        3. Proprietary new data. This is the world of traditional market research, with a focus on facts and numbers not news/commentary/analysis. Proprietary data is the yeast that makes the aggregated non-proprietary dough rise. The cost to acquire this “yeast” is therefore critical.

        Current monetization strategies are usually either advertising or subscriptions.

        If you can show ROI by saving money you can charge subscriptions. Subscriptions enable predictable, high margin businesses; subscriptions are also much more recession proof than advertising. The basic rules for getting people to pay for data are:

        1. It would cost them way more than the subscription cost to research this data for themselves. In other words, they cannot easily use Google to find a free equivalent or call 2 or 3 buddies.
        2. The data can be used to make a case, either internally for a budget or externally for new capital, to spend some serious money. This means that the data must be verifiable and the source must be trusted and can be referenced.

        There are lots of pricing options for paid subscriptions - by registered user, by concurrent user, by # of records, by time period, by product (print vs online). Print On Demand technology has rejuvenated print by making very small print runs cost effective.

        In recent years the trend has been to move increasingly towards making information free, letting advertisers pay the bills. Google has clearly changed the advertising landscape with Cost Per Click, as this is closer to a model with proven Return On Investment (ROI).

        The future may move to transaction models. There will always be a big place for brand advertising (aka “faith based advertising”) but the logic of ROI tends to drive from Cost Per Click to Cost Per Action to Cost Per Product (i.e. a % of the product price). Given a choice, most sellers prefer to simply pay a % of the sell price; in an ideal business world all costs are variable costs.

        These transactional business networks rely on two key planks:

        1. Data. If all the market participants are accessing a common set of data that is trusted and verifiable they are more likely to trade with each other through the network. So a vertical search database is the key enabler for transactions.
        2. Trust. This comes from having these niche communities bound together through content sharing and ratings. This assumes a registered user community using real names.

        Google is not the only steamroller heading this way. LinkedIn has the other piece and they are on a roll. The strategic imperative for media firms is to use the base techniques of search and social networking to build their own value space. The technical pieces for both are easy to assemble. It is simply a land grab game.

        Business Networks work best in industries with lots of buyers and lots of sellers. This is known as a “fat butterfly” industry structure. If a few buyers or a few sellers dominate the industry, there is less opportunity for an information intermediary. This may sound familiar to people who remember B2B 1.0 circa 1999 with over-hyped companies such as VerticalNet. The end goals are the same in B2B 2.0, but a lot has changed since then:

        • B2B users are now online way more than they were in 1999.
        • Vertical Search is better at creating the common pools of trusted data.
        • Social networks and ratings have accustomed people to make judgments about who they might be interacting with online.
        • The cost to develop and deploy these types of systems have come down dramatically.
        • Entrepreneurs have figured out how to enable and co-opt intermediaries in the supply chain; rather than naively assuming that they serve no purpose and will quietly disappear.

        Vertical Search companies that rely on a few jazzy features or a “we try harder” approach will have their competitive advantage inexorably eroded by Google. Companies that use search as one tool to build niche online business networks based on a transactional model can create sustainable competitive advantage.

        ]]> Discuss]]>
        http://www.readwriteweb.com/archives/beyond_vertical_search_to_business_networking.php http://www.readwriteweb.com/archives/beyond_vertical_search_to_business_networking.php Enterprise Tue, 29 Jan 2008 21:57:21 -0800 Bernard Lunn