news search - ReadWriteWeb http://www.readwriteweb.com/feeds/tag/news search en Copyright 2009 Richard MacManus readwriteweb@gmail.com Sun, 22 Nov 2009 12:00:55 -0800 http://www.sixapart.com/movabletype/?v=4.23-en http://blogs.law.harvard.edu/tech/rss Google News Archive Expands: See Articles in Context google_news_logo_sep08.pngGoogle just announced an interesting update to its Google News Archive, which, starting today, will not only feature the electronic text of a lot more historical newspapers, but also a scanned copy of the actual paper. While having access to the text itself is already great for researchers, having access to an article in the context of the whole paper is even more useful. For now, however, only a select group of newspapers are available in this form and a lot of the historical material is still stuck behind pay walls.

]]>Sponsor

]]> Articles in Context

Being able to browse through an article in the context of the paper version is a great step forward for researchers. Before, you would either have to dust off old copies in an archive or operate an antiquated microfilm machine (without the ability to do a full-text search, of course). Now, at least in theory, you can just type in a search query and Google will not only do a full-text search of the historical archives, but also show you all the advertising and related articles in the paper itself.

The interface is very similar to the Google Books UI, with the ability to zoom in and out, jump to a specific page, see the page in full-screen etc. One difference, though, is that the advertising is a lot more prominent in this version of the interface.

google_newspapers_1.jpg

Hard to Find

All of this sounds great in theory. However, while we were trying out these new functions, we were barely able to find any material that was presented in this new way. As Google itself points out, not every search will trigger this new content, but if it does, the link will say "Google News Archive." Also, it is important to note that a lot of the historical material has been licensed from ProQuest and Heritage, which charge for access to their archives (unless you are, for example, on a college campus that subscribes to these services).

Over time, Google is planning to integrate these newspaper results into the main search results on Google.com, but for now Google is keeping the index of the News Archive separate from the main index.

google_news_archive.png

]]>Discuss]]>
http://www.readwriteweb.com/archives/google_news_archive_expands_se.php http://www.readwriteweb.com/archives/google_news_archive_expands_se.php Products Mon, 08 Sep 2008 11:20:33 -0800 Frederic Lardinois
Orglex: Semantic News, Blog and Job Search for Industry Verticals Orglex is a new semantic-web powered news, blog and job search engine with a social networking component and industry vertical focus. It's an interesting service that brings together a number of different approaches we've seen elsewhere to build something relatively new.

Semantic analysis of content makes topic focused search smarter than otherwise possible, and wrapping it in other value adds like blog and job search is a smart, solid play.

]]>Sponsor

]]> The first step Orglex users take is to select from any of 30 industry "hubs," collections of news feeds and resources organized around topics ranging from pharmaceuticals to social networking to management consulting.

The news section of each hub displays recent stories vetted by topical relevance via an industry specific ontology, combined with relative weighting of top sources according to how often they write about a particular sector (again determined by industry specific ontology). Its an interesting approach to news, a combination I don't think I've seen before.

A news feed made up of all the hubs you select is then displayed on your Orglex page and is exportable in feed format. The company has a white-label version of its Venture News feed available on the leading blog VentureBeat, though this automated aggregation of links off-site doesn't get very prominent billing there. No surprise and no knock on either company for that.

The feeds published by several of the hubs look like something worth subscribing to already. The most recent items in the "social networking" feed are on the left, judge for yourself.

In addition to news, Orglex also aggregates industry specific job listings from sites around the web and pages for people in each industry. The people section of the site seems inoperable right now and for a job aggregation site to try and to wring cache out of big brand icons as "featured employers" seems questionable.

One of the most interesting parts of the site is the leader board set up for each hub. Top sources are presumably indexed manually but ranked by the frequency with which they write about that hub's topic, according to the ontology. I'm always looking for new ways to discover top sources in new niches and Orglex could be a good tool to put in that toolbox.

The whole site is a work in progress and that's probably why Orglex hasn't gotten any media coverage to date except for the Amazon Web Services blog post I discovered it through. None the less, it's an interesting service to watch.

Readers interested in semantic web developments should check out the resources we've compiled on that and four other emerging key topics in the ReadWriteWeb Toolkit for 2008.

]]>Discuss]]>
http://www.readwriteweb.com/archives/orglex_semantic_search.php http://www.readwriteweb.com/archives/orglex_semantic_search.php Products Tue, 26 Feb 2008 13:42:17 -0800 Marshall Kirkpatrick