scalability - ReadWriteWeb http://www.readwriteweb.com/feeds/tag/scalability en Copyright 2009 Richard MacManus readwriteweb@gmail.com Mon, 23 Nov 2009 16:43:23 -0800 http://www.sixapart.com/movabletype/?v=4.23-en http://blogs.law.harvard.edu/tech/rss Everything You Wanted to Know About Semantic Technology, But Were Afraid to Ask (at SemTech 09) Editor's note: we offer our long-term sponsors the opportunity to write 'Sponsor Posts' and tell their story. These posts are clearly marked as written by sponsors, but we also want them to be useful and interesting to our readers. We hope you like the posts and we encourage you to support our sponsors by trying out their products. This one is by Hakia, one of the participants in the recent 2009 Semantic Technology Conference.

Participants in the 2009 Semantic Technology Conference walked away considering fundamental questions about what is and isn't semantic technology. The relevance of this post's title will hopefully become clear by the end to those of you mischievous readers who may have stumbled upon it with other ideas. The conference was a great and well-organized affair in San Jose, California. One of the highlights was the Semantic Search Keynote panel, with all of the major players on stage (Ask, Bing, Google, Hakia, TrueKnowledge, and Yahoo!), as seen in the picture below.

]]>Sponsor

]]>

Bear in mind that semantic technology can be as heavy and stifling for any audience as stem-cell research can be to high-school students. But Carla Thompson of Guidewire did a terrific job of coming up with discussion topics and moderating the panel. Everyone survived the ordeal without any sign of dozing.

Despite the positive outcome, some responses from the panelists made me wonder if we should go back to the basic question of, "What is semantic search?" Or, better yet, what isn't semantic search? Here is my list:

Structured Data

Folks, semantic technology is not structured data. A database that can, given the query "social drinking," pull up a list of beer brands, their manufacturers, and their contact information has nothing to do with semantics. Some people seem to have the impression that a search engine somehow uses semantic technology if it retrieves structured data for its results. It is a trick as old as the ancient Egyptians who used beats to organize harvesting information. Organized information is not semantic information.

Morphology

If a search engine is robust and returns the same results for the query "top ten" as it does for "top 10" (i.e. it recognizes that "ten" means 10"), calling the search engine semantic would be a stretch. Anyone could come up with a substitution list like this without a drop of linguistic knowledge. Similarly, distinguishing the name "Fisher" from the noun "fisher" by detecting the capitalization of the first letter does not go beyond the application of simple linguistic rules. These capabilities are not semantic search capabilities.

Syntax

A certain amount of semantic information can be salvaged from syntax. Unfortunately, if syntax were enough for us to detect the meaning of text, then an 8-year-old with perfect reading ability (i.e. who is able to syntactically parse strings of English-language letters) could be expected to understand the meaning of Shakespeare's works. The difference between reading and understanding is the difference between syntax and semantics. The former requires the skill to parse things out, while the latter requires vast amount of associative knowledge.

Statistics

An infinite number of monkeys typing on an infinite number of keyboards would eventually come up with the complete text of the Declaration of Independence. This is a scientific statement; it is not a joke. However, if a search engine is expected to be semantically relevant using statistical algorithms, one would have to wait until the monkeys finished their job. Statistics have no place in semantic technology. A simple test would reveal that. For example, your brain is able to understand a unique sequence of words that you have never seen before, such as "Polar bears don't eat alligator eggs before dawn." If semantics were built on statistics, computers and algorithms would not understand this and billions of other sentences.

Scalability

Scalability is the narrow bridge between science and technology. What you can carry from science to technology over this bridge determines the level of capabilities in the real world. The science of semantics is huge and stems from the roots of philosophy. But Web search is a very particular problem with stringent constraints (a narrow bridge). Designing semantic algorithms to drive a Web search engine is like walking on egg shells and requires a completely new approach. Thus, a semantic search algorithm could be very sophisticated but still not suitable for the Web.

These five areas cover what isn't semantic search and should help readers understand the questions that emerged from the Semantic Technology Conference. Structured data, morphology, syntax, statistics, and scalability are key areas to discuss moving forward. Of course, contrary to the title of this post, no one was actually afraid of asking these questions. But if you caught the reference in the title, that was your semantic brain in action, one last example of what is semantics technology.

]]>Discuss]]>
http://www.readwriteweb.com/archives/everything_to_know_about_semantic_technology_at_semtech_09.php http://www.readwriteweb.com/archives/everything_to_know_about_semantic_technology_at_semtech_09.php Sponsors Fri, 26 Jun 2009 05:00:18 -0800 RWW Sponsor
What's the Biggest Rails App? It Doesn't Matter Once upon a time, whenever anyone asked, "But are there any big applications built on Rails?" The answer was usually, 43Things, anything from 37Signals, or Odeo. But over the past year, there's no doubt that if there is a poster child for Rails, it is now Twitter. With such notorious bouts of downtime, a worse poster child Rails could not possibly hope for. But is Twitter even the largest application out there running on Rails? Does it even matter?

]]>Sponsor

]]> "Twitter is almost certainly the largest site running on Rails, so fans of the framework and its developers have been quick to deflect the criticism and point it back at the engineers at Twitter [to explain downtime]," wrote Nik Cubrilovic in a recent post on TechCrunch calling out Rails as a poor choice for large scale app development. The debates over what causes Twitter's frequent outages (we think it's a database issue) and whether Rails is good for large apps aside, Twitter might not actually be the biggest Rails-based app out there anymore.

Some back of the napkin math by noted rails developer Evan Weaver (who recently went to work for Twitter), finds that while Twitter might be huge in terms of monthly pageviews, the Facebook app Friends for Sale, may still be bigger. And Yellopages and Scribd are similarly massive.

Ignoring the oddities in Weaver's computation (like, for example, that even though he works at Twitter he only guesses how much traffic the API is fielding), which he admits result in "wildly inaccurate values," he makes one very good final point: It doesn't matter!

"It is important to keep in mind how useless this information is. It doesn't even make sense to say 'Rails site' or 'PHP site,'" says Weaver. "Livejournal uses Perl, Memcached, and MySQL, among other things. Does that make it a Perl site, a MySQL site, or a C site? I don't know what Scribd uses, but it's pretty likely that their document pre-renderer is Java or C, not Ruby. Friends for Sale uses Nginx, Rails, Memcached, MySQL, and Linux. Ruby is really just a little piece of the pie."

]]>Discuss]]>
http://www.readwriteweb.com/archives/whats_the_biggest_rails_app.php http://www.readwriteweb.com/archives/whats_the_biggest_rails_app.php Twitter Tue, 27 May 2008 08:40:59 -0800 Josh Catone
Assetbar Aims to Bring Scalability to Social Web Apps - RSS Reading is First Up Prelaunched social RSS reader Assetbar calls itself the first application built on the company's new “Media Participation Platform” and has a number of remarkable features already that you'll want to check out if you can get in. (invite code below)

The experienced team of entrepreneurial engineers behind the application says its goal is "to open the platform to other developers around the world so they can create new apps with features that wouldn't be sane with traditional stacks."

]]>Sponsor

]]> In the mean time, the RSS reader has all kinds of social features that are a lot of fun. It offers inline commenting, quick reviews, notification of which of your friends has viewed an item first (if you, you "win!") and a cool bookmarklet for sharing images off-site into your Assetbar stream. So far, it does move pretty fast, though there's a limited number of people using the service so far.

I've been following Assetbar so far via the red hot blogger Louis Gray, whose invite code I'll rip off and repost here (it's "2friendly") in exchange for saying that if you haven't subscribed to Louis's blog yet, you really should.

In addition to Gray's close coverage of the app, you can also check out an official tour of Assetbar - which unfortunately is one of many examples of the team's biggest need, a major User Experience overhaul. Part of the TOS required that I "not cry" about prelaunch limitations of the service, though, so I'll leave it at saying this: I'm not about to use Assetbar in its current state but the concepts here are fascinating.

Scalability

Scalability is probably the number one issue faced by the exploding world of web apps these days. When you take into account all the data portability by calls to other servers that go on in all the lifestreaming apps and variations thereof, it will be a wonder if web apps work at all in 18 months.

The Assetbar site says the company is made up of folks from a previous enterprise venture called Redline Networks, which was the subject of a large acquisition by Juniper Networks in 2005. Redline was all about application scalability, so it seems the team is now aiming to bring the next level of scalability technology to the consumer market. Again, this social feed reader is just the first app they are building in house - the Assetbar site says look out for the API and contact them if you are interested in getting involved. There's no contact info on the site but founder Israel L'Heureux's email is available via WhoIs.

From the site:

We founded Redline Networks in 2000, where we introduced a single threaded event-driven web server design which provided such low latency and high scale that we productized it as a "next generation load balancer". In addition to load balancing, our product also performed I/O offload, TCP connection management, HTTP Compression, SSL, HTTP security, logging, and more. Our "E|X 3250" product earned the 9.5 out of 10, the highest score among 229 enterprise products reviewed by InfoWorld Magazine's Test Center during 2003.

The site also includes links to five scalability related patents developed by the team and now owned by Jupiter. The point is, these guys are hot stuff. Joining the small crowd waiting in the shadows to slit Twitter's throat is probably somewhere in their minds.

As for the RSS reader, it's cool in its formative stages. Scalability, sharing, time sensitive metadata, super simple reviews and off-site integration with my feed reader are all part of my "dream come true feed reader" vision. Better handling of OPML files, offline access and item storage, standards for exporting my attention data (RSS doesn't count as an export option) and a mobile option are all important to me too, though, and Assetbar doesn't offer any of that. It's worth a good look none the less, so here's some screenshots below of the primary page first and then the off-site asset sharing view after I clicked on the bookmarklet from a Flickr search results page. Enjoy.

My favorite little feature, the offsite asset sharing bookmarklet in action.

]]>Discuss]]>
http://www.readwriteweb.com/archives/assetbar_on_scalability.php http://www.readwriteweb.com/archives/assetbar_on_scalability.php Products Thu, 14 Feb 2008 08:40:50 -0800 Marshall Kirkpatrick