About 18 months ago, we wrote about an obscure search startup from Germany called FAROO. We believed that its radical alternative, using peer-to-peer (P2P) technology, had a shot at being a real disruptive force. Today, it has made some progress, has raised some money and is getting out into the market. (Disclosure: FAROO is currently a ReadWriteWeb sponsor).
FAROO is wisely underplaying P2P in its marketing, preferring more fashionable terms such as "real-time search" and "social discovery." But the P2P technology drives it.
So, we decided to invite someone who understands P2P at a technical level to interview Wolf Garbe, FAROO's founder. Our tech expert, Kiril Pertsev, of Agily Networks, has already written about P2P for us in the past.
Kiril: Why .NET? Did you already have development resources or did you make this choice because you consider it a better option for networked desktop applications? Would you make this choice again? And if you're not satisfied with .NET, what would your platform of choice be, given all of your experience over the past few years?
Wolf: I come from Delphi (Object Pascal). So, the choice of C#/.NET was a dedicated decision for a new platform, not driven by legacy. When I started to work on the first prototype in 2004, Delphi moved towards .NET. I preferred to go with the original, especially because the development of C# was led by Anders Hejlsberg, the designer of Borland's Turbo Pascal (which Delphi derived from).
Of course, I also looked into Java, which I found quite similar, both from the language perspective (C# vs. Java) and the JIT Runtime environment (Java Virtual Machine vs. .NET Runtime). The decision for C# was based on the dominating desktop market share of Windows and the assumption that embedding the .NET framework into the OS would ensure fast penetration of .NET. This only partially came true, partly due to the limited success of Vista, which was the first Windows version with .NET pre-installed.
Kiril: Doesn't this choice hinder your ability to move to Mac and Linux platforms.
Wolf: We were betting on Mono for platform compatibility. Unfortunately, Mac OS X still has no Mono application launcher, other than starting with the terminal, which is not feasible for a mass market. With the increasing importance of the Mac OS X platform, I expect this to change. Silverlight today is already natively available for Mac.
For the ultimate platform independence, we are also continually observing the diverse RIA developments (AJAX, AIR, Silverlight, Mozilla Prism, HTML 5 persistent web storage, Mozilla's DOM storage, Google Gears and Flash persistent storage), which could one day allow us to remove the download and installation step for P2P. But so far, no solution meets all of the requirements: out-of-browser capability, permanent background operation, auto-start option, tray icon support, cross-domain connection support, persistent storage, accepting an incoming connection and receiving data and NAT traversal.
Kiril: If you become dissatisfied with .NET, what would be your next platform of choice.
Wolf: Although not everything went as expected, I still believe that .NET is a very powerful platform, and C# as a language is evolving at a much faster and broader pace than Java.
Today, we have a good .NET penetration rate in the US and Europe. With Windows 7, I expect that to increase in Asia as well.
Kiril: I see that you're using a pretty simple P2P communication technology instead of sophisticated Hamachi-like NAT traversal using UDP hole punching.
Wolf: I suppose you are referring to the transport layer, which is HTTP over TCP/IP. The real P2P overlay protocol on top of that is not that simple anymore.
Because our distributed search engine system architecture breaks with almost all legacy paradigms, we thought it would be a good idea that it be at least based on proven and widely used standards wherever possible. There are several reasons for this:
NAT traversal is the most critical issue for every P2P application. It's really a shame that although the Internet is built on a distributed foundation, end-to-end connectivity between users in a decentralized way is completely broken. We are using several NAT traversal techniques: Manual Port Forwarding, Automatic Port Forwarding via UPnP and Teredo. Teredo is a IPv6 Tunneling technology, standardized according to RFC4380.
Teredo is part of Windows XP, Vista, and Windows 7; with Miredo, there is also an open-source implementation for Linux and Mac OS X available. Microsoft reports that with Teredo, the chance of a connection between two peers increases from 15% to 84% (PDF link). Our observations are somewhere between 60% and 70%.
Teredo is quite sophisticated technology and is a more universal approach. It provides connectivity at the OS level, in contrast to having several applications in use, where each uses its own proprietary traversal technology.
Kiril: Could you please elaborate on choosing network technology, having achieved a substantial number of users and collecting usage statistics. Do you know how many active and passive peers you have at any given time? What is the ratio?
Wolf: We have solid insight into the state of our P2P network. We know the number of active and passive peers on any given day (using the log from our update server). The active peer ratio is between 60 and 70%.
We are also currently working on an improved distributed intraday statistic. (The distributed statistic currently built into the P2P client is not valid anymore for the increased network size. For scalability, every peer has only a limited view of the whole network, which requires more advanced methods for calculating the actual network size.)