Techmeme is on fire this morning with discussion of Rafe Needleman's CNet post about Twitter's supposed plans to index the content of links shared over the microblogging service. Ex-Googler turned Twitter exec, Santosh Jayaram, said as much last night, as well as mentioning plans to rank search results by the reputation of the author.
It is really strange that none of the coverage we've seen today makes mention of yesterday's news that Twitter has picked Bit.ly as its new default URL shortener. Bit.ly indexes the content of links and gathers a whole lot more data. Below are three reasons we're betting that Twitter will not index the content of links itself, it will rely on Bit.ly to do it. Twitter will probably acquire Bit.ly as a result, in exchange for Twitter stock. If not Bit.ly, it will be one of a handful of other third party companies currently working behind the scenes with Twitter on this kind of search. Twitter is not going to do it all on its own, we're willing to bet on that. Update: After publishing this post we have been sent additional information illustrating just how close the Twitter/Bit.ly relationship already is.
Many people asked yesterday why Twitter was choosing an outside party at all to shorten its links. Why not do that in-house? The most obvious answer would be that it's very hard work to reliably redirect millions upon millions of links every day. Why should Twitter do it? Bit.ly is redirecting 50 million clicks a week right now, up from only 15 million per week just 5 weeks ago. Now that the relationship with Twitter is as close as it is (Twitter was the source of only about 50% of the traffic through bit.ly before) we can expect that number to grow even faster. We hear that last month Bit.ly got off of Amazon servers and is fast adding servers of their own. Update: One trusted industry source speaking on the condition of anonymity now tells us that Bit.ly servers "were moved into Twitter's racks months ago in preparation for this change."
Five weeks ago we wrote about Bit.ly receiving venture funding from some of the hottest investors in the web business: investors like Ron Conway, an early Google investor, Mitch Kapor (the inventor of Lotus) and rock star startup investor Jeff Clavier. Bit.ly's former parent company, BetaWorks, got money from Tim O'Reilly, one of the fathers of Web 2.0 O'Reilly AlphaTech Ventures. They took this money in part because they needed it. It's not easy to do what Bit.ly does. There's good reason for Twitter to not reproduce that work.
If it makes sense to have a specialist team redirect the links, it makes even more sense to have someone else indexing the content of those linked pages. Remember, Twitter was started as a group SMS service. They keep it as simple as possible over there and let others do the most complicated work. If there was ever a startup that does not suffer from "Not Built Here" disease (meaning they can't integrate other peoples' work) Twitter is a great example.
We believe this is going to be a business deal more than a technology play. Even Santosh Jayaram, the Googler turned Twitter exec that started this whole discussion last night, has an MBA with a resume full of business development jobs. He's being described as "Operations at Twitter" but his LinkedIn profile lists his current position as VP of Business Operations.
Twitter may be processing its own trends data on site now, but that's analysis of very short bursts of text and it's using technology built by a startup the company acquired - Summize. We expect them to do the same thing with linked-page indexing and analysis.
One of the reasons that Twitter is so interesting is that with a 140 character limit, every word counts. When it comes to the pages people share links to on Twitter, that's not necessarily the case. Enter semantic analysis of those pages, something Bit.ly is currently doing with the help of the Reuters Calais system. Bit.ly serves up links to Calais and gets back a list of the keywords and concepts that the linked-to pages are actually about. Think of it as machine-performed auto tagging with subject keywords. This structured data is much more interesting than the mere presence of search terms in a full text search.
Bit.ly has had semantics in its sites since day one. It's also very strong in real-time statistical analysis. It's reminiscent of the Twitter-acquired search engine Summize, which was grounded in a background of sentiment analysis and brought real-time to the game as well. We wouldn't be surprised to see sentiment analysis and semantics, both of which are very hard to do, become a part of the API that Twitter offers outside developers in the future.
Several people have mentioned in the last 24 hours that Bit.ly and Twitter have common investors. Bit.ly came from a small New York incubator called Betaworks, which is also an investor in the most popular Twitter client Tweetdeck. Betaworks was also an investor in Summize, the search engine that Twitter acquired. That means Betaworks owns some stock in Twitter as well. That stock is probably relatively small, not enough to make a deal happen but more than enough to facilitate introductions between friends.
The actual story behind the scenes is no doubt much more complicated than this. Bit.ly's John Borthwick told us this morning that Bit.ly is working on part of this development but Twitter is too. Several other companies are testing some kind of API program already, so it may not be Bit.ly or just Bit.ly that becomes the center of this story long term. We've heard in the last week from more than one company working on something like this with Twitter and Bit.ly would be far and away the cheapest of the candidates for Twitter to pick up in terms of the relatively small venture financing they've taken.
OneRiot is one of those other companies. They will unveil a related but broader technology early next week (watch this space) and they too have an investor in common with Twitter (Spark Capital).
Competition over the deep real time search space has got to be heating up for all the players, though. "Four search companies have approached us over the past eight weeks," Bit.ly's Borthwick tells us.
Disclosure: Reuters Calais, the company doing semantic analysis for bit.ly and other companies, is an RWW sponsor.
Comments
Subscribe to comments for this post OR Subscribe to comments for all ReadWriteWeb posts
All of the above is encouraging to someone like me, that fears @ev will swoon at a billion dollars cash being waved in his face - and sell.
Sounds like all the hard boiled business going on is all in the back-end ( branded channels for the different carriers, buying Tweetdeck and bitly like he did Summize, etc. ) leaving all us long suffering end users immune from evil.
I hope some lawyers chime in here and comment on how it is possible for Twitter to cut all sorts of B2B deals and not inflict any misery on us end users.
because twitter is a little bitch.
Posted by: Sean Canton
|
May 7, 2009 10:09 AM
Great article. One problem is filtering what is reputable.
Posted by: Guias Local
|
May 7, 2009 10:10 AM
What's interesting to me is that Bit.ly covers much more than Twitter. So, even if Twitter has amazing tech, it still doesn't appeal to me as much as a more comprehensive crawl of social media.
Justin, that's a good point. Bit.ly could be Twitter Search's ticket out of the Tweet-o-sphere alone. The same could be said for the other companies working on this functionality, but I hadn't thought of it that way at all. The news could be not just that Twitter allows search of linked pages on Twitter, but that Twitter goes real-time search of the whole wide web. Hotness!
If you haven't tried Mark Carey's Twitter Real-time On Google Greasemonkey script, by the way, I'd highly recommend it. It puts Twitter search results for your query at the top of every Google search results page and it's changed my use of Google more than anything has since I discovered Custom Search Engines.
Great, I'm glad they're moving to bit.ly for the default shortener - I've been using bit.ly myself for awhile because of the additional information they give you about the use of the links.
We saw Twitter acquire summize, so it's not out of character for them to acquire bit.ly as well. Great; let them focus on their work and leverage the good work others have done.
Now, can we start talking about a tweetdeck acquisition???
Andrea, a Tweetdeck acquisition would seem pretty sweet to me, but imagine what it would make Seesmic, Twhirl, Twitterific, etc. feel like. That would probably do a lot of harm to the developer ecosystem as people wouldn't want to create competing businesses if one of them was going to get brought in house. Or maybe people would take that gamble and hope it was them! Next up, Tweetmeme?
Hi Marshall,
I disagree with your point that crawling content does not matter.
Twitter wants to make money and for that they need highly organized content.
By crawling links, they are indexing data which might be useful for running contextual advertising banners on twitter.
Also Spammy content can get burried.
Are my points valid ?
Varun, that's a good point. I find structured data about key concepts in content more interesting, and probably just as monetizable if not more so though.
Ok ,
Im giving a shot at Opencalais :-)
bit.ly isn't grabbing metadata for alot of my links. any idea why.. do i need to turn something on to get metadata?
@dexin it will, which lead to spammers overwhelming twitter just like it happened with e-mail.
Brenden. There are probably some scaling issues being dealt with right now. Hopefully in time your pages will be indexed. We'll see!
I recently launched http://tweetlinx.com, which does index shared links and examine semantic content: see http://tweetlinx.com/markhuetsch for links that are in my friends' status stream, for example. I also built a page for 'ReadWriteWeb fan': http://tweetlinx.com/readwritewebfan - Hope you'll check the site out.
With new users defecting at 60% per month, WordPress coming out with 'Twitter' like extension, Twitter needs constant hype. Santosh Jayaram is just feeding that so that social media marketers/SEO guys have another reason to visit Twitter.
What the Heck? Spending so much time on Twitter and they are not even Indexing those links.
His idea is just ridiculous and that thing about ranking system OMFG! It's probably going to be another Selfish promotional tool of Twitter which is picked by Biz and Ev and the rest of the group. See if you were Sarah Lucy and give BJ to Ev you would be on that "suggested users" list which gives you about 10K followers per day.
Twitter is like freaking Apple community. It sucks but we brag about it.
Google is starting to index tweets
Thanks for the pro-Bit.ly comments.
Just to make sure all you Bit.ly users know - we do have a community Uservoice forum - and we do listen! It's located here: http://bit.ly/ARUYX
The suggestion area is how we come up with many of our improvements to the service. We are growing by leaps and bounds, so while our smart engineers have thought of a lot of items, there are ideas that come straight from the above Uservoice forums.
This is the place (Uservoice forum) where you - the Bit.ly community get to vote - and let our product manager know what is truly important for you/priority to have.
Thanks for the great analysis Marshall and thank you Bit.ly users for spreading the love!
Rex
Bit.ly Community Mgr.
Marshall,
Very insightful analysis and connecting the dots. But Google is the sleeping Gorilla who may not make Twitter et al's supremacy so easy. Also other URL shortners are doing their bit, although the bit.ly user interface and info they are providing from link data is impressive.
Second point on semanticizing posts- Social Media is still quite noisy, and when you semanticize out of context, you end-up with a mess: noise on top of mess yields more noise. Then we have to re-filter some other way.
William, all good points. Thanks.
Thanks Marshall for mentioning Mark Carey's tool. It's useful.
Have you tried Search Cloudlet (FF add-on)- auto-clouds for Twitter and Google searches.
There is an abundant and ever increasing amount of spam on Twitter.
Thus the world will get a spam indexation and retrieval system?
Maybe twitter should also be outsourcing its avatar serving, as it has been broken for weeks if not months. If they can't reliably get avatars right then url shortening is definitely beyond them.
does anyonw know if google is likely to see the links posted on twitter and take this into account for a websites page rank? if so would this be for sites posted as the direct link and also sites posted with a url shortener?
i have been using twitter a few months now (username: greensurrey) and would like to know if it's worth putting a few more links to my site. i see many companies do it so i assume so but not sure.
adam