Archive for RSS

Thoughts on Twitter architecture and pricing

Om Malik wrote an interesting post about twitter pricing yesterday, but I think he’s a little off. I don’t blame him, considering his background is not computer science. And besides, it started a really interesting conversation. Before we start talking about Twitter pricing plans, we need to come to an agreement about what technically is hurting Twitter. Ideally, scaling issues should be orthogonal to your business plan; if you are successful, lots of people use your product, and that’s a good problem to have. Generally, you don’t want to tax your best users.

So on to the technology. Here’s the clue that we’ll start with:

Twitter is, fundamentally, a messaging system. Twitter was not architected as a messaging system, however. For expediency’s sake, Twitter was built with technologies and practices that are more appropriate to a content management system.

From Twitter’s post on architecture and the problems they are facing 

When I read “content management system”, I’m thinking “blogging platform”. My guess is that Twitter is built to be a massively multi-user blogging and blog reading system - every user gets a blog to publish posts with and a blog reader to aggregate the posts of their friends. Considering Evan Williams was the founder of Blogger, I think it’s pretty reasonable.

So if you think of it that way, then the obvious way to architect the system is publishing via RSS and aggregating via RSS. When you write a new tweet, your message gets stored in the database. (Yes, shoving all of that data into a database is a really difficult engineering problem in itself. Assuredly they will partition across multiple databases if they don’t already.) The massive pain comes in when pulling in what your friends’ tweets are. Let’s talk through how it works. Your twitter homepage is acting like an RSS reader, so first it will lookup all of the feeds it needs to check - all of the people you follow. Then, for every person you follow, an RSS feed will be read or generated. The resulting set of RSS feeds will be merged back together and sorted chronologically. The result is your Twitter homepage.

Notice here that this is what is called a “pull” or “poll” model - you are checking for new posts whether there are new posts or not. This can generate a ton of unnecessary load on servers and databases, not to mention network traffic costs. With the advent of Twitter applications, these applications are constantly polling Twitter to see if there is anything new to publish. Ping, ping, ping. All to see if there is something new afoot.

Which brings us around to pricing. It is not, as Om suggested, Scoble’s fault for having 25,000 people following him. The cost is not sending one of his messages 25,000 times. No, actually it’s Scoble’s fault for following 21,000 people and constantly checking for new tweets from those people. It’s also the fault of power users like him using applications that aggressively use the Twitter API to check for new tweets - most likely the same people who use those applications are following large numbers of people.

As with all scaling problems, the first idea is “cache more!”. And sure, you can cache the heavy Twitter producers. But Scoble isn’t following just the big twitter users - he’s following everyone he can, because that’s how he believes he can get an edge on news and trends. Can the long tail be cached? Doubtful - there are too many users who fall into that category. Can you charge those who follow more than, say, 1000 people? Maybe $10 a month for every thousand people you follow, with the first 1,000 free? That could work, but it’s risky. Would Scoble, in the face of paying $210 a month, permanently switch to Pownce? Or Friendfeed if they built a twitter clone? How many would follow?

The solution, of course, is to do exactly what Twitter says they are doing - switch to a different model and scale horizontally (”throw more machines at it”). I’m interested to see how it turns out for them.

Comments (3)

Regarding the Opaque Value problem.

First, read this.

The Opaque Value Problem (or, Why do people use Twitter?)

Thanks. This is important, and most people over the age of 25 don’t understand this. (Uh oh, I’m not bringing up the age question again, am I?)

Let’s start from a simple statement.

How compelling you find content is directly proportional to how relevant it is to you. The more relevant to you, the more you care.

OK, how about one more simple statment.

The people in your social network are relevant to you compared to those who are outside your social network. For more on that, read this.

Let’s mash the last two statements together.

Given that your social network is relevant to you, content generated from your social network is going to be compelling to you. The more content generated from your social network you get, the better.

It’s going to be boring nonsense to everyone else. So what.

Sites need to realize that if they want customers to visit at least once a day, there needs to be a lot of content available for consumption generated from their social network. This is what Facebook does. This is what Twitter does.

How well does your site integrate with my life?

Comments (1)

Google Reader - David’s shared items

Google Reader - David’s shared items

Since I’m subscribed to 30-some feeds and read hundreds of articles every day, I’ve decided to start sharing the articles that I think are worth reading with all of you. You can read the web (HTML) version of it by clicking the link above, or subscribe to my RSS feed here. It’s all about the kinds of topics my socialstartups.com blog is about - technology and community and startups. Enjoy!

Comments

I’m late to the Google Reader party

Wow, never going back now. I used the Google Personalized home page forever (or, since it launched), but now that I’ve got so many blogs to keep up with, it was time to give Google Reader a chance. And you know what? Keyboard shortcuts are severely underrated. You just don’t know how great they are until you try it. Makes reading so much faster.

Comments