andrewducker | Distributed Twitter

Can anyone see a way that a distributed Twitter could work?

Obviously you could use RSS to aggregate tweeters you like into one place - but that only covers very basic functionality. For instance, I can't see a way to both decentralise it _and_ allow for hashtag filtering over the whole database to find the tweets you like.

Any thoughts?

Flat | Top-Level Comments Only

From:

andlosers.livejournal.com

Have you seen Identi.ca and Laconi.ca?

Also see OpenMicroBlogging.

From:

andrewducker

I knew there were similar things to twitter out there, but I can't see how they solve the problem.

As I said in my post, individual aggregation isn't a problem - the problem is that with Twitter you can watch (for instance) #andrewducker to see all twitter posts about me. If they're spread across 500 microsites then you can't do that.

From:

red-phil.livejournal.com

Yes you can, you just have to search 500 sites.

The problem is finding what sites exist at any one time.
Especially if they are hosted on peoples home PCs and are liable to drop off the net at random.

From:

andrewducker

And keep track of those sites, and add new ones in as they appear, losing them as they disappear.

Doing all of that in a seamless manner is not going to be easy. NNTP did a pretty good job though.

From:

drplokta

It's a tricky one, because the data's not much larger than the metadata, so there's not much point in maintaining complete copies of the metadata at each site while sharding the data.

In theory, your searches could start on one server and then traverse a tree of other known servers, and the server where your search begins could consolidate the results and hand them to you, but it would make searching a bit slow and unreliable. It would also make them generate a lot of network traffic.

Alternatively, each post could go to multiple sites -- one for the user, one for each other user mentioned with an @user, and one for each hashtag used. Then it would be possible to search by user, @user or hashtag because you'd know which server to send the search to.

From:

andrewducker

Who would run the hashtag servers? A hashtag is entirely arbitrary after all.

From:

nameandnature

It's Usenet: flood fill between servers, hashtags are groups and "all tweets from Fred" is a group, so hashtagged posts from Fred are crossposts to Fred's group and the hashtag group.

This isn't completely distributed as there are still servers and clients.

From:

andrewducker

But that's fine, because anyone can run a local server for NNTP.

Not a bad idea. I wonder how high the traffic is.

From:

nameandnature

Rather less than the real Usenet, I'd've thought. More thoughts:

You probably don't actually want NNTP and the Usenet message format, you just want to steal their ideas. Bonus points for making it use Google Wave.

You'd want to fix Usenet's current forgery and cancel problem with magical crypto fairy dust.

Anyone can run a leaf node, but you need a bunch of peers which flood to each other. Arranging peering is a manual step in NNTP. It'd be nice if this weren't the case in the distributed system, but sorting that out is a hard problem.

You don't want special clients for it, because people don't care enough to install one. Your local server's job is to present the messages it knows about as web pages, Atom feeds and so on.

From:

andrewducker

Preventing forgery is a problem - but one that's solved so long as site owners want to keep spam off of their own site (which they do) and people are willing to pass data along.

The usenet model, where I pass data to Bob (who trusts me), who passes it to Charles (who trusts him), who passes it to Dave (who trusts him) works pretty well, so long as everyone along the chain has a vested interest in their own reputation - and thus the reputation of their members.

I'm also in favour of manual peering - and you'll end up with some central points and lots of branches and leaves off of that.

From:

nameandnature

I'm guessing servers won't be run by individuals who trust each other but by service providers like Twitter (or ISPs if such a thing became popular, like email is and news used to be). Providers probably take money from users (or show them ads) but they don't trust them. If I can inject a tweet saying "andrewducker is wearing baggy pants today" at my local server and have it propagate everywhere, you might get a bit annoyed. Hence magical crypto fairy dust.

From:

call-waiting.livejournal.com

Sounds like you need a Distributed Hashtag Table.

Just kidding. Not really.</dodgeball>

From:

matgb

Hashtags were created independently of Twitter and initially you had to go to places like hashtags.com to search for them anyway.

I already have a live bookmark with an RSS of incoming link referrals and mentions of my name that comes from blog pinging servers.

All you need is a microblogging ping server and a decent search setup for complete aggregation—search for @andrewducker or whatever and it'll show all entries on all sources that are pinging your engine of choice.

Automattic already run a free pinging service for blogs (that I keep meaning to link to), a similar service would mean your host has to ping one repository, it pings all the search setups, etc.

Easy.

Completely beyond my technical skills, but the technology is already there.

From:

andrewducker

I can't see that scaling well beyond a few services. It's going to scale with the sqare of them - 5 services pinging each other works, 500 doesn't work so well.

I like the idea of making it NNTP-like more - where data sloshes around the system.

From:

matgb

I can't see that scaling well beyond a few services.

But it already does work for blogging. How many blogging platforms are there out there? I can track incoming links from all over the shop very easily, and search for specific terms effectively.

They pick up links, terms, tags, etc. You only actually need to ping the search engines anyway, if you're searching on just your site then you only get stuff from there, but if you search at, say, hashtags, you get referrals, tags, etc from everywhere that's pinging a service that pings hashtags.

From:

andrewducker

"I can track incoming links from all over the shop very easily, and search for specific terms effectively."

Ok, wander over to twitterfall and add "Tehran" to the list, and watch the posts appear in real time.

Now, find me every blog post in the world tagged "Iran" posted in the last five minutes.

From:

matgb

http://www.icerocket.com/search?tab=blog&fr=h&q=iran&x=0&y=0

It won't get everything, for the simple reason that sites like, say, LJ don't automatically ping stuff, but it'll get stuff that wants to be found.

And something like http://hashtags.org is already set up for this sort of thing.

From:

andrewducker

If it doesn't work for everything then it's not really fulfilling the same objective, is it?

Especially as it just leaves you dependent on more centralised solutions - in this case the numerous search engines that need to be pinged.

From:

matgb

Well no, it can't find everything. But neither does every search on Twitter where they're only interrogating their own database.

For a post to be found, it has to want to be found, LJ is notoriously bad at helping people find stuff on the site, DW is at least looking at a pinging service for more than just the very crappy weblogs.com.

The advantage of a distributed model is competition, redundancy and a distributed system making things less prone to fall over.

The disadvantage is that not all the sites competing will talk to each other (without consumer pressure) and you still need to have some people, somewhere, tracking stuff.

If you want your search box to be built into whatever service you're using, then that adds extra weight to the spec, search requires a central repository of some sort. Or your server will need to track and remember everything tweeted on every service globally, just in case you want to search for that.

Which I suspect will actually happen at some point, but not for awhile.

From:

andrewducker

I just don't think that pinging gives you everything you need.

I'd much rather have a usenet model of various sites sloshing data around than one that requires every site to ping every other site.

The former scales linearly, the latter geometrically.

From:

matgb

But you don't need every site to ping every other site.

You need to have a couple sites that aggregate pings to other search engines and possibly feed portals. Your site would be grabbing feeds from your subscriptions and would pick up on everything in there, and everything else would come from a search engine—which release feeds of the results so you'd get your @replies anyway.

You choose your search engine(s), they sort out the pinging APIs between them.

From:

andrewducker

You need every producing site to ping every search site. Which also means they all have to keep track of every search site out there. Or have an automated method for search sites to register interest. Hmm, that might work.

This means that the search sites end up with copies of everything, but that's not the end of the world - it does mean the producers don't need to have copies of everything.

Ok, I can see that working. What we need is a protocol for makign data available and registering to be informed of updates.

Of course, if site A has to provide feeds to 500 search sites then that might also be overloading it - so cache-and-forward might be a better strategy. But that's an implementation detail.

From:

matgb

You need every producing site to ping every search site.

No, you need to have a couple sites providing a pinging service, like Auttommatic does with ping-o-matic. Then you, the user, chose which pinging service to use.

Which also means they all have to keep track of every search site out there.

No, the pinging service does that for you.

And most blog platforms already provide feeds for a huge number of search services, sort of—every RSS reader in the world is effectively acting as a search service to an extent.

Actually, given that Twitter already has a huge pile of published APIs, and a lot of people are used to using clients already, the impetus for a distributed model is pretty much already there, all it'd need is for Twitter to have an outage and people'll step in.

From:

andrewducker

Ok, so DuckerTwitter.Com pings the ping site of its choice, and then the search sites use that as a trigger to come get data from DuckerTwitter.com? Or do I send the complete data to the ping site?

Flat | Top-Level Comments Only

Profile

andrewducker

My NotZen Website

October 2025

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Page Summary

Active Entries

Style Credit

Style: Neutral Good for Practicality by timeasmymeasure

Expand Cut Tags

No cut tags

Page generated Oct. 24th, 2025 11:42 pm

Very cool, if true

Distributed Twitter

Distributed Twitter

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

Profile

October 2025

Most Popular Tags

Page Summary

Active Entries

Style Credit

Expand Cut Tags