Email or username:

Password:

Forgot your password?
Top-level
Christine Lemmer-Webber

At any rate, there's a bit of conflation here. "It scales down" by saying "you can have an isolated community/use case that's oblivious to the rest of the system" is categorically distinct from "it scales down" in terms of "a small node can meaningfully participate with the larger system"

36 comments
Christine Lemmer-Webber replied to Christine

At any rate, the problem with "scaling down" is much clearer when it comes to the problem of "scaling wide".

Or let me put it a different way: ATProto *explodes in complexity* when you try to scale it towards meaningful decentralization

Christine Lemmer-Webber replied to Christine

Yes that's right we're getting to the spicy part of this conversation. We did the warm-up, now it's time to talk about the real thing, whether or not decentralization in the way I believe people *think* that term means is reasonably possible with ATProto as it's currently designed

Christine Lemmer-Webber replied to Christine

But before we do that, I need to stretch and run to the bathroom

So for those of you following along, if you found this, Secret Goblin #3, let me know: "👺"

Oops wait actually we gotta talk about that one for a sec there's a reason I left it in scare quotes

Christine Lemmer-Webber replied to Christine

Why on earth is the textual descriptor for Unicode U+1F47A "JAPANESE GOBLIN", does anyone know?

It's a Tengu, right?

Despite being the only actually named "goblin" emoji, I feel awkward about this one because is it correct to call it a "JAPANESE GOBLIN" instead of just "TENGU"?!?!

I don't know!

Christine Lemmer-Webber replied to Christine

If you have knowledge or OPINIONS about "👺", its name choice in unicode, or, for that matter, a white person just dropping it in the middle of a group chat WITHOUT putting it in quotes (I did tho), feel free to derail the comment thread

Otherwise it's time for a

=== STRETCH BREAK ===

Christine Lemmer-Webber replied to Christine

I'm back. It's time to talk about it: does Bluesky/ATProto suffer a "quadratic explosion" as we move from centralization towards *meaningful* decentralization?

I claimed it did, but I was challenged on this. What did I mean? Am I right or wrong?

It's time to find out!

Christine Lemmer-Webber replied to Christine

In the previous blogpost I said the following:

> If this sounds infeasible to do in our metaphorical domestic environment, that's because it is. A world of full self-hosting is not possible with Bluesky.

(cotd)

Christine Lemmer-Webber replied to Christine

Decentralized ATProto is quadratic quote, cotd:

> In fact, it is worse than the storage requirements, because the message delivery requirements become quadratic at the scale of full decentralization: to send a message to one user is to send a message to all. Rather than writing one letter, a copy of that letter must be made and delivered to every person on earth.

Christine Lemmer-Webber replied to Christine

This was probably the thing I got the hardest pushback on from a team member of Bluesky, that it is not quadratic as we scale towards decentralization.

Truth be told, I don't have a degree in CS. Most of what I know I learned from studying independently and community resources. Was I wrong?

Debate between myself and why.bsky.team about whether or not ATProto is quadratic as we decentralize it.

We both agree that agency is the most important thing anyway, more on that later.
Christine Lemmer-Webber replied to Christine

Just as a quick aside, regarding that comment about "agency", maximizing the agency of everyone (and more importantly, minimizing subjection!) sits at the heart of my ethical framework fossandcrafts.org/episodes/11-

So I don't disagree on that part, but that's an aside!

Christine Lemmer-Webber replied to Christine

Now, I said I won't read replies until I am done summarizing things, and that's true, so maybe someone has gone out of their way and proven that I am wrong, that the claims in my article are factually incorrect and so on and so forth. I wouldn't know yet.

But... I don't think I'm wrong.

Christine Lemmer-Webber replied to Christine

As said I'm very self-conscious about these things because I *don't* have formal CS training. But I do a lot of research and so I've tried to become knowledgeable about these things and this *seemed* like the correct analysis to me

Because of that, I turned to people who actually knew more than me

Christine Lemmer-Webber replied to Christine

For one thing I derailed the entire Spritely morning standup by walking everyone through the scenario. I gave the story example, which I'll detail later.

But @dthompson didn't find the story helpful, too much narrative detail. "I need to work through this example independently." So he did.

Christine Lemmer-Webber replied to Christine

@dthompson came back and laid it out in more formal terms and said I was right.

But I was still nervous, so I called up one of my old MIT AI Lab type friends and rambled about it to them on a call. What did they think?

Christine Lemmer-Webber replied to Christine

"I think it's pretty clear immediately that it's quadratic. This is basic engineering considerations, the first thing you do when you start designing a system," they said.

Well that's a relief, why isn't it clear to everyone else, I asked?

So they suggested I lay it out to you as I did to them.

Christine Lemmer-Webber replied to Christine

Let's start with the following:

- ATProto has positioned itself as "no compromises on centralized use cases". Well, in that case, let's say it can't do *worse* than eg ActivityPub. This includes with replies. You can't do *worse* than ActivityPub on replies and mentioning someone, etc.

Christine Lemmer-Webber replied to Christine

- We will interpret the most centralized system as one where there's only one provider for storage and distribution of all messages: the least amount of user participation
- The flip side of the spectrum of maximum decentralization is the *most* amount of participation: every user self-hosts.

Christine Lemmer-Webber replied to Christine

- Just as blogging is decentralized but Google (and Google Reader) are not, it is not enough to have just PDS'es in Bluesky be self-hosted. When we say self-hosted, we really mean self-hosted: users are participating in the distribution of their content.

Christine Lemmer-Webber replied to Christine

- We will consider this a gradient. We can analyze the system from the greatest extreme of centralization which can "scale towards" the greatest degree of decentralization.

Christine Lemmer-Webber replied to Christine

- Finally, we will analyze both in terms of the load of a single participant on the network but also in terms of the amount of network traffic as a whole.

Okay. That is the structure we will use for our analysis. Let's compare "message passing" vs ATProto-style "global public shared heap".

Christine Lemmer-Webber replied to Christine

So okay. Let's get the CS notation out of the way:

"Message passing" at full decentralization:
- O(1) from a single node's perspective
- O(n) from a whole-network zoom-out perspective (inherent: add a user, it's one more user)

Okay, that's reasonable and what you'd expect

Christine Lemmer-Webber replied to Christine

"Public global no-missed-messages (or not worse than AP) shared-heap" ATProto style at full decentralization:
- O(n) from a single user's perspective (!)
- O(n^2) from a whole-network perspective (!!!!!!)

Oof I'd better back this up because that ain't good!

Christine Lemmer-Webber replied to Christine

In other words, as our systems get more decentralized, message passing handles things fine. Individual nodes can participate in the network no matter how big it gets. The zoom-out for the network as a whole doesn't get more complicated as we add more users OR move more users towards self hosting.

Christine Lemmer-Webber replied to Christine

Things are NOT good, if I'm correct above, as we make things more decentralized in the atproto-public-shared-heap model. The more self-hosting and indeed the more "full nodes" join, the more it gets expensive for each of the nodes and the network EXPLODES!

Truly self-hosted atproto is NOT POSSIBLE!

Christine Lemmer-Webber replied to Christine

And there is no solution to this without adding directed message passing. Another way to say this is: to fix a system like ATProto to allow for self-hosting, you have to ultimately fundamentally change it to be a lot more like a system like ActivityPub!

Christine Lemmer-Webber replied to Christine

Now I left more of the precise analytical explanation in my blogpost. But social media isn't great for that, so go check out my blogpost if you want to go through all that (eg if you're more like @dthompson and less like me, I'm a narrative person) dustycloud.org/blog/re-re-blue

Christine Lemmer-Webber replied to Christine

Here's our story:
- We have 26 users: [Alice, Bob, Carol, ... Zack].
- Each user sends one message per day, which is intended to have one recipient. (This may sound unrealistic, but it's fine for modeling.)
- Each user sends a message in a ring: Alice => Bob, Bob => Carol, ... Zack => Alice

Christine Lemmer-Webber replied to Christine

Now just before you say "wait but ATProto isn't for DMs", yes, but one way this could happen is that eg Bob follows Alice, Carol follows Bob, etc.

What I'm saying is, messages can have an "intended audience". That's what we're using here.

Christine Lemmer-Webber replied to Christine

Before we get into this, remember, the main difference between "message passing" and the "shared heap" is the former has directed and delivered messages, the latter does not. See prev blogpost for explainer.

So, what happens in a day for both systems? Because that's what we really want to find out.

Christine Lemmer-Webber replied to Christine

Under message passing, Alice sends her message to Bob. Only Bob need *receive* the message. So on and so forth.

- For an individual self-hosted node, messages passed per day: 1.
- Per the decentralized network, total messages passed zooming out: 26.

That's about what we'd expect.

Christine Lemmer-Webber replied to Christine

Under the public-gods-eye-view-shared-heap model, each user must know of all messages to know what may be relevant. Each user must *receive* all messages.
- Individual self-hosted server, 26 messages must be received per day.
- Zoom out on whole decentralized network: 26*26: 676!

Christine Lemmer-Webber replied to Christine

Sounds survivable with 26 users though, right?
Let's try just adding 5 more users.

Message passing:
- Per node per day: no change.
- Per the network: 5 more messages.

Public gods-eye-view-shared-heap-model:
- Per node per day: 5 more per day
- Per network: ((31 * 31) - (26 * 26)): 285!

Christine Lemmer-Webber replied to Christine

Now, could we handle a million self hosted users? Is it possible? No problem in message passing. EXPLOSIVE with atproto.

What if we had a million users and added just 5 more? How many more messages must the network bear?

5 new messages in message passing.
*10,000,025* new messages sent in atproto!

Christine Lemmer-Webber replied to Christine

"Christine that's ridiculous, we're not expecting a million self-hosted users"

Well I think it would be nice!

But regardless, ActivityPub has 27,000 servers on it, all meaningfully participating in the network.

ATProto, in its current design, would be crushed to DEATH

Christine Lemmer-Webber replied to Christine

"But Christine", you may say, "I heard gossip might fix this!"

No. It cannot.

In fact, I was being more generous than a gossip network, and assumed you only *received* a message once.

With gossip you might *receive* more than once.

But you need to receive a message to know it.

Christine Lemmer-Webber replied to Christine

ATProto was designed for a "big world" view. That's fine! But I'm trying to show seriously what happens if it was actually, really decentralized.

*Every* fully participating node added to the network makes the network explosively more expensive.

ATProto doesn't scale towards decentralization.

Go Up