This goes back to the questions I asked before. On...

This goes back to the questions I asked before. On the fediverse, we'd isolate the bulk of them from the greater network by suspending their host servers. Does #Bluesky provide tools that allow admins or individual accounts to completely exclude everything from a specific server? Will the indexing servers defederate from known nazi havens? Will the algorithms serve posts from nazi accounts so long as they don't trim the ML flag for hate speech? So many unanswered questions.

Like 2 May 2023 at 16:21 | Open on merveilles.town

13 comments

L. Rhodes

If #Twitter does merge into the network, my guess is it will (a) happen fast — like, within the next year, and (b) put an end to the social media reshuffling phase we've been going through since October. #Bluesky instantly reached critical mass; people will get to use "Twitter" while credibly maintaining they're not on Twitter. No fediverse project will be able to count on growth by matching Twitter features. Better to start focusing on what differentiates fediverse services from Bluesky.

2 May 2023 at 17:42 | Open on merveilles.town

L. Rhodes

One hugely important question that I don't see being asked (perhaps because not many people have worked through the implications) is: Who gets to be a #Bluesky indexer? Theoretically, any entity that can maintain one can run an indexing server, but are there controls at the account server layer to allow or deny indexers access to the data coming through that server? What's to stop, say, a Cambridge Analytica or the CCP from mining all of the data on the network for political use?

3 May 2023 at 13:12 | Open on merveilles.town

L. Rhodes replied to L.

Partial answer here: https://blueskyweb.xyz/blog/5-5-2023-federation-architecture

"The federation architecture allows anyone to host a BGS, though it's a fairly resource-demanding service."

BGS = Big Graph Service, #Bluesky's new jargon for network-crawling data indexers.

This is pretty much the answer I expected: No real guardrails on who gets to index the network and provide feeds. State actors aren't likely to be daunted by the resource demands.

6 May 2023 at 18:38 | Open on merveilles.town

L. Rhodes replied to L.

That blog post also outlines a piece that wasn't obvious to me from the protocol spec: "An App View is the piece that actually assembles your feed and all the other data you see in the app, and is generally expected to be downstream from a BGS’s firehose of data."

So algorithmic filtering presumably happens on a smaller server independent of both the BGS and PDS, which makes the #Bluesky nerwork even more complex than I visualized in the OP. And presumably, anyone can run an App View, too.

6 May 2023 at 18:56 | Open on merveilles.town

L. Rhodes replied to L.

Here's #Bluesky's diagram of the federated network.

Labelers are independent services where posts coming from the BGS are tagged to make them easier to filter. Accounts (and maybe admins) can then hook into a labeler to outsource some of the moderation load.

Feed generators sort and filter posts. This is where the custom algorithms live.

App Views also do some sorting, but they're mostly (I believe) for sorting out post types for app types, e.g. photoblogging, microblogging, etc.

A diagram showing two devices connecting to two different Personal Data Servers, each providing data to a Big Graph Service, which pushes data back out to a Labeler, and App View and a Feed Gen, each of which also passes data back to one PDS or to the next item in the group. Sorry, I sure that's confusing, but as I said, it's a more complicated picture than those I provided in my earlier post.

6 May 2023 at 19:18 | Open on merveilles.town

L. Rhodes replied to L.

This is much more complicated than I initially understood, both technically and socially. Since Labeler, App View and Feed Gen are all separate services that can be run independently of both the BGS and PDS, taking full advantage of the network means implicitly trusting four different entities on top of your local host. And even if you don't trust them, they get a say in how (or if) your posts are received by others on the network. I guess that's what #Bluesky means by the "reach layer."

6 May 2023 at 19:31 | Open on merveilles.town

L. Rhodes replied to L.

So let's say I'm the CCP, and I want to control what Chinese citizens can do on #Bluesky. I could start by configuring the national network to only allow traffic to and from PDS that connect to approved Feed Generators. I could run the BGS that crawls those PDS to exclude servers from other countries. I could run the Labelers that feed into those Generators and flag posts as "seditious" so that they get filtered. And, of course, I could investigate anyone who gets flagged by the ML software.

6 May 2023 at 19:36 | Open on merveilles.town

L. Rhodes replied to L.

But how about a non-state actor example? Let's say I'm just a ML enthusiast and decide to run my own #Bluesky Labeler. Maybe I run a pretty good service, and lots of people on the network start relying on it for moderation. And maybe I'm also a big 'ol TERF, and adjust the Labeler so that accounts by people whose profiles identify them as trans get mislabeled into frequently moderated categories. AFAICT, the only safeguard against this is… marketplace dynamics.

6 May 2023 at 19:46 | Open on merveilles.town

L. Rhodes replied to L.

#Bluesky is pretty clear about the market-oriented direction of how they're structuring the network: https://blueskyweb.xyz/blog/3-30-2023-algorithmic-choice In effect, they're offloading a lot of responsibility onto "consumer choice." But these are consumer choices about services that are, to a large extent, black boxes. Unless the BGS, Labelers, App Views and Feed Gens are radically transparent about how they handle data, you have to choose based on little more than how you feel about your timeline.

6 May 2023 at 19:55 | Open on merveilles.town

L. Rhodes replied to L.

This, from "Federation Architecture Overview," is also pretty wild: "For example, the BGS might crawl to grab data such as a certain post’s likes and reposts, and the app view will output the count of those metrics."

#Bluesky's solution to the out-of-sync metrics people sometimes complain about on #Mastodon is to separately crawl for likes and reposts, and pass that info through the App View. Which seems like an opportunity to inject false metrics, particularly if anyone can run an App View.

6 May 2023 at 20:06 | Open on merveilles.town

L. Rhodes replied to L.

The post says that #Bluesky is experimenting with breaking Labeling and Feed Gen out from App View, so this more complicated infrastructure isn't necessarily the form the network will ultimately take. Hopefully, they'll walk back that plan, because it seems to me that the flexibility it adds mostly opens up new vectors for bad actors who want to reshape traffic on the network to their own ends.

6 May 2023 at 20:10 | Open on merveilles.town

Daniel Schildt replied to L.

@lrhodes Thank you for the deep technical overview about the current status of the system.

6 May 2023 at 20:16 | Open on mastodon.social

L. Rhodes replied to Daniel

@autiomaa You're welcome, but it's really not all that deep. I'm really just going off of the spec as written, along with a few blog posts. The only thing that makes my analysis stand out is the fact that so few of the people covering Bluesky are offering any sort of structural analysis at all.

6 May 2023 at 20:18 | Open on merveilles.town

Go Up