Digital infrastructure 4 a cooperative internet. social/technological systems & systems neuro as a side gig. writin bout the surveillance state n makin some p2p. #UAW4811 rank and file agitator
information is political, science is labor.
science/work-oriented alt of @jonny
This is a public account, quotes/boosts/links are always ok <3.
Bluesky [is] designed to foster a new ecosystem of applications. [...] It is interoperable with existing internet protocols and blockchain-based systems, opening the door for a more connected, less siloed social experience. Since its launch in April 2023, over 100 clients have been built on the AT Protocol, and users have created more than 50,000 custom feeds. And the best part of it all? By building on top of the AT protocol, these developers have access to Blueskyβs 13M users worldwide.
The VC firm sees bsky and their ownership of the relay as being a potentially very lucrative chokepoint, where the users of bluesky are the asset to rent to platform developers who want "access" to them. I've written before how atproto's decentralization is effectively meaningless with the relay system, where it's decentralized in the same sense as google alerts is decentralized - sure you can host your own PDS, but it's only useful because the main relay crawls it, and then either bsky or someone else who (inevitably) pays for access can send it back to you.
Bluesky [is] designed to foster a new ecosystem of applications. [...] It is interoperable with existing internet protocols and blockchain-based systems, opening the door for a more connected, less siloed social experience. Since its launch in April 2023, over 100 clients have...
@jonny it'll be interesting to see how fast this will all unravel. I was kind of expecting them to be a bit more stealthy for a while longer before showing their hand, but maybe not.
just took a look at about a month of atproto firehose i have just been accumulating, and it looks like it's time for an update to the ol "is it becoming a communication medium yet" and the answer is even more no than before.
1% of accounts receive 72% of interactions (up from 44% last december when the network was a fraction of the size),
1% of posts receive 56% of all interactions, and
almost 90% of posts receive 0 interactions.
the distribution is steep too in the high end of that tail. Scrolling through the default feeds rn on a secondary account following zero people and with zero interactions, posts are averaging in the ~hundreds up to a tens of thousands of interactions. on my actual account where i have interacted with people, i receive the fixed proportion of low-interaction mixins from my network which is like 30-40%. Think about how common seeing a post with hundreds of interactions is tho in the default feeds - 0.01% of posts receive 470 likes, and 0.0001% receive 6300. That's how much the algorithmic amplification makes a monoculture.
I have been taking samples of fedi while developing fetch all replies and backfilling, and the distribution on AP fedi is... not like that... but i haven't taken a systematic sample.
Edit: to be clear, this a month sample of all likes and all accounts that were active in that month. So not all accounts from all time
just took a look at about a month of atproto firehose i have just been accumulating, and it looks like it's time for an update to the ol "is it becoming a communication medium yet" and the answer is even more no than before.
1% of accounts receive 72% of interactions (up from 44% last december when the network was a fraction of the size),
1% of posts receive 56% of all interactions, and
almost 90% of posts receive 0 interactions.
@jonny "Telegram says it will now share IP addresses and phone numbers to authorities in response to valid orders" oopsie. I guess this makes a lot of sense coming soon after the CEOs arrest
alright, after like a year of halfheartedly trying on and off, #FetchAllReplies is pretty much finished - the problem of not being able to see all replies to a post is one of the largest complaints that people have with mastodon in particular but also the fedi in general. It is an especially potent problem for smaller servers, making them feel lonely, and making the whole fedi seem quiet. It is also a large contributor to the 'reply guy' problem where a moderately popular post will get the same replies over and over again and people won't even know they're doing it.
This patch recursively fetches replies using activitypub collections. it does it respectfully, only when someone is explicitly looking at a post (rather than fetching all replies for everything all the time) with some debounce, and spaces out the recursive calls to the other servers in deep threads.
the only thing left is to make the posts get inserted into the web client as they are received, currently you need to refresh to see them.
trying it locally now and it is a game changer.
i'm not "good at ruby" so if you ever wanna see this upstream, kindly spare a code review?
alright, after like a year of halfheartedly trying on and off, #FetchAllReplies is pretty much finished - the problem of not being able to see all replies to a post is one of the largest complaints that people have with mastodon in particular but also the fedi in general. It is an especially potent problem for smaller servers, making them feel lonely, and making the whole fedi seem quiet. It is also a large contributor to the 'reply guy' problem where a moderately popular post will get the same replies...
Mozilla holding everyone for ransom and saying they will post everyones bookmarks and browser history in public unless we all paid $50 to fund Firefox development forever would genuinely be a better business decision than giving in to adtech. Literally the only thing anyone who uses it wants from them is an uncompromised browser. Value goes to zero otherwise
@jonny There's no such thing as an uncompromised browser. And PPA actually will allow advertisers to get specific click through information -without- tracking users, which they are currently doing, with or without the browser's help.
In other words suggesting this is 'giving in to adtech' is incorrect. This is offering advertisers a capability to get some meaningful data without having to track users.
You wont believe the number of stormtroopers theyre deploying against unarmed students unless you see it. This is just one side: at least 7 police departments with at least two layers at every point of egress, with several layers in back for rear control and rotation. They've got the army out against your kids for having the audacity to do whatever they can to stop a genocide
The paper figure is a lot cuter, but by linearizing it and presenting it as two parallel tracks they have obscured the most salient feature of the network: the big relay in the middle. Beyond "centralization bad," that pins down most of the undesirable and dangerous features of the protocol, and makes it seem like theres a lot more choice than there is.
Since the design purposefully hides the architecture: you dont know where your feed generators are drawing from, or those used by your friends. So you cant know what the effect of choosing a different relay would be, aka the main relay is always indispensable. Importantly the relays subscribe to you, you dont push to the relay, and since you arent really supposed to operate your own data store, you can be dropped from the network without knowing - the relay serves as an unaccountable point of moderation.
The paper figure is a lot cuter, but by linearizing it and presenting it as two parallel tracks they have obscured the most salient feature of the network: the big relay in the middle. Beyond "centralization bad," that pins down most of the undesirable...
They describe another real weakness in the protocol on page 4 that also makes the single relay indispensable: fedi has backfilling problems, but its possible to solve them because you can at least know who does have the complete picture - the OP server knows of all interactions (that it wants to). Since there are no backlinks, and PDSes are not dereferenceable by username, the only way the whole thing works is if someone has a relatively complete picture of the whole network - otherwise eg. you would have no idea who to deliver a post to.
They describe another real weakness in the protocol on page 4 that also makes the single relay indispensable: fedi has backfilling problems, but its possible to solve them because you can at least know who does have the complete picture - the OP server knows of all interactions (that it wants to). Since there are no backlinks, and PDSes are not dereferenceable by username, the only way the whole thing works is if someone has a relatively complete picture of the whole network - otherwise eg. you would...
@jonny that over my head, but all I need to know about blsky is it is designed as wrongly as is possible on purpose. which I suspect is another way of saying what u said.
Helping someone debug something, said they asked chatgpt about what a series of bit shift operations were doing. He thought it was actually evaluating the code, yno like it presents itself as doing. Instead its example was a) not the code he put in, with b) incorrect annotations, and c) even more incorrect sample outputs. Has been doing this all day and had just started considering maybe chatGPT was wrong.
I was like first of all never do that again, and explained how chatGPT wasnt doing anything like what he thought it was doing. We spent 2 minutes isolating that code, printing out the bit string after each operation, and he immediately understood what was going on.
I fucking hate these LLMs. Empowerment is learning how to figure things out, how to make tools for yourself and how to debug problems. These things are worse than disempowering, teaching people to be dependent on something that teaches them bullshit.
Edit: too many ppl reading this as "this person bad at programming" - not what I meant. Criticism is of deceptive presentation of LLMs.
Helping someone debug something, said they asked chatgpt about what a series of bit shift operations were doing. He thought it was actually evaluating the code, yno like it presents itself as doing. Instead its example was a) not the code he put in, with b) incorrect annotations, and c) even more incorrect sample outputs. Has been doing this all day and had just started considering maybe chatGPT was wrong.
@jonny I hate these LLMs, and I hate the in-industry hypemonkeys that lend their clout to the absurd misconception that LLMs are any kind of intelligence, or that they are actually doing anythign else than autocorrect applied at scale.
@jonny My evaluation has ultimately been that the fundamental problem with these LLMs, at least in terms of the output the give, is that they are designed to give a satisfying answer to whatever is posed to them, even if they can't. So rather than say "I can't answer that" it will instead just invent something that sounds good. Because it may not know the answer, but it damn well knows what an answer *looks like*, and appearing to answer is preferable to giving a disappointing result.
Hey any journalists on here plz turn your public post indexing on, because most of you haven't and thats why people looking for public information cant find you.
Go to settings > public profile > privacy and reach, select "include public posts in search results"
Not all the fedi wants to be a public space, and thats fine, but some parts should be right now.
Unfortunately, discoverability w #Mastodon servers is so bad that you can click on a users busy profile and it may appear empty "for reasons". This happens a lot.
And there's nothing to suggest new people to follow based on existing activity and follows.
This paradigm hasn't even figured out a decent way to provide URL links to content. I still end up on other M. websites (where I don't reside) when I click on links to toots. How does a news website even cope with that, when they consider adding "share on fedi" icons to their own pages?
Unfortunately, discoverability w #Mastodon servers is so bad that you can click on a users busy profile and it may appear empty "for reasons". This happens a lot.
And there's nothing to suggest new people to follow based on existing activity and follows.
This paradigm hasn't even figured out a decent way to provide URL links to content. I still end up on other M. websites (where I don't reside) when I click on links to toots. How does a news website even cope with that, when they...
I think in the future if I am ever writing a code paper I am just going to take the list of contributors and copy paste that into the authors list with links to a git blame (with consent). If we're going to have a credit assignment system as broken as authorship, we can at least err on the "include everyone" side of the brokenness - I want the person who submits a PR to fix a typo in the docs to get credit for helping. People being incentivized to make lots of little contributions is good, actually.
It should be the same way with regular papers too - put your lab techs and undergrads on the paper! Put on the grad student/postdoc who isnt explicitly assigned to this project but ends up helping out anyway. Its literally free! Authorship inflation is a made up problem thats not even a problem!
I think in the future if I am ever writing a code paper I am just going to take the list of contributors and copy paste that into the authors list with links to a git blame (with consent). If we're going to have a credit assignment system as broken as authorship, we can at least err on the "include everyone" side of the brokenness - I want the person who submits a PR to fix a typo in the docs to get credit for helping. People being incentivized to make lots of little contributions is good, actually.
@jonny I've seen a couple of people do this previously, if I remember C. Titus Brown did so, and took a lot of flack from some corners for it at the time. I'd hope that attitudes have changed, especially around software. But yeah, in general we need to be more open about offering authorship for any kind of contribution to the project.
@jonny 100% to the general sentiment, but like every lazy system, this incentivizes unwanted behavior, i.e. to maximize coauthorship through mundane contributions that perfectly follow protocol.
Academics: stop being coy about #SciHub and start treating it like basic research infrastructure. If you dont include it in your syllabus already as a normal way to access research, you should start. No more winks and nods, just link directly to it and accept no criticism for doing so from the researchers that necessitate its continued existence by their publishing practices
@jonny
Would you say that the category "researchers that necessitate its continued existence by their publishing practices" includes
- people who publish paywalled papers in journals that don't allow green OA, on the reasoning that everyone can just use SciHub to read them
- people who published in paywalled journals that allow self-archiving, but don't bother to do it
tip for new fedis: the way the fediverse works is there is a chipmunk that comes by and puts all the posts in his mouth and goes and stashes them in his tree and only some of them hatch but that is the cost of federation
The #LLMs aren't just weird text generators, and when these companies talk to investors they don't talk about whether they're sentient or not. They talk about "understanding intent" as a synonym for matching search queries to ads. They're parsing your email and calendar and docs and matching them to entities in their knowledge graph to predict your likelihood of clicking an ad. They don't talk about generated text as thought, it's to optimize ad content and give better clickthrough rates to advertisers who pay to embed in the answers of "LLM-type experiences" https://abc.xyz/investor/static/pdf/2022_Q4_Earnings_Transcript.pdf
The #LLMs aren't just weird text generators, and when these companies talk to investors they don't talk about whether they're sentient or not. They talk about "understanding intent" as a synonym for matching search queries to ads. They're parsing your email and calendar and docs and matching them to entities in their knowledge graph to predict your likelihood of clicking an ad. They don't talk about generated text as thought, it's to optimize ad content and give better clickthrough rates to advertisers...
I'm not saying LLMs are magic and can do all the things they promise to investors, I'm saying these companies don't care about whether the bots can think. they won't work and that's worse: what they certainly will do is deepen the logic of surveillance that drives their application in advertising and provide a lot of flimsy, bias ridden, nonfunctional LLMs as platforms to data consumers like governments, cops, and insurance companies to make use of surveillance data under the cloak of LLM datawashing.
apparently you can check where the browser window is relative to the screen that it's on, so I had this very cursed idea and that is to make a webpage that has a fixed position on your screen (rather than in the browser window) and the browser window is like a magnifying glass that you have to decrease the size of to bring the page in focus, and then you have to move the window around to find the different sections of the page like a point and click adventure.
@jonny can people also host their own relay or what would be the Mastodon equivalent?
@jonny it'll be interesting to see how fast this will all unravel. I was kind of expecting them to be a bit more stealthy for a while longer before showing their hand, but maybe not.
just took a look at about a month of atproto firehose i have just been accumulating, and it looks like it's time for an update to the ol "is it becoming a communication medium yet" and the answer is even more no than before.
1% of accounts receive 72% of interactions (up from 44% last december when the network was a fraction of the size),
1% of posts receive 56% of all interactions, and
almost 90% of posts receive 0 interactions.
the distribution is steep too in the high end of that tail. Scrolling through the default feeds rn on a secondary account following zero people and with zero interactions, posts are averaging in the ~hundreds up to a tens of thousands of interactions. on my actual account where i have interacted with people, i receive the fixed proportion of low-interaction mixins from my network which is like 30-40%. Think about how common seeing a post with hundreds of interactions is tho in the default feeds - 0.01% of posts receive 470 likes, and 0.0001% receive 6300. That's how much the algorithmic amplification makes a monoculture.
I have been taking samples of fedi while developing fetch all replies and backfilling, and the distribution on AP fedi is... not like that... but i haven't taken a systematic sample.
one prior post, i'll find the other later:
https://neuromatch.social/@jonny/111656139481866077
Edit: to be clear, this a month sample of all likes and all accounts that were active in that month. So not all accounts from all time
just took a look at about a month of atproto firehose i have just been accumulating, and it looks like it's time for an update to the ol "is it becoming a communication medium yet" and the answer is even more no than before.
1% of accounts receive 72% of interactions (up from 44% last december when the network was a fraction of the size),
1% of posts receive 56% of all interactions, and
almost 90% of posts receive 0 interactions.