Most powerful would probably be: both :)
Also we need to mention that of the 850 languages only 400 are really “spoken” in the languages.
It was driven by the desire to give voice to minor languages and so after the reportages in Papua New Guinea the source was tinkering, as far as I remember about 400 were trained from wikibase and about 100 were trained from a mix of specific dictionaries and web sources.
We could derive the “language variants” by BCP-47 blowing it up to maybe 700.
The rest (about 150, mostly minor languages) was driven by desire to find anything you can find in this english-dominated internet, even in papers about the language.
Anyway: My belief is that any fedi onboarding [just in: https://www.smashingmagazine.com/2023/04/design-effective-user-onboarding-flow/]
should include the user saying
This is my native language (100%) and I speak these 10%-99% perfect …
Then it will get precise cause it limits it to the selection where we should have enough difference.
🧵 1/2
@maxlath
Otherwise the languages are limited to the script used (I think, it didn't cover mixed scripts, something to think of).
Looking into yours now.