Email or username:

Password:

Forgot your password?
Gregory

#Smithereen update everyone asked for: quick search!

- You can now quickly search for people and groups.
- People you follow and groups you're a member of are displayed higher.
- Alphabet doesn't matter. If you search for "олег", you'll find people who called themselves Олег and Oleg. Should work for other scripts too. Probably works terrible for Arabic, Hebrew, and some Asian languages.
- You can also paste links to external objects to load them.

#activitypub #mastodev @activitypub

21 comments
Gregory

Though VK's version also works correctly if you type your query in a wrong keyboard layout. As in, "jktu" when you meant "олег". VK can do this trivially since 99% of its target audience has two known keyboard layouts — English QWERTY and Russian ЙЦУКЕН. I can't do it this easily... I'll be especially clueless if you set your language to English, which some Russians actually do.

Gregory

Yep it still does that after all these years.

Gregory

One more Smithereen-specific thing I forgot to mention. When you load a post like that, it'll actually fetch the entire thread for you. I made a parallel comment thread downloader.

a1ba-nyan
@grishka haha when I asked about that, someone said "no, why you want to download whole fediverse recursively?" to me
a1ba-nyan
@grishka :D

I definitely should try Smithereen again.
Gregory

@a1batross when comment threads are incomplete, it's a terrible user experience. And I'm not the kind of developer who would settle for a terrible user experience due to some technical peculiarities that the users couldn't care less about.

wakest ⁂

@grishka wow this is great! are you the first fediverse software that has implemented this (to your knowledge?)

/dev/urandom (aka jan Lentan)

@grishka sounds like it could be a useful setting nonetheless.

maybe with a switch that instead checks the ukrainian, belarusian, greek and other relatively static non-latin kbd layouts

Gregory

@devurandom that's certainly an option, but would you really convert every string you receive into 20 different keyboard layouts just in case? That could also lead to some mysterious false-positive search results.

a1ba-nyan
@grishka @devurandom maybe ask user to list languages they know in settings and then convert? That information could be useful in future.

I'm not insisting on that feature though, just a random idea.
/dev/urandom (aka jan Lentan)

@grishka not asking for all searches to be run with every possible layout, but just to add a separate user setting that goes

[✓] Search keyboard layout correction: [ Russian ↓ ]

and then you get to pick whichever layout you commonly use.

this could also be helpful in that searches mistakenly typed in that layout could be converted to english.

Vftdan

@grishka
May it is a good idea to define distance matrices between alphabets' letters and between keyboard layouts' keys, merge them into something like least cost matrix and use it to modify levenshtein distange algorithm?
@devurandom

fperson :SarianFlagRounded:

@grishka Hmm, You can handle only QWERTY, ЙЦУКЕН. I think it'll be a nice UX for Russian users and it'll be kind of expected for other users so nothing too bad. Facebook, for example, doesn't handle even ЙЦУКЕН

Sebastian Lasse, redaktor.me

@grishka

Yay.
redaktor got that too. I am interested in “Alphabet doesn't matter” - can you give a pointer to code?
What I did a while ago is in
github.com/redaktor/widgets-pr
and worked on phonetic search …

PS, just in case anyone needs to detect languages, this covers 800+ … github.com/redaktor/languages

Gregory

@sl007 github.com/grishka/Smithereen/

I use this github.com/jirutka/unidecode to transliterate everything into Latin alphabet and store these strings in a separate table (qsearch_index). So, basically, search only ever operates on Latin strings.

Gregory

@sl007 so because:
- I convert names into Latin when creating and updating the search index
- I convert search queries into Latin when searching
I could type your name in Cyrillic and still find you:

Gregory

@sl007 also notice that there's a Sebastián with an accented á and it still found him

:crackCat: FelineDisrespectFromBehind :crackCat: :verified:
@grishka Ok now I'm curious to see how it handles non russian cyrillic characters like љ, њ, ђ, џ or ћ
Go Up