Email or username:

Password:

Forgot your password?
Top-level
DevWouter

@ujay68 @simevidas

Thanks to your link I did some digging and I came to the same conclusion. It even says that JavaScript strings are UTF-16. However a quick check in javascript on both Firefox and safari and the JS implementation is the same.

Kinda wierd that HTML5 spec suggest UTF-8. (also mastodon counts 👩‍👩‍👧‍👧 as a single character)

3 comments
John Ulrik

@DevWouter @simevidas Yes, JavaScript strings have been UTF-16 since the beginning of time. I think that’s where many of the compatibility issues come from. The Go language, eg, has a more modern approach combining UTF-8 byte sequences and codepoints for characters (“runes”).

John Ulrik

@DevWouter @simevidas From an end-user point of view, the only concept that would make sense as a measure of length IMO is what Unicode calls a “glyph”, ie, a sequence of code points that display or print as ONE visible symbol, ONE (possibly complex composite) emoji or ONE (possibly multiply accented) character.

Go Up