Email or username:

Password:

Forgot your password?
Top-level
Jonah

@nikitonsky I'd note that the advice to use grapheme clusters as the atomic unit of strings is dangerous. First because the clusters are ill-defined (as you note), but second because using them to e.g. parse a message can cause security issues!

For example, imagine a JSON string:
"string"
and use a combining char at the start:
"ˇstring"
(typing on mobile, the hacek is not actually combining)
Now it's not a string anymore. This is an error, but it's possible to imagine an injection attack.

2 comments
Niki Tonsky

@vjon how is this an injection attack? It’s just an invalid string

Jonah

@nikitonsky @nikitonsky Sorry, message length limits.

The security issue is potential. In JSON, it's not possible to construct an attack this way -- at least not with a strict parser. In another format, it might turn into an actual injection attack.

But a syntactic error in trusted data is an issue on its own, because the string is valid. A parser based on grapheme clusters would reject it incorrectly. And it means that using them to parse JSON, or any other textual format, is not possible.

Go Up