@johannes @jkt @simevidas I thought it would be the other way around. The same grouping of bytes could represent different codepoints, based on the encoding.
Top-level
@johannes @jkt @simevidas I thought it would be the other way around. The same grouping of bytes could represent different codepoints, based on the encoding. 1 comment
|
@ramsey @jkt @simevidas yes, but working on bytes means that the encoding has to be carried thorough the different layers and might cut utf-8 sequences apart (assuming utf-8 being the default encoding)
With either codepoints or grapheme clusters you at least get some valid (while not always sensible) result.