Top-level
6 comments
@a1ba @oss It's sort of doable on normal large pieces of text but most encoding detection libraries (and there is a ton of them) are completely stupid, like "é" instead of "é" should just never happen.
And I think I'd rather have that kind of stuff and iconv being forever banned into a corner for just historical purposes of taking old text files and transforming them to utf-8. |
more like it was created before wide adoption of Unicode, so all characters messed up.
It doesn't help that some programmers mindlessly open such old files in Windows, which will probably interpret unknown symbols as CP1251, then save it as KOI, then open it in different editor that detects it as Unicode, which then saves it back to CP1251, and... it may never stop, just more and more information getting lost forever, since it also predates wide spread of VCS.
more like it was created before wide adoption of Unicode, so all characters messed up.