It's important to remember that while all languages are left-to-right and top-downwards (in that order only), they also only use alphanumerical ascii, so a recommended best practice is using an encoding like iso-latin-1. If you must deal with weird foreign letters (which no-one uses), UTF-16 is a must, but remember that the BMP is all you need. Be sure to aggressively normalise the characters you receive, and remember that you can truncate wherever you like without causing any problems. It's important to know that domains can't contain any of those funny foreign letters, and will never be written using them under any circumstances. Speaking of which, all languages have the concept of capitalisation, and this always works identically to English