Email or username:

Password:

Forgot your password?
Hector Martin

Speaking of Unicode identifiers being a stupid idea: I have not seen a single Unicode/punycode URL in my almost 10 years in Japan, in real life.

Not. Once. Not in the hostname portion, not in the path portion. Never.

Nobody wants that nonsense here. Seriously. It's a silly novelty and only creates practical problems (and security issues).

You know how Japanese ads and billboards direct people to complex pages/URLs? They give you a search term to plug into Google.

(To clarify, you do get Unicode terms in path fields for things like wikis, but never as part of URLs people are expected to type out, and I've seriously never seen punycode domains.)

17 comments
Nicolás Alvarez

@marcan In Argentina, .com.ar domains started allowing áéíóúüñ in domain names, and gave registration priority to people who already had the un-accented version registered.

I don't think I have *ever* seen any legitimate domain name using them. People will just assume "links don't have accents" and type it without so you'd need to register both anyway.

Nicolás Alvarez

@marcan I think switching from .gov.ar (from blindly adopting the global TLDs) to .gob.ar was a good move though, despite causing similar issues during the transition (users "knowing" domains spell it gov so websites needed to be available in both, etc).

rin
@marcan maybe true in case of cjk, but having lived in russia I have definitely seen cyrillic domains used in ads, albeit much rarer than latin ones
Gaelan Steele

@marcan the search terms on billboards thing is common in the UK too. always seems terrifying - so many ways it can go wrong!

Meriel :leafeon:

@marcan I've seen them once in Germany, and it was because I had to fix an ancient tool for managing DNS zones to correctly use UTF-8 and punycode behind the scenes. lol

bex :neocat_flag_nb_256:

@marcan I’ve actually seen like.. two here in Sweden that uses åäö, everyone else just uses a and o as before..

ダイハツ!ムーヴコンテ

@marcan I've seen a few punycode domains, but most often on prc sites.

lj·rk

@marcan I think it's a lot bigger in the Arabic world? I've seen a few domains that used their characters, and punycode integration into browsers make it quite painless. It's pretty ugly of a hack, but I also don't see the issue, tbqh.

HEROBRINE7GAMER

@ljrk @marcan Usually big websites use ASCII characters for domain names only because arabic is RTL, but for some reason use arabic for sub-categories or article names which renders it in percent-encoding and makes writing the url impossible. For example, Firefox on android does not support punycode integration and displays only in percent-encoding. That is painful. (afaik firefox for desktop requires arabic support packages for punycode integration)

Nicolás Alvarez

@herobrine7gamer @marcan @ljrk punycode is only a thing in domain names, for the rest of the URL it's just Unicode encoded in UTF-8

jaseg

@marcan I very occassionally see ones involving umlaut characters (äöü) in DE. They are an integral part of German and are in lots of common German names, and while there exists a standard romanization that everyone knows about, it looks kind of ugly.

scarcraft

@marcan
Here in spain we have free .es domains for the city councils. I assume that is nearly a mandatory rule to obtain certain money funds. As an example, my city has the logroño.es

Hector Martin

@scarcraft ñ is basically the only one I'd bother with in Spanish, it feels so wrong to write that as "n". But it feels like the exception that proves the rule.

Case in point: it's aragon.es, not aragón.es :)

Landon Epps

@marcan I do see Unicode domains in search results. They tend to be single purpose websites geared toward SEO. I bet it‘s because Google prioritizes a result if the URL matches the search term exactly (e.g. wimax比較.com).

Interestingly [kanji].com is often used in logos, but the actual URL is ASCII. I see 価格.com the most, but there are plenty of examples. The Unicode domain in their logo doesn’t exist, not even to redirect. The actual URL is kakaku.com.

Hector Martin

@landonepps Oh, it never really hit me but kakaku.com is indeed a very interesting case.

As for the other one... Yeah, I mean, of course such domains exist. I just mean I've never seen one used by any notable company, advertiser, service, etc., and in particular I don't think I've ever seen one used on paper/offline with the intention of being typed in.

Landon Epps

@marcan Yeah, I've also never seen a notable company use a Unicode domain. I'm pretty sure it's just an SEO hack where they make their domain a common search term.

Григорий Клюшников

Cyrillic IDN domains in the .рф TLD are a thing in Russia. They aren't very popular, but they do exist, including on ads.

Go Up