Issue with Japanese trackback
I have been testing TrackBack
module and noticed that Japanese trackback was not shown correctly. It appeared that the encoding was not properly recognized.
In my case, the problem was multifold. First off, the source blog that is pinging the trackback is not sending the encoding. TrackBack module uses the encoding from the source, and without encoding, it relies on its internal algorithm to determine the encoding.
Now, in the algorithm, the second issue was a bug in Trackback. It checks the locale of the site, but the locale is not set (it is empty!) Without the locale, it falls to English (actually ISO-8859-1, which is standard Latin). Thus, the trackback request from Japanese blog was never correctly encoded.
Then, I modified the code to specify the locale to Japanese, yet, the trackback was not encoded correctly. TrackBack specified three Japanese encoding, ISO-2022-JP, EUC-JP, SJIS as the possible encoding sets for Japanese. Those three encoding sets are popular and should work. The php's mb_detect_encoding() function is supposed to detect which encoding is used for a given string. For unknown reason, it always returned SJIS, while the given trackback string was ISO-20220-JP. The function can't detect it correctly.
The solution to this was to specify 'auto' encoding. I don't know why, but instead of specifying three Japanese encoding sets, simply asking mb_detect_encoding() function to just detect seems working.
Now, with those modifications, a trackback from Japanese blogs is correctly shown on the right sidebar, though you might not see Japanese but just some squares if you don't have Japanese fonts installed.
References:
http://cl.pocari.org/2005-07-10-1.html
(Japanese)
http://labs.gmo-media.jp/archive/21
(Japanese)
http://je-pu-pu.jp/blog/archives/2005/02/mb_detect_encod.html
(Japanese)
