Little bit about encoding

Well, it was a tremendous test for me. Yesterday I got stuck with problem related to encoding. Believe me when I say it was one hell of a night. I tried so many things to solve my problem, that my fingers started to hurt. Google received a lot of queries from me 😀 Now we are best mates.

So problem was with how my records were displayed in database. I tried to make multi-language application in PHP, MySQL. When I added for example new user with russian nickname it came out as some gibberish. Well I never actually thought about encoding. Never had to 🙂 Now I had to face a large scale problem. Well when I reached, that point I started to communicate with Google. I read a large stack of articles about encoding. A big help for me was specifically one article "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)" I recommend it to everyone who will have, that problem. Explains quite a lot.

Now some steps for having a good multi-language website.

  • Mainly your files have to be in "UTF-8 without BOM"
  • Secondly all your webpages must contain in head at least meta tag
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  • Next is step for database. Choose collation of database between utf8_general_ci or utf8_unicode_ci. I prefer utf8_unicode_ci it is slower, but more accurate.
  • And about this step I didn't know. That's why I had my problem. After connecting to database you need, to execute query mysql_query("SET NAMES utf8");

IMHO now you probably have done everything in your power.