I'm not a very lingual guy. So when I got a request to see how to store and display Russian characters with MySQL and PHP, I didn't really know too much about it. So began my research.
First of all I was told there was some table that had Russian data in it. When I did a select on it I would get something like:
mysql> select * from polls; ... | 11 | ??? ???????? ?????? ?????????? ????????????? | NULL |
Ok, that's reasonable. The mysql client must not know how to display it. But how can I be sure those are actually Russian characters and not actually question marks? If the user never got Russian working, can we be sure there is actually Russian data in there? I wanted to see the true values in there, and came across the ascii() mysql function which gives the ascii value of the first character of a string.
mysql> select ascii(question) from polls; ... | 63 |
Hmm, 63 is actually the ascii value for a question mark. So I cannot depend on this data. Next task is creating a table and populating it with Russian data. After some searching, I found that I needed to specify the column character set.
create table polls2 ( id int(11) not null auto_increment, question text character set utf8, lang varchar(10) default null, primary key (id));
Ok great, now how do I enter data? I don't know Russian, nor how to type it. Luckily, I did have some Russian spam on hand, and succeeded in cutting and pasting it. But before I could insert, I had to change my mysql command line connection character set:
set names 'utf8';
insert into polls2 (question) values ('some pasted russian stuff');
That looked like it populated the table ok, and in my OS X terminal I can even see the characters (but they look ugly). I wrote (or rather, stole) some PHP code to query the DB and display it, but was just ending up with junk in the browser. After much searching, I found that a header must be added to the HTML:
... meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf8" (sorry can't put the brackets here)
Ok, but it still didn't work. I'm starting to dislike these Russians. I then went to Gmail with the Russian spam, and looked at the HTML source of the email displayed in the browser. I saved it, and then copied the html file to the server I was working on. I went to retrieve the file from my server, and whaddaya know it's showing junk characters again. So clearly it wasn't any incorrect data in MySQL or displaying it with PHP on my part. There must be something else going on. It's as if the META HTTP-EQUIV headers specifying language are being ignored.
So I ran wget to show me the headers:
wget -q --save-headers -O - my.site.com/russian.spam.html
And then notice that the server was sending:
Content-Type: text/html; charset=iso-8859-1
Clearly overriding whatever I put in the html headers. I then found there was an AddDefaultCharset in the Apache config which is the culprit. I didn't want to really change that, and instead created an .htaccess with:
AddDefaultCharset Off
Now, after doing all this, I can finally see the Russian characters in the browser, retrieved from MySQL, and displayed with PHP:

Maybe these notes will be helpful to some poor soul.
An ERROR has occured!
Here you might send email-notification to webmaster or something like that.