Fixing double-encoded UTF-8 data in MySQL

I recently came across an old website of ours that I wanted to migrate to a new server. When I finished migration, I found the special characters typical to european languages broken. It took me a while to find the problem. The data was wrong in the Database but on the old system it was displayed correctly, the new didn’t. I finally figured out that my database suffered from double encoded UTF-8 characters.

It seams, that double-encoded UTF-8 texts are a fairly common problem when dealing with MySQL. This may be due to the fact that the default character set of the connection to the server is Latin-1, but that is not relevant once the data is already corrupt.

Here is how to fix it – export the database using latin1 charset an re-import with utf-8:

mysqldump -h DB_HOST -u DB_USER -pDB_PASSWORD –opt –quote-names –skip-set-charset –default-character-set=latin1 DB_NAME > dump.sql

mysql -h DB_HOST -u DB_USER -p DB_PASSWORD –default-character-set=utf8 DB_NAME < dump.sql

Of course, you should first replace DB_HOST, DB_USER, DB_PASSWORD and DB_NAME with values, corresponding to your personal setup.

source: Fixing double-encoded UTF-8 data in MySQL