Just had an interesting issue with an encoding mess on a column
containing non-ASCII (Russian) text. The solution was not
immediately obvious so I decided it's worth sharing.
The column (actually the whole table) was created with DEFAULT
CHARSET cp1251. Most of the data was in proper cp1251 national
encoding indeed. However, because of web application failure to
properly set the encoding, some of the rows were actually in
UTF-8. That needed to be fixed.
Simply using CONVERT(column USING xxx) did not work because MySQL
treated the source data as if it was in cp1251. One obvious
solution would be to write a throwaway PHP script which would SET
NAMES cp1251, pull the offending rows (they'd come out in UTF-8),
iconv() them to proper cp1251, and UPDATE them with new values.
However it's possible to fix the issue within MySQL. The trick is
to tell it to treat the string coming from the table as binary,
and then do charset …
[Read more]