When I wrote this post about invalid utf8 characters I needed a way
to convert the mysql message into a real identifier for me to
take a look at. Below is a quick and dirty script to figure out
the bad rows.
I wrote the script below in literally 2 mins. It's really basic
and basically just CROSS JOINs two versions of the table one in
utf8 the other in latin1 and reports back which string column is
not correct. I haven't cleaned the script up, it's ugly and for
my purposes will only exist for a short period of time.
test.$TABLE = is the original table latin1 for instance
- use mysqlimport this forces data to not get converted to latin1
from utf8 by setting the character set to binary (i.e. do no
convert)
$DB.$TABLE = is the new utf8 table.
[Read more]
#!/usr/bin/php …