Showing entries 1 to 3
Displaying posts with tag: utf8mb4 (reset)
Troubleshooting Issues with MySQL Character Sets Q & A

In this blog, I will provide answers to the Q & A for the Troubleshooting Issues with MySQL Character Sets webinar.

First, I want to thank everybody for attending the March 9 MySQL character sets troubleshooting webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:

Q: We’ve had some issues converting tables from

utf8

  to

utf8mb4

. Our issue was that the collation we wanted to use –

[Read more]
MySQL 5.7, utf8mb4 and the load data infile

In this post, I’ll discuss how MySQL 5.7 handles UTF8MB4 and the load data infile.

Many of my clients have told me that they do not like using the LOAD DATA INFILE statement and prefer to manually parse and load the data. The main reason they do it is issues with the character sets, specifically UTF8MB4 and the load data infile. This was surprising to me as nowadays everyone uses UTF8. MySQL 5.7 (as well as 5.6) has full support for UTF8MB4, which should fix any remaining issues (i.e., you can now load new emoji, like ).

Last week I was investigating an interesting case where we were loading data and got the following error:

mysql -e 'select version()'
+-----------+
| version() |
+-----------+
| 5.7.12    |
+-----------+
$ mysql -vvv testdb < load_data.sql
ERROR 1300 (HY000) at line 1: Invalid …
[Read more]
Importing the Unicode Character Database in MySQL

In Python it is easily possible to findout the name of a Unicode character and findout some properties about that character. The module which does that is called unicodedata.

An example:

>>> import unicodedata
>>> unicodedata.name('☺')
'WHITE SMILING FACE'

This module uses the data as released in the UnicodeData.txt file from the unicode.org website.

So if UnicodeData.txt is a 'database', then we should be able to import it into MySQL and use it!

I wrote a small Python script to automate this. The basic steps are:

  • Download UnicodeData.txt
  • Create a unicodedata.ucd table
  • Use LOAD DATA LOCAL INFILE to load the data

This isn't difficult especially because the file doesn't have the actual characters …

[Read more]
Showing entries 1 to 3