Update: WordPress 4.2 has full UTF-8 support! There’s no need to upgrade manually any more. ?
For many years, MySQL had only supported a small part of UTF-8, a section commonly referred to as plane 0, the “Basic Multilingual Plane”, or the BMP. The UTF-8 spec is divided into “planes“, and plane 0 contains the most commonly used characters. For a long time, this was reasonably sufficient for MySQL’s purposes, and WordPress made do with this limitation.
It has always been possible to store all UTF-8 characters in
the latin1
character set,
though latin1
has shortcomings. While it
recognises the connection between upper and lower case characters
in Latin alphabets (such as English, French and German), it
doesn’t recognise the same connection for other alphabets. For
example, it doesn’t know …