The utf8 spec says that a utf8 character can take up to 4 bytes,
mySQL currently only supports up to 3 bytes. So, in essence if
your application allowed 255 characters to be inserted into a
field, when in utf8 land ie a utf8 column these 255 characters
can take up to 765 bytes.
Here is a breakdown from
dev.mysql.com
-
- Basic Latin letters, digits, and punctuation signs use one
byte.
- Most European and Middle East script letters fit into a
two-byte sequence: extended Latin letters (with tilde, macron,
acute, grave and other accents), Cyrillic, Greek, Armenian,
Hebrew, Arabic, Syriac, and others.
- Korean, Chinese, and Japanese ideographs use three-byte
sequences.