In Python it is easily possible to findout the name of a Unicode
character and findout some properties about that character. The
module which does that is called unicodedata.
An example:
>>> import unicodedata
>>> unicodedata.name('☺')
'WHITE SMILING FACE'
This module uses the data as released in the UnicodeData.txt file
from the unicode.org website.
So if UnicodeData.txt is a 'database', then we should be able to
import it into MySQL and use it!
I wrote a small Python script to automate this. The basic steps
are:
- Download UnicodeData.txt
- Create a unicodedata.ucd table
- Use
LOAD DATA LOCAL INFILE to load the data
This isn't difficult especially because the file doesn't have the
actual characters in it. It is …
[Read more]