The client du jour has a table with a column that keeps phone
numbers. This is normal.
However, the column is a varchar(255) charset UTF8.
This is not okay.
In MySQL, that will take up 765 bytes per row in memory, plus
will be doing UTF8 to to 3byte expansion for every row read or
scanned. And it has to use the wchar string ops to do
comparisons.
The ISO standard for PSTN phone numbers, E.164, states that phone
numbers are a max of 15 digits long, and that includes the
country code.
Make phone numbers a varchar(16) charset ASCII. Have your
application mash out all the spaces and hypens.
Then marshalling and expansion is free, and modern processors can
do an equality test against 16 bytes in a single operation.
16 bytes vs 765, that's a 50x savings.
Oh, and stop assuming that all phone numbers are in the US and
Canada, and thus are of the form (AAA) EEE-TTTT. There are
acceptably good library routines in your language of choice that
will accept a string of digits, look at the country code, know
that country's dialing plan, and output a "pretty formated" phone
number. Don't try to keep pretty formated phone numbers in your
database, and don't try to make the database server do the pretty
formatting.
Oct
01
2008