One of the most common patterns I see in my consulting work is
identifiers that are generated by MD5() or UUID(). Many times
this is done in an application framework or something similar —
not software the client has written. From the application
programmer’s point of view, it’s just an incredibly handy idiom:
generate a unique value and use it, you’re done.
Those values tend to appear in session identifiers, but that’s
not the only place; I especially notice them in apps that use
Java’s Hibernate interfaces, whether session IDs are involved or
not. They propagate themselves all around the other tables, where
they become secondary indexes and even get combined with other
columns to make even bigger keys.
What’s wrong with this? There are two major things that hurt
performance in such cases: larger data and indexes, and
non-sequential values. I’ll ignore the latter in this article,
since whether an identifier is …
[Read more]