Twitter's new tweet store:
When you tweet it's stored in an internal system called T-bird,
which is built on top of Gizzard. Secondary indexes are stored in
a separate system called T-flock, which is also Gizzard
based.
Unique IDs for each tweet are generated by Snowflake, which can
be more evenly sharded across a cluster. FlockDB is used for ID
to ID mapping, storing the relationships between IDs (uses
Gizzard).
Gizzard is Twitter's distributed data storage framework built on
top of MySQL (InnoDB).
InnoDB was chosen because it doesn't corrupt data. Gizzard us
just a datastore. Data is fed in and you get it back out again.
To get higher performance on individual nodes a lot of features
like binary logs and replication are turned off. Gizzard handles
sharding, replicating N copes of the data, and job scheduling.
Gizzard is used as a building block for other storage systems at
Twitter. …
[さらに読む]