Brian
Aker writes in his "PostgreSQL to scale to 1 billion users"
post:
Backup is irrelevant for those of you who care about this
discussion. LVM/ZFS snapshots are the rule of the land.
While I agree with most of Brian's statements in the article, I
respectfully disagree with the statement above, especially the
bolded part. Copy-on-write snapshots are EVIL for very large
databases operating in a high I/O environment and backup, by no
means, is entirely irrelevant. Please correct me if I am wrong
but it is my understanding that both LVM and ZFS implement
copy-on-write snapshots. Backup may be irrelevant for most sites
but not for us.
If, however, by "irrelevant" Brian meant that not important in
choosing one database over another, I can agree with that. Why?
Because no one benchmarks …
I spent my Sunday working on my three presentations that I will be presenting
at the upcoming MySQL Conference. About two hours ago, as I was
reviewing my stuff, I told my lovely wife that I may talk in my
sessions how replication for read scalability no longer makes
sense in high traffic environments. I told her, I am probably
going to vote in favor of investing in memcached vs read slaves
for scaling reads.
Believe it, or not, she hammered me with all sorts of questions.
I spent some time answering her questions. I scanned my brain to
gather more evidence to support myself including that at work we
are moving and staying away from replication as much as
possible.
Then, I got busy writing the post about Facebook using MySQL
replication to update Memcached. After publishing the post, I
checked Planet MySQL and found …
A number of months ago, possibly a year ago, I wrote an internal
letter to the MySQL internal discuss list with the title of "The
Death of Read Replication". Ever since then I've been getting
pinged internally to publish my thoughts on this
externally.
Here goes :)
Read replication is going to be in use for many years into the
future. There are plenty of reasons to use it, and plenty of
setups where it will make sense.
All of the scripting, management daemons, and ease of use
scenarios will not solve its problems though, and I am finding
that users have either moved away from it, or more often, have
reduced their need for it.
A few reasons:
Latency is painful to manage.
Lots of servers means more head count (both disks and in numbers
of people to manage it)
In web usage, the rule of thumb is to keep your query number
under 7, for this reason you try make more out of …
Faced with the challenge "to figure out a way for memcached servers to replicate data concurrently with
the MySQL databases," across the country, Facebook came up
with a clever solution of "embedding extra information in to the
MySQL replication stream that allows [Facebook] to properly
update memcached [servers] in Virginia."
This is very smart! I am curious about how they implemented this.
I wonder if by "replication stream" they are just referring to
binary logs. The article didn't mention whether they hacked MySQL
to do synchronous replication as well, like Google. That would be
really neat: synchronous replication that updates memcached.
Synchronous or not, the idea is still uber cool and I would love
to see more …
For reference:
http://highscalability.com/skype-plans-postgresql-scale-1-billion-users
Here are some observations by me on the state of database usage
in Web 2.0:
All major web 2.0 sites now use object caching (of one type or
another)
Sharding and now Proxy style solutions are becoming commodity.
They are everywhere.
What does this mean?
Replication is dead except for replicating for "application"
needs.
Good News :)
For MySQL it encourages multiple engines. For Postgres I suspect
their flexible index design will be useful. The "I replicated
over here for a backup, or to run reports..." is still happening
a lot. Multi-master replication is one scenario to achieve high
availability (DRBD on the low end... you will go broke trying to
deploy …
I’ve managed my first week (well 4 days) at my new employer PrimeBase Technologies. Another open source company, a different open source company, my second in succession . A company that actually has it’s parent roots in the commercial sector and this is now branching into Open Source and supporting the community.
While my departure from MySQL was really no surprise to friends within the MySQL community, just some shock of the accelerated timing. Actually it should not have been a shock to my employer either as I’d expressed clearly these intentions in two reviews in the past twelve months. My April Fool’s managed to shock and catch out many to continue this saga.
Our focus has been the …
[Read more]Here are two basic tips for proper indexing ...Don't mess with datatypes, too often people refer to an attribute defining it as one datatype in a table and as another in different tables, this actually prevents index usage in joins (forget about FKs for this time ;)) See an example here. You could declare a function based index as a workaround, but why don't we all try to make it right?Put
There was just a thread on the Freenode #mysql IRC channel with
someone wanting to switch off and delete their binary logs. Why?
Because they were short on diskspace.
Mind you, this was a production system, so generally it's rather
a bad idea to disable binary logging there, unless you really
don't value your data - but in that case you might as well just
close down your shop now ;-)
This is not about MySQL reliability, but hardware can and will
fail, and all kinds of other things can and will go wrong over
time.
While I appreciate the jam this person was in today: choosing
between not being online at all, and disabling the binlog for
now.... it's so much better to prevent this. I actually hear
about database servers running out of disk space quite often so
this is a common event!
It's something to keep a close eye on, for instance using
Nagios, and
have an alert …
A few months ago, I wrote about the issues you will face with installing MySQL on OS X 10.5, Leopard. I am pleased to inform everyone, that this problem has been fixed!
The bug in question, mysql#28854, clearly stated that the problem was with the PrefPane. On Valentine’s Day 2008, Alfredo Kojima (of Workbench fame) fixed the problem, and uploaded a new PrefPane, to ftp://ftp.mysql.com/pub/mysql/download/gui-tools/MySQL.prefPane-leopardfix.zip.
This fixes an incompatibility with the default shipped PrefPane. The new PrefPane also detects if the MySQL data directory (/usr/local/mysql/data) has the incorrect permissions (and if so, one should …
[Read more]Plenty of people have been excited by the prospect of Amazon EC2 and the ability to scale out your databases as load increases from your original configuration. I noticed Morgan Tocker and Carl Mercier are going to be presenting on this topic at the upcoming MySQL ConferenceHowever almost immediately people are worried about the lack of persistent of data across instance terminations.In a sense