Today Virident announced a set of servers, called
GreenCloud, aimed at increasing performance for MySQL and
memcached servers. Last week I got a chance to talk to Vijay
Karamcheti and Shridar Subramanian at Virident about their
technology and get a preview on what they are up to.
The technology Virident use to improve performance is a third
level of memory storage based on Flash. But it goes way beyond
just adding SSD disks. To put things in perspective, look at how
resources in an average server has developed in the last 20 years
or so. We have now something like 1000 times more memory, and
1000 times more CPU performance, but disk performance has
increased very little, maybe 5 times, and that is an optimistic
number. Note that this is regarding disk performance, available
disk storage has increased also 1000 times or so.
What does this mean then? Well disk I/O is an issue, probably the
main issue for database performance. Now, database has still
gotten faster, a lot so, as we have more memory and can hence
cache A LOT more data, which speeds things up enormously. That
performance comes from the fact that we can avoid disk I/O.
There are a couple of issues here though:
- For writes, I still need to go to the disk, independent of how much RAM I have, a disk I/O will still need to happen, to the database or a logfile, but it must happen. The reason is simple. If I put my written and committed transaction in a log buffer in memory, by transaction will not be persisted.
- Caching of databases only helps so much. Once you have cached up, say, 20 % of the data in the database, further caching will improve performance as much. The reason is of course that data access patterns are skewed, they are not evenly spread across the total size of the database.
So, then, this means that when you have some percentage of the
database in the cache, and this is a block cache mostly (the
Falcon storage engine has some interesting ways of dealing with
this though), then we are still locked into the I/O performance
of the disks.
But if we go back 20 years in time again, when we were then
compensating for slow disks put caching data in RAM, there were
compromises being done. Fast, and random, RAM access as opposed
to slow disk block-level access. But what has happened now is
that there is an even bigger gap in performance between size of
RAM and disk performance. So can we not fill that gap?
Looking at attributes of the two types of memory we are looking
at so far, in case of RAM:
- Is fast and random accessed.
- But RAM is also not persistent. It is this point that makes disks still so important. Having all the database in RAM is actually possible in many cases these days, but this is not useful, as that data will not be persistent.
Disk storage on the other hand:
- Is persistent and has higher capacity.
- But disks are also slow and use block-level I/O.
Looking at this, if we had a third memory level that was faster
than disk and persistent, but possibly not as large as the disk
system, that would solve a LOT of problems. A database that needs
to persist something in a transaction log or a database today,
needs to write to disk. The key is persist here. And that
is exactly what Virident provides, a Flash memory based system
with most of the attributes of RAM, although a fair bit slower
(but still WAY faster that disk), which is random accessed, just
like RAM (here is a significant difference from SSD disks) and is
persistent.
I want to note that there are other ways of solving this problem.
One is to do what MySQL Cluster is doing, which is “semi
persisting” RAM by synchronous replication between nodes.
As anyone can realize, applications really need to be aware of
this “third storage media” that Virident provides to work
properly. Virident has a special version of the InnoDB plug-in to
handle this. And the known scalability issues with InnoDB are not
really present here either, and least to a much larger extent
that in “normal” InnoDB, as this is the InnoDB Plug-in with a lot
of fixed for this same problem.
And it doesn’t end there. As I wrote above, for the developer
this Flash memory has similar attributes to RAM, i.e. it is not a
block-level device but random access, and there are no context
switching needed! These are the two features that makes this
technology stand away from just plugging in SSD disks in any
server!
All in all, I’m excited about this, there is a lot of performance
potential to gain from this setup. By being able to scale
write-performance on a single server to new higher level, means
that technologies, in and of themselves good, like sharding,
might be needed asmuch anymore. Also, any distributed technology
to solve this problem, like MySQL Cluster, has limitations, cache
invalidation and distributed locking, none of which makes for
high scalability. Maybe Virident technology will be a standard
component in any high-end MySQL server eventually?
/Karlsson