Yesterday, while at the MySQL Conference, I was interviewed by
Robert Scoble
about my employer, Gear6 and our product, an enterprise memcached
distribution.
While waiting in the line for a breakfast table, I found Reggie
Burnett, who is still with MySQL now Oracle. We shared a table
and talking about Android and the future of handhelds.
I missed the keynotes by Edward Screven and by Tim O'Reilly.
Instead I had scheduled interviews with The 451 Group and then
with Robert
Scoble. Those both went really well. And I learned that the
Screven speech went not so well, which would have been amusing,
but not a good use of time.
The rest of the day, so far, has consisted of meeting people,
spending time at the Memcached.org booth and the Gear6 booth, and
doing more scheduled tech press interviews. Sarah Novotny showed up during the nosh and free
beer, right before the BOF sessions.
…
I keep seeing "Memcached is not a key value store. It is a cache.
Hence the name." This is strongly reinforced by statements made
in the memcached mailing list itself.
This is short sighted.
Memcached is a number of things. It is an idea (fast key value
store, with distributed hash function scaling), it is a network
protocol (two of them, in fact), it is a selection of client
libraries and APIs (most based on libmemcached), and it is a
server implementation. In fact, now, is is now a number of server
implementations, because now there are a number of different
things that implement the memcached protocol.
Only one of which is the open source community edition of the
memcached server, version 1.4, downloadable from
http://memcached.org/
Despite what you may get told, especially on the memcached
mailing list, you can in fact use memcached as a store, not just
as a cache.
…
(originally posted at the Gear6 corporate blog: MySQL+Memcached is still the
workhorse. Please comment there.)
Because I'm becoming known as someone who knows something about
"this whole NoSQL thing", people have started asking me to take a
look at some of their systems or ideas, and tell them which NoSQL
technology they should use.
To be fair, it is a confusing space right now, there are a LOT of
NoSQL technologies showing up, and there is a lot of buzz from
the tech press, and in blogs and on twitter. Most of that
buzz is, frankly, ignorant and uninformed, and is being written
by people who do not have enough experience running live
systems.
A couple of times already, someone has described an application
or concept to me, and asked "So, should I use Cassandra, or
CouchDB, or what?"
And I look …
From the Changelog:
C++ interface for libhashkit.
Modified memcached_set_memory_allocators() so that it requires a
context pointer.
memcached_clone() now runs 5 times faster.
Functions used for callbacks are now given const
memcached_st.
Added MEMCACHED_BEHAVIOR_CORK.
memslap now creates a configuration file at ~/.memslap.cnf
memcached_purge() now calls any callbacks registered during get
execution.
Many fixes to memslap.
Updates for memcapable.
Compile fixes for OpenBSD.
Fix for possible recursive decent on IO failure.
Possibly the most exciting piece is the performance wins for
memcached_clone(). In a lot of situations developers use
libmemcached with Apache. Each time an Apache process has to be
created a clone() call is made (in some PHP architectures this
happens with each request). On local testing this went from
around ~300ms for me, down to …
The slides for my presentation at FOSDEM 2010 are now available online at slideshare. In this presentation I describe a successful client implementation with the result of 10x performance improvements. My presentation covers monitoring, reviewing and analyzing SQL, the art of indexes, improving SQL, storage engines and caching.
The end result was a page load improvement from 700+ms load time to a a consistent 60ms.
10x Performance Improvements – A Case Study View more presentations from Ronald Bradford.
At dealnews we have three tiers of servers. First is our
development servers, then staging and finally production. The
complexity of the environment increases at each level. On a
development server, everything runs on the localhost: mysql,
memcached, etc. At the staging level, there is a dedicated MySQL
server. In production, it gets quite wild with redundant services
and two data centers.
One of the challenges of this is where and how to store the
connection information for all these services. We have done
several things in the past. The most common thing is to store
this information in a PHP file. It may be per server or there
could be one big file like:
<?php
if(DEV){
$server = "localhost";
} else {
$server = "10.1.1.25";
}
?>
This gets messy quickly. Option two is to …
One of the problems I have with Memcache is this cache is passive, this means it only stores cached data. This means application using Memcache has to has to special logic to handle misses from the cache, being careful updating the cache - you may have multiple data modifications happening at the same time. Finally you have to pay with increased latency constructing the items expired from the cache, while they could have been refreshed in the background. I think all of these problems could be solved with concept of active cache
The idea with Active Cache is very simple - for any data retrieval operation cache would actually know how to construct the object, so you will never get a miss from the cache, unless there is an error. From existing tools this probably lies out best on registering the jobs with Gearman.
The updates of the data in this case should go through the same system so you can get serialization (or other logic) for …
[Read more]
I had an opportunity to catch up with Mark
Atwood last week to discuss his new role at Gear6 and some of the
interesting developments currently going on around memcached, including
Gearman
integration and its suitability for cloud computing
environments.
NorthScale's own Patrick Galbraith has, for many years now, authored and maintained the MySQL, and now Drizzle, UDFs for memcached. Last week, Patrick took this one step further with the latest release, version 1.1, which now includes support for "check and set" (a.k.a. CAS) operations.
User Defined Functions are available for a number of different databases. This allows some kind of stored procedure language or other triggers to execute other code imported into the DB. In the case of the memcached UDF, this means giving stored procedures the ability to call memcached operations.
The general idea here is pretty simple. Most applications start with a database, though it's always possible to use web services or flat files. Regardless of where the data is persisted, to keep the cache always up to date with the System of …
[Read more]