Redis, Memcached, Tokyo Tyrant and MySQL comparision

I wanted to compare the following DBs, NoSQLs and caching solutions for speed and connections. Tested the following

My test had the following criteria

  • 2 client boxes
  • All clients connecting to the server using Python
  • Used Python's threads to create concurrency
  • Each thread made 10,000 open-close connections to the server
  • The server …
Storage Miniconf Deadline Extended!

The organisers have given all miniconfs an additional few weeks to spruik for more proposal submissions, huzzah!

So if you didn’t submit a proposal because you weren’t sure whether you’d be able to attend LCA2010, you now have until October 23 to convince your boss to send you and get your proposal in.

libmemcached packages

Ronald Bradford last week posted about memcached not being multi-threaded on Ubuntu, something he discovered via some small utilities that are bundled with libmemcached, written by Brian Aker.

When I noticed there were no Ubuntu packages for libmemcached (or the CLI tools) I decided to create some.

For your enjoyment: (Source debs are included)

The repository also contains a memcached that has been re-compiled with multithreading enabled.

The rise of the GLAMMP stack

First there was LAMP.  But are you using GLAMMP?  You have probably not heard of it because we just coined the term while chatting at work.  You know LAMP (Linux, Apache, MySQL and PHP or Perl and sometimes Python). So, what are the extra letters for?

The G is for Gearman - Gearman is a system to farm out work to other machines, dispatching function calls to machines that are better suited to do work, to do work in parallel, to load balance lots of function calls, or to call functions between languages.

The extra M is for Memcached - memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web …

PHP, Python Consistent Hashing

I found out the hashing algorithm used in PHP-Memcache is different from that of Python-Memcache. The keys went to different servers as the hash created by python and php were different.

I posted a question on the memcache groups and was lucky to find this wonderful reply.

import memcache
import binascii
m = memcache.Client(['', '
', ''])

def php_hash(key):
    return (binascii.crc32(key) >> 16) & 0x7fff

for i in range(30):
       key = 'key' + str(i)
       a = m.get((php_hash(key), key))
       print i, a

This is the only thing that has to be done on Python's end, change the way the hash is calculated. The coding on PHP end remains same. All you guys using PHP for web based front-end with MySQL and Python for back-end scripts shall find this helpful.

Thanks Brian Rue.

Reference: …

Advanced Squid Caching for Rails Applications: Preface

Since the day one when I joined Scribd, I was thinking about the fact that 90+% of our traffic is going to the document view pages, which is a single action in our documents controller. I was wondering how could we improve this action responsiveness and make our users happier.

Few times I was creating a git branches and hacking this action trying to implement some sort of page-level caching to make things faster. But all the time results weren’t as good as I’d like them to be. So, branches were sitting there and waiting for a better idea.

Few months ago my good friend has joined Scribd and we’ve started thinking on this problem together. As the result of our brainstorming we’ve managed to figure out what were the problems preventing us from doing efficient caching: …

How Facebook serves pictures

I caught Facebook - Needle in a Haystack: Efficient Storage of Billions of Photos on Flowgram. First up, I’m not a big fan of Flowgrams - the format is sensible, slide and voice, is excellent, but the delivery in a web browser isn’t optimal… make downloadable videos!

The talk however, was excellent. Do watch it, and learn a bit more about Facebook’s infrastructure. Anyway, some notes I took from the talk:

  • “We’re one of the largest MySQL installations in the world”
  • Use memcache - “We have memcache because databases aren’t fast” (later on in the questions)
  • Separate team focusing on APE (Apache, PHP and Extensions that they work on)
  • 6.5 billion total images, 4-5 sizes stored for each, so 30 billion files, of about 540TB total… During peak? 475,000 images served per second, and growing by …
Dog-pile Effect and How to Avoid it with Ruby on Rails memcache-client Patch

We were using memcache in our application for a long time and it helped a lot to reduce DB servers load on some huge queries. But there was a problem (sometimes called a “dog-pile effect”) - when some cached value was expired and we had a huge traffic, sometimes too many threads in our application were trying to calculate new value to cache it.

For example, if you have some simple but really bad query like

SELECT COUNT(*) FROM some_table WHERE some_flag = X

which could be really slow on a huge tables, and your cache expires, then ALL your clients calling a page with this counter will end up waiting for this counter to be updated. Sometimes there could be tens or even hundreds of such a queries running on your DB killing your server and breaking an entire application (number of …

Useful Cacti Templates to Monitor Your Servers

Recently I had one customer for consulting and aside from mysql optimization, etc they asked me for cacti installation/setup to monitor their pretty generic LAMP application. I’ve started setting up all this stuff and I’ve never thought it could be so painful… lots of different templates for the same tasks, all of them are incompatible with recent cacti releases, etc, etc… So, this post is generally a list of used templates with a fixes I’ve made to make them work on recent cacti release.


