on nuodb and falcon

Warning: this is a mixture of historical content, biases, stupid marketing and unknown/proprietary/closed source technologies. Proceed with caution.

NuoDB marketing was sending out this message, encouraging me to blog (they were looking for bloggers too):

And while Facebook sharded MySQL 4000 times, even they call it a “fate worse than death.”

We’ve seen this phrase before and it did not come from us. For whatever reason NewSQL echo chamber is repeating this with less and less truth in it. In various whitepapers (all behind registration walls) they mention some analyst estimates and try to put a parallel between operating costs of large companies and something a new developer would do, as if everyone is living under same constraints.

I don’t know if NuoDB is a good technology for the customer they’re targeting, all their diagrams essentially say “we have blocks of magic and we multiply them”, and if you approach them at a conference, their “tech guy is away at the moment”. Still, the key term around is that what they do is Holy Grail of databases and we should believe in that.

It is still a question whether NuoDB does solve problems of massive scale web deployments. They seem to diss existing operational environment costs with “thousands of servers and storage are required” and I’m not sure what the cost of their alternative is.

We’ve revealed some aggregate numbers of our MySQL based data platform before – there’re tens of millions of queries (billions at cache level), millions of IOPS, and it is somewhat difficult to squeeze that into less than “thousands of servers”.

There’re more than billion users to serve and sheer amount of data in the social graph is also not something you can put on a few thumb drives. If only any of these software vendors could tell how much their platform would cost in such a case.

I am not an expert at optimistic concurrency control that seems to be in there – I have yet to see a large scale system using it. Optimistic concurrency control (the use of “control” here sounds like an oxymoron) means that if users talking to different servers do same operation, one of them is going to get an error on commit (instead of waiting for the lock release and doing his job on top). This also cannot hide any latencies, if consistency is required. Unfortunately, retries in higher latency environments are even more expensive and writing software for such model becomes more complicated than writing software for sharded datasets.

Software that does not have to be aware of sharding and underlying partitioning is easier to implement. Unfortunately, writing efficient software on top of distributed datasets is not that easy. Fan-out in distributed systems is expensive (especially at thousands of machines level) and is not an operation that should be done a lot in web-facing environments.

Usually developers will already have abstractions that allow them to do efficient data retrieval without thinking about shards, yet forcing to think twice when they would be doing something expensive. The cost is always there, visible or invisible, and someone has to deal with it.

At least we know who is behind this. Meet Jim Starkey, database luminary behind it and few other database management systems. In MySQL world he had his fame during the rocky period of Oracle InnoDB acquisition. MySQL AB had to do something and acquired Netfrastructure – a Java application-server/database hybrid. Some of that technology was rewritten into C++ and used as a storage engine for MySQL. This whole new development had two names:

MySQL 6 and Falcon

Jim Starkey captivated crowds by dissing status quo (I saw famous people leaving the room) yet not being all correct himself. His storage engine was supposed to be architected for the web. Well, it wasn’t. At that time I was doing some work on Wikipedia and had various ideas on what works and doesn’t work in web facing databases. Jim Starkey had different ideas.

One of them was that RAM was supposed to be cheap, so Falcon was supposed to be memory hungry for various operations. Sure, RAM got cheaper but data volumes got larger. The trend I observed later was that amount of RAM per bytes stored was decreasing rather than increasing.

Another Falcon bet was that CPUs are going to be much faster, so instead of storing/reading data in any ordered fashion one was supposed to read unsorted data then sort it in memory (as RAM is cheap too). Again, major web pattern (open-ended range reads – ORDER BY … LIMIT) got missed. Lots of web-facing range queries would become very expensive, so in order to be web scale on has to rewrite their applications to fit the new model.

Random disk access was supposed to go away – and even if index looks up sparse data, Starkey believed that doing disk reads in physical order was supposed to give better performance. That was not a correct assumption at concurrent workloads and ended up missing few other important optimizations such as lack of covering index reads. We did not see too much flash around at that time, and I’m not sure how Falcon design would’ve worked on flash anyway.

I wrote some of these observations down and sent them to MySQL engineering. Jim Starkey did not reply, someone else close to him did with “let’s wait for benchmarks, then talk”. Unfortunately, decent benchmarks never arrived. I was not the only one who had questions.

There were various performance issues. For a while it was told that one should not implement low level concurrency primitives and use OS provided methods (e.g. pthreads) everywhere instead. Apparently when Falcon tried implementing internal spinlocks they did not work that well. Mark Callaghan pointed out that spinlock implementation inside Falcon was not actually spinning (compiler optimized that loop away) and was just using OS mutexes.

There were some interesting properties of the engine that could have been valuable – it had row cache at the core, kept transactional overhead in memory (you were not supposed to run long running transactions), etc.

Eventualy Falcon leadership changed and remaining team of engineers tried to salvage the product for quite a while (success of the project was measured in how many minutes it can stay up without crashing), but it all became moot once InnoDB and MySQL teams were reunited under Oracle.

And no, I will not make another “fate worse than death” joke, those are expired long ago. Though I don’t think that Falcon record expired by now, so I will take NuoDB claims with a grain of salt. I don’t know exactly what problems they are solving, so it is difficult to come up with good analysis and comparisons. I won’t and neither should they.