Technical conferences are flooded with visual
[mis]representations of a particular product's performance,
compression, cost effectiveness, micro-transactions per
flux-capacitor, or whatever two-axis comparison someone dreams
up. Lets be honest, benchmarketers like to believe we all suffer
from innumeracy.
The Merriam-Webster dictionary defines innumeracy as
follows:
innumeracy (noun): marked by an ignorance of mathematics and
the scientific approach Mark
Callaghan has been a long time advocate of explaining benchmark results, but that's not the
point of the bar chart. Oh no, the bar chart only exists to catch
your eye and …
There are generally three components to any benchmark
project:
- Create the benchmark application
- Execute it
- Publish your results
I assume many people think they want to run more benchmarks but
give up since step 2 is extremely consuming as you expand the
number of different configurations/scenarios.
I'm hoping that this blog post will encourage more people to
dive-in and participate, as I'll be sharing the bash script I
used to test the various compression options coming in the MongoDB
3.0 storage engines. It enabled me to run a few different
tests against 8 different configurations, recording insertion
speed and size-on-disk for each one.
If you're into this sort of thing, please read on and provide any
feedback or improvements you can think of. …
The next release of MongoDB includes the
ability to select a storage engine, the goal being that different
storage engines will have different capabilities/advantages, and
user's can select the one most beneficial to their particular
use-case. Storage engines are cool. MySQL has
offered them for quite a while. One very big difference between
the MySQL and MongoDB implementations is that in MySQL the user
gets to select a particular storage engine for each table,
whereas in MongoDB it's a choice made at server startup. You get
a single storage engine for everything on the particular mongod
instance. I see pros and cons to each decision, but that's a blog
for another day.
In MongoDB 3.0 …
Today is my last day at Tokutek. On Monday I'm starting a new opportunity
as VP/Technology at CrunchTime!. If you are a web developer, database
developer, or quality assurance engineer in the Boston area and
looking for a new opportunity please contact me or visit the
CrunchTime! career page.
I've really enjoyed my time at VoltDB and Tokutek. Working for
Mike Stonebraker (at VoltDB) was on my career
"bucket list" and in these past 3.5 years at Tokutek I've
experienced the awesomeness of the MySQL ecosystem and the
surging NoSQL database …
A few days ago I wrote about MySQL performance implications of InnoDB isolation modes and I touched briefly upon the bizarre performance regression I found with InnoDB handling a large amount of versions for a single row. Today I wanted to look a bit deeper into the problem, which I also filed as a bug.
First I validated in which conditions the problem happens. It seems to happen only in REPEATABLE-READ isolation mode and only in case there is some hot rows which get many row versions during a benchmark run. For example the problem does NOT happen if I run sysbench with “uniform” distribution.
In terms of concurrent selects it also seems to require some very special conditions – you need to have the connection to let some …
[Read more]Last month I wrote a blog about the closing of MongoDB ticket SERVER-1240, which brings Collection Level Locking (CLL) to the MMAPV1 storage engine in MongoDB 2.8. In MongoDB 2.6 there is a writer lock at the database level, so each database only allows one writer at a time. In concurrent write workloads, this means that all writers essentially form a single line and do their writes one at a time. In MongoDB 2.8 this lock has been moved to the collection level. Better yet is document level locking, but even though this feature was shown at MongoDB World 2014 it's not going to ship. But it did make for one amazing demo by …
[Read more]
"Should vegetarians open steakhouse restaurants?"
Though someone will probably give me several examples of why they
should, I'll argue that they absolutely should not. How can
someone who doesn't eat steak convince others to eat at their
"steak-only" restaurant?
But this is something a "professional technology benchmarker"
(PTB) struggles with on a regular basis. Hello, I'm Tim
Callaghan, and I'm a PTB.
professional technology benchmarker, or PTB (noun) : One
who compares two technologies as part of their job. One of these
technologies is usually the product of the PTB's employer, the
other is almost always not. In a past experience I was tasked
with comparing the performance of a fully in-memory database with
Oracle and MySQL on a "TPC-C like" workload. At the time I was an
Oracle expert and working for the in-memory database company, but
had never started a single MySQL server in my life. At …
I'm starting off 2015 with the following New Year's Resolution,
to improve the state of benchmarking. About a month ago I
noticed the following tweet:
Hey @tokutek, please look at this: http://stssoft.com/products/stsdb-4-0/benchmark ….
Are the benchmarks rigged or correctly done? I'm curious to know!
While I've never met Ian Campbell (@iamic) he
certainly knew how to call me to action. I immediately checked
out the STSsoft
website, the benchmark results page, and the benchmark code itself. My first reaction …
While MongoDB 2.8 introduces a formal storage engine API and brings with it the new
WiredTiger storage engine, it also adds collection
level locking to the existing memory mapped engine (MMAPV1) which
will remain the default engine until MongoDB 3.0, so says Eliot.
The MongoDB community has been waiting a long time for collection
level locking, the Jira
ticket was created on June 15, 2010. When I saw the following
Facebook post I got excited to give it a spin, but unfortunately
the results were extremely poor using MongoDB …
I just pushed the new Java based iiBench for MySQL (and Percona
Server and MariaDB), the code and documentation are available now
in the iibench-mysql Github repo. Pull request are
welcome!
The history of iiBench goes back to the early days of
Tokutek.
Since "indexed insertion" is a strength of Fractal Tree indexes, the first iiBench was
created by Tokutek in C++ back in 2008. Mark
Callaghan rewrote iiBench in Python, adding several features
along the way. His version of iiBench is available in …