One of our customers sometimes observed lots of simple insertions taking far longer than expected to complete. Usually these insertions completed in milliseconds, but the insertions sometimes were taking hundreds of seconds. These stalls indicated the existence of a serialization bug in the Fractal Tree index software, so the hunt was on. We found that these stalls occurred when a big transaction was committing and the Fractal Tree index software was taking a checkpoint. This problem was fixed in both TokuDB 7.0.3 and TokuMX 1.0.3. Please read on as we describe some details about this bug and how we fixed it. We describe some of the relevant Fractal Tree index algorithms first.
What is a Big Transaction?
Each transaction builds a rollback log as it performs Fractal Tree index operations. The rollback log is maintained in memory until it gets too big and is spilled to the TokuDB rollback file. We define a small …
[Read more]