Tailrank has a few large tables in our MySQL database that could benefit from a bit of truncation for (hopefully) some performance gain. The theory goes that if you have less data you'll have a faster system. This isn't always a one-to-one comparison of course because if you delete enough data MySQL will eventually just buffer the whole data in memory and you'll get an order of magnitude more performance.
All things being equal though what kind of performance boost would you get for SELECTs if you were to take a 20M node table and truncate it by 80%.
The answer is that you'd only receive a 9.3% performance boost. Since btrees are log(N) this means that in order to compute the boost you'd can just use the equation:
(log(N) - log(N_after) / log(N)) * 100
This boost obviously isn't worth the hassle and we're just going to migrate to a new table schema …
[Read more]