Sorry for the delay, now on to range queries and lenient
updates. Let’s call them queries and updates, for
short. So far, I’ve shown that B-trees (and any of a number
of other data structures) are very far from the “tight bound.”
I’ll say a bound is a tight if it’s a lower bound and you can
come up with data structure that matches it.
So how do we match the bandwidth bound for queries and
updates? I already mentioned in passing how to do this, but
let’s look more closely.
Fast Updates
The way to get fast updates is to log them. You can easily
saturate disk bandwidth by writing out the insertion, deletion
and update requests with no index.
A query now will typically start by sorting the data. Even
a point query requires looking at all the data, but a range query
requires looking at all the data log times (in order to sort it),
or using a large amount of extra …
[Read more]