It’s common wisdom that large-scale database systems require distributing the data across machines. But what seems to be missing in a lot of discussions is distributing the query processing too. By this I mean the actual computation that’s performed on the data.
I just had a conversation with Peter Zaitsev yesterday that helped make concrete some thoughts I’ve been having about Cassandra for a while. Because Cassandra doesn’t allow you to really do any computation in the data (aggregating, evaluating expressions, and so on), if you’re going to use it for truly Big …[Read more...]