When moving from two data nodes to a bigger Cluster it is not
necessarily true that you will have better performance. In fact
you can get worse.
Here are some things to think about:
- Database Load (traffic to Cluster) - if you can handle query
load X on a two node cluster and move the same load X to a four
data node cluster you will likely get new_load_X=0.8X, i.e., a
performance degradation. This has to do with 1) buffers are not
filled up fast enough so the data nodes will do "timeout" based
sending or 2) that the access patterns aren't scaling. To correct
1) you need to increase the load on the cluster so that internal
communication buffers fill up faster.
Access pattern related "problems":
- For primary key operations (reads, updates, deletes) you will always go to the correct node to fetch the data with PK operations, no matter how many nodes you have. So no …