I recently presented a webinar explaining how you can enjoy the key benefits of NoSQL data stores without giving up all of the great features provided by a mature RDBMS.
In case you weren’t able to attend (or wanted to refresh your memory) then the webinar replay and charts are now available.
There’s often a lot of excitement around NoSQL Data Stores with the promise of simple access patterns, flexible schemas, scalability and High Availability. The downside can come in the form of losing ACID transactions, consistency, flexible queries and data integrity checks. What if you could have the best of both worlds?
These webinars are always a good opportunity to get your questions answered; here’s a catch up of the Q&A from this session:
- Would you suggest using mysql cluster to store graph data (150k writes second)? For graph data, you always have to choose between a specialized graph database and a more general-purpose database. If it’s “almost” relational with a few graph-like connections, your decision might be different than if it’s purely graph-like. In any case, your write load of 150K writes per second can certainly be managed in MySQL Cluster. It only requires a little care to get an appropriate cluster configuration as far as number of data nodes, number of API nodes, memory, disk, and networking. Also, the total eventual size of the data is an important factor in the decision about whether to use Cluster, since indexes must always fit in the total distributed memory of the data nodes.
- Can you please explain the RAM requirements for MySQL Cluster, for example if my database is 10GBs in disc space, will it require 10GBs of RAM in MySQL Cluster? There is additional overhead in addition to the raw data. It’s tricky to try to summarize, but there is fixed overhead per row plus space for re-do logs and indexes. Details are in online documents. All indexed columns must be in memory but other columns can be on disk if you choose. Remember that each row has to be stored on 2 data nodes and so you need to figure out your total memory requirement, double it and then divide by the number of data nodes to find how much memory would be needed for each data node. MySQL Cluster Evaluation Guide – Designing, Evaluating and Benchmarking MySQL Cluster is a good white paper to refer to in order to decide if MySQL Cluster is the right database for your application as well as what you’ll need and what you should do to get the best results.
- Is there a wizard to migrate innoDB to MySQL Cluster? There’s not a “wizard” per se, but “ALTER TABLE x ENGINE=ndb” will convert a particular table. (It’s only tricky if you have foreign keys which might have to be dropped at the beginning and reenabled at the end of the process).
- Can this be deployed on EC2 instances, or is this for bare metal? MySQL Cluster has been successfully deployed (e.g. by PayPal)
- How difficult is it to do a hardware upgrade? Do you have to do it all at once or can you do each machine in turn? Both hardware and software upgrades are online operations. You can add nodes to a running cluster, and upgrade the software on nodes individually. If you use the MySQL Cluster Manager, many of the upgrade operations can be automated. You won’t be able to exploit some upgrades (e.g. extra hardware on a data node) until you’ve upgraded.
- Does MySQL Cluster store all data in memory? What scenarios available for swaping data to disk? Can we differentiate which tables/columns are stored on memory/disk? All indexes are in memory. A table can be all in-memory, or it can have non-indexed columns stored on disk. That’s a per-column choice.
- Can mysql be subscribed/notified when some data is changed/updated? There is a notification API. It is currently only supported in C NDB API (this is the “Event API”), not in MySQL server or others. There are plans to also support it in Node.JS, but no actual support at this time. If using SQL then triggers can be defined in the MySQL Server – just like for InnoDB tables.