Database clusters are pretty sophisticated distributed systems with complex dependencies between nodes. The failure of a node will generally impact the overall cluster, as the remaining nodes need to reconfigure themselves to continue to operate without the failed node. Since re-introducing a node will also affect the existing cluster, the timing could therefore be dependent on the state of the other nodes in the cluster. Repair and restarts often needs to be performed[Read more...]
Use your MySQL expertise to analyze the strengths and weaknesses of MongoDB.
SPEAKER: Tim Callaghan, VP of Engineering at Tokutek
DATE: Tuesday, December 17th
TIME: 1pm ET
MongoDB is a popular NoSQL DBMS that shares the ease-of-use and quick setup that made MySQL famous. But is MongoDB really up to the job? Is it right for your applications? If you understand MySQL well, you know how database systems work.
Join Tim Callaghan, VP/Engineering at Tokutek as he recaps his and CEO of Continuent, Robert Hodges, session from 2013′s Percona Live London. Learn how to lean on your knowledge of topics like schema design, query optimization, indexing, sharding, and high availability to analyze the strengths and weaknesses of MongoDB. System design is all about asking the right[Read more...]
The Severalnines team is pleased to announce the release of ClusterControl 1.2.4. This release contains key new features along with performance improvements and bug fixes.
We have outlined some of the key features below. For additional details about the release:[Read more...]
Dan McKinley (Etsy) wrote an [IMHO] insightful article Why MongoDB Never Worked at Etsy.
First off, it’s important to realise that it’s not a snipe at MongoDB – it’s a fine tool.
The lessons are related to mixing multiple databases in a deployment (administration and monitoring overhead) and the acknowledgement that issues of schema design, scalability and maintenance need attention regardless of which brand or technology you pick for your database. That comes back to the old insight that migrations are rarely worth it (regardless of what you migrate to what).
I think these are indeed important considerations as they have a major impact on the ongoing costs of your entire environment (production as well as development and testing) – these days we[Read more...]
We’re particularly excited about this year’s Percona Live London MySQL Conference. The line-up of speakers & topics looks excellent and it’s good to see speakers from Oracle, Percona, the MariaDB Foundation (amongst others) scheduled at the same event. It demonstrates not just the diversity of the ever broadening MySQL ecosystem, but also the fact that there really is room for everyone to contribute, participate in and advance MySQL in manifold directions while still retaining a certain amount of uniformity.
And this is how we will be contributing to the event ...[Read more...]
Since our initial release last summer, TokuMX has supported fully ACID and MVCC multi-statement transactions. I’d like to take this post to explain exactly what we’ve done and what features are now available to the user.
But before beginning, an important note: we have implemented this for non-sharded clusters only. We do not support distributed transactions across different shards.
At a high level, what have we done?
We have taken MongoDB’s basic transactional behavior, and extended it. MongoDB is transactional with respect to one, and only one, document. MongoDB guarantees single document atomicity. Journaling provides durability[Read more...]
The upcoming Percona Live London conference, November 11-12, features quite a number of talks about the latest MySQL features and related technologies. There will be a lots of talks about the new MySQL 5.6 features:
We already discussed one to one relations in MongoDB, and the main conclusion was that you should design your collections according to the most frequent access pattern. With one to many relations, this is still valid, but other factors may come into play.
Let’s look at a simple problem: we are a shop and we want to store customers’ information as well as their orders. Each customer can make several orders, this is a one to many relation. With MySQL or any relational database system, we would create 2 tables:
CREATE TABLE customer ( customer_id int(11) NOT NULL AUTO_INCREMENT, name varchar(50) NOT NULL DEFAULT '', zipcode varchar(10) DEFAULT NULL, PRIMARY KEY (customer_id) ) ENGINE=InnoDB; CREATE TABLE orders ([Read more...]
Since introducing TokuMX, we’ve discussed benefits that TokuMX has for existing MongoDB applications that require no changes. In this post, I introduce an extension we’ve made to the indexing API: clustering indexes, a tool that can tremendously improve query performance. If I were to speak to someone about clustering indexes, I think the conversation could go something like this…
What is a Clustering Index?
A clustering index is an index that stores the entire document, not just the defined key.
A common example is[Read more...]
Database vendors regularly issue critical patch updates to address software bugs or known vulnerabilities, but for a variety of reasons, organizations are often unable to install them in a timely manner, if at all. Evidence suggests that companies are actually getting worse at patching databases, with an increased number violating compliance standards and governance policies1.
Patching that require database downtime would be of extreme concern in a 24*7 environment, however most cluster upgrades can be performed online.[Read more...]
I had to laugh (just a bit) at this on the exhibitor floor at Oracle Open World 2013. There was a large MongoDB presence at the Slot 301. There are a few reasons.
First, the identity crisis remains. There is no MongoDB in the list of exhibitors, it’s 10gen, but where is the 10gen representation in the sign. 99.99% of attendees would not know this.
Second, the first and only slide I saw (as shown below), tries to directly compare implementing a solution to Oracle. The speaker made some comment but I really zoned out quickly. Having worked with MongoDB, even on one of my own projects, contemplated the ROI of being proficient in this for consulting, even discussing at length with the CEO and CTO, and hearing only issues with MongoDB with existing MySQL clients, I have come
For those of you who know Severalnines and maybe use some of our tools & products, you’ll know that we provide our users with a monthly summary of all the resources & tools that we’re publishing. Since this is publicly available material, we thought it’d be useful also for the broader open source database community.
In the past month, we’ve made the following resources & tools available:
In talking to existing MongoDB users and TokuMX evaluators, I’ve often heard that the performance of MongoDB is very good as long as your working data set fits in RAM. The story continues that if your working data set grows to be larger than the RAM on your server, the built-in sharding capabilities of MongoDB allow you to scale horizontally.[Read more...]
We’ve been hard at work on TokuMX since it’s initial release just over 2 months ago. Today we released TokuMX v1.2 which includes Hot Backup in the Enterprise Edition.
Hot Backup allows users to create a backup of a running TokuMX primary or secondary server in a replica set, with no blocking of writes for clients. We will be blogging more about the Hot Backup technology in the coming weeks. This same technology is used for Hot Backup in TokuDB.
Also worth noting are the features we’ve added since the initial TokuMX release:
UPDATE 2013-08-30: Tungsten 2.1.2 was released.
UPDATE 2013-08-23: We have found a few problems that happen when replicating with RBR and temporal columns. We will have to publish an updated bugfix release quite soon.Tungsten Replicator 2.1.1 is out. Key features in this release are:
Recently, we’ve seen a few people ask us about building TokuMX from scratch. While it’s best if you just use the binaries you can get from us (they have all the right optimizations, we’ve tested them, and we can interpret coredumps they generate), we recognize there are other reasons you might need to do a custom build.
Since we actually build six distinct products all using the Fractal Tree indexing® library (community and enterprise versions of TokuDB for MySQL, TokuDB for MariaDB, and TokuMX), our build process is pretty complicated, compared to software packages that might, for example, just involve one source repository and link against a few standard libraries. Our TokuMX builds involve four git repositories, three[Read more...]
On Wednesday night, the Boston MongoDB User group was kind enough to have me speak about TokuMX Internals. I spoke about Fractal Tree® indexes and the technical reasons behind the benefits they provide to MongoDB applications. Although the talk mostly references TokuMX and MongoDB, all the theory applies to TokuDB and MySQL as well.
My slides are on our technology overview page, along with other great content.
Opportunities to present technical material to an engaged audience asking tough questions is rare, and much appreciated. So thank you to the Boston MongoDB User group for having me present.
For people used to relational databases, using NoSQL solutions such as MongoDB brings interesting challenges. One of them is schema design: while in the relational world, normalization is a good way to start, how should we design our collections when creating a new MongoDB application?
Let’s see with a simple example how we would create a data structure for MySQL (or any relational database) and for MongoDB. We will assume in this post that we want to store people information (their name) and the details from their passport (country and validity date).
In the relational world, the basic idea is to try to stick to the 3rd normal form and create two tables (I’ll omit indexes and foreign keys for clarity – MongoDB supports indexes but not foreign keys):
mysql> select * from people;[Read more...]
One of my favorite MySQL tools ever is pt-query-digest. It's the tool you use to generate a report of your slow query log (or some other supported sources), and is similar to for example the Query Analyzer in MySQL Enterprise Monitor. Especially in the kind of job I am, where I often just land out of nowhere on a server at the customer site, and need to quickly get an overview of what is going on, this tool is always the first one I run. I will show some examples below.
A lot is said about the differences in the data between MySQL (http://www.mysql.com/) and MongoDB. Things such as “MongoDB is document based”, “MySQL is relational”, “InnoDB has a clustering key”, etc.. Some may wonder how TokuDB, our MySQL storage engine, and TokuMX, our MongoDB product, fit in with these data layouts. I could not find anything describing the differences with a simple google search, so I figured I’d write a post explaining how things compare.
So who are the players here? With MySQL, users are likely familiar with two storage engines: MyISAM, the original default up until MySQL 5.5, and[Read more...]
Before creating a unique index in TokuMX or TokuDB, ask yourself, “does my application really depend on the database enforcing uniqueness of this key?” If the answer is ANYTHING other than yes, do not declare the index to be unique. Why? Because unique indexes may kill your write performance. In this post, I’ll explain why.
Unique indexes are a strange beast: they have no impact on standard databases that use B-Trees, such as MongoDB and MySQL, but may be horribly painful for databases that use write optimized data structures, like TokuMX’s Fractal Tree(R) indexes. How? They[Read more...]
In my last post, I showed what a Fractal Tree® index is at a high level. Once again, the Fractal Tree index is the data structure inside TokuMX and TokuDB, our MongoDB and MySQL products. One of its strengths is the ability to get high levels of compression on the stored data. In this post, I’ll explain why that is.
At a high level, one can argue that there isn’t anything special about our compression algorithms. We basically do this: we take large chunks of data, use known compression methods (e.g. zlib,[Read more...]
With our recent release of TokuMX 1.0, we’ve made some bold claims about how fast TokuMX can run MongoDB workloads. In this post, I want to dig into one of the big areas of improvement, write performance and reduced I/O.
One of the innovations of TokuMX is that it eliminates a long-held rule of databases: to get good write performance, the working set of your indexes should fit in memory. The standard reasoning goes along the lines of: if your indexes’ working set does not fit in memory, then your writes will induce I/O, you will become I/O bound, and performance will suffer. So, either make sure[Read more...]
I am actually quite excited about Tokutek’s release of TokuMX. I think it is going to change the landscape of database systems and it is finally something that made me looking into NoSQL.
Why is TokuMX interesting? A few reasons:
Tokutek is known for its full-featured fast-indexing technology. MongoDB is known for its great document-based data model and ease of use. TokuMX, version 1.0, combines the best of both worlds.
Quite frequently, especially with large-scale or complicated applications, we use MySQL alongside other technologies for certain tasks of reporting, caching as well as main data-store for portions of application.
What technologies for data storage and processing do you use alongside MySQL in your environment? Please feel free to elaborate in the comments about your use case and experiences!Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
The post[Read more...]