Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Previous 30 Newer Entries Showing entries 31 to 60 of 182 Next 30 Older Entries

Displaying posts with tag: mongodb (reset)

MongoDB, TokuMX and InnoDB for disk IO-bound, read-only point queries
+0 Vote Up -0Vote Down
This repeats a test that was done on pure-flash servers. The goals are to determine whether the DBMS can efficiently use the IO capacity of a pure-disk server.  The primary metrics are the QPS that the DBMS can sustain and the ratio of disk reads per query. The summary is that a clustered primary key index makes TokuMX and InnoDB much more IO efficient for PK lookups on IO-bound workloads.

TokuMX and InnoDB get much more QPS than MongoDB from the same IO capacity for this workload. TokuMX and InnoDB have a clustered primary index. There is at most 1 disk read per query assuming all non-leaf nodes from the index are in memory and all leaf nodes are not in memory. With MongoDB the primary key index is not clustered so there can be a disk read for the leaf node of

  [Read more...]
Notes on the storage stack
+0 Vote Up -0Vote Down
If you want high performance and quality of service from a DBMS then you need the same from the OS. The MySQL/Postgres/MongoDB crowd doesn't always speak with the Linux crowd. On the bright side there is a good collection of experts from the Linux side of things at my employer and we have begun speaking. There were several long threads on the PG hackers lists about PG+Linux and this lead to a meeting at the LSFMM summit. I am very happy these groups met. We have a lot to learn from each other. DBMS people can explain our IO patterns and get motivated to write DBMS workload simulators (like innosim,   [Read more...]
Insert benchmark on disks, part 2
+1 Vote Up -0Vote Down
I ran more insert benchmark tests for InnoDB on pure disk servers. The previous results with a lot more detail are here. My goal in this case was to use better configuration options for InnoDB on disk and to understand the impact of innodb_flush_neighbors. With the better settings InnoDB sustains a much higher insert rate.

The first problem in the previous tests was that I used a few settings that are better for flash than disk with InnoDB so I increased innodb_write_io_threads from 4 to 32, reduced

  [Read more...]
Insert benchmark for flash, part 2
+1 Vote Up -0Vote Down
I repeated a few of the long running tests for the insert benchmark using flash storage. My goals were to test at least one new configuration, repeat a few tests to confirm the configuration was what I claimed it was and to confirm the impact of doing fsync-on-commit during this test. In this test the write operation adds 1000 small documents and the redo log write is not small. From casual observation I did not see a big impact from doing fsync-on-commit (or a big benefit from not doing it) but that is the point of this post. From the results here the impact from fsync-on-commit still appears to be minor. But remember that this is specific to one workload for which many KB of data are written to the redo log or journal per commit and for which the eventual bottleneck is random  [Read more...]
TokuMX, MongoDB and InnoDB on IO-bound point queries
+1 Vote Up -0Vote Down
I used sysbench to understand whether TokuMX, MongoDB and InnoDB can use most of the IOPs provided by a fast flash device and whether their use is efficient. The workload query fetches one document/row by primary key. This is a very simple workload but helps me to understand how disk read requests are processed. The primary metric is the QPS that can be sustained when the database is much larger than RAM. A secondary metric is the number of disk reads per query during the test. Efficiency, not doing too many reads per query, matters when you want to support many concurrent users.

tl;dr - new database engines are usually worse on multi-core than old database engines. I know there are exceptions to this rule for both new engines, like WiredTiger and RethinkDB, and old engines that won't be

  [Read more...]
on io scheduling again
+1 Vote Up -0Vote Down

Most of database engines have to deal with underlying layers – operating systems, device drivers, firmware and physical devices, albeit different camps choose different methods.
In MySQL world people believe that InnoDB should be handling all the memory management and physical storage operations – maximized buffer pool space, adaptive/fuzzy flushing, crash recovery getting faster, etc. That can result in lots of efficiency wins, as managing everything with data problem in mind allows to tune for efficiency and performance.

Other storage systems (though I hear it from engineers on different types of problems too) like PostgreSQL or MongoDB consider OS to be much smarter and let it do caching or buffering. Which means that in top Postgres expert presentations you will hear much more about operating systems than in MySQL talks. This results in OS knowledge attrition in MySQL world (all you


  [Read more...]
TokuMX, MongoDB and InnoDB versus the insert benchmark with disks
+0 Vote Up -0Vote Down
I used the insert benchmark on servers that use disks in my quest to learn more about MongoDB internals. The insert benchmark is interesting for a few reasons. First while inserting a lot of data isn't something I do all of the time it is something for which performance matters some of the time. Second it subjects secondary indexes to fragmentation and excessive fragmentation leads to wasted IO and wasted disk space. Finally it allows for useful optimizations including write-optimized algorithms (fractal tree via TokuMX, LSM vis RocksDB and WiredTiger) or the InnoDB insert buffer. Hopefully I can move onto other workloads after this week.  [Read more...]
Redo logs in MongoDB and InnoDB
+0 Vote Up -0Vote Down
Both MongoDB and InnoDB support ACID. For MongoDB this is limited to single document changes while InnoDB extends that to multi-statement and possibly long-lived transactions. My goal in this post is to explain how the MongoDB journal is implemented and used to support ACID. Hopefully this will help to understand performance. I include comparisons to InnoDB.

What is ACID?

There are a few interesting constraints on the support for ACID with MongoDB. It uses a per-database reader-writer lock. When a write is in progress all other uses of that database (writes & reads) are blocked. Reads can be done concurrent with other reads but block writes. The manual states that the lock is
  [Read more...]
A few comments on MongoDB and InnoDB replication
+0 Vote Up -0Vote Down
MongoDB replication has something like InnoDB fake changes built in. It prefetches all documents to be changed while holding a read lock before trying to apply any changes. I don't know whether the read prefetch extends to indexes. That question has now been added to my TODO list. Using fake changes to prefetch on MySQL replicas for InnoDB worked better than everything that came before it because it prefetched any index pages that were needed for index maintenance. Then we made it even better by making sure to prefetch sibling pages in b-tree leaf pages when pessimistic changes (changes not limited to a single page) might be done (thanks Domas).  Hopefully InnoDB fake changes can be retired with the arrival of  [Read more...]
My Favorite MongoDB Replication Feature: Crash Safety
+0 Vote Up -0Vote Down

At an extremely high level, replication in MongoDB and MySQL are similar. Both databases have exactly one machine, the primary (or master), that accepts writes from clients. With a single transaction (or atomic operation, in MongoDB’s case), the tables and oplog (or binary log in MySQL) are modified to reflect the change. The log captures what the change is so other secondaries (or slaves) can read the changes and process them, making the slaves identical to the master. (Note that I am NOT talking about multi-master replication.)

Underneath the covers, their implementations are quite different. And in peeking underneath the covers while developing TokuMX, I learned

  [Read more...]
We're hiring!
+1 Vote Up -0Vote Down
Continuent, a leading provider of database clustering and replication software has five (5) new positions open: Build/Test Engineer Senior Database Availability and Clustering Engineer Senior Database Replication Engineer Data Replication Sales Engineer Clustering and Replication Test Development Engineer If you want to get in on the ground floor of a growing company in a challenging field
Insert benchmark for InnoDB, MongoDB and TokuMX and flash storage
+1 Vote Up -0Vote Down
This work is introduced with a few disclaimers in an earlier post. For these tests I ran the insert benchmark client in two steps: first to load 100M documents/rows into an empty database and then to load another 400M documents/rows. The test used 1 client thread, 1 collection/table and the query threads were disabled. The replication log (oplog/binlog) was disabled for all tests. I assume that all of the products would suffer with that enabled.

Note that the database in this test was fully cached by InnoDB and TokuMX and almost fully cached for MongoDB. This means there will be no disk reads for InnoDB during index maintenance, no disk reads for TokuMX during compaction and few disk reads for MongoDB  during index maintenance. Future posts have results for databases much

  [Read more...]
Bigger data
+1 Vote Up -0Vote Down
Database benchmarks are hard but not useless. I use them to validate performance models and to find behavior that can be improved. It is easy to misunderstand results produced by others and they are often misused for marketing (benchmarketing). It is also easy to report incorrect results and I have done that a few times for InnoDB. A benchmark report is much more useful when it includes an explanation. Only one of these is an explanation: A is faster than B, A is faster than B because it uses less random IO. It isn't easy to explain results. That takes time and expertise in the DBMS and the rest of the hardware and software stack used during the test. The trust I have in benchmark reports is inversely related to the number of different products that have been tested.

This is an introduction for a sequence of blog posts that compare MongoDB, TokuMX and

  [Read more...]
How to use the ClusterControl REST API to automate your Database Cluster
+0 Vote Up -0Vote Down
March 11, 2014 By Severalnines

For ops folks with multiple environments and instances to manage, a fully programmable infrastructure is the basis for automation. ClusterControl exposes all functionality through a REST API. The web UI also interacts with the REST API to retrieve monitoring data (cluster load, alarms, backup status, etc.) or to send management commands (add/remove nodes, run backups, upgrade a cluster, add/remove load balancer, etc.). The API is written in PHP and runs under Apache. The diagram below illustrates the architecture of ClusterControl.

Figure: ClusterControl

  [Read more...]
Resources for HA Database Clusters: New ClusterControl Release, Galera Migration Webinar & Blog Resources
+0 Vote Up -0Vote Down
March 6, 2014 By Severalnines

 

Check Out Our Latest Resources for MySQL, MariaDB & MongoDB Clusters

 

Here is a summary of resources & tools that we’ve made available to you in the past weeks. If you have any questions on these, feel free to contact us!

 

ClusterControl 1.2.5 released

We are pleased to announce the release of ClusterControl 1.2.5, which now supports MySQL 5.6 and Global Transaction IDs to enable cross-datacenter and cloud replication over high latency networks. Galera users are now able to assign nodes to their

  [Read more...]
ClusterControl 1.2.5 Released
+2 Vote Up -0Vote Down
March 5, 2014 By Severalnines

The Severalnines team is pleased to announce the release of ClusterControl 1.2.5. This release contains key new features along with performance improvements and bug fixes. We have outlined some of the key features below. 

For additional details about the release:

  [Read more...]
MongoDB and Hadoop - Stockholm MongoDB User Group Meetup - Monday, March 3, 2014
+0 Vote Up -0Vote Down
February 27, 2014 By Severalnines

 

Stockholm MongoDB User Group Meetup: “MongoDB and Hadoop”

Monday, March 3, 2014 starting @ 5:00 PM

 

Join us next Monday as we host the Stockholm MongoDB User Group Meetup in Kista, or the Wireless Valley as it is also referred to. 

 

Our very own Vinay Joosery will be speaking about how to best automate the management & deployment of database clusters, specifically MongoDB clusters though the same principles apply for MySQL, MariaDB and Percona XtraDB based clusters. Henrik Ingo of MongoDB will be talking about Analytics with MongoDB & Hadoop. And Jim Dowling, a Senior Researcher at the Swedish Institute of Computer Science, will talk

  [Read more...]
Everything is awesome
+1 Vote Up -0Vote Down
My kids watched the new Lego movie today and spent the rest of the day repeating "Everything is amazing". I spent a few hours reading MongoDB documentation to help a friend who uses it. Everything wasn't awesome for all of us. I try to be a pessimist when reading database documentation. If you spend any time near production then you spend a lot of time debugging things that fail. Being less than optimistic is a good way to predict failure.

One source of pessimism is database limits. MongoDB has a great page to describe limits. It limits index keys to less than 1025 bytes. But this is a great example that shows the value of pessimism. The documentation states that values (MongoDB documents) are not added to the index when the index key is too large. An optimist might assume that an insert or update statement fails when

  [Read more...]
How TokuMX was Born
+2 Vote Up -0Vote Down

With TokuMX 1.4 coming out soon, with (teaser) wonderful improvements made to sharding and updates (and plenty of other goodies), I’ve recently reminisced about how we got TokuMX to this point. We (actually, really John) started dabbling with integrating Fractal Tree® indexes into MongoDB in the summer of 2012, where we (really, he) prototyped using Fractal Tree indexes only for secondary indexes. As cool as that prototype was, it

  [Read more...]
The Effects of Database Heap Storage Choices in MongoDB
+0 Vote Up -0Vote Down

William Zola over at MongoDB gave a great talk called “The (Only) Three Reasons for Slow MongoDB Performance”. It reminded me of an interesting characteristic of updates in MongoDB. Because MongoDB’s main data store is a flat file and secondary indexes store offsets into the flat file (as I explain here), if the location of a document changes, corresponding entries in secondary indexes must also change. So, an update to an unindexed field that causes the document to move also causes modifications to every secondary index, which, as William points out, can be expensive. If a document has indexed an array, this

  [Read more...]
Replicate from Oracle to Oracle, Oracle to MySQL, and Oracle to Analytics
+0 Vote Up -0Vote Down
Oracle is the most powerful DBMS in the world. However, Oracle's expensive and complex replication makes it difficult to build highly available applications or move data in real-time to data warehouses and popular databases like MySQL. In this webinar-on-demand you will learn how Continuent Tungsten solves problems with Oracle replication at a fraction of the cost of other solutions and with less
Webinar Replay & Slides: Repair & Recovery for Your MySQL, MariaDB & MongoDB / TokuMX Clusters
+0 Vote Up -0Vote Down
January 23, 2014 By Severalnines

 

Thanks to everyone who attended this week’s webinar; if you missed the sessions or would like to watch the webinar again and browse through the slides, they are now available online.

 

Special thanks to Seppo Jaakola from Codership, the creators of Galera Cluster, for walking us through the various scenarios of Galera recovery. 

 

Webinar topics discussed: 

  • Redundancy models for Galera, NDB and MongoDB / TokuMX
  • Failover & Recovery (Automatic vs Manual)
  [Read more...]
Why I Love Open Source
+3 Vote Up -0Vote Down
Anders Karlsson wrote about Some myths on Open Source, the way I see it a few days ago.  Anders' article is mostly focused on exploding the idea that open source magically creates high quality code.  It is sad to say you do not have to look very far to see how true this is.

While I largely agree with Anders' points, there is far more that could be said on this subject, especially on the benefits of open source. I love working on open source software. Here are three reasons that are especially important to me.

1.) Open source is a great way to disseminate technology to users.  In the best cases, it is this easy to get open source products up and running:

$ sudo apt-get install software-i-want-to-use

A lot







  [Read more...]
MongoDB in 2013 -- A Year in Review
+0 Vote Up -1Vote Down
It's again that time of the year. Analysts are spending oceans of words to predict the future, companies are making plans for the next year and people are resting and enjoying the break with their families. To me, this is the perfect time to reflect on my choices, the direction I'm headed to and consider if I still love what I do.

At the beginning of the year I decided to join MongoDB (formerly 10gen). The more I think about it, the more I realize I've been wrong. Yes, it's been the worst decision in my life not to join MongoDB when I was first offered the opportunity years ago. At that time an

  [Read more...]
MongoDB in 2013 -- A Year in Review
+0 Vote Up -1Vote Down
It's again that time of the year. Analysts are spending oceans of words to predict the future, companies are making plans for the next year and people are resting and enjoying the break with their families. To me, this is the perfect time to reflect on my choices, the direction I'm headed to and consider if I still love what I do.

At the beginning of the year I decided to join MongoDB (formerly 10gen). The more I think about it, the more I realize I've been wrong. Yes, it's been the worst decision in my life not to join MongoDB when I was first offered the opportunity years ago. At that time an

  [Read more...]
New Webinar: Repair and Recovery for your MySQL, MariaDB and MongoDB/TokuMX Clusters
+1 Vote Up -0Vote Down
December 19, 2013 By Severalnines


Database clusters are pretty sophisticated distributed systems with complex dependencies between nodes. The failure of a node will generally impact the overall cluster, as the remaining nodes need to reconfigure themselves to continue to operate without the failed node. Since re-introducing a node will also affect the existing cluster, the timing could therefore be dependent on the state of the other nodes in the cluster. Repair and restarts often needs to be performed


  [Read more...]
December 17 Webinar: Use Your MySQL Knowledge to Become a MongoDB Guru
+0 Vote Up -0Vote Down

Use your MySQL expertise to analyze the strengths and weaknesses of MongoDB.

SPEAKER: Tim Callaghan, VP of Engineering at Tokutek
DATE: Tuesday, December 17th
TIME: 1pm ET
Register Now!

MongoDB is a popular NoSQL DBMS that shares the ease-of-use and quick setup that made MySQL famous. But is MongoDB really up to the job? Is it right for your applications? If you understand MySQL well, you know how database systems work.

Join Tim Callaghan, VP/Engineering at Tokutek as he recaps his and CEO of Continuent, Robert Hodges, session from 2013′s Percona Live London. Learn how to lean on your knowledge of topics like schema design, query optimization, indexing, sharding, and high availability to analyze the strengths and weaknesses of MongoDB. System design is all about asking the right




  [Read more...]
ClusterControl 1.2.4 Released
+2 Vote Up -0Vote Down
November 19, 2013 By Severalnines

The Severalnines team is pleased to announce the release of ClusterControl 1.2.4. This release contains key new features along with performance improvements and bug fixes.

We have outlined some of the key features below. For additional details about the release:

  [Read more...]
Mixing databases usually not optimal
+0 Vote Up -0Vote Down

Dan McKinley (Etsy) wrote an [IMHO] insightful article Why MongoDB Never Worked at Etsy.

First off, it’s important to realise that it’s not a snipe at MongoDB – it’s a fine tool.

The lessons are related to mixing multiple databases in a deployment (administration and monitoring overhead) and the acknowledgement that issues of schema design, scalability and maintenance need attention regardless of which brand or technology you pick for your database. That comes back to the old insight that migrations are rarely worth it (regardless of what you migrate to what).

I think these are indeed important considerations as they have a major impact on the ongoing costs of your entire environment (production as well as development and testing) – these days we

  [Read more...]
Severalnines at Percona Live London 2013: MySQL Cluster Performance Tuning, exhibitor space with live demos, discount code...
+1 Vote Up -0Vote Down
November 4, 2013 By Severalnines

Percona Live London MySQL Conference - 11-12th November, 2013

We’re particularly excited about this year’s Percona Live London MySQL Conference. The line-up of speakers & topics looks excellent and it’s good to see speakers from Oracle, Percona, the MariaDB Foundation (amongst others) scheduled at the same event. It demonstrates not just the diversity of the ever broadening MySQL ecosystem, but also the fact that there really is room for everyone to contribute, participate in and advance MySQL in manifold directions while still retaining a certain amount of uniformity.

And this is how we will be contributing to the event ...

  [Read more...]
Previous 30 Newer Entries Showing entries 31 to 60 of 182 Next 30 Older Entries

Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.