Showing entries 31 to 40 of 283
« 10 Newer Entries | 10 Older Entries »
Displaying posts with tag: mongodb (reset)
Sometimes a Variety of Databases is THE Database You Need

We were just leafing through the 2015 edition of The DZone Guide to Database and Persistence Management, and we noticed some interesting stats in the guide's included survey, about which we'd like to share some observations. The survey is one of the ebook's central features, and it includes feedback from over 800 IT Professionals, with 63% of those respondents coming from companies with over 100 employees and 69% with over 10 years of experience -- they represent a significant and important cross-section of our industry.

These kinds of reports can be enlightening, as they offer the opportunity to take some of our principles and pin them to the hard facts and numbers of actual database activity, in the field. 

In a section titled "One Type of Database is Usually Not Enough," the report reveals that it's stadard …

[Read more]
RocksDB vs the world for loading Linkbench tables

I like RocksDB, MyRocks and MongoRocks because they are IO efficient thanks to excellent compression and low write-amplification. It is a bonus when I get better throughput with the RocksDB family. Here I show that RocksDB has excellent load performance and to avoid benchmarketing the context is loading Linkbench tables when all data fits in cache. This workload should be CPU bound and can be mutex bound. I compare engines for MySQL (RocksDB, TokuDB, InnoDB) and MongoDB (RocksDB, WiredTiger) using the same hardware.

I used …

[Read more]
Setup a MongoDB replica/sharding set in seconds

In the MySQL world, we’re used to playing in the MySQL Sandbox. It allows us to deploy a testing replication environment in seconds, without a great deal of effort or navigating multiple virtual machines. It is a tool that we couldn’t live without in Support.

In this post I am going to walk through the different ways we have to deploy a MongoDB replica/sharding set test in a similar way. It is important to mention that this is not intended for production, but to be used for troubleshooting, learning or just playing around with replication.

Replica Set regression test’s diagnostic commands

MongoDB includes a .js that allows us to deploy a replication set from the MongoDB’s shell. Just run the following:

# mongo …
[Read more]
Peter Zaitsev webinar January 27th: Compression In Open Source Databases

Percona invites you to attend a webinar Wednesday, January 27th, with CEO Peter Zaitsev: Compression In Open Source Databases. Register now!

Data growth has been tremendous in the last decade and shows no signs of stopping. To deal with this trend database technologies have implemented a number of approaches, and data compression is by far the most common and important. Compression in open source databases is complicated, and there are a lot of different approaches – each with their own implications.

In this talk we will perform a survey of compression in some of the most popular open source database engines including: Innodb, TokuDB, MongoDB, WiredTiger, RocksDB, and PostgreSQL.

Important information: …

[Read more]
Percona Live Data Performance Conference 2016: news you need to know!

The Percona Live Data Performance Conference 2016 is rapidly approaching, and we’re looking forward to providing an outstanding experience April 18-21 for all whom attend.

Percona Live is the premier event for the rich and diverse open source community and businesses that thrive in the MySQL and NoSQL marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, and CEOs representing organizations from industry giants such as Oracle to start-ups. Vendors increasingly rely on the conference as a major opportunity to connect with potential high-value customers from around the world.

Below are some highlights for the upcoming conference regarding the conference schedule, Tutorial sessions, Birds of a Feather talks, and Lightning talks.

Conference Schedule

Percona Live is packed with engaging sessions, helpful …

[Read more]
The advantages of an LSM vs a B-Tree

The log structured merge tree (LSM) is an interesting algorithm. It was designed for disks yet has been shown to be effective on SSD. Not all algorithms grow better with age. A long time ago I met one of the LSM co-inventors, Patrick O'Neil, at the first job I had after graduate school. He was advising my team on bitmap indexes. He did early and interesting work on both topics. I went on to maintain bitmap index code in the Oracle RDBMS for a few years. Patrick O'Neil made my career more interesting.

Performance evaluations are hard. It took me a long time to get expertise in InnoDB, then I repeated that for RocksDB. Along the way I made many mistakes. Advice on …

[Read more]
Percona Server for MongoDB storage engines in iiBench insert workload

We recently released the GA version of Percona Server for MongoDB, which comes with a variety of storage engines: RocksDB, PerconaFT and WiredTiger.

Both RocksDB and PerconaFT are write-optimized engines, so I wanted to compare all engines in a workload oriented to data ingestions.

For a benchmark I used iiBench-mongo (, and I inserted one billion (bln) rows into a collection with three indexes. Inserts were done in ten parallel threads.

For memory limits, I used a 10GB as the cache size, with a total limit of 20GB available for the mongod process, …

[Read more]
Read, write & space amplification - B-Tree vs LSM

This post compares a B-Tree and LSM for read, write and space amplification. The comparison is done in theory and practice so expect some handwaving mixed with data from iostat and vmstat collected while running the Linkbench workload. For the LSM I consider leveled compaction rather than size-tiered compaction. For the B-Tree I consider a clustered index like InnoDB.

The comparison in practice provides values for read, write and space amplification on real workloads. The comparison in theory attempts to explain those values.

B-Tree vs LSM in …

[Read more]
Read, write & space amplification - pick 2

Good things come in threes, then reality bites and you must choose at most two. This choice is well known in distributed systems with CAPPACELC and FIT. There is a similar choice for database engines. An algorithm can optimize for at most two from readwrite and space amplification. These are metrics for efficiency …

[Read more]
Define better for a small-data DBMS

There are many dimensions by which a DBMS can be better for small data workloads: performance, efficiency, manageability, usability and availability. By small data I mean OLTP. Performance gets too much attention from both industry and academia while the other dimensions are at least as important in the success of a product. Note that this discussion is about which DBMS is likely to get the majority of new workloads. The decision to migrate a solved problem is much more complex.

  • Performance - makes marketing happy
  • Efficiency - makes management happy
  • Manageability - makes operations happy
  • Usability - makes databased-backed application developers happy
  • Availability - makes users happy

Performance makes marketing happy because they can publish a whitepaper to show their product …

[Read more]
Showing entries 31 to 40 of 283
« 10 Newer Entries | 10 Older Entries »