|Previous 30 Newer Entries||Showing entries 61 to 90 of 90|
At FROSCON I’ll be talking about fast data structures for maintaining indexes. The talk will share some content with my upcoming MySQL Connect talk.
At VLDB, Dzejla Medjedovic will be presenting a talk on our paper on SSD-friendly Bloom-filter-like data structures. The paper is
Michael A. Bender, Martin Farach-Colton, Rob Johnson, Russell Kraner, Bradley C. Kuszmaul, Dzejla Medjedovic, Pablo Montes, Pradeep Shetty, Richard P. Spillane, and Erez Zadok.
Don’t Thrash: How to Cache Your Hash on Flash. PVLDB 5(11):1627-1637, 2012.
An earlier version of the paper appeared at[Read more...]
A few weeks ago Bradley Kuszmaul and I attended the Dagstuhl Seminar on Database Workload Management.
The Dagstuhl computer science research center is (remotely) located in the countryside in Saarland, Germany. The actual building is an 18th Century Manor House, first retooled as an old-age home, and then a computer science research center. Workshop participants typically spend the whole week talking and working together.
Dagstuhl Computer Science Center[Read more...]
This past week I attended OSCon, the annual conference for open source’s true believers. And there was a religious fervor in the air, particularly from the point of view of someone more accustomed to Oracle conferences.
And if open source is the religion, proprietary closed-source companies are the devil. That having been said, I was surprised how virtually all large companies were demonized. Even long-time defenders of open source like IBM were ignored at best. That didn’t prevent them from coming though, with Microsoft and HP in particular with high-profile sponsorships and PR offensives that didn’t seem to have much influence with the crowd.
The companies generating buzz were the small companies built around development of their own open source products. There are a surprising number of them out[Read more...]
Three rules on making indexes around queries to provide good performance
Application performance often depends on how fast a query can respond and query performance almost always depends on good indexing. So one of the quickest and least expensive ways to increase application performance is to optimize the indexes. This talk presents three simple and effective rules on how to construct indexes around queries that result in good performance.
Time: 2PM EDT / 11AM PDT
This webinar is a general discussion applicable to all databases using indexes and is not specific to any particular MySQL® storage engine[Read more...]
"Why the days are numbered for Hadoop as we know it"I know GigaOM like to provoke scandals sometimes, we all remember some other unforgettable piece, but there is something behind it...
Solving the Challenges of Big Databases with MySQL
When you’re using MySQL for big data (more than ten times as large as main memory), these challenges often arise: loading data fast; maintaining indexes under insertions deletions, and updates; adding and removing columns online; adding indexes online; preventing slave lag; and compressing data effectively.
This session shows why some of these challenges are difficult to solve with storage engines based on B-trees, how Fractal Tree® data structures work, and why they can help solve these problems. Tokutek sells a transaction-safe Fractal Tree storage engine for MySQL, but the presentation is primarily about the underlying technology. It includes a discussion of both the theoretical and practical aspects of Fractal Tree indexes.
I have the privilege of being able to give[Read more...]
Table optimization is a necessary evil; tables sometimes need to be optimized to reclaim space or to improve query performance. Unfortunately, MySQL blocks writes to a table while it is being optimized. Because optimization time is proportional to the table size, writes can be blocked for a long time. Fractal Tree indexes support online optimization; however, the MySQL metadata lock gets in the way of writing while optimizing. We will describe a simple patch to MySQL that enables online optimization of TokuDB tables.
Why do tables need to be optimized? Here are some reasons.
Fast indexing requires the leaves of a Fractal Tree® Index to be big. But some queries require the leaves to be small in order to get any reasonable performance. Basements nodes are our way to achieve these conflicting goals, and here I’ll explain how.
On many occasions, we at Tokutek have pointed out that TokuDB is write optimized, which means TokuDB indexes data much faster than a B-tree solution such as InnoDB. As with any write-optimized data structure, Fractal Tree indexes need to bundle up lots of small writes into a few big writes. Otherwise, there’d be no way to beat a B-tree. So the question is, how big do the writes have to be?
Consider how long it takes to write k bytes to a disk. First, there is the seek time s, which we can assume to be independent of k.[Read more...]
The signal-to-noise ratio in the NoSQL world has made it hard to figure out what’s going on, or even who has something new. For all the talk of performance in the NoSQL world, much of the most exciting part of what’s new is really not about performance at all.
Take for example, MongoDB, which has a really great data model and MapReduce has a very handy scripting language. These are genuine and probably long-lasting contributions. Their innovation is all about finding a new language to use for interacting with data. They are about NoSQL.
The confusion comes, for me, when we get to the performance side of the equation. I explore this in detail in an article I did for Datanami recently – http://www.datanami.com/datanami/2012-05-22/the_sound_and_the_nosql_fury.html.
Martin Farach-Colton and I ran a Tutorial on Algorithms for Memory Sensitive Computing on May 18th at the 44th ACM Symposium on Theory of Computing (STOC) at NYU. Here is the program for the tutorial.
Erik Demaine (MIT) spoke on the History of I/O Models. Throughout the years, a remarkable variety of computational models have been proposed to explain the effects of caching, data locality, prefetching, and single-and multi-level memory hierarchies. Erik traced the intellectual history and connections between these models. Most approaches[Read more...]
"Dell announced a prototype low-power server with ARM processors, following a growing demand by Web companies for custom-built servers that can scale performance while reducing financial overhead on data centers"In short, ARM (see Wikipedia definition here) is an architecture standard for processors. ARM processors are slower compared to good old x86 processors from Intel and AMD, but have power-efficiency, density and price attributes that intrigue [Read more...]
TokuDB® is a proven solution that scales MySQL® and MariaDB® from GBs to TBs with unmatched insert and query speed, compression, and online schema flexibility.
Tokutek’s recently launched TokuDB v6 delivers all of these features and more, with the introduction of high performance replication for MySQL and MariaDB. TokuDB v6 eliminates the common and persistent problem of “slave lag” in which a replication server is unable to keep up with the query load borne by the master server. TokuDB v6 solves this by offering high ingestion rates at the slave.[Read more...]
Many database management tasks become difficult as you move from millions of rows and gigabytes of data to billions of rows and terabytes of data. Such tasks include ingesting data while maintaining indexes; changing schemas without downtime; and supporting connections, replication, and backup. For some scaling problems (connections and replication), MySQL® is better than most of the competition. For others, such as indexing, schema changes, and backup, MySQL has typically been harder to use. Fortunately, the tasks MySQL does well are in its core, whereas the tasks that are more difficult can be solved with storage engine plug-ins.
I recently gave a talk at[Read more...]
Tackling machine data on the ground to ensure successful operations for NASA in space
The Company: Southwest Research Institute (SwRI) is an independent, nonprofit applied research and development organization. The staff of more than 3,000 specializes in the creation and transfer of technology in engineering and the physical sciences. Currently, SwRI is part of an international team working on the NASA[Read more...]
In April, I got to give a talk at Percona Live, about why The Right Read Optimization is Actually Write Optimization. It was my first industry talk, so I was delighted when someone in the audience said “I feel like I just earned a college credit.”[Read more...]
MySQL storage engine provider joins forces with leading database consultants to deliver support for growing number of MySQL and MariaDB customers
Lexington, MA – (May 2, 2012) – Tokutek, the leader in high-performance and agile database storage engines, today announced a strategic partnership with PalominoDB, a premier database operations and engineering consultancy, to provide database services and support to joint customers. Tokutek’s storage engine will be complemented with PalominoDB’s operational excellence, 24×7 on-call support and access to the company’s skilled team of[Read more...]
I’m happy to announce that TokuDB v6.0 is now generally available and can be downloaded here.
I wanted to take this time to talk about one more under-the-hood goody we’ve added to v6.0. In[Read more...]
Challenges of Big Databases with MySQL
Many database management tasks become difficult as you move from millions of rows and gigabytes of data to billions of rows and terabytes of data. Such tasks include ingesting data while maintaining indexes; changing schemas without downtime; and supporting connections, replication, and backup. For some scaling problems (connections and replication), MySQL is better than most of the competition. For others, such as indexing, schema changes, and backup, MySQL has typically been harder to use. Fortunately, the tasks MySQL does well are in its core, whereas the tasks that are more difficult can be solved with storage engine[Read more...]
A key feature of our new TokuDB v6.0 release, which I have been blogging about this week, is compression. Compression is always on in TokuDB, and the compression we’ve achieved in the past has been quite good. See a previous post on the 18x compression achieved by TokuDB v5.0 on one benchmark. In our latest release, we’ve updated the way compression works and got 50% improvement on compression.
I decided to present numbers on the same set of data as the old post, so see that post for experimental details.
But first, what are the changes? TokuDB compresses large blocks[Read more...]
Master/slave replication is an important tool that gets used in many ways: distributing read loads among many slaves for performance, using a slave for backups so the master can handle live load, geographically distributed disaster recovery, etc. The Achilles’ Heal of slave performance is that slave workloads are single-threaded. The master can have many clients inserting, updating, querying, whereas the slave has only one insertion client: the master. InnoDB single-client performance is much slower than its multi-client performance, which means that the bottleneck in a master/slave system is often the rate at which a slave can keep up.
If the master has an average transactions per second (tps) that is higher than what the slave can handle, the slave will fall further and further behind. If the slaves are being used to distribute read workload, for example, the[Read more...]
We are excited to announce TokuDB® v6.0, the latest version of Tokutek’s flagship storage engine for MySQL and MariaDB.
This version offers feature and performance enhancements over previous releases, support for XA (two-phase transactional commits), better compression, and reduced performance variability associated with checkpointing. This release also brings TokuDB support up to date on MySQL v5.1, MySQL v5.5 and MariaDB v5.2. There’s a lot of great technical stuff under the hood in this release and I’ll be reviewing the improvements one-by-one over the course of this week.
I’ll be posting more details about the new features and performance, so here’s an overview of what’s in store.Replication Slave Lag One of the things TokuDB does well is single-threaded insertions, which translates directly into [Read more...]
On Monday, I took a break from planning for the upcoming Percona Live MySQL Conference (where we have a session, lightning talk, booth, and other misc activities planned) to go attend the UK-Massachusetts Innovation Economies Conference at the MIT Media Lab. The event featured Gov. Deval Patrick, MIT Media Lab Director Joi Ito, industry experts such as[Read more...]
Now that the snow is melting and spring is in the air, the SkySQL Team is hitting the road and making the rounds of key industry events, trade shows, and meetups around the globe. Come meet the team, pick-up a few tips and tricks for using the MySQL database, network with your peers, and learn more about SkySQL’s products and services. Here are some the events we’ll be at this spring:
BIG Data, A New Horizon for Data Analysis
March 20 - 21, 2012
Cité Internationale Univeritaire de Paris, Paris, France
March 28-29, 2012
Columbia Metropolitan Convention Center, Columbia, South Carolina
We had the privilege this past week to be invited to be part of the 2012 O’Reilly Strata “Making Data Work” Conference. Some of our photos from the event are here. At the event, we were excited to have Tokutek described in front of the approximately 2,500 attendees during the keynote sessions.
Overall, the diversity of topics discussed at the conference was impressive, spanning databases, developer tools, data visualization techniques, customer stories, and business implications. The full agenda is[Read more...]
|Previous 30 Newer Entries||Showing entries 61 to 90 of 90|