Home |  MySQL Buzz |  FAQ |  Feeds |  Submit your blog feed |  Feedback |  Archive |  Aggregate feed RSS 2.0 English Deutsch Español Français Italiano 日本語 Русский Português 中文
Showing entries 1 to 28

Displaying posts with tag: B-Tree (reset)

Interview with John Partridge, President & CEO of Tokutek, Inc.
+0 Vote Up -0Vote Down

“As the database gets used, shards can grow at an uneven rate and one shard might carry a majority of the load. MongoDB corrects this by balancing shards, but because of MongoDB’s lack of concurrency this operation can stall the database unacceptably.”–John Partridge.

I have interviewed John Partridge, President & CEO of Tokutek, Inc.

RVZ

Q1. Tokutek recently announced to have eliminated performance issues of MongoDB sharding. What was the problem?

John Partridge: The problem occurs after a shard is created. As the database gets used, shards can grow at an uneven rate and one shard might carry a majority of the load. MongoDB corrects this by balancing

  [Read more...]
Thoughts on Small Datum – Part 1
+0 Vote Up -0Vote Down

A little background…

When I ventured into sales and marketing (I’m an engineer by education) I learned I would often have to interpret and simply summarize the business value that is sometimes hidden in benchmarks. Simply put, the people who approve the purchase of products like TokuDB® and TokuMX™ appreciate the executive summary.

Therefore, I plan to publish a multipart series here on TokuView where I will share my simple summaries and thoughts on business value for the benchmarks Mark Callaghan (@markcallaghan), a former Google and now Facebook database guru, is publishing on his blog, Small Datum.

I’m going to start with his first benchmark post and work my way forward to

  [Read more...]
Slides from Boston MongoDB User Group Meetup on 7/31/13
+0 Vote Up -0Vote Down

On Wednesday night, the Boston MongoDB User group was kind enough to have me speak about TokuMX Internals. I spoke about Fractal Tree® indexes and the technical reasons behind the benefits they provide to MongoDB applications. Although the talk mostly references TokuMX and MongoDB, all the theory applies to TokuDB and MySQL as well.

My slides are on our technology overview page, along with other great content.

Opportunities to present technical material to an engaged audience asking tough questions is rare, and much appreciated. So thank you to the Boston MongoDB User group for having me present.

Why Unique Indexes are Bad
+1 Vote Up -0Vote Down

Before creating a unique index in TokuMX or TokuDB, ask yourself, “does my application really depend on the database enforcing uniqueness of this key?” If the answer is ANYTHING other than yes, do not declare the index to be unique. Why? Because unique indexes may kill your write performance. In this post, I’ll explain why.

Unique indexes are a strange beast: they have no impact on standard databases that use B-Trees, such as MongoDB and MySQL, but may be horribly painful for databases that use write optimized data structures, like TokuMX’s Fractal Tree(R) indexes. How? They

  [Read more...]
Percona Live Slides and Video Available: The Right Read Optimization is Actually Write Optimization
+2 Vote Up -0Vote Down

In April, I got to give a talk at Percona Live, about why The Right Read Optimization is Actually Write Optimization. It was my first industry talk, so I was delighted when someone in the audience said “I feel like I just earned a college credit.”

Box offered to host everyone’s slides from the conference here (mine is here). A big thanks from me to Sheeri Cabral, for

  [Read more...]
TokuDB v6.0: Download Available
+2 Vote Up -0Vote Down

TokuDB v6.0 is full of great improvements, like getting rid of slave lag, better compression, improved checkpointing, and support for XA.

I’m happy to announce that TokuDB v6.0 is now generally available and can be downloaded here.

Sysbench Performance

I wanted to take this time to talk about one more under-the-hood goody we’ve added to v6.0. In

  [Read more...]
OLTP and OLAP – Have Your Cake and Eat it Too!
+0 Vote Up -0Vote Down

Looks like we’ll be having some more fun at the Percona Live MySQL Conference! In addition to our booth and my colleague Tim’s talk, my lightning talk was accepted. The title is “OLTP and OLAP – Have Your Cake and Eat it Too!” The lightning talks, given in a TBD order, will start Wednesday evening (April 11th) at around 6:30 pm.

Below is the abstract I submitted.

  [Read more...]
FictionPress Selects TokuDB for Consistent Performance and Fast Disaster Recovery
+1 Vote Up -0Vote Down

FictionPress

Issues addressed:

  • Support complex and efficient indexes at 100+ million rows.
  • Predicable and consistent performance regardless of data size growth.
  • Fast recovery.

Ensuring Predictable Performance at Scale

The Company:  FictionPress operates both FictionPress.com and FanFiction.net and is home to over 6 million works of fiction, with millions of writers/readers

  [Read more...]
Top Ten for 2011
+0 Vote Up -0Vote Down

 

It’s almost the end of the year – that means holiday cards, shopping, cooking, parties, and the inevitable year-end top lists (including gems like this one).

In the spirit of end of year list making, we fed our 60+ blogs this year through Google Analytics to find out what our own top ten blogs were (outside of product announcements). So if you missed an episode of the View (TokuView that is) we’ve got a Tokutek Top Ten for you (spoiler alert – they are mostly technical):

10. Cage Match: OldSQL, NoSQL and NewSQL – References to

  [Read more...]
Slides of my talk on B+Tree Indexes and InnoDB
+2 Vote Up -0Vote Down
The slides of my talk on B+Tree Indexes and InnoDB are now available for download. This slide was presented during Percona Live London 2011. You can download the slides from here. There are many other interesting and informative talks that were presented during Percona Live London 2011, and I think you should definitely check them out, if you haven't. They are available here.
Fractal Tree Indexes – MySQL Meetup
+0 Vote Up -0Vote Down

At next month’s Boston MySQL Meetup, I will give a talk: “Fractal Tree Indexes – Theoretical Overview and Customer Use Cases.” The meetup is 7 pm Monday, January 9th, 2012, and will be held at MIT Building E51 Room 337e (corner of Ames & Amherst St, Cambridge, MA). Thanks to host Sheeri Cabral for the invitation.

Most databases employ B-trees to achieve a good tradeoff between the ability to update data quickly and to search it quickly. It

  [Read more...]
It Actually is Easy Being Green
+1 Vote Up -0Vote Down

(Fractal) Tree Frog

Fractal Tree™ indexes are green. They have the potential to be greener still. Here’s why:

Remarkably, data centers consume 1-3 percent of all the US electricity. A majority of this power is used to drive servers and storage systems. Significant energy savings remain on the table.

Here’s why Fractal Tree indexing enables more energy-efficient storage: Data centers typically use many small-capacity disks rather than a few large-capacity disks. Why? One reason is to harness more spindles to obtain more I/Os per second. In some high-performance applications, users go so

  [Read more...]
Indexing: The Director’s Cut
+0 Vote Up -0Vote Down

Thanks again to Erin O’Neill and Mike Tougeron for having me at the SF MySQL Meetup last month for the talk on “Understanding Indexing.” The crowd was very interactive, and I appreciated that over 100 people signed up for the event and left some very positive comments and reviews.

Thanks to Mike, a video of the talk is now available:

As a brief overview – Application performance often depends on how fast a query can respond and query performance almost always depends on good indexing. So one of the quickest and least expensive ways to increase application performance is to optimize the indexes. This talk presents three simple and effective rules on how to construct

  [Read more...]
Don’t Thrash: How to Cache your Hash on Flash
+0 Vote Up -0Vote Down

Last week I gave a talk entitled “Don’t Thrash: How to Cache your Hash.” The talk took place at the Workshop on Algorithms and Data Structures (ADS) in a medieval castle turned conference center in Bertinoro, Italy. An earlier version of this work (with the same title) appeared at the HotStorage conference in Portland, OR. Tokutek co-founders Bradley, Martin, and I are coauthors on the work, along with students and other faculty at Stony Brook University.

The talk title is colorful and doggerel-y. Here’s what the title means. “Cache your hash”—the so-called Bloom Filter type data structure. A Bloom filter acts like a negative

  [Read more...]
Percona Live, NYC
+0 Vote Up -0Vote Down

Yesterday, Percona held Percona Live NYC, which they describe as an “intensive one-day MySQL summit.” They meant it. It was like drinking from a firehose. There was too much for me to give a complete report, so I’d like to highlight two sessions that stuck out for me.

Why SQL Wins

Sergei Tsarev (Clustrix) gave a great overview of the last 50 years of database development. He talked about the early days, in which what we now think of as database functionality had to be implemented in each application. Programmer productivity was therefore low.

As modern SQL databases emerged, productivity shot up since databases bundled up common functionality with an easy-to-code interface. This now seems like a golden age of databases, in which transactional semantics were hashed out.

Fast forward to today.

  [Read more...]
OldSQL Tricks or NewSQL Treats
+4 Vote Up -0Vote Down

Why do B-trees need “Tricks” to work?

Marko Mäkelä recently posted a couple of “tips and tricks” you can use to improve InnoDB performance. Tips and tricks. A general purpose relational database like MySQL shouldn’t need “tips and tricks” to perform well, and I lay the blame on design choices that were made in the early ’70s: the B-tree data structure underlying all OldSQL databases. B-trees were designed for machines that had very different performance characteristics than the machines of today. Hardware has changed, but B-trees are the same. Tips and Tricks are an attempt to make up the difference.

So B-tree implementers — InnoDB, Oracle, MS SQL Server — are fighting an uphill battle; they’re fighting the future. B-trees

  [Read more...]
Understanding InnoDB clustered indexes
+3 Vote Up -0Vote Down
Some people don't probably know, but there is a difference between how indexes work in MyISAM and how they work in InnoDB, particularly when talking from the point of view of performance enhancement. Now since, InnoDB is starting to be widely used, it is important we understand how indexing works in InnoDB. Hence, the reason for this post!
On “Replace Into”, “Insert Ignore”, Triggers, and Row Based Replication
+0 Vote Up -0Vote Down

In posts on June 30 and July 6, I explained how implementing the commands “replace into” and “insert ignore” with TokuDB’s fractal trees data structures can be two orders of magnitude faster than implementing them with B-trees. Towards the end of each post, I hinted at that there are some caveats that complicate the story a little. On July 21st I explained one caveat, secondary keys, and on August 3rd, Rich explained another caveat. In this

  [Read more...]
TokuDB speeds up “replace” and “insert ignore” operations by relaxing the affected rows constraint
+2 Vote Up -0Vote Down

In posts on June 30 and July 6, we explained how implementing the commands “replace into” and “insert ignore” with TokuDB’s fractal trees data structures can be two orders of magnitude faster than implementing them with B-trees. Towards the end of each post, we hinted at that there are some caveats that complicate the story a little. In this post, we explain one of the complications: the calculation of affected rows.

MySQL returns the number of rows affected by a “replace” or “insert” statement to the client. For the “replace” statement, the number of affected rows is defined to be

  [Read more...]
On “Replace Into”, “Insert Ignore”, and Secondary Keys
+0 Vote Up -0Vote Down

In posts on June 30 and July 6, I explained how implementing the commands “replace into” and “insert ignore” with TokuDB’s fractal trees data structures can be two orders of magnitude faster than implementing them with B-trees. Towards the end of each post, I hinted at that there are some caveats that complicate the story a little. In this post, I explain one of the complications: secondary indexes.

Secondary indexes act the same way in TokuDB as they do in InnoDB. They store the defined secondary key, and the primary key as a pointer to the rest of the row. So, say

  [Read more...]
Why “insert … on duplicate key update” May Be Slow, by Incurring Disk Seeks
+0 Vote Up -0Vote Down

In my post on June 18th, I explained why the semantics of normal ad-hoc insertions with a primary key are expensive because they require disk seeks on large data sets. I previously explained why it would be better to use “replace into” or to use “insert ignore” over normal inserts. In this post, I explain why another alternative to normal inserts, “insert … on duplicate key update” is no better in MySQL, because the command incurs disk seeks.

The reason “insert ignore” and “replace into” can be made fast with

  [Read more...]
Making “Insert Ignore” Fast, by Avoiding Disk Seeks
+0 Vote Up -0Vote Down

In my post from three weeks ago, I explained why the semantics of normal ad-hoc insertions with a primary key are expensive because they require disk seeks on large data sets. Towards the end of the post, I claimed that it would be better to use “replace into” or “insert ignore” over normal inserts, because the semantics of these statements do NOT require disk seeks. In my post last week, I explained how the command “replace into” can be fast with TokuDB’s fractal trees. Today, I explain how “insert ignore” can be fast, using a strategy that is very similar to what we do

  [Read more...]
Making “Replace Into” Fast, by Avoiding Disk Seeks
+0 Vote Up -1Vote Down

In this post two weeks ago, I explained why the semantics of normal ad-hoc insertions with a primary key are expensive because they require disk seeks on large data sets. Towards the end of the post, I claimed that it would be better to use “replace into” or “insert ignore” over normal inserts, because the semantics of these statements do NOT require disk seeks. In this post, I explain how the command “replace into” can be fast with fractal trees.

The semantics of “replace into” are as follows:


  • if the primary (or unique) key does not exist, insert the new row
  • if the primary (or unique) key does exist, overwrite the existing row with the new row

The slow, expensive way





  [Read more...]
Making Updates Fast, by Avoiding Disk Seeks
+1 Vote Up -0Vote Down

The analysis that shows how to make deletions really fast by using clustering keys and TokuDB’s fractal tree based engine also applies to make updates really fast. (I left it out of the last post to keep the story simple). As a quick example, let’s look at the following statement:

update foo set price=price+1 where product=toy;

Executing this statement has two steps:


  • a query to find where product=toy
  • a combination of insertions and deletions to change old rows to new rows

The analysis is identical to that for deletions. Just like for





  [Read more...]
Disk seeks are evil, so let’s avoid them, pt. 4
+0 Vote Up -0Vote Down

Continuing in the theme from previous posts, I’d like to examine another case where we can eliminate all disk seeks from a MySQL operation and therefore get two orders-of-magnitude speedup. The general outline of these posts is:


  • B-trees do insertion disk seeks. While they’re at it, they piggyback some other work on the disk seeks. This piggyback work requires disk seeks regardless.
  • TokuDB’s Fractal Tree indexes don’t do insertion disk seeks. If we also get rid of the piggyback work, we end up with no disk seeks, and a two order of magnitude improvement.

So it’s all about finding out which piggyback work is important (important enough to pay a huge performance penalty for), and which isn’t.





  [Read more...]
Making Deletions Fast, by Avoiding Disk Seeks
+1 Vote Up -2Vote Down

In my last post, I discussed how fractal tree data structures can be up to two orders of magnitude faster on deletions over B-trees. I focused on the deletions where the row entry is known (the storage engine API handler::delete_row), but I did not fully analyze how MySQL delete statements can be fast. In this post, I do. Here I show how one can use TokuDB, a storage engine that uses fractal tree data structures, to make MySQL deletions run fast.

Let’s take a step back and analyze the work needed to be done to execute a MySQL delete statement. Suppose we have the table:

create table foo (
	id auto_increment
	a int,
	b int,
	primary key (id)
)

Say we wish to perform the following operation that deletes 100,000 rows:

delete from
  [Read more...]
Disk seeks are evil, so let’s avoid them, pt. 3 (Deletions)
+0 Vote Up -0Vote Down

As mentioned in parts 1 and 2, having many disk seeks are bad (they slow down performance). Fractal tree data structures minimize disk seeks on ad-hoc insertions, whereas B-trees practically guarantee that disk seeks are performed on ad-hoc insertions. As a result, fractal tree data structures can insert data up to two orders of magnitude faster than B-Trees can.

In this post, let’s examine deletions, and get an intuitive understanding for why fractal-tree data structures exhibit the same two orders of magnitude faster deletions than B-trees. In MySQL 5.1, this advantage is really eye-popping for TokuDB v. InnoDB, because InnoDB does not use its insert buffer for deletions.

  [Read more...]
Tokutek’s Fractal Tree Indexes
+2 Vote Up -3Vote Down

Tokutek’s Bradley did a session on their Fractal Tree Index technology at the MySQL Conference (and an OpenSQL Camp before that – but I wasn’t at that one), and my first thought was: great, now we get to see what and where the magic is. On second thought, I realised you may not want to know.

I know I’m going to be a party pooper here, but I do feel it’s important for people to be aware of the consequences of looking at this stuff (there’s slide PDFs online as well as video), and software patents in general. I reckon Tokutek has done some cool things, but the patents are a serious problem.

Tokutek’s technology has patents pending, and is thus patent encumbered. What does this mean for you? It means that if you look at their “how they did it” info and you happen to code something that later ends up in a related patent lawsuit,

  [Read more...]
Showing entries 1 to 28

Planet MySQL © 1995, 2014, Oracle Corporation and/or its affiliates   Legal Policies | Your Privacy Rights | Terms of Use

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.