Showing entries 1 to 5
Displaying posts with tag: shard (reset)
Interview with John Partridge, President & CEO of Tokutek, Inc.

“As the database gets used, shards can grow at an uneven rate and one shard might carry a majority of the load. MongoDB corrects this by balancing shards, but because of MongoDB’s lack of concurrency this operation can stall the database unacceptably.”–John Partridge.

I have interviewed John Partridge, President & CEO of Tokutek, Inc.

RVZ

Q1. Tokutek recently announced to have eliminated performance issues of MongoDB sharding. What was the problem?

John Partridge: The problem occurs after a shard is created. As the database gets used, shards can grow at an uneven rate and one shard might carry a majority of the load. MongoDB corrects this by balancing shards, but because of MongoDB’s lack of concurrency this operation can stall the database unacceptably (see the …

[Read more]
“Shard early, shard often”

I wrote a post a while back that said why you don't want to shard.  In that post that I tried to explain that hardware advances such as 128G of RAM being so cheap is changing the point at which you need to shard, and that the (often omitted) operational issues created by sharding can be painful.

What I didn't mention was that if you've established that you will need to eventually shard, is it better to just get it out of the way early?  My answer is almost always no. That is to say I disagree with a statement I've been hearing recently; "shard early, shard often".  Here's why:

  • There's an order of magnitude better performance that can be gained by focusing on query/index/schema optimization.  The gains from sharding are usually much lower.
  • If you shard …
[Read more]
Speaking at Mysql Conf 2009: Architecture and Technology, Cloud Computing, LAMP, Replication and Scale-Out

I'll be going into detail what is Sharding, how to Shard, pitfalls of Sharding, performance/throughput gains, shard roles, and performance scaling in general. I hope to make this the most comprehensive talk to date on the subject in 45 min.

The topic is called Scaling a Widget Company. I'll detail how I setup the data layer for Rockyou. How many transactions per second Rockyou is at, what the infrastructure is comprised of, how 99.999% uptime is achieved and hopefully get into BCP which I probably will not have time to go over.

If you want me to focus on specific aspects on the subject of shard'ing let me know and I will :).

Slides from Proxy talk

I’ve reposted the slides from my Spockproxy talk (Spockproxy is a Sharding only version of the MySQL proxy).  Since I’ve have to move this web site it’s been some effort to move all of the files into their new homes.

These slides are in a variety of formats and have loads of great information if you’re considering a sharding solution – even if that is not Spockproxy.  Of course once you see how easy it is you’ll put it on your short list.

The slides are available at http://www.frankf.us/projects/spockproxy/  If you want to hear the talk you’ll have to invite me.

Capacity Planning, Architecture, Scaling, Response time, Throughput

First of all let me start off saying that I learned a lot of Capacity Planning from two people. Jozo Dujmovic, and John Allspaw-who by the way is coming out with a book.

Capacity != Performance
. You may have the capacity to do a bubble sort but a bubble sort is still a bubble sort.

Really to Scale you need to know when your application will break. I have a tool set to help determine what application is producing what SQL and use that to figure out which SQL is producing the most load on the system. Some common tricks I do is put the execution path automatically as a SQL comment, then sample the FULL Processlist to build a graph on what application, function, SQL pattern is the top load.

On top of that I use Ganglia to trend the use of each mysql …

[Read more]
Showing entries 1 to 5