Showing entries 231 to 240 of 242
« 10 Newer Entries | 2 Older Entries »
Displaying posts with tag: Insight for Developers (reset)
Distributed Set Processing with Shard-Query

Can Shard-Query scale to 20 nodes?

Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance:

I will shortly release another blog post focusing on ICE performance at 20 nodes, but before that, I want to give a quick preview, then explain exactly how Shard-Query works.

Yes, Shard-Query scales very well at 20 nodes
Distributed set processing (theory) What is SQL?

As you probably know, SQL stands for “structured query language”. It isn’t so much the language that is structured, but …

[Read more]
Percona Live gets bigger: two more speaker tracks!

We’ve just rented more rooms, and published an additional two tracks of speakers for Percona Live in New York on May 26th. The schedule is here. There is a long queue of speaker submissions we’re finalizing and will be adding to the schedule, to fill the few empty slots in those new rooms.

My favorite not-yet-confirmed session is from a company who has built their business in the Amazon cloud, and has seen just about every angle of running a large database in the cloud. This isn’t an extraordinary database, all things considered — as they told me, “it’s not a science fiction use case. It’s just science fiction to run it in the cloud.” That is precisely why this is such an interesting story to hear. There is a lot of wisdom to be gleaned from people who’ve done such things.

Tickets are selling fast, and we still expect to …

[Read more]
Drop table performance

There have been recent discussions about DROP TABLE performance in InnoDB. (You can refer to Peter’s post http://www.mysqlperformanceblog.com/2011/02/03/performance-problem-with-innodb-and-drop-table/ and these bug reports: http://bugs.mysql.com/bug.php?id=51325 and http://bugs.mysql.com/bug.php?id=56332.) It may not sound that serious, but if your workload often uses DROP TABLE and you have a big buffer pool, it may be a significant issue. This can get especially painful, as during this operation InnoDB holds the LOCK_open mutex, which prevents other queries from executing. So, this is a problem for a server with a large amount of memory, like the one we have in our lab: a …

[Read more]
Innodb row size limitation

I recently worked on a customer case where at seemingly random times, inserts would fail with Innodb error 139. This is a rather simple problem, but due to it’s nature, it may only affect you after you already have a system running in production for a while.

Suppose you have the following table structure:

CREATE TABLE example (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
fname TEXT NOT NULL,
fcomment TEXT,
ftitle TEXT NOT NULL,
fsubtitle TEXT NOT NULL,
fcontent TEXT NOT NULL,
fheader TEXT,
ffooter TEXT,
fdisclaimer TEXT,
fcopyright TEXT,
fstylesheet TEXT,
fterms TEXT,
PRIMARY KEY (id)
) Engine=InnoDB;

Now you insert some test data into it:
mysql> INSERT INTO example
-> VALUES (
->   NULL,
->   'First example',
->   'First comment',
->   'First title',
->   'First subtitle',
->   'First …

[Read more]
MySQL caching methods and tips

“The least expensive query is the query you never run.”

Data access is expensive for your application. It often requires CPU, network and disk access, all of which can take a lot of time. Using less computing resources, particularly in the cloud, results in decreased overall operational costs, so caches provide real value by avoiding using those resources. You need an efficient and reliable cache in order to achieve the desired result. Your end users also care about response times because this affects their work productivity or their enjoyment of your service. This post describes some of the most common cache methods for MySQL.

Popular cache methods

The MySQL query cache

When the query cache is enabled, MySQL examines each query to see if the contents have been stored in the query cache. If the results have been cached they are used instead of actually running the query.. This improves the response time …

[Read more]
Flexviews – part 3 – improving query performance using materialized views

Combating “data drift”

In my first post in this series, I described materialized views (MVs). An MV is essentially a cached result set at one point in time. The contents of the MV will become incorrect (out of sync) when the underlying data changes. This loss of synchronization is sometimes called drift. This is conceptually similar to a replication slave that is behind. Until it catches up, the view of the data on the slave is “behind” the changes on the master. An important difference is that each MV could have drifted by a different length of time.

A view which has drifted out of sync must be refreshed. Since an MV drifts over time from the “base tables” (those tables on which the view was built) there must be a process to bring them up-to-date. …

[Read more]
InnoDB Flushing: Theory and solutions

I mentioned problems with InnoDB flushing in a previous post. Before getting to ideas on a solution, let’s define some terms and take a look into theory.

The two most important parameters for InnoDB performance are innodb_buffer_pool_size and innodb_log_file_size. InnoDB works with data in memory, and all changes to data are performed in memory. In order to survive a crash or system failure, InnoDB is logging changes into InnoDB transaction logs. The size of the InnoDB transaction log defines how many changed blocks we can have in memory for a given period of time. The obvious question is: Why can’t we simply have a gigantic InnoDB transaction log? The answer is that the size of the transaction log affects recovery time after a crash. The bigger the log, the longer the recovery time. …

[Read more]
Using Flexviews – part two, change data capture

In my previous post I introduced materialized view concepts. This post begins with an introduction to change data capture technology and describes some of the ways in which it can be leveraged for your benefit. This is followed by a description of FlexCDC, the change data capture tool included with Flexviews. It continues with an overview of how to install and run FlexCDC, and concludes with a demonstration of the utility.

As a reminder, the first post covered the following topics:

  1. What is a materialized view(MV)?
  2. It explained that an MV can pre-compute joins and may aggregate and summarize data.
  3. Using the aggregated data can significantly improve query response times compared to accessing the non-aggregated data.
  4. Keeping MVs up-to-date (refreshing) is …
[Read more]
Using Flexviews – part one, introduction to materialized views

If you know me, then you probably have heard of Flexviews. If not, then it might not be familiar to you. I’m giving a talk on it at the MySQL 2011 CE, and I figured I should blog about it before then. For those unfamiliar, Flexviews enables you to create and maintain incrementally refreshable materialized views.

You might be asking yourself “what is an incrementally refreshable materialized view?”. If so, then keep reading. This is the first in a multi-part series describing Flexviews.

edit:
You can find part 2 of the series here: http://www.mysqlperformanceblog.com/2011/03/25/using-flexviews-part-two-change-data-capture/


The output of …

[Read more]
Moving Subtrees in Closure Table Hierarchies

Many software developers find they need to store hierarchical data, such as threaded comments, personnel org charts, or nested bill-of-materials. Sometimes it’s tricky to do this in SQL and still run efficient queries against the data. I’ll be presenting a webinar for Percona on February 28 at 9am PST. I’ll describe several solutions for storing and querying trees in an SQL database, including the design I call Closure Table.

In Closure Table, we store every path in a tree, not only direct parent-child references, but also grandparent-grandchild, and every other path, no matter how long. We even store paths of length zero, which means a node is its own parent. So if A is a parent of B, and B is a parent of C and C is a parent of D, we need to store the following paths: A-A, A-B, A-C, A-D, B-B, B-C, B-D, C-C, …

[Read more]
Showing entries 231 to 240 of 242
« 10 Newer Entries | 2 Older Entries »