Planet MySQL

Displaying posts with tag: Insight for Developers (reset)

May

2011

Posted by Justin Swanhart of MySQL Performance Blog on Thu 19 May 2011 06:06 UTC
Tags:

Insight for Developers, MySQL

Here is the problem: http://en.wikipedia.org/wiki/Partition_problem

Any weighted set can be partitioned. A weighted set can be represented in a sparse way as a space and time optimization. Also duplicates are removed, making it easy to at least partially partition any problem by hash.

For example, you can distribute the subset sum problem over multiple machines when you reduce the set to unique values and hash on the value (weighted set). This allows you to distribute checks, reducing complexity.

You can partition the problem into even and odd items, the absolute value of the items, etc, and add many simple checks which can be tested during insertion instead of after a huge list has been created inside the database. Using distributed computation of combinatorial algebra (sum) you can aggregate the data with your load, allowing you to answer questions on the data while it is loaded. The check will fire as soon as an …

[Read more]

May

2011

Using any general purpose computer as a special purpose SIMD computer

Posted by Justin Swanhart of MySQL Performance Blog on Mon 16 May 2011 13:33 UTC
Tags:

Insight for Developers, MySQL

Often times, from a computing perspective, one must run a function on a large amount of input. Often times, the same function must be run on many pieces of input, and this is a very expensive process unless the work can be done in parallel.

Shard-Query introduces set based processing, which on the surface appears to be similar to other technologies on the market today. However, the scaling features of Shard-Query are just a side effect of the fact that it operates on sets in parallel. Any set can be operated on to any arbitrary degree of parallelism up to, and including, the cardinality of the set.
Given that:

It is often possible to arbitrarily transform one type of expression into a different, but compatible type for computational purposes as long as the conversion is bidirectional
An range operation over a set of integers or dates can be transformed into one or more discrete sub-ranges
Any …

[Read more]

May

2011

Distributed Set Processing with Shard-Query

Posted by Justin Swanhart of MySQL Performance Blog on Sat 14 May 2011 13:43 UTC
Tags:

Insight for Developers, MySQL

Can Shard-Query scale to 20 nodes?

Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance:

I will shortly release another blog post focusing on ICE performance at 20 nodes, but before that, I want to give a quick preview, then explain exactly how Shard-Query works.

Yes, Shard-Query scales very well at 20 nodes
Distributed set processing (theory) What is SQL?

As you probably know, SQL stands for “structured query language”. It isn’t so much the language that is structured, but …

[Read more]

May

2011

Percona Live gets bigger: two more speaker tracks!

Posted by MySQL Performance Blog on Tue 03 May 2011 21:34 UTC
Tags:

Events and Announcements, Insight for DBAs, Insight for Developers, Cloud and NoSQL, MySQL

We’ve just rented more rooms, and published an additional two tracks of speakers for Percona Live in New York on May 26th. The schedule is here. There is a long queue of speaker submissions we’re finalizing and will be adding to the schedule, to fill the few empty slots in those new rooms.

My favorite not-yet-confirmed session is from a company who has built their business in the Amazon cloud, and has seen just about every angle of running a large database in the cloud. This isn’t an extraordinary database, all things considered — as they told me, “it’s not a science fiction use case. It’s just science fiction to run it in the cloud.” That is precisely why this is such an interesting story to hear. There is a lot of wisdom to be gleaned from people who’ve done such things.

Tickets are selling fast, and we still expect to …

[Read more]

Apr

2011

Drop table performance

Posted by MySQL Performance Blog on Thu 21 Apr 2011 03:49 UTC
Tags:

Benchmarks, Insight for DBAs, Insight for Developers, Percona Software, MySQL

There have been recent discussions about DROP TABLE performance in InnoDB. (You can refer to Peter’s post http://www.mysqlperformanceblog.com/2011/02/03/performance-problem-with-innodb-and-drop-table/ and these bug reports: http://bugs.mysql.com/bug.php?id=51325 and http://bugs.mysql.com/bug.php?id=56332.) It may not sound that serious, but if your workload often uses DROP TABLE and you have a big buffer pool, it may be a significant issue. This can get especially painful, as during this operation InnoDB holds the LOCK_open mutex, which prevents other queries from executing. So, this is a problem for a server with a large amount of memory, like the one we have in our lab: a …

[Read more]

Apr

2011

Innodb row size limitation

Posted by Fernando Ipar of MySQL Performance Blog on Thu 07 Apr 2011 21:04 UTC
Tags:

tips, innodb, storage engine, Insight for Developers, MySQL

I recently worked on a customer case where at seemingly random times, inserts would fail with Innodb error 139. This is a rather simple problem, but due to it’s nature, it may only affect you after you already have a system running in production for a while.

Suppose you have the following table structure:

CREATE TABLE example (
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
fname TEXT NOT NULL,
fcomment TEXT,
ftitle TEXT NOT NULL,
fsubtitle TEXT NOT NULL,
fcontent TEXT NOT NULL,
fheader TEXT,
ffooter TEXT,
fdisclaimer TEXT,
fcopyright TEXT,
fstylesheet TEXT,
fterms TEXT,
PRIMARY KEY (id)
) Engine=InnoDB;

Now you insert some test data into it:
mysql> INSERT INTO example -> VALUES ( -> NULL, -> 'First example', -> 'First comment', -> 'First title', -> 'First subtitle', -> 'First …

[Read more]

Apr

2011

MySQL caching methods and tips

Posted by Justin Swanhart of MySQL Performance Blog on Tue 05 Apr 2011 04:39 UTC
Tags:

tips, tuning, memcached, query cache, Insight for Developers, Cloud and NoSQL, MySQL, Performance

“The least expensive query is the query you never run.”

Data access is expensive for your application. It often requires CPU, network and disk access, all of which can take a lot of time. Using less computing resources, particularly in the cloud, results in decreased overall operational costs, so caches provide real value by avoiding using those resources. You need an efficient and reliable cache in order to achieve the desired result. Your end users also care about response times because this affects their work productivity or their enjoyment of your service. This post describes some of the most common cache methods for MySQL.

Popular cache methods

The MySQL query cache

When the query cache is enabled, MySQL examines each query to see if the contents have been stored in the query cache. If the results have been cached they are used instead of actually running the query.. This improves the response time …

[Read more]

Apr

2011

Flexviews – part 3 – improving query performance using materialized views

Posted by Justin Swanhart of MySQL Performance Blog on Mon 04 Apr 2011 20:36 UTC
Tags:

Insight for DBAs, Insight for Developers, MySQL

Combating “data drift”

In my first post in this series, I described materialized views (MVs). An MV is essentially a cached result set at one point in time. The contents of the MV will become incorrect (out of sync) when the underlying data changes. This loss of synchronization is sometimes called drift. This is conceptually similar to a replication slave that is behind. Until it catches up, the view of the data on the slave is “behind” the changes on the master. An important difference is that each MV could have drifted by a different length of time.

A view which has drifted out of sync must be refreshed. Since an MV drifts over time from the “base tables” (those tables on which the view was built) there must be a process to bring them up-to-date. …

[Read more]

Apr

2011

InnoDB Flushing: Theory and solutions

Posted by MySQL Performance Blog on Mon 04 Apr 2011 15:17 UTC
Tags:

Benchmarks, Insight for DBAs, Insight for Developers, MySQL

I mentioned problems with InnoDB flushing in a previous post. Before getting to ideas on a solution, let’s define some terms and take a look into theory.

The two most important parameters for InnoDB performance are innodb_buffer_pool_size and innodb_log_file_size. InnoDB works with data in memory, and all changes to data are performed in memory. In order to survive a crash or system failure, InnoDB is logging changes into InnoDB transaction logs. The size of the InnoDB transaction log defines how many changed blocks we can have in memory for a given period of time. The obvious question is: Why can’t we simply have a gigantic InnoDB transaction log? The answer is that the size of the transaction log affects recovery time after a crash. The bigger the log, the longer the recovery time. …

[Read more]

Mar

2011

Using Flexviews – part two, change data capture

Posted by Justin Swanhart of MySQL Performance Blog on Fri 25 Mar 2011 23:13 UTC
Tags:

Insight for DBAs, Insight for Developers, MySQL

In my previous post I introduced materialized view concepts. This post begins with an introduction to change data capture technology and describes some of the ways in which it can be leveraged for your benefit. This is followed by a description of FlexCDC, the change data capture tool included with Flexviews. It continues with an overview of how to install and run FlexCDC, and concludes with a demonstration of the utility.

As a reminder, the first post covered the following topics:

What is a materialized view(MV)?
It explained that an MV can pre-compute joins and may aggregate and summarize data.
Using the aggregated data can significantly improve query response times compared to accessing the non-aggregated data.
Keeping MVs up-to-date (refreshing) is …

[Read more]

Top Authors

Oracle MySQL Blogs

Vendor Blogs

MySQL Links