|Previous 30 Newer Entries||Showing entries 31 to 60 of 60|
I’m evidencing an increase in the planning, confidence & execution for MyISAM to InnoDB migration.
How much can a single consultant observe? I agree Oracle should not go to PR based on my experience. But I find that:
This is the way I see it. No doubt, the Oracle/Sun deal made its impact. The fact that InnoDB is no longer a 3rd party; the fact Oracle invests in InnoDB and no other engine (Falcon is down, no real development on MyISAM); the fact InnoDB is to be the default engine: all these put companies at ease with[Read more...]
So this is about a SELECT COUNT(*) FROM tblname without a WHERE clause. MyISAM has an optimisation for that since it maintains a rowcount for each table. InnoDB and PBXT can’t do that (at least not easily) because of their multi-versioned nature… different transactions may see a different number of rows for the table table!
So, it’s kinda known but nevertheless often ignored that this operation on InnoDB is costly in terms of time; what InnoDB has to do to figure out the exact number of rows is scan the primary key and just tally. Of course it’s faster if it doesn’t have to read a lot of the blocks from disk (i.e. smaller dataset or a large enough buffer pool).
I was curious about PBXT’s performance on this, and behold it appears to be quite a bit faster! For a table with 50 million rows, PBXT[Read more...]
Some time ago I attended the "Optimisation by Design" course from Open Query¹. In it, Arjen teaches how writing better queries and schemas can make your database access much faster (and more reliable). One such way of optimising things is by adding appropriate query hints or flags. These hints are magic strings that control how a server executes a query or how it returns results.
An example of such a hint is SQL_CALC_FOUND_ROWS. You use it in a select query with a LIMIT clause. It instructs the server to select a limited numbers of rows, but also to calculate the total number of rows that would have been returned without the limit clause in place. That total number of rows is stored in a session variable, which can be retrieved via SELECT FOUND_ROWS(); That simply reads the variable and clears it on the server, it doesn't actually have to look at any table or[Read more...]
Actually this is (almost) all I wanted to say. This is intentionally posted with all related keywords in title, in the hope that a related search on Google will result with this post on first page.
I’m just still encountering companies who use MyISAM as their storage engine and are unaware that their nightly backup actually blocks their application, basically rendering their product unavailable for long minutes to hours on a nightly basis.
So this is posted as a warning for those who were not aware of this fact.
There is no hot (non blocking) backup for MyISAM. Closest would be file system snapshot, but even this requires flushing of tables, which may take a while to complete. If you must have a hot backup, then either use replication – and take the risk of the slave not being in complete sync with the master – or use another storage engine, i.e. InnoDB.
A few weeks ago, we had a query optimization request from one of our customer.
The query was very simple like:PLAIN TEXT CODE:
This column in the table is looks like this:PLAIN TEXT CODE:
The table have 549252 rows and of course, there is an index on the col1. MySQL estimated the cardinality of that index as 87, though what was of course misleading as index cardinality in this case can't be over 9, as there is only 8(+ NULL) different possible values for this column.PLAIN TEXT CODE: [Read more...]
I have not caused a fist fight in a while, so it's time to take off the gloves. I claim that somewhere around of 99% of advice about tuning MySQL's key cache hit ratio is wrong, even when you hear it from experts. There are two major problems with the key buffer hit ratio, and a host of smaller ones. If you make some assumptions that are very hard to prove, there actually is a very limited use for the statistics from which the ratio is derived (but not the ratio itself, which you should ignore). Read on for the details.
In this article, I will use key buffer, key_buffer, and key cache interchangeably. However, I will be careful about the difference between "rate" and "ratio". In this article, the key cache miss rate is defined as the number of misses per unit of time, with the units of operations per second. The key cache miss[Read more...]
I've had MySQL on my Windows 7 laptop for a bit (as part of wampserver), mostly for local offline WordPress development.
However, even though MySQL is relatively stable, I've been observing a vast quantity of intermittent MySQL errors, as reported by WordPress in the PHP error log (C:\wamp\logs\php_error.log). Here are some examples:
[05-Jan-2010 09:47:51] WordPress database error Error on delete of 'C:\Windows\TEMP\#sql17e0_1a2_6.MYD' (Errcode: 13) for query SELECT t.*, tt.* FROM wp_terms AS t INNER JOIN wp_term_taxonomy AS tt ON tt.term_id = t.term_id INNER JOIN wp_term_relationships AS tr ON tr.term_taxonomy_id = tt.term_taxonomy_id WHERE tt.taxonomy IN[Read more...]
Drizzle originally inherited the FRM file from MySQL (which inherited it from UNIREG). The FRM file stores metadata about a table; what columns it has, what type those columns are, what indexes, any default values, comments etc are all stored in the FRM. In the days of MyISAM, this worked relatively well. The row data was stored in table.MYD, indexes on top of it in table.MYI and information about the format of the row was
in table.FRM. Since MyISAM itself wasn’t crash safe, it didn’t really matter if creating/deleting the FRM file along with the table was either.
As more sophisticated engines were introduced (e.g. InnoDB) that had their own data dictionary, there started to be more of a problem. There were now two places storing information about a table: the FRM file and the data dictionary specific to the engine. Even if the data[Read more...]
I had an interesting case recently. The customer dealing with large MySQL data warehouse had the table which was had data merged into it with INSERT ON DUPLICATE KEY UPDATE statements. The performance was extremely slow. I turned out it is caused by hundreds of daily partitions created for this table. What is the most interesting (and surprising) not every statement is affected equally as you can see from the benchmarks above:
I got the following test table created:PLAIN TEXT SQL:
In my previous post I looked into how large table_cache actually can decrease performance. The "miss" path is getting more expensive very quickly as table cache growths so if you're going to have high miss ratio anyway you're better off with small table cache.
What I have not checked though is how does table_cache (or table_open_cache in newer version) size affects the hit path.
I started with the same test as last time - created 100000 tables and read all of them at once to make sure all table cache entries are populated. When I tried repeatedly reading 1000 empty tables with table_cache of 20000 and 2000. With Table Cache of 2000 I got about 16400 selects/sec with Table Cache of 20000 13500 selects/sec. So there is some slow[Read more...]
I turned on the skip-name-resolve parameter in the my.cnf and the Throughput of MySQL grew considerably, almost more than double.
Since I started doing the occasional consulting job for Open Query, I've seen a lot of MySQL servers that have been installed once and then forgotten about. This gave me the idea to do a short presentation about some basic MySQL server configuration. The first go was at DrupalCampMelbourne and I recently tried (and failed) to cram it into a three minute lightning talk slot at the LUV September meeting.
The title of the talk is (now) MySQL > YourSQL. I chose this not because I think that MySQL is better than the $other_database you use or because I may or may not run a newer version of MySQL on better hardware, but because I use InnoDB and not MyISAM as the default table[Read more...]
Recently I was doing a little work for a client who has MyISAM tables with many columns (the same one Peter wrote about recently). The client's performance is suffering in part because of the number of columns, which is over 200. The queries are generally pretty simple (sums of columns), but they're ad-hoc (can access any columns) and it seems tailor-made for a column-oriented database.
I decided it was time to actually give Infobright a try. They have an open-source community edition, which is crippled but not enough to matter for this test. The "Knowledge Grid" architecture seems ideal for the types of queries the client runs. But hey, why not also try[Read more...]
MySQL backup soon becomes an important matter when the database is used in production. The pain-point comes from the fact that while backuping the database is not available to respond to client requests anymore. With mysqldump - the standard tool for performing MySQL backups - and a large database the operation can go over many tenth of minutes if not hours. If I am running my business on line this is simply not acceptable.
classical approach to workaround this problem
is to take advantage of MySQL replication. I set up a master/slave
configuration where the slave acts as copy of the master. Then, when
needed, I run mysqldump on the
slave without any service interruption on the master.
But ZFS snapshosts bring a new straightforward approach that avoids the pain and the complexity of a master/slave replication.
It is pretty understood the tables which have long rows tend to be slower than tables with short rows. I was interested to check if the row length is the only thing what matters or if number of columns we have to work with also have an important role. I was interested in peak row processing speed so I looked at full table scan in case data fits in OS cache completely. I created 3 tables - First containing single tinyint column which is almost shortest type possible (CHAR(0) could be taking less space), table with 1 tinyint column and char(99) column and table with 100 tinyint columns. The former two tables have the same row length but have number of column different 50 times. Finally I have created 4th table which is also 100 columns but one of them is VARCHAR causes raw format to be dynamic.
More specially:[Read more...]
I spent Friday examining the systems for a more traditional consulting gig (in case you did not know, Pythian has offered more traditional consulting and emergency services in addition to our remote DBA offering for our entire 12 year history). It is a familiar story to many people — the database performance was unacceptable.
The company had a few log tables that had huge amounts of inserts and foreign keys, so they used InnoDB. Unfortunately, they also used a GUID as a primary key (which is getting more and more common these days, but long primary keys with InnoDB slow down
DELETE commands a great deal) —
varchar(32) and utf8.
That’s right — their primary key for many of these tables was 96 bytes long (32 characters * 3 bytes per[Read more...]
It is rather typical for systems to start as MyISAM but as system growths to move to Innodb. The reason of the move could be just desire for better data consistency guaranty or being bitten repairing multiple GB MyISAM table few times, though Table Locks is probably the most important issue - with modern multi core servers not only the fact you can't well mix SELECTs and UPDATEs but also the fact only one update can be happening at the time can be the problem, not to mention Key Cache which often becomes serious contention issue.
The problem we often run into during migration is Full Text Search indexes which are not[Read more...]
Suppose you have a MyISAM table containing a column with a full text index. This table starts to grow to a significant size (millions of rows) and gets updated fairly frequently. Chances are that you’ll start to see some bottlenecks when accessing this table, since without row level locking, the reading and writing operations will be blocking each other.
A solution that many people would suggest right away is to use the master for writes and a slave for reads, but this only masks the problem, and it won’t take long before enough read traffic on the slave starts causing slave lags.
The main difference between the Sphinx search engine and other alternatives is its close integration with MySQL. For example, it can be used as a storage[Read more...]
One of the ways to import data into MySQL is using the LOAD DATA INFILE. It is a faster method than recovering from a dump, as it’s raw data instead of SQL sentences.
The import time depends on the table engine, for example, MyISAM can be 40 times faster than Innodb. Let’s benchmark this:
I’m gonna make some benchmarking using MySQL 5.1.36 (64 bits MacOS X). I’ll need a big table, so I’ll take City from the World Database and create a huge table called “city_huge”:
CREATE TABLE city_huge LIKE CITY; INSERT INTO city_huge SELECT NULL, name, CountryCode, District, Population FROM city; # Run this sentence 100 times, # so city_huge table will be 100 times bigger than city. # Tip: use a script, temporary table,[Read more...]
Seen in #maatkit on Freenode:
I never realized just how terrible recovering MyISAM from a crash can be
Sad but true — it can be pretty painful. This is one of the reasons I pretty much recommend InnoDB (okay, okay, XtraDB) for most data unless it’s read-only.
Related posts brought to you by Yet Another Related Posts Plugin.
As I've wrote few times using multiple key caches is a great way to get CPU scalability if you're using MyISAM. It is however very annoying - this feature really looks half baked to me.
The problem with multiple key caches and mapping of tables to the different files is - there is no way to see the existing key caches, their mapping and stats. The only thing you can access is key cache parameters - via structured variables
In particular I would like to:
See the list of created caches Right now I can create key caches with random names causing invisible resource consumption. It is possible to make an error in key[Read more...]
Assume you have a table with about 300 000 rows, and an indexed column ‘col1′ with only 9 distinct values. Now you got a query like ’select * from t1 where col1 = const’. The questions are
- when the index is faster to full table scan and vice versa?
- does MySQL use the optimal plan by default?
These questions became very relevant now that QOT got server access and is able to gather various table metrics including selectivity. Besides index selectivity the threshold value obviously depends on the storage engine used, so for me it is also interesting to see how our PBXT engine compares to others in this aspect. Namely to InnoDB - an engine with similar transactional properties and MyISAM - a very fast engine for read-only scenarios.
For the test I took[Read more...]
InnoDB and MyISAM use B+ and B trees for indexes (InnoDB also has internal hash index).
In both these structures, the depth of the index is an important factor. When looking for an indexed row, a search is made on the index, from root to leaves.
Assuming the index is not in memory, the depth of the index represents the minimal cost (in I/O operation) for an index based lookup. Of course, most of the time we expect large portions of the indexes to be cached in memory. Even so, the depth of the index is an important factor. The deeper the index is, the worse it performs: there are simply more lookups on index nodes.
What affects the depth of an index?
There are quite a few structural issues, but it boils down to two important factors:
I get to hear that a lot: that LOCK TABLES with MyISAM is some sort of replacement for transactions; some model we can work with which gives us ‘transactional flavor’.
It isn’t, and here’s why.
When we speak of a transactional database/engine, we check out its ACID compliance. Let’s break out the ACID and see what LOCK TABLES provides us with:
This is a short note that the MySQL User Group Meetings in Israel are established (well, re-established after a very long period).
Thanks to Eddy Resnick from Sun Microsystems Israel who has set up the meetings. So far, we’ve had 2 successful meetings, and we intend to have more! First one was in Sun’s offices in Herzlia; second one, held last week, was at Interbit (a MySQL training center) in Ramat Gan. We hope to hold these meetings on a monthly basis, and the next ones are expected to be held at Interbit.
A new (blessed) law in Israel forbids us from sending invitations for these meetings via email without prior consent of the recepient (this law has passed as means of stopping spam). We do realize there are many users out there who would be interested in these meeting. For those users: please stay tuned to[Read more...]
|Previous 30 Newer Entries||Showing entries 31 to 60 of 60|