I just read SQL: Ranking without self join, in which
Shlomi
Noach shares a nice MySQL-specific trick based on user-defined variables to compute rankings.
Shlomi's trick reminds me somewhat of the trick I came across
little over a year ago to caclulate percentiles. At that time, several
people pointed out to me too that using user-defined variables in
this way can be unreliable.The problem with user-defined
variablesSo what is the problem exaclty? Well, whenever a query
assigns to a variable, and that same variable is read in another
part of the query, you're on thin ice. That's because the …
Before I even start this post I am going to repeat our view that Oracle is well aware that it has little to gain from killing off MySQL and that we expect MySQL to become the scale-out database for non-transactional web applications and to compete with SQL Server in departmental deployments.
That said there has been some interesting discussion on Twitter
this week in response to the European Commission’s investigation
of Oracle-Sun about whether Oracle could - in theory - kill off
MySQL. Here’s a Q+A explaining my view as to how Oracle could
kill MySQL but probably won’t, and why MySQL AB’s choice of dual
licensing and the GPL has come back to haunt Monty
Widenius.
Q. Oracle can’t kill MySQL even if it wants to, because its open
source. Right?
A. …
[Read more]
Looking at Bug#47310, which is a feature request that I hear
frequently when I talk about partitions, I wrote a comment,
suggesting triggers to work around the limitation.
The reason for the limitation is that allowing arbitrary
functions for partitioning was too complex and it was provoking
crashes and other unpleasant side effects (see the discussion
under bug#18198).
But if you use a trigger, the resulting column is a plain
integer, and many of the side effects disappear. The drawback is
that you need to add a column to your table, and you need
to use that column when searching for data. With that in mind,
you can implement the workaround quite easily.
[Read more]
USE test;
DROP TABLE IF EXISTS users;
CREATE TABLE users (
user_id int(10) NOT NULL,
…
Since DTrace was released for Solaris I am missing it on Linux systems... It can't be included in Linux by the same reason why ZFS can't be - it's licensing issue. Both ZFS and DTrace are under CDDL, which is incompatible with GPL. So you can see DTrace and ZFS on Solaris, FreeBSD, MacOS, but not on Linux.
However I follow the project SystemTap for couple of years (it was started in 2005), which is supposed to provide similar to DTrace functionality.
Why I am interested in this tool, because there is no simple way
under Linux to profile not CPU-bound load (for CPU-bound there is
OProfile, see for example
http://mysqlinsights.blogspot.com/2009/08/oprofile-for-io-bound-apps.html).
I.e. for IO-bound or for mutex contention problems OProfile is
not that useful. …
I used to develop with MySQL, and those were the golden days. These days I don’t write queries so much. But yesterday I needed to answer this question: are there any issues in our issue-tracking system that meet the following criteria?
- The last two or more emails are from the customer
- These emails were separated by at least two hours (i.e. it wasn’t a single train of thought)
I could do it with all kinds of correlated subqueries and so on — but maybe I could also just do it without them, no? Can this be done with plain old JOINS and GROUP BY? I’m sure you know the answer.
Here’s my approach: group emails by issue, and concatenate the dates they were sent in reverse order. If an email was sent from Percona to the customer, replace the date with a magical OUTBOUND constant. The result might look like this: “2009-09-11 13:17:34,OUTBOUND,…”. I’ll change this to create a good sample …
[Read more]The EXPLAIN command is an important tool to review how a SQL query is executed and in this example includes what indexes are used.
By adding a covering index I ended up with the following EXPLAIN plan I was unable to explain. The end result was a boost in server performance which was the ultimate goal.
mysql> explain select max(md) from e_r where email = 'xxxx@gmail.com' and id = '36981'; +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+------------------------------+ | 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables …[Read more]
Recently I was working on a project which used Infobright as the database. The version tested was 3.1.1 both on OpenSolaris as well as Solaris 10. Infobright is like a column-oriented database engine for MySQL primarily targeted towards data warehouse, data mining type of project deployments.
While everything was working as expected, one thing we did notice that as number of concurrent connections tried to query against the database we noticed that queries deteriorated fast in the sense that not much parallel benefits were being squeezed from the machine. Now this sucks! (apparently sucks is now a technical term). It sucks because the server has definitely many cores and typically each Infobright query still can at the max peg a core. So the expectation will be typically to atleast handle concurrent queries which is close to the number of cores (figuratively speaking though …
[Read more]
So I was watching the discovery channel about the maintenance of
the Golden Gate bridge. I was surprised to find out that it takes
two years to paint the whole thing, then, when they are
"finished" the job starts all over again. Basically, this is a
never ending project! I started to think about the larger,
several thousand server, MySQL installations I've worked on and
how to improved performance. Oddly enough The Project Formally
Known as Golden Gate was born! Thought I would change the name
given the recent threat from Big Brother!
-- The fore thought
Take an environment with several hundred or, in this case,
several thousand servers and figure out a way to increase
performance from 30% - 80% or more.
-- Back to basics
De-fragmentation of the data set is a simple way to make sure
that your throughput at its best. Over time, social networking
sites, or any site that runs heavy writes with …
What is wrong here (the part of SHOW INNODB STATUS):
--------------
ROW OPERATIONS
--------------
8 queries inside InnoDB, 9 queries in queue
100 read views open inside InnoDB
It is relationship between queries active - queries inside innodb+queries in the queue totalling 17 with "read views open inside InnoDB" which is a fancy way of saying "active transactions" which is 100.
Typically you would want this ratio to be close to 1, may be 2 which would correspond all active transactions doing active work and spending little time waiting on the application. Waiting transactions are bad because they hold lock as well as other resources, such as preventing innodb purge operation from proceeding.
If you see something like this in your envinronment it often makes sense to check the the list of transactions too. Chances are you would see something like:
---TRANSACTION 0 1044183147, …
[Read more]