Showing entries 1 to 10
Displaying posts with tag: GROUP_CONCAT (reset)
GROUP_CONCAT is very slow. So I used FastBit for the table instead! 43 seconds down to .16!

MySQL always tries to make toast of my good ideas. This is the only time is succeeds in making toast (see bug #2). |

This time I'm working on star schema optimzation (which will be my next blog post). MySQL takes about .24 seconds to project out a list of parts that match a filter, but it takes 43 (not .43, FORTY-THREE) seconds to turn it into a CSV list using GROUP_CONCAT.

  select P_PartKey from part where where P_MFGR = 'MFGR#1' or P_MFGR='MFGR#2';
  399597 rows in set (0.24 sec)

  mysql> select group_concat(P_PartKey) into @pkey from part where P_MFGR = 'MFGR …

[Read more]
GROUP_CONCAT is very slow. So I used FastBit for the table instead! 43 seconds down to .16!

MySQL always tries to make toast of my good ideas. This is the only time is succeeds in making toast (see bug #2). |

This time I'm working on star schema optimzation (which will be my next blog post). MySQL takes about .24 seconds to project out a list of parts that match a filter, but it takes 43 (not .43, FORTY-THREE) seconds to turn it into a CSV list using GROUP_CONCAT.

  select P_PartKey from part where where P_MFGR = 'MFGR#1' or P_MFGR='MFGR#2';
  399597 rows in set (0.24 sec)

  mysql> select group_concat(P_PartKey) into @pkey from part where P_MFGR = 'MFGR#1' or   P_MFGR='MFGR#2';
  Query OK, 1 row affected (43.25 sec)

So I decided to just stick the part table into FastBit instead (using my FastBit_UDF tools):
fb_create('/var/lib/fastbit/part', …

[Read more]
The power of MySQL’s GROUP_CONCAT

In the very early days of Percona Vadim wrote very nice post about GROUP_CONCAT.

But I want to show you a bit more about it.

When is GROUP_CONCAT useful? Usually while working with Support customers I recommend it when you have aggregation of many-to-many info. It makes the view simpler and more beautiful and it doesn’t need much effort to make it work.

Some simple examples:

This is a test table:

CREATE TABLE `group_c` (
`parent_id` int(11) DEFAULT NULL,
`child_id` int(11) DEFAULT NULL
) ENGINE=InnoDB;
INSERT INTO …
[Read more]
Removing all databases from a MySQL instance

Many out there will have different ideas about this, some using procs, some using a function, others using a shell script. Well I didn’t want to spend much time on it so decided a group_concat(concat would be enough.
There is no genius, rather laziness :) but what if you have a hundred databases and you want to drop them all?

mysql Thu Mar  3 13:50:06 2011 > pager sed 's/,/ /g'
PAGER set to 'sed 's/,/ /g''
mysql Thu Mar  3 13:50:32 2011 > select group_concat(concat('drop database ',SCHEMA_NAME,';')) from information_schema.schemata where SCHEMA_NAME !='mysql' and SCHEMA_NAME !='information_schema';
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| group_concat(concat('drop database ' SCHEMA_NAME ';')) …
[Read more]
Greatest N per group: top 3 with GROUP_CONCAT()

In my opinion, one of the best things that happened to Planet MySQL lately, is Explain Extended, a blog by Alex Bolenok (also known as Quassnoi on Stackoverflow).

I never had the pleasure of meeting Alex in person, but his articles are always interesting and of high quality, and the SQL wizardry he pulls off is downright inspiring. I really feel humbled by the creativity of some of his solutions and his apparent experience with multiple RDBMS products.

Alex' most recent post is about aggregation, and finding a top 3 based on the aggregate:

In …

[Read more]
MySQL: Another Ranking trick

I just read SQL: Ranking without self join, in which Shlomi Noach shares a nice MySQL-specific trick based on user-defined variables to compute rankings.

Shlomi's trick reminds me somewhat of the trick I came across little over a year ago to caclulate percentiles. At that time, several people pointed out to me too that using user-defined variables in this way can be unreliable.The problem with user-defined variablesSo what is the problem exaclty? Well, whenever a query assigns to a variable, and that same variable is read in another part of the query, you're on thin ice. That's because the …

[Read more]
A Faster MySQL Database Size Google Chart

Abstract - As described by Walter Heck, MySQL database size can be visualized using Google Charts. With a minor code improvement the URL for the chart can be obtained twice as fast. With some more modification, the number of lines can be cut down resulting in a function that is half as long.

Hi!It's been a while since I posted - I admit I'm struggling for a bit to balance time and attention to the day job, writing a book, preparing my talks for the MySQL user's conference and of course family life.

A month ago or so I …

[Read more]
A fast, single pass method to calculate the median in MySQL

After stepping off of the GROUP_CONCAT() solution for calculating quantiles I figured it would be nice to find a better way to calculate the median too.
Solution
I previously wrote on how to calculate the median using GROUP_CONCAT(), but I think that this is a better way:


SELECT AVG(length) AS median -- take the average of left and right median
, MIN(length) AS left_median --
, MAX(length) AS right_median --
, @l AS left_median_position --
, @r AS right_median_position --
FROM (
SELECT @n, length -- @n is just here to …
[Read more]
Calculating Percentiles with MySQL, Round 2

My previous post on calculating percentiles with MySQL generated some comments and good discussion. In particular, I got some very interesting comments from Vladimir.

Basically, Vladimir was doubtful whether the GROUP_CONCAT() solution would be optimal in comparison to a JOIN. His proposal is to solve it like this:


SELECT SUM(g1.r) sr
, g2.length l
, SUM(g1.r)/(SELECT COUNT(*) FROM film) p
FROM (SELECT COUNT(*) r, length FROM film GROUP BY length) g1
JOIN (SELECT COUNT(*) r, length FROM film GROUP BY length) g2
ON g1.length < g2.length
GROUP BY g2.length
HAVING p > 0.9
ORDER BY p
LIMIT 1


First, this query sets up two identical subqueries in …

[Read more]
Calculating the Nth percentile in MySQL

Yesterday, I was on the freenode ##pentaho irc channel when Andres Chaves asked me how to calculate the Nth percentile in MySQL. He saw a solution somewhere using subqueries, but wasn't too happy about it.

A while ago I wrote about calulating the median in MySQL, and it turns out the Nth percentile can be calculated using a similar, single-pass approach, not relying on subqueries, UDFs, or user-defined variables.
The percentile....
So, what is a percentile exactly? Here's what the wikipedia says:

A percentile is the value of a variable below which a certain percent of observations fall. So the 20th percentile is the value (or score) below which 20 …

[Read more]
Showing entries 1 to 10