Showing entries 1 to 6
Displaying posts with tag: Percentile (reset)

In my opinion, one of the best things that happened to Planet MySQL lately, is Explain Extended, a blog by Alex Bolenok (also known as Quassnoi on Stackoverflow).

I never had the pleasure of meeting Alex in person, but his articles are always interesting and of high quality, and the SQL wizardry he pulls off is downright inspiring. I really feel humbled by the creativity of some of his solutions and his apparent experience with multiple RDBMS products.

Alex' most recent post is about aggregation, and finding a top 3 based on the aggregate:

In …

I just read SQL: Ranking without self join, in which Shlomi Noach shares a nice MySQL-specific trick based on user-defined variables to compute rankings.

Shlomi's trick reminds me somewhat of the trick I came across little over a year ago to caclulate percentiles. At that time, several people pointed out to me too that using user-defined variables in this way can be unreliable.The problem with user-defined variablesSo what is the problem exaclty? Well, whenever a query assigns to a variable, and that same variable is read in another part of the query, you're on thin ice. That's because the …

After stepping off of the `GROUP_CONCAT()` solution for calculating quantiles I figured it would be nice to find a better way to calculate the median too.
Solution
I previously wrote on how to calculate the median using `GROUP_CONCAT()`, but I think that this is a better way:

`SELECT     AVG(length) AS median                 -- take the average of left and right median,          MIN(length) AS left_median            -- ,          MAX(length) AS right_median           --,          @l          AS left_median_position   --,          @r          AS right_median_position  --FROM (     SELECT     @n, length                       -- @n is just here to …`

Are you getting fed up yet with my posts on calculating percentiles? Well, I'm sorry but I'm not quite finished.

Here's a simple, fast method to calculate the specified number of quantiles:

`-- set the number of quantiles, for exmple:--   quartiles: 4--   deciles: 10--   percentiles: 100SET @quantiles:=4;  -- select quartiles-- calculate all quantiles-- --SELECT     amount                     AS metric,          @n DIV (@c DIV @quantiles) AS quantile,          @n                         AS NFROM       sakila.paymentCROSS JOIN (            SELECT @n:=0                           -- rownumber            ,      @c:=COUNT(*) …`

My previous post on calculating percentiles with MySQL generated some comments and good discussion. In particular, I got some very interesting comments from Vladimir.

Basically, Vladimir was doubtful whether the `GROUP_CONCAT()` solution would be optimal in comparison to a `JOIN`. His proposal is to solve it like this:

```SELECT   SUM(g1.r) sr,        g2.length l,        SUM(g1.r)/(SELECT COUNT(*) FROM film) pFROM    (SELECT COUNT(*) r, length FROM film GROUP BY length) g1JOIN    (SELECT COUNT(*) r, length FROM film GROUP BY length) g2ON       g1.length < g2.lengthGROUP BY g2.lengthHAVING p > 0.9ORDER BY pLIMIT 1
```

First, this query sets up two identical subqueries …