I just read SQL: Ranking without self join, in which
Shlomi
Noach shares a nice MySQL-specific trick based on user-defined variables to compute rankings.
Shlomi's trick reminds me somewhat of the trick I came across
little over a year ago to caclulate percentiles. At that time, several
people pointed out to me too that using user-defined variables in
this way can be unreliable.The problem with user-defined
variablesSo what is the problem exaclty? Well, whenever a query
assigns to a variable, and that same variable is read in another
part of the query, you're on thin ice. That's because the …
I used to develop with MySQL, and those were the golden days. These days I don’t write queries so much. But yesterday I needed to answer this question: are there any issues in our issue-tracking system that meet the following criteria?
- The last two or more emails are from the customer
- These emails were separated by at least two hours (i.e. it wasn’t a single train of thought)
I could do it with all kinds of correlated subqueries and so on — but maybe I could also just do it without them, no? Can this be done with plain old JOINS and GROUP BY? I’m sure you know the answer.
Here’s my approach: group emails by issue, and concatenate the dates they were sent in reverse order. If an email was sent from Percona to the customer, replace the date with a magical OUTBOUND constant. The result might look like this: “2009-09-11 13:17:34,OUTBOUND,…”. I’ll change this to create a good sample …
[Read more]