Showing entries 1 to 2
Displaying posts with tag: SUBSTRING_INDEX (reset)
A fun use of SUBSTRING_INDEX and friends in MySQL

I used to develop with MySQL, and those were the golden days. These days I don’t write queries so much. But yesterday I needed to answer this question: are there any issues in our issue-tracking system that meet the following criteria?

  • The last two or more emails are from the customer
  • These emails were separated by at least two hours (i.e. it wasn’t a single train of thought)

I could do it with all kinds of correlated subqueries and so on — but maybe I could also just do it without them, no? Can this be done with plain old JOINS and GROUP BY? I’m sure you know the answer.

Here’s my approach: group emails by issue, and concatenate the dates they were sent in reverse order. If an email was sent from Percona to the customer, replace the date with a magical OUTBOUND constant. The result might look like this: “2009-09-11 13:17:34,OUTBOUND,…”. I’ll change this to create a good sample …

[Read more]
Calculating the Nth percentile in MySQL

Yesterday, I was on the freenode ##pentaho irc channel when Andres Chaves asked me how to calculate the Nth percentile in MySQL. He saw a solution somewhere using subqueries, but wasn't too happy about it.

A while ago I wrote about calulating the median in MySQL, and it turns out the Nth percentile can be calculated using a similar, single-pass approach, not relying on subqueries, UDFs, or user-defined variables.
The percentile....
So, what is a percentile exactly? Here's what the wikipedia says:

A percentile is the value of a variable below which a certain percent of observations fall. So the 20th percentile is the value (or score) below which 20 …

[Read more]
Showing entries 1 to 2