MySQL’s TO_DAYS function

I’ll try to clear up possible confusion about dates and MySQL’s TO_DAYS() function. TO_DAYS() works, correctly. TO_DAYS() assumes a year 0, correctly. There are indeed bugs, but only if you use MySQL extensions with partitions.

First: TO_DAYS() works, correctly. The MySQL Reference Manual says we use a proleptic Gregorian calendar, and that’s all explained with terms anyone can follow, in section “11.7. What Calendar Is Used By MySQL?”
http://dev.mysql.com/doc/refman/5.1/en/mysql-calendar.html
So we follow the Gregorian rules, not the Julian ones, without concerning ourselves about what happened in 1582 — just like DB2, just like the standard, not like Oracle. To make sure it’s okay, I wrote and ran this stored procedure:

DELIMITER //
SET @@sql_mode=ansi//
DROP procedure IF EXISTS p//
CREATE PROCEDURE p ()
BEGIN
  DECLARE days_in_month INT DEFAULT 31;
  DECLARE months_in_year INT DEFAULT 12;
  DECLARE expected_to_days INT;
  DECLARE pyear INT DEFAULT 1;
  DECLARE pmonth INT DEFAULT 1;
  DECLARE pday INT DEFAULT 1;
  DECLARE cyear VARCHAR(4);
  DECLARE year_month_day CHAR(10);

  /* Start with DATE string = '0001-01-01', the minimum legal value */
  SET year_month_day = pyear || '-' || pmonth || '-' || pday;
  /* Start with INT = 366 i.e. 366 days since '0000-01-01' */
  SET expected_to_days = 366;
  WHILE year_month_day < '9999-12-31' DO
    IF TO_DAYS(year_month_day) <> expected_to_days THEN
      SELECT 'error', year_month_day,TO_DAYS(year_month_day), expected_to_days;
      SET pyear = 9999; SET pmonth = 12; SET pday = 30;
      END IF;
    SET pday = pday + 1;
    IF pday > days_in_month THEN
      /* Overflowed days in month. Go forward 1 month. */
      SET pday = 1;
      SET pmonth = pmonth + 1;
      IF pmonth > months_in_year THEN
        /* Overflowed months in year. Go forward 1 year. */
        SET pmonth = 1;
        SET pyear = pyear + 1;
        END IF;
      IF pmonth = 1 THEN SET days_in_month = 31; END IF; /* January */
      IF pmonth = 2 THEN
        IF pyear MOD 4 = 0 AND (pyear MOD 100 <> 0 OR pyear MOD 400 = 0) THEN
          SET days_in_month = 29;
        ELSE
          SET days_in_month = 28;
          END IF;
        END IF;                                          /* February */
      IF pmonth = 3 THEN SET days_in_month = 31; END IF; /* March */
      IF pmonth = 4 THEN SET days_in_month = 30; END IF; /* April */
      IF pmonth = 5 THEN SET days_in_month = 31; END IF; /* May */
      IF pmonth = 6 THEN SET days_in_month = 30; END IF; /* June */
      IF pmonth = 7 THEN SET days_in_month = 31; END IF; /* July */
      IF pmonth = 8 THEN SET days_in_month = 31; END IF; /* August */
      IF pmonth = 9 THEN SET days_in_month = 30; END IF; /* September */
      IF pmonth =10 THEN SET days_in_month = 31; END IF; /* October */
      IF pmonth =11 THEN SET days_in_month = 30; END IF; /* November */
      IF pmonth =12 THEN SET days_in_month = 31; END IF; /* December */
      END IF;
    SET cyear = pyear;
    WHILE LENGTH(cyear) < 4 DO SET cyear = '0' || cyear; END WHILE; /* pad */
    SET year_month_day = cyear || '-' || pmonth || '-' || pday;
    SET expected_to_days = expected_to_days + 1;
    END WHILE;
  END//
CALL p()//

The procedure ranges over all valid dates between ‘0001-01-01′ and ‘9999-12-30′, while incrementing an integer that begins at 366. Each time the day goes up, the procedure calls TO_DAYS() and compares to the integer that we’re incrementing. If there’s an error, it prints ‘error’.

It doesn’t print ‘error’. Therefore there’s no error. QED.

This isn’t as comprehensive as what MySQL’s test suite does, but I like to do straightforward procedures so it’s obvious there’s nothing up my sleeve.

Second: TO_DAYS() assumes a year 0, correctly.

Astute readers may have noticed that I started with the assumption that TO_DAYS(’0001-01-01′ = 366. That’s correct because there was a year 0. I should explain that 0000 is 100% valid according to astronomical year numbering. Read your wikipedia: http://en.wikipedia.org/wiki/ISO_8601. Yes, Pope Gregory XIII didn’t use it, and yes, MySQL’s TO_DAYS(’0000-02-29′) etc. won’t use it, but we didn’t define those as valid dates for TO_DAYS(), so its only job is to be correct starting with 0001-01-01.

So, given that the MySQL manual says the start was really in year 0, the value of TO_DAYS(’0001-01-01′) should be the number of days in year 0. And how many is that according to Gregory’s leap-year rules? Well, 0 is divisible by 4, it’s divisible by 100, it’s divisible by 400. So it’s a leap year. So 366 is the right start.

Third: TO_DAYS() has associated bugs, but only if you use MySQL extensions.

Although this is basically a happy story, I do have to say that TO_DAYS works rather ungregorianly for dates which are outside the range 0001-01-01 to 9999-12-31, or dates which have MySQL’s extension that they can have day = 0, e.g. ‘2008-10-00′. This does result in the only legitimate bugs concerning TO_DAYS in the bugs.mysql.com
database:
Bug#20577 Partitions: use of to_days() function leads to selection failures
Bug#40972 Partition pruning can lead to crash for bad dates
We probably won’t fix Bug#20577 during the lifetime of MySQL version 5.1, but Bug#40972 (which is more serious) already has a fix available now. There also feature requests in bugs.mysql.com for extending the date range (Bug#8488, Bug#17902). I myself would love to see it go down to -4713-11-24 i.e. Julian day 0, but the 6.x worklog tasks don’t include such a feature.

I recommend the following for anyone who wants to know about date handling with various DBMSs:

“The Oracle Calendar” by Peter Gulutzan and Trudy Pelzer
http://www.orafaq.net/papers/dates_o.doc

“Dates in SQL” by Peter Gulutzan and Trudy Pelzer
http://www.informit.com/articles/article.aspx?p=30939