You often see databases with huge dynamic text fields, such as VARCHAR(255), TEXT, or as I recently was allowed to see the blanket use of LONGTEXT (max 4GiB) in order to be invulnerable from all contingencies. Things getting even worse when an index is used over such columns, because hey, there is an index. It makes things fast :-) Okay, jokes aside. Often you can save a lot of space and time, MySQL spends traversing the index when using a proper column type and index size.
Even in times of a growing market of specialized NoSQL databases, the relevance of traditional RDBMS doesn't decline. Especially when it comes to the calculation of aggregates based on complex data sets that can not be processed as a batch like Map&Reduce. MySQL is already bringing in a handful of aggregate functions that can be useful for a statistical analysis. The best known of this type are certainly:
Since I use MySQL for the statistical analysis on a project, I wanted to optimize the database queries and learned a lot about stuff like number theory, set theory and partial sums. I took my MySQL UDF, I've published two years ago, for this purpose and added new functions for a deeper statistical analysis. The project is around for a while, so it's time to share things with the public to start a discussion of how things could be further optimized. The source and a small documentation can be found on Github: