Most analytical and BI databases have date dimension table(s). One frequently needs to generate and populate such data. I present a solution below for such data generation, written in Python. Please use different database drivers/modules to connect to your specific database server (MySQL, SQL Server, Oracle, etc.) for data population.
1. It takes 2 parameters, start date and end date, in YYYYMMDD format, inclusive. Extensive error checking is built in, but let me know if you have comments/suggestions;
2. The script produce a Python dictionary (associated array) and print out its content;
3. The output includes dayNumber: a day’s position in a year. For example, 2011-02-01 is the 32ed day in 2011, therefore its dayNumber is 32;
4. The output includes weekNumber: a week’s position in a year. The week number in year is based[Read more...]
When I was designing web sites, for a long time I wrote my HTML code the “hard” way – by opening a text editor and manually typing in the code (this was before I purchased Adobe DreamWeaver).
During that time of manual HTML writing, I had a project that required forms on a web page, and I needed a place to store the information. After talking with a few tech friends, I decided to use MySQL as my web site database, and Perl as my scripting language.
I had written complex Bourne shell scripts before, but Perl was something entirely new. With a little help from a buddy of mine, after a few hours I was off and running. I was amazed at how easy it was to connect to a MySQL database with Perl.
This example will show you how to[Read more...]
I’ve done some benchmark testing of 2 Python modules for MySQL data retrieval: MySQLdb and pyhs. MySQLdb uses MySQL’s client libraries, whereas pyhs uses HandlerSocket that bypasses MySQL’s client layer and interfaces Innodb storage engine’s files directly. In my testing, HandlerSocket results in 82% improvement over mysql client libraries based on number of rows retrieved. The tests were conducted under different conditions: right after a start when cache is cold, a warmed up cache after running SELECT * FROM customer, and alternating the execution order of those 2 Python files.[Read more...]
1. The script prints out elapsed time since transaction started, MySQL thread id, and the kill statement for transactions running longer than a defined threshold value, in seconds. Just copy, paster, and then execute the kill statement if you want to terminate the long transaction(s);
2. Adjust shellCmd variable;
3. Adjust longRunningThreshold value as needed. It is measured in seconds;
4. No special libraries/modules needed, as long as there is a working mysql client;
5. re module is used for regex processing. Good place to find examples of regular expression search and grouping. A status variable is used to assist locating MySQL thread id once a transaction running longer than the defined threshold is found.
import re, shlex, subprocess def runCmd(cmd): proc = subprocess.Popen(shlex.split(cmd),[Read more...]
Quick update to the framework that was released yesterday; I’ve added automatic graph generation. I chose DyGraphs due to the quick ability to enable support – the HTML is very quick and simply loads the CSV data. It has the same zooming features of Highcharts without the JS overhead.
Now when you run a load test you will get (in the output directory) a mixture of files: the main cumulative CSV and HTML file for the hostname that was tested, and then one CSV and HTML per report variable that was tested. This means you don’t have to drag the main CSV file into an alternate program or spend time parsing out certain variables one at a time to generate specific graphs. I’ve also added support for limiting output of SNMP variables (LOAD,CPU,MEM). Head over here and download the update:[Read more...]
It seems that everyone loves load testing these days. Problem is that everyone is using their own quick scripts, simple or complex, to drive their tests without the ability for other DBAs to duplicate those tests. Let’s say I write a great test and share my results and graphs on the blog – you want to run the same tests to see how your new DB servers compare in performance: this framework allows you to do that without duplicating any work or writing code. This is a basic release that will get the ball rolling. I’ve included some sample tests in the README file, so give them a try.
This codebase offers a user friendly framework for creating and visualizing MySQL database load test jobs. It is based around Sysbench, which is generally considered the industry standard load test application. The framework allows you to do the following:
So I’ve been doing a fair number of automated load tests these past six months. Primarily with Sysbench, which is a fine, fine tool. First I started using some simple bash based loop controls to automate my overnight testing, but as usually happens with shell scripts they grew unwieldy and I rewrote them in python. Now I have some flexible and easily configurable code for sysbench based MySQL benchmarking to offer the community. I’ve always been a fan of giving back to such a helpful group of people – you’ll never hear me complain about “my time isn’t free”. So, let me know what you want in an ideal testing environment (from a load testing framework automation standpoint) and I’ll integrate it into my existing framework and then release it via the BSD license. The main goal here is to have a standardized modular framework, based on sysbench,[Read more...]
I tried Python out a while ago, but stopped trying it to learn it after some major frustrations. Maybe I didn’t dig deep enough into it. I found the documentation hard to read, and the module layout seemed a little random at times. For some reason I found executing an external process and getting the results to be a little convoluted. (Since then I’ve learned to use popen(..).communicate())
I ended up messing with other languages to try to find one that suits my tastes, like Erlang and[Read more...]
This is a reply to the blog post Poor man’s MySQL replication monitoring. Haidong Ji had a few problems using MySQLdb (could use the ‘dict’ cursor) and apparently he doesn’t want to much dependencies. I agree that using the mysql client tool is a nice alternative if you don’t want to use any 3rd party Python modules. And the MySQL client tools are usually and should be installed with the server.
However, since MySQL Connector/Python only needs itself and Python, dependencies are reduced to a minimum.[Read more...]