<?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0"
     xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/">

<channel>
  <title>Planet MySQL</title>
  <link>http://www.planetmysql.org/</link>
  <pubDate>Sat, 07 Nov 2009 12:30:02 +0000</pubDate>
  <language>en</language>
  <description>Planet MySQL - http://www.planetmysql.org/</description>

  <item>
    <title>New script speeds up Kontrollbase login by 50%</title>
    <guid isPermaLink="false">http://kontrollsoft.com/?p=516</guid>
    <link>http://feedproxy.google.com/~r/Kontrollsoft/~3/Z5zEml-cN2c/516</link>
    <description>There&amp;#8217;s a new script in kontrolbase that speeds up the login process by up to 50%. I highly recommend every user to upgrade to the latest version of the svn release. Otherwise you can grab the file here: http://code.google.com/p/kontrollbase/source/browse/trunk/bin/kontroll-query_cache_preload.pl
Read more about the script and how it works here: http://kontrollsoft.com/forum/kontrollbase-issues-and-solutions/query-cache-preload-script-speeds-up-login-by-50#p22
</description>
    <content:encoded><![CDATA[There&#8217;s a new script in kontrolbase that speeds up the login process by up to 50%. I highly recommend every user to upgrade to the latest version of the svn release. Otherwise you can grab the file here: http://code.google.com/p/kontrollbase/source/browse/trunk/bin/kontroll-query_cache_preload.pl
Read more about the script and how it works here: http://kontrollsoft.com/forum/kontrollbase-issues-and-solutions/query-cache-preload-script-speeds-up-login-by-50#p22
<img src="http://feeds.feedburner.com/~r/Kontrollsoft/~4/Z5zEml-cN2c" height="1" width="1" /><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22140&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22140&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Sat, 07 Nov 2009 01:13:00 +0000</pubDate>
    <dc:creator>Matt Reid</dc:creator>
    <category>announcement</category>
  </item>

  <item>
    <title>My MySQL Tool Chest</title>
    <guid isPermaLink="false">tag:blogger.com,1999:blog-8007802080401497299.post-545169657910002013</guid>
    <link>http://mmatemate.blogspot.com/2009/11/my-mysql-tool-chest.html</link>
    <description>Every time I need to install or reconfigure a new workstation, I review the set of tools I use. It's an opportunity to refresh the list, reconsider the usefulness of old tools and review new ones. During my first week at Open Market I got one of these opportunities. Here is my short list of free (as in 'beer') OSS tools and why they have a place in my tool chest.Testing EnvironmentsVirtual BoxOf all the Virtual Machines out there, I consider Virtual Box to be the easiest to use. Since I first looking into it while I was still working at Sun/MySQL, this package has been improved constantly. It's a must have to stage High Availability scenarios or run tools that are not available in your OS of choice.MySQL SandboxDid you compile MySQL from source and want to test it without affecting your current installation? Will replication break when you try a new feature? Will the upgrade work as expected? There is no other way to easily test this other than MySQL Sandbox. It's a must have for anyone working with MySQL regularly.BackupZRM for MySQLMany people have asked me why do I always suggest going this way when using (insert tool of preference) gets the job done. ZRM for MySQL has plenty of features that go beyond taking the actual backup, making it a breeze to actually manage the backup sets. In most cases if you use (insert tool of preference), you are still left with the additional tasks around the backups (ie: scheduling, rotation, copying backup off site, backup reports, etc). Why reinvent the wheel?TuningThese are simple scripts that can quickly give you an overview of the current status of any MySQL server and assist you in making proper adjustments to improve efficiency.mysqlslaI like to call mysqlsla the Slow Query Reality Check. I found that many times developers and DBAs start scanning the slow query log to find the slowest queries and start optimizing them to increase overall performance. Many fail to recognize that quick queries that are run hundreds or thousands of times in a short period of time, can have a much greater impact on performance than a dozen complex long running ones. mysqlsla can scan the query log, slow or general, and rank the queries based on accumulated run and lock times (among other values). This way it's easy to identify the the queries that will really impact overall performance. It might be a &quot;SELECT COUNT(*) FROM table WHERE status = ?&quot; instead of a query with a 5 table JOIN.mysqltunerRunning mysqltuner is like taking a physical exam of a MySQL server. Whether you do it the first time you get into a server or after any changes to its configuration and/or environment. The script will very quickly point to the low hanging fruit in terms of configuration parameters. The most common issue I've caught with it is memory over allocation. This is a nasty situation that, by its very nature, if undetected it will show up in the very worst moment: under heavy load.mytopmytop will show you in real time what is going on in the server. Doing load tests? Trying to catch deadlocks? Fire up your test case while keeping an eye on mytop's screen.OtherMySQL WorkbenchAt this point, I haven't been able to find any tool, other than MySQL Workbench, to get proper DB diagrams for MySQL schemas. The ideal situation would be that every DBA would have these diagrams accessible, but the truth is, they rarely exist.sar-sqlI know, this is beating my own drum, but it works and combined with some other tools, it can provide a great deal of information with negligible overhead. I wish I had more time to write about more use cases.WildcardmytermI just read about myterm in a recent blog. I am really intrigued by it, but haven't had any time to test it. If it works as advertised, it is a great companion to sar-sql.</description>
    <content:encoded><![CDATA[Every time I need to install or reconfigure a new workstation, I review the set of tools I use. It's an opportunity to <i>refresh</i> the list, reconsider the usefulness of old tools and review new ones. During my first week at <a href="http://www.openmarket.com/">Open Market</a> I got one of these opportunities. Here is my short list of free (as in 'beer') OSS tools and why they have a place in my tool chest.<br /><h3>Testing Environments<br /></h3><h4>Virtual Box</h4><br />Of all the Virtual Machines out there, I consider <a href="http://www.virtualbox.org/" target="_blank" title="Virtual Box">Virtual Box</a> to be the easiest to use. Since I first looking into it while I was still working at Sun/MySQL, this package has been improved constantly. It's a must have to stage High Availability scenarios or run tools that are not available in your OS of choice.<br /><h4>MySQL Sandbox</h4>Did you compile MySQL from source and want to test it without affecting your current installation? Will replication break when you try a new feature? Will the upgrade work as expected? There is no other way to easily test this other than <a href="http://www.mysqlsandbox.net/" target="_blank" title="MySQL Sandbox">MySQL Sandbox</a>. It's a must have for anyone working with MySQL regularly.<br /><br /><h3>Backup</h3><h4>ZRM for MySQL</h4>Many people have asked me why do I always suggest going this way when using (insert tool of preference) gets the job done. <a href="http://www.zmanda.com/backup-mysql.html" target="_blank" title="ZRM for MySQL">ZRM for MySQL</a> has plenty of features that go beyond taking the actual backup, making it a breeze to actually manage the backup sets. In most cases if you use (insert tool of preference), you are still left with the additional tasks around the backups (ie: scheduling, rotation, copying backup off site, backup reports, etc). Why reinvent the wheel?<br /><br /><h4>Tuning</h4><br />These are simple scripts that can quickly give you an overview of the current status of any MySQL server and assist you in making proper adjustments to improve efficiency.<br /><br /><h4>mysqlsla</h4><br />I like to call <a href="http://hackmysql.com/mysqlsla" target="_blank" title="mysqlsla">mysqlsla</a> the <i>Slow Query Reality Check</i>. I found that many times developers and DBAs start scanning the slow query log to find the slowest queries and start optimizing them to increase overall performance. Many fail to recognize that quick queries that are run hundreds or thousands of times in a short period of time, can have a much greater impact on performance than a dozen complex long running ones. <b>mysqlsla</b> can scan the query log, slow or general, and rank the queries based on accumulated run and lock times (among other values). This way it's easy to identify the the queries that will really impact overall performance. It might be a "SELECT COUNT(*) FROM table WHERE status = ?" instead of a query with a 5 table JOIN.<br /><h4>mysqltuner</h4>Running <a href="http://blog.mysqltuner.com/" target="_blank" title="mysqltuner">mysqltuner</a> is like taking a <i>physical exam</i> of a MySQL server. Whether you do it the first time you get into a server or after any changes to its configuration and/or environment. The script will very quickly point to the <i>low hanging fruit</i> in terms of configuration parameters. The most common issue I've caught with it is memory over allocation. This is a nasty situation that, by its very nature, if undetected it will show up in the very worst moment: under heavy load.<br /><h4>mytop</h4><a href="http://jeremy.zawodny.com/mysql/mytop/" target="_blank" title="mytop">mytop</a> will show you in real time what is going on in the server. Doing load tests? Trying to catch deadlocks? Fire up your test case while keeping an eye on mytop's screen.<br /><br /><h3>Other</h3><h4>MySQL Workbench</h4>At this point, I haven't been able to find any tool, other than <a href="http://dev.mysql.com/doc/workbench/en/index.html" target="_blank" title="MySQL Workbench">MySQL Workbench</a>, to get proper DB diagrams for MySQL schemas. The ideal situation would be that every DBA would have these diagrams accessible, but the truth is, they rarely exist.<br /><h4>sar-sql</h4>I know, this is beating my own drum, but it works and combined with some other tools, it can provide a great deal of information with negligible overhead. I wish I had more time to write about more use cases.<br /><h3>Wildcard</h3><h4>myterm</h4>I just read about <b>myterm</b> in a <a href="http://www.jetprofiler.com/blog/8/myterm---extensible-mysql-command-line-client/" target="_blank" title="recent blog">recent blog</a>. I am really intrigued by it, but haven't had any time to test it. If it works as advertised, it is a great companion to <b>sar-sql</b>.<div><img width="1" height="1" src="https://blogger.googleusercontent.com/tracker/8007802080401497299-545169657910002013?l=mmatemate.blogspot.com" /></div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22139&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22139&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Fri, 06 Nov 2009 23:34:00 +0000</pubDate>
    <dc:creator>Gerardo Narvaja</dc:creator>
    <category>tuning</category>
    <category>scripts</category>
    <category>tools</category>
    <category>mysql</category>
    <category>backup</category>
    <category>dba</category>
  </item>

  <item>
    <title>FLIFO scheduling for InnoDB</title>
    <guid isPermaLink="false">http://www.facebook.com/note.php?note_id=175800920932</guid>
    <link>http://www.facebook.com/note.php?note_id=175800920932</link>
    <description>At the end of this note I describe how InnoDB can be much faster (2.5X) for high-concurrency workloads. However, what we really did is improve the code to not get 2.5X slower.

InnoDB uses innodb_thread_concurrency to limit the number of threads that run concurrently. Enforcement of this limit is imprecise to improve performance. By imprecise, I mean that there are usually fewer than innodb_thread_concurrency threads running within InnoDB even when there are many threads ready, willing and able to run there.

In what follows assume that there is a 1:1 mapping between thread, session and connection.

Enforcement is implemented using tickets. Each time a thread enters InnoDB it gives up 1 ticket. When a thread has 0 tickets it gets more tickets if the number of threads with non-zero ticket counts is less than innodb_thread_concurrency. Otherwise the thread is put at the end of the queue of waiting threads and the thread at the head of the queue is scheduled to run. This queue provides FIFO scheduling. A thread has zero tickets at the start of statement execution.

There is also the notion of exiting InnoDB. A thread exits InnoDB at the end of a statement, during a row lock wait and when performing long-running work outside of InnoDB. Long-running work includes sending results back to a client and performing a file sort. The cases in which a thread exits InnoDB expanded with fixes for bug 32149  (thanks to Ben for finding these). Note that a thread does not exit InnoDB while blocked on IO. I will experiment with that in the future.

So far, this sounds reasonable, with the exception that threads don't exit InnoDB when blocked on IO. There will be at most 8 pending read requests when innodb_thread_concurrency=8 and this might not utilize a servers IO capacity. An IO bound server needs a higher value for innodb_thread_concurrency than a CPU bound server. As some servers are IO bound part of the time and CPU bound part of the time, it is difficult to find the best value for innodb_thread_concurrency.

But I encountered a new problem. I noticed a few servers with 1000+ concurrent queries and when I looked at the thread stacks all threads were blocked on the per-thread condition variables used to schedule them to be run. This server used innodb_thread_concurrency=8 and eventually I confirmed that 8 threads had been scheduled to run. The problem is that the threads took too long to begin running.

At this point I made that joke that while the scheduler in Linux may be O(1), there is a high constant factor. And then someone told me that we weren't using that yet.

We don't want to modify libc, NPTL or the kernel, so we need to fix this in InnoDB. InnoDB uses FIFO to schedule the threads. We changed it to use a hybrid of FIFO+LIFO, thus the name FLIFO scheduling. This makes performance much better on high-concurrency workloads. The behavior is enabled by the my.cnf parameter innodb_thread_lifo. 

Two variables were added as part of the change:

fifo-pending - the number of threads that have been selected to run via a pthread condition variable broadcast but have yet to begin running. This is incremented by threads that exit InnoDB when they schedule another thread to run and is decremented by the thread as soon as it begins to run.
lifo-running - the number of threads that are currently running courtesy of the LIFO policy.

When a thread would not be allowed to enter InnoDB because of the FIFO policy and the innodb_thread_concurrency limit, it is allowed to enter by the LIFO policy when:

    lifo-running &lt; fifo-pending


This allows 2 * innodb_thread_concurrency threads to run concurrently in some cases, so the value should be set with that in mind.

The results are impressive. See the results on a chart. The numbers are the TPS from sysbench readonly for MySQL 5.0.84 using the new FLIFO policy versus the original FIFO policy on an 8-core x86 server. The difference is that throughput is constant with FLIFO at high-concurrency while it degrades signficantly for the original FIFO.

5084-flifo 186.22 342.65 570.79 707.82 718.84 665.60 574.96 590.33 611.15 612.69 637.37 
5084-original 180.95 335.37 516.97 645.86 625.14 648.88 658.71 579.59 515.26 431.97 255.04 

sysbench command line for the test:

sysbench --test=oltp --mysql-db=test --oltp-table-size=2000000 --max-time=180 \
--max-requests=0 --mysql-table-engine=innodb --db-ps-mode=disable \
--mysql-engine-trx=yes --oltp-read-only --oltp-skip-trx --oltp-dist-type=uniform \
--oltp-range-size=1000 --num-threads=1 --seed-rng=1 run


my.cnf for FLIFO:

innodb_buffer_pool_size=2000M
innodb_log_file_size=100M
innodb_flush_log_at_trx_commit=2
innodb_doublewrite=0
innodb_flush_method=O_DIRECT
innodb_thread_concurrency=16
max_connections=2000
innodb_max_dirty_pages_pct=80
table_cache=2000

innodb_thread_sleep_delay=0
innodb_concurrency_tickets=500
innodb_thread_lifo=1

my.cnf for original FIFO:

innodb_buffer_pool_size=2000M
innodb_log_file_size=100M
innodb_doublewrite=0
innodb_flush_method=O_DIRECT
innodb_thread_concurrency=32
max_connections=2000
innodb_max_dirty_pages_pct=80
innodb_flush_log_at_trx_commit=2
table_cache=2000


This is vmstat output during the test. Note that context switches for InnoDB+FLIFO are about 1/3 of the rate for original InnoDB and CPU times for us/sy/id/wa are much better for FLIFO.

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
# Typical output from vmstat for InnoDB+FLIFO
40  0      0 56524420 1255528 5451320    0    0     0    81 1035 66694 81 11  8  0  0

# and from original InnoDB
53  0      0 56494980 1255616 5460592    0    0     0    95 1037 238890 58 23 19  0  0
</description>
    <content:encoded><![CDATA[At the end of this note I describe how InnoDB can be much faster (2.5X) for high-concurrency workloads. However, what we really did is improve the code to not get 2.5X slower.

InnoDB uses <a hef="http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_thread_concurrency">innodb_thread_concurrency</a> to limit the number of threads that run concurrently. Enforcement of this limit is imprecise to improve performance. By imprecise, I mean that there are usually fewer than innodb_thread_concurrency threads running within InnoDB even when there are many threads ready, willing and able to run there.

In what follows assume that there is a 1:1 mapping between thread, session and connection.

Enforcement is implemented using tickets. Each time a thread enters InnoDB it gives up 1 ticket. When a thread has 0 tickets it gets more tickets if the number of threads with non-zero ticket counts is less than innodb_thread_concurrency. Otherwise the thread is put at the end of the queue of waiting threads and the thread at the head of the queue is scheduled to run. This queue provides FIFO scheduling. A thread has zero tickets at the start of statement execution.

There is also the notion of <b>exiting InnoDB</b>. A thread exits InnoDB at the end of a statement, during a row lock wait and when performing long-running work outside of InnoDB. Long-running work includes sending results back to a client and performing a file sort. The cases in which a thread exits InnoDB expanded with fixes for <a href="http://bugs.mysql.com/bug.php?id=32149">bug 32149</a>  (thanks to Ben for finding these). Note that a thread <b>does not</b> exit InnoDB while blocked on IO. I will experiment with that in the future.

So far, this sounds reasonable, with the exception that threads don't exit InnoDB when blocked on IO. There will be at most 8 pending read requests when innodb_thread_concurrency=8 and this might not utilize a servers IO capacity. An IO bound server needs a higher value for innodb_thread_concurrency than a CPU bound server. As some servers are IO bound part of the time and CPU bound part of the time, it is difficult to find the best value for innodb_thread_concurrency.

But I encountered a new problem. I noticed a few servers with 1000+ concurrent queries and when I looked at the thread stacks <b>all</b> threads were blocked on the per-thread condition variables used to schedule them to be run. This server used innodb_thread_concurrency=8 and eventually I confirmed that 8 threads had been scheduled to run. The problem is that the threads took too long to begin running.

At this point I made that joke that while the scheduler in Linux may be O(1), there is a high constant factor. And then someone told me that we weren't using that yet.

We don't want to modify libc, NPTL or the kernel, so we need to fix this in InnoDB. InnoDB uses FIFO to schedule the threads. We changed it to use a hybrid of FIFO+LIFO, thus the name <b>FLIFO</b> scheduling. This makes performance much better on high-concurrency workloads. The behavior is enabled by the my.cnf parameter <b>innodb_thread_lifo</b>. 

Two variables were added as part of the change:
<ul>
<li><b>fifo-pending</b> - the number of threads that have been selected to run via a pthread condition variable broadcast but have yet to begin running. This is incremented by threads that exit InnoDB when they schedule another thread to run and is decremented by the thread as soon as it begins to run.
<li><b>lifo-running</b> - the number of threads that are currently running courtesy of the LIFO policy.
</ul>
When a thread would not be allowed to enter InnoDB because of the FIFO policy and the innodb_thread_concurrency limit, it is allowed to enter by the LIFO policy when:
<pre>
    lifo-running < fifo-pending
</pre>

This allows 2 * innodb_thread_concurrency threads to run concurrently in some cases, so the value should be set with that in mind.

The results are impressive. <a href="http://chart.apis.google.com/chart?chs=400x200&amp;cht=lc&amp;chxt=x,y,x,y&amp;chd=t:186,343,571,708,719,666,575,590,611,613,637%7C181,335,517,646,625,649,659,580,515,432,255&amp;chdl=5084-flifo%7C5084-original&amp;chtt=&amp;chco=FF0000,00FF00&amp;chds=0,719&amp;chxr=1,0,719&amp;chxl=0:%7C1%7C2%7C4%7C8%7C16%7C32%7C64%7C128%7C256%7C512%7C1024%7C2:%7CConcurrent%20users%7C%7C3:%7CTPS%7C&amp;chxp=2,50%7C3,50&amp;chg=10,10">See the results on a chart</a>. The numbers are the TPS from sysbench readonly for MySQL 5.0.84 using the new FLIFO policy versus the original FIFO policy on an 8-core x86 server. The difference is that throughput is constant with FLIFO at high-concurrency while it degrades signficantly for the original FIFO.
<pre>
5084-flifo 186.22 342.65 570.79 707.82 718.84 665.60 574.96 590.33 611.15 612.69 637.37 
5084-original 180.95 335.37 516.97 645.86 625.14 648.88 658.71 579.59 515.26 431.97 255.04 
</pre>
sysbench command line for the test:
<pre>
sysbench --test=oltp --mysql-db=test --oltp-table-size=2000000 --max-time=180 \
--max-requests=0 --mysql-table-engine=innodb --db-ps-mode=disable \
--mysql-engine-trx=yes --oltp-read-only --oltp-skip-trx --oltp-dist-type=uniform \
--oltp-range-size=1000 --num-threads=1 --seed-rng=1 run
</pre>

my.cnf for FLIFO:
<pre>
innodb_buffer_pool_size=2000M
innodb_log_file_size=100M
innodb_flush_log_at_trx_commit=2
innodb_doublewrite=0
innodb_flush_method=O_DIRECT
innodb_thread_concurrency=16
max_connections=2000
innodb_max_dirty_pages_pct=80
table_cache=2000

innodb_thread_sleep_delay=0
innodb_concurrency_tickets=500
innodb_thread_lifo=1
</pre>
my.cnf for original FIFO:
<pre>
innodb_buffer_pool_size=2000M
innodb_log_file_size=100M
innodb_doublewrite=0
innodb_flush_method=O_DIRECT
innodb_thread_concurrency=32
max_connections=2000
innodb_max_dirty_pages_pct=80
innodb_flush_log_at_trx_commit=2
table_cache=2000
</pre>

This is vmstat output during the test. Note that context switches for InnoDB+FLIFO are about 1/3 of the rate for original InnoDB and CPU times for us/sy/id/wa are much better for FLIFO.
<pre>
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
# Typical output from vmstat for InnoDB+FLIFO
40  0      0 56524420 1255528 5451320    0    0     0    81 1035 66694 81 11  8  0  0

# and from original InnoDB
53  0      0 56494980 1255616 5460592    0    0     0    95 1037 238890 58 23 19  0  0
</pre><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22137&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22137&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Fri, 06 Nov 2009 21:19:23 +0000</pubDate>
    <dc:creator>Mark Callaghan</dc:creator>
  </item>

  <item>
    <title>Oracle Express Edition first steps for MySQL DBAs</title>
    <guid isPermaLink="false">tag:blogger.com,1999:blog-8575059197193667898.post-3843304511236504599</guid>
    <link>http://dave-stokes.blogspot.com/2009/11/oracle-express-edition-first-steps-for.html</link>
    <description>I have had a few MySQL DBAs ask about how to get started learning Oracle.  I will admit that it has been on my to-do list for quite a while1. It never hurts to know more than one database system and a great deal of DBA help wanted ads mention Oracle. Someone once said that you must make sure your capabilities exceed your limitations2 and recently I have been feeling limited when others have started to talk about Oracle capabilities.So what does it take for a MySQL DBA to get their hands on their own Oracle instance? I used my Ubuntu box to go to Oracle's web site to get the free Oracle XE software.  Download and feed to package manager Add my account to dba group As root, /etc/init.d/oracle-xe configure to set passwords and portsI pointed my web browser to http://127.0.01/apex and got the page you see in image with this blog.  Now I need to find my copy of Hands-On Oracle Database 10g Express Edition for Linux.1. I also have a stack of DB-2 and SQL Servers books that were picked up at Half Price Books to read through. I am always looking for better ways to express database concepts for MySql exams. Writing up something similar to this entry for DB2 and SQL Server is also on the list.  2. This quote has attributed to Bruce Lee and several others. And it always gota groan from my kids when I used it on them.</description>
    <content:encoded><![CDATA[<p align=left>I have had a few MySQL DBAs ask about how to get started learning Oracle.<a href="http://1.bp.blogspot.com/_T60cuxcUiVI/SvR-mhpEo4I/AAAAAAAABos/tk1f0Mw2JX0/s1600-h/Oshot.png"><img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_T60cuxcUiVI/SvR-mhpEo4I/AAAAAAAABos/tk1f0Mw2JX0/s200/Oshot.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5401081053454836610" /></a>  I will admit that it has been on my to-do list for quite a while<sup>1</sup>. It never hurts to know more than one database system and a great deal of DBA help wanted ads mention Oracle. Someone once said that you must make sure your capabilities exceed your limitations<sup>2</sup> and recently I have been feeling limited when others have started to talk about Oracle capabilities.<br /><br />So what does it take for a MySQL DBA to get their hands on their own Oracle instance? I used my Ubuntu box to go to Oracle's web site to get the free Oracle XE software.<br /> <br /><ol><br /><li> Download and feed to package manager</li><br /><li> Add my account to dba group</li><br /><li> As root, <span>/etc/init.d/oracle-xe configure</span> to set passwords and ports</li><br /></ol><br /><br />I pointed my web browser to http://127.0.01/apex and got the page you see in image with this blog.  Now I need to find my copy of <a href="http://www.amazon.com/Hands-Oracle-Database-Express-Linux/dp/B000MAHC8Q/ref=sr_1_5?ie=UTF8&amp;s=books&amp;qid=1257539506&amp;sr=8-5">Hands-On Oracle Database 10g Express Edition for Linux</a>.<br /><br />1. I also have a stack of DB-2 and SQL Servers books that were picked up at Half Price Books to read through. I am always looking for <i>better</i> ways to express database concepts for MySql exams. Writing up something similar to this entry for DB2 and SQL Server is also on the list.  <br /><br />2. This quote has attributed to Bruce Lee and several others. And it always gota groan from my kids when I used it on them.<div><img width="1" height="1" src="https://blogger.googleusercontent.com/tracker/8575059197193667898-3843304511236504599?l=dave-stokes.blogspot.com" /></div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22136&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22136&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Fri, 06 Nov 2009 19:01:00 +0000</pubDate>
    <dc:creator>Dave Stokes</dc:creator>
    <category>MySQL DBA</category>
    <category>Oracle</category>
  </item>

  <item>
    <title>Log Buffer #168: a Carnival of the Vanities for DBAs</title>
    <guid isPermaLink="false">http://www.pythian.com/news/?p=4969</guid>
    <link>http://www.pythian.com/news/4969/log-buffer-168-a-carnival-of-the-vanities-for-dbas</link>
    <description>This is the 168th edition of Log Buffer, the weekly review of database blogs.  Let&amp;#8217;s give the wheel a spin and see who comes first&amp;nbsp;.&amp;nbsp;.&amp;nbsp;.&amp;nbsp;
MySQL
Brian &amp;#8220;Krow&amp;#8221; Aker has something to say about Drizzle, InfiniDB, and column-oriented storage: &amp;#8220;I have been asked a number of times &amp;#8216;do you think there is a need for a column oriented database in the open source world?&amp;#8217; The answer has been yes! &amp;nbsp;.&amp;nbsp;.&amp;nbsp;.&amp;nbsp; I was very happy to see Calpont do their release of Infinidb last week.&amp;#8221;
Vadim of the MySQL Performance Blog said, &amp;#8220;As Calpont announced availability of InfiniDB I surely couldn&amp;#8217;t miss a chance to compare it with previously tested databases in the same environment.&amp;#8221;  And he didn&amp;#8217;t, as shows his post Air traffic queries in InfiniDB: early alpha.  Bob Dempsey and Jim Tommaney of InfiniDB are in on the discussion.
Back to Drizzle for a moment, and Jay Pipes&amp;#8217; item, The Great Escape.  &amp;#8220;This week, I am working on putting together test cases which validate the Drizzle transaction log&amp;#8217;s handling of BLOB columns. &amp;nbsp;.&amp;nbsp;.&amp;nbsp;.&amp;nbsp; I ran into an interesting set of problems and am wondering how to go about handling them. Perhaps the LazyWeb will have some solutions. &amp;nbsp;.&amp;nbsp;.&amp;nbsp;.&amp;nbsp; The problem, in short, is inconsistency in the way that the NUL character is escaped (or not escaped) in both the MySQL/Drizzle protocol and the MySQL/Drizzle client tools.&amp;#8221;
Baron Schwartz has been catching erroneous queries, without MySQL proxy, having been inspired by Chris Calender&amp;#8217;s post, Capturing Erroneous Queries with MySQL Proxy.
Nick Goodman promises instant relief from slow MySQL reporting queries using dynamoDB.  And no gooey applicator!
Robert Hodges of the Scale-Out Blog looks at replicating from MySQL to Drizzle and beyond.  &amp;#8220;I am&amp;nbsp;.&amp;nbsp;.&amp;nbsp;.&amp;nbsp;delighted that Marcus Erikkson has published a patch to Tungsten that allows replication from MySQL to Drizzle. He&amp;#8217;s also working on implementing Drizzle-to-Drizzle support, which will be very exciting. &amp;nbsp;.&amp;nbsp;.&amp;nbsp;.&amp;nbsp; This brings up a question&amp;#8211;what about replicating from MySQL to PostgreSQL? What about other databases?&amp;#8221;
And back to xaprb, where Baron confesses, I’m a Postgres user, as it turns out.
PostgreSQL
Bruce Momjian has published a new security talk, Securing PostgreSQL From External Attack.
Andrew Dunstan delves into recursion in Recursion, n. See recursion. &amp;#8220;Never,&amp;#8221; he says, &amp;#8220;underestimate the usefulness of silly demos (this is written for a talk next week) to teach things worth knowing.&amp;#8221;
Bernd Helmle shares a walk-through of cloning Slony nodes. &amp;#8220;The new stable branch 2.0 of Slony-I is out for a while now. Time to blog about one of my favorite new features there, cloning an existing node without doing an initial SUBSCRIBE command.&amp;#8221;
SQL Server
It was PASS Summit this week.  On Home of the Scary DBA, Grant Fritchey covers the event with several good posts, including PASS Summit 2009 Key Note 3.  (Grant is also Geek of the Week! Congratulations, Grant!  I guess.  Quote: &amp;#8220;I think most DBA&amp;#8217;s have adminhood thrust upon them. I think the ‘accidental’ DBA is the most prevalent path into becoming a DBA. I became a full time Admin by opening my mouth once too often.&amp;#8221;)
Aaron Bertrand also has his summary Blogging from the PASS Keynote: 2009-11-03.  (Grant and Aaron both have to specify which keynote they mean, because there&amp;#8217;s more than one keynote. This, I guess, is &amp;#8220;keynote redundancy&amp;#8221;, but I still think PASS needs to normalize.)
Greg Low announces the launch at the PASS Summit of a new book, SQL Server MVP Deep Dives. &amp;#8220;This is no ordinary book,&amp;#8221; he writes. &amp;#8220;Paul Nielsen took up Steve Ballmer&amp;#8217;s challenge at a recent MVP summit to do something notable to give back to the community. He organised a large group of SQL Server MVPs to create a unique book and worked with Manning to get it published. The money made on the book was to go directly to a charity and the charity chosen was WarChild.&amp;#8221;
Ben Nevarez asks, Are You Using Scalable Shared Databases? &amp;#8220;Did you know that you can share read-only databases between several instances of SQL Server?  &amp;nbsp;.&amp;nbsp;.&amp;nbsp;.&amp;nbsp; Scalable Shared Databases is a very interesting SQL Server feature that many of us seem to almost have forgotten about&amp;nbsp;.&amp;nbsp;.&amp;nbsp;.&amp;nbsp;&amp;#8221;
Here&amp;#8217;s Roman Rehak reporting an issue with restoring 2000 backups on 2008.  He writes, &amp;#8220;Recently we&amp;#8217;ve been experiencing a lot of headaches with SQL Server 2008 crashing while restoring a backup taken on a SQL Server 2000 production server. The crash resulted in a stack dump but SQL Server would continue running, although less stable, and sooner or later needed a reboot.&amp;#8221;
Meanwhile, Adam Machanic reports on SQL Server 2008: lock escalation, INSERTs, and a potential bug.  Adam says, &amp;#8220;Lock escalation is a funny thing. I&amp;#8217;ve found myself on numerous occasions waging war against its concurrency-sapping existence, and rarely have I found myself wishing that it would work more aggressively. But there is a time and place for everything, and yesterday I discovered that a major change has occurred with regard to lock escalation in SQL Server 2008.&amp;#8221;
Oracle
Mohammed Mawla on the Pythian Blog bridges the gap with his item on running the same query against multiple SQL Server AND Oracle instances.
Surachart Opun shares his HOWTO on using DUPLICATE without a connection to target database: &amp;#8220;&amp;nbsp;.&amp;nbsp;.&amp;nbsp;.&amp;nbsp;that&amp;#8217;s a 11gR2 Feature. DUPLICATE can be performed without connecting to a target database. This requires connecting to a catalog and auxiliary database.&amp;#8221;
Here&amp;#8217;s another HOWTO, this one from the great grandson of Husnu Sensoy: How to Install Oracle 11g Release 2 on OEL 5.4 on VirtualBox: Installing Grid Infrastructure. He begins, &amp;#8220;In Oracle 11g Release 2 you will find that things have changed even for single instance database installation. I will try to illustrate in this series of posts how to install a single instance Oracle 11g Release 2 database to your Linux machines.&amp;#8221;
But let&amp;#8217;s step back a bit. Ronny Egners says, Oracle on linux – yes of course – but what linux?. &amp;#8220;There is a discussion from December 2008 what Linux (SLES vs. Red vHat vs. Oracle Enterprise Linux) to use for running oracle on Linux by Yann Neuhaus. &amp;nbsp;.&amp;nbsp;.&amp;nbsp;.&amp;nbsp; After nearly one year i wanted to catch up the article and check if the pros and cons are still valid or if there changed anything.&amp;#8221;
Chen Shapira offers The Senile DBA Guide to Troubleshooting Sudden Growth in Redo Generation, which begins, &amp;#8220;I just troubleshooted a server where the amounts of redo generated suddenly exploded to the point of running out of disk space. &amp;nbsp;.&amp;nbsp;.&amp;nbsp;.&amp;nbsp; The problem was found and the storage manager pacified, I decided to save the queries I used.  &amp;nbsp;.&amp;nbsp;.&amp;nbsp;.&amp;nbsp; It was very embarrassing to discover that I actually have 4 similar but not identical scripts&amp;nbsp;.&amp;nbsp;.&amp;nbsp;.&amp;nbsp; Now I have 5.&amp;#8221;
Embarrassing, Chen! But do you have guilty feelings like Martin Widlake does?  He makes a guilty confession.  The sin?  &amp;#8220;I use the Buffer Cache Hit Ratio.&amp;#8221;
Last, Tyler Muth introduces Logger, A PL/SQL Logging and Debugging Utility.
That is all for now.  Please let&amp;#8217;s hear your favourite database blogs in the comments.  Until next time!</description>
    <content:encoded><![CDATA[<p>This is the 168<sup>th</sup> edition of <a href="http://www.pythian.com/news/about-log-buffer"><em>Log Buffer</em></a>, the weekly review of database blogs.  Let&#8217;s give the wheel a spin and see who comes first&nbsp;.&nbsp;.&nbsp;.&nbsp;</p>
<h3>MySQL</h3>
<p><a href="http://krow.livejournal.com"><strong>Brian &#8220;Krow&#8221; Aker</strong></a> has something to say about <a href="http://krow.livejournal.com/675706.html">Drizzle, InfiniDB, and column-oriented storage</a>: &#8220;I have been asked a number of times &#8216;do you think there is a need for a column oriented database in the open source world?&#8217; The answer has been yes! &nbsp;.&nbsp;.&nbsp;.&nbsp; I was very happy to see Calpont do their release of Infinidb last week.&#8221;</p>
<p><strong>Vadim</strong> of the <a href="http://www.mysqlperformanceblog.com">MySQL Performance Blog</a> said, &#8220;As Calpont announced availability of InfiniDB I surely couldn&#8217;t miss a chance to compare it with previously tested databases in the same environment.&#8221;  And he didn&#8217;t, as shows his post <a href="http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha">Air traffic queries in InfiniDB: early alpha</a>.  Bob Dempsey and Jim Tommaney of InfiniDB are in on the discussion.</p>
<p>Back to Drizzle for a moment, and <a href="http://www.jpipes.com"><strong>Jay Pipes&#8217;</strong></a> item, <a href="http://www.jpipes.com/index.php?/archives/309-The-Great-Escape.html">The Great Escape</a>. <span></span> &#8220;This week, I am working on putting together test cases which validate the Drizzle transaction log&#8217;s handling of BLOB columns. &nbsp;.&nbsp;.&nbsp;.&nbsp; I ran into an interesting set of problems and am wondering how to go about handling them. Perhaps the LazyWeb will have some solutions. &nbsp;.&nbsp;.&nbsp;.&nbsp; The problem, in short, is inconsistency in the way that the NUL character is escaped (or not escaped) in both the MySQL/Drizzle protocol and the MySQL/Drizzle client tools.&#8221;</p>
<p><a href="http://www.xaprb.com/blog"><strong>Baron Schwartz</strong></a> has been <a href="http://www.xaprb.com/blog/2009/11/01/catching-erroneous-queries-without-mysql-proxy">catching erroneous queries, without MySQL proxy</a>, having been inspired by Chris Calender&#8217;s post, <a href="http://www.chriscalender.com/?p=66">Capturing Erroneous Queries with MySQL Proxy</a>.</p>
<p><a href="http://www.nicholasgoodman.com/bt/blog"><strong>Nick Goodman</strong></a> promises <a href="http://www.nicholasgoodman.com/bt/blog/2009/11/02/instant-relief-from-slow-mysql-reporting-queries-using-dynamodb/">instant relief from slow MySQL reporting queries using dynamoDB</a>.  And no gooey applicator!</p>
<p><strong>Robert Hodges</strong> of <a href="http://scale-out-blog.blogspot.com">the Scale-Out Blog</a> looks at <a href="http://scale-out-blog.blogspot.com/2009/10/replicating-from-mysql-to-drizzle-and.html">replicating from MySQL to Drizzle and beyond</a>.  &#8220;I am&nbsp;.&nbsp;.&nbsp;.&nbsp;delighted that Marcus Erikkson has published a patch to Tungsten that allows replication from MySQL to Drizzle. He&#8217;s also working on implementing Drizzle-to-Drizzle support, which will be very exciting. &nbsp;.&nbsp;.&nbsp;.&nbsp; This brings up a question&#8211;what about replicating from MySQL to <strong>PostgreSQL</strong>? What about other databases?&#8221;</p>
<p>And back to <a href="http://www.xaprb.com/blog">xaprb</a>, where Baron confesses, <a href="http://www.xaprb.com/blog/2009/11/03/im-a-postgres-user-as-it-turns-out">I’m a Postgres user, as it turns out</a>.</p>
<h3>PostgreSQL</h3>
<p><a href="http://momjian.us/main/blogs"><strong>Bruce Momjian</strong></a> has published a <a href="http://momjian.us/main/blogs/pgblog.html#October_30_2009">new security talk</a>, <em>Securing PostgreSQL From External Attack</em>.</p>
<p><a href="http://people.planetpostgresql.org/andrew">Andrew Dunstan</a> delves into recursion in <a href="http://people.planetpostgresql.org/andrew/index.php?/archives/46-Recursion,-n.-See-recursion..html">Recursion, n. See recursion</a>. &#8220;Never,&#8221; he says, &#8220;underestimate the usefulness of silly demos (this is written for a talk next week) to teach things worth knowing.&#8221;</p>
<p><a href="http://psoos.blogspot.com"><strong>Bernd Helmle</strong></a> shares a walk-through of <a href="http://psoos.blogspot.com/2009/09/cloning-slony-nodes.html">cloning Slony nodes</a>. &#8220;The new stable branch 2.0 of Slony-I is out for a while now. Time to blog about one of my favorite new features there, cloning an existing node without doing an initial SUBSCRIBE command.&#8221;</p>
<h3>SQL Server</h3>
<p>It was PASS Summit this week.  On <a href="http://scarydba.wordpress.com">Home of the Scary DBA</a>, <strong>Grant Fritchey</strong> covers the event with several good posts, including <a href="http://scarydba.wordpress.com/2009/11/05/pass-summit-2009-key-note-3">PASS Summit 2009 Key Note 3</a>.  (Grant is also <a href="http://www.simple-talk.com/opinion/geek-of-the-week/interview-with-the-scary-dba-%E2%80%93-grant-fritchey/"><strong>Geek of the Week!</strong></a> Congratulations, Grant!  I guess.  Quote: &#8220;I think most DBA&#8217;s have adminhood thrust upon them. I think the ‘accidental’ DBA is the most prevalent path into becoming a DBA. I became a full time Admin by opening my mouth once too often.&#8221;)</p>
<p><a href="http://sqlblog.com/blogs/aaron_bertrand"><strong>Aaron Bertrand</strong></a> also has his summary <a href="http://sqlblog.com/blogs/aaron_bertrand/archive/2009/11/03/blogging-from-the-pass-keynote-2009-11-03.aspx">Blogging from the PASS Keynote: 2009-11-03</a>.  (Grant and Aaron both have to specify which keynote they mean, because there&#8217;s <em>more than one keynote</em>. This, I guess, is &#8220;keynote redundancy&#8221;, but I still think PASS needs to normalize.)</p>
<p><a href="http://sqlblog.com/blogs/greg_low"><strong>Greg Low</strong></a> announces the <a href="http://sqlblog.com/blogs/greg_low/archive/2009/10/31/book-sql-server-mvp-deep-dives-launch-at-pass-summit-usa.aspx">launch at the PASS Summit of a new book, <em>SQL Server MVP Deep Dives</em></a>. &#8220;This is no ordinary book,&#8221; he writes. &#8220;Paul Nielsen took up Steve Ballmer&#8217;s challenge at a recent MVP summit to do something notable to give back to the community. He organised a large group of SQL Server MVPs to create a unique book and worked with Manning to get it published. The money made on the book was to go directly to a charity and the charity chosen was WarChild.&#8221;</p>
<p><a href="http://sqlblog.com/blogs/ben_nevarez"><strong>Ben Nevarez</strong></a> asks, <a href="http://sqlblog.com/blogs/ben_nevarez/archive/2009/10/30/are-you-using-scalable-shared-databases.aspx">Are You Using Scalable Shared Databases?</a> &#8220;Did you know that you can share read-only databases between several instances of SQL Server?  &nbsp;.&nbsp;.&nbsp;.&nbsp; Scalable Shared Databases is a very interesting SQL Server feature that many of us seem to almost have forgotten about&nbsp;.&nbsp;.&nbsp;.&nbsp;&#8221;</p>
<p>Here&#8217;s <a href="http://sqlblog.com/blogs/roman_rehak"><strong>Roman Rehak</strong></a> reporting an issue with <a href="http://sqlblog.com/blogs/roman_rehak/archive/2009/11/02/issue-with-restoring-2000-backups-on-2008.aspx">restoring 2000 backups on 2008</a>.  He writes, &#8220;Recently we&#8217;ve been experiencing a lot of headaches with SQL Server 2008 crashing while restoring a backup taken on a SQL Server 2000 production server. The crash resulted in a stack dump but SQL Server would continue running, although less stable, and sooner or later needed a reboot.&#8221;</p>
<p>Meanwhile, <a href="http://sqlblog.com/blogs/adam_machanic"><strong>Adam Machanic</strong></a> reports on <a href="http://sqlblog.com/blogs/adam_machanic/archive/2009/10/30/sql-server-2008-lock-escalation-inserts-and-a-potential-bug.aspx">SQL Server 2008: lock escalation, INSERTs, and a potential bug</a>.  Adam says, &#8220;Lock escalation is a funny thing. I&#8217;ve found myself on numerous occasions waging war against its concurrency-sapping existence, and rarely have I found myself wishing that it would work more aggressively. But there is a time and place for everything, and yesterday I discovered that a major change has occurred with regard to lock escalation in SQL Server 2008.&#8221;</p>
<h3>Oracle</h3>
<p><a href="http://www.pythian.com/news/author/mawla"><strong>Mohammed Mawla</strong></a> on the Pythian Blog bridges the gap with his item on <a href="http://www.pythian.com/news/4683/run-the-same-query-against-multiple-sql-server-and-oracle-instances">running the same query against multiple SQL Server AND Oracle instances</a>.</p>
<p><a href="http://surachartopun.com"><strong>Surachart Opun</strong></a> shares his HOWTO on using <a href="http://surachartopun.com/2009/11/duplicate-without-connection-to-target.html">DUPLICATE without a connection to target database</a>: &#8220;&nbsp;.&nbsp;.&nbsp;.&nbsp;that&#8217;s a 11gR2 Feature. DUPLICATE can be performed without connecting to a target database. This requires connecting to a catalog and auxiliary database.&#8221;</p>
<p>Here&#8217;s another HOWTO, this one from <a href="http://husnusensoy.wordpress.com">the great grandson of Husnu Sensoy</a>: <a href="http://husnusensoy.wordpress.com/2009/10/30/how-to-install-oracle-11g-release-2-on-oel-5-4-on-virtualbox-installing-grid-infrastructure">How to Install Oracle 11g Release 2 on OEL 5.4 on VirtualBox: Installing Grid Infrastructure</a>. He begins, &#8220;In Oracle 11g Release 2 you will find that things have changed even for single instance database installation. I will try to illustrate in this series of posts how to install a single instance Oracle 11g Release 2 database to your Linux machines.&#8221;</p>
<p>But let&#8217;s step back a bit. <a href="http://blog.ronnyegner-consulting.de"><strong>Ronny Egners</strong></a> says, <a href="http://blog.ronnyegner-consulting.de/2009/10/19/oracle-on-linux-yes-of-course-but-what-linux/">Oracle on linux – yes of course – but what linux?</a>. &#8220;There is a discussion from December 2008 what Linux (SLES vs. Red vHat vs. Oracle Enterprise Linux) to use for running oracle on Linux by Yann Neuhaus. &nbsp;.&nbsp;.&nbsp;.&nbsp; After nearly one year i wanted to catch up the article and check if the pros and cons are still valid or if there changed anything.&#8221;</p>
<p><a href="http://prodlife.wordpress.com"><strong>Chen Shapira</strong></a> offers <a href="http://prodlife.wordpress.com/2009/11/04/the-senile-dba-guide-to-troubleshooting-sudden-growth-in-redo-generation">The Senile DBA Guide to Troubleshooting Sudden Growth in Redo Generation</a>, which begins, &#8220;I just troubleshooted a server where the amounts of redo generated suddenly exploded to the point of running out of disk space. &nbsp;.&nbsp;.&nbsp;.&nbsp; The problem was found and the storage manager pacified, I decided to save the queries I used.  &nbsp;.&nbsp;.&nbsp;.&nbsp; It was very embarrassing to discover that I actually have 4 similar but not identical scripts&nbsp;.&nbsp;.&nbsp;.&nbsp; Now I have 5.&#8221;</p>
<p>Embarrassing, Chen! But do you have guilty feelings like <a href="http://mwidlake.wordpress.com"><strong>Martin Widlake</strong></a> does?  He makes <a href="http://mwidlake.wordpress.com/2009/11/01/buffer-cache-hit-ratio-my-guilty-confession">a guilty confession</a>.  The sin?  &#8220;I use the Buffer Cache Hit Ratio.&#8221;</p>
<p>Last, <a href="http://tylermuth.wordpress.com"><strong>Tyler Muth</strong></a> introduces <a href="http://tylermuth.wordpress.com/2009/11/03/logger-a-plsql-logging-and-debugging-utility">Logger, A PL/SQL Logging and Debugging Utility</a>.</p>
<p>That is all for now.  Please let&#8217;s hear your favourite database blogs in the comments.  Until next time!</p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22135&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22135&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Fri, 06 Nov 2009 17:47:34 +0000</pubDate>
    <dc:creator>Dave Edwards</dc:creator>
    <category>Log Buffer</category>
    <category>MySQL</category>
    <category>Oracle</category>
    <category>PostgreSQL</category>
    <category>SQL Server</category>
    <category>Technical Blog</category>
  </item>

  <item>
    <title>451 CAOS Links 2009.11.06</title>
    <guid isPermaLink="false">http://blogs.the451group.com/opensource/?p=1273</guid>
    <link>http://feedproxy.google.com/~r/451opensource/~3/hA82okI_pOE/</link>
    <description>Funambol acquires Zapatec. Open source gains Closure. And more.
Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca
&amp;#8220;Tracking the open source news wires, so you don&amp;#8217;t have to.&amp;#8221;
For the latest on Oracle&amp;#8217;s acquisition of MySQL via Sun, see Everything you always wanted to know about MySQL but were afraid to ask
# Funambol acquired Zapatec, an AJAX web 2.0 frameworks vendor. 
# The top ten issues facing open source users, according to Mark Radcliffe. 
# Google open sourced its Closure JavaScript tools. 
# Sam Ramji explained why open source still matters in the cloud. 
# Likewise Software launched Enterprise Starter Packs for integration with Active Directory. 
# Eucalyptus Systems updated its open source cloud computing software. 
# RiverMuse hired a new CEO released Enterprise Subscription version. 
# Jaspersoft joined the Open Source Software Institute. 
# Hitachi joined Red Hat&amp;#8217;s Advanced Mission-Critical Program. 
# SourceForge became GeekNet, reported third quarter results. 
# Mark Fidelman supplied six laws for expanding an open source business strategy into Europe. 
# Eric Raymond launched ForgePlucker, for extracting project-state from forges for backup and offline analysis. 
# Versant relicensed its db4o database from GPLv2 to the GPLv3. 
# Hippo and Sonatype announced a distribution and technology partnership. 
# Mandriva released Mandriva Linux 2010. 
# Mary Jo Foley reported on Microsoft&amp;#8217;s Orchard open source content management app. 
# Zend Technologies is working with Oracle to deliver an integrated Unbreakable Linux and PHP offering. 
# Subversion was submitted to become an Apache incubator project. 
# Pentaho released Pentaho Analyzer and unveiled an “Agile BI” initiative. 
# Red Hat CEO said open source needs champions - such as Red Hat and Google. 
# eWeek reported that Microsoft has recommitted another $100K to Apache. 
# Stephen O&amp;#8217;Grady speculated on Amazon, RDS and the Future of MySQL.
</description>
    <content:encoded><![CDATA[<p>Funambol acquires Zapatec. Open source gains Closure. And more.</p>
<p>Follow 451 CAOS Links live @caostheory on <a href="http://twitter.com/caostheory">Twitter</a> and <a href="http://identi.ca/caostheory">Identi.ca</a><br />
<em>&#8220;Tracking the open source news wires, so you don&#8217;t have to.&#8221;</em></p>
<p>For the latest on Oracle&#8217;s acquisition of MySQL via Sun, see <a href="http://blogs.the451group.com/opensource/2009/10/26/everything-you-always-wanted-to-know-about-mysql-but-were-afraid-to-ask/">Everything you always wanted to know about MySQL but were afraid to ask</a></p>
<p># Funambol <a href="http://bit.ly/6lCLJ">acquired</a> Zapatec, an AJAX web 2.0 frameworks vendor. </p>
<p># The top ten issues facing open source users, <a href="http://bit.ly/1FKHeu">according to</a> Mark Radcliffe. </p>
<p># Google <a href="http://bit.ly/1a6HJ4">open sourced</a> its Closure JavaScript tools. </p>
<p># Sam Ramji <a href="http://bit.ly/4wejOV">explained</a> why open source still matters in the cloud. </p>
<p># Likewise Software <a href="http://bit.ly/sAMEA">launched</a> Enterprise Starter Packs for integration with Active Directory. </p>
<p># Eucalyptus Systems <a href="http://bit.ly/2wGprV">updated</a> its open source cloud computing software. </p>
<p># RiverMuse <a href="http://bit.ly/1xzHqd">hired</a> a new CEO <a href="http://bit.ly/41wx5A">released</a> Enterprise Subscription version. </p>
<p># Jaspersoft <a href="http://bit.ly/dYrH0">joined</a> the Open Source Software Institute. </p>
<p># Hitachi <a href="http://bit.ly/3eoruG">joined</a> Red Hat&#8217;s Advanced Mission-Critical Program. </p>
<p># SourceForge <a href="http://bit.ly/2pqaEL">became</a> GeekNet, <a href="http://bit.ly/1NEwcf">reported</a> third quarter results. </p>
<p># Mark Fidelman <a href="http://bit.ly/2r4OkF">supplied</a> six laws for expanding an open source business strategy into Europe. </p>
<p># Eric Raymond <a href="http://bit.ly/MCbdF">launched</a> ForgePlucker, for extracting project-state from forges for backup and offline analysis. </p>
<p># Versant <a href="http://bit.ly/4u5BsA">relicensed</a> its db4o database from GPLv2 to the GPLv3. </p>
<p># Hippo and Sonatype <a href="http://bit.ly/avpK7">announced</a> a distribution and technology partnership. </p>
<p># Mandriva <a href="http://bit.ly/3p3FYj">released</a> Mandriva Linux 2010. </p>
<p># Mary Jo Foley <a href="http://bit.ly/4FchLD">reported</a> on Microsoft&#8217;s Orchard open source content management app. </p>
<p># Zend Technologies is <a href="http://bit.ly/1pUIGv">working with</a> Oracle to deliver an integrated Unbreakable Linux and PHP offering. </p>
<p># Subversion was <a href="http://bit.ly/3aQgQ6">submitted</a> to become an Apache incubator project. </p>
<p># Pentaho <a href="http://bit.ly/iXrtU">released</a> Pentaho Analyzer and unveiled an “Agile BI” initiative. </p>
<p># Red Hat CEO <a href="http://bit.ly/4xXdiJ">said</a> open source needs champions - such as Red Hat and Google. </p>
<p># eWeek <a href="http://bit.ly/mIjuD">reported</a> that Microsoft has recommitted another $100K to Apache. </p>
<p># Stephen O&#8217;Grady <a href="http://bit.ly/2guc94">speculated</a> on Amazon, RDS and the Future of MySQL.</p>
<img src="http://feeds.feedburner.com/~r/451opensource/~4/hA82okI_pOE" height="1" width="1" /><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22133&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22133&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Fri, 06 Nov 2009 16:24:22 +0000</pubDate>
    <dc:creator>The 451 Group</dc:creator>
    <category>Links</category>
    <category>451 group</category>
    <category>451caostheory</category>
    <category>451group</category>
    <category>amazon</category>
    <category>apache</category>
    <category>caostheory</category>
    <category>closure</category>
    <category>db4o</category>
    <category>eric raymond</category>
    <category>Eucalyptus</category>
    <category>forgeplucker</category>
    <category>funambol</category>
    <category>google</category>
    <category>hippo</category>
    <category>hitachi</category>
    <category>jaspersoft</category>
    <category>likewise</category>
    <category>Linux</category>
    <category>mandriva</category>
    <category>mark fidelman</category>
    <category>Mark Radcliffe</category>
    <category>matt aslett</category>
    <category>mattaslett</category>
    <category>matthew aslett</category>
    <category>matthewas</category>
  </item>

  <item>
    <title>Amazon RDS – The Beginner’s Guide</title>
    <guid isPermaLink="false">http://www.webyog.com/blog/?p=1085</guid>
    <link>http://www.webyog.com/blog/2009/11/06/amazon-rds-the-beginners-guide/</link>
    <description>On the eve of Microsoft’s announcement of the public release of SQL Azure Database, Amazon decides to release RDS. And that, too, after having resisted users’ demands for a relational database service for a very long time. Preemptive action, perhaps? Whatever it may be, I believe that such a healthy competition can do much good to the Cloud marketplace.
RDS brings with it the promise of MySQL on a Cloud. Having been a MySQL fan for quite some time now, I was itching to get my hands on an AWS account and check out what the hype was all about. Imagine my confusion when I signed up for Amazon RDS and all the AWS Management Console showed me was the EC2 dashboard! It was time I downloaded the Getting Started Guide and went through the rigmarole of studying it.
Setting up the Command Line Interface Tools
Apparently, there is no GUI yet for RDS. The only way to go about using it is through the CLI tools. Setting up the tools, however, can be quite a pain: There are no installers per se; you have to download the archive, extract it, and set up the environment manually. Here’s how I did it on my Windows XP box:

Prerequisites: The CLI tools are written in Java.  So you need to have either JDK or JRE installed on your system to be able to run them. I had JRE 6 installed.
 The Environment: There are a couple of environment variables that need to be set, manually:

The JAVA_HOME variable, containing the path of the Java runtime installed on the system.
The AWS_RDS_HOME variable, containing the path to the folder containing the CLI tools.



C:\&amp;gt;set JAVA_HOME=E:\Java\jre6
C:\&amp;gt;set AWS_RDS_HOME=C:\Amazon RDS\CLI


The Credential File: The archive containing the CLI tools also had a file named credential-file-path.template. I copied my AWS Access and Secret Keys into the place holders in the file, and then had to set yet another environment variable:

C:\&amp;gt;set AWS_CREDENTIAL_FILE=C:\Amazon RDS\CLI\credential-file-path.template

One Last Thing: Finally, I was almost done (setting up the CLI tools, that is)! All I had to do was add the path of the CLI tools to the PATH variable:

C:\&amp;gt;set PATH=%PATH%;C:\Amazon RDS\CLI
Creating a Database Instance
I went on to create an Extra Large database instance, with an allocated storage of 5GB:
C:\&amp;gt;rds-create-db-instance --engine MySQL5.1 --master-username root --master-user-password mypass
--db-name WebyogTestData --db-instance-identifier webyogtestinstance --allocated-storage 5
--db-instance-class db.m1.xlarge –-header
DBINSTANCE  DBInstanceId        Class         Engine    Storage  Master Username
Status    Backup Retention
DBINSTANCE  webyogtestinstance  db.m1.xlarge  mysql5.1  5        root
creating  1
SECGROUP  Name     Status
SECGROUP  default  active
PARAMGRP  Group Name        Apply Status
PARAMGRP  default.mysql5.1  in-sync
C:\&amp;gt;

The rds-decribe-db-instances command displays all the running instances:
C:\&amp;gt;rds-describe-db-instances
DBINSTANCE  webyogtestinstance  2009-11-06T08:40:52.571Z  db.m1.xlarge  mysql5.1
5   root  available  webyogtestinstance.clc2ed76md1v.us-east-1.rds.amazonaws.com 3306  us-east-1d  1
SECGROUP  default  active
PARAMGRP  default.mysql5.1  in-sync
C:\&amp;gt;
Voila! My database is up and running in no time.
Setting up SQLyog/MONyog to Connect to Amazon RDS
Here’s the best part about Amazon RDS: It has native MySQL 5.1 support. (Well, at this time, it supports  no other RDBMS, but may be  it will in the future.) This means that I can use my favorite MySQL GUI tool to connect to the Amazon RDS database instance. Or at least that’s what Amazon claims.




SQLyog settings for RDS


I filled in the master user name and password that I had used to create the DB instance. For the host address, I used webyogtestinstance.clc2ed76md1v.us-east-1.rds.amazonaws.com (I noticed it in the output of the rds-describe-db-instances command).
Apprehensively, I clicked on the Test Connection button and without a hitch it connected successfully. Notice that SQLyog reports as having connected to MySQL 5.1.38-log.




Success connecting to Amazon RDS


Setting up MONyog was as simple.




MONyog displaying Amazon RDS DB stats









MONyog displaying the InnoDB Cache stats



In Conclusion
RDS is but a MySQL 5.1 instance running on an EC2 platform, bringing with it all the advantages of EC2. You can scale your server to use up to 68GB of memory, 26 ECUs, and 1TB of persistent storage. On the flip-side, Amazon RDS does not support replication yet.
Periodically, the Amazon RDS system performs some maintenance of the database instance. This ensures that your server is running smoothly. This also translates into a 4-hour down time period on a weekly basis.
There is a pattern emerging here, if you look close enough. Back in 2008, at about the same time, when it became clear that Microsoft would announce a Windows-based Cloud, Amazon jumped in and announced support for Windows-based EC2 instances. And Amazon has managed to do it again with RDS. Microsoft, it seems, drives Amazon harder than users do.
For us at Webyog this is an exciting development. We believe that our products (SQLyog and MONyog) are very well &amp;#8216;fit for the Cloud&amp;#8217;.  Much more fit than the console-based tools that most advanced users still seem to use.  We will now start checking our programs in detail with this. Till now we&amp;#8217;ve found no issues.

Want to Know More?
Read more about Amazon RDS froom their website.
Amazon RDS Functionality.
Pricing plans for Amazon RDS.
Sign up for Amazon RDS.</description>
    <content:encoded><![CDATA[<p>On the eve of Microsoft’s announcement of the public release of SQL Azure Database, Amazon decides to release RDS. And that, too, after having resisted users’ demands for a relational database service for a very long time. Preemptive action, perhaps? Whatever it may be, I believe that such a healthy competition can do much good to the Cloud marketplace.</p>
<p>RDS brings with it the promise of MySQL on a Cloud. Having been a MySQL fan for quite some time now, I was itching to get my hands on an AWS account and check out what the hype was all about. Imagine my confusion when I signed up for Amazon RDS and all the AWS Management Console showed me was the EC2 dashboard! It was time I downloaded the Getting Started Guide and went through the rigmarole of studying it.</p>
<h2>Setting up the Command Line Interface Tools</h2>
<p>Apparently, there is no GUI yet for RDS. The only way to go about using it is through the CLI tools. Setting up the tools, however, can be quite a pain: There are no installers per se; you have to download the archive, extract it, and set up the environment manually. Here’s how I did it on my Windows XP box:</p>
<ul type="disc">
<li>Prerequisites: The CLI tools are written in Java.  So you need to have either JDK or JRE installed on your system to be able to run them. I had JRE 6 installed.</li>
<li> The Environment: There are a couple of environment variables that need to be set, manually:
<ul type="circle">
<li>The JAVA_HOME variable, containing the path of the Java runtime installed on the system.</li>
<li>The AWS_RDS_HOME variable, containing the path to the folder containing the CLI tools.</li>
</ul>
</li>
</ul>
<pre>C:\&gt;set JAVA_HOME=E:\Java\jre6</pre>
<pre>C:\&gt;set AWS_RDS_HOME=C:\Amazon RDS\CLI</pre>
<p>
<ul type="disc">
<li>The Credential File: The archive containing the CLI tools also had a file named credential-file-path.template. I copied my AWS Access and Secret Keys into the place holders in the file, and then had to set yet another environment variable:</li>
</ul>
<pre>C:\&gt;set AWS_CREDENTIAL_FILE=C:\Amazon RDS\CLI\credential-file-path.template</pre>
<ul type="disc">
<li>One Last Thing: Finally, I was almost done (setting up the CLI tools, that is)! All I had to do was add the path of the CLI tools to the PATH variable:</li>
</ul>
<pre>C:\&gt;set PATH=%PATH%;C:\Amazon RDS\CLI</pre>
<h2>Creating a Database Instance</h2>
<p>I went on to create an Extra Large database instance, with an allocated storage of 5GB:</p>
<pre>C:\&gt;rds-create-db-instance --engine MySQL5.1 --master-username <strong>root</strong> --master-user-password<strong> mypass</strong>
--db-name <strong>WebyogTestData</strong> --db-instance-identifier webyogtestinstance --allocated-storage 5
--db-instance-class db.m1.xlarge –-header
DBINSTANCE  DBInstanceId        Class         Engine    Storage  Master Username
Status    Backup Retention
DBINSTANCE  webyogtestinstance  db.m1.xlarge  mysql5.1  5        root
<strong>creating</strong>  1
SECGROUP  Name     Status
SECGROUP  default  active
PARAMGRP  Group Name        Apply Status
PARAMGRP  default.mysql5.1  in-sync
C:\&gt;</pre>
<p>
<p>The rds-decribe-db-instances command displays all the running instances:</p>
<pre>C:\&gt;rds-describe-db-instances
DBINSTANCE  webyogtestinstance  2009-11-06T08:40:52.571Z  db.m1.xlarge  mysql5.1
5   root  available <strong> webyogtestinstance.clc2ed76md1v.us-east-1.rds.amazonaws.com 3306</strong>  us-east-1d  1
SECGROUP  default  active
PARAMGRP  default.mysql5.1  in-sync
C:\&gt;</pre>
<p>Voila! My database is up and running in no time.</p>
<h2>Setting up SQLyog/MONyog to Connect to Amazon RDS</h2>
<p>Here’s the best part about Amazon RDS: It has native MySQL 5.1 support. (Well, at this time, it supports  no other RDBMS, but may be  it will in the future.) This means that I can use my favorite <a href="http://webyog.com/" target="_blank">MySQL GUI tool</a> to connect to the Amazon RDS database instance. Or at least that’s what Amazon claims.</p>
<p>
<div>
<dl>
<dt><img src="http://webyog.com/blog/wp-content/uploads/2009/11/sqlyogsettings.JPG" alt="SQLyog Settings for RDS" width="801" height="523" /></dt>
<dd><strong>SQLyog settings for RDS</strong></dd>
</dl>
</div>
<p>I filled in the master user name and password that I had used to create the DB instance. For the host address, I used webyogtestinstance.clc2ed76md1v.us-east-1.rds.amazonaws.com (I noticed it in the output of the rds-describe-db-instances command).</p>
<p>Apprehensively, I clicked on the Test Connection button and without a hitch it connected successfully. Notice that SQLyog reports as having connected to MySQL 5.1.38-log.</p>
<p>
<div>
<dl>
<dt><img src="http://webyog.com/blog/wp-content/uploads/2009/11/sqlyogsuccess.JPG" alt="Success connecting to Amazon RDS" width="807" height="603" /></dt>
<dd><strong>Success connecting to Amazon RDS</strong></dd>
</dl>
</div>
<p>Setting up <a href="http://webyog.com/" target="_blank">MONyog</a> was as simple.</p>
<p>
<div>
<dl>
<dt><img src="http://webyog.com/blog/wp-content/uploads/2009/11/MONrds.JPG" alt="MONyog displaying Amazon RDS DB stats" width="811" height="522" /></dt>
<dd><strong>MONyog displaying Amazon RDS DB stats</strong></dd>
</dl>
</div>
<h2>
<p>
</h2>
<h4>
<dl>
<dt><img src="http://webyog.com/blog/wp-content/uploads/2009/11/MONrds1.JPG" alt="MONyog displaying the InnoDB Cache stats" width="813" height="523" /></dt>
<dd>
<h4><strong>MONyog displaying the InnoDB Cache stats</strong></h4>
</dd>
</dl>
</h4>
<h2>In Conclusion</h2>
<p>RDS is but a MySQL 5.1 instance running on an EC2 platform, bringing with it all the advantages of EC2. You can scale your server to use up to 68GB of memory, 26 ECUs, and 1TB of persistent storage. On the flip-side, Amazon RDS does not support replication yet.</p>
<p>Periodically, the Amazon RDS system performs some maintenance of the database instance. This ensures that your server is running smoothly. This also translates into a 4-hour down time period on a weekly basis.</p>
<p>There is a pattern emerging here, if you look close enough. Back in 2008, at about the same time, when it became clear that Microsoft would announce a Windows-based Cloud, Amazon jumped in and announced support for Windows-based EC2 instances. And Amazon has managed to do it again with RDS. Microsoft, it seems, drives Amazon harder than users do.</p>
<p>For us at Webyog this is an exciting development. We believe that our products (SQLyog and MONyog) are very well &#8216;fit for the Cloud&#8217;.  Much more fit than the console-based tools that most advanced users still seem to use.  We will now start checking our programs in detail with this. Till now we&#8217;ve found no issues.</p>
<p>
<h2>Want to Know More?</h2>
<p><a href="http://aws.amazon.com/rds/">Read more about Amazon RDS froom their website</a>.</p>
<p><a href="http://aws.amazon.com/rds/#functionality">Amazon RDS Functionality</a>.</p>
<p><a href="http://aws.amazon.com/rds/#pricing">Pricing plans for Amazon RDS</a>.</p>
<p><a href="https://www.amazon.com/ap/signin?openid.ns=http://specs.openid.net/auth/2.0&amp;authCookies=1&amp;openid.mode=checkid_setup&amp;openid.identity=http://specs.openid.net/auth/2.0/identifier_select&amp;openid.claimed_id=http://specs.openid.net/auth/2.0/identifier_select&amp;openid.pape.max_auth_age=600&amp;openid.return_to=https://www.amazon.com/gp/aws/ssop/handlers/auth-portal.html?ie=UTF8&amp;wreply=https%253A%252F%252Faws-portal.amazon.com%252Fgp%252Faws%252Fdeveloper%252Fsubscription%252Findex.html&amp;awsrequestchallenge=false&amp;wtrealm=urn%253Aaws%253AawsAccessKeyId%253A1QQFCEAYKJXP0J7S2T02&amp;wctx=productCodepRmAmazonRDSpRm&amp;wa=wsignin1.0&amp;awsrequesttfa=true&amp;openid.assoc_handle=ssop&amp;openid.pape.preferred_auth_policies=http://schemas.openid.net/pape/policies/2007/06/multi-factor-physical&amp;openid.ns.pape=http://specs.openid.net/extensions/pape/1.0&amp;siteState=awsMode::signUp::productName::Amazon%20Relational%20Database%20Service::&amp;">Sign up for Amazon RDS</a>.</p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22134&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22134&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Fri, 06 Nov 2009 15:57:33 +0000</pubDate>
    <dc:creator>Webyog</dc:creator>
    <category>MySQL</category>
    <category>News</category>
  </item>

  <item>
    <title>PBMS backup and S3 storage</title>
    <guid isPermaLink="false">tag:blogger.com,1999:blog-5313622847232042654.post-383176410971588446</guid>
    <link>http://bpbdev.blogspot.com/2009/11/pbms-backup-and-s3-storage.html</link>
    <description>Hi,So I had to make some changes to the way that backup worked with PBMS in order that it would behave as expected when the BLOB data is stored remotely on an S3 server.I will go over how the PBMS repository is backed up first and then explain how it works with S3 storage.PBMS provides a pbms_backup table in its 'pbms' database that records all the backups that have been performed on any of the PBMS repositories on the server. The create statement for this table looks like this:CREATE TABLE pbms.pbms_backup (id                INT NOT NULL AUTO_INCREMENT,Database_Name     VARCHAR(64) NOT NULL,Started                      TIMESTAMP,Completed                TIMESTAMP,IsRunning                 BOOL,IsDump                     BOOL,Location                    VARCHAR(1024),Cloud_Ref                INT,Cloud_backup_no  INT,PRIMARY KEY (id))There are 2 ways to backup a PBMS repository:MySQLDump: To backup the repository using 'mysqldump' you simply dump the database's 'pbms_dump' table. This table is a one column table of type 'long blob'. Any select from this table is assumed to be a repository backup and any insert into the table is assumed to be a repository recovery.  Selecting the data from the 'pbms_dump' table results in a new record being added to the 'pbms.pbms_backup' table. To recover the repository the dumped data is inserted back into the 'pbms_dump' table. During the recovery it is important to set the pbms variable &quot;Restoring-Dump&quot; in the database's 'pbms_variable' table to &quot;TRUE&quot;. This tells the PBMS engine that the database is being recovered and that insertion of BLOB references should not increment the BLOB reference count.Engine level backup: To perform an engine level backup all you do is insert a record into the 'pbms.pbms_backup' table providing the name of the database to be backed up and the location into which the backup should be placed. This starts an asynchronous backup operation. To recover the repository from the backup all you do is drag and drop the backup into your recovered database.So how does this work when the actual BLOB data is stored on an S3 server?What happens is the backup process makes a copy of the BLOB data on the S3 server. In the case of an engine level backup the user can specify the S3 Server that the BLOBs should be backed up to which may be a different server entirely. BLOB data is stored on the S3 server using generated names matching the pattern:&amp;#60;database_id&amp;#62;/&amp;#60;backup_no&amp;#62;/&amp;#60;cloud_ref&amp;#62;.&amp;#60;time_stamp&amp;#62;.&amp;#60;sequence_count&amp;#62;Example: 1257393091/0/87.1257393139.626database_id is the id of the database as provided by the PBMS engine.backup_no is the backup number of the BLOB. For BLOBs that are not backup copies this number is always zero. The backup number is just a sequence counter that ensures that the backup BLOBs have a unique name. All backup BLOBs from the same backup will have the same backup number. This backup number is stored in the 'Cloud_backup_no' column of the 'pbms.pbms_backup' table.cloud_ref is the reference into the 'pbms.pbms_cloud' table that refers to the S3 server and bucket in which the BLOB is stored.time_stamp is the time in seconds  at which the BLOB was created.sequence_count is just a counter to ensure that no 2 BLOBs get the same name.When a backup is started a check is done to find an unused backup number. Then as each BLOB repository record is backed up the S3 server is sent a request to copy the BLOB to its new name with the new backup number which may also be in a different bucket or on a different server.The first step of the database recovery is to delete any BLOBs from the original database. This is done by performing the S3 equivalent of 'rm -r &amp;#60;database_id&amp;#62;/0/* ' .Recovery from an sqldump backup is just the reverse of the backup, as each BLOB repository record is written back to the repository the BLOB is copied back to its original name and location with the backup number set to zero. The 'cloud_ref' number is used to look up the S3 server location from which the original BLOB came.Recovery from an engine level backup is a bit different because the repository recovery is just a drag and drop operation. The first time the database is accessed the PBMS engines sees that the repository has just been recovered and starts the S3 BLOB recovery process. To recover the S3 BLOBs a list of all the BLOBs in the backup is made and then using the 'cloud_ref' number from the BLOB name the original location of the BLOB is found and the backed up BLOB is copied back to it.The nice thing about using S3 storage for the BLOBs is that the database BLOB repository can be backed up quite nicely just using mysqldump.When deleting old backups it is important to remember that there may be BLOBs on an S3 server that also need to be cleaned up. This is where you can make use of the 'pbms.pbms_backup' table to find the location of the backed up BLOBs and using a tool, such as the &quot;S3 FireFox Organizer&quot;, delete them. After that the backup record can be deleted from the 'pbms.pbms_backup' table. I could have had the PBMS engine delete all the backed up BLOBs for a backup when the record was deleted from the 'pbms.pbms_backup' table but I thought that that could lead to lost data if the user did not realize the side effects of deleting a backup record.Barry</description>
    <content:encoded><![CDATA[Hi,<br /><br />So I had to make some changes to the way that backup worked with PBMS in order that it would behave as expected when the BLOB data is stored remotely on an S3 server.<br /><br />I will go over how the PBMS repository is backed up first and then explain how it works with S3 storage.<br /><br />PBMS provides a pbms_backup table in its 'pbms' database that records all the backups that have been performed on any of the PBMS repositories on the server. The create statement for this table looks like this:<br /><blockquote><br />CREATE TABLE <span>pbms.pbms_backup</span> (<br /><span>id</span>                INT NOT NULL AUTO_INCREMENT,<br /><span>Database_Name</span>     VARCHAR(64) NOT NULL,<br /><span>Started </span>                     TIMESTAMP,<br /><span>Completed</span>                TIMESTAMP,<br /><span>IsRunning</span>                 BOOL,<br /><span>IsDump</span>                     BOOL,<br /><span>Location</span>                    VARCHAR(1024),<br /><span>Cloud_Ref </span>               INT,<br /><span>Cloud_backup_no</span>  INT,<br />PRIMARY KEY (id)<br />)<br /><br /></blockquote>There are 2 ways to backup a PBMS repository:<br /><br /><ul><li><span>MySQLDump</span>: To backup the repository using 'mysqldump' you simply dump the database's 'pbms_dump' table. This table is a one column table of type 'long blob'. Any select from this table is assumed to be a repository backup and any insert into the table is assumed to be a repository recovery.  Selecting the data from the 'pbms_dump' table results in a new record being added to the 'pbms.pbms_backup' table. To recover the repository the dumped data is inserted back into the 'pbms_dump' table. During the recovery it is important to set the pbms variable "Restoring-Dump" in the database's 'pbms_variable' table to "TRUE". This tells the PBMS engine that the database is being recovered and that insertion of BLOB references should not increment the BLOB reference count.</li><li><span>Engine level backup</span>: To perform an engine level backup all you do is insert a record into the 'pbms.pbms_backup' table providing the name of the database to be backed up and the location into which the backup should be placed. This starts an asynchronous backup operation. To recover the repository from the backup all you do is drag and drop the backup into your recovered database.</li></ul>So how does this work when the actual BLOB data is stored on an S3 server?<br /><br />What happens is the backup process makes a copy of the BLOB data on the S3 server. In the case of an engine level backup the user can specify the S3 Server that the BLOBs should be backed up to which may be a different server entirely. BLOB data is stored on the S3 server using generated names matching the pattern:<br /><br />&#60;database_id&#62;/&#60;backup_no&#62;/&#60;cloud_ref&#62;.&#60;time_stamp&#62;.&#60;sequence_count&#62;<br /><span>Example</span>: 1257393091/0/87.1257393139.626<br /><br /><ul><li><span>database_id</span> is the id of the database as provided by the PBMS engine.</li><li><span>backup_no</span> is the backup number of the BLOB. For BLOBs that are not backup copies this number is always zero. The backup number is just a sequence counter that ensures that the backup BLOBs have a unique name. All backup BLOBs from the same backup will have the same backup number. This backup number is stored in the 'Cloud_backup_no' column of the 'pbms.pbms_backup' table.</li><li><span>cloud_ref</span> is the reference into the 'pbms.pbms_cloud' table that refers to the S3 server and bucket in which the BLOB is stored.</li><li><span>time_stamp</span> is the time in seconds  at which the BLOB was created.</li><li><span>sequence_count</span> is just a counter to ensure that no 2 BLOBs get the same name.</li></ul>When a backup is started a check is done to find an unused backup number. Then as each BLOB repository record is backed up the S3 server is sent a request to copy the BLOB to its new name with the new backup number which may also be in a different bucket or on a different server.<br /><br />The first step of the database recovery is to delete any BLOBs from the original database. This is done by performing the S3 equivalent of '<span>rm -r &#60;database_id&#62;/0/*</span> ' .<br /><br />Recovery from an sqldump backup is just the reverse of the backup, as each BLOB repository record is written back to the repository the BLOB is copied back to its original name and location with the backup number set to zero. The 'cloud_ref' number is used to look up the S3 server location from which the original BLOB came.<br /><br />Recovery from an engine level backup is a bit different because the repository recovery is just a drag and drop operation. The first time the database is accessed the PBMS engines sees that the repository has just been recovered and starts the S3 BLOB recovery process. To recover the S3 BLOBs a list of all the BLOBs in the backup is made and then using the 'cloud_ref' number from the BLOB name the original location of the BLOB is found and the backed up BLOB is copied back to it.<br /><br />The nice thing about using S3 storage for the BLOBs is that the database BLOB repository can be backed up quite nicely just using mysqldump.<br /><br />When deleting old backups it is important to remember that there may be BLOBs on an S3 server that also need to be cleaned up. This is where you can make use of the 'pbms.pbms_backup' table to find the location of the backed up BLOBs and using a tool, such as the "S3 FireFox Organizer", delete them. After that the backup record can be deleted from the 'pbms.pbms_backup' table. I could have had the PBMS engine delete all the backed up BLOBs for a backup when the record was deleted from the 'pbms.pbms_backup' table but I thought that that could lead to lost data if the user did not realize the side effects of deleting a backup record.<br /><br />Barry<div><img width="1" height="1" src="https://blogger.googleusercontent.com/tracker/5313622847232042654-383176410971588446?l=bpbdev.blogspot.com" /></div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22138&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22138&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Fri, 06 Nov 2009 15:30:00 +0000</pubDate>
    <dc:creator>Barry Leslie</dc:creator>
  </item>

  <item>
    <title>Database Workbench 3.4.2 released</title>
    <guid isPermaLink="false">97D2B3EF-1AF0-4031-BDF2-AB91F655CB38</guid>
    <link>http://www.upscene.com/displaynews.php?item=20091106</link>
    <description>Ladies, gentlemen,

Upscene Productions is proud to announce the next 
version of the popular database development tool:

&quot; Database Workbench 3.4.2 &quot;

Version 3.4.2 includes bugfixes in addition to the
new features in version 3.4.0

Changes Highlights in 3.4 and previous versions
---------------------------------------------------
- NexusDB v3 support
- Many diagramming improvements
- Better Stored Routine Debugger
- Ability to cancel import/export processes
- Monitoring GUI for Firebird 2.1
- Ability to cancel running queries in NexusDB, SQL Anywhere and Firebird 2.1
- Stored Routine debugger for SQL Anywhere
- Many new features, enhancements and bug fixes...
... and much more ...

Download a trial at: http://www.upscene.com/downloads.php
Full list of features and fixes: http://customer.upscene.com/script/mantisgateway.exe/fixed?fixedin=3.4.2&amp;projectid=1


Database Workbench supports:
- Borland InterBase ( 4.x - 8.x )
- Firebird ( 1.x, 2.x )
- MS SQL Server/MSDE ( v6.5, 7, 2000, 2005, 2008, MSDE 1 &amp; 2, SQL Express )
- MySQL 4.x, 5.x
- Oracle Database ( 8i, 9i, 10g, 11g )
- Sybase SQL Anywhere ( 9, 10 and 11 )
- NexusDB ( 2.08.x, 3.0.x )</description>
    <content:encoded><![CDATA[Ladies, gentlemen,<br />
<br />
Upscene Productions is proud to announce the next <br />
version of the popular database development tool:<br />
<br />
" Database Workbench 3.4.2 "<br />
<br /><br />
Version 3.4.2 includes bugfixes in addition to the
new features in version 3.4.0<br />
<br />
Changes Highlights in 3.4 and previous versions<br />
---------------------------------------------------<br />
- NexusDB v3 support<br />
- Many diagramming improvements<br />
- Better Stored Routine Debugger<br />
- Ability to cancel import/export processes<br />
- Monitoring GUI for Firebird 2.1<br />
- Ability to cancel running queries in NexusDB, SQL Anywhere and Firebird 2.1<br />
- Stored Routine debugger for SQL Anywhere<br />
- Many new features, enhancements and bug fixes...<br />
... and much more ...<br />
<br />
Download a trial at: http://www.upscene.com/downloads.php<br />
Full list of features and fixes: http://customer.upscene.com/script/mantisgateway.exe/fixed?fixedin=3.4.2&projectid=1<br />
<br />
<br />
Database Workbench supports:<br />
- Borland InterBase ( 4.x - 8.x )<br />
- Firebird ( 1.x, 2.x )<br />
- MS SQL Server/MSDE ( v6.5, 7, 2000, 2005, 2008, MSDE 1 & 2, SQL Express )<br />
- MySQL 4.x, 5.x<br />
- Oracle Database ( 8i, 9i, 10g, 11g )<br />
- Sybase SQL Anywhere ( 9, 10 and 11 )<br />
- NexusDB ( 2.08.x, 3.0.x )<br /><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22132&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22132&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Fri, 06 Nov 2009 13:43:58 +0000</pubDate>
    <dc:creator>Martijn Tonies</dc:creator>
  </item>

  <item>
    <title>Amazon RDS, MySQL, Hmm?</title>
    <guid isPermaLink="false">http://krow.livejournal.com/676287.html</guid>
    <link>http://krow.livejournal.com/676287.html</link>
    <description>When a name like Amazon gets into the business, everyone acts like everything is new and shiny again :)Why, I can remember like it was, well... almost a decade ago when I first saw a hosting vendor get into the business of support MySQL databases.The first vendor I remember? Rackspace. Why? Because way back when, they were the main sponsor for the second OpenSource Database Conference.Looking through the options that Amazon provides, they do look pretty sharp. Database backups can be a pain, and providing snapshot based ones is an excellent idea. There was some talk of snapshotting when I was at the MySQL User Group this week talking about Drizze. The general consensus was that this was the superior way to do backups.The other backup methods, like using a physical backup tool, certainly work (Innodb has sold one for years, and Percona has one that they provide). If logical backups are your thing you can always use mysqldump with Innodb to do an online backup (If you look through past blogs you can see how I use it and the distributed revision control system Mercurial to backup my databases and provide point in time recovery/selective restores).Backup services to S3 are not new at all. Zmanda started to provide a product for backing it up years ago. Googling on &quot;S3 mysql backup&quot;, I found several links for HOWTO to do it. Looking at the API, I believe native driver support is available. I would be curious to find out if they are supporting just the ASCII protocol, or if they have enabled the binary one as well (which is an awesome way of crashing your database). I am going to assume these databases are all in a self contained instance. If not? Should be simple enough to crash you and your neighbor. It would be interesting to find out how many changes they had to make to the database (or still have to make... all of this reminds me that they were heavily recruiting at the MySQL User's Conference last year). It was pretty obvious back then that they were going to work on this service. There is no mention of SSL support. The SSL stuff built into MySQL would be a pain to make work, but I suspect they could with some hacking.I wonder if libdrizzle's mysql bits will work? I suspect we will have to try that out. If so, anyone need sharding in their driver?BTW if you follow their &quot;use mysqldump&quot; model for pushing data into the service, remember that --single-transaction will allow you to do a hot backup. There is no need to lock up your current database. I am still appalled at how few people know that.  Years ago we should have renamed &quot;mysqldump&quot; to &quot;mysqlbackup&quot; and defaulted the settings for Innodb. Outside of licensing, I suspect &quot;how to backup&quot; my database was the most often requested question.Oh, that and &quot;are you worried now that Oracle has acquired Innodb?&quot;Renaming our dump tool to backup would have made for pretty slick marketing :)Amazon's sizing numbers look good. They have hit the sweet spot for most users.Monitoring is nice, but also, not that hard. If you have everything already in AWS I can see where the &quot;one stop shopping&quot; would be nice (or is that one-click?).There is more you can find in an article by Jeff. Rightscale has an article up as well. I'm really curious to find out what they have disabled in 5.1, I have a hard time believing they can support all of its features. Coming up with an upgrade path should be interesting to watch. I've seen no sign of any patches coming back from them, so I would assume they have rolled their own. This means either they will be on 5.1 for a long time, or will be spending a lot of engineering time maintaining their own version.  (Most of the bigger MySQL shops do, so it is not all that surprising.) The number of folks on the planet that can really maintain custom versions well is pretty small. I don't believe Amazon has any of them on staff. I am sure though that they can find people who can figure it out. I have seen mixed results when companies keep their own custom versions of MySQL. It can be pretty easy to code yourself into a corner when you don't play an active part in the community. 5.1 should provide many years worth of use for them though.Kudos to Amazon.  Having them run a database service won't provide the &quot;high end&quot; sort of usage that keeps the folks who tune databases in business, but there certainly are a lot of users for whom this type of service will work just fine.At the very least this service will certainly up the ante for others in the MySQL hosting business.Providing services like RDS and application is the future for cloud companies. It will be interesting to see what they will come up with next.I'm surprised Memcached hasn't been done as a service yet, but perhaps that is why I have those SASL patches sitting in my inbox. We still have a little while before Gearman will be announced :)</description>
    <content:encoded><![CDATA[When a name like <a href="http://aws.amazon.com/rds/">Amazon</a> gets into the business, everyone acts like everything is new and shiny again :)<br /><br />Why, I can remember like it was, well... almost a decade ago when I first saw a hosting vendor get into the business of support MySQL databases.<br /><br />The first vendor I remember? <a href="http://www.rackspace.com/solutions/managed_hosting/services/database/mysql.php">Rackspace</a>. Why? Because way back when, they were the main sponsor for the second OpenSource Database Conference.<br /><br />Looking through the options that Amazon provides, they do look pretty sharp. Database backups can be a pain, and providing snapshot based ones is an excellent idea. There was some talk of snapshotting when I was at the MySQL User Group this week talking about <a href="http://drizzle.org">Drizze</a>. The general consensus was that this was the superior way to do backups.<br /><br />The other backup methods, like using a physical backup tool, certainly work (Innodb has <a href="http://www.innodb.com/products/hot-backup/">sold one for years</a>, and Percona has <a href="https://launchpad.net/percona-xtrabackup">one that they provide</a>). If logical backups are your thing you can always use mysqldump with Innodb to do an online backup (If you look through past blogs you can see how I use it and the distributed revision control system <a href="http://mercurial.selenic.com/">Mercurial</a> to backup my databases and provide point in time recovery/selective restores).<br /><br />Backup services to S3 are not new at all. <a href="http://www.zmanda.com/zrm-mysql-enterprise.html">Zmanda</a> started to provide a product for backing it up years ago. Googling on "S3 mysql backup", I found several links for HOWTO to do it. <br /><br />Looking at the API, I believe native driver support is available. I would be curious to find out if they are supporting just the ASCII protocol, or if they have enabled the binary one as well (which is an awesome way of crashing your database). I am going to assume these databases are all in a self contained instance. If not? Should be simple enough to crash you and your neighbor. It would be interesting to find out how many changes they had to make to the database (or still have to make... all of this reminds me that they were heavily recruiting at the MySQL User's Conference last year). It was pretty obvious back then that they were going to work on this service. There is no mention of SSL support. The SSL stuff built into MySQL would be a pain to make work, but I suspect they could with some hacking.<br /><br />I wonder if libdrizzle's mysql bits will work? I suspect we will have to try that out. If so, anyone need sharding in their driver?<br /><br />BTW if you follow their "use mysqldump" model for pushing data into the service, remember that --single-transaction will allow you to do a hot backup. There is no need to lock up your current database. I am still appalled at how few people know that.  Years ago we should have renamed "mysqldump" to "mysqlbackup" and defaulted the settings for Innodb. Outside of licensing, I suspect "how to backup" my database was the most often requested question.<br /><br />Oh, that and "are you worried now that Oracle has acquired Innodb?"<br /><br />Renaming our dump tool to backup would have made for pretty slick marketing :)<br /><br />Amazon's sizing numbers look good. They have hit the sweet spot for most users.<br /><br />Monitoring is nice, but also, not that hard. If you have everything already in AWS I can see where the "one stop shopping" would be nice (or is that one-click?).<br /><br />There is more you can find in an <a href="http://aws.typepad.com/aws/2009/10/introducing-rds-the-amazon-relational-database-service-.html">article by Jeff</a>. <a href="http://blog.rightscale.com/2009/10/26/amazon-relational-database-service/">Rightscale</a> has an article up as well. <br /><br />I'm really curious to find out what they have disabled in 5.1, I have a hard time believing they can support all of its features. Coming up with an upgrade path should be interesting to watch. I've seen no sign of any patches coming back from them, so I would assume they have rolled their own. This means either they will be on 5.1 for a long time, or will be spending a lot of engineering time maintaining their own version.  (Most of the bigger MySQL shops do, so it is not all that surprising.) The number of folks on the planet that can really maintain custom versions well is pretty small. I don't believe Amazon has any of them on staff. I am sure though that they can find people who can figure it out. <br /><br />I have seen mixed results when companies keep their own custom versions of MySQL. It can be pretty easy to code yourself into a corner when you don't play an active part in the community. 5.1 should provide many years worth of use for them though.<br /><br />Kudos to Amazon.  Having them run a database service won't provide the "high end" sort of usage that keeps the folks who tune databases in business, but there certainly are a lot of users for whom this type of service will work just fine.<br /><br />At the very least this service will certainly up the ante for others in the MySQL hosting business.<br /><br />Providing services like RDS and application is the future for cloud companies. It will be interesting to see what they will come up with next.<br /><br />I'm surprised <a href="http://code.google.com/p/memcached/">Memcached</a> hasn't been done as a service yet, but perhaps that is why I have those SASL patches sitting in my inbox. <br /><br />We still have a little while before <a href="http://gearman.org/">Gearman</a> will be announced :)<br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22130&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22130&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Fri, 06 Nov 2009 08:18:19 +0000</pubDate>
    <dc:creator>Brian Aker</dc:creator>
  </item>

  <item>
    <title>Air traffic queries in MyISAM and Tokutek (TokuDB)</title>
    <guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1641</guid>
    <link>http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/</link>
    <description>This is next post in series
Analyzing air traffic performance with InfoBright and MonetDB
Air traffic queries in LucidDB
Air traffic queries in InfiniDB: early alpha
Let me explain the reason of choosing these engines.  After initial three posts I am often asked &quot;What is baseline ? Can we compare results with standard MySQL engines ?&quot;. So there come MyISAM to consider it as base point to see how column-oriented-analytic engines are better here. 
However, take into account, that for MyISAM we need to choose proper indexes to execute queries effectively, and there is pain coming with indexes: - load of data is getting slower; - to design proper indexes is additional research,  especially when MySQL optimizer is not smart in picking best one.
The really nice thing about MonetDB, InfoBright, InfiniDB is that they do not need indexes, so you may not worry about maintaining them and picking best one. I am not sure about LucidDB, I was told indexes are needed, but creating new index was really fast even on full database, so I guess, it's not B-Tree indexes. So this my reflexion on indexes turned me onto TokuDB direction.
What is so special about TokuDB ? There two things: indexes have special structure and are &quot;cheap&quot;, by &quot;cheap&quot; I mean the maintenance cost is constant and independent on datasize. With regular B-Tree indexes cost grows  exponentially on datasize (Bradley Kuszmaul from Tokutek will correct me if I am wrong in this statement). Another point with TokuDB, it uses compression, so I expect less size of loaded data and less IO operations during query execution.
So what indexes we need for queries. To recall you details, the schema is available in this post
http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/, and
queries I posted on sheet &quot;Queries&quot; in my summary Spreadsheet.
With Bradley's help we chose  next indexes:
PLAIN TEXT
CODE:




KEY `Year` &amp;#40;`Year`,`Month`&amp;#41;,


&amp;nbsp; KEY `Year_2` &amp;#40;`Year`,`DayOfWeek`&amp;#41;,


&amp;nbsp; KEY `DayOfWeek` &amp;#40;`DayOfWeek`,`Year`,`DepDelay`&amp;#41;,


&amp;nbsp; KEY `DestCityName` &amp;#40;`DestCityName`,`OriginCityName`,`Year`&amp;#41;,


&amp;nbsp; KEY `Year_3` &amp;#40;`Year`,`DestCityName`,`OriginCityName`&amp;#41;,


&amp;nbsp; KEY `Year_4` &amp;#40;`Year`,`Carrier`,`DepDelay`&amp;#41;,


&amp;nbsp; KEY `Origin` &amp;#40;`Origin`,`Year`,`DepDelay`&amp;#41; 






And I measured load time for both MyISAM and TokuDB in empty table with created indexes.
Load time for MyISAM: 16608 sec
For TokuDB: 19131 sec
Datasize (including indexes)
MyISAM: 36.7GB
TokuDB: 6.7GB
I am a bit surprised that TokuDB is slower loading data, but my guess it is related to compression, and I expect with bigger amount of data TokuDB will be faster MyISAM.
Now to queries. Bradley pointed me that query Q5 SELECT t.carrier, c, c2, c*1000/c2 as c3 FROM (SELECT carrier,
count(*) AS c FROM ontime WHERE DepDelay&gt;10 AND Year=2007 GROUP BY
carrier) t JOIN (SELECT carrier, count(*) AS c2 FROM ontime WHERE
Year=2007 GROUP BY carrier) t2 ON (t.Carrier=t2.Carrier) ORDER BY c3 can be rewritten as
SELECT carrier,totalflights,ndelayed,ndelayed*1000/totalflights as c3 FROM (SELECT carrier,count(*) as totalflights,sum(if(depdelay&gt;10,1,0)) as ndelayed from ontime where year=2007 group by carrier) t order by c3 desc; ( I name it as Query Q5i)
The summary table with queries execution time (in sec, less is better):


Query
MyISAM
TokuDB


Q0
72.84
50.25


Q1
61.03
55.01


Q2
98.12
58.36


Q3
123.04
66.87


Q4
6.92
6.91


Q5
13.61
11.86


Q5i
7.68
6.96


Q6
123.84
69.03


Q7
187.22
159.62


Q8 (1y)
8.75
7.59


Q8 (2y)
102.17
64.95


Q8 (3y)
104.7
69.76


Q8 (4y)
107.05
70.46


Q8 (10y)
119.54
84.64


Q9
69.05
47.67


For reference I used 5.1.36-Tokutek-2.1.0 for both MyISAM and TokuDB tests.
And if you are interested to compare MyISAM with previous engines:


Query
MyISAM
MonetDB
InfoBright
LucidDB
InfiniDB


Q0
72.84
29.9
4.19
103.21
NA


Q1
61.03
7.9
12.13
49.17
6.79


Q2
98.12
0.9
6.73
27.13
4.59


Q3
123.04
1.7
7.29
27.66
4.96


Q4
6.92
0.27
0.99
2.34
0.75


Q5
13.61
0.5
2.92
7.35
NA


Q6
123.84
12.5
21.83
78.42
NA


Q7
187.22
27.9
8.59
106.37
NA


Q8 (1y)
8.75
0.55
1.74
6.76
8.13


Q8 (2y)
102.17
1.1
3.68
28.82
16.54


Q8 (3y)
104.7
1.69
5.44
35.37
24.46


Q8 (4y)
107.05
2.12
7.22
41.66
32.49


Q8 (10y)
119.54
29.14
17.42
72.67
70.35


Q9
69.05
6.3
0.31
76.12
9.54


The all results are available in summary Spreadsheet
I especially do not put TokuDB in the same table with analytic oriented databases, to highlight TokuDB is  OLTP engine for general purposes.
As you see it is doing better than MyISAM in all queries.
    
    Entry posted by Vadim |
      One comment
    Add to:  |  |  |  | </description>
    <content:encoded><![CDATA[<p>This is next post in series<br />
<a href="http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/">Analyzing air traffic performance with InfoBright and MonetDB</a><br />
<a href="http://www.mysqlperformanceblog.com/2009/10/26/air-traffic-queries-in-luciddb/">Air traffic queries in LucidDB</a><br />
<a href="http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/">Air traffic queries in InfiniDB: early alpha</a></p>
<p>Let me explain the reason of choosing these engines.  After initial three posts I am often asked "What is baseline ? Can we compare results with standard MySQL engines ?". So there come MyISAM to consider it as base point to see how column-oriented-analytic engines are better here. </p>
<p>However, take into account, that for MyISAM we need to choose proper indexes to execute queries effectively, and there is pain coming with indexes: - load of data is getting slower; - to design proper indexes is additional research,  especially when MySQL optimizer is not smart in picking best one.</p>
<p>The really nice thing about MonetDB, InfoBright, InfiniDB is that they do not need indexes, so you may not worry about maintaining them and picking best one. I am not sure about LucidDB, I was told indexes are needed, but creating new index was really fast even on full database, so I guess, it's not B-Tree indexes. So this my reflexion on indexes turned me onto TokuDB direction.</p>
<p>What is so special about TokuDB ? There two things: indexes have special structure and are "cheap", by "cheap" I mean the maintenance cost is constant and independent on datasize. With regular B-Tree indexes cost grows  exponentially on datasize (Bradley Kuszmaul from Tokutek will correct me if I am wrong in this statement). Another point with TokuDB, it uses compression, so I expect less size of loaded data and less IO operations during query execution.</p>
<p>So what indexes we need for queries. To recall you details, the schema is available in this post<br />
<a href="http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/">http://www.mysqlperformanceblog.com/2009/10/02/analyzing-air-traffic-performance-with-infobright-and-monetdb/</a>, and<br />
queries I posted on sheet "Queries" in my summary <a href="https://spreadsheets.google.com/a/percona.com/ccc?key=0AjsVX7AnrCYwdERIZFVqakRrcXplM0g0UktaUkRwenc&amp;hl=en">Spreadsheet</a>.</p>
<p>With Bradley's help we chose  next indexes:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>CODE:</span>
<div>
<div>
<ol>
<li>
<div>KEY `Year` <span>&#40;</span>`Year`,`Month`<span>&#41;</span>,</div>
</li>
<li>
<div>&nbsp; KEY `Year_2` <span>&#40;</span>`Year`,`DayOfWeek`<span>&#41;</span>,</div>
</li>
<li>
<div>&nbsp; KEY `DayOfWeek` <span>&#40;</span>`DayOfWeek`,`Year`,`DepDelay`<span>&#41;</span>,</div>
</li>
<li>
<div>&nbsp; KEY `DestCityName` <span>&#40;</span>`DestCityName`,`OriginCityName`,`Year`<span>&#41;</span>,</div>
</li>
<li>
<div>&nbsp; KEY `Year_3` <span>&#40;</span>`Year`,`DestCityName`,`OriginCityName`<span>&#41;</span>,</div>
</li>
<li>
<div>&nbsp; KEY `Year_4` <span>&#40;</span>`Year`,`Carrier`,`DepDelay`<span>&#41;</span>,</div>
</li>
<li>
<div>&nbsp; KEY `Origin` <span>&#40;</span>`Origin`,`Year`,`DepDelay`<span>&#41;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>And I measured load time for both MyISAM and TokuDB in empty table with created indexes.</p>
<p>Load time for MyISAM: <strong>16608 sec</strong><br />
For TokuDB: <strong>19131 sec</strong></p>
<p>Datasize (including indexes)</p>
<p>MyISAM: <strong>36.7GB</strong><br />
TokuDB: <strong>6.7GB</strong></p>
<p>I am a bit surprised that TokuDB is slower loading data, but my guess it is related to compression, and I expect with bigger amount of data TokuDB will be faster MyISAM.</p>
<p>Now to queries. Bradley pointed me that query Q5 <code>SELECT t.carrier, c, c2, c*1000/c2 as c3 FROM (SELECT carrier,<br />
count(*) AS c FROM ontime WHERE DepDelay>10 AND Year=2007 GROUP BY<br />
carrier) t JOIN (SELECT carrier, count(*) AS c2 FROM ontime WHERE<br />
Year=2007 GROUP BY carrier) t2 ON (t.Carrier=t2.Carrier) ORDER BY c3</code> can be rewritten as<br />
<code>SELECT carrier,totalflights,ndelayed,ndelayed*1000/totalflights as c3 FROM (SELECT carrier,count(*) as totalflights,sum(if(depdelay>10,1,0)) as ndelayed from ontime where year=2007 group by carrier) t order by c3 desc;</code> ( I name it as Query Q5i)</p>
<p>The summary table with queries execution time (in sec, less is better):</p>
<table border=1>
<tr>
<td>Query</td>
<td>MyISAM</td>
<td>TokuDB</td>
</tr>
<tr>
<td>Q0</td>
<td>72.84</td>
<td>50.25</td>
</tr>
<tr>
<td>Q1</td>
<td>61.03</td>
<td>55.01</td>
</tr>
<tr>
<td>Q2</td>
<td>98.12</td>
<td>58.36</td>
</tr>
<tr>
<td>Q3</td>
<td>123.04</td>
<td>66.87</td>
</tr>
<tr>
<td>Q4</td>
<td>6.92</td>
<td>6.91</td>
</tr>
<tr>
<td>Q5</td>
<td>13.61</td>
<td>11.86</td>
</tr>
<tr>
<td>Q5i</td>
<td>7.68</td>
<td>6.96</td>
</tr>
<tr>
<td>Q6</td>
<td>123.84</td>
<td>69.03</td>
</tr>
<tr>
<td>Q7</td>
<td>187.22</td>
<td>159.62</td>
</tr>
<tr>
<td>Q8 (1y)</td>
<td>8.75</td>
<td>7.59</td>
</tr>
<tr>
<td>Q8 (2y)</td>
<td>102.17</td>
<td>64.95</td>
</tr>
<tr>
<td>Q8 (3y)</td>
<td>104.7</td>
<td>69.76</td>
</tr>
<tr>
<td>Q8 (4y)</td>
<td>107.05</td>
<td>70.46</td>
</tr>
<tr>
<td>Q8 (10y)</td>
<td>119.54</td>
<td>84.64</td>
</tr>
<tr>
<td>Q9</td>
<td>69.05</td>
<td>47.67</td>
</tr>
</table>
<p>For reference I used 5.1.36-Tokutek-2.1.0 for both MyISAM and TokuDB tests.</p>
<p>And if you are interested to compare MyISAM with previous engines:</p>
<table border=1>
<tr>
<td>Query</td>
<td>MyISAM</td>
<td>MonetDB</td>
<td>InfoBright</td>
<td>LucidDB</td>
<td>InfiniDB</td>
</tr>
<tr>
<td>Q0</td>
<td>72.84</td>
<td>29.9</td>
<td>4.19</td>
<td>103.21</td>
<td>NA</td>
</tr>
<tr>
<td>Q1</td>
<td>61.03</td>
<td>7.9</td>
<td>12.13</td>
<td>49.17</td>
<td>6.79</td>
</tr>
<tr>
<td>Q2</td>
<td>98.12</td>
<td>0.9</td>
<td>6.73</td>
<td>27.13</td>
<td>4.59</td>
</tr>
<tr>
<td>Q3</td>
<td>123.04</td>
<td>1.7</td>
<td>7.29</td>
<td>27.66</td>
<td>4.96</td>
</tr>
<tr>
<td>Q4</td>
<td>6.92</td>
<td>0.27</td>
<td>0.99</td>
<td>2.34</td>
<td>0.75</td>
</tr>
<tr>
<td>Q5</td>
<td>13.61</td>
<td>0.5</td>
<td>2.92</td>
<td>7.35</td>
<td>NA</td>
</tr>
<tr>
<td>Q6</td>
<td>123.84</td>
<td>12.5</td>
<td>21.83</td>
<td>78.42</td>
<td>NA</td>
</tr>
<tr>
<td>Q7</td>
<td>187.22</td>
<td>27.9</td>
<td>8.59</td>
<td>106.37</td>
<td>NA</td>
</tr>
<tr>
<td>Q8 (1y)</td>
<td>8.75</td>
<td>0.55</td>
<td>1.74</td>
<td>6.76</td>
<td>8.13</td>
</tr>
<tr>
<td>Q8 (2y)</td>
<td>102.17</td>
<td>1.1</td>
<td>3.68</td>
<td>28.82</td>
<td>16.54</td>
</tr>
<tr>
<td>Q8 (3y)</td>
<td>104.7</td>
<td>1.69</td>
<td>5.44</td>
<td>35.37</td>
<td>24.46</td>
</tr>
<tr>
<td>Q8 (4y)</td>
<td>107.05</td>
<td>2.12</td>
<td>7.22</td>
<td>41.66</td>
<td>32.49</td>
</tr>
<tr>
<td>Q8 (10y)</td>
<td>119.54</td>
<td>29.14</td>
<td>17.42</td>
<td>72.67</td>
<td>70.35</td>
</tr>
<tr>
<td>Q9</td>
<td>69.05</td>
<td>6.3</td>
<td>0.31</td>
<td>76.12</td>
<td>9.54</td>
</tr>
</table>
<p>The all results are available in <a href="https://spreadsheets.google.com/a/percona.com/ccc?key=0AjsVX7AnrCYwdERIZFVqakRrcXplM0g0UktaUkRwenc&amp;hl=en">summary Spreadsheet</a></p>
<p>I especially do not put TokuDB in the same table with analytic oriented databases, to highlight TokuDB is  OLTP engine for general purposes.<br />
As you see it is doing better than MyISAM in all queries.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by Vadim |
      <a href="http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/#comments">One comment</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/&amp;title=Air%20traffic%20queries%20in%20MyISAM%20and%20Tokutek%20(TokuDB)" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/&amp;title=Air%20traffic%20queries%20in%20MyISAM%20and%20Tokutek%20(TokuDB)" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/&amp;title=Air%20traffic%20queries%20in%20MyISAM%20and%20Tokutek%20(TokuDB)" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/&amp;T=Air%20traffic%20queries%20in%20MyISAM%20and%20Tokutek%20(TokuDB)" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/11/05/air-traffic-queries-in-myisam-and-tokutek-tokudb/&amp;title=Air%20traffic%20queries%20in%20MyISAM%20and%20Tokutek%20(TokuDB)" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22128&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22128&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Fri, 06 Nov 2009 06:21:03 +0000</pubDate>
    <dc:creator>MySQL Performance Blog</dc:creator>
    <category>OLAP</category>
    <category>benchmarks</category>
    <category>dw</category>
    <category>mysql</category>
  </item>

  <item>
    <title>Back from Hiatus - Summary Update 2</title>
    <guid isPermaLink="false">http://blog.tonybain.com/tony_bain/2009/11/back-from-hiatus-summary-update-2.html</guid>
    <link>http://blog.tonybain.com/tony_bain/2009/11/back-from-hiatus-summary-update-2.html</link>
    <description>Back from Hiatus - Summary Update 1GoodDataGoodData has launched and they are providing a cloud based analytics platform for use in integration with online apps.&amp;#0160; Starting with some initial focus on SalesForce data, but working hard on expanding the list of ISV’s who choose to provide their customers analytics via GoodData.GoodData was started by “good guy” Czech serial entrepreneur Roman Stanek (NetBeans) and has just raised funds from Andressen Horowitz and appointed Time O’Reilly to the board.&amp;#0160; GoodData is interesting because it is simple, accessible and available on demand.&amp;#0160; Still early days but think Roman is on to another winner here.&amp;#0160; Certainly recommend any ISV building cloud based apps to look at their platform.Mark LogicI was keen to learn more about Mark Logic as I didn’t understand their products in any detail.&amp;#0160; David and Ron were more than obliging and I sat down with them last week for a run though.In short, I am impressed by the technology of Mark Logic.&amp;#0160; It is a database that uses XML as the schema data model and XQuery as the primary query language.&amp;#0160; But it is far more than and XML extension bolted on top of a traditional db engine (such as some of the XML capabilities in the more traditional RBDMS vendors).&amp;#0160; Internally Mark Logic has all the important DBMS components but they are designed and optimized around the XML schema (query processor, indexing etc) from the ground up.&amp;#0160; I also understand they have distributed multi-node capability, something which is still quite rare over in the general purpose RBDMS world. Mark Logic has a history in the content publishing market, as you would expect, because much “published” data is naturally represented in XML.&amp;#0160; I did sense the team at Mark Logic is keen to break away from this niche a little (while at the same time respecting that this will likely remain their primary market).&amp;#0160; Exactly how they go about this isn’t entirely clear to me as the world has kind of moved on from the “XML for everything” excitement that existed in the early 2000’s.&amp;#0160; There will be plenty of case-by-case requirements, but a piecemeal market is hard to drive business development.&amp;#0160; But publishing remains a clear staple and I am sure they can leverage this into a few more.I did get somewhat excited when we were talking about serializing JSON in and out of Mark Logic.&amp;#0160; This is very topical in the web app market as we see a push towards client based web applications and web service dishing up JSON.&amp;#0160; But this is not necessarily a money spinner as there are “free” offerings servicing this need already (CouchDB, MongoDB etc).&amp;#0160; I understand Mark Logic is proprietary license so it might be hard to gain traction here.KognitioI spoke briefly with Kognitio a couple of weeks back.&amp;#0160; I hear very little about Kognitio so I was keen to speak to them about their progress.&amp;#0160; Kognitio is a UK based company and provides a data warehouse appliance, while only launching in the US last year they have a much longer history in the UK.Kognitio seems to be taking an alternative approach to achieving growth than the one many of the US vendors are using.&amp;#0160; While most of the US companies are venture backed and are pushing hard to gain market share, Kognitio on the other hand is privately backed and seems to be taking a slower and more methodical approach.&amp;#0160; This has obviously served them well in the UK but it will be interesting how that plays out into the highly crowed, highly competitive US data warehousing scene.&amp;#0160; It may turn out to be a true test to see who really does win out of the tortoise and the hare.InfobrightThe big news at Infobright is that Miriam is no longer CEO and she has been replaced by a temporary CEO, board member Mark Burton.&amp;#0160; I spoke with Mark a couple of days ago and the reasons cited were around future direction and the next stage in the company’s lifecycle etc.&amp;#0160; They are still sorting this all out and expect to be ready to start discussing their new direction in a few weeks.&amp;#0160; In saying that, when we spoke I got the feeling their positioning will still very tied to the MySQL customer base, something I tend to disagree with.&amp;#0160; But it would be premature to speculate and instead will wait to further information is available.</description>
    <content:encoded><![CDATA[<p><em><a href="http://blog.tonybain.com/tony_bain/2009/11/back-from-hiatus-summary-update-1.html">Back from Hiatus - Summary Update 1</a><br /></em></p><h2>GoodData</h2><a href="http://www.gooddata.com/" target="_blank">GoodData </a>has launched and they are providing a cloud based analytics platform for use in integration with online apps.&#0160; Starting with some initial focus on SalesForce data, but working hard on expanding the list of ISV’s who choose to provide their customers analytics via GoodData.<br /><br />GoodData was started by “good guy” Czech serial entrepreneur <a href="http://twitter.com/RomanStanek" target="_blank">Roman Stanek</a> (NetBeans) and has just raised funds from Andressen Horowitz and appointed Time O’Reilly to the board.&#0160; GoodData is interesting because it is simple, accessible and available on demand.&#0160; Still early days but think Roman is on to another winner here.&#0160; Certainly recommend any ISV building cloud based apps to look at their platform.<br /><br /><h2>Mark Logic</h2>I was keen to learn more about <a href="http://www.marklogic.com/" target="_blank">Mark Logic</a> as I didn’t understand their products in any detail.&#0160; David and Ron were more than obliging and I sat down with them last week for a run though.<br /><br />In short, I am impressed by the technology of Mark Logic.&#0160; It is a database that uses XML as the schema data model and XQuery as the primary query language.&#0160; But it is far more than and XML extension bolted on top of a traditional db engine (such as some of the XML capabilities in the more traditional RBDMS vendors).&#0160; Internally Mark Logic has all the important DBMS components but they are designed and optimized around the XML schema (query processor, indexing etc) from the ground up.&#0160; I also understand they have distributed multi-node capability, something which is still quite rare over in the general purpose RBDMS world. <br /><br />Mark Logic has a history in the content publishing market, as you would expect, because much “published” data is naturally represented in XML.&#0160; I did sense the team at Mark Logic is keen to break away from this niche a little (while at the same time respecting that this will likely remain their primary market).&#0160; Exactly how they go about this isn’t entirely clear to me as the world has kind of moved on from the “XML for everything” excitement that existed in the early 2000’s.&#0160; There will be plenty of case-by-case requirements, but a piecemeal market is hard to drive business development.&#0160; But publishing remains a clear staple and I am sure they can leverage this into a few more.<br /><br />I did get somewhat excited when we were talking about serializing JSON in and out of Mark Logic.&#0160; This is very topical in the web app market as we see a push towards client based web applications and web service dishing up JSON.&#0160; But this is not necessarily a money spinner as there are “free” offerings servicing this need already (CouchDB, MongoDB etc).&#0160; I understand Mark Logic is proprietary license so it might be hard to gain traction here.<br /><br /><h2>Kognitio</h2>I spoke briefly with <a href="http://www.kognitio.com/" target="_blank">Kognitio</a> a couple of weeks back.&#0160; I hear very little about Kognitio so I was keen to speak to them about their progress.&#0160; Kognitio is a UK based company and provides a data warehouse appliance, while only launching in the US last year they have a much longer history in the UK.<br /><br />Kognitio seems to be taking an alternative approach to achieving growth than the one many of the US vendors are using.&#0160; While most of the US companies are venture backed and are pushing hard to gain market share, Kognitio on the other hand is privately backed and seems to be taking a slower and more methodical approach.&#0160; This has obviously served them well in the UK but it will be interesting how that plays out into the highly crowed, highly competitive US data warehousing scene.&#0160; It may turn out to be a true test to see who really does win out of the tortoise and the hare.<br /><br /><h2>Infobright</h2>The big news at <a href="http://www.infobright.com" target="_blank">Infobright </a>is that Miriam is no longer CEO and she has been replaced by a temporary CEO, board member Mark Burton.&#0160; I spoke with Mark a couple of days ago and the reasons cited were around future direction and the next stage in the company’s lifecycle etc.&#0160; They are still sorting this all out and expect to be ready to start discussing their new direction in a few weeks.&#0160; In saying that, when we spoke I got the feeling their positioning will still very tied to the MySQL customer base, something I tend to disagree with.&#0160; But it would be premature to speculate and instead will wait to further information is available.<br /><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22127&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22127&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Fri, 06 Nov 2009 04:03:22 +0000</pubDate>
    <dc:creator>Tony Bain</dc:creator>
    <category>Business Intelligence</category>
    <category>Cloud Databases</category>
    <category>Data Integration</category>
    <category>Database Management</category>
    <category>Microsoft SQL Server</category>
    <category>MySQL</category>
    <category>NoSQL</category>
    <category>Relational DB</category>
    <category>Web 2.0</category>
    <category>Web/Tech</category>
  </item>

  <item>
    <title>OQGRAPH session on MySQL University – recording now available</title>
    <guid isPermaLink="false">http://openquery.com/blog/?p=1123</guid>
    <link>http://openquery.com/blog/oqgraph-session-mysql-university-recording</link>
    <description>It was fun doing the MySQL University session on OQGRAPH yesterday. Now also available: slides (PDF) and audio/video recording (FLV download, if anyone can convert to a more open format, that&amp;#8217;d be great).</description>
    <content:encoded><![CDATA[<p>It was fun doing the <a href="http://openquery.com/blog/oqgraph-engine-mysql-university-5-nov-2009-1000-utc" target="_blank">MySQL University session on OQGRAPH</a> yesterday. Now also available: <a href="http://openquery.com/files/oqgraph-mysqluni-2009-11-05.pdf" target="_blank">slides</a> (PDF) and <a href="http://openquery.com/files/oqgraph-mysqluni-2009-11-05.flv">audio/video recording</a> (FLV download, if anyone can convert to a more open format, that&#8217;d be great).</p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22124&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22124&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Fri, 06 Nov 2009 02:17:19 +0000</pubDate>
    <dc:creator>Open Query</dc:creator>
    <category>GRAPH engine</category>
    <category>graphengine</category>
    <category>mysql</category>
    <category>MySQL University</category>
    <category>Open Query</category>
    <category>OQGRAPH</category>
    <category>presentation</category>
    <category>recording</category>
    <category>session</category>
    <category>slides</category>
  </item>

  <item>
    <title>New developers training course is almost ready</title>
    <guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1630</guid>
    <link>http://www.mysqlperformanceblog.com/2009/11/05/new-developers-training-course-is-almost-ready/</link>
    <description>We've been busy expanding our training curriculum to include training for developers building applications with MySQL.  We have reached the point where we're ready for a pilot teach - and it brings me great pleasure to announce that we're opening it up for blog readers to attend, free of charge.
The details:
 San Francisco
4th December
9:30AM - 5PM
Spaces are limited, so to give everyone a fair chance we're delaying registration to open at noon tomorrow (Friday) Pacific Time. It's strictly first in first served, so be quick!  The registration link is here.
    
    Entry posted by Morgan Tocker |
      One comment
    Add to:  |  |  |  | </description>
    <content:encoded><![CDATA[<p>We've been busy expanding our training curriculum to include training for developers building applications with MySQL.  We have reached the point where we're ready for a pilot teach - and it brings me great pleasure to announce that we're opening it up for blog readers to attend, <em>free of charge</em>.</p>
<p><strong>The details:<br />
</strong> San Francisco<br />
4th December<br />
9:30AM - 5PM</p>
<p>Spaces are limited, so to give everyone a fair chance <strong>we're delaying registration to open at noon tomorrow</strong> (Friday) Pacific Time. It's strictly first in first served, so be quick!  The <a href="http://percona-ca-sfo-dev.eventbrite.com/">registration link is here</a>.</p>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by Morgan Tocker |
      <a href="http://www.mysqlperformanceblog.com/2009/11/05/new-developers-training-course-is-almost-ready/#comments">One comment</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/11/05/new-developers-training-course-is-almost-ready/&amp;title=New%20developers%20training%20course%20is%20almost%20ready" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/11/05/new-developers-training-course-is-almost-ready/&amp;title=New%20developers%20training%20course%20is%20almost%20ready" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/11/05/new-developers-training-course-is-almost-ready/&amp;title=New%20developers%20training%20course%20is%20almost%20ready" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/11/05/new-developers-training-course-is-almost-ready/&amp;T=New%20developers%20training%20course%20is%20almost%20ready" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/11/05/new-developers-training-course-is-almost-ready/&amp;title=New%20developers%20training%20course%20is%20almost%20ready" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22125&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22125&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Fri, 06 Nov 2009 01:56:25 +0000</pubDate>
    <dc:creator>Morgan Tocker</dc:creator>
    <category>announce</category>
    <category>community</category>
    <category>mysql</category>
    <category>percona</category>
  </item>

  <item>
    <title>PBMS Cloud storage is back!</title>
    <guid isPermaLink="false">tag:blogger.com,1999:blog-5313622847232042654.post-2555970065377414593</guid>
    <link>http://bpbdev.blogspot.com/2009/11/pbms-cloud-storage-is-back.html</link>
    <description>Hi,Support for S3 BLOB storage has now been fully integrated into the PBMS engine. It works in much the same way that I mentioned in an earlier post but with some important changes so I will explain it all again here.When using S3 BLOB storage with PBMS the BLOB reference tracking and metadata is handled the same as before in that they are stored in the BLOB record in the repository, but the actual BLOB is stored on an S3 server.To setup S3 storage you need to add an S3 cloud reference record to the pbms.pbms_cloud table provided by PBMS. For example:INSERT INTO pbms.pbms_cloud(ID, Server, bucket, PublicKey, PrivateKey) VALUES(16, &quot;S3.amazonaws.com&quot;, &quot;PBMS-Test&quot;, &quot;abc123&quot;, &quot;amjr15vWq&quot;);Then you need to tell PBMS which database should use S3 cloud storage for its BLOBs. This is done by updating a couple of records in the pbms_variable table that PBMS provides for each user database. For example to setup the database &quot;myDB&quot; for S3 cloud storage you would do the following:UPDATE myDB.pbms_variable set value = &quot;16&quot; where name = &quot;S3-Cloud-Ref&quot;;UPDATE myDB.pbms_variable set value = &quot;CLOUD&quot; where name = &quot;Storage-type&quot;;The database &quot;myDB&quot; is now setup for cloud storage. All BLOB data will now be stored in the bucket &quot;PBMS-Test&quot; on the S3 server &quot;S3.amazonaws.com&quot;.This diagram shows the steps taken when the PBMS client library uploads a BLOB to the PBMS repository using S3 cloud storage. All of these steps are performed by one call the the PBMS client lib and the client application knows nothing about the type of BLOB storage being used:Step 1: The BLOB metadata is sent to the PBMS engine.Step 2: A repository record is created containing the BLOB metadata.Step 3: A reply is sent back to the client containing the BLOB reference which is passed back up to the client application to be inserted into the user's table in place of the BLOB. An  S3 authorization signature is also returned to the client. The authorization signature is generated by the PBMS engine using the Public/Private keys for the S3 server to sign the BLOB upload request.Step 4: The PBMS client library uses the authorization signature to upload the BLOB to the S3 server.Alternatively the BLOB can be inserted directly into the user's table in which case the PBMS engine will upload it to the S3 server. This is not the preferred way of doing things  because it forces the MySQL server to handle the BLOB data, which is what we are trying to avoid by using the PBMS engine.Here is how the BLOB is accessed. Keep in mind that all the client application does is provide the PBMS lib with a PBMS BLOB reference as it currently does and receives the BLOB in return. It knows nothing about the cloud.Step 1: A BLOB request containing a PBMS BLOB reference is sent to the PBMS engine’s HTTP server.Step 2: The BLOB’s repository record is read from local storage.Step 3: An HTTP redirect reply is sent back to the client redirecting the request to the BLOB stored in the cloud. The metadata associated with the BLOB is returned to the client in the reply’s HTTP headers. The redirect URL is an authenticated query string that gives the client time limited access to the BLOB data. Use of an authenticated query string allows the data in the cloud to have access protection without requiring the client applications to know the private key normally required to get access. Step 4: The redirect is followed to the cloud and the BLOB data is retrieved.The beauty of this system is that the client applications need never know how or where the actually BLOB data is stored and since the BLOB transfer is all directly between the client and the S3 server, the BLOB data doesn't need to be handled at all by the machine on which the MySQL server is running.Because the BLOB repository records indirectly refer back to the S3 server via the pbms_cloud table, the administrator is free to move the BLOB data between S3 servers and/or buckets with out having to do anything more than updating the record in the pbms_cloud table. For example given the following setup:all that would be required to relocate the BLOB data from the Amazon S3 server to the Sun S3 server would be to:Copy the BLOB data from the amazon server to the Sun server.Update the pbms_cloud table as: UPDATE pbms.pbms_cloud set server = &quot;S3.sunaws.com&quot;, Bucket = &quot;B3&quot; PublicKey = &quot;zft123&quot;, PrivateKey = &quot;abc123&quot; where id = 17;Delete the BLOBs on the amazon server.resulting in:I will explain how the new BLOB repository backup system works with cloud storage in another blog entry.Barry</description>
    <content:encoded><![CDATA[Hi,<br /><br />Support for S3 BLOB storage has now been fully integrated into the PBMS engine. It works in much the same way that I mentioned in an earlier post but with some important changes so I will explain it all again here.<br /><br />When using S3 BLOB storage with PBMS the BLOB reference tracking and metadata is handled the same as before in that they are stored in the BLOB record in the repository, but the actual BLOB is stored on an S3 server.<br /><br />To setup S3 storage you need to add an S3 cloud reference record to the pbms.pbms_cloud table provided by PBMS. For example:<br /><br /><span>INSERT INTO pbms.pbms_cloud(ID, Server, bucket, PublicKey, PrivateKey) VALUES(16, "S3.amazonaws.com", "PBMS-Test", "abc123", "amjr15vWq");</span><br /><br />Then you need to tell PBMS which database should use S3 cloud storage for its BLOBs. This is done by updating a couple of records in the pbms_variable table that PBMS provides for each user database. For example to setup the database "myDB" for S3 cloud storage you would do the following:<br /><br /><span>UPDATE myDB.pbms_variable set value = "16" where name = "S3-Cloud-Ref";</span><br /><span>UPDATE myDB.pbms_variable set value = "CLOUD" where name = "Storage-type";</span><br /><br />The database "myDB" is now setup for cloud storage. All BLOB data will now be stored in the bucket "PBMS-Test" on the S3 server "S3.amazonaws.com".<br /><br />This diagram shows the steps taken when the PBMS client library uploads a BLOB to the PBMS repository using S3 cloud storage. All of these steps are performed by one call the the PBMS client lib and the client application knows nothing about the type of BLOB storage being used:<br /><br /><a href="http://1.bp.blogspot.com/_ESrUDY1Mluc/SvOCT_YAsZI/AAAAAAAAACE/xVUZjnTyv9I/s1600-h/BLOB_Insert.jpg"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 300px;" src="http://1.bp.blogspot.com/_ESrUDY1Mluc/SvOCT_YAsZI/AAAAAAAAACE/xVUZjnTyv9I/s400/BLOB_Insert.jpg" alt="" id="BLOGGER_PHOTO_ID_5400803658088624530" border="1" /></a><br /><br /><br /><ul><li><span>Step 1</span>: The BLOB metadata is sent to the PBMS engine.</li><li><span>Step 2</span>: A repository record is created containing the BLOB metadata.</li><li><span>Step 3</span>: A reply is sent back to the client containing the BLOB reference which is passed back up to the client application to be inserted into the user's table in place of the BLOB. An  S3 authorization signature is also returned to the client. The authorization signature is generated by the PBMS engine using the Public/Private keys for the S3 server to sign the BLOB upload request.<br /></li><li><span>Step 4</span>: The PBMS client library uses the authorization signature to upload the BLOB to the S3 server.</li></ul>Alternatively the BLOB can be inserted directly into the user's table in which case the PBMS engine will upload it to the S3 server. This is not the preferred way of doing things  because it forces the MySQL server to handle the BLOB data, which is what we are trying to avoid by using the PBMS engine.<br /><br /><br />Here is how the BLOB is accessed. Keep in mind that all the client application does is provide the PBMS lib with a PBMS BLOB reference as it currently does and receives the BLOB in return. It knows nothing about the cloud.<br /><br /><a href="http://1.bp.blogspot.com/_ESrUDY1Mluc/SvODkQXamWI/AAAAAAAAACM/IRGD2q0t1BQ/s1600-h/BLOB_Retrieval.jpg"><img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 320px; height: 240px;" src="http://1.bp.blogspot.com/_ESrUDY1Mluc/SvODkQXamWI/AAAAAAAAACM/IRGD2q0t1BQ/s320/BLOB_Retrieval.jpg" alt="" id="BLOGGER_PHOTO_ID_5400805037039065442" border="1" /></a><br /><br /><br /><ul><li><span>Step 1</span>: A BLOB request containing a PBMS BLOB reference is sent to the PBMS engine’s HTTP server.</li><li><span>Step 2</span>: The BLOB’s repository record is read from local storage.</li><li><span>Step 3</span>: An HTTP redirect reply is sent back to the client redirecting the request to the BLOB stored in the cloud. The metadata associated with the BLOB is returned to the client in the reply’s HTTP headers. The redirect URL is an authenticated query string that gives the client time limited access to the BLOB data. Use of an authenticated query string allows the data in the cloud to have access protection without requiring the client applications to know the private key normally required to get access. </li><li><span>Step 4</span>: The redirect is followed to the cloud and the BLOB data is retrieved.</li></ul>The beauty of this system is that the client applications need never know how or where the actually BLOB data is stored and since the BLOB transfer is all directly between the client and the S3 server, the BLOB data doesn't need to be handled at all by the machine on which the MySQL server is running.<br /><br />Because the BLOB repository records indirectly refer back to the S3 server via the pbms_cloud table, the administrator is free to move the BLOB data between S3 servers and/or buckets with out having to do anything more than updating the record in the pbms_cloud table. For example given the following setup:<br /><br /><a href="http://1.bp.blogspot.com/_ESrUDY1Mluc/SvOEUv6hTpI/AAAAAAAAACU/PcO6F39tYFk/s1600-h/BLOB_location1.jpg"><img style="margin: 1px auto 10px; display: block; text-align: left; cursor: pointer; width: 320px; height: 240px;" src="http://1.bp.blogspot.com/_ESrUDY1Mluc/SvOEUv6hTpI/AAAAAAAAACU/PcO6F39tYFk/s320/BLOB_location1.jpg" alt="" id="BLOGGER_PHOTO_ID_5400805870141525650" border="1" /></a><br /><br /><br />all that would be required to relocate the BLOB data from the Amazon S3 server to the Sun S3 server would be to:<br /><br /><ul><li>Copy the BLOB data from the amazon server to the Sun server.</li><li>Update the pbms_cloud table as: <span>UPDATE pbms.pbms_cloud set server = "S3.sunaws.com", Bucket = "B3" PublicKey = "zft123", PrivateKey = "abc123" where id =</span> 17;</li><li>Delete the BLOBs on the amazon server.</li></ul>resulting in:<br /><br /><a href="http://2.bp.blogspot.com/_ESrUDY1Mluc/SvOEz4stJbI/AAAAAAAAACc/--S6Fo3O-qI/s1600-h/BLOB_location2.jpg"><img style="margin: 0px auto 10px; display: block; text-align: right; cursor: pointer; width: 320px; height: 240px;" src="http://2.bp.blogspot.com/_ESrUDY1Mluc/SvOEz4stJbI/AAAAAAAAACc/--S6Fo3O-qI/s320/BLOB_location2.jpg" alt="" id="BLOGGER_PHOTO_ID_5400806405075445170" border="1" /></a><br /><br />I will explain how the new BLOB repository backup system works with cloud storage in another blog entry.<br /><br />Barry<div><img width="1" height="1" src="https://blogger.googleusercontent.com/tracker/5313622847232042654-2555970065377414593?l=bpbdev.blogspot.com" /></div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22126&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22126&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Thu, 05 Nov 2009 23:31:00 +0000</pubDate>
    <dc:creator>Barry Leslie</dc:creator>
    <category>MySQL</category>
    <category>S3</category>
    <category>Cloud</category>
    <category>PBMS</category>
    <category>Drizzle</category>
  </item>

  <item>
    <title>The Danger of blocking the Oracle/Sun deal</title>
    <guid isPermaLink="false">http://blog.tonybain.com/tony_bain/2009/11/image-via-wikipediafyi---the-thoughts-here-have-been-gathered-from-conversations-with-several-individuals-including-an-inter.html</guid>
    <link>http://blog.tonybain.com/tony_bain/2009/11/image-via-wikipediafyi---the-thoughts-here-have-been-gathered-from-conversations-with-several-individuals-including-an-inter.html</link>
    <description>Image via WikipediaFYI - the thoughts here have been gathered from conversations with several individuals, including an interesting conversation yesterday.&amp;#0160; As these conversations were off the record I won’t name names here but thanks to those people.I love open source software and I am a big supporter of many companies that produce open source offerings.&amp;#0160; Here I am not going to debate if Oracle acquiring MySQL will be better for MySQL or not as that has been done to death.&amp;#0160; But I do think it is relevant to discuss the dangers of blocking a commercial vendor from acquiring a potentially competitive open source vendor.Many open source software initiatives are purely community backed and are constructed in an informal, ad-hoc manner.&amp;#0160; Many other initiatives are started within large new generation companies to serve their own needs, and then made available for the benefit of the community at large as a side benefit.This way of building software however limits the audience to users who have the necessary technical capability to build, deploy and support that software internally without a dependency on another supporting body (other than the community of course).&amp;#0160; To open the software up as an option to a much wider customer base a more formal structure is required.&amp;#0160; Typically a company is formed to develop, document, support, promote and rally the community and allow the software to be used in a much greater capacity.But without large license sales bootstrapping an open source company is very difficult.&amp;#0160; Instead open source companies often use venture finance as a means of resourcing the company, to grow both the company and the software to some form of critical mass.&amp;#0160; In the database world off the top of my head I can think of a bunch of open source companies who have taken this approach (MySQL prior to Sun, 10gen, Infobright, EnterpriseDB and so on).&amp;#0160; Venture finance firms invest to help create significant value, and then create an exit.&amp;#0160; IPO’s are unlikely in the database industry in general right now, and less likely for open source database companies.&amp;#0160; So the most likely exit is through acquisition.&amp;#0160; The number of companies who have the size, ability, direction &amp;amp; motivation to acquire a highly successful venture back database start up (assuming under good exit conditions) I could probably count on my fingers &amp;amp; toes. As dramatic as it sounds, blocking a commercial vendor from acquiring and open source vendor because of product overlaps could have much future impact.&amp;#0160; Start ups seeking funding to build any killer app that overlaps with commercial software (of course, not just DB) may find some resistance from investors due to the potential exit issues.&amp;#0160; Of course protecting the consumer from anti-competitive behavior is a necessary evil, but we also have to ensure that the system that allows companies like MySQL to come into existence is also protected.

</description>
    <content:encoded><![CDATA[<p><a href="http://en.wikipedia.org/wiki/Image%3AOracle_Corporation_HQ.jpg"><img alt="Oracle headquarters" height="200" src="http://upload.wikimedia.org/wikipedia/en/thumb/9/9d/Oracle_Corporation_HQ.jpg/300px-Oracle_Corporation_HQ.jpg" style="border: medium none ; display: block;" width="300" /></a><span>Image via <a href="http://en.wikipedia.org/wiki/Image%3AOracle_Corporation_HQ.jpg">Wikipedia</a></span></p><em><span>FYI - the thoughts here have been gathered from conversations with several individuals, including an interesting conversation yesterday.&#0160; As these conversations were off the record I won’t name names here but thanks to those people.</span></em><br /><br />I love open source software and I am a big supporter of many companies that produce open source offerings.&#0160; Here I am not going to debate if Oracle acquiring MySQL will be better for MySQL or not as that has been done to death.&#0160; But I do think it is relevant to discuss the dangers of blocking a commercial vendor from acquiring a potentially competitive open source vendor.<br /><br />Many open source software initiatives are purely community backed and are constructed in an informal, ad-hoc manner.&#0160; Many other initiatives are started within large new generation companies to serve their own needs, and then made available for the benefit of the community at large as a side benefit.<br /><br />This way of building software however limits the audience to users who have the necessary technical capability to build, deploy and support that software internally without a dependency on another supporting body (other than the community of course).&#0160; To open the software up as an option to a much wider customer base a more formal structure is required.&#0160; Typically a company is formed to develop, document, support, promote and rally the community and allow the software to be used in a much greater capacity.<br /><br />But without large license sales bootstrapping an open source company is very difficult.&#0160; Instead open source companies often use venture finance as a means of resourcing the company, to grow both the company and the software to some form of critical mass.&#0160; In the database world off the top of my head I can think of a bunch of open source companies who have taken this approach (MySQL prior to Sun, <a href="http://www.10gen.com/" rel="homepage" title="10gen">10gen</a>, <a href="http://www.infobright.com" rel="homepage" title="Infobright">Infobright</a>, <a href="http://www.enterprisedb.com/" rel="homepage" title="EnterpriseDB">EnterpriseDB</a> and so on).&#0160; <br /><br />Venture finance firms invest to help create significant value, and then create an exit.&#0160; IPO’s are unlikely in the database industry in general right now, and less likely for open source database companies.&#0160; So the most likely exit is through acquisition.&#0160; The number of companies who have the size, ability, direction &amp; motivation to acquire a highly successful venture back database start up (assuming under good exit conditions) I could probably count on my fingers &amp; toes. <br /><br />As dramatic as it sounds, blocking a commercial vendor from acquiring and open source vendor because of product overlaps could have much future impact.&#0160; Start ups seeking funding to build any killer app that overlaps with commercial software (of course, not just DB) may find some resistance from investors due to the potential exit issues.&#0160; Of course protecting the consumer from anti-competitive behavior is a necessary evil, but we also have to ensure that the system that allows companies like MySQL to come into existence is also protected.<br /><br />

<div><a href="http://reblog.zemanta.com/zemified/7919e15b-67ca-421d-adca-edaf4617a130/" title="Reblog this post [with Zemanta]"><img alt="Reblog this post [with Zemanta]" class="zemanta-pixie-img " src="http://img.zemanta.com/reblog_e.png?x-id=7919e15b-67ca-421d-adca-edaf4617a130" style="border: medium none ; float: right;" /></a><span></span></div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22122&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22122&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Thu, 05 Nov 2009 23:29:52 +0000</pubDate>
    <dc:creator>Tony Bain</dc:creator>
    <category>Business Intelligence</category>
    <category>Cloud Databases</category>
    <category>Database Management</category>
    <category>Microsoft SQL Server</category>
    <category>MySQL</category>
    <category>NoSQL</category>
    <category>Oracle</category>
    <category>Relational DB</category>
    <category>Web 2.0</category>
    <category>Web/Tech</category>
  </item>

  <item>
    <title>The impact from InnoDB readahead</title>
    <guid isPermaLink="false">http://www.facebook.com/note.php?note_id=175047440932</guid>
    <link>http://www.facebook.com/note.php?note_id=175047440932</link>
    <description>InnoDB uses background threads to handle readahead (prefetch) requests. Requests are generated when it detects sequential or random access to most blocks in an extent. This is described in the 5.0 MySQL manual with new behavior in the plugin.

The Facebook MySQL patch has new my.cnf variables to disable readahead: innodb_readahead_random and innodb_readahead_sequential. I don't know whether readahead is useful for my workload. This is hard to determine because SHOW STATUS counters for InnoDB readahead display the number of times the readahead generation functions are called rather than the number of readahead requests issued (fixed in the Facebook patch, fix pending in official InnoDB). There are also no counters for the number of pages fetched by readahead that are eventually used (not yet fixed in the Facebook patch).

I ran a mirror of the production workload on two test servers and disabled readahead on one of them. Based on the results below, I think we should disable readahead.

Results

The server with readahead enabled did 6% more InnoDB page read requests 3.8% (264544) fewer synchronous page read requests. Based on these results, readahead appears to be effective as it reduces the number of synchronous page reads. Note that the server with readahead read 7444152 pages and (255016 + 428633) were from readahead.

I estimate that 39% of the readahead requests were useful. The server with readahead did (255016 + 428633) readahead requests and 264544 fewer synchronous page read requests and (264544) / (255016 + 428633) ~= 0.39.

From the data on the number of IO requests above, I think that readahead is somewhat effective. But my opinion changes when I consider the IO latency statistics. While it did fewer synchronous read requests the server with readahead enabled used 18.6% more time for them (106827 versus 90021 seconds) and the average latency to service a synchronous read was 23% higher (15.87 milliseconds versus 12.87). It also spent much more time for asynchronous read requests but that is expected given that prefetch requests use async reads.

Alas, I did not get iostat results.

The Facebook patch for MySQL displays much more data in SHOW INNODB STATUS. Some of it is included below from the test servers.

Data from SHOW INNODB STATUS for the server with readahead enabled:

Pages read 7444152, created 94756, written 2758813
Read ahead: 255016 random, 428633 sequential
Sync reads: 6730101 requests, 0 old, 16384.30 bytes/r, svc: 106827.43 secs, 15.87 msecs/r, 807.76 max msecs, wait: 106827.57 secs 15.87 msecs/r, 807.76 max msecs
Async reads: 374094 requests, 0 old, 31269.07 bytes/r, svc: 3037.55 secs, 8.12 msecs/r, 473.97 max msecs, wait: 24448.33 secs 65.35 msecs/r, 473.97 max msecs

Data from SHOW INNODB STATUS for the server with readahead disabled:

Pages read 7025361, created 89550, written 2778099
Read ahead: 0 random, 0 sequential
Sync reads: 6994645 requests, 0 old, 16384.29 bytes/r, svc: 90021.08 secs, 12.87 msecs/r, 1289.22 max msecs, wait: 90021.21 secs 12.87 msecs/r, 1289.22 max msecs
Async reads: 20932 requests, 0 old, 24078.19 bytes/r, svc: 267.37 secs, 12.77 msecs/r, 326.25 max msecs, wait: 347.28 secs 16.59 msecs/r, 326.25 max msecs
</description>
    <content:encoded><![CDATA[InnoDB uses background threads to handle readahead (prefetch) requests. Requests are generated when it detects sequential or random access to most blocks in an extent. This is described in the <a href="http://dev.mysql.com/doc/refman/5.0/en/innodb-disk-io.html">5.0 MySQL manual</a> with <a href="http://www.innodb.com/doc/innodb_plugin-1.0/innodb-performance.html#innodb-performance-read_ahead">new behavior in the plugin</a>.

The <a href="http://www.facebook.com/MySQLatFacebook">Facebook MySQL patch</a> has new my.cnf variables to disable readahead: innodb_readahead_random and innodb_readahead_sequential. I don't know whether readahead is useful for my workload. This is hard to determine because SHOW STATUS counters for InnoDB readahead display the number of times the readahead generation functions are called rather than the number of readahead requests issued (fixed in the Facebook patch, fix pending in official InnoDB). There are also no counters for the number of pages fetched by readahead that are eventually used (not yet fixed in the Facebook patch).

I ran a mirror of the production workload on two test servers and disabled readahead on one of them. Based on the results below, I think we should disable readahead.

<h1>Results</h1>

The server with readahead enabled did 6% more InnoDB page read requests 3.8% (264544) fewer synchronous page read requests. Based on these results, readahead appears to be effective as it reduces the number of synchronous page reads. Note that the server with readahead read 7444152 pages and (255016 + 428633) were from readahead.

I estimate that 39% of the readahead requests were useful. The server with readahead did (255016 + 428633) readahead requests and 264544 fewer synchronous page read requests and (264544) / (255016 + 428633) ~= 0.39.

From the data on the number of IO requests above, I think that readahead is somewhat effective. But my opinion changes when I consider the IO latency statistics. While it did fewer synchronous read requests the server with readahead enabled used 18.6% more time for them (106827 versus 90021 seconds) and the average latency to service a synchronous read was 23% higher (15.87 milliseconds versus 12.87). It also spent much more time for asynchronous read requests but that is expected given that prefetch requests use async reads.

Alas, I did not get iostat results.

The Facebook patch for MySQL displays much more data in SHOW INNODB STATUS. Some of it is included below from the test servers.

Data from SHOW INNODB STATUS for the server with readahead enabled:
<pre>
Pages read 7444152, created 94756, written 2758813
Read ahead: 255016 random, 428633 sequential
Sync reads: 6730101 requests, 0 old, 16384.30 bytes/r, svc: 106827.43 secs, 15.87 msecs/r, 807.76 max msecs, wait: 106827.57 secs 15.87 msecs/r, 807.76 max msecs
Async reads: 374094 requests, 0 old, 31269.07 bytes/r, svc: 3037.55 secs, 8.12 msecs/r, 473.97 max msecs, wait: 24448.33 secs 65.35 msecs/r, 473.97 max msecs
</pre>
Data from SHOW INNODB STATUS for the server with readahead disabled:
<pre>
Pages read 7025361, created 89550, written 2778099
Read ahead: 0 random, 0 sequential
Sync reads: 6994645 requests, 0 old, 16384.29 bytes/r, svc: 90021.08 secs, 12.87 msecs/r, 1289.22 max msecs, wait: 90021.21 secs 12.87 msecs/r, 1289.22 max msecs
Async reads: 20932 requests, 0 old, 24078.19 bytes/r, svc: 267.37 secs, 12.77 msecs/r, 326.25 max msecs, wait: 347.28 secs 16.59 msecs/r, 326.25 max msecs
</pre><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22123&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22123&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Thu, 05 Nov 2009 23:12:27 +0000</pubDate>
    <dc:creator>Mark Callaghan</dc:creator>
  </item>

  <item>
    <title>Aspects and benefits of distributed version control systems (DVCS)</title>
    <guid isPermaLink="false">http://www.lenzg.net/archives/285-guid.html</guid>
    <link>http://www.lenzg.net/archives/285-Aspects-and-benefits-of-distributed-version-control-systems-DVCS.html</link>
    <description>This blog post is a by-product of my preparation work for an upcoming talk titled &amp;quot;Why you should be using a distributed version control system (DVCS) for your project&amp;quot; at SAPO Codebits in Lisbon (December 3-5, 2009). Publishing these thoughts prior to the conference serves two purposes: getting some peer review on my findings and acting as a teaser for the actual talk. So please let me know &amp;mdash; did I cover the relevant aspects or did I miss anything? What's your take on DVCS vs. the centralized approach? Why do you prefer one over the other? I'm looking forward to your comments!
Even though there are several distributed alternatives available for some years now (with Bazaar, git and Mercurial being the most prominent representatives here), many large and popular Open Source projects still use centralized systems like Subversion or even CVS to maintain their source code. While Subversion has eased some of the pains of CVS (e.g. better remote access, renaming/moving of files and directories, easy branching), the centralized approach by itself poses some disadvantages compared to distributed systems. So what are these? Let me give you a few examples of the limitations that a centralized system like Subversion has and how these affect the possible workflows and development practices. Continue reading &quot;Aspects and benefits of distributed version control systems (DVCS)&quot;</description>
    <content:encoded><![CDATA[<p>This blog post is a by-product of my preparation work for an upcoming talk titled &quot;<a target="_blank" href="http://codebits.eu/s/blog/45921f2141d0d5cae4925cd863153d1d">Why you should be using a distributed version control system (DVCS) for your project</a>&quot; at <a target="_blank" href="http://codebits.eu/">SAPO Codebits</a> in Lisbon (December 3-5, 2009). Publishing these thoughts prior to the conference serves two purposes: getting some peer review on my findings and acting as a teaser for the actual talk. So please let me know &mdash; did I cover the relevant aspects or did I miss anything? What's your take on DVCS vs. the centralized approach? Why do you prefer one over the other? I'm looking forward to your comments!</p>
<p>Even though there are several distributed alternatives available for some years now (with <a target="_blank" href="http://bazaar-vcs.org/">Bazaar</a>, <a href="http://git-scm.com/">git</a> and <a href="http://mercurial.selenic.com/">Mercurial</a> being the most prominent representatives here), many large and popular Open Source projects still use centralized systems like <a href="http://subversion.tigris.org/">Subversion</a> or even <a href="http://www.nongnu.org/cvs/">CVS</a> to maintain their source code. While Subversion has eased some of the pains of CVS (e.g. better remote access, renaming/moving of files and directories, easy branching), the centralized approach by itself poses some disadvantages compared to distributed systems. So what are these? Let me give you a few examples of the limitations that a centralized system like Subversion has and how these affect the possible workflows and development practices.</p> <br /><a href="http://www.lenzg.net/archives/285-Aspects-and-benefits-of-distributed-version-control-systems-DVCS.html#extended">Continue reading "Aspects and benefits of distributed version control systems (DVCS)"</a><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22120&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22120&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Thu, 05 Nov 2009 21:49:19 +0000</pubDate>
    <dc:creator>Lenz Grimmer</dc:creator>
    <category>Linux</category>
    <category>mylvmbackup</category>
    <category>MySQL</category>
    <category>OSS</category>
    <category>bzr</category>
    <category>code</category>
    <category>codebits</category>
    <category>collaborating</category>
    <category>community</category>
    <category>contributing</category>
    <category>development</category>
    <category>drupal</category>
    <category>git</category>
    <category>mercurial</category>
    <category>mysql</category>
    <category>oss</category>
    <category>programming</category>
    <category>scm</category>
    <category>social</category>
    <category>subversion</category>
  </item>

  <item>
    <title>Back from Hiatus - Summary Update 1</title>
    <guid isPermaLink="false">http://blog.tonybain.com/tony_bain/2009/11/back-from-hiatus-summary-update-1.html</guid>
    <link>http://blog.tonybain.com/tony_bain/2009/11/back-from-hiatus-summary-update-1.html</link>
    <description>Here is a summary of the key discussions I have had over the last month.&amp;#0160; Keep in mind, I’m no analyst.&amp;#0160; This is largely opinion based on various conversations I have had with the relevant companies (for analyst insight see Curt Monash).KickFireI think Kickfire has been doing it a little tough lately.&amp;#0160; The difficulties in a startup launching a hardware appliance (and associated logistics) combined with being too focused on the MySQL customer base has impacted the growth of this interesting start up.&amp;#0160; But they aren’t taking it lying down and have adjusted the strategy and have added a new appliance to the range.&amp;#0160; Kickfire now seems to have a stronger focus on the enterprise and has released a larger version of its appliance to provide a growth path.&amp;#0160; As I have said all along, the MySQL aspect of their product is interesting but the solution as a whole is much more interesting and has much broader appeal than just the current MySQL customer base.Flipping hardware appliances is a much tougher play than software only solutions, partly due to it being much more difficult for customers to get their hands on your stuff and have a play before they buy.&amp;#0160; Hopefully Kickfire has mitigated most of these issues now though their online, on demand evaluation host.&amp;#0160; I haven’t yet played with this but it is on my list of things to do over the coming month.Kickfire’s enterprise strategy is just one of many that will be re-enforced by an Oracle acquisition of Sun.GreenplumGreenplum has addressed a perceived chink in its amour with the release of its column store capability.&amp;#0160; Greenplum has taken the popular hybrid approach which means on a case by case basis you can decide if a particular table should be row or column orientated.&amp;#0160; But as Daniel points out, it is a storage level only solution.&amp;#0160; The storage only approach brings just part of the benefit of columnar stores, to achieve the full benefit the query execution engine needs to be aware of this layout (so features such as lightweight compression can be effectively used).&amp;#0160; But I am sure this is an area where Greenplum will make further improvements in the future.GroovyGroovy has been working hard carving out its niche in the real time web data market.&amp;#0160; If you don’t recall, Groovy makes an in-memory RDBMS that has been extended to provide real time data streaming capabilities.&amp;#0160; Groovy has been positioning this into the large web properties who are working on creating new large scale, real time applications for their user base.Aster DataAster has put out a number of announcements over the last month and I am trying to keep up.&amp;#0160; Firstly they announced their tight integration with Hadoop.&amp;#0160; This integration with Hadoop is map-reduce on the outside of the Aster Data platform (which apparently they didn’t have already although I think everyone assumed they did given their strong in database map-reduce message).&amp;#0160; Aster has been banging the map-reduce drum for some time and is clearly the point of difference they are focusing on.&amp;#0160; Aster has also release version 4.0 of their platform a couple of days ago, then a few days ago I was a bit surprised to see an email from them referring to their platform as “the World&amp;#39;s First Massively Parallel Data-Application Server”.&amp;#0160; This seems to be a new name reference to the in database map-reduce stuff, maybe as an effort to differentiate themselves from the myriad of competitors in this space they are trying to carve out a new category all for themselves.&amp;#0160; For me, the external map-reduce stuff makes sense as I can see this
being useful for data preparation on the way in to Aster and data
dissemination of data on its way out of Aster.&amp;#0160; But I still don’t have
in my head clear examples when their in database map-reduce stuff is
useful.&amp;#0160; I am sure it is but I have a feeling it is valuable on a case
by case basis which is difficult to articulate especially as a point of
difference message.&amp;#0160; But I missed Curt’s map-reduce webinar (at the
last minute) so maybe that would have shed some light.&amp;#0160; Anyway, they are running a webinar on this which you can register for here.To me, Aster is more aggressively driving their platform into green fields trying to leverage their technology to find new customers and new markets.&amp;#0160; Greenplum on the other hand is more ‘steady as she goes’, focusing on a more traditional and conservative enterprise data warehousing market (while still innovating ahead of the general purpose behemoth&amp;#39;s).&amp;#0160; The risks are on both sides.&amp;#0160; When trying to define a new market you risk not finding one or finding one that is too small or “niche” to support your business.&amp;#0160; With the conservative approach you risk being lumped in with everyone else, and in data warehousing ‘everyone else’ is now quite a long list.&amp;#0160; &amp;gt;&amp;gt; part 2&amp;#0160;

</description>
    <content:encoded><![CDATA[Here is a summary of the key discussions I have had over the last month.&#0160; Keep in mind, I’m no analyst.&#0160; This is largely opinion based on various conversations I have had with the relevant companies (for analyst insight see <a href="http://www.dbms2.com/" target="_blank">Curt Monash</a>).<br /><br /><h2>KickFire</h2>I think Kickfire has been doing it a little tough lately.&#0160; The difficulties in a startup launching a hardware appliance (and associated logistics) combined with being too focused on the MySQL customer base has impacted the growth of this interesting start up.&#0160; But they aren’t taking it lying down and have adjusted the strategy and have <a href="http://www.kickfire.com/images/press_releases/Kickfire_3000_Series_Product_Release.pdf" target="_blank">added a new appliance to the range</a>.&#0160; Kickfire now seems to have a stronger focus on <a href="http://www.kickfire.com/blog/?p=469">the enterprise</a> and has released a larger version of its appliance to provide a growth path.&#0160; As I have said all along, the MySQL aspect of their product is interesting but the solution as a whole is much more interesting and has much broader appeal than just the current MySQL customer base.<br /><br />Flipping hardware appliances is a much tougher play than software only solutions, partly due to it being much more difficult for customers to get their hands on your stuff and have a play before they buy.&#0160; Hopefully Kickfire has mitigated most of these issues now though their online, on demand evaluation host.&#0160; I haven’t yet played with this but it is on my list of things to do over the coming month.<br /><br />Kickfire’s enterprise strategy is just one of many that will be re-enforced by an Oracle acquisition of Sun.<br /><br /><h2>Greenplum</h2>Greenplum has addressed a perceived chink in its amour with the release of its <a href="http://www.greenplum.com/news/248/231/Beyond-Rows-and-Columns-Greenplum-s-Polymorphic-Data-Storage----Part-1/d,blog/" target="_blank">column store capability</a>.&#0160; Greenplum has taken the popular hybrid approach which means on a case by case basis you can decide if a particular table should be row or column orientated.&#0160; But as <a href="http://dbmsmusings.blogspot.com/2009/10/greenplum-announces-column-oriented.html" target="_blank">Daniel points out</a>, it is a storage level only solution.&#0160; The storage only approach brings just part of the benefit of columnar stores, to achieve the full benefit the query execution engine needs to be aware of this layout (so features such as lightweight compression can be effectively used).&#0160; But I am sure this is an area where Greenplum will make further improvements in the future.<br /><br /><h2>Groovy</h2>Groovy has been working hard carving out its niche in the real time web data market.&#0160; If you don’t recall, Groovy makes an in-memory RDBMS that has been extended to provide real time data streaming capabilities.&#0160; Groovy has been positioning this into the large web properties who are working on creating new large scale, real time applications for their user base.<br /><br /><h2>Aster Data</h2>Aster has put out a number of announcements over the last month and I am trying to keep up.&#0160; Firstly they announced their tight integration with Hadoop.&#0160; This integration with Hadoop is map-reduce on the outside of the Aster Data platform (which apparently they didn’t have already although I think everyone assumed they did given their strong in database map-reduce message).&#0160; Aster has been banging the map-reduce drum for some time and is clearly the point of difference they are focusing on.&#0160; <br /><br />Aster has also release version 4.0 of their platform a couple of days ago, then a few days ago I was a bit surprised to see an email from them referring to their platform as “<a href="http://www.asterdata.com/news/091102-Aster-Data-4.0-Massively-Parallel-Data-Application-Server.php" target="_blank">the World&#39;s First Massively Parallel Data-Application Server</a>”.&#0160; This seems to be a new name reference to the in database map-reduce stuff, maybe as an effort to differentiate themselves from the myriad of competitors in this space they are trying to carve out a new category all for themselves.&#0160; For me, the external map-reduce stuff makes sense as I can see this
being useful for data preparation on the way in to Aster and data
dissemination of data on its way out of Aster.&#0160; But I still don’t have
in my head clear examples when their in database map-reduce stuff is
useful.&#0160; I am sure it is but I have a feeling it is valuable on a case
by case basis which is difficult to articulate especially as a point of
difference message.&#0160; But I missed Curt’s map-reduce webinar (at the
last minute) so maybe that would have shed some light.&#0160; Anyway, they are running a webinar on this which you can <a href="http://www.asterdata.com/wp_Aster_Data_4.0_Applications_Within/" target="_blank">register for here</a>.<br /><br /><p>To me, Aster is more aggressively driving their platform into green fields trying to leverage their technology to find new customers and new markets.&#0160; Greenplum on the other hand is more ‘steady as she goes’, focusing on a more traditional and conservative enterprise data warehousing market (while still innovating ahead of the general purpose behemoth&#39;s).&#0160; The risks are on both sides.&#0160; When trying to define a new market you risk not finding one or finding one that is too small or “niche” to support your business.&#0160; With the conservative approach you risk being lumped in with everyone else, and in data warehousing ‘everyone else’ is now quite a long list.&#0160; </p><p><a href="http://blog.tonybain.com/tony_bain/2009/11/back-from-hiatus-summary-update-2.html">&gt;&gt; part 2&#0160;</a></p>

<div><a href="http://reblog.zemanta.com/zemified/5c1dac80-ac64-4f22-a3be-d94e71c3faaf/" title="Reblog this post [with Zemanta]"><img alt="Reblog this post [with Zemanta]" class="zemanta-pixie-img " src="http://img.zemanta.com/reblog_e.png?x-id=5c1dac80-ac64-4f22-a3be-d94e71c3faaf" style="border: medium none ; float: right;" /></a><span></span></div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22119&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22119&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Thu, 05 Nov 2009 19:58:21 +0000</pubDate>
    <dc:creator>Tony Bain</dc:creator>
    <category>Business Intelligence</category>
    <category>Cloud Databases</category>
    <category>Data Integration</category>
    <category>Database Management</category>
    <category>Microsoft SQL Server</category>
    <category>MySQL</category>
    <category>Oracle</category>
    <category>Relational DB</category>
    <category>Web 2.0</category>
  </item>

  <item>
    <title>InfiniDB parallel processing of airline on-time data.</title>
    <guid isPermaLink="false">http://infinidb.org/infinidb-blog/infinidb-parallel-processing-of-airline-on-time-data.html</guid>
    <link>http://infinidb.org/infinidb-blog/infinidb-parallel-processing-of-airline-on-time-data.html</link>
    <description>&amp;nbsp;With many thanks to Vadim at Percona for his analysis of different capabilities of different columnar dbms. Definitely good information, and well documented.&amp;nbsp; Queries and results available at:http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/Of course InfiniDB, does offer multi-threaded processing for all offerings and distributed prRead More...</description>
    <content:encoded><![CDATA[<p>&nbsp;</p><br/><p>With many thanks to Vadim at Percona for his analysis of different capabilities of different columnar dbms. Definitely good information, and well documented.&nbsp; Queries and results available at:</p><br/><p>http://www.mysqlperformanceblog.com/2009/11/02/air-traffic-queries-in-infinidb-early-alpha/</p><br/><p>Of course InfiniDB, does offer multi-threaded processing for all offerings and distributed prRead More...<br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22121&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22121&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Thu, 05 Nov 2009 19:55:24 +0000</pubDate>
    <dc:creator>Jim Tommaney</dc:creator>
    <category>scalability</category>
    <category>Performance</category>
    <category>multi-threaded</category>
  </item>

  <item>
    <title>InnoDB: look after fragmentation</title>
    <guid isPermaLink="false">http://www.mysqlperformanceblog.com/?p=1616</guid>
    <link>http://www.mysqlperformanceblog.com/2009/11/05/innodb-look-after-fragmentation/</link>
    <description>One problem made me puzzled for couple hours, but it was really interesting to figure out what's going on. 
So let me introduce problem at first. The table is
PLAIN TEXT
CODE:




CREATE TABLE `c` &amp;#40;


&amp;nbsp; `tracker_id` int&amp;#40;10&amp;#41; unsigned NOT NULL,


&amp;nbsp; `username` char&amp;#40;20&amp;#41; character set latin1 collate latin1_bin NOT NULL,


&amp;nbsp; `time_id` date NOT NULL,


&amp;nbsp; `block_id` int&amp;#40;10&amp;#41; unsigned default NULL,


&amp;nbsp; PRIMARY KEY&amp;nbsp; &amp;#40;`tracker_id`,`username`,`time_id`&amp;#41;,


&amp;nbsp; KEY `block_id` &amp;#40;`block_id`&amp;#41;


&amp;#41; ENGINE=InnoDB 






Table has 11864696 rows and takes Data_length: 698,351,616 bytes on disk
The problem is that after restoring table from mysqldump, the query that scans data by primary key was slow. How slow ? Let me show.
The query in question is (Q1):
SELECT count(distinct username) FROM  tracker where TIME_ID &gt;= '2009-07-20 00:00:00' AND TIME_ID = '2009-07-20 00:00:00' AND TIME_ID </description>
    <content:encoded><![CDATA[<p>One problem made me puzzled for couple hours, but it was really interesting to figure out what's going on. </p>
<p>So let me introduce problem at first. The table is</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>CODE:</span>
<div>
<div>
<ol>
<li>
<div>CREATE TABLE `c` <span>&#40;</span></div>
</li>
<li>
<div>&nbsp; `tracker_id` int<span>&#40;</span><span>10</span><span>&#41;</span> unsigned NOT NULL,</div>
</li>
<li>
<div>&nbsp; `username` char<span>&#40;</span><span>20</span><span>&#41;</span> character set latin1 collate latin1_bin NOT NULL,</div>
</li>
<li>
<div>&nbsp; `time_id` date NOT NULL,</div>
</li>
<li>
<div>&nbsp; `block_id` int<span>&#40;</span><span>10</span><span>&#41;</span> unsigned default NULL,</div>
</li>
<li>
<div>&nbsp; PRIMARY KEY&nbsp; <span>&#40;</span>`tracker_id`,`username`,`time_id`<span>&#41;</span>,</div>
</li>
<li>
<div>&nbsp; KEY `block_id` <span>&#40;</span>`block_id`<span>&#41;</span></div>
</li>
<li>
<div><span>&#41;</span> ENGINE=InnoDB </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Table has 11864696 rows and takes Data_length: 698,351,616 bytes on disk</p>
<p>The problem is that after restoring table from mysqldump, the query that scans data by primary key was slow. How slow ? Let me show.</p>
<p>The query in question is (Q1):</p>
<p><code>SELECT count(distinct username) FROM  tracker where TIME_ID >= '2009-07-20 00:00:00' AND TIME_ID <= '2009-10-21 00:00:00' AND (tracker_id=437)<br />
</code></p>
<p>On cold buffer_pool, it took:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>CODE:</span>
<div>
<div>
<ol>
<li>
<div>+---------------------------+</div>
</li>
<li>
<div>| count<span>&#40;</span>distinct username<span>&#41;</span> |</div>
</li>
<li>
<div>+---------------------------+</div>
</li>
<li>
<div>|&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span>5856156</span> | </div>
</li>
<li>
<div>+---------------------------+</div>
</li>
<li>
<div><span>1</span> row in set <span>&#40;</span><span>4</span> min <span>13</span>.<span>61</span> sec<span>&#41;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>However the query (again on cold buffer_pool) (Q2)</p>
<p><code>SELECT count(distinct username) FROM  tracker where TIME_ID >= '2009-07-20 00:00:00' AND TIME_ID <= '2009-10-21 00:00:00'<br />
</code></p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>CODE:</span>
<div>
<div>
<ol>
<li>
<div>+---------------------------+</div>
</li>
<li>
<div>| count<span>&#40;</span>distinct username<span>&#41;</span> |</div>
</li>
<li>
<div>+---------------------------+</div>
</li>
<li>
<div>|&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span>5903053</span> | </div>
</li>
<li>
<div>+---------------------------+</div>
</li>
<li>
<div><span>1</span> row in set <span>&#40;</span><span>18</span>.<span>81</span> sec<span>&#41;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Difference is impressive. <strong>4 min 13.61 sec</strong> vs <strong>18.81 sec</strong></p>
<p>If you want EXPLAIN plain, here it is:</p>
<p>For Q1:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>CODE:</span>
<div>
<div>
<ol>
<li>
<div>+----+-------------+-------------------------+------+---------------+---------+---------+-------+---------+--------------------------+</div>
</li>
<li>
<div>| id | select_type | table&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;| type | possible_keys | key&nbsp; &nbsp; &nbsp;| key_len | ref&nbsp; &nbsp;| rows&nbsp; &nbsp; | Extra&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; |</div>
</li>
<li>
<div>+----+-------------+-------------------------+------+---------------+---------+---------+-------+---------+--------------------------+</div>
</li>
<li>
<div>|&nbsp; <span>1</span> | SIMPLE&nbsp; &nbsp; &nbsp; | tracker&nbsp; | ref&nbsp; | PRIMARY&nbsp; &nbsp; &nbsp; &nbsp;| PRIMARY | <span>4</span>&nbsp; &nbsp; &nbsp; &nbsp;| const | <span>6880241</span> | Using where; Using index | </div>
</li>
<li>
<div>+----+-------------+-------------------------+------+---------------+---------+---------+-------+---------+--------------------------+</div>
</li>
<li>
<div><span>1</span> row in set <span>&#40;</span><span>0</span>.<span>02</span> sec<span>&#41;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>For Q2:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>CODE:</span>
<div>
<div>
<ol>
<li>
<div>+----+-------------+-------------------------+-------+---------------+-------------------------------------+---------+------+----------+--------------------------+</div>
</li>
<li>
<div>| id | select_type | table&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;| type&nbsp; | possible_keys | key&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;| key_len | ref&nbsp; | rows&nbsp; &nbsp; &nbsp;| Extra&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; |</div>
</li>
<li>
<div>+----+-------------+-------------------------+-------+---------------+-------------------------------------+---------+------+----------+--------------------------+</div>
</li>
<li>
<div>|&nbsp; <span>1</span> | SIMPLE&nbsp; &nbsp; &nbsp; | tracker | index | NULL&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | block_id | <span>5</span>&nbsp; &nbsp; &nbsp; &nbsp;| NULL | <span>13760483</span> | Using where; Using index | </div>
</li>
<li>
<div>+----+-------------+-------------------------+-------+---------------+-------------------------------------+---------+------+----------+--------------------------+ </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>Query Q1 is executed using Primary Key, and Query Q2 is using block_id key.</p>
<p>To get more details I ran both queries with our extended stats in slow.log (available in 5.0-percona releases)</p>
<p>So for query Q1:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>CODE:</span>
<div>
<div>
<ol>
<li>
<div># Query_time: <span>253</span>.<span>643162</span>&nbsp; Lock_time: <span>0</span>.<span>000137</span>&nbsp; Rows_sent: <span>1</span>&nbsp; Rows_examined: <span>11569733</span>&nbsp; Rows_affected: <span>0</span>&nbsp; Rows_read: <span>11569733</span></div>
</li>
<li>
<div>#&nbsp; &nbsp;InnoDB_IO_r_ops: <span>73916</span>&nbsp; InnoDB_IO_r_bytes: <span>1211039744</span>&nbsp; InnoDB_IO_r_wait: <span>236</span>.<span>149003</span></div>
</li>
<li>
<div>#&nbsp; &nbsp;InnoDB_rec_lock_wait: <span>0</span>.<span>000000</span>&nbsp; InnoDB_queue_wait: <span>0</span>.<span>000000</span></div>
</li>
<li>
<div>#&nbsp; &nbsp;InnoDB_pages_distinct: <span>54838</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>And for query Q2:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>CODE:</span>
<div>
<div>
<ol>
<li>
<div># Query_time: <span>18</span>.<span>846855</span>&nbsp; Lock_time: <span>0</span>.<span>000123</span>&nbsp; Rows_sent: <span>1</span>&nbsp; Rows_examined: <span>11864696</span>&nbsp; Rows_affected: <span>0</span>&nbsp; Rows_read: <span>11864696</span></div>
</li>
<li>
<div>#&nbsp; &nbsp;InnoDB_IO_r_ops: <span>27510</span>&nbsp; InnoDB_IO_r_bytes: <span>450723840</span>&nbsp; InnoDB_IO_r_wait: <span>0</span>.<span>165124</span></div>
</li>
<li>
<div>#&nbsp; &nbsp;InnoDB_rec_lock_wait: <span>0</span>.<span>000000</span>&nbsp; InnoDB_queue_wait: <span>0</span>.<span>000000</span></div>
</li>
<li>
<div>#&nbsp; &nbsp;InnoDB_pages_distinct: <span>24687</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>As you see for Q1 IO read took <strong>236.149003 sec</strong> vs <strong>0.165124</strong> for Q2.  But Q1 is scan by primary key, which supposed to be<br />
sequential!</p>
<p>Let's see on another statistic, which available in innodb_check_fragmentation patch:</p>
<p>for Q1:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>CODE:</span>
<div>
<div>
<ol>
<li>
<div>SHOW STATUS LIKE <span>'Innodb_scan_pages%'</span>;</div>
</li>
<li>
<div>+------------------------------+-------+</div>
</li>
<li>
<div>| Variable_name&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | Value |</div>
</li>
<li>
<div>+------------------------------+-------+</div>
</li>
<li>
<div>| Innodb_scan_pages_contiguous | <span>88</span>&nbsp; &nbsp; | </div>
</li>
<li>
<div>| Innodb_scan_pages_jumpy&nbsp; &nbsp; &nbsp; | <span>73789</span> | </div>
</li>
<li>
<div>+------------------------------+-------+</div>
</li>
<li>
<div><span>2</span> rows in set <span>&#40;</span><span>0</span>.<span>00</span> sec<span>&#41;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>and for Q2:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>CODE:</span>
<div>
<div>
<ol>
<li>
<div>mysql&gt; SHOW STATUS LIKE <span>'Innodb_scan_pages%'</span>;&nbsp; &nbsp; &nbsp; &nbsp; </div>
</li>
<li>
<div>+------------------------------+-------+</div>
</li>
<li>
<div>| Variable_name&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | Value |</div>
</li>
<li>
<div>+------------------------------+-------+</div>
</li>
<li>
<div>| Innodb_scan_pages_contiguous | <span>26959</span> | </div>
</li>
<li>
<div>| Innodb_scan_pages_jumpy&nbsp; &nbsp; &nbsp; | <span>442</span>&nbsp; &nbsp;| </div>
</li>
<li>
<div>+------------------------------+-------+</div>
</li>
<li>
<div><span>2</span> rows in set <span>&#40;</span><span>0</span>.<span>00</span> sec<span>&#41;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>So you see for Q1 it was not sequential scan, even it is primary key, but it is sequential for Q2.</p>
<p>So what's the answer ? It's <strong>fragmentation</strong> of primary key (and whole data table, as InnoDB data == primary key). But how it could happen with<br />
primary key after mysqldump ? The answer here if we look on </p>
<p>EXPLAIN SELECT * FROM tracker;</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>CODE:</span>
<div>
<div>
<ol>
<li>
<div>+----+-------------+-------------------------+-------+---------------+-------------------------------------+---------+------+----------+-------------+</div>
</li>
<li>
<div>| id | select_type | table&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;| type&nbsp; | possible_keys | key&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;| key_len | ref&nbsp; | rows&nbsp; &nbsp; &nbsp;| Extra&nbsp; &nbsp; &nbsp; &nbsp;|</div>
</li>
<li>
<div>+----+-------------+-------------------------+-------+---------------+-------------------------------------+---------+------+----------+-------------+</div>
</li>
<li>
<div>|&nbsp; <span>1</span> | SIMPLE&nbsp; &nbsp; &nbsp; | tracker | index | NULL&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | block_id | <span>5</span>&nbsp; &nbsp; &nbsp; &nbsp;| NULL | <span>13760483</span> | Using index | </div>
</li>
<li>
<div>+----+-------------+-------------------------+-------+---------------+-------------------------------------+---------+------+----------+-------------+</div>
</li>
<li>
<div><span>1</span> row in set <span>&#40;</span><span>0</span>.<span>00</span> sec<span>&#41;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>We see that dump is taken in key "block_id" order, not in primary key order. And later when we load this table, INSERTS into primary key happens in random order, and that gives us the fragmentation we see here.</p>
<p>How to fix it in our case. It's easy: <code>ALTER TABLE tracker ENGINE=InnoDB</code>, it will force InnoDB to rebuild table in primary key order.</p>
<p>After that Q1:</p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>CODE:</span>
<div>
<div>
<ol>
<li>
<div>+---------------------------+</div>
</li>
<li>
<div>| count<span>&#40;</span>distinct username<span>&#41;</span> |</div>
</li>
<li>
<div>+---------------------------+</div>
</li>
<li>
<div>|&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span>5856156</span> | </div>
</li>
<li>
<div>+---------------------------+</div>
</li>
<li>
<div><span>1</span> row in set <span>&#40;</span><span>17</span>.<span>72</span> sec<span>&#41;</span></div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div>mysql&gt; SHOW STATUS LIKE <span>'Innodb_scan_pages%'</span>;</div>
</li>
<li>
<div>+------------------------------+-------+</div>
</li>
<li>
<div>| Variable_name&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | Value |</div>
</li>
<li>
<div>+------------------------------+-------+</div>
</li>
<li>
<div>| Innodb_scan_pages_contiguous | <span>37864</span> | </div>
</li>
<li>
<div>| Innodb_scan_pages_jumpy&nbsp; &nbsp; &nbsp; | <span>574</span>&nbsp; &nbsp;| </div>
</li>
<li>
<div>+------------------------------+-------+</div>
</li>
<li>
<div><span>2</span> rows in set <span>&#40;</span><span>0</span>.<span>00</span> sec<span>&#41;</span></div>
</li>
<li>
<div>&nbsp;</div>
</li>
<li>
<div>and extended stats:</div>
</li>
<li>
<div># Query_time: <span>17</span>.<span>765369</span>&nbsp; Lock_time: <span>0</span>.<span>000137</span>&nbsp; Rows_sent: <span>1</span>&nbsp; Rows_examined: <span>11569733</span>&nbsp; Rows_affected: <span>0</span>&nbsp; Rows_read: <span>11569733</span></div>
</li>
<li>
<div>#&nbsp; &nbsp;InnoDB_IO_r_ops: <span>38530</span>&nbsp; InnoDB_IO_r_bytes: <span>631275520</span>&nbsp; InnoDB_IO_r_wait: <span>0</span>.<span>204893</span></div>
</li>
<li>
<div>#&nbsp; &nbsp;InnoDB_rec_lock_wait: <span>0</span>.<span>000000</span>&nbsp; InnoDB_queue_wait: <span>0</span>.<span>000000</span></div>
</li>
<li>
<div>#&nbsp; &nbsp;InnoDB_pages_distinct: <span>35584</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>You see that time returned to appropriate <strong>17.72 sec</strong>.</p>
<p>You may ask what happens now with Q2 ? yes, it's getting slow now, as we made key "block_id" inserted not in order. </p>
<div><span><a href="http://www.mysqlperformanceblog.com">PLAIN TEXT</a></span></div>
<div><span>CODE:</span>
<div>
<div>
<ol>
<li>
<div>+---------------------------+</div>
</li>
<li>
<div>| count<span>&#40;</span>distinct username<span>&#41;</span> |</div>
</li>
<li>
<div>+---------------------------+</div>
</li>
<li>
<div>|&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span>5903053</span> |</div>
</li>
<li>
<div>+---------------------------+</div>
</li>
<li>
<div><span>1</span> row in set <span>&#40;</span><span>2</span> min <span>8</span>.<span>92</span> sec<span>&#41;</span></div>
</li>
<li>
<div>mysql&gt; SHOW STATUS LIKE <span>'Innodb_scan_pages%'</span>;</div>
</li>
<li>
<div>+------------------------------+-------+</div>
</li>
<li>
<div>| Variable_name&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; | Value |</div>
</li>
<li>
<div>+------------------------------+-------+</div>
</li>
<li>
<div>| Innodb_scan_pages_contiguous | <span>45</span>&nbsp; &nbsp; | </div>
</li>
<li>
<div>| Innodb_scan_pages_jumpy&nbsp; &nbsp; &nbsp; | <span>35904</span> | </div>
</li>
<li>
<div>+------------------------------+-------+</div>
</li>
<li>
<div><span>2</span> rows in set <span>&#40;</span><span>0</span>.<span>00</span> sec<span>&#41;</span> </div>
</li>
</ol>
</div>
</div>
</div>
<p></p>
<p>As for mysqldump you may use <code>--order-by-primary</code>  options to force dump in primary key order.</p>
<p>So notes to <strong>highlight</strong>:</p>
<ul>
<li>InnoDB fragmentation may hurt your query significantly, especially when data is not in buffer_pool and execution goes to read from disk</li>
<li>Fragmentation by secondary key is much more likely than by primary key, and you cannot really control it (tough it is possible in XtraDB / InnoDB-plugin with FAST INDEX creation) so be careful with queries scan many records by secondary key</li>
<li>To check if you query affected by fragmentation you can use  <code>Innodb_scan_pages_contiguous ; Innodb_scan_pages_jumpy</code> statistics in 5.0-percona releases (coming to 5.1-XtraDB soon)</li>
</ul>
    <hr noshade style="margin:0;height:1px" />
    <p>Entry posted by Vadim |
      <a href="http://www.mysqlperformanceblog.com/2009/11/05/innodb-look-after-fragmentation/#comments">3 comments</a></p>
    <p>Add to: <a href="http://del.icio.us/post?url=http://www.mysqlperformanceblog.com/2009/11/05/innodb-look-after-fragmentation/&amp;title=InnoDB:%20look%20after%20fragmentation" title="Bookmark this post on del.icio.us"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/delicious.png" alt="delicious" /></a> | <a href="http://digg.com/submit?phase=2&amp;url=http://www.mysqlperformanceblog.com/2009/11/05/innodb-look-after-fragmentation/&amp;title=InnoDB:%20look%20after%20fragmentation" title="Digg this post on Digg.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/digg.png" alt="digg" /></a> | <a href="http://reddit.com/submit?url=http://www.mysqlperformanceblog.com/2009/11/05/innodb-look-after-fragmentation/&amp;title=InnoDB:%20look%20after%20fragmentation" title="Submit this post on reddit.com"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/reddit.png" alt="reddit" /></a> | <a href="http://www.netscape.com/submit/?U=http://www.mysqlperformanceblog.com/2009/11/05/innodb-look-after-fragmentation/&amp;T=InnoDB:%20look%20after%20fragmentation" title="Vote for this article on Netscape"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/netscape.gif" alt="netscape" /></a> | <a href="http://www.google.com/bookmarks/mark?op=add&amp;bkmk=http://www.mysqlperformanceblog.com/2009/11/05/innodb-look-after-fragmentation/&amp;title=InnoDB:%20look%20after%20fragmentation" title="Add to Google Bookmarks"><img src="http://www.mysqlperformanceblog.com/wp-content/themes/boxy-but-gold/images/google.png" alt="Google Bookmarks" /></a></p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22118&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22118&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Thu, 05 Nov 2009 19:01:54 +0000</pubDate>
    <dc:creator>MySQL Performance Blog</dc:creator>
    <category>Innodb</category>
    <category>mysql</category>
  </item>

  <item>
    <title>Enterprise2.0 Conference</title>
    <guid isPermaLink="false">http://blogs.sun.com/shanti/entry/enterprise2_0_conference</guid>
    <link>http://blogs.sun.com/shanti/entry/enterprise2_0_conference</link>
    <description>Notes from Enterprise2.0 Conference</description>
    <content:encoded><![CDATA[<a href="http://perfwork.wordpress.com/2009/11/04/86/" title="Notes from Enterprise2.0 Conference">Notes from Enterprise2.0 Conference<br /></a><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22117&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22117&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Thu, 05 Nov 2009 18:00:20 +0000</pubDate>
    <dc:creator>Shanti Subramanyam</dc:creator>
    <category>Personal</category>
    <category>e2.0</category>
    <category>e2conf</category>
  </item>

  <item>
    <title>InnoDB plugin gets better again</title>
    <guid isPermaLink="false">tag:blogger.com,1999:blog-5915567578707286635.post-6617923741002050480</guid>
    <link>http://mysqlha.blogspot.com/2009/11/innodb-plugin-gets-better-again.html</link>
    <description>Forgive me for being a shill, but InnoDB appears to have added a feature for the next release of the InnoDB plugin that prevents the buffer pool from getting wiped out by a full table scan. Many people have requested this. The documentation is excellent. I have tested it and not only did it work as advertised, but it didn't degrade performance on OLTP workloads. This fixes bug 45015 and is a nice feature to have when you occasionally use mysqldump to copy a table from a busy OLTP server. Now is a good time to evaluate MySQL 5.1 with the InnoDB plugin.</description>
    <content:encoded><![CDATA[Forgive me for being a shill, but InnoDB <a href="http://dev.mysql.com/doc/refman/5.1/en/innodb-buffer-pool.html">appears to have added a feature</a> for the next release of the InnoDB plugin that prevents the buffer pool from getting wiped out by a full table scan. Many people <a href="http://www.mysqlperformanceblog.com/2007/10/26/heikki-tuuri-innodb-answers-part-i/">have requested</a> this. <a href="http://dev.mysql.com/doc/refman/5.1/en/innodb-buffer-pool.html">The documentation</a> is excellent. I have tested it and not only did it work as advertised, but it didn't degrade performance on OLTP workloads. This fixes <a href="http://bugs.mysql.com/bug.php?id=45015">bug 45015</a> and is a nice feature to have when you occasionally use mysqldump to copy a table from a busy OLTP server. Now is a good time to evaluate MySQL 5.1 with the InnoDB plugin.<div><img width="1" height="1" src="https://blogger.googleusercontent.com/tracker/5915567578707286635-6617923741002050480?l=mysqlha.blogspot.com" /></div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22115&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22115&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Thu, 05 Nov 2009 17:25:00 +0000</pubDate>
    <dc:creator>Mark Callaghan</dc:creator>
  </item>

  <item>
    <title>myterm - extensible mysql command line client</title>
    <guid isPermaLink="false">http://www.jetprofiler.com/blog/8/myterm---extensible-mysql-command-line-client/</guid>
    <link>http://www.jetprofiler.com/blog/8/myterm---extensible-mysql-command-line-client/</link>
    <description>What if I type this:


myterm&amp;gt; SELECT engine, count(*) AS count FROM information_schema.tables GROUP BY 1 ORDER BY 2 DESC | chart pie | browser

and Firefox says:What&amp;#039;s that?I just launched an open-source project on launchpad called myterm. Myterm is a crossover between the standard mysql command line client and the concept of pipes and filters in bash. You can use it to run queries and filter the produced result set in various ways using pipe chaining. This lends itself to quite a lot of different use cases, for instance graphical charts, md5 checksums and different presentation forms to name a few. It has browser integration using shell exec, which means it can render html result sets or charts in your browser. And since most stuff is written using plugins, it will work well to serve as a hub for hooking in more and more tools for data transfer, dumping, backup, simplify monitoring and so on. Sort of like inversed bash-scripting; you start inside the db going out.

The model is based on commands, filters, presenters and dests. The COMMAND (usually a query) produces a result set which is sent of to a chain of filters. Each FILTER can process the result set and reformat the data. A PRESENTER takes a result set and renders it in some form, e.g. plain text, tab separated values, html table or chart. Finally, a DEST is simply the destination of the rendered output, such as standard out, a file or the browser. A full command chain:

COMMAND --resultset--&amp;gt; FILTER --resultset--&amp;gt; PRESENTER --mime--&amp;gt; DESTSome examplesFirst some standard output:

myterm&amp;gt; SHOW PROCESSLIST+------+------+-----------------+------+---------+------+-------+------------------+| Id   | User | Host            | db   | Command | Time | State | Info             |+------+------+-----------------+------+---------+------+-------+------------------+| 4789 | root | 127.0.0.1:59047 | test | Query   |    0 |       | SHOW PROCESSLIST |+------+------+-----------------+------+---------+------+-------+------------------+1 row in set (0.000 sec)

Then, lets use the cols filter:


myterm&amp;gt; SHOW PROCESSLIST | cols 2-3+------+-----------------+| User | Host            |+------+-----------------+| root | 127.0.0.1:59047 |+------+-----------------+1 row in set (0.000 sec)

which filters out some columns. Similarly, a basic grep filter exists. Lets hope people won&amp;#039;t stop using indexes just because of this:


myterm&amp;gt; SHOW DATABASES | grep info+--------------------+| Database           |+--------------------+| information_schema |+--------------------+1 row in set (0.001 sec)

For chart rendering, myterm uses libchart by Jean-Marc Tremeaux. Take a look at the biggest tables:


myterm&amp;gt; SELECT CONCAT(table_schema, &amp;#039;.&amp;#039;, table_name) AS &amp;#039;Table&amp;#039;, data_length + index_length AS Bytes FROM information_schema.tables ORDER BY 2 DESC | other 7 | chart hbar

The other filter just reduces any rows after 6 into a total item named Other. Resulting bar chart:

The insertify plugin will reverse engineer a result set into a create statement and an insert statement. This is not as rock-solid or performant as CREATE...SELECT (it won&amp;#039;t pick up indexes), but works for creating temporary snapshots or test data:


myterm&amp;gt; SHOW PROCESSLIST|insertify|tsv -N -ECREATE TABLE &amp;#039;some_table&amp;#039; (    &amp;#039;Id&amp;#039; bigint(11) NOT NULL DEFAULT &amp;#039;&amp;#039;,    &amp;#039;User&amp;#039; varchar(16) NOT NULL DEFAULT &amp;#039;&amp;#039;,    &amp;#039;Host&amp;#039; varchar(64) NOT NULL DEFAULT &amp;#039;&amp;#039;,    &amp;#039;db&amp;#039; varchar(64) DEFAULT NULL,    &amp;#039;Command&amp;#039; varchar(16) NOT NULL DEFAULT &amp;#039;&amp;#039;,    &amp;#039;Time&amp;#039; int(7) UNSIGNED NOT NULL DEFAULT &amp;#039;&amp;#039;,    &amp;#039;State&amp;#039; varchar(30) DEFAULT NULL,    &amp;#039;Info&amp;#039; varchar(100) DEFAULT NULL) ENGINE=InnoDB;INSERT INTO some_table (Id, User, Host, db, Command, Time, State, Info) VALUES (&amp;#039;4789&amp;#039;, &amp;#039;root&amp;#039;, &amp;#039;127.0.0.1:59047&amp;#039;, &amp;#039;test&amp;#039;, &amp;#039;Query&amp;#039;, &amp;#039;0&amp;#039;, NULL, &amp;#039;SHOW PROCESSLIST&amp;#039;);2 rows in set (0.000 sec)

Here are the plugins so far:


myterm&amp;gt; plugins list15 plugins loaded:Filters:  cols        Filters columns and column ranges.  grep        Filters lines containing the specified text in any column.  insertify   Creates insert statements based on a result set.  other       Reduces a result to max N rows, collapsing any extra rows to a row titled Other at the end.Presenters:  chart       Renders a chart using php Libchart.  html        Formats the data to a html table.  md5         Calculates an md5 checksum of all rows and column values.  plain       Formats the data to the default plain text table grid.  tsv         Formats the data to tab-separated-values plain text.  vert        Presents the data in a vertical plain-text fashion, similar to mysql \G output format.  vhtml       Formats the data to a record by record, vertical html table.Dests:  browser     Sends the output to the browser.  file        Sends output to a file with the given filename.  mailto      Sends output to the registered email application using a mailto: link  std         Sends output to standard out.
Current statusIt&amp;#039;s currently written in PHP, which is kind of bad, because of PHP:s limited console integration, signal handling (ctrl-c) and threading (think +asynchronous / multi-threaded queries). Maybe a rewrite in Python would fix these issues. Eric Day has initiated a similar project at Portland State University which is about to start... ...and they&amp;#039;re thinking of Python! So I&amp;#039;ve contacted him about possibly combining these projects in some way later on, once both projects have gotten further. DownloadTo download it, you need bazaar and php 5.2+ command line with mysqli extension. I&amp;#039;ve only tested it on Ubuntu Linux. Type:


bzr branch lp:myterm

then follow the README file.ContributeFor now, myterm parsing and option handling is limited (don&amp;#039;t do multi-line queries, comments or too much quoting), but most basic stuff works. There are probably tons of bugs in it and things that don&amp;#039;t work, I know :) So... if you&amp;#039;d like to contribute, join the project and mailing list on launchpad. There, you can also look at the blueprints which are some ideas on additional features. 

Feel free to leave comments and feature suggestions!
2 Comments</description>
    <content:encoded><![CDATA[What if I type this:<br />
<br />
<ol>
<li>myterm&gt; SELECT engine, count(*) AS count FROM information_schema.tables GROUP BY 1 ORDER BY 2 DESC | chart pie | browser</li></ol>
<br />
and Firefox says:<img style="display: block; margin: 15px 0 15px 0; border: 0;" src="http://www.jetprofiler.com/img/blog/myterm_pie.png" alt="Myterm pie" /><h2>What&#039;s that?</h2>I just launched an open-source project on launchpad called <a rel="nofollow" target="_new" href="https://launchpad.net/myterm">myterm</a>. Myterm is a crossover between the standard mysql command line client and the concept of pipes and filters in bash. You can use it to run queries and filter the produced result set in various ways using pipe chaining. This lends itself to quite a lot of different use cases, for instance graphical charts, md5 checksums and different presentation forms to name a few. It has browser integration using shell exec, which means it can render html result sets or charts in your browser. And since most stuff is written using plugins, it will work well to serve as a hub for hooking in more and more tools for data transfer, dumping, backup, simplify monitoring and so on. Sort of like inversed bash-scripting; you start inside the db going out.<br />
<br />
The model is based on commands, filters, presenters and dests. The COMMAND (usually a query) produces a result set which is sent of to a chain of filters. Each FILTER can process the result set and reformat the data. A PRESENTER takes a result set and renders it in some form, e.g. plain text, tab separated values, html table or chart. Finally, a DEST is simply the destination of the rendered output, such as standard out, a file or the browser. A full command chain:<br />
<br />
<b>COMMAND --resultset--&gt; FILTER --resultset--&gt; PRESENTER --mime--&gt; DEST</b><h2>Some examples</h2>First some standard output:<br />
<ol>
<li>myterm&gt; SHOW PROCESSLIST</li><li>+------+------+-----------------+------+---------+------+-------+------------------+</li><li>| Id   | User | Host            | db   | Command | Time | State | Info             |</li><li>+------+------+-----------------+------+---------+------+-------+------------------+</li><li>| 4789 | root | 127.0.0.1:59047 | test | Query   |    0 |       | SHOW PROCESSLIST |</li><li>+------+------+-----------------+------+---------+------+-------+------------------+</li><li>1 row in set (0.000 sec)</li></ol>
<br />
Then, lets use the <b>cols</b> filter:<br />
<br />
<ol>
<li>myterm&gt; SHOW PROCESSLIST | cols 2-3</li><li>+------+-----------------+</li><li>| User | Host            |</li><li>+------+-----------------+</li><li>| root | 127.0.0.1:59047 |</li><li>+------+-----------------+</li><li>1 row in set (0.000 sec)</li></ol>
<br />
which filters out some columns. Similarly, a basic <b>grep</b> filter exists. Lets hope people won&#039;t stop using indexes just because of this:<br />
<br />
<ol>
<li>myterm&gt; SHOW DATABASES | grep info</li><li>+--------------------+</li><li>| Database           |</li><li>+--------------------+</li><li>| information_schema |</li><li>+--------------------+</li><li>1 row in set (0.001 sec)</li></ol>
<br />
For chart rendering, myterm uses <a rel="nofollow" target="_new" href="http://naku.dohcrew.com/libchart/pages/introduction/">libchart</a> by Jean-Marc Tremeaux. Take a look at the biggest tables:<br />
<br />
<ol>
<li>myterm&gt; SELECT CONCAT(table_schema, &#039;.&#039;, table_name) AS &#039;Table&#039;, data_length + index_length AS Bytes FROM information_schema.tables ORDER BY 2 DESC | other 7 | chart hbar</li></ol>
<br />
The <b>other</b> filter just reduces any rows after 6 into a total item named Other. Resulting bar chart:<img style="display: block; margin: 15px 0 15px 0; border: 0;" src="http://www.jetprofiler.com/img/blog/myterm_hbar.png" alt="Myterm hbar" /><br />
<br />
The <b>insertify</b> plugin will reverse engineer a result set into a create statement and an insert statement. This is not as rock-solid or performant as CREATE...SELECT (it won&#039;t pick up indexes), but works for creating temporary snapshots or test data:<br />
<br />
<ol>
<li>myterm&gt; SHOW PROCESSLIST|insertify|tsv -N -E</li><li>CREATE TABLE &#039;some_table&#039; (</li><li>    &#039;Id&#039; bigint(11) NOT NULL DEFAULT &#039;&#039;,</li><li>    &#039;User&#039; varchar(16) NOT NULL DEFAULT &#039;&#039;,</li><li>    &#039;Host&#039; varchar(64) NOT NULL DEFAULT &#039;&#039;,</li><li>    &#039;db&#039; varchar(64) DEFAULT NULL,</li><li>    &#039;Command&#039; varchar(16) NOT NULL DEFAULT &#039;&#039;,</li><li>    &#039;Time&#039; int(7) UNSIGNED NOT NULL DEFAULT &#039;&#039;,</li><li>    &#039;State&#039; varchar(30) DEFAULT NULL,</li><li>    &#039;Info&#039; varchar(100) DEFAULT NULL</li><li>) ENGINE=InnoDB;</li><li>INSERT INTO some_table (Id, User, Host, db, Command, Time, State, Info) VALUES (&#039;4789&#039;, &#039;root&#039;, &#039;127.0.0.1:59047&#039;, &#039;test&#039;, &#039;Query&#039;, &#039;0&#039;, NULL, &#039;SHOW PROCESSLIST&#039;);</li><li>2 rows in set (0.000 sec)</li></ol>
<br />
Here are the plugins so far:<br />
<br />
<ol>
<li>myterm&gt; plugins list</li><li>15 plugins loaded:</li><li></li><li>Filters:</li><li>  cols        Filters columns and column ranges.</li><li>  grep        Filters lines containing the specified text in any column.</li><li>  insertify   Creates insert statements based on a result set.</li><li>  other       Reduces a result to max N rows, collapsing any extra rows to a row titled Other at the end.</li><li></li><li>Presenters:</li><li>  chart       Renders a chart using php Libchart.</li><li>  html        Formats the data to a html table.</li><li>  md5         Calculates an md5 checksum of all rows and column values.</li><li>  plain       Formats the data to the default plain text table grid.</li><li>  tsv         Formats the data to tab-separated-values plain text.</li><li>  vert        Presents the data in a vertical plain-text fashion, similar to mysql \G output format.</li><li>  vhtml       Formats the data to a record by record, vertical html table.</li><li></li><li>Dests:</li><li>  browser     Sends the output to the browser.</li><li>  file        Sends output to a file with the given filename.</li><li>  mailto      Sends output to the registered email application using a mailto: link</li><li>  std         Sends output to standard out.</li></ol>
<h2>Current status</h2>It&#039;s currently written in PHP, which is kind of bad, because of PHP:s limited console integration, signal handling (ctrl-c) and threading (think +asynchronous / multi-threaded queries). Maybe a rewrite in Python would fix these issues. Eric Day has initiated a <a rel="nofollow" target="_new" href="http://oddments.org/?p=206">similar project</a> at Portland State University which is about to start... ...and they&#039;re thinking of Python! So I&#039;ve contacted him about possibly combining these projects in some way later on, once both projects have gotten further. <h2>Download</h2>To download it, you need bazaar and php 5.2+ command line with mysqli extension. I&#039;ve only tested it on Ubuntu Linux. Type:<br />
<br />
<ol>
<li>bzr branch lp:myterm</li></ol>
<br />
then follow the README file.<h2>Contribute</h2>For now, myterm parsing and option handling is limited (don&#039;t do multi-line queries, comments or too much quoting), but most basic stuff works. There are probably tons of bugs in it and things that don&#039;t work, I know :) So... if you&#039;d like to contribute, <a rel="nofollow" target="_new" href="https://launchpad.net/~myterm-team">join the project</a> and mailing list on launchpad. There, you can also look at the <a rel="nofollow" target="_new" href="https://blueprints.launchpad.net/myterm">blueprints</a> which are some ideas on additional features. <br />
<br />
Feel free to leave comments and feature suggestions!<br />
<br/><br/><a href="http://www.jetprofiler.com/blog/8/myterm---extensible-mysql-command-line-client/#comments">2 Comments</a><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22116&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22116&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Thu, 05 Nov 2009 16:47:22 +0000</pubDate>
    <dc:creator>Bj&amp;ouml;rn Melinder</dc:creator>
    <category>myterm</category>
    <category>mysql</category>
  </item>

  <item>
    <title>The New York Times on Oracle-Sun merger</title>
    <guid isPermaLink="false">http://blog.thinkphp.de/archives/453-guid.html</guid>
    <link>http://blog.thinkphp.de/archives/453-The-New-York-Times-on-Oracle-Sun-merger.html</link>
    <description>
The New York Times posted an article called &quot;Decision on Oracle a Test for Kroes&quot; where they cite me with a suggestion what should be done when allowing the merger - splitting off MySQL from Sun/Oracle.

&amp;#160;

I had some tweets about the situation and briefly chatted with Florian Müller whom everybody should know from 2004's anti software patent EU campaign and who is acting as a formal advisor in the current EU observation regarding the deal. I believe in Open Source being &quot;free as in freedom&quot;. In the last months I haven't heard anything from Oracle itself regarding its future plans for MySQL. And when I read this article (&quot;Oracle plans aggressive fight [...]&quot;) from PC World magazine I'm not sure if it would be quite a good thing that Oracle will own MySQL in the future. Does Oracle have experience in growing and maintaining an Open Source community accepting the dual-license nature of the software, and will it invest into MySQL?

&amp;#160;

By splitting off MySQL (i.e. selling it to another business) it has to be assured that there's a company behind MySQL that thrives the further adoption of MySQL especially in the Enterprise Business. That's why in the past decade MySQL grew from a small database to a great database server which suits a lot of enterprise needs for the web. I don't want that industry fall back - the future is the web and the database for the web is MySQL.

&amp;#160;

I'm pretty sure that EU commisioner Kroes will do the right decision by end of January 2010.
</description>
    <content:encoded><![CDATA[<p>
The New York Times posted an article called "<a href="http://www.nytimes.com/2009/11/05/technology/companies/05iht-oracle.html?_r=1">Decision on Oracle a Test for Kroes</a>" where they cite me with a suggestion what should be done when allowing the merger - splitting off MySQL from Sun/Oracle.
</p>
<p>&#160;</p>
<p>
I had some <a href="http://www.twitter.com/BjoernSchotte">tweets</a> about the situation and briefly chatted with Florian Müller whom everybody should know from 2004's anti software patent EU campaign and who is acting as a formal advisor in the current EU observation regarding the deal. I believe in Open Source being "free as in freedom". In the last months I haven't heard anything from Oracle itself regarding its future plans for MySQL. And when I read <a href="http://www.pcworld.com/article/181406/oracle_plans_aggressive_fight_with_eu_over_sun_takeover.html">this article</a> ("Oracle plans aggressive fight [...]") from PC World magazine I'm not sure if it would be quite a good thing that Oracle will own MySQL in the future. Does Oracle have experience in growing and maintaining an Open Source community accepting the dual-license nature of the software, and will it invest into MySQL?
</p>
<p>&#160;</p>
<p>
By splitting off MySQL (i.e. selling it to another business) it has to be assured that there's a company behind MySQL that thrives the further adoption of MySQL especially in the Enterprise Business. That's why in the past decade MySQL grew from a small database to a great database server which suits a lot of enterprise needs for the web. I don't want that industry fall back - the future is the web and the database for the web is MySQL.
</p>
<p>&#160;</p>
<p>
I'm pretty sure that EU commisioner Kroes will do the right decision by end of January 2010.
</p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22089&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22089&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Thu, 05 Nov 2009 12:49:14 +0000</pubDate>
    <category>PHP</category>
  </item>

  <item>
    <title>MySQL Cluster 7.0.9a source release now available – replaces 7.0.9</title>
    <guid isPermaLink="false">http://www.clusterdb.com/?p=653</guid>
    <link>http://www.clusterdb.com/uncategorized/mysql-cluster-7-0-9a-source-release-now-available-replaces-7-0-9/</link>
    <description>The source version for MySQL Cluster 7.0.9a has now been made available at ftp://ftp.mysql.com/pub/mysql/download/cluster_telco/mysql-5.1.39-ndb-7.0.9a/
This replaces MySQL Cluster 7.0.9.
You can either wait for the binaries to be released or if you&amp;#8217;re in a rush then you can find instructions on building the binaries for yourself in the earlier article: &amp;#8220;MySQL Cluster 7.0.7 source released&amp;#8220;.
A description of all of the changes (fixes) that have gone into MySQL Cluster 7.0.9a (compared to 7.0.8a) can be found in the MySQL Cluster MySQL Cluster 7.0.9a Change Log.</description>
    <content:encoded><![CDATA[<p>The source version for MySQL Cluster 7.0.9a has now been made available at <a href="ftp://ftp.mysql.com/pub/mysql/download/cluster_telco/mysql-5.1.39-ndb-7.0.9a/">ftp://ftp.mysql.com/pub/mysql/download/cluster_telco/mysql-5.1.39-ndb-7.0.9a/</a></p>
<p>This replaces MySQL Cluster 7.0.9.</p>
<p>You can either wait for the binaries to be released or if you&#8217;re in a rush then you can find instructions on building the binaries for yourself in the earlier article: &#8220;<a href="http://www.clusterdb.com/mysql-cluster/mysql-cluster-7-0-7-source-released/">MySQL Cluster 7.0.7 source released</a>&#8220;.</p>
<p>A description of all of the changes (fixes) that have gone into MySQL Cluster 7.0.9a (compared to 7.0.8a) can be found in the MySQL Cluster <a href="http://www.clusterdb.com/wp-content/uploads/2009/11/Cluster_7_0_9a_ChangeLog.txt">MySQL Cluster 7.0.9a Change Log</a>.</p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22088&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22088&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Thu, 05 Nov 2009 08:57:54 +0000</pubDate>
    <dc:creator>Andrew Morgan</dc:creator>
    <category>Uncategorized</category>
  </item>

  <item>
    <title>MySQL Cluster 6.3.28a source release now available – replaces 6.3.28</title>
    <guid isPermaLink="false">http://www.clusterdb.com/?p=647</guid>
    <link>http://www.clusterdb.com/mysql-cluster/mysql-cluster-6-3-28a-source-release-now-available-replaces-6-3-28/</link>
    <description>The source version for MySQL Cluster 6.3.28a has now been made available at ftp://ftp.mysql.com/pub/mysql/download/cluster_telco/mysql-5.1.39-ndb-6.3.28a/
This replaces MySQL Cluster 7.3.28 which has been withdrawn.
You can either wait for the binaries to be released or if you&amp;#8217;re in a rush then you can find instructions on building the binaries for yourself in the earlier article: &amp;#8220;MySQL Cluster 7.0.7 source released&amp;#8220;.
A description of all of the changes (fixes) that have gone into MySQL Cluster 6.3.28a (compared to 6.3.27) can be found in the MySQL Cluster MySQL Cluster 6.3.28a Change Log.</description>
    <content:encoded><![CDATA[<p>The source version for MySQL Cluster 6.3.28a has now been made available at <a href="ftp://ftp.mysql.com/pub/mysql/download/cluster_telco/mysql-5.1.39-ndb-6.3.28a/">ftp://ftp.mysql.com/pub/mysql/download/cluster_telco/mysql-5.1.39-ndb-6.3.28a/</a></p>
<p>This replaces MySQL Cluster 7.3.28 which has been withdrawn.</p>
<p>You can either wait for the binaries to be released or if you&#8217;re in a rush then you can find instructions on building the binaries for yourself in the earlier article: &#8220;<a href="http://www.clusterdb.com/mysql-cluster/mysql-cluster-7-0-7-source-released/">MySQL Cluster 7.0.7 source released</a>&#8220;.</p>
<p>A description of all of the changes (fixes) that have gone into MySQL Cluster 6.3.28a (compared to 6.3.27) can be found in the MySQL Cluster <a href="http://www.clusterdb.com/wp-content/uploads/2009/11/Cluster_6_3_28a_ChangeLog.txt">MySQL Cluster 6.3.28a Change Log</a>.</p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22087&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22087&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Thu, 05 Nov 2009 08:50:04 +0000</pubDate>
    <dc:creator>Andrew Morgan</dc:creator>
    <category>MySQL Cluster</category>
    <category>MySQL</category>
  </item>

  <item>
    <title>Dreamhost Uptime Numbers are TERRIBLE!</title>
    <guid isPermaLink="false">http://www.nicholasgoodman.com/bt/blog/2009/11/04/dreamhost-uptime-numbers-are-terrible/</guid>
    <link>http://www.nicholasgoodman.com/bt/blog/2009/11/04/dreamhost-uptime-numbers-are-terrible/</link>
    <description>I don&amp;#8217;t care what their marketing stats say, I have my own indepedent verification.  I&amp;#8217;ve been using Wormly for quite a while monitoring some of my demo sites, and other services that are part of Bayon and part of Dynamo.  Since I was already paying for it, I figured I&amp;#8217;d turn it loose on this blog (nicholasgoodman.com) and see what the uptime was like.
I always thought Dreamhost was a little skiddish, and my email box finds approximately one email per day with a failure, but i figured they were small, single request failures.  Nope.  The independent measuring of the uptime of this blog is a CRUDDY, CRAPPY, 97.6%.

That&amp;#8217;s pathetic!  My blog is nothing special, an out of the box Wordpress installation backed by their MySQL.  I haven&amp;#8217;t done any of my own installations, customizations (excepting a theme) and yet my blog uptime is awful.  I&amp;#8217;ve liked the dreamhost panel; it gives the &amp;#8220;technical but uninterested in actually administering their own server&amp;#8221; user a lot of power and I&amp;#8217;d be willing to tolerate a little downtime (truthfully, anything above 99.5% is OK with me).  But 97% uptime?  Shyeah&amp;#8230; Time to start looking.
Anyone have any suggestions for good Wordpress / PHP / MySQL hosts?  WIlling to pay top dollar and I&amp;#8217;ll bring with me registrations for about 25 domains.</description>
    <content:encoded><![CDATA[<p>I don&#8217;t care what their marketing stats say, I have my own indepedent verification.  I&#8217;ve been using Wormly for quite a while monitoring some of my demo sites, and other services that are part of Bayon and part of Dynamo.  Since I was already paying for it, I figured I&#8217;d turn it loose on this blog (nicholasgoodman.com) and see what the uptime was like.</p>
<p>I always thought Dreamhost was a little skiddish, and my email box finds approximately one email per day with a failure, but i figured they were small, single request failures.  Nope.  The independent measuring of the uptime of this blog is a CRUDDY, CRAPPY, 97.6%.</p>
<p><a href="http://www.nicholasgoodman.com/bt/blog/wp-content/uploads/2009/11/200911042116.jpg"><img src="http://www.nicholasgoodman.com/bt/blog/wp-content/uploads/2009/11/200911042116-tm.jpg" height="100" width="289" border="1" hspace="4" vspace="4" alt="200911042116" /></a><br />
That&#8217;s pathetic!  My blog is nothing special, an out of the box Wordpress installation backed by their MySQL.  I haven&#8217;t done any of my own installations, customizations (excepting a theme) and yet my blog uptime is awful.  I&#8217;ve liked the dreamhost panel; it gives the &#8220;technical but uninterested in actually administering their own server&#8221; user a lot of power and I&#8217;d be willing to tolerate a little downtime (truthfully, anything above 99.5% is OK with me).  But 97% uptime?  Shyeah&#8230; Time to start looking.</p>
<p>Anyone have any suggestions for good Wordpress / PHP / MySQL hosts?  WIlling to pay top dollar and I&#8217;ll bring with me registrations for about 25 domains.</p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22085&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22085&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Thu, 05 Nov 2009 05:20:12 +0000</pubDate>
    <dc:creator>Nicholas Goodman</dc:creator>
    <category>Technology Industry</category>
  </item>

  <item>
    <title>Do you want to be an OurDelta mirror?</title>
    <guid isPermaLink="false">http://ourdelta.org/?p=564</guid>
    <link>http://ourdelta.org/do-you-want-to-be-an-ourdelta-mirror</link>
    <description>Then please contact us:  i n f o (at) o u r d e l t a (dot) o r g
What are the requirements? Having a server with HTTP access and no hassles with low traffic limits. At this stage you&amp;#8217;ll need about 5GB disk space, and you&amp;#8217;ll use rsync to sync from the master servers (we&amp;#8217;ll provide you with a script to help with that). Thanks!
With the new releases the traffic is up (not surprising) and while our existing mirrors appear to be doing ok so far, it&amp;#8217;ll be good to have more available before we run into capacity or speed problems. We also haven&amp;#8217;t yet split for geographic location, that too becomes a possibility with more mirrors.</description>
    <content:encoded><![CDATA[<p>Then please contact us:  i n f o (at) o u r d e l t a (dot) o r g</p>
<p>What are the requirements? Having a server with HTTP access and no hassles with low traffic limits. At this stage you&#8217;ll need about 5GB disk space, and you&#8217;ll use rsync to sync from the master servers (we&#8217;ll provide you with a script to help with that). Thanks!</p>
<p>With the new releases the traffic is up (not surprising) and while our existing mirrors appear to be doing ok so far, it&#8217;ll be good to have more available <em>before</em> we run into capacity or speed problems. We also haven&#8217;t yet split for geographic location, that too becomes a possibility with more mirrors.</p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22084&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22084&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Wed, 04 Nov 2009 23:03:03 +0000</pubDate>
    <dc:creator>OurDelta - Builds for MySQL</dc:creator>
    <category>General</category>
    <category>mariadb</category>
    <category>mirrors</category>
    <category>mysql</category>
    <category>ourdelta</category>
  </item>

  <item>
    <title>We've launched: open source alpha available now at infinidb.org</title>
    <guid isPermaLink="false">urn:lj:livejournal.com:atom1:jtommaney:3195</guid>
    <link>http://jtommaney.livejournal.com/3195.html</link>
    <description>We've launched: open source alpha available now at infinidb.org&amp;nbsp;&amp;nbsp; See www.infinidb.org for details.The intention will be to post information specific to InifiniDB&amp;nbsp;performance, scalability, or&amp;nbsp;features on that site.&amp;nbsp;Topics here may be more general in nature, i.e. comparisons of column versus row for different use cases that may apply to any column architecture dbms. </description>
    <content:encoded><![CDATA[We've launched: open source alpha available now at infinidb.org<br /><br />&nbsp;&nbsp; See <a href="http://www.infinidb.org">www.infinidb.org</a> for details.<br /><br />The intention will be to post information specific to InifiniDB&nbsp;performance, scalability, or&nbsp;features on that site.&nbsp;Topics here may be more general in nature, i.e. comparisons of column versus row for different use cases that may apply to any column architecture dbms. <br /><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22083&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22083&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Wed, 04 Nov 2009 22:39:25 +0000</pubDate>
    <dc:creator>Jim Tommaney</dc:creator>
  </item>

  <item>
    <title>The search for MySQL 5.5</title>
    <guid isPermaLink="false">http://openquery.com/blog/?p=1116</guid>
    <link>http://openquery.com/blog/search-mysql-55</link>
    <description>So, MySQL 6.0 was ditched, and a few weeks ago 5.4 was also &amp;#8211; its features to be added in other (earlier) versions (I&amp;#8217;m told 5.2 but not sure). I reckon that&amp;#8217;s good news, regardless of the version number. There was also an announcement about a change in the release mechanism at Sun/MySQL.
Now for practicals. If I look on Launchpad, the 5.1 branch is the only active one (next to 5.0 fixes, of course). 5.4 was last updated 15 weeks ago. There is no 5.2 on there that I can find. Wasn&amp;#8217;t looking for it really, just happened to notice its absence while I was trying to find 5.5. And the reason for that was that Miguel closed a bug I was following, noting it was no longer reproducible in 5.5. He pastes some code that reports mysql as 5.5, so it&amp;#8217;s not a typo.
So, in addition to the above list of abandonment (5.4, 6.0), we have 5.2 which I&amp;#8217;m told should exist but doesn&amp;#8217;t at Launchpad, and 5.5 which appears to exist and is news to me yet doesn&amp;#8217;t appear to be out there either. Are you confused? I am.
The particular bug was found during a training session and occurs on Windows. Now the bug is closed, but we can&amp;#8217;t see code and have no indication when it or binaries will be available. So what do I tell a user asking about the bug and its apparent fix? (I have to say apparent because Miguel&amp;#8217;s response indicate that it&amp;#8217;s merely not reproducible on the later version, there&amp;#8217;s no specific fix)
Updates

Vladislav Vainroub notes there&amp;#8217;s a mysql-next-mr branch on launchpad which is in fact version 5.5 inside. It appears to be mirrored, last sync 5 minutes ago but last changeset 39 hours ago. So this seems like a publishing branch, not a development branch (otherwise we&amp;#8217;d see more activity).
Paul DuBois tells that the mysql-server trunk on launchpad is now 5.5. Last activity is from a week ago, so I presume that like the abovementioned mysql-next-mr branch it&amp;#8217;s synced and not actually from a live development branch. Pity.
</description>
    <content:encoded><![CDATA[<p>So, MySQL 6.0 was ditched, and a few weeks ago 5.4 was also &#8211; its features to be added in other (earlier) versions (I&#8217;m told 5.2 but not sure). I reckon that&#8217;s good news, regardless of the version number. There was also an announcement about a change in the release mechanism at Sun/MySQL.</p>
<p>Now for practicals. If I look on <a href="https://code.launchpad.net/mysql-server" target="_blank">Launchpad</a>, the 5.1 branch is the only active one (next to 5.0 fixes, of course). 5.4 was last updated 15 weeks ago. There is no 5.2 on there that I can find. Wasn&#8217;t looking for it really, just happened to notice its absence while I was trying to find 5.5. And the reason for that was that Miguel <a href="http://bugs.mysql.com/27594" target="_blank">closed a bug</a> I was following, noting it was no longer reproducible in 5.5. He pastes some code that reports mysql as 5.5, so it&#8217;s not a typo.</p>
<p>So, in addition to the above list of abandonment (5.4, 6.0), we have 5.2 which I&#8217;m told should exist but doesn&#8217;t at Launchpad, and 5.5 which appears to exist and is news to me yet doesn&#8217;t appear to be out there either. <em>Are you confused? I am.</em></p>
<p>The particular bug was found during a training session and occurs on Windows. Now the bug is closed, but we can&#8217;t see code and have no indication when it or binaries will be available. So what do I tell a user asking about the bug and its apparent fix? (I have to say <em>apparent</em> because Miguel&#8217;s response indicate that it&#8217;s merely not reproducible on the later version, there&#8217;s no specific fix)</p>
<p><strong>Updates</strong></p>
<ul>
<li>Vladislav Vainroub notes there&#8217;s a <a href="https://code.launchpad.net/~mysql/mysql-server/mysql-next-mr" target="_blank">mysql-next-mr</a> branch on launchpad which is in fact version 5.5 inside. It appears to be mirrored, last sync 5 minutes ago but last changeset 39 hours ago. So this seems like a publishing branch, not a development branch (otherwise we&#8217;d see more activity).</li>
<li>Paul DuBois tells that the mysql-server trunk on launchpad is now 5.5. Last activity is from a week ago, so I presume that like the abovementioned mysql-next-mr branch it&#8217;s synced and not actually from a live development branch. Pity.</li>
</ul><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22082&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22082&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Wed, 04 Nov 2009 22:30:35 +0000</pubDate>
    <dc:creator>Open Query</dc:creator>
    <category>Uncategorized</category>
    <category>5.1</category>
    <category>5.2</category>
    <category>5.4</category>
    <category>5.5</category>
    <category>6.0</category>
    <category>mysql</category>
    <category>open query</category>
  </item>

  <item>
    <title>Increasing innodb_log_file_size</title>
    <guid isPermaLink="false">http://www.facebook.com/note.php?note_id=174491010932</guid>
    <link>http://www.facebook.com/note.php?note_id=174491010932</link>
    <description>We have servers that run with innodb_log_file_size=256M and some of these servers do a lot of disk writes per second. I want to know whether performance will improve with a larger value for innodb_log_file_size and setup two test servers that used 256M and 512M for it and then ran a mirror of the production workload on them.

The results are interesting. The benefit varies from significant to not much depending on how you measure. With this change the write rate was reduced:

4.5% as measured by iostat w/s
13% as measured by iostat wsec/s
18% as measured by Innodb pages written. 

I then added the my.cnf variable innodb_flush_neighbors_on_checkpoint to MySQL. There are several conditions under which InnoDB writes dirty pages. One reason for pages to be flushed is page preflush which is done to enforce the fuzzy checkpoint constraint. The oldest LSN for a dirty page must not be too close to the start of the current group of log files. InnoDB submits async write requests to enforce this. The Facebook MySQL patch adds statistics to SHOW INNODB STATUS that reports on the source of page writes and for my servers page preflush is the common cause. Other causes are too many dirty pages and moving pages from the LRU to the free list.

When a dirty page is to be written, InnoDB submits async write requests for that page and all other dirty pages from the same extent. This is done to reduce the disk seek overhead. This will also increase the rate at which pages are written to disk. In the case of page preflush, the impact can be significant. When there are too many dirty pages or not enough pages on the free list, InnoDB submits a fixed number of write requests. But when page preflush is done, InnoDB must submit async write requests for all pages with a modified LSN that is too small. So the extra writes done for pages in the same extent can lead to a large number of write requests.

With a modified binary I set skip_innodb_flush_neighbors_on_checkpoint for the server using innodb_log_file_size=512M. With this change the write rate was reduced:

2% as measured by iostat w/s
18% as measured by iostat wsec/s
26% as measured by Innodb pages written. 

I don't know why the iostat rate for w/s was not reduced more when using skip_innodb_flush_neighbors_on_checkpoint. 

The setting skip_innodb_flush_neighbors_on_checkpoint is also likely to be useful when flash is used, but that is another discussion.</description>
    <content:encoded><![CDATA[We have servers that run with <a href="http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_log_file_size">innodb_log_file_size=256M</a> and some of these servers do a lot of disk writes per second. I want to know whether performance will improve with a larger value for innodb_log_file_size and setup two test servers that used 256M and 512M for it and then ran a mirror of the production workload on them.

The results are interesting. The benefit varies from significant to not much depending on how you measure. With this change the write rate was reduced:
<ul>
<li>4.5% as measured by iostat w/s
<li>13% as measured by iostat wsec/s
<li>18% as measured by Innodb pages written. 
</ul>
I then added the my.cnf variable <b>innodb_flush_neighbors_on_checkpoint</b> to MySQL. There are several conditions under which InnoDB writes dirty pages. One reason for pages to be flushed is page preflush which is done to enforce the fuzzy checkpoint constraint. The oldest LSN for a dirty page must not be too close to the start of the current group of log files. InnoDB submits async write requests to enforce this. The Facebook MySQL patch adds statistics to SHOW INNODB STATUS that reports on the source of page writes and for my servers page preflush is the common cause. Other causes are too many dirty pages and moving pages from the LRU to the free list.

When a dirty page is to be written, InnoDB submits async write requests for that page and all other dirty pages from the same extent. This is done to reduce the disk seek overhead. This will also increase the rate at which pages are written to disk. In the case of page preflush, the impact can be significant. When there are too many dirty pages or not enough pages on the free list, InnoDB submits a fixed number of write requests. But when page preflush is done, InnoDB must submit async write requests for all pages with a modified LSN that is too small. So the extra writes done for pages in the same extent can lead to a large number of write requests.

With a modified binary I set <b>skip_innodb_flush_neighbors_on_checkpoint</b> for the server using innodb_log_file_size=512M. With this change the write rate was reduced:
<ul>
<li>2% as measured by iostat w/s
<li>18% as measured by iostat wsec/s
<li>26% as measured by Innodb pages written. 
</ul>
I don't know why the iostat rate for w/s was not reduced more when using <b>skip_innodb_flush_neighbors_on_checkpoint</b>. 

The setting <b>skip_innodb_flush_neighbors_on_checkpoint</b> is also likely to be useful when flash is used, but that is another discussion.<br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22081&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22081&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Wed, 04 Nov 2009 22:06:46 +0000</pubDate>
    <dc:creator>Mark Callaghan</dc:creator>
  </item>

  <item>
    <title>IRC is the best support method and Open Source rules</title>
    <guid isPermaLink="false">http://themattreid.com/wordpress/?p=246</guid>
    <link>http://feedproxy.google.com/~r/Themattreid/~3/5WIstKGeKDk/</link>
    <description>I was working on a server today that was not hooked up to our usual monitoring systems for one reason or another and I needed to generate a database tuning report. Typically I use Matthew Montgomery&amp;#8217;s &amp;#8216;tuning-primer.sh&amp;#8217; script for this since it&amp;#8217;s command line based, simple to use, and generates a number of useful items for tuning recommendations. It&amp;#8217;s a great starting point before delving into the deeper aspects of MySQL and the OS.
I ran into an issue with it on this server that was running the MySQL 5.0.77-percona-highperf-b13-log x86_64 build. The error was:
./tuning-primer.sh.1: line 517: 5.000000: syntax error in expression (error token is &quot;.000000&quot;)
There were three options to fix this issue

Dive into the code and modify it cowboy style
Use our typical monitoring against the client&amp;#8217;s wishes
Contact the developer to get a fix

I hopped on IRC.freenode.net to the #mysql channel and found Matthew online. Long story short he narrowed it down to the version of MySQL being from Percona that was changing the format of the variable in question vs the typical MySQL variable type. After discussing the matter he released a new build to address this change and I was on my way to generating reports again.That is quality support that you won&amp;#8217;t see from the likes of Microsoft or Apple or any big closed-source player.
If there&amp;#8217;s on thing to take away from this story it&amp;#8217;s the Open Source Software will ALWAYS be better than closed source because of the hands on attitude and direct contact you can get with the developers or, at the minimum, with the large user community that is willing and able to help troubleshoot. Why are people able to help? Because the code is open and free &amp;#8211; and people like free software that they can improve and fix themselves &amp;#8211; and those people like to help others because we&amp;#8217;ve all needed help at some point or another, no matter how much of an expert you are.
If there are two things to take away it&amp;#8217;s to remember that IRC is a wealth of useful information and support for Open Source applications. You can find me on the #mysql, ##php, #perl, #extjs, and #codeigniter channels under various usernames &amp;#8211; or idling in the #kontrollbase channel for supporting my own application.
Closed source apps are the old way to do business, the rotting steel skeletons from the industrial age of computing&amp;#8230; Open Source is the brainy kid down the street that doesn&amp;#8217;t want to rip you off for something that some nameless big corporation designed overseas for pennies on the dollar just to turn a profit and sell you something with crappy support and non-auditable code. Long live OSS!
</description>
    <content:encoded><![CDATA[<p>I was working on a server today that was not hooked up to our usual monitoring systems for one reason or another and I needed to generate a database tuning report. Typically I use Matthew Montgomery&#8217;s &#8216;tuning-primer.sh&#8217; script for this since it&#8217;s command line based, simple to use, and generates a number of useful items for tuning recommendations. It&#8217;s a great starting point before delving into the deeper aspects of MySQL and the OS.</p>
<p>I ran into an issue with it on this server that was running the MySQL 5.0.77-percona-highperf-b13-log x86_64 build. The error was:<br />
<code>./tuning-primer.sh.1: line 517: 5.000000: syntax error in expression (error token is ".000000")</code></p>
<p>There were three options to fix this issue</p>
<ol>
<li>Dive into the code and modify it cowboy style</li>
<li>Use our typical monitoring against the client&#8217;s wishes</li>
<li>Contact the developer to get a fix</li>
</ol>
<p>I hopped on IRC.freenode.net to the #mysql channel and found Matthew online. Long story short he narrowed it down to the version of MySQL being from Percona that was changing the format of the variable in question vs the typical MySQL variable type. After discussing the matter he released a new build to address this change and I was on my way to generating reports again.That is quality support that you won&#8217;t see from the likes of Microsoft or Apple or any big closed-source player.</p>
<p>If there&#8217;s on thing to take away from this story it&#8217;s the Open Source Software will ALWAYS be better than closed source because of the hands on attitude and direct contact you can get with the developers or, at the minimum, with the large user community that is willing and able to help troubleshoot. Why are people able to help? Because the code is open and free &#8211; and people like free software that they can improve and fix themselves &#8211; and those people like to help others because we&#8217;ve all needed help at some point or another, no matter how much of an expert you are.</p>
<p>If there are two things to take away it&#8217;s to remember that IRC is a wealth of useful information and support for Open Source applications. You can find me on the #mysql, ##php, #perl, #extjs, and #codeigniter channels under various usernames &#8211; or idling in the #kontrollbase channel for supporting my own application.</p>
<p>Closed source apps are the old way to do business, the rotting steel skeletons from the industrial age of computing&#8230; Open Source is the brainy kid down the street that doesn&#8217;t want to rip you off for something that some nameless big corporation designed overseas for pennies on the dollar just to turn a profit and sell you something with crappy support and non-auditable code. Long live OSS!</p>
<img src="http://feeds.feedburner.com/~r/Themattreid/~4/5WIstKGeKDk" height="1" width="1" /><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22080&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22080&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Wed, 04 Nov 2009 20:56:45 +0000</pubDate>
    <dc:creator>Matt Reid</dc:creator>
    <category>irc</category>
    <category>open source</category>
    <category>mysql</category>
    <category>support</category>
  </item>

  <item>
    <title>Vote for Kontrollbase on MySQL Forge</title>
    <guid isPermaLink="false">http://themattreid.com/wordpress/?p=244</guid>
    <link>http://feedproxy.google.com/~r/Themattreid/~3/U2H5Q84oRX0/</link>
    <description>Just a reminder to all of those users that are enjoying Kontrollbase &amp;#8211; if you get a minute in your day please go to the MySQL Forge site and put your vote in on Kontrollbase. It&amp;#8217;s a simple star based vote on the right side of the page located here: http://forge.mysql.com/projects/project.php?id=318
</description>
    <content:encoded><![CDATA[<p>Just a reminder to all of those users that are enjoying Kontrollbase &#8211; if you get a minute in your day please go to the MySQL Forge site and put your vote in on Kontrollbase. It&#8217;s a simple star based vote on the right side of the page located here: <a href="http://forge.mysql.com/projects/project.php?id=318">http://forge.mysql.com/projects/project.php?id=318</a></p>
<img src="http://feeds.feedburner.com/~r/Themattreid/~4/U2H5Q84oRX0" height="1" width="1" /><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22078&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22078&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Wed, 04 Nov 2009 19:49:12 +0000</pubDate>
    <dc:creator>Matt Reid</dc:creator>
    <category>Kontrollbase</category>
    <category>mysql forge</category>
    <category>vote</category>
  </item>

  <item>
    <title>Amazon's move mocks EU's fear of Oracle</title>
    <guid isPermaLink="false">http://news.cnet.com/8301-13505_3-10390467-16.html</guid>
    <link>http://news.cnet.com/8301-13505_3-10390467-16.html?part=rss&amp;amp;tag=feed&amp;amp;subj=TheOpenRoad</link>
    <description>Amazon.com's fork of the MySQL database suggests that competition is alive and well, regardless of Oracle's desire to buy Sun or of the European Commission.</description>
    <content:encoded><![CDATA[Amazon.com's fork of the MySQL database suggests that competition is alive and well, regardless of Oracle's desire to buy Sun or of the European Commission.<br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22077&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22077&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Wed, 04 Nov 2009 18:50:14 +0000</pubDate>
    <dc:creator>Matt Asay</dc:creator>
  </item>

  <item>
    <title>Trying to Find a Usable C++ IDE for Linux</title>
    <guid isPermaLink="false">http://inaugust.com/post/67</guid>
    <link>http://inaugust.com/post/67</link>
    <description>Dear LazyWeb,I&amp;#39;m looking for a usable C++ IDE for Linux and I&amp;#39;m wondering if you&amp;#39;ve seen one. Before you start giving the normal suggestions (Ecliipse, NetBeans, just-use-vi) let start off by saying that I&amp;#39;ve tried Eclipse, Netbeans and Code::Blocks and KDevelop several times, and that I normally hack in some combination of vi and emacs. (yes yes, I know I&amp;#39;m supposed to religiously pick one and be rude to the other... consider me a postmodern hacker)For it to be usable by me, it must be able to:Handle the fact that my build is run with autoconf/automake. Properly rename a method and have that show up throughout the codebase.Properly encapsulate a variable with getter/setting methods.Correctly answer the question &amp;quot;where is this method being used&amp;quot;Run without consuming all of my RAM and CPU resources. Quickly and easily open a new project/branch (I have 93 different branches of Drizzle in my source dir right now. Going through a 10 step process to open any given branch in the IDE== FAIL)Bonus points given for:Allowing me to deal with one or more bzr branches in a sane mannerSupporting an option like emacs where the tab key NEVER inserts a tab character and instead ALWAYS indents the line its on. Figuring out by the existence of a configure.ac file that perhaps a Makefile will appear if it runs &amp;quot;autoreconf -f -i; ./configure&amp;quot;If you think you have the answer, try this as a test:bzr branch lp:drizzleOpen the drizzle directory as a &amp;quot;project&amp;quot;BuildFind some method on the Session object in drizzled/session.h.&amp;nbsp; Rename it using the IDE. Build again.Find the method errmsg_printf in drizzled/errmsg_print.h. Find out every place that uses it. See if that matches what grep -r &amp;#39;\berrmsg_printf\b&amp;#39; would tell you. Anybody? If you have an IDE and it can actually deal with my daily Drizzle development, I will happily blog both that it can and how to get started. </description>
    <content:encoded><![CDATA[<p>Dear LazyWeb,</p><p>I&#39;m looking for a usable C++ IDE for Linux and I&#39;m wondering if you&#39;ve seen one. Before you start giving the normal suggestions (Ecliipse, NetBeans, just-use-vi) let start off by saying that I&#39;ve tried Eclipse, Netbeans and Code::Blocks and KDevelop several times, and that I normally hack in some combination of vi and emacs. (yes yes, I know I&#39;m supposed to religiously pick one and be rude to the other... consider me a postmodern hacker)</p><p>For it to be usable by me, it must be able to:</p><ol><li>Handle the fact that my build is run with autoconf/automake. </li><li>Properly rename a method and have that show up throughout the codebase.</li><li>Properly encapsulate a variable with getter/setting methods.</li><li>Correctly answer the question &quot;where is this method being used&quot;</li><li>Run without consuming all of my RAM and CPU resources. </li><li>Quickly and easily open a new project/branch (I have 93 different branches of Drizzle in my source dir right now. Going through a 10 step process to open any given branch in the IDE== FAIL)</li></ol><p>Bonus points given for:</p><ol><li>Allowing me to deal with one or more bzr branches in a sane manner</li><li>Supporting an option like emacs where the tab key NEVER inserts a tab character and instead ALWAYS indents the line its on. </li><li>Figuring out by the existence of a configure.ac file that perhaps a Makefile will appear if it runs &quot;autoreconf -f -i; ./configure&quot;</li></ol><p>If you think you have the answer, try this as a test:</p><ul><li>bzr branch lp:drizzle</li><li>Open the drizzle directory as a &quot;project&quot;</li><li>Build</li><li>Find some method on the Session object in drizzled/session.h.&nbsp; Rename it using the IDE. Build again.</li><li>Find the method errmsg_printf in drizzled/errmsg_print.h. Find out every place that uses it. See if that matches what grep -r &#39;\berrmsg_printf\b&#39; would tell you. </li></ul><p>Anybody? If you have an IDE and it can actually deal with my daily Drizzle development, I will happily blog both that it can and how to get started. </p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22076&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22076&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Wed, 04 Nov 2009 18:26:02 +0000</pubDate>
    <dc:creator>Monty Taylor</dc:creator>
    <category>tools</category>
    <category>mysql</category>
    <category>drizzle</category>
  </item>

  <item>
    <title>How to analyze SQL performance in InfiniDB</title>
    <guid isPermaLink="false">http://infinidb.org/infinidb-blog/how-to-analyze-sql-performance-in-infinidb.html</guid>
    <link>http://infinidb.org/infinidb-blog/how-to-analyze-sql-performance-in-infinidb.html</link>
    <description>One of the things I found missing when I came to MySQL from other databases was a good SQL tracing utility that helped me understand exactly what a long running SQL statement was doing. The inclusion of the SQL Profiler in post 5.0 versions of MySQL helped, but I always felt more could be done.With InfiniDB, you have some new SQL diagnostic and tracing tools that you can use to get more performance data from SQL statements that don&amp;rsquo;t seem to be running well.&amp;nbsp; Let me give yoRead More...</description>
    <content:encoded><![CDATA[<p>One of the things I found missing when I came to MySQL from other databases was a good SQL tracing utility that helped me understand exactly what a long running SQL statement was doing. The inclusion of the SQL Profiler in post 5.0 versions of MySQL helped, but I always felt more could be done.</p><br/><p>With InfiniDB, you have some new SQL diagnostic and tracing tools that you can use to get more performance data from SQL statements that don&rsquo;t seem to be running well.&nbsp; Let me give yoRead More...<br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22079&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22079&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Wed, 04 Nov 2009 17:14:06 +0000</pubDate>
    <dc:creator>Robin Schumacher</dc:creator>
  </item>

  <item>
    <title>The Great Escape</title>
    <guid isPermaLink="false">http://jpipes.com/index.php?/archives/309-guid.html</guid>
    <link>http://jpipes.com/index.php?/archives/309-The-Great-Escape.html</link>
    <description>
This week, I am working on putting together test cases which validate the Drizzle transaction log's handling of BLOB columns.


I ran into an interesting set of problems and am wondering how to go about handling them.  Perhaps the LazyWeb will have some solutions. 


The problem, in short, is inconsistency in the way that the NUL character is escaped (or not escaped) in both the MySQL/Drizzle protocol and the MySQL/Drizzle client tools.  And, by client tools, I mean both everyone's favourite little mysql command-line client, but also the mysqltest client, which provides infrastructure and runtime services for the MySQL and Drizzle test suites.


Even within the server and client protocol, there appears to be some inconsistency in how and when things are escaped.  Take a look at this interesting output from the drizzle client program (FYI, output is identical for mysql client, I checked...)


drizzle&gt; select 'test\0me';
+---------+
| test    |
+---------+
| test me | 
+---------+
1 row in set (0 sec)


You'll notice that in the first SELECT statement, the column header is cut off &amp;mdash; i.e. the column header is not escaping the \0 NUL character in the string 'test\0me'.  However, the result data does not truncate the string but replaces the NUL character with a space character.  So, I came to the conclusion that the drizzle client does not escape column headers but does do some sort of escaping for the result data. Given this conclusion, you will understand my raised eyebrow when the following SELECT statement was displayed:


drizzle&gt; select 'test\0me' = 'test me';
+------------------------+
| 'test\0me' = 'test me' |
+------------------------+
|                      0 | 
+------------------------+
1 row in set (0 sec)


Hmmm...so maybe column headers are being escaped by the MySQL/Drizzle client?  Clearly, the NUL character was escaped as the characters '\\' followed by the character '0' in the column header above.  Indeed, quite puzzling.


OK, so the above anomaly needs to be investigated.  However, a similar issue exists for the mysqltest/drizzletest client program.  To see the problem, check the following out.  I create a simple test case with the following in it:


--disable_warnings
DROP TABLE IF EXISTS t1;
--enable_warnings

SELECT 'test\0me';

CREATE TABLE t1 (fld BLOB NULL);
INSERT INTO t1 VALUES ('test\0me');
SELECT COUNT(*) FROM t1;
DROP TABLE t1;


Now, what you would expect to see for the output of the above &amp;mdash; at least if you expect results similar to the MySQL/Drizzle client output &amp;mdash; is the following:


DROP TABLE IF EXISTS t1;
SELECT 'test\0me';
test
test me
CREATE TABLE t1 (fld BLOB NULL);
INSERT INTO t1 VALUES ('test\0me');
SELECT COUNT(*) FROM t1;
COUNT(*)
1
DROP TABLE t1;


That is what you would expect to see in the output of course... Here is what you actually get in the output:


DROP TABLE IF EXISTS t1;
SELECT 'test\0me';
test
test


So, the mysqltest/drizzletest client apparently does not escape the NUL character for the result data at all.  It looks like it does do some escaping/replacing for the NUL character in the column header, though, otherwise the second &quot;test&quot; line would not appear.  This leads to the result file being essentially truncated as soon as a NUL character is included in any output to the mysqltest/drizzletest client.  This essentially makes the mysqltest/drizzletest client useless for testing and validating BLOB data.

Possible Solutions?

I think the cleanest solution would be to create a shared library of code that would be responsible for uniformly and consistently escaping data, and then linking the various clients (and server) with this library and removing all of the various escaping functions currently in the server.  This would, of course, take some time, but would be the most future proof solution.  Anyone else have ideas on solving the problem of being able to test and validate binary data via the test suite?  Cheers!

</description>
    <content:encoded><![CDATA[<p>
This week, I am working on putting together test cases which validate the <a href="http://jpipes.com/index.php?/archives/299-Drizzle-Replication-The-Transaction-Log.html" title="Drizzle transaction log">Drizzle transaction log</a>'s handling of BLOB columns.
</p>
<p>
I ran into an interesting set of problems and am wondering how to go about handling them.  Perhaps the LazyWeb will have some solutions. <img src="http://jpipes.com/templates/default/img/emoticons/smile.png" alt=":-)" style="display: inline; vertical-align: bottom;" class="emoticon" />
</p>
<p>
The problem, in short, is inconsistency in the way that the NUL character is escaped (or not escaped) in both the MySQL/Drizzle protocol and the MySQL/Drizzle client tools.  And, by client tools, I mean both everyone's favourite little mysql command-line client, but also the mysqltest client, which provides infrastructure and runtime services for the MySQL and Drizzle test suites.
</p>
<p>
Even within the server and client protocol, there appears to be some inconsistency in how and when things are escaped.  Take a look at this interesting output from the drizzle client program (FYI, output is identical for mysql client, I checked...)
</p>
<pre>
drizzle> select 'test\0me';
+---------+
| test    |
+---------+
| test me | 
+---------+
1 row in set (0 sec)
</pre>
<p>
You'll notice that in the first <tt>SELECT</tt> statement, the column header is cut off &mdash; i.e. the column header is not escaping the <tt>\0</tt> NUL character in the string <tt>'test\0me'</tt>.  However, the result data <strong><em>does not</em></strong> truncate the string but <em>replaces</em> the NUL character with a space character.  So, I came to the conclusion that the drizzle client does not escape column headers but does do some sort of escaping for the result data. Given this conclusion, you will understand my raised eyebrow when the following <tt>SELECT</tt> statement was displayed:
</p>
<pre>
drizzle> select 'test\0me' = 'test me';
+------------------------+
| 'test\0me' = 'test me' |
+------------------------+
|                      0 | 
+------------------------+
1 row in set (0 sec)
</pre>
<p>
Hmmm...so maybe column headers <em>are</em> being escaped by the MySQL/Drizzle client?  Clearly, the NUL character was escaped as the characters '\\' followed by the character '0' in the column header above.  Indeed, quite puzzling.
</p>
<p>
OK, so the above anomaly needs to be investigated.  However, a similar issue exists for the mysqltest/drizzletest client program.  To see the problem, check the following out.  I create a simple test case with the following in it:
</p>
<pre>
--disable_warnings
DROP TABLE IF EXISTS t1;
--enable_warnings

SELECT 'test\0me';

CREATE TABLE t1 (fld BLOB NULL);
INSERT INTO t1 VALUES ('test\0me');
SELECT COUNT(*) FROM t1;
DROP TABLE t1;
</pre>
<p>
Now, what you <strong>would expect to see</strong> for the output of the above &mdash; at least if you expect results similar to the MySQL/Drizzle client output &mdash; is the following:
</p>
<pre>
DROP TABLE IF EXISTS t1;
SELECT 'test\0me';
test
test me
CREATE TABLE t1 (fld BLOB NULL);
INSERT INTO t1 VALUES ('test\0me');
SELECT COUNT(*) FROM t1;
COUNT(*)
1
DROP TABLE t1;
</pre>
<p>
That is what you would <em>expect</em> to see in the output of course... Here is what you <em>actually</em> get in the output:
</p>
<pre>
DROP TABLE IF EXISTS t1;
SELECT 'test\0me';
test
test
</pre>
<p>
So, the mysqltest/drizzletest client apparently does not escape the NUL character for the <strong>result data</strong> at all.  It looks like it does do some escaping/replacing for the NUL character in the column header, though, otherwise the second "test" line would not appear.  This leads to the result file being essentially truncated as soon as a NUL character is included in any output to the mysqltest/drizzletest client.  This essentially makes the mysqltest/drizzletest client useless for testing and validating BLOB data.
</p>
<h2>Possible Solutions?</h2>
<p>
I think the cleanest solution would be to create a shared library of code that would be responsible for uniformly and consistently escaping data, and then linking the various clients (and server) with this library and removing all of the various escaping functions currently in the server.  This would, of course, take some time, but would be the most future proof solution.  Anyone else have ideas on solving the problem of being able to test and validate binary data via the test suite?  Cheers!
</p>
</p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22075&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22075&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Wed, 04 Nov 2009 16:11:00 +0000</pubDate>
    <dc:creator>Jay Pipes</dc:creator>
    <category>MySQL</category>
    <category>Drizzle</category>
    <category>C/C++</category>
  </item>

  <item>
    <title>Improving the Performance of your Java-based MySQL Applications (Giving a Webinar on Nov 5th at 10:00 AM Pacific/1:00 PM Eastern)</title>
    <guid isPermaLink="false">http://www.jroller.com/mmatthews/entry/improving_the_performance_of_your</guid>
    <link>http://www.jroller.com/mmatthews/entry/improving_the_performance_of_your</link>
    <description>I&amp;#8216;ll be giving a webinar version of a well-received presentation from last year&amp;#8216;s Users Conference tomorrow (November 5th) at 10:00 AM Pacific that covers how to configure MySQL Connector/J to deliver the best performance for your Java application running on MySQL. The session is interactive, and we&amp;#8216;ll be answering questions, so bring those along!

	Come spend your morning/afternoon break with me and learn a few new tricks for making your applications fly. The information to sign up is at https://www.mysql.com/news-and-events/web-seminars/display-460.html. Hope to see you there!</description>
    <content:encoded><![CDATA[<p>I&#8216;ll be giving a webinar version of a well-received presentation from last year&#8216;s Users Conference tomorrow (November 5th) at 10:00 <span>AM </span>Pacific that covers how to configure MySQL Connector/J to deliver the best performance for your Java application running on MySQL. The session is interactive, and we&#8216;ll be answering questions, so bring those along!</p>

	<p>Come spend your morning/afternoon break with me and learn a few new tricks for making your applications fly. The information to sign up is at <a href="https://www.mysql.com/news-and-events/web-seminars/display-460.html">https://www.mysql.com/news-and-events/web-seminars/display-460.html</a>. Hope to see you there!</p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22074&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22074&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Wed, 04 Nov 2009 14:17:58 +0000</pubDate>
    <dc:creator>Mark Matthews</dc:creator>
    <category>MySQL</category>
    <category>java</category>
    <category>mysql</category>
    <category>performance</category>
  </item>

  <item>
    <title>Four short links: 4 November 2009</title>
    <guid isPermaLink="false">tag:radar.oreilly.com,2009://57.38408</guid>
    <link>http://feedproxy.google.com/~r/oreilly/radar/atom/~3/0IQP5FMA_uk/four-short-links-4-november-20.html</link>
    <description>
ChipHacker -- collaborative FAQ site for electronics hacking.  Based on the same StackExchange software as RedMonk's FOSS FAQ for open source software.
Democracy Live -- BBC launch searchable coverage of parliamentary discussion, using speech-to-text.  One aspect we're particularly proud of is that we've managed to deliver good results for speech-to-text in Welsh, which, we're told, is unique. I think of this as the start of a They Work For You for video coverage.  I'd love to be able to scale this to local government coverage, which is disappearing as local newspapers turn into delivery mechanisms for real estate advertisements.
InfiniDB: Open Source Column Database -- hooks into MySQL, uses MySQL for SQL parsing, security, etc.  The commercial enterprise version has multi-server support (parallel scale-out).  (via Brian Aker)
Massive Online Analysis -- MOA is a framework for data stream mining. Includes tools for evaluation and a collection of machine learning algorithms. Related to the WEKA project, also written in Java, while scaling to more demanding problems. . (via joshua on Delicious)



   
</description>
    <content:encoded><![CDATA[<p><ol>
<li><a href="http://chiphacker.com/">ChipHacker</a> -- collaborative FAQ site for electronics hacking.  Based on the same <a href="http://stackexchange.com/">StackExchange</a> software as RedMonk's <a href="http://fossfaq.com/">FOSS FAQ</a> for open source software.</li>
<li><a href="http://www.bbc.co.uk/blogs/aboutthebbc/2009/11/democracy-live.shtml">Democracy Live</a> -- BBC launch searchable coverage of parliamentary discussion, using speech-to-text.  <i>One aspect we're particularly proud of is that we've managed to deliver good results for speech-to-text in Welsh, which, we're told, is unique.</i> I think of this as the start of a <a href="http://theyworkforyou.com">They Work For You</a> for video coverage.  I'd love to be able to scale this to local government coverage, which is disappearing as local newspapers turn into delivery mechanisms for real estate advertisements.</li>
<li><a href="http://www.infinidb.org/resources/tech-articles/69-introducing-infinidb-from-calpont">InfiniDB: Open Source Column Database</a> -- hooks into MySQL, uses MySQL for SQL parsing, security, etc.  The commercial enterprise version has multi-server support (parallel scale-out).  (via <a href="http://krow.livejournal.com/675706.html">Brian Aker</a>)</li>
<li><a href="http://www.cs.waikato.ac.nz/~abifet/MOA/">Massive Online Analysis</a> -- <i>MOA is a framework for data stream mining. Includes tools for evaluation and a collection of machine learning algorithms. Related to the WEKA project, also written in Java, while scaling to more demanding problems. </i>. (via <a href="http://delicious.com/joshua">joshua on Delicious</a>)</li>
</ol></p>

<div>
<a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=0IQP5FMA_uk:3SMlEXpIcBE:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=0IQP5FMA_uk:3SMlEXpIcBE:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=0IQP5FMA_uk:3SMlEXpIcBE:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=0IQP5FMA_uk:3SMlEXpIcBE:JEwB19i1-c4"><img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=0IQP5FMA_uk:3SMlEXpIcBE:JEwB19i1-c4" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=0IQP5FMA_uk:3SMlEXpIcBE:7Q72WNTAKBA"><img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=7Q72WNTAKBA" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/oreilly/radar/atom/~4/0IQP5FMA_uk" height="1" width="1" /><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22071&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22071&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Wed, 04 Nov 2009 11:00:00 +0000</pubDate>
    <dc:creator>Nat Torkington</dc:creator>
    <category>big data</category>
    <category>collective intelligence</category>
    <category>databases</category>
    <category>democracy</category>
    <category>gov2.0</category>
    <category>hardware</category>
    <category>maker</category>
    <category>open source</category>
  </item>

  <item>
    <title>Instant Relief from MySQL Reporting Queries: Incremental Updates</title>
    <guid isPermaLink="false">http://www.nicholasgoodman.com/bt/blog/2009/11/03/instant-relief-from-mysql-reporting-queries-incremental-updates/</guid>
    <link>http://www.nicholasgoodman.com/bt/blog/2009/11/03/instant-relief-from-mysql-reporting-queries-incremental-updates/</link>
    <description>Yesterday, I covered how you can do an initial &amp;#8220;replication&amp;#8221; of data from MySQL to DynamoDB and how this can improve performance, and save storage space.  The follow on question becomes:  
That&amp;#8217;s Great Nick.  But how do I do keep my data up to date?
We&amp;#8217;ve got data in our Airline Performance dataset through 31-DEC-2007.  I loaded 1 year, all of 2007, for the previous example.  What happens when the FAA publishes their 2008 January results, and we&amp;#8217;ve loaded the new months worth of data into MySQL?
MySQL:
select count(*) from otp.ontime; 8061223
select count(*) from ontime where FlightDate &amp;gt; &amp;#8216;2007-12-31&amp;#8242;; 605765
select count(*) from ontime where FlightDate &amp;lt;= &amp;#8216;2007-12-31&amp;#8242;; 7455458
DynamoDB:
select count(*) from FASTER.&amp;#8221;ontime&amp;#8221;; 7455458 
So, we&amp;#8217;ve added approximately 600k new records to our source system that we don&amp;#8217;t have in our reporting system.  How do we incrementally insert these records and get just the 600k new rows into our DynamoDB reporting instance?
Easy Easy Easy.
We&amp;#8217;ve already done all the work, all we have to do is simply get records we haven&amp;#8217;t processed yet!  Should take just a few minutes to get our current table &amp;#8220;up to date&amp;#8221; with the one over in MySQL.
DynamoDB:
select max(&amp;#8221;FlightDate&amp;#8221;) from FASTER.&amp;#8221;ontime&amp;#8221;;  2007-12-31
insert into FASTER.&amp;#8221;ontime&amp;#8221; select * from MYSQL_SOURCE.&amp;#8221;ontime&amp;#8221; where &amp;#8220;FlightDate&amp;#8221; &amp;gt; DATE &amp;#8216;2007-12-31&amp;#8242;; 605765
In other words, let&amp;#8217;s select from MySQL any records whose date is beyond what we have currently (2007-12-31).
select count(*) from FASTER.&amp;#8221;ontime&amp;#8221;;  8061223
select count(*) from FASTER.&amp;#8221;ontime&amp;#8221; where &amp;#8220;FlightDate&amp;#8221; &amp;gt; DATE &amp;#8216;2007-12-31&amp;#8242;;  605765
MySQL:
While the DynamoDB INSERT statement was running, the following SQL was being run on MySQL.
show processlist shows a SQL session with the following SQL:
SELECT * FROM `ontime` WHERE `FlightDate` &amp;gt; DATE &amp;#8216;2007-12-31&amp;#8242;;
A single SQL statement (insert into select * from table where date &amp;gt; last time) has you up to date for reporting!  Long term we may look to work with Tungsten to be able to keep our data up to date using replication bin log records but for now, this simple pull based approach.</description>
    <content:encoded><![CDATA[<p>Yesterday, I covered how you can do an <a href="http://www.nicholasgoodman.com/bt/blog/2009/11/02/instant-relief-from-slow-mysql-reporting-queries-using-dynamodb/">initial &#8220;replication&#8221; of data from MySQL to DynamoDB</a> and how this can improve performance, and save storage space.  The follow on question becomes:  </p>
<p><strong>That&#8217;s Great Nick.  But how do I do keep my data up to date?</strong></p>
<p>We&#8217;ve got data in our Airline Performance dataset through 31-DEC-2007.  I loaded 1 year, all of 2007, for the previous example.  What happens when the FAA publishes their 2008 January results, and we&#8217;ve loaded the new months worth of data into MySQL?</p>
<p>MySQL:</p>
<blockquote><p>select count(*) from otp.ontime; <strong>8061223</strong><br />
select count(*) from ontime where FlightDate &gt; &#8216;2007-12-31&#8242;; <strong>605765</strong><br />
select count(*) from ontime where FlightDate &lt;= &#8216;2007-12-31&#8242;; <strong>7455458</strong></p></blockquote>
<p>DynamoDB:</p>
<blockquote><p>select count(*) from FASTER.&#8221;ontime&#8221;; <strong>7455458</strong> </p></blockquote>
<p>So, we&#8217;ve added approximately 600k new records to our source system that we don&#8217;t have in our reporting system.  How do we incrementally insert these records and get just the 600k new rows into our DynamoDB reporting instance?</p>
<p>Easy Easy Easy.</p>
<p>We&#8217;ve already done all the work, all we have to do is simply get records we haven&#8217;t processed yet!  Should take just a few minutes to get our current table &#8220;up to date&#8221; with the one over in MySQL.</p>
<p>DynamoDB:</p>
<blockquote><p>select max(&#8221;FlightDate&#8221;) from FASTER.&#8221;ontime&#8221;;  <strong>2007-12-31</strong><br />
insert into FASTER.&#8221;ontime&#8221; select * from MYSQL_SOURCE.&#8221;ontime&#8221; where &#8220;FlightDate&#8221; &gt; DATE &#8216;2007-12-31&#8242;; <strong>605765</strong></p></blockquote>
<p>In other words, let&#8217;s select from MySQL any records whose date is beyond what we have currently (2007-12-31).</p>
<blockquote><p>select count(*) from FASTER.&#8221;ontime&#8221;;  <strong>8061223<br />
</strong>select count(*) from FASTER.&#8221;ontime&#8221; where &#8220;FlightDate&#8221; &gt; DATE &#8216;2007-12-31&#8242;;  <strong>605765</strong></p></blockquote>
<p>MySQL:<br />
While the DynamoDB <strong>INSERT</strong> statement was running, the following SQL was being run on MySQL.</p>
<blockquote><p>show processlist shows a SQL session with the following SQL:<br />
SELECT * FROM `ontime` WHERE `FlightDate` &gt; DATE &#8216;2007-12-31&#8242;;</p></blockquote>
<p>A single SQL statement (<strong>insert into select * from table where date &gt; last time</strong>) has you up to date for reporting!  Long term we may look to work with Tungsten to be able to keep our data up to date using replication bin log records but for now, this simple pull based approach.</p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22068&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22068&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Wed, 04 Nov 2009 05:37:54 +0000</pubDate>
    <dc:creator>Nicholas Goodman</dc:creator>
    <category>DynamoBI</category>
  </item>

  <item>
    <title>[MySQL][Spider]Spider-2.8 released</title>
    <guid isPermaLink="false">tag:blogger.com,1999:blog-7870178081855084823.post-5365074278985071386</guid>
    <link>http://wild-growth.blogspot.com/2009/11/mysqlspiderspider-28-released.html</link>
    <description>I'm pleased to announce the release of Spider storage engine version 2.8(beta).Spider is a Storage Engine for database sharding.http://spiderformysql.com/The main changes in this version are following.- Add table parameter &quot;link_status&quot;.&amp;nbsp;&amp;nbsp;You can change link_status using &quot;alter table&quot; statement.&amp;nbsp;&amp;nbsp;Spider's link fault management is table level.Please see &quot;99_change_logs.txt&quot; in the download documents for checking other changes.Enjoy!</description>
    <content:encoded><![CDATA[I'm pleased to announce the release of Spider storage engine version 2.8(beta).<br />Spider is a Storage Engine for database sharding.<br /><a href="http://spiderformysql.com/">http://spiderformysql.com/</a><br /><br />The main changes in this version are following.<br />- Add table parameter "link_status".<br />&nbsp;&nbsp;You can change link_status using "alter table" statement.<br />&nbsp;&nbsp;Spider's link fault management is table level.<br /><br />Please see "99_change_logs.txt" in the download documents for checking other changes.<br /><br />Enjoy!<div><img width="1" height="1" src="https://blogger.googleusercontent.com/tracker/7870178081855084823-5365074278985071386?l=wild-growth.blogspot.com" /></div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22065&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22065&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Tue, 03 Nov 2009 22:13:00 +0000</pubDate>
    <dc:creator>Kentoku SHIBA</dc:creator>
    <category>MySQL</category>
    <category>Spider</category>
  </item>

  <item>
    <title>Hello!</title>
    <guid isPermaLink="false">http://www.facebook.com/note.php?note_id=173781870932</guid>
    <link>http://www.facebook.com/note.php?note_id=173781870932</link>
    <description>I am Domas Mituzas, and I have just joined Facebook MySQL team. I've been working at MySQL Support before, as well as did Wikipedia data and performance engineering on my free time - and I've blogged about that a bit. 

Here at Facebook I've started working on large scale database deployment introspection -  we want to know that everything churns happily, and knowing what is wrong and why it is wrong is the first step to improving the mental and emotional state of our servers.

Now, fetch me few thousand scalpels, before I grab my axe... ;-)</description>
    <content:encoded><![CDATA[I am Domas Mituzas, and I have just joined Facebook MySQL team. I've been working at MySQL Support before, as well as did Wikipedia data and performance engineering on my free time - and <a href="http://mituzas.lt/">I've blogged about</a> that a bit. 

Here at Facebook I've started working on large scale database deployment introspection -  we want to know that everything churns happily, and knowing what is wrong and why it is wrong is the first step to improving the mental and emotional state of our servers.

Now, fetch me few thousand scalpels, before I grab my axe... ;-)<br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22063&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22063&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Tue, 03 Nov 2009 21:39:33 +0000</pubDate>
    <dc:creator>Mark Callaghan</dc:creator>
  </item>

  <item>
    <title>I’m a Postgres user, as it turns out</title>
    <guid isPermaLink="false">http://www.xaprb.com/blog/?p=1409</guid>
    <link>http://www.xaprb.com/blog/2009/11/03/im-a-postgres-user-as-it-turns-out/</link>
    <description>Someone recently posted this to an email list as a sample of an interesting SHOW INNODB STATUS output:

mysql&amp;gt; SHOW ENGINE INNODB STATUS\G
          _______  _______
|\     /|(  ____ \(  ____ \
| )   ( || (    \/| (    \/
| |   | || (_____ | (__
| |   | |(_____  )|  __)
| |   | |      ) || (
| (___) |/\____) || (____/\
(_______)\_______)(_______/

 _______  _______  _______ _________ _______  _______  _______  _______
(  ____ )(  ___  )(  ____ \\__   __/(  ____ \(  ____ )(  ____ \(  ____ \
| (    )|| (   ) || (    \/   ) (   | (    \/| (    )|| (    \/| (    \/
| (____)|| |   | || (_____    | |   | |      | (____)|| (__    | (_____
|  _____)| |   | |(_____  )   | |   | | ____ |     __)|  __)   (_____  )
| (      | |   | |      ) |   | |   | | \_  )| (\ (   | (            ) |
| )      | (___) |/\____) |   | |   | (___) || ) \ \__| (____/\/\____) |
|/       (_______)\_______)   )_(   (_______)|/   \__/(_______/\_______) 

I thought it was worth trying out, so I gave it a shot:

mysql&amp;gt; use postgres
ERROR 1049 (42000): Unknown database 'postgres'


Clearly I just need to create the database.  Short work:

mysql&amp;gt; create database postgres;
Query OK, 1 row affected (0.00 sec)

mysql&amp;gt; use postgres
Database changed


So now I&amp;#8217;m using Postgres.  I still feel like I&amp;#8217;m missing something, though.  It feels a lot like reading XKCD comics.  Where&amp;#8217;s the tooltip?

Related posts:Version 1.1.3 of improved Cacti graphs for MySQL released I&amp;#8217;veHow MySQL really executes a query WARNING: nWhat do the InnoDB insert buffer statistics mean? Ever seen 
Related posts brought to you by Yet Another Related Posts Plugin.</description>
    <content:encoded><![CDATA[<p>Someone recently posted this to an email list as a sample of an interesting SHOW INNODB STATUS output:</p>

<code><pre title="use mariadb? use drizzle? drop database oracle?">mysql&gt; SHOW ENGINE INNODB STATUS\G
          _______  _______
|\     /|(  ____ \(  ____ \
| )   ( || (    \/| (    \/
| |   | || (_____ | (__
| |   | |(_____  )|  __)
| |   | |      ) || (
| (___) |/\____) || (____/\
(_______)\_______)(_______/

 _______  _______  _______ _________ _______  _______  _______  _______
(  ____ )(  ___  )(  ____ \\__   __/(  ____ \(  ____ )(  ____ \(  ____ \
| (    )|| (   ) || (    \/   ) (   | (    \/| (    )|| (    \/| (    \/
| (____)|| |   | || (_____    | |   | |      | (____)|| (__    | (_____
|  _____)| |   | |(_____  )   | |   | | ____ |     __)|  __)   (_____  )
| (      | |   | |      ) |   | |   | | \_  )| (\ (   | (            ) |
| )      | (___) |/\____) |   | |   | (___) || ) \ \__| (____/\/\____) |
|/       (_______)\_______)   )_(   (_______)|/   \__/(_______/\_______) </pre></code>

<p>I thought it was worth trying out, so I gave it a shot:</p>

<code><pre>mysql&gt; use postgres
ERROR 1049 (42000): Unknown database 'postgres'
</pre></code>

<p>Clearly I just need to create the database.  Short work:</p>

<code><pre>mysql&gt; create database postgres;
Query OK, 1 row affected (0.00 sec)

mysql&gt; use postgres
Database changed
</pre></code>

<p>So now I&#8217;m using Postgres.  I still feel like I&#8217;m missing something, though.  It feels a lot like reading <a href="http://xkcd.com/">XKCD</a> comics.  Where&#8217;s the tooltip?</p>

<p>Related posts:<ol><li><a href="http://www.xaprb.com/blog/2009/10/24/version-1-1-3-of-improved-cacti-graphs-for-mysql-released/" rel="bookmark" title="Permanent Link: Version 1.1.3 of improved Cacti graphs for MySQL released">Version 1.1.3 of improved Cacti graphs for MySQL released</a> <small>I&#8217;ve</small></li><li><a href="http://www.xaprb.com/blog/2009/04/01/how-mysql-really-executes-a-query/" rel="bookmark" title="Permanent Link: How MySQL really executes a query">How MySQL really executes a query</a> <small>WARNING: n</small></li><li><a href="http://www.xaprb.com/blog/2009/10/25/what-do-the-innodb-insert-buffer-statistics-mean/" rel="bookmark" title="Permanent Link: What do the InnoDB insert buffer statistics mean?">What do the InnoDB insert buffer statistics mean?</a> <small>Ever seen </small></li></ol></p>
<p>Related posts brought to you by <a href="http://mitcho.com/code/yarpp/">Yet Another Related Posts Plugin</a>.</p><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22061&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22061&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Tue, 03 Nov 2009 20:05:47 +0000</pubDate>
    <dc:creator>Baron Schwartz (xaprb)</dc:creator>
    <category>PostgreSQL</category>
    <category>SQL</category>
    <category>humor</category>
    <category>innodb</category>
    <category>mysql</category>
  </item>

  <item>
    <title>451 CAOS Links 2009.11.03</title>
    <guid isPermaLink="false">http://blogs.the451group.com/opensource/?p=1271</guid>
    <link>http://feedproxy.google.com/~r/451opensource/~3/kgDORtzhhlk/</link>
    <description>Yahoo! Open! Sources! Traffic! Server! Funding for 10gen. And more.
Follow 451 CAOS Links live @caostheory on Twitter and Identi.ca
&amp;#8220;Tracking the open source news wires, so you don&amp;#8217;t have to.&amp;#8221;
For the latest on Oracle&amp;#8217;s acquisition of MySQL via Sun, see Everything you always wanted to know about MySQL but were afraid to ask
# Yahoo! Open! Sourced! Traffic! Server! 
# Red Hat launched Enterprise Virtualization for Servers for managing Linux and Microsoft Windows servers. 
# 10gen, the company behind MongoDB, has raised $3.4m in a second round of funding. 
# Talend updated its Open Profiler and Data Quality products. 
# Tata Communications partnered with SugarCRM to provide on-demand CRM to customers in India. 
# eZ Systems has appointed former IBM and BEA executive Christoph Rau as CEO. 
# Bob Sutor offered his preliminary thoughts on starting an open source business.
# REvolution R Enterprise 3.0 is now available featuring R productivity environment for Windows. 
# Skype confirmed it is creating an open-source Linux client. 
# Monty Widenius claimed that MySQL&amp;#8217;s main rivals has always been Oracle.
# Scalix has delivered Scalix Starter Packs, providing email and calendaring for home and small office use. 
# Open source ad server vendor OpenX announced a partnership with Microsoft. 
# Glyn Moody reported that the EU wants to re-define “closed” as “nearly open”.
# Munich&amp;#8217;s use of Mozilla software surpassed Mozilla&amp;#8217;s expectations. 
# Matt Asay reported on why open source, like football/soccer is about big value, not big money. 
# Mike Hogan examined how to protect FOSS -related revenues in an era of cloud services, focusing on Amazon RDS/MySQL.
</description>
    <content:encoded><![CDATA[<p>Yahoo! Open! Sources! Traffic! Server! Funding for 10gen. And more.</p>
<p>Follow 451 CAOS Links live @caostheory on <a href="http://twitter.com/caostheory">Twitter</a> and <a href="http://identi.ca/caostheory">Identi.ca</a><br />
<em>&#8220;Tracking the open source news wires, so you don&#8217;t have to.&#8221;</em></p>
<p>For the latest on Oracle&#8217;s acquisition of MySQL via Sun, see <a href="http://blogs.the451group.com/opensource/2009/10/26/everything-you-always-wanted-to-know-about-mysql-but-were-afraid-to-ask/">Everything you always wanted to know about MySQL but were afraid to ask</a></p>
<p># Yahoo! <a href="http://bit.ly/3cOAGC">Open! Sourced!</a> Traffic! Server! </p>
<p># Red Hat <a href="http://bit.ly/uYgrq">launched</a> Enterprise Virtualization for Servers for managing Linux and Microsoft Windows servers. </p>
<p># 10gen, the company behind MongoDB, has <a href="http://bit.ly/3uPCLF">raised</a> $3.4m in a second round of funding. </p>
<p># Talend <a href="http://bit.ly/W6K3R">updated</a> its Open Profiler and Data Quality products. </p>
<p># Tata Communications <a href="http://bit.ly/SO9vB">partnered</a> with SugarCRM to provide on-demand CRM to customers in India. </p>
<p># eZ Systems has <a href="http://bit.ly/31RMEh">appointed</a> former IBM and BEA executive Christoph Rau as CEO. </p>
<p># Bob Sutor <a href="http://bit.ly/4gdypg">offered</a> his preliminary thoughts on starting an open source business.</p>
<p># REvolution R Enterprise 3.0 is <a href="http://bit.ly/ALNZC">now available</a> featuring R productivity environment for Windows. </p>
<p># Skype <a href="http://bit.ly/NKVtY">confirmed</a> it is creating an open-source Linux client. </p>
<p># Monty Widenius <a href="http://bit.ly/1DMAAl">claimed</a> that MySQL&#8217;s main rivals has always been Oracle.</p>
<p># Scalix has <a href="http://bit.ly/337pUe">delivered</a> Scalix Starter Packs, providing email and calendaring for home and small office use. </p>
<p># Open source ad server vendor OpenX <a href="http://bit.ly/2t3k27">announced</a> a partnership with Microsoft. </p>
<p># Glyn Moody <a href="http://bit.ly/SEqKN">reported</a> that the EU wants to re-define “closed” as “nearly open”.</p>
<p># Munich&#8217;s use of Mozilla software <a href="http://bit.ly/MHq8E1">surpassed</a> Mozilla&#8217;s expectations. </p>
<p># Matt Asay <a href="http://bit.ly/uWzpy">reported</a> on why open source, like football/soccer is about big value, not big money. </p>
<p># Mike Hogan <a href="http://bit.ly/4wKB2l">examined</a> how to protect FOSS -related revenues in an era of cloud services, focusing on Amazon RDS/MySQL.</p>
<img src="http://feeds.feedburner.com/~r/451opensource/~4/kgDORtzhhlk" height="1" width="1" /><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22062&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22062&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Tue, 03 Nov 2009 19:52:28 +0000</pubDate>
    <dc:creator>The 451 Group</dc:creator>
    <category>Links</category>
    <category>Software</category>
    <category>10gen</category>
    <category>451 group</category>
    <category>451caostheory</category>
    <category>451group</category>
    <category>amazon relational database service</category>
    <category>bob sutor</category>
    <category>caostheory</category>
    <category>Christoph Rau</category>
    <category>enterprise virtualization for servers</category>
    <category>eZ Systems</category>
    <category>Linux</category>
    <category>matt aslett</category>
    <category>mattaslett</category>
    <category>matthew aslett</category>
    <category>matthewaslett</category>
    <category>mongodb</category>
    <category>monty widenius</category>
    <category>m</category>
  </item>

  <item>
    <title>Drizzle, InfiniDB, Column Oriented Storage</title>
    <guid isPermaLink="false">http://krow.livejournal.com/675706.html</guid>
    <link>http://krow.livejournal.com/675706.html</link>
    <description>I have been asked a number of times  &quot;do you think there is a need for a column oriented database in the open source world?&quot;The answer has been yes!Users and vendors have asked me this question a number of times. The problem has been most of the vendors were interested in creating closed source solutions around either Drizzle/MySQL, or, did their efforts in a way that made serious modifications to the backend (aka... made poor use of the storage engine interface).For these reasons I have not really found myself all that thrilled to work with what has been out there. Also, I would often find that the commitment to open source was either luke warm or &quot;we will do it, once we have some traction...&quot;.My response to that?  &quot;Tell me more when you open source it. I'll see if it will work.&quot;For this reason I was very happy to see Calpont do their release of Infinidb last week.So as of this weekend?We have a project to use their engine with Drizzle. Infinidb makes use of the storage engine interface I worked on for MySQL which is a subset of the interface we have built for Drizzle. We have had several engines ported already, but this will be the first column oriented engine we will have ported to Drizzle.Building in different engines beyond the basic transactional engines is fun, because we get to see how the design stretches to fit additional needs. The core of Drizzle stays the same, but the micro-kernel nature of our design allows for others to expand the reach of where Drizzle can be used. Padraig started working on the engine on Friday and had it loading by the end of the weekend.It should be fun to see what additional enhancements we can do out of the box with Infini engine :)</description>
    <content:encoded><![CDATA[I have been asked a number of times  "do you think there is a need for a column oriented database in the open source world?"<br /><br />The answer has been yes!<br /><br />Users and vendors have asked me this question a number of times. The problem has been most of the vendors were interested in creating closed source solutions around either Drizzle/MySQL, or, did their efforts in a way that made serious modifications to the backend (aka... made poor use of the storage engine interface).<br /><br />For these reasons I have not really found myself all that thrilled to work with what has been out there. Also, I would often find that the commitment to open source was either luke warm or "we will do it, once we have some traction...".<br /><br />My response to that?  "Tell me more when you open source it. I'll see if it will work."<br /><br />For this reason I was very happy to see Calpont do their release of <a href="http://www.infinidb.org/resources/tech-articles/69-introducing-infinidb-from-calpont">Infinidb</a> last week.<br /><br />So as of this weekend?<br /><br />We have a <a href="http://drizzle.org/wiki/Calpont_InfiniDB_in_Drizzle">project</a> to use their engine with <a href="http://drizzle.org/">Drizzle</a>. Infinidb makes use of the storage engine interface I worked on for MySQL which is a subset of the interface we have built for Drizzle. We have had several engines ported already, but this will be the first column oriented engine we will have ported to Drizzle.<br /><br />Building in different engines beyond the basic transactional engines is fun, because we get to see how the design stretches to fit additional needs. The core of Drizzle stays the same, but the micro-kernel nature of our design allows for others to expand the reach of where Drizzle can be used. <a href="http://posulliv.com/">Padraig</a> started working on the engine on Friday and had it loading by the end of the weekend.<br /><br />It should be fun to see what additional enhancements we can do out of the box with Infini engine :)<br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22060&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22060&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Tue, 03 Nov 2009 17:44:13 +0000</pubDate>
    <dc:creator>Brian Aker</dc:creator>
  </item>

  <item>
    <title>Choosing the right page size, part 2</title>
    <guid isPermaLink="false">http://www.facebook.com/note.php?note_id=173535925932</guid>
    <link>http://www.facebook.com/note.php?note_id=173535925932</link>
    <description>InnoDB uses a 16kb page by default. I want to know whether performance improves with an 8kb database page for my workload. Two servers were setup to run a mirror of the production database workload. One used 8kb InnoDB pages and the other used 16kb.

It isn't clear that my performance will improve with 8kb pages. But my results are a function of my workload and data. I expect that 8kb pages will be much better for others. For example, when your data access pattern is uniform and most data is fetched by primary key, then smaller pages can be better as less space is wasted in the buffer pool per active row. However, a smaller page size will also waste more space from fragmentation for LOB columns. Peak IOPs on spinning disk are similar for 8kb pages and 16kb pages with random IO bound workloads. But when using flash, peak IOPs for 8kb pages are much higher than 16kb pages. So you, or your consultant, have interesting work to do if you want to evaluate this.

From the results below:

Pages created is ~3X higher with 8kb pages. I think there is much more fragmentation from LOB columns.
Transfer rates (rsec/s and wsec/s) are much higher for 16kb pages.
IOPs (r/s and w/s) are slightly lower for 16kb pages.


iostat data

These are average values from iostat over 2 hours:

page size    r/s   rsec/s  w/s    wsec/s
8kb          300    5445   543     8807
16kb         279    9288   516    11372


SHOW INNODB STATUS

This is data from SHOW INNODB STATUS over more than 24 hours:

page size   Pages read   Pages created   Pages written 
8kb         46679348      1736637        37016281
16kb        38837234       603852        27099815 
</description>
    <content:encoded><![CDATA[InnoDB uses a 16kb page by default. I want to know whether performance improves with an 8kb database page for my workload. Two servers were setup to run a mirror of the production database workload. One used 8kb InnoDB pages and the other used 16kb.

It isn't clear that my performance will improve with 8kb pages. But my results are a function of my workload and data. I expect that 8kb pages will be much better for others. For example, when your data access pattern is uniform and most data is fetched by primary key, then smaller pages can be better as less space is wasted in the buffer pool per active row. However, a smaller page size will also waste more space from fragmentation for LOB columns. Peak IOPs on spinning disk are similar for 8kb pages and 16kb pages with random IO bound workloads. But when using flash, peak IOPs for 8kb pages are much higher than 16kb pages. So you, or your consultant, have interesting work to do if you want to evaluate this.

From the results below:
<ul>
<li>Pages created is ~3X higher with 8kb pages. I think there is much more fragmentation from LOB columns.
<li>Transfer rates (rsec/s and wsec/s) are much higher for 16kb pages.
<li>IOPs (r/s and w/s) are slightly lower for 16kb pages.
</ul>

<h1>iostat data</h1>

These are average values from iostat over 2 hours:
<pre>
page size    r/s   rsec/s  w/s    wsec/s
8kb          300    5445   543     8807
16kb         279    9288   516    11372
</pre>

<h1>SHOW INNODB STATUS</h1>

This is data from <b>SHOW INNODB STATUS</b> over more than 24 hours:
<pre>
page size   Pages read   Pages created   Pages written 
8kb         46679348      1736637        37016281
16kb        38837234       603852        27099815 
</pre><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22059&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22059&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Tue, 03 Nov 2009 15:48:39 +0000</pubDate>
    <dc:creator>Mark Callaghan</dc:creator>
  </item>

  <item>
    <title>DB Charmer – ActiveRecord Connection Magic Plugin</title>
    <guid isPermaLink="false">http://kovyrin.net/?p=324</guid>
    <link>http://feedproxy.google.com/~r/Homo-Adminus/~3/_bkc3hWZQr0/</link>
    <description>Today I&amp;#8217;m proud to announce the first public release of our ActiveRecord database connection magic plugin: DbCharmer. 

DB Charmer &amp;#8211; ActiveRecord Connection Magic Plugin
DbCharmer is a simple yet powerful plugin for ActiveRecord that does a few things:

Allows you to easily manage AR models&amp;#8217; connections (switch_connection_to method)
Allows you to switch AR models&amp;#8217; default connections to a separate servers/databases
Allows you to easily choose where your query should go (on_* methods family)
Allows you to automatically send read queries to your slaves while masters would handle all the updates.
Adds multiple databases migrations to ActiveRecord


Installation
There are two options when approaching db-charmer installation:

using the gem (recommended)
install as a Rails plugin

To install as a gem, add this to your environment.rb:
12config.gem 'db-charmer', :lib =&amp;gt; 'db_charmer', 
&amp;nbsp; &amp;nbsp; :source =&amp;gt; 'http://gemcutter.org'
And then run the command:
1sudo rake gems:install
To install db-charmer as a Rails plugin use this:
1script/plugin install git://github.com/kovyrin/db-charmer.git
Easy ActiveRecord Connection Management
As a part of this plugin we&amp;#8217;ve added switch_connection_to method that accepts many different kinds of db connections specifications and uses them on a model. We support:

Strings and symbols as the names of connection configuration blocks in database.yml.
ActiveRecord models (we&amp;#8217;d use connection currently set up on a model).
Database connections (Model.connection)
Nil values to reset model to default connection.

Sample code:
1234567class Foo &amp;lt; ActiveRecord::Model; end

Foo.switch_connection_to&amp;#40;:blah&amp;#41;
Foo.switch_connection_to&amp;#40;'foo'&amp;#41;
Foo.switch_connection_to&amp;#40;Bar&amp;#41;
Foo.switch_connection_to&amp;#40;Baz.connection&amp;#41;
Foo.switch_connection_to&amp;#40;nil&amp;#41;
The switch_connection_to method has an optional second parameter should_exist which is true by default. This parameter is used when the method is called with a string or a symbol connection name and there is no such connection configuration in the database.yml file. If this parameter is true, an exception would be raised, otherwise, the error would be ignored and no connection change would happen.
This is really useful when in development mode or in a tests you do not want to create many different databases on your local machine and just want to put all your tables in a single database.
Warning: All the connection switching calls would switch connection only for those classes the method called on. You can&amp;#8217;t call the switch_connection_to method and switch connection for a base class in some hierarchy (for example, you can&amp;#8217;t switch AR::Base connection and see all your models switched to the new connection, use the classic establish_connection instead).
Multiple DB Migrations
In every application that works with many databases, there is need in a convenient schema migrations mechanism.
All Rails users already have this mechanism &amp;#8211; rails migrations. So in DbCharmer, we&amp;#8217;ve made it possible to seamlessly use multiple databases in Rails migrations.
There are two methods available in migrations to operate on more than one database:
1. Global connection change method &amp;#8211; used to switch whole migration to a non-default database.
2. Block-level connection change method &amp;#8211; could be used to do only a part of a migration on a non-default db.
Migration class example (global connection rewrite):
1234567891011121314class MultiDbTest &amp;lt; ActiveRecord::Migration
&amp;nbsp; &amp;nbsp;db_magic :connection =&amp;gt; :second_db

&amp;nbsp; &amp;nbsp;def self.up
&amp;nbsp; &amp;nbsp; &amp;nbsp;create_table :test_table, :force =&amp;gt; true do |t|
&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;t.string :test_string
&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;t.timestamps
&amp;nbsp; &amp;nbsp; &amp;nbsp;end
&amp;nbsp; &amp;nbsp;end

&amp;nbsp; &amp;nbsp;def self.down
&amp;nbsp; &amp;nbsp; &amp;nbsp;drop_table :test_table
&amp;nbsp; &amp;nbsp;end
&amp;nbsp;end
Migration class example (block-level connection rewrite):
1234567891011121314class MultiDbTest &amp;lt; ActiveRecord::Migration
&amp;nbsp; def self.up
&amp;nbsp; &amp;nbsp; on_db :second_db do
&amp;nbsp; &amp;nbsp; &amp;nbsp; create_table :test_table, :force =&amp;gt; true do |t|
&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; t.string :test_string
&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; t.timestamps
&amp;nbsp; &amp;nbsp; &amp;nbsp; end
&amp;nbsp; &amp;nbsp; end
&amp;nbsp; end

&amp;nbsp; def self.down
&amp;nbsp; &amp;nbsp; on_db :second_db &amp;#123; drop_table :test_table &amp;#125;
&amp;nbsp; end
end
By default in development and test environments you could skip this :second_db connection from your database.yml files and rails would create the tables in your single database, but in production you&amp;#8217;d specify it and get the table created on a separate server and/or in a separate database.
This behaviour is controlled by the DbCharmer.migration_connections_should_exist configuration attribute which could be set from a rails initializer.
Using Models in Master-Slave Environments
Master-slave replication is the most popular scale-out technique in a medium-sized and large database-centric applications today. There are some rails plugins out there that help developers to use slave servers in their models but none of them were flexible enough for us to start using them in a huge application we work on.
So, we&amp;#8217;ve been using ActsAsReadonlyable plugin for a long time and have made tons of changes in its code over the time. But since that plugin has been abandoned by its authors, we&amp;#8217;ve decided to collect all of our master-slave code in one plugin and release it for rails 2.2+. DbCharmer adds the following features to Rails models:
Auto-Switching all Reads to the Slave(s)
When you create a model, you could use db_magic :slave =&amp;gt; :blah or db_magic :slaves =&amp;gt; &amp;#91; :foo, :bar &amp;#93; commands in your model to set up reads redirection mode when all your find/count/exist/etc methods will be reading data from your slave (or a bunch of slaves in a round-robin manner). Here is an example:
1234567class Foo &amp;lt; ActiveRecord::Base
&amp;nbsp; db_magic :slave =&amp;gt; :slave01
end

class Bar &amp;lt; ActiveRecord::Base
&amp;nbsp; db_magic :slaves =&amp;gt; &amp;#91; :slave01, :slave02 &amp;#93;
end
Default Connection Switching
If you have more than one master-slave cluster (or simply more than one database) in your database environment, then you might want to change the default database connection of some of your models. You could do that by using db_magic :connection =&amp;gt; :foo call from your models. Example:
123class Foo &amp;lt; ActiveRecord::Base
&amp;nbsp; db_magic :connection =&amp;gt; :foo
end
Sample model on a separate master-slave cluster (so, separate main connection + a slave connection):
123class Bar &amp;lt; ActiveRecord::Base
&amp;nbsp; db_magic :connection =&amp;gt; :bar, :slave =&amp;gt; :bar_slave
end
Per-Query Connection Management
Sometimes you have select queries that you know you want to run on the master. This could happen for example when you have just added some data and need to read it back and not sure if it made it all the way to the slave yet or no. For this situation and a few others there is a set of methods we&amp;#8217;ve added to ActiveRecord models:
1) +on_master+ &amp;#8211; this method could be used in two forms: block form and proxy form. In the block form you could force connection switch for a block of code:
1234User.on_master do
&amp;nbsp; user = User.find_by_login&amp;#40;'foo'&amp;#41;
&amp;nbsp; user.update_attributes!&amp;#40;:activated =&amp;gt; true&amp;#41;
end
In the proxy form this method could be used to force one query to be performed on the master database server:
123Comment.on_master.last&amp;#40;:limit =&amp;gt; 5&amp;#41;
User.on_master.find_by_activation_code&amp;#40;code&amp;#41;
User.on_master.exists?&amp;#40;:login =&amp;gt; login, :password =&amp;gt; password&amp;#41;
2) on_slave &amp;#8211; this method is used to force a query to be run on a slave even in situations when it&amp;#8217;s been previously forced to use the master. If there is more than one slave, one would be selected randomly. This method has two forms as well: block and proxy.
3) on_db&amp;#40;connection&amp;#41; &amp;#8211; this method is what makes two previous methods possible. It is used to switch a model&amp;#8217;s connection to some db for a short block of code or even for one statement (two forms). It accepts the same range of values as the switch_connection_to method does. Example:
12Comment.on_db&amp;#40;:olap&amp;#41;.count
Post.on_db&amp;#40;:foo&amp;#41;.find&amp;#40;:first&amp;#41;
Associations Connection Management
ActiveRecord models can have an associations with each other and since every model has its own database connections, it becomes pretty hard to manage connections in a chained calls like User.posts.count. With a class-only connection switching methods this call would look like the following if we&amp;#8217;d want to count posts on a separate database:
1Post.on_db&amp;#40;:olap&amp;#41; &amp;#123; User.posts.count &amp;#125;
Apparently this is not the best way to write the code and we&amp;#8217;ve implemented an on_* methods on associations as well so you could do things like this:
12@user.posts.on_db&amp;#40;:olap&amp;#41;.count
@user.posts.on_slave.find&amp;#40;:title =&amp;gt; 'Hello, world!'&amp;#41;
Notice: Since ActiveRecord associations implemented as proxies for resulting objects/collections, it is possible to use our connection switching methods even without chained methods:
12@post.user.on_slave # would return post's author
@photo.owner.on_slave # would return photo's owner
Starting with DbCharmer release 1.4 it is possible to use prefix notation for has_many and HABTM associations connection switching:
12@user.on_db&amp;#40;:foo&amp;#41;.posts
@user.on_slave.posts
Named Scopes Support
To make it easier for DbCharmer users to use connections switching methods with named scopes, we&amp;#8217;ve added on_* methods support on the scopes as well. All the following scope chains would do exactly the same way (the query would be executed on the :foo database connection):
1234Post.on_db&amp;#40;:foo&amp;#41;.published.with_comments.spam_marked.count
Post.published.on_db&amp;#40;:foo&amp;#41;.with_comments.spam_marked.count
Post.published.with_comments.on_db&amp;#40;:foo&amp;#41;.spam_marked.count
Post.published.with_comments.spam_marked.on_db&amp;#40;:foo&amp;#41;.count
And now, add this feature to our associations support and here is what we could do:
123@user.on_db&amp;#40;:archive&amp;#41;.posts.published.all
@user.posts.on_db&amp;#40;:olap&amp;#41;.published.count
@user.posts.published.on_db&amp;#40;:foo&amp;#41;.first
Documentation
For more information on the plugin internals, please check out the source code. All the plugin&amp;#8217;s code is ~100% covered with a tests that were placed in a separate staging rails project located at github. The project has unit tests for all or at least the most of the parts of plugin&amp;#8217;s code.
What Ruby and Rails implementations does it work for?
We have a continuous integration setups for this plugin on MRI 1.8.6 with Rails 2.2 and 2.3. We use the plugin in production on Scribd.com with MRI (rubyee) 1.8.6 and Rails 2.2.
Who are the authors?
This plugin has been created in Scribd.com for our internal use and then the sources were opened for other people to use. All the code in this package has been developed by Alexey Kovyrin for Scribd.com and is released under the MIT license. For more details, see the LICENSE file.

If you have any comments on this project, feel free to contact me here in comments or by email. And, of course, patches are welcome (only when covered with tests).



  
</description>
    <content:encoded><![CDATA[<p>Today I&#8217;m proud to announce the first public release of <a href="http://www.scribd.com">our</a> ActiveRecord database connection magic plugin: <a href="http://github.com/kovyrin/db-charmer">DbCharmer</a>. </p>
<hr/>
<h2>DB Charmer &#8211; ActiveRecord Connection Magic Plugin</h2>
<p><tt>DbCharmer</tt> is a simple yet powerful plugin for ActiveRecord that does a few things:</p>
<ol>
<li>Allows you to easily manage AR models&#8217; connections (<code><span>switch_connection_to</span></code> method)
<li>Allows you to switch AR models&#8217; default connections to a separate servers/databases
<li>Allows you to easily choose where your query should go (<code><span>on_<span>*</span></span></code> methods family)
<li>Allows you to automatically send read queries to your slaves while masters would handle all the updates.
<li>Adds multiple databases migrations to ActiveRecord
</ol>
<p><span></span></p>
<h2>Installation</h2>
<p>There are two options when approaching db-charmer installation:</p>
<ul>
<li>using the <a href="http://gemcutter.org/gems/db-charmer">gem</a> (recommended)
<li>install as a Rails plugin
</ul>
<p>To install as a <a href="http://gemcutter.org/gems/db-charmer">gem</a>, add this to your environment.rb:</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br />2<br /></div></td><td><div>config.<span>gem</span> <span>'db-charmer'</span>, <span>:lib</span> <span>=&gt;</span> <span>'db_charmer'</span>, <br />
&nbsp; &nbsp; <span>:source</span> <span>=&gt;</span> <span>'http://gemcutter.org'</span></div></td></tr></tbody></table></div>
<p>And then run the command:</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br /></div></td><td><div><span>sudo</span> rake gems:<span>install</span></div></td></tr></tbody></table></div>
<p>To install db-charmer as a Rails plugin use this:</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br /></div></td><td><div>script<span>/</span>plugin <span>install</span> git:<span>//</span>github.com<span>/</span>kovyrin<span>/</span>db-charmer.git</div></td></tr></tbody></table></div>
<h2>Easy ActiveRecord Connection Management</h2>
<p>As a part of this plugin we&#8217;ve added <code><span>switch_connection_to</span></code> method that accepts many different kinds of db connections specifications and uses them on a model. We support:</p>
<ol>
<li>Strings and symbols as the names of connection configuration blocks in database.yml.
<li>ActiveRecord models (we&#8217;d use connection currently set up on a model).
<li>Database connections (<code><span>Model.<span>connection</span></span></code>)
<li>Nil values to reset model to default connection.
</ol>
<p>Sample code:</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br />2<br />3<br />4<br />5<br />6<br />7<br /></div></td><td><div><span>class</span> Foo <span>&lt;</span> <span>ActiveRecord::Model</span>; <span>end</span><br />
<br />
Foo.<span>switch_connection_to</span><span>&#40;</span><span>:blah</span><span>&#41;</span><br />
Foo.<span>switch_connection_to</span><span>&#40;</span><span>'foo'</span><span>&#41;</span><br />
Foo.<span>switch_connection_to</span><span>&#40;</span>Bar<span>&#41;</span><br />
Foo.<span>switch_connection_to</span><span>&#40;</span>Baz.<span>connection</span><span>&#41;</span><br />
Foo.<span>switch_connection_to</span><span>&#40;</span><span>nil</span><span>&#41;</span></div></td></tr></tbody></table></div>
<p>The <code><span>switch_connection_to</span></code> method has an optional second parameter <code><span>should_exist</span></code> which is true by default. This parameter is used when the method is called with a string or a symbol connection name and there is no such connection configuration in the database.yml file. If this parameter is <code><span><span>true</span></span></code>, an exception would be raised, otherwise, the error would be ignored and no connection change would happen.</p>
<p>This is really useful when in development mode or in a tests you do not want to create many different databases on your local machine and just want to put all your tables in a single database.</p>
<p><b>Warning:</b> All the connection switching calls would switch connection <b>only</b> for those classes the method called on. You can&#8217;t call the <code><span>switch_connection_to</span></code> method and switch connection for a base class in some hierarchy (for example, you can&#8217;t switch AR::Base connection and see all your models switched to the new connection, use the classic <code><span>establish_connection</span></code> instead).</p>
<h2>Multiple DB Migrations</h2>
<p>In every application that works with many databases, there is need in a convenient schema migrations mechanism.</p>
<p>All Rails users already have this mechanism &#8211; rails migrations. So in <tt>DbCharmer</tt>, we&#8217;ve made it possible to seamlessly use multiple databases in Rails migrations.</p>
<p>There are two methods available in migrations to operate on more than one database:</p>
<p>1. Global connection change method &#8211; used to switch whole migration to a non-default database.<br />
2. Block-level connection change method &#8211; could be used to do only a part of a migration on a non-default db.</p>
<p>Migration class example (global connection rewrite):</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br />2<br />3<br />4<br />5<br />6<br />7<br />8<br />9<br />10<br />11<br />12<br />13<br />14<br /></div></td><td><div><span>class</span> MultiDbTest <span>&lt;</span> <span>ActiveRecord::Migration</span><br />
&nbsp; &nbsp;db_magic <span>:connection</span> <span>=&gt;</span> <span>:second_db</span><br />
<br />
&nbsp; &nbsp;<span>def</span> <span>self</span>.<span>up</span><br />
&nbsp; &nbsp; &nbsp;create_table <span>:test_table</span>, <span>:force</span> <span>=&gt;</span> <span>true</span> <span>do</span> <span>|</span>t<span>|</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;t.<span>string</span> <span>:test_string</span><br />
&nbsp; &nbsp; &nbsp; &nbsp;t.<span>timestamps</span><br />
&nbsp; &nbsp; &nbsp;<span>end</span><br />
&nbsp; &nbsp;<span>end</span><br />
<br />
&nbsp; &nbsp;<span>def</span> <span>self</span>.<span>down</span><br />
&nbsp; &nbsp; &nbsp;drop_table <span>:test_table</span><br />
&nbsp; &nbsp;<span>end</span><br />
&nbsp;<span>end</span></div></td></tr></tbody></table></div>
<p>Migration class example (block-level connection rewrite):</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br />2<br />3<br />4<br />5<br />6<br />7<br />8<br />9<br />10<br />11<br />12<br />13<br />14<br /></div></td><td><div><span>class</span> MultiDbTest <span>&lt;</span> <span>ActiveRecord::Migration</span><br />
&nbsp; <span>def</span> <span>self</span>.<span>up</span><br />
&nbsp; &nbsp; on_db <span>:second_db</span> <span>do</span><br />
&nbsp; &nbsp; &nbsp; create_table <span>:test_table</span>, <span>:force</span> <span>=&gt;</span> <span>true</span> <span>do</span> <span>|</span>t<span>|</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; t.<span>string</span> <span>:test_string</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; t.<span>timestamps</span><br />
&nbsp; &nbsp; &nbsp; <span>end</span><br />
&nbsp; &nbsp; <span>end</span><br />
&nbsp; <span>end</span><br />
<br />
&nbsp; <span>def</span> <span>self</span>.<span>down</span><br />
&nbsp; &nbsp; on_db <span>:second_db</span> <span>&#123;</span> drop_table <span>:test_table</span> <span>&#125;</span><br />
&nbsp; <span>end</span><br />
<span>end</span></div></td></tr></tbody></table></div>
<p>By default in development and test environments you could skip this <code><span><span>:second_db</span></span></code> connection from your database.yml files and rails would create the tables in your single database, but in production you&#8217;d specify it and get the table created on a separate server and/or in a separate database.</p>
<p>This behaviour is controlled by the <code><span>DbCharmer.<span>migration_connections_should_exist</span></span></code> configuration attribute which could be set from a rails initializer.</p>
<h2>Using Models in Master-Slave Environments</h2>
<p>Master-slave replication is the most popular scale-out technique in a medium-sized and large database-centric applications today. There are some rails plugins out there that help developers to use slave servers in their models but none of them were flexible enough for us to start using them in a huge application we work on.</p>
<p>So, we&#8217;ve been using ActsAsReadonlyable plugin for a long time and have made tons of changes in its code over the time. But since that plugin has been abandoned by its authors, we&#8217;ve decided to collect all of our master-slave code in one plugin and release it for rails 2.2+. <tt>DbCharmer</tt> adds the following features to Rails models:</p>
<h3>Auto-Switching all Reads to the Slave(s)</h3>
<p>When you create a model, you could use <code><span>db_magic <span>:slave</span> <span>=&gt;</span> <span>:blah</span></span></code> or <code><span>db_magic <span>:slaves</span> <span>=&gt;</span> <span>&#91;</span> <span>:foo</span>, <span>:bar</span> <span>&#93;</span></span></code> commands in your model to set up reads redirection mode when all your find/count/exist/etc methods will be reading data from your slave (or a bunch of slaves in a round-robin manner). Here is an example:</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br />2<br />3<br />4<br />5<br />6<br />7<br /></div></td><td><div><span>class</span> Foo <span>&lt;</span> <span>ActiveRecord::Base</span><br />
&nbsp; db_magic <span>:slave</span> <span>=&gt;</span> <span>:slave01</span><br />
<span>end</span><br />
<br />
<span>class</span> Bar <span>&lt;</span> <span>ActiveRecord::Base</span><br />
&nbsp; db_magic <span>:slaves</span> <span>=&gt;</span> <span>&#91;</span> <span>:slave01</span>, <span>:slave02</span> <span>&#93;</span><br />
<span>end</span></div></td></tr></tbody></table></div>
<h3>Default Connection Switching</h3>
<p>If you have more than one master-slave cluster (or simply more than one database) in your database environment, then you might want to change the default database connection of some of your models. You could do that by using <code><span>db_magic <span>:connection</span> <span>=&gt;</span> <span>:foo</span></span></code> call from your models. Example:</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br />2<br />3<br /></div></td><td><div><span>class</span> Foo <span>&lt;</span> <span>ActiveRecord::Base</span><br />
&nbsp; db_magic <span>:connection</span> <span>=&gt;</span> <span>:foo</span><br />
<span>end</span></div></td></tr></tbody></table></div>
<p>Sample model on a separate master-slave cluster (so, separate main connection + a slave connection):</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br />2<br />3<br /></div></td><td><div><span>class</span> Bar <span>&lt;</span> <span>ActiveRecord::Base</span><br />
&nbsp; db_magic <span>:connection</span> <span>=&gt;</span> <span>:bar</span>, <span>:slave</span> <span>=&gt;</span> <span>:bar_slave</span><br />
<span>end</span></div></td></tr></tbody></table></div>
<h3>Per-Query Connection Management</h3>
<p>Sometimes you have select queries that you know you want to run on the master. This could happen for example when you have just added some data and need to read it back and not sure if it made it all the way to the slave yet or no. For this situation and a few others there is a set of methods we&#8217;ve added to ActiveRecord models:</p>
<p>1) +on_master+ &#8211; this method could be used in two forms: block form and proxy form. In the block form you could force connection switch for a block of code:</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br />2<br />3<br />4<br /></div></td><td><div>User.<span>on_master</span> <span>do</span><br />
&nbsp; user = User.<span>find_by_login</span><span>&#40;</span><span>'foo'</span><span>&#41;</span><br />
&nbsp; user.<span>update_attributes</span>!<span>&#40;</span><span>:activated</span> <span>=&gt;</span> <span>true</span><span>&#41;</span><br />
<span>end</span></div></td></tr></tbody></table></div>
<p>In the proxy form this method could be used to force one query to be performed on the master database server:</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br />2<br />3<br /></div></td><td><div>Comment.<span>on_master</span>.<span>last</span><span>&#40;</span><span>:limit</span> <span>=&gt;</span> 5<span>&#41;</span><br />
User.<span>on_master</span>.<span>find_by_activation_code</span><span>&#40;</span>code<span>&#41;</span><br />
User.<span>on_master</span>.<span>exists</span>?<span>&#40;</span><span>:login</span> <span>=&gt;</span> login, <span>:password</span> <span>=&gt;</span> password<span>&#41;</span></div></td></tr></tbody></table></div>
<p>2) <code><span>on_slave</span></code> &#8211; this method is used to force a query to be run on a slave even in situations when it&#8217;s been previously forced to use the master. If there is more than one slave, one would be selected randomly. This method has two forms as well: block and proxy.</p>
<p>3) <code><span>on_db<span>&#40;</span>connection<span>&#41;</span></span></code> &#8211; this method is what makes two previous methods possible. It is used to switch a model&#8217;s connection to some db for a short block of code or even for one statement (two forms). It accepts the same range of values as the <code><span>switch_connection_to</span></code> method does. Example:</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br />2<br /></div></td><td><div>Comment.<span>on_db</span><span>&#40;</span><span>:olap</span><span>&#41;</span>.<span>count</span><br />
Post.<span>on_db</span><span>&#40;</span><span>:foo</span><span>&#41;</span>.<span>find</span><span>&#40;</span><span>:first</span><span>&#41;</span></div></td></tr></tbody></table></div>
<h3>Associations Connection Management</h3>
<p>ActiveRecord models can have an associations with each other and since every model has its own database connections, it becomes pretty hard to manage connections in a chained calls like <code><span>User.<span>posts</span>.<span>count</span></span></code>. With a class-only connection switching methods this call would look like the following if we&#8217;d want to count posts on a separate database:</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br /></div></td><td><div>Post.<span>on_db</span><span>&#40;</span><span>:olap</span><span>&#41;</span> <span>&#123;</span> User.<span>posts</span>.<span>count</span> <span>&#125;</span></div></td></tr></tbody></table></div>
<p>Apparently this is not the best way to write the code and we&#8217;ve implemented an <code><span>on_<span>*</span></span></code> methods on associations as well so you could do things like this:</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br />2<br /></div></td><td><div><span>@user</span>.<span>posts</span>.<span>on_db</span><span>&#40;</span><span>:olap</span><span>&#41;</span>.<span>count</span><br />
<span>@user</span>.<span>posts</span>.<span>on_slave</span>.<span>find</span><span>&#40;</span><span>:title</span> <span>=&gt;</span> <span>'Hello, world!'</span><span>&#41;</span></div></td></tr></tbody></table></div>
<p>Notice: Since ActiveRecord associations implemented as proxies for resulting objects/collections, it is possible to use our connection switching methods even without chained methods:</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br />2<br /></div></td><td><div><span>@post</span>.<span>user</span>.<span>on_slave</span> <span># would return post's author</span><br />
<span>@photo</span>.<span>owner</span>.<span>on_slave</span> <span># would return photo's owner</span></div></td></tr></tbody></table></div>
<p>Starting with <tt>DbCharmer</tt> release 1.4 it is possible to use prefix notation for has_many and HABTM associations connection switching:</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br />2<br /></div></td><td><div><span>@user</span>.<span>on_db</span><span>&#40;</span><span>:foo</span><span>&#41;</span>.<span>posts</span><br />
<span>@user</span>.<span>on_slave</span>.<span>posts</span></div></td></tr></tbody></table></div>
<h3>Named Scopes Support</h3>
<p>To make it easier for <tt>DbCharmer</tt> users to use connections switching methods with named scopes, we&#8217;ve added <code><span>on_<span>*</span></span></code> methods support on the scopes as well. All the following scope chains would do exactly the same way (the query would be executed on the <code><span><span>:foo</span></span></code> database connection):</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br />2<br />3<br />4<br /></div></td><td><div>Post.<span>on_db</span><span>&#40;</span><span>:foo</span><span>&#41;</span>.<span>published</span>.<span>with_comments</span>.<span>spam_marked</span>.<span>count</span><br />
Post.<span>published</span>.<span>on_db</span><span>&#40;</span><span>:foo</span><span>&#41;</span>.<span>with_comments</span>.<span>spam_marked</span>.<span>count</span><br />
Post.<span>published</span>.<span>with_comments</span>.<span>on_db</span><span>&#40;</span><span>:foo</span><span>&#41;</span>.<span>spam_marked</span>.<span>count</span><br />
Post.<span>published</span>.<span>with_comments</span>.<span>spam_marked</span>.<span>on_db</span><span>&#40;</span><span>:foo</span><span>&#41;</span>.<span>count</span></div></td></tr></tbody></table></div>
<p>And now, add this feature to our associations support and here is what we could do:</p>
<div><table cellspacing="0" cellpadding="0"><tbody><tr><td><div>1<br />2<br />3<br /></div></td><td><div><span>@user</span>.<span>on_db</span><span>&#40;</span><span>:archive</span><span>&#41;</span>.<span>posts</span>.<span>published</span>.<span>all</span><br />
<span>@user</span>.<span>posts</span>.<span>on_db</span><span>&#40;</span><span>:olap</span><span>&#41;</span>.<span>published</span>.<span>count</span><br />
<span>@user</span>.<span>posts</span>.<span>published</span>.<span>on_db</span><span>&#40;</span><span>:foo</span><span>&#41;</span>.<span>first</span></div></td></tr></tbody></table></div>
<h2>Documentation</h2>
<p>For more information on the plugin internals, please check out the source code. All the plugin&#8217;s code is ~100% covered with a tests that were placed in a separate staging rails project located at <a href="http://github.com/kovyrin/db-charmer-sandbox">github</a>. The project has unit tests for all or at least the most of the parts of plugin&#8217;s code.</p>
<h2>What Ruby and Rails implementations does it work for?</h2>
<p>We have a continuous integration setups for this plugin on MRI 1.8.6 with Rails 2.2 and 2.3. We use the plugin in production on <a href="http://www.scribd.com">Scribd.com</a> with MRI (rubyee) 1.8.6 and Rails 2.2.</p>
<h2>Who are the authors?</h2>
<p>This plugin has been created in Scribd.com for our internal use and then the sources were opened for other people to use. All the code in this package has been developed by <a href="http://kovyrin.net">Alexey Kovyrin</a> for Scribd.com and is released under the MIT license. For more details, see the LICENSE file.</p>
<hr/>
<p>If you have any comments on this project, feel free to contact me here in comments or by email. And, of course, patches are welcome (only when covered with tests).</p>

<p><a href="http://feedads.g.doubleclick.net/~a/39XdsZtBnK_TSk_nhFgZ4PXFCJY/0/da"><img src="http://feedads.g.doubleclick.net/~a/39XdsZtBnK_TSk_nhFgZ4PXFCJY/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/39XdsZtBnK_TSk_nhFgZ4PXFCJY/1/da"><img src="http://feedads.g.doubleclick.net/~a/39XdsZtBnK_TSk_nhFgZ4PXFCJY/1/di" border="0" ismap="true"></img></a></p><div>
<a href="http://feeds.feedburner.com/~ff/Homo-Adminus?a=_bkc3hWZQr0:tDjUt-_xpVg:D7DqB2pKExk"><img src="http://feeds.feedburner.com/~ff/Homo-Adminus?i=_bkc3hWZQr0:tDjUt-_xpVg:D7DqB2pKExk" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/Homo-Adminus?a=_bkc3hWZQr0:tDjUt-_xpVg:7Q72WNTAKBA"><img src="http://feeds.feedburner.com/~ff/Homo-Adminus?d=7Q72WNTAKBA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/Homo-Adminus?a=_bkc3hWZQr0:tDjUt-_xpVg:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/Homo-Adminus?i=_bkc3hWZQr0:tDjUt-_xpVg:V_sGLiPBpWU" border="0"></img></a>
</div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22057&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22057&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Tue, 03 Nov 2009 15:36:26 +0000</pubDate>
    <dc:creator>Alexey Kovyrin</dc:creator>
    <category>Databases</category>
    <category>Development</category>
    <category>My Projects</category>
    <category>ActiveRecord</category>
    <category>MySQL</category>
    <category>Ruby</category>
    <category>Ruby On Rails</category>
    <category>scalability</category>
    <category>scribd</category>
  </item>

  <item>
    <title>Cardinality</title>
    <guid isPermaLink="false">tag:blogger.com,1999:blog-8575059197193667898.post-6848959801824222288</guid>
    <link>http://dave-stokes.blogspot.com/2009/11/cardinality.html</link>
    <description>Last night I was asked about index cardinality.  One of the members of the North Texas MySQL Users Group was using phpMyadmin and noticed an element marked 'cardinality' and asked me what it meant. And I will admit I was stumped.The manual says:ANALYZE TABLE determines index cardinality (as displayed in the Cardinality column of SHOW INDEX output) by doing ten random dives to each of the index trees and updating index cardinality estimates accordingly. Because these are only estimates, repeated runs of ANALYZE TABLE may produce different numbers. This makes ANALYZE TABLE fast on InnoDB tables but not 100% accurate because it does not take all rows into account.MySQL uses index cardinality estimates only in join optimization. If some join is not optimized in the right way, you can try using ANALYZE TABLE. In the few cases that ANALYZE TABLE does not produce values good enough for your particular tables, you can use FORCE INDEX with your queries to force the use of a particular index, or set the max_seeks_for_key system variable to ensure that MySQL prefers index lookups over table scans. See Section 5.1.3, “Server System Variables”, and Section B.5.6, “Optimizer-Related Issues”.Clear, huh? Well not 100% for me so I went back to searching. Other finds on the net say that cardinality is a measure of how accurate an index is and that a UNIQUE index would have the highest cardinality. So the more unique the index entries are, the more higher the cardinality.Which brings up another question:  How often do you maintain your indexes? Whay clues tell you to do maintenance?  Please share your recommendations!BTW we will have pizza at the next meeting of the North Texas Users Group, so see you December 7 at 7PM at the Sun offices, 16000 Dallas Tollway in suite 700!</description>
    <content:encoded><![CDATA[Last night I was asked about index cardinality.  One of the members of the <a href="http://northtexasmysql.org">North Texas MySQL Users Group</a> was using <a href="http://www.phpmyadmin.net/home_page/index.php">phpMyadmin</a> and noticed an element marked 'cardinality' and asked me what it meant. And I will admit I was stumped.<br /><br />The <a href="http://dev.mysql.com/doc/refman/5.0/en/innodb-restrictions.html">manual</a> says:<br /><br /><i>ANALYZE TABLE determines index cardinality (as displayed in the Cardinality column of SHOW INDEX output) by doing ten random dives to each of the index trees and updating index cardinality estimates accordingly. Because these are only estimates, repeated runs of ANALYZE TABLE may produce different numbers. This makes ANALYZE TABLE fast on InnoDB tables but not 100% accurate because it does not take all rows into account.<br /><br />MySQL uses index cardinality estimates only in join optimization. If some join is not optimized in the right way, you can try using ANALYZE TABLE. In the few cases that ANALYZE TABLE does not produce values good enough for your particular tables, you can use FORCE INDEX with your queries to force the use of a particular index, or set the max_seeks_for_key system variable to ensure that MySQL prefers index lookups over table scans. See Section 5.1.3, “Server System Variables”, and Section B.5.6, “Optimizer-Related Issues”.</i><br /><br />Clear, huh? Well not 100% for me so I went back to searching. Other finds on the net say that cardinality is a measure of how accurate an index is and that a UNIQUE index would have the highest cardinality. So the more unique the index entries are, the more higher the cardinality.<br /><br />Which brings up another question:  How often do you maintain your indexes? Whay clues tell you to do maintenance?  Please share your recommendations!<br /><hr align=center width=50%><br />BTW we will have pizza at the next meeting of the North Texas Users Group, so see you December 7 at 7PM at the Sun offices, 16000 Dallas Tollway in suite 700!<div><img width="1" height="1" src="https://blogger.googleusercontent.com/tracker/8575059197193667898-6848959801824222288?l=dave-stokes.blogspot.com" /></div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22058&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22058&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Tue, 03 Nov 2009 14:54:00 +0000</pubDate>
    <dc:creator>Dave Stokes</dc:creator>
    <category>cardinality</category>
  </item>

  <item>
    <title>Make MySQL refuse connections until data nodes are started</title>
    <guid isPermaLink="false">tag:blogger.com,1999:blog-5702936365231918674.post-4085539666587977620</guid>
    <link>http://blog.some-abstract-type.com/2009/11/make-mysql-refuse-connections-until.html</link>
    <description>MySQL Cluster 6.3.28 and 7.0.9 introduce the MySQL server option --ndb-wait-setup. This makes sure that clients can not connect the SQL Node when no Data Nodes are available within, by default, 15 seconds. When the timeout is reached, and no Data Nodes are available, the NDB storage engine will be marked as unavailable.The following will appear in the MySQL server error log when --ndb-wait-setup=30 has been set: [Note] NDB: NodeID is 10, management server 'ndbsup-priv-1:1406' [Note] NDB[0]: NodeID: 10, no storage nodes connected (timed out) [Note] Starting Cluster Binlog Thread [Note] Event Scheduler: Loaded 0 events [Note] NDB Binlog: Ndb tables initially read only. .. 30 seconds later.. [Warning] NDB : Tables not available after 30 seconds.     Consider increasing --ndb-wait-setup value [Note] /data1/mysql/5.1.39_6.3.28/libexec/mysqld: ready for connections.Use case: when doing an installation and you start Data and SQL Nodes quickly after each other. Normally, services which connect to a MySQL server (which is connect to a MySQL Cluster) will have failures because NDB tables are not yet available. With --ndb-wait-setup option set, they will not even be able to connect. It could help in some automated install scenarios where you want to make sure clients can't do anything until Data Nodes are available.Stay tuned for the binaries due first half of November (2009). Source is already available for MySQL Cluster 7.0.9 and 6.3.28.</description>
    <content:encoded><![CDATA[<p><a href="http://www.mysql.com/products/database/cluster/">MySQL Cluster</a> <a href="http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-news-6-3.html#mysql-cluster-news-5-1-39-ndb-6-3-28">6.3.28</a> and <a href="http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-news-7-0.html#mysql-cluster-news-5-1-39-ndb-7-0-9">7.0.9</a> introduce the MySQL server option <a href="http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-system-variables.html#sysvar_ndb_wait_setup"><tt>--ndb-wait-setup</tt></a>. This makes sure that clients can not connect the SQL Node when no Data Nodes are available within, by default, 15 seconds. When the timeout is reached, and no Data Nodes are available, the NDB storage engine will be marked as unavailable.</p><p>The following will appear in the MySQL server error log when <tt>--ndb-wait-setup=30</tt> has been set:</p><pre><br /> [Note] NDB: NodeID is 10, management server 'ndbsup-priv-1:1406'<br /> [Note] NDB[0]: NodeID: 10, <strong>no storage nodes connected (timed out)</strong><br /> [Note] Starting Cluster Binlog Thread<br /> [Note] Event Scheduler: Loaded 0 events<br /> [Note] NDB Binlog: Ndb tables initially read only.<br /> .. 30 seconds later..<br /> <strong>[Warning] NDB : Tables not available after 30 seconds. <br />    Consider increasing --ndb-wait-setup value</strong><br /> [Note] /data1/mysql/5.1.39_6.3.28/libexec/mysqld: ready for connections.<br /></pre><p>Use case: when doing an installation and you start Data and SQL Nodes quickly after each other. Normally, services which connect to a MySQL server (which is connect to a MySQL Cluster) will have failures because NDB tables are not yet available. With <tt>--ndb-wait-setup</tt> option set, they will not even be able to connect. It could help in some automated install scenarios where you want <i>to make sure</i> clients can't do anything until Data Nodes are available.</p><p>Stay tuned for the binaries due first half of November (2009). Source is already available for MySQL Cluster <a href="ftp://ftp.mysql.com/pub/mysql/download/cluster_telco/mysql-5.1.39-ndb-7.0.9/">7.0.9</a> and <a href="ftp://ftp.mysql.com/pub/mysql/download/cluster_telco/mysql-5.1.39-ndb-6.3.28/">6.3.28</a>.</p><div><img width="1" height="1" src="https://blogger.googleusercontent.com/tracker/5702936365231918674-4085539666587977620?l=blog.some-abstract-type.com" /></div><br/>PlanetMySQL Voting:
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22055&vote=1&apivote=1">Vote UP</a> /
	 <a href="http://planet.mysql.com/entry/vote/?entry_id=22055&vote=-1&apivote=1">Vote DOWN</a>]]></content:encoded>
    <pubDate>Tue, 03 Nov 2009 13:23:24 +0000</pubDate>
    <dc:creator>Geert Vanderkelen</dc:creator>
    <category>mysql</category>
    <category>cluster</category>
  </item>

</channel>
</rss>
