Showing entries 27793 to 27802 of 44921
« 10 Newer Entries | 10 Older Entries »
Postgres on OpenSolaris using 2x Quad Cores: Use FX Scheduler

During my PGCon 2009 presentation there was a question on the saw tooth nature of the workload results on the high end side of benchmark runs. To which Matthew Wilcox (from Intel) commented it could be scheduler related. I did not give it much thought at that time till today when I was trying to do some iGen runs for the JDBC Binary Transfer patch (more on that in another blog post) and also Simon's read only scalability runs . Then I realized that I was not following one of my one tuning advice for running Postgres on OpenSolaris. The advice is to  use FX Class of scheduler instead of the default TS Class on OpenSolaris . More details on various scheduler classes can be found on docs.sun.com.

Now how many times I have forgotten to do …

[Read more]
New Mac Dev Box Software Checklist

When I get a new Mac, I go through the same steps every time. Yes I use a Mac as a dev box.

Software

Xcode
Adium
Launchbar
iTerm – change defaults to black background. (edit bookmarks, default, background)
Apache HTTP Server (source)
MySQL (source)
PHP (source)
Eclipse PDT
Omnigraffle

[Read more]
Introducing Multiple Clustering Indexes

In this posting I’ll describe TokuDB’s multiple clustering index feature.  (This posting is by Zardosht.)

In general (not just for TokuDB) a clustered index or a clustering index is an index that stores the all of the data for the rows.  Quoting the MySQL 5.1 reference manual:

Accessing a row through the clustered index is fast because the row data is on the same page where the index search leads. If a table is large, the clustered index architecture often saves a disk I/O operation when compared to storage organizations that store row data using a different page from the index record.

Most storage engines allow at most one clustered index for each table. For example, MyISAM does not support clustered indexes at all, whereas InnoDB allows only the primary key be a clustered index.

[Read more]
6 Tips for a Smooth Zimbra Server Install

It may sound odd offering more Zimbra installation advice since there is a lot on the subject in other blogs, our documents, wiki and Forums. In fact, some quick research surfaced over 1.4 million hits for Zimbra server install on the web and 36,000 on the Zimbra site alone.

But we are also fortunate to have more new Zimbra users than ever, and after helping some trial customers recently, it …

[Read more]
Cache Line Sizes and Concurrency

We’ve been looking at high concurrency level issues with Drizzle and MySQL. Jay pointed me to this article on the concurrency issues due to shared cache lines and decided to run some of my own tests. The results were dramatic, and anyone who is writing multi-threaded code needs to be aware of current CPU cache line sizes and how to optimize around them.

I ran my tests on two 16-core Intel machines, one with a 64 byte cache line, and one with 128 byte cache line. First off, how did I find these values?

one:~$ cat /proc/cpuinfo | grep cache_alignment
cache_alignment : 64
...

two:~$ cat /proc/cpuinfo | grep cache_alignment
cache_alignment : 128
...

You will see one line for each CPU. If you are not familiar with /proc/cpuinfo, take a closer look at the full output. It’s a nice quick reference of other things like L2 …

[Read more]
The Big ALTER TABLE Test

As previously noted, I've been playing with XtraDB a bit at work. Over a week ago I decided to test compression on one of our larger tables and it took a bit longer than I expected.

(root@db_server) [db_name]> ALTER TABLE table_name \
    ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4;
Query OK, 825994826 rows affected (8 days 14 hours 23 min 47.08 sec)
Records: 825994826  Duplicates: 0  Warnings: 0

Zoiks!

It's too bad we couldn't use all the cores on the machine for the ALTER TABLE, huh?

On the plus side, the file sizes aren't too bad.

Before:

-rw-rw---- 1 mysql mysql 1638056067072 2009-05-24 09:23 table_name.ibd

After:

-rw-rw---- 1 mysql mysql  587617796096 2009-05-27 07:14 table_name.ibd

I'll have more to say about XtraDB …

[Read more]
What we're looking for in a data integration tool

As our data warehousing process grows and the workflows get more complex, we've revisited the question of what tools to use in this process. Out of curiosity, I had a look at basing such a process on Hadoop/Hive for scalability reasons, but the lack of mature tools and the sacrifices on efficiency that would entail meant we're better off using something else as long as a distributed processing platform is the only thing that can get the job done. I'm also curious about the transition to continuous integration, a model I noticed showing up a couple of years ago and now getting some air under its wings as CEP, IBM's …

[Read more]
GPL Licensing and MySQL Storage Engines

The spirit and intent of the Free Software Foundation (FSF) and the GPL license are right on target. However, we must be careful to ensure that the GPL license is interpreted in a manner fulfills the spirit and intent behind its framing. Richard Stallman and associates set out to draft a license agreement that ensures that free software remains free. They didn’t want to see open source become corrupted with the insertion of proprietary code that would eat away at the freedoms they envisioned.

To protect the eternal purity of the open source software, they created constraints on how proprietary code can interact with the GPL code. Their one weapon in this battle is the automatic and forced expansion of their GPL license to any code that integrates with the base GPLed code. I often refer to this process as acting like a virus. I don’t use this term to infer nefarious intent any more than viral marketing infers nefarious intent. On the …

[Read more]
Overwriting is much faster than appending

Writing small volume of data (Bytes-MBs) with sync (fsync()/fdatasync()/O_SYNC/O_DSYNC) is very common for RDBMS and is needed to guarantee durability. For transactional log files, sync happens per commit. For data files, sync happens at checkpoint etc. Typically RDBMS does syncing data very frequently. In this case, overwriting is much faster than appending for most filesystems/storages. Overwriting does not change file size, while appending does. Increasing file size requires a lot of overheads such as allocating space within the filesystem, updating & flushing metadata. This really matters when you writes data with fsync() very frequently. The following are simple benchmarking results on ext3 RHEL5.3.


1. creating an empty file, then writing 8KB 128*1024 times with fdatasync()
fdatasync per second: 3085.94321
(Emulating current binlog (sync-binlog=1) behavior)

2. creating a 1GB data file, then …

[Read more]
A 10x Performance Increase for Batch INSERTs With MySQL Connector/J Is On The Way....

Connector/J has a feature where the driver can take prepared statements of the form "INSERT INTO foo VALUES (...)", and if configured with "rewriteBatchedStatements=true", can re-write batches of them to the form "INSERT INTO foo VALUES (...), (...), (...)". This is a performance win on a few fronts, because of reduction in latency (remember, MySQL in general doesn't have a "batch" form of prepared statement parameter bindings, so each parameter set usually needs to be sent as a separate INSERT), and because of optimizations of handling "multivalue" INSERT in the server itself.

Prior to code sitting at the the head of Connector/J 5.1 which fixes Bug#41532 and Bug#40440 "larger" batches (over 500 batched parameter sets or so) experienced extreme performance degradations due to a bunch of extra parsing that the driver did while …

[Read more]
Showing entries 27793 to 27802 of 44921
« 10 Newer Entries | 10 Older Entries »