13 件中 1 - 10 件を表示
次の 3 件 »
Displaying posts with tag: in English (reset)
Compressing URLs in your Webapp, for size and speed

Last year I had a chance to talk about the internals of our service: Pathtraq at Percona Performance Conference (slides), in which I described the methods we use to compress the URLs in our database to below 40% of the original size, however had not released the source code since then.  I am sorry for the delay, but have finally uploaded the code to github.com/kazuho/url_compress.

It is generally considered difficult to achieve high ratio for compressing short texts.  This is due to the fact that most compression algorithms are adaptive, i.e., short texts reach their end before the compressors learn how to encode them …

[さらに読む]
Q4M 0.9.4 released

I have just uploaded Q4M (Queue for MySQL) 0.9.4 to q4m.31tools.com.  There has been no bug fixes since 0.9.3, the only change is the newly added function queue_compact(table_name) that can be used to trigger table compaction manually.

If you were looking for a way to control queue compaction timing, it would be worth considering upgrading to 0.9.4.

For more information of what compaction is, please refer to my last entry on Q4M describing concurrent compaction.

MySQL and the XFS stack overflow problem

I have heard that there have been talks at MySQL UC on running MySQL (InnoDB) on top of XFS for better write concurrency than ext3.

Last weekend I had a chance to discuss on Twitter the xfs stack overflow problem (that became apparent this month, on x86_64 systems with 8k stack) and how it would affect (if any) MySQL servers.  If I understand correctly, the problem is that stack overflow might occur when a dirty page needs to be paged-out to xfs.

The conclusion was that for stability of MySQL running on xfs:

  1. xfs should not be used on top of LVM or any other MD (i.e. software RAID, etc.)
  2. xfs volumes should only contain ibdata files (that are accessed using O_DIRECT) so that the files on xfs would never exist as dirty pages within the OS

References:

[さらに読む]
Q4M 0.9.2 prerelease avaiable fixing data corruption on 32bit systems

Thanks to a user of Q4M, I have found a bug that would likely lead to data corruption on 32bit versions of Q4M.  64bit versions are unaffected.

Q4M by default uses mmap(2) to read from data files.  On 32bit systems, it tries to map max. 1GB per each table into memory using mmap.  When mmap fails to map memory due to low memory, Q4M falls back to file I/O to read the data.

However there was a bug in handling the response from mmap, that led to reading corrupt data from database files when mmap(2) failed after the size of the underlying file was grown / shrunk by Q4M.  And since Q4M writes back the corrupt data into the database file when rows are being consumed, the bug will likely destroy the database files.

I have fixed the bug and have uploaded Q4M 0.9.2, into the prerelease directory at …

[さらに読む]
Building a highly configurable, easy-to-maintain backup solution for LVM-based VMs and MySQL databases

Motives and the Features

For the servers running in our new network, I was in need for a highly configurable, but easy-to-use backup solution that can take online backups of VMs and MySQL databases running multiple storage engines.

Since my colleagues are all researchers or programmers but there are no dedicated engineers for managing our system, I decided to write a set of command line scripts to accomplish the task instead of using an existing, highly-configurable but time-taking-to-learn backup solutions, like Amanda.

And what I have come up with now is a backup solution with following characteristics, let me introduce them.

  • a central backup server able to take backup of other servers over SSH using public-key authentication
  • no need to install backup agents into each server
  • LVM snapshot-based online, …
[さらに読む]
Comparing InnoDB performance on HDD, SSD, in-memory

The chart shows benchmark results taken using sysbench.  Rough understanding would be that (for this scenario) the performance ratio is HDD:SSD:in-memory = 1:10:50.

transactions/sec. read/write reqs./sec.
buffer_pool=8M, HDD 19.93 378.59
buffer_pool=8M, SSD (Intel X25-M) 207.70 3946.29
buffer_pool=2048M, HDD 998.82 18977.51

Details:

The benchmark was taken using MySQL 5.1.41 using innodb_plugin running on linux 2.6.31/x86_64 (Ubuntu 9.10 server).  Options passed to sysbench were: --test=oltp --db-driver=mysql --mysql-table-engine=innodb …

[さらに読む]
Uploading an autotools-based distribution onto CPAN

Background

It is a pain to create binary packages.  But installing a program from source tarball is a tedious task.  You need to run ./configure & make && make install.  Sometimes you need to resolve the dependencies by hand as well.  That's where source-code-based package distribution systems come in, and the largest system is, IMHO, CPAN.  If you could upload a autotools-based distribution onto CPAN, then the users of the software can install them with the cpan command (or cpanp or cpanf or whatever), with the dependencies automatically resolved.

And for my case, it was considered especially benefitial, since the program I am now working on (it's called incline, a replicator for RDB shards using MySQL or PostgreSQL), uses perl scripts for running tests.  By …

[さらに読む]
A Clever way to scale-out a web application (YAPC::Asia 2009 Presentation)

For couple of months I have been writing middlewares for database shards, and today I made a presentation covering them.  It includes the following.

  • Incline - a trigger and queue based distributed materialized view manager
  • Pacific - a set of perl scripts to manage MySQL shards, a MySQL shard can be split into two in less than 10 seconds of write blocking (and no read blocks)
  • DBIx::ShardManager - a client API for accessing database shards using Incline and Pacific

With these middlewares I think it is no more difficult to write web applications that runs on database shards.  In fact IMHO it is as easy as writing a webapp that runs on a standalone database.

The presentation slides are available from slideshare.  If you have any question or suggestions, please leave a comment.  Thank you.

[さらに読む]
Picoev: a tiny event loop for network applications, faster than libevent or libev

I am sure many programmers writing network applications have their own abstracting layers hiding the differences between various I/O multiplex APIs, like select(2), poll(2), epoll(2), ... And of course, I am one among them.  While writing mycached (see Mycached: memcached protocol support for MySQL for more information), I was at first considering of using libev for multiplexing socket I/Os.  Libevent was not an option since it does not (yet) provide multithreading support.

But it was a great pain for me to learn how to use libev.  I do not mean that its is an ugly product.  In fact, I think that it is a very well written, excellent library.  However, for me it was too much a …

[さらに読む]
Mycached: memcached protocol support for MySQL

It is a well-known fact that the bottlenecks of MySQL does not exist in its storage engines, but rather in the core, for example, its parser and execution planner.  Last weekend I started to wonder how fast MySQL could be if those bottlenecks were skipped.  Not being able to stop my curiousity, I started adding memcached proctol support to MySQL as a UDF.  And that is Mycached.

From what I understand, there are two advantages of using mycached (or the memcached protocol, in general) over using SQL.  One is faster access.  The QPS (queries per second) of mycached is roughly 2x compared to using SQL.  The other is higher concurrency.  As can be seen in the chart below, mycached can handle thousands of connections simultaneously.

[さらに読む]
13 件中 1 - 10 件を表示
次の 3 件 »