Showing entries 20023 to 20032 of 44113
« 10 Newer Entries | 10 Older Entries »
Innodb Caching (part 2)

Few weeks ago I wrote about Innodb Caching with main idea you might need more cache when you think you are because Innodb caches data in pages, not rows, and so the whole page needs to be in memory even if you need only one row from it. I have created the simple benchmark which shows a worse case scenario by picking the random set of primary key values from sysbench table and reading them over and over again.

This time I decided to “zoom in” on the time when result drop happens – 2x increase in number of rows per step hides a lot of details, so I’m starting with some number of rows when everything was still in cache for all runs and increasing number of rows being tested 20% per step. I’m trying standard Innodb page size, 4KB page size as 16K page size compressed to 4. The data in this case compresses perfectly (all pages …

[Read more]
Running spotlight from your Mac terminal window

A colleague just showed me this most excellent little command you can add to your .profile on the Mac to do Spotlight-indexed searches from the command line. Very nice.

function slocate() {
mdfind "kMDItemDisplayName == '$@'wc";
}

config[master]% time slocate my.cnf
/private/etc/my.cnf
/opt/local/var/macports/sources/rsync.macports.org/release/ports/databases/mysql4/files/my.cnf
real 0m0.018s
user 0m0.006s
sys 0m0.006s

Replication Issues: Never purge logs before slave catches them!!

A few days ago one of our customers contact us to inform a problem with one of their replication servers.

This server was reporting this issue:

Last_Error: Could not parse relay log event entry. The possible reasons are: the master’s binary log is corrupted (you can check this by running ‘mysqlbinlog’ on the binary log), the slave’s relay log is corrupted (you can check this by running ‘mysqlbinlog’ on the relay log), a network problem, or a bug in the master’s or slave’s MySQL code. If you want to check the master’s binary log or slave’s relay log, you will be able to know their names by issuing ‘SHOW SLAVE STATUS’ on this slave.

After brief research we found the customer had deleted some binary logs from the master and relay logs from slave to release space since they were having space issues.

The customer requested us to get slave working again without affecting the production …

[Read more]
451 CAOS Links 2011.05.10

EMC launches Greenplum HD. DataStax releases Brisk. And more.

# EMC launched its Greenplum HD Hadoop distribution, with the support of Jaspersoft, Pentaho, and SnapLogic, among others.

# DataStax …

[Read more]
DRBD and Semi-sync shootout on large server

DRBD and semi-sync benchmarks on a 2x8 132 GB server

I recently had the opportunity to run some benchmarks against a relatively large server, to learn how it was behaving in its specific configuration. I got some interesting results that I'll share here.

read more

New Maatkit tool: mk-table-usage

This month’s Maatkit release includes a new tool that’s kind of an old tool at the same time. We wrote it a couple years ago for a client who has a very large set of tables and many queries and developers, and wants the database’s schema and queries to self-document for data-flow analysis purposes. At the time, it was called mk-table-access and was rather limited — just a few lines of code wrapped around some existing modules, with an output format that wasn’t generic enough to be broadly useful. Thus we didn’t release it with Maatkit. We recently changed the name to mk-table-usage (to match mk-index-usage), included it in the Maatkit suite of tools, and enhanced the functionality a lot.

What’s this tool good for? Well, imagine that you’re a big MySQL user and you hire a new developer. Now you need to bring the new person up to speed with your environment. Or, you want to …

[Read more]
On database write workload profiling

I always have difficulties with complex analysis schemes, so fall back to something that is somewhat easier. Or much easier. Here I will explain the super-powerful method of database write workload analysis.

Doing any analysis on master servers is already too complicated, as instead of analyzing write costs one can be too obsessed with locking and there’s sometimes uncontrollable amount of workload hitting the server beside writes. Fortunately, slaves are much better targets, not only because writes there are single-threaded, thus exposing every costly I/O as time component, but also one can drain traffic from slaves, or send more in order to cause more natural workload.

Also, there can be multiple states of slave load:

  • Healthy, always at 0-1s lag, write statements are always immediate
  • Spiky, usually at 0s lag, but has jumps due to sometimes occuring slow statements
  • Lagging, because of …
[Read more]
Star schema benchmark on MySQL Cluster 7.2

I decided to try the star schema benchmark on our latest 7.2 release (link). Star schema benchmark is an analytics oriented benchmark, and MySQL Cluster has not been developed to address this kind of workload. Nevertheless I couldn't resist trying...
Setup

  • 2 data-nodes each running on a 4-way Xeon E7420 @ 2.13GHx (total 16 cores) 256Gb RAM
  • The mysqld was co-located with one of the data-nodes
  • I used memory tables

Results
Queries: link

Query sf10 sf100
Q1.1 5 62
Q1.2 0.4 …
[Read more]
What’s a good benchmark?

Vadim has taught me that valid benchmarks are both simple and complex. Simple, because the basic principles are few; complex, because the devil is in the details and it’s a lot of work to satisfy the basic requirements. I’ll give the simple version here.

  • Benchmarks must be appropriate. The workload, sample dataset, distribution of work and data, and so on must be relevant and meaningful for the intended purpose. Running the wrong benchmark rarely teaches anything.
  • Benchmarks must be fully documented. Another researcher must be able to determine exactly how you ran your benchmark, on what hardware, under what workload, what operating system, kernel version, all MySQL tuning parameters, and so on.
  • Benchmarks must be repeatable. Another researcher must be able to reproduce your results. Documentation is part of this, but you need to ensure that you can reproduce your own results. If you can’t, no one else …
[Read more]
So you want to run MySQL on SSDs?

Here’s why I do: it’s time for me to build a new master database server. Our current main slave is too underpowered to be handle our entire load in an emergency, which means that our failover situation isn’t that great. I’ll replace the master with something new and shiny, make some performance improvements while I’m at it, and the old master will work just fine in an emergency.

For IO intensive servers, I conserve space and electricity by using 1U machines with 6 or 8 2.5″ drives.

I’d normally buy 8 Seagate Savvio 15K SAS drives and set them up as a RAID 10 array. This would run me about $1850.

We’re pretty frugal when it comes to our technology budget and I can’t really stomach spending that kind of money to effectively get 550 GB of redundant, fast magnetic disk storage. SATA MLC SSDs that blow traditional drives out of the water are currently under $2 / GB.

Disclaimer

[Read more]
Showing entries 20023 to 20032 of 44113
« 10 Newer Entries | 10 Older Entries »