Showing entries 23161 to 23170 of 44965
« 10 Newer Entries | 10 Older Entries »
MySQL Cluster on Windows - NDB API part 4 - Finishing it up

So we have come to the forth and last part of this small series on how to get started with NDB MGM API on Windows. I am planning some more code using NDB API specifically, but that will be a separate series.
If you haven't followed this series before, the parts before this one were:

[Read more]
The case against using rpm packaging for MySQL

In some environments using a distro package management system may* provide benefits including handling dependencies and providing a simpler approach when there are no dedicated DBA or SA resources.

However, the incorrect use can result in pain and in this instance production downtime. Even with dedicated resources at an unnamed premium managed hosting provider, the simple mistake of assumption resulted in over 30 minutes of unplanned downtime during peak time.

One of the disadvantages of using a system such as rpm is the lack of control in managing the starting and stopping of your MySQL instance, and the second is unanticipated package dependency upgrades.

So what happened with this client. When attempting to use the MySQL client on the production server, I got the following error.

$ mysql -uxxx -p
error while loading shared libraries: libmysqlclient.so.10: cannot open shared object file: No such file or …
[Read more]
Redis -- key pair and replication

DBAs seem to be getting hit over the head with the NoSQL message while trying to keep their SQL systems going. SQL does have its place in the world1 and much of the NoSQL push seems to be a way to get past some of SQL's limitations. But simply moving from a proven technology and infrastructure to something new with fad overtones is not going to make life easier for Joe Average DBA. Redis is one of those project that will get notice from a lot of DBAs looking for a very fast key-value datastore.

Redis keeps the dataset in memory but writes to disk asynchronously and reloaded when Redis is restarted. Or the data can be saved each time a command is issued or on schedule to minimize data loss.

Redis also has master-slave replication and setup consists of a 'slave of x.x.x.x' line in the slave's config file. And is the only trivial thing about Redis.

[Read more]
Connecting Pentaho Data Integration to hive / hadoop

My latest data integration challenge has been with a new node in my data landscape: a hadoop/hive installation. Since PDI has become my favorite hammer for many different tasks, I thought it would be handy to get connected to the hive database via jdbc. With that ability, I can enhance hive output by including lookups and joins with operational ( MySQL ) databases.

Unfortunately, I didn't have much luck using standard connections with jdbc and table input steps. I suppose this is because the hive jdbc driver is still in the embryonic stage.

The turning point for my effort was the discovery of the new User Defined Java Class in Pentaho 4.0 GA. I struggled a bit before getting this to work, but I now have a simple working example that returns the result of a hive query to the stream. There was quite a bit of late night thrashing, so excuse the un-refined code.

In summary, the keys to getting the udjc to work …

[Read more]
Translation of "Chapter 10. Lost connection to MySQL server during query." of "Methods for searching errors in SQL application" just published.

This chapter is about possible reasons of "Lost connection to MySQL server" error not discussed in previous one.



Chapter 10. Lost connection to MySQL server during query


You can see error "Lost connection to MySQL server" not only because
too small connect_timeout, but because other reasons too. In this
chapter we discuss these reasons.




$php phpconf2009_4.php

string(44) "Lost connection to MySQL server during query"


Most likely error log will show what happened:


...


Rest of the chapter is here


On “Replace Into”, “Insert Ignore”, Triggers, and Row Based Replication

In posts on June 30 and July 6, I explained how implementing the commands “replace into” and “insert ignore” with TokuDB’s fractal trees data structures can be two orders of magnitude faster than implementing them with B-trees. Towards the end of each post, I hinted at that there are some caveats that complicate the story a little. On July 21st I explained one caveat, secondary keys, and on August 3rd, Rich explained another caveat. In this post, I explain the other …

[Read more]
Translation of "Chapter 10. Lost connection to MySQL server during query." of "Methods for searching errors in SQL application" just published.

This chapter is about possible reasons of "Lost connection to MySQL server" error not discussed in previous one.



Chapter 10. Lost connection to MySQL server during query


You can see error "Lost connection to MySQL server" not only because
too small connect_timeout, but because other reasons too. In this
chapter we discuss these reasons.




$php phpconf2009_4.php

string(44) "Lost connection to MySQL server during query"


Most likely error log will show what happened:


...


Rest of the chapter is here


Comparing ScaleDB’s Shared Cache Tier vs. NFS and CFS

Prior posts addressed the performance benefits of a shared cache tier (ScaleDB CAS) and also the storage flexibility it enables.This post compares the ScaleDB CAS purpose-built file storage sharing system against off-the-shelf solutions like NFS and various cluster file systems (CFS).

When using a clustered database, like ScaleDB, each node has full access to all of the data in the database. This means that the file system (SAN, NAS, Cloud, etc.) must allow multiple nodes to share the data in the file system.

Options include:
1. Network File System (NFS)
2. Cluster File System (CFS)
3. Purpose-built file storage interface

Locking Granularity:
I won’t get deeply …

[Read more]
Back to blogging....

It has been a while since I posted on my blog - in fact, I believe this is the first time ever that more than one month passed between posts since I started blogging. There are a couple of reasons for the lag:


  • Matt Casters, Jos van Dongen and me have spent a lot of time finalizing our forthcoming book, Pentaho Kettle Solutions (Wiley, ISBN: 978-0-470-63517-9). The book is currently being produced, and should be available according to schedule in early September 2010. If you're interested, you might like to read …
[Read more]
Why message queues and offline processing are so important

If you read Percona's whitepaper on Goal-Driven Performance Optimization, you will notice that we define performance using the combination of three separate terms. You really want to read the paper, but let me summarize it here:

  1. Response Time - This is the time required to complete a desired task.
  2. Throughput - Throughput is measured in tasks completed per unit of time.
  3. Capacity - The system's capacity is the point where load cannot be increased without degrading response time below acceptable levels.

Setting and meeting your response time goal should always be your primary focus, but the closer throughput is to capacity the worse response time can be.  It's a trade-off!

[Read more]
Showing entries 23161 to 23170 of 44965
« 10 Newer Entries | 10 Older Entries »