Showing entries 1 to 7
Displaying posts with tag: slow (reset)
The Fast Way to Import CSV Data Into a Tungsten Cluster

The Question Recently, a customer asked us:

After importing a new section of user data into our Tungsten cluster, we are seeing perpetually rising replication lag. We are sitting at 8.5hrs estimated convergence time after importing around 50 million rows and this lag is climbing continuously. We are currently migrating some of our users from a NoSQL database into our Tungsten cluster. We have a procedure to write out a bunch of CSV files after translating our old data into columns and then we recursively send them to the write master using the mysql client. Specifically our import SQL is doing LOAD DATA LOCAL INFILE and the reading in a large CSV file to do the import. We have 20k records per CSV file and we have 12 workers which insert them in parallel.

Simple Overview The Skinny

In cases like this, the slaves are having trouble with the database unable to keep up with the apply stage …

[Read more]
GROUP_CONCAT is very slow. So I used FastBit for the table instead! 43 seconds down to .16!

MySQL always tries to make toast of my good ideas. This is the only time is succeeds in making toast (see bug #2). |

This time I'm working on star schema optimzation (which will be my next blog post). MySQL takes about .24 seconds to project out a list of parts that match a filter, but it takes 43 (not .43, FORTY-THREE) seconds to turn it into a CSV list using GROUP_CONCAT.

  select P_PartKey from part where where P_MFGR = 'MFGR#1' or P_MFGR='MFGR#2';
  399597 rows in set (0.24 sec)

  mysql> select group_concat(P_PartKey) into @pkey from part where P_MFGR = 'MFGR …

[Read more]
GROUP_CONCAT is very slow. So I used FastBit for the table instead! 43 seconds down to .16!

MySQL always tries to make toast of my good ideas. This is the only time is succeeds in making toast (see bug #2). |

This time I'm working on star schema optimzation (which will be my next blog post). MySQL takes about .24 seconds to project out a list of parts that match a filter, but it takes 43 (not .43, FORTY-THREE) seconds to turn it into a CSV list using GROUP_CONCAT.

  select P_PartKey from part where where P_MFGR = 'MFGR#1' or P_MFGR='MFGR#2';
  399597 rows in set (0.24 sec)

  mysql> select group_concat(P_PartKey) into @pkey from part where P_MFGR = 'MFGR#1' or   P_MFGR='MFGR#2';
  Query OK, 1 row affected (43.25 sec)

So I decided to just stick the part table into FastBit instead (using my FastBit_UDF tools):
fb_create('/var/lib/fastbit/part', …

[Read more]
The Myth About Slow SQL JOIN Operations

In my recent SQL work for a large Swiss bank, I have maintained nested database view monsters whose unnested SQL code amounted up to 5k lines of code, joining the same table over and over again in separate subselects combined via UNION operations. This monster performed in way under 50ms, no matter how we queried … Continue reading The Myth About Slow SQL JOIN Operations →

Stock Centos6 Ruby and a MySQL database far away

I wanted to share a recent issue with stock Ruby on Centos6.

We run a Rails application that connects to a MySQL datastore. We're in a datacenter transformation and we deployed the application to our new datacenter though the MySQL datastore is at the other datacenter. As you can see there is about a 35ms distance between the application and the MySQL instance.



[root@host]# ping 172.x.y.19
PING 172.x.y.19 (172.x.y.19) 56(84) bytes of data.
64 bytes from 172.x.y.19: icmp_seq=1 ttl=253 time=32.6 ms
64 bytes from 172.x.y.19: icmp_seq=2 ttl=253 time=38.1 ms
64 bytes from 172.x.y.19: icmp_seq=3 ttl=253 time=36.0 ms
^C
--- 172.x.y.19 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2751ms
rtt min/avg/max/mdev = 32.656/35.600/38.101/2.250 ms


We started the Rails application up though it just was not …

[Read more]
Canada’s Version of the Ant and Grasshopper

Background Knowledge

It appears this depiction of the Canadian political left and right originated as far as I can tell by johnnyslow [at] gmail [period] com around the end of the 2005. I recently heard the below of what I’m reposting on Charles Adler radio show.

I was so impressed by this analogy I had to post it for others that may have not read it yet.

Classic Version

The ant works hard in the withering heat all summer long, building his house and laying up supplies for the winter. The grasshopper thinks he’s a fool, and laughs and dances and plays the summer away.

Come winter, the ant is warm and well fed. The shivering grasshopper has no food or
shelter, so he dies out in the cold.

THE END

The Canadian Version

The ant works hard in the withering heat all summer long, building his house and laying …

[Read more]
Q&A on MySQL 5.1

Listening to Sheeri's presentation on MySQL 5.1, I saw that there are a few questions left unanswered. I am listing here some of the questions that I found interesting, plus a few from an early webinar on the same topic.

Q: does Partitioning physically split data?
A: No. Some engines (MyISAM, Archive) do a physical split, but this is not necessary, as you see if you apply partitioning to a InnoDB table. Partitioning is a logical split of data, for easy retrieval. It is completely transparent to the user.
Q: Can you set partitions to different servers?
A: No. Partitions are logical parts of one table within one server. Partitioning through the Federated engine is not supported.
Q: How efficient are Row-Based …
[Read more]
Showing entries 1 to 7