Showing entries 36491 to 36500 of 44734
« 10 Newer Entries | 10 Older Entries »
Being lazy

Dear Kettle fan,
Since our code is open, we have to be honest: in the past, the performance of Kettle was less than stellar in the “Text File” department. It’s true that we did offer some workarounds with respect to database loading, but there are cases when people don’t want to touch any database at all. Let’s take a closer look at that specific problem…

Reading and writing text files…
Let’s take a look at this delimited (CSV) file (28MB). Unzipped, the file is around 89MB in size.

Suppose you read this file using version 2.5.1 (soon to be out) with a single “Text File Input” step. On my machine, that process consumes most of the available CPU power it can take and takes around 57 seconds to complete. (1M rows/minute or 60M rows/hour)

When we analyze what’s eating the CPU …

[Read more]
scale-out: notes on sharding, unique keys, foreign keys?

adapt.com uses sharding to scale out. In my talk on moving to MySQL, I mentioned a few design issues we had to become aware of before we rolled out a sharded database environment. I promised to give more details, so here they are…

We wanted some keys to be globally unique (eg, userid and accountid). How would we do this?

auto_increment_offset and auto_increment_increment

We had been using autoincremented values. So we considered using MySQL’s auto_increment_offset and auto_increment_increment to keep autoincrement values unique across shards. (This basically tells each shard to start at a certain number (auto_increment_offset), eg 1 for one shard, 2 for the next, etc, then to skip auto_increment_increment between values. So, eg, if we set that to 1000, our next keys would be 1001 for shard one, 1002 for shard 2, etc. Then 2001 for shard one, 2002 for shard 2, and so on. …

[Read more]
MySQL Proxy: Query Stats

In one of the last commits I added a SQL Tokenizer which understands the basic tokens of (My)SQL:

  • Strings
  • Literals
  • Numbers
  • Comments
  • Operators

With this basic understanding we can normalize Queries and build statistics over similar queries.

The idea is simple and already implemented in mysqldumpslow:

/* login.php:37 */SELECT * FROM tbl WHERE id = 1243 AND name = "jan"

is turned into

  SELECT * FROM `tbl` WHERE `id` = ? AND `name` = ?

The queries look like prepared statements now and can be used the characterize queries of the same kind.

  • comments are removed
  • whitespaces are stripped
  • literals are quoted
  • constants are replaced with ?

Taking the famous world-db and executing some simple queries like:

root@127.0.0.1:4040 …
[Read more]
Tips From Work: Emacs Regex Replace and the Star Quantifier

I came across this issue sometime last year, got the solution from the resident Emacs wizard and forgot about it. Last week, it happened again. I'm recording it now so I don't forget about it.

Emacs lets you search and replace in a buffer with regular expressions. So far so cool. I needed to replace a part of a lot of HTML links in a file.


<a href="/path/to/file" title="some title">Go!</a>

was to become


<a href="/path/to/file" title="other title">Go!</a>

with varying values for the title attribute obviously. So I searched for


    <a href="/path/to/file" title="([^\"]*)">Go!</a>

and Emacs complained about some invalid regex. Darn. I double and triple checked all necessary escape sequences (which I omitted here for clarity) and finally asked the wizard.

Easy, Emacs doesn't konw the * quantifier. Do …

[Read more]
InnoDB: Number of pending reads 128, pending pread calls 0

InnoDB: Number of pending reads 128, pending pread calls 0
InnoDB: Error: InnoDB has waited for 50 seconds for pending
InnoDB: reads to the buffer pool to be finished.

What does this mean? Well searching yahoo search I couldn't find anything other then pointers to source code where this message generated. I assumed it was a disk problem because performance was super crappy.

My disk monitoring software which probes MegaRaidCli didn't report a failed disk. Linux didn't report a failed disk or any errors. So, what is happening? Was I wrong?

Well, in fact it was a failed disk, but not in a state that could be classified as failed by the controller or any other component including the OS. With the help of my bud we found that a disk in the array was in a critical state-the disk itself just had a bunch of sector errors, yet the controller, os, everything would not mark it as dead - so the end result …

[Read more]
MySQL Camp II - Introductions

We have started MySQL Camp II. The first session is Introductions.

I didn’t catch all the employers, but here is part of the list of attendees at the Introduction section. Great to see multiple people from many places including ESPN, priceline.com, Proven Scaling, Solid Tech - sponsors of Dorsal Source, ForSaleByOwner. fontshop.com, 9Mmedia, CafeMom, JP Morgan, …

[Read more]
Heading to FrOSCon

I mentioned this event a few weeks ago, but it's worth a reminder.

The Free and Open Source Software Conference is held in Sankt Augustin, a cheerful place near Bonn.


Some reasons to attend:

  • the organizers are friendly and efficient
  • the program is exciting
  • the food is good,
  • it's full of open source enthusiasts
  • I am meeting with a bunch of friends and colleagues

Do you need more? Ah, yes, my session :)
I will present a topic for database administrators, Monitoring and logging a database server.
But there is also Lenz …

[Read more]
MySQL Camp II - It begins


Well readers, your either here or your not. MySQL Camp II starts today in Brooklyn, New York, at Polytechnic University. Last night’s pre drinks meetup in NYC went well, but today it’s brass tacks time. View Larger Map

For those of you not able to make it, IRC@Freenode #mysql-camp will be the place to hang out to hear what’s happening. If your not at the camp, please identify yourself. Be sure to also check out the Camp Web Site MySQL Camp II for the plans for today and tomorrow.

For those of you not here, MySQL Camp III is already in planning.

Clustering & partitioning

Let’s have a quick look at some clustering examples in the new 3.0 engine:

This example runs all steps in the transformation in a clustered mode. That means that there are 4 slave transformations that run in parallel with each other.
The interesting part is that first of all the “Fixed Input” step is running in parallel, each copy reading a certain part of a file.
The second thing to mention about it is that we now allow you to run multiple copies of a single step on a cluster. In this example, we run 3 copies of a step per slave transformation. In total there are 12 copies of the sort step in action in parallel.

IMPORTANT: this is a test-transformation, a real world sorting exercise would also include a “Sorted Merge” step to keep the data sorted. I was too lazy to redo the screenshots though

The clustered transformations now also support logging to …

[Read more]
Speaking at the FrOSCon conference this weekend

Hmm, long time no post! But I have the perfect excuse: I've actually been on vacation the past two weeks... Now I am back at work and busy trying to make a dent in the mail pile that has accumulated while I was gone. In addition to that, I am assisting with the organization of our MySQL Developer Meeting in Heidelberg, which will take place in September. See Kaj's blog post for more details about it. I am in charge of our community guests and really look forward to meeting most of our internal developers as well as community guests there! This is going to be a fun event.

In addition to that, I am applying the final touches to my upcoming presentation about our ongoing community activities: "Opening the doors (and windows) of the Cathedral", which will be held at the …

[Read more]
Showing entries 36491 to 36500 of 44734
« 10 Newer Entries | 10 Older Entries »