Thanks to everyone who came out to the San Francisco PHP and MySQL meetup! Also, thanks to Michael for organizing such a great event, and Percona for sponsoring the food. I put the slides from the talk up on my wiki for reference or in case you missed it. I believe that there will be a video up at some point as well. While down there I also had a chance to stop by Digg and talked to them about Gearman (they’ve been using it for a while). It was interesting to see how they were using it in a large scale deployment. I was able to get some valuable feedback to future development, and a cool t-shirt. :) Thanks Digg!
I’ll be talking about Gearman this Thursday (October 1st, 2009) at the San Francisco PHP and MySQL Meetup groups (these are two separate groups, but sometimes share the topic). A few other folks involved in the Gearman community should also be there to help out, including James Luedke (the PHP extension main author), Eric Lambert (the Java API author), Dormando, and Hachi (Perl version maintainers at SixApart). You can sign up at either the MySQL or PHP meetup groups. We’ll be discussing the basics for those of you who don’t even know what Gearman is, common use cases, new features, advanced topics for folks already using Gearman, and of course Q&A throughout. Hope to see you there!
Sign up today for OpenSQL Camp 2009! The space is confirmed so go ahead and make your travel arrangements. The event is free and will be taking place in Portland, OR on November 14-15th, 2009. If you are interested in leading a session or presenting a talk, be sure to add it to the session ideas page. Also, we are still looking for sponsors! Please visit the sponsors page if you or your company/organization might be interested. All donations are tax deductible.
We should have representatives from many open source database (and database-related) projects. PostgreSQL, MySQL, Drizzle, memcached, and Gearman, just to name a few. If you work closely with one of[Read more...]
As I outlined in Part 1 MySQL Proxy can be one tool for performing SQL analysis. The impact with any monitoring is the art of monitoring will affect the results, in this case the performance. I don’t recommend enabling this level of detailed monitoring in production, these techniques are designed for development, testing, and possibly stress testing.
This leads to the question, how do I monitor SQL in production? The simple answer to this question is, Sampling. Take a representative sample of your production system. The implementation of this depends on many factors including your programming technology stack, and your MySQL topology.
If for example you are using PHP, then defining MySQL proxy on a production system, and executing firewall rules to[Read more...]
This blog post is about things I did on my own free time, not endorsed by my employer.
This blog post is about things I did on my own free time, not endorsed by my employer.
As Brian mentioned, a number of us traveled up to Seattle last week to discuss the road map for Drizzle, Gearman, and memcached. Thanks to everyone who was able to make it! It was great to see folks again (Northscale guys, Robert Hodges, Padraig), and meet a couple new people like Nathan, one of the Google Summer of Code students for Drizzle. I thought I’d take a moment to mention some of the discussions related to the tasks I’m working on.
For Drizzle, we talked about the new configuration and plugin system I’ve been digging into lately. Monty Taylor has been doing a great job refactoring the plugin loading, but there are still some steps to be taken to get things where we want. One of the big goals with all this is to have the plugin and config system not specific to Drizzle at all so we can use this in other projects as well (one being[Read more...]
Linux vs FreeBSD, vi vs emacs, MySQL vs PostgreSQL, your habit or favorite technology vs another’s. At the end of the day there is no winner, just a matter of preference for the task at hand. I learned C++ 13 years ago, I forgot most of my C++ knowledge 10 years ago, I discouraged the use of C++ in this period in between, and in the past year I’ve been re-learning C++ (mostly due to Drizzle). So what did I use after unlearning C++ 10 years ago? I wrote everything in C (and by everything I mean this was my performance programming language of choice). This worked quite well, but it’s an interesting evolution that I think is now coming full circle.
When I first started programming C, it was a bit clumsy, and I look back at my old code and cringe. I began to develop a certain programming style that can best be described as object-oriented C programming due to the conventions used. The structs,[Read more...]
There has been a lot of talk lately about the “Open Source Cloud.” What will it look like, who will be behind it, and can I use it now? These were hot topics at OSCON, and Stephen O’Grady had two excellent posts on them recently as well (one, two). As many other folks do, I see a few major drivers of this: the need for private clouds, the prevention of vendor lock-in (proprietary services), and open source hackers just wanting to be able to extend and fix the source code. Some layers of this open source cloud stack are further along than others, and I’m going to attempt to outline what I’ve found so far along with what I think is missing.
Virtual Machine[Read more...]
I’m going to be giving a talk at the PHP user group here in Portland, OR on August 11th. Details can be found here. If you’re in the Portland area please join the group and come check it out! This will be similar to the Boston MySQL Meetup group talk I gave earlier this month with Patrick, but with more focus on PHP.
There is a lot happening with Gearman right now. Last week at OSCON we received quite a bit of good feedback with the project which will certainly help direct our priorities moving forward. In the past couple weeks we’ve had the following Gearman releases:
At the July MySQL User Group, Eric Day and Patrick Galbraith spoke about Drizzle, a lightweight, microkernel, open source database for high-performance scale-out applications, and Gearman, an open source, distributed job queuing system.
The slides can be downloaded from http://www.oddments.org/notes/DrizzleGearmanBoston2009.pdf.
The first hour of video, where Eric and Patrick talk about Drizzle, is at http://www.youtube.com/watch?v=hi4cGzFlcuU, and below:
The second part, about 1.5 hours, where Eric and Patrick talk about Gearman, and then illustrate Gearman and Drizzle working together in a custom search[Read more...]
Gearman is an open source generic framework for distributed processing. At OSCON 2009 I attended the Gearman: Build Your Own Distributed Platform in 3 Hours tutorial.
While it’s very easy to install Gearman, and follow the first example, if you missed the all important additional PHP steps listed on just one slide you may be left with the “‘Class ‘GearmanClient’ not found” error.
The following are detailed instructions for the installation and configuration of Gearman and PHP on Ubuntu 9.04 Jaunty.
Add the Drizzle PPA to get pre-packaged versions of Gearman.
cp /etc/apt/sources.list /etc/apt/sources.list.orig echo "deb http://ppa.launchpad.net/drizzle-developers/ppa/ubuntu intrepid main deb-src[Read more...]
Similar to the MySQL and Drizzle user defined functions for Gearman, we now have PostrgeSQL functions too! These allow you to submit jobs to the job server from within your SQL query, trigger, or stored procedure. Here’s a snippet of how this looks in PostgreSQL:
shell$ psql test test=# SELECT gman_do('reverse', 'Hello World!'); gman_do -------------- !dlroW olleH (1 row) test=#
Special thanks to Selena Deckelmann for helping get these working!
I’ll be talking about the latest features in Gearman along with covering a few Gearman-powered applications tomorrow, July 14th, at 10AM Pacific time. Follow this link for details on how to sign up: Gearman: New Features and Applications for Distributed Job Queuing (http://mysql.com/news-and-events/web-seminars/display-387.html)
Brian Aker and I will also be talking about Gearman at the PostgreSQL pgDay San Jose on Sunday, July 19th, and at OSCON the week after. See the Gearman presentations page for more information.
Version 0.8 of the Gearman C Server and Library has been released. This includes basic HTTP protocol support, build system improvements, and bug fixes.
Version 0.4.0 of the Gearman PHP Extension has also been released.
If you want to learn more about Gearman, be sure to check out the upcoming Boston MySQL Meetup, MySQL Webinar (http://mysql.com/news-and-events/web-seminars/display-387.html), or the one of the events at OSCON (tutorial, session, and BoF).
I’ll be heading back to my home state (Maine) this week for a visit, and while I’m back there Patrick Galbraith and I will be talking at the Boston MySQL Meetup Group on Monday night about Drizzle, Gearman, and how to combine the two with projects like Narada. If you are in the Boston area, be sure to check it out!
I’ve been following the excellent work that Jan, Kay, and others have been doing with MySQL Proxy, it has really matured into a great piece of software. I talked to Jan at the MySQL UC and toyed with the idea of integrating libdrizzle into MySQL Proxy. I’ve also been asked by a number of folks when a Drizzle Proxy project will be started and if it will be as feature rich as MySQL Proxy. For a while I just said “Someday, I just don’t have the time.” Lately though I am hoping we never have a Drizzle Proxy project.
Let me explain.
One of the fundamental ideas in software engineering is code reuse through libraries or modules. Rather than create a Drizzle Proxy[Read more...]
The July meeting of the Boston MySQL User Group will feature Eric Day, a prominent Drizzle developer, talking about Drizzle and Gearman:
In this talk we will discuss two growing technologies: Drizzle and Gearman.
We will explain what the Drizzle project is, what we aim to accomplish, and an overview of where we are at. We will also be introducing the fundamentals of how to leverage Gearman, an open-source, distributed job queuing system. Gearman’s generic design allows it to be used as a building block for almost any use - from speeding up your website to building your own Map/Reduce cluster. We will tie Drizzle and Gearman together and demonstrate how they work in a custom Search Engine application.
Here is the URL for MIT’s Map with the location of this[Read more...]
I just finished adding pluggable protocol support to the Gearman job server, this will enable even more methods of submitting jobs into Gearman. If all the various Gearman APIs, MySQL UDFs, and Drizzle UDFs are not enough, it’s now fairly easy to write a module that takes over the socket I/O and parsing hooks to map any protocol into the job server. As an example module, I added basic HTTP protocol support:
> gearmand -r http &  29911 > ./examples/reverse_worker > /dev/null &  29928 > nc localhost 8080 POST /reverse HTTP/1.1 Content-Length: 12 Hello World! HTTP/1.0 200 OK X-Gearman-Job-Handle: H:lap:1 Content-Length: 12 Server: Gearman/0.8 !dlroW olleH
I’ve added a few headers for setting things like background, priority, and unique key. For example, if you want to run the above job in the background:
POST /reverse HTTP/1.1 Content-Length: 12[Read more...]
I’ve been working with Patrick Galbraith for the past couple weeks on a new project that started as an example in his upcoming book. It is a search engine built using Gearman, Sphinx, Drizzle or MySQL, and memcached. Patrick wrote the first implementation in Perl to tie all these pieces together, but there is also a Java version underway bring written by Trond Norbye and Eric Lambert that will be shown at the CommunityOne and JavaOne conferences next week. I’ve been helping get the system setup on a new cluster and with the port to Drizzle.
Narada provides interfaces that allow you to submit URLs to be[Read more...]
I’m pleased to announce version 0.6 of the Gearman C server and library. The major new feature of this release is a pluggable persistent queue for the job server. It comes bundled with a libdrizzle module (so your queue can live in Drizzle or MySQL), but Brian has already written a libmemcached module and there is a flat-file module in the works as well. The persistent queue allows background jobs to be stored via the pluggable module, so if the job server crashes or is shutdown, the queue module can repopulate the job server with any jobs that were not yet complete. This is just the first version of the queue support, so expect more modules and features in the future!
On a related note, James Luedke has also released version 0.3 of the[Read more...]
If you’ve pulled the latest Drizzle code from lp:drizzle, you may have noticed a new plugin/gearman_udf directory in there. This is a new UDF that acts as a Gearman client. This is mostly a port of the Gearman MySQL UDF I wrote, but I did it the proper C++ way to fit in better with Drizzle. It also uses the new plugin system Monty Taylor has been working in, which makes it much easier. :)
To use it, just make sure you have the Gearman C library installed and Drizzle will pick it up and build it for you. No extra configuration required!
The following example assumes you have a Gearman job server and a reverse worker running (see examples/reverse_worker in the C library package).
drizzle> SELECT[Read more...]
I read Jeremy’s post last week on jobs that matter and was also reminded of Tim O’Reilly’s post about the same topic from earlier this year (which is a great read by the way). This got me thinking, and I realized it has been about one year since I made the commitment to myself to get more involved with open source. I had been a passive member for over ten years, but I decided it was time for a change. After looking at various GNU projects, the Linux kernel, MySQL, and a few others, I decided to focus on the MySQL community. After attending OSCON I was even more inspired with the announcement of Drizzle, along with learning about other projects such as[Read more...]
Last week at the MySQL Conference & Expo I made a release right before my first Gearman talk. I didn’t get a chance to blog about it, so here it is now. This is a fairly big release since it includes a major refactoring of the job server to now be threaded. Doing some simple tests on it with a 16-core Intel machine:
The Y axis is total jobs/second, and the X axis is number of clients/workers (8×8 means 8 clients and 8 workers). The clients and workers are the ‘blobslap’ utilities included in the source, and they are shoving as many jobs/second through the server as they can. Job size was random between 0 and 1024 bytes. The job server peaked out at about 43,000 jobs/second, most likely due to the machine being so busy (running[Read more...]