Effective April 1, I will join Percona full-time as a consultant. I’ll be helping people build high-performance applications with MySQL, but I’ll also be continuing to develop and improve tools such as Maatkit. This career change has been a long time in progress. I’m really looking forward to it, but at the same time it’s hard to leave my current employer, The Rimm-Kaufman Group (RKG). Working with them has been the best job I’ve ever had.
The details can be found
http://en.oreilly.com/mysql2008/public/schedule/detail/588
I've since moved on from Flickr to a new Job, but Flickr is still
allowing me to give this talk. Flickr is so cool!
The talk encompasses capacity planning and scaling for a heavy
concurrent write and read environment, and when it makes sense to
split resources out to a single application.
There was a proposed project of particular interest to me in the
MySQL list for Google's Summer of Code, an
Obfuscator. This tool would take a schema/dataset
and obfuscate it in such a way that it can be posted on forums or
submitted to MySQL in a bug or support request, without divulging
any sensitive information.
If you're currently a student and want to work on this project
and thus be an active part of the Google Summer of Code 2008,
apply quickly (before March 31st) through the GSoC student application form and also join the
MySQL SoC
mailing list!
The Obfuscator …
Just before the Easter holidays I posted this challenge for a MySQL schema. For lack of
submissions to far, I'll leave it open for a little bit
longer.
Perhaps you reckon the challenge sucks ;-) In that case please
comment and tell why! That'd be good feedback. Otherwise, do take
a stab at it. If you did and got stuck, comment about this too.
Then others can help and move it forward.
Last week I played with queries from TPC-H benchmarks, particularly comparing MySQL 6.0.4-alpha with 5.1. MySQL 6.0 is interesting here, as there is a lot of new changes in optimizer, which should affect execution plan of TPC-H queries. In reality only two queries (from 22) have significantly better execution time (about them in next posts), but I want to write about is queries that execute slower in new MySQL 6.0 version.
Query is pretty simple
PLAIN TEXT SQL:
- SELECT sum(l_extendedprice * l_discount) AS revenue
- FROM lineitem WHERE l_shipdate>= date '1995-01-01'
- AND l_shipdate <date '1995-01-01' + interval '1' year
- AND l_discount BETWEEN 0.09 - 0.01 AND 0.09 + 0.01
- AND l_quantity <24;
with execution plain (in 5.1)
PLAIN TEXT SQL:
- …
We made a very significant announcement last week, of a collaboration with one of the most (if not the most) security sensitive institutions on earth, the United States government's National Security Agency. They've joined the burgeoning OpenSolaris community, to collaborate with Sun and other community members on the future of ultra-secure operating systems.
To put this in context, community engagement has always been one of the most important ways Sun innovates in the marketplace - we partner with those that have extreme demands (whether it's the world's largest supercomputing facility, or the world's most paranoid security professionals (no offense intended), or the world's largest archival storage facilities), and then we leverage that expertise to create products for the mass market. We let extreme customers teach …
[Read more](Credit: Matt Asay)
I've known Steve Pearson for a year or two, and have always been blown away by how aggressive his company, CBS Interactive, has been with adopting open source. MySQL, Linux, Spring, Lucene, etc. etc. The list of open-source projects that CBS Interactive deploys is long.
Why? Why does CBS Interactive use open source? According to Steve:
- Speed of development (rapid prototyping);
- Ease of access (Access to the code as well as documentation);
- Expandability (Ability to contribute back to the core product);
- Cost.
Steve went on to describe three projects that it has moved to open source. It turns out that the company's adoption of open source has evolved over time, based on bad experiences with proprietary software (and its vendors). CBS Interactive replaced and revamped its content management system with open source. It runs its David Letterman …
[Read more]The FederateX Pluggable Storage Engine for MySQL, version 0.3 has been released. This release contains a fix for bug #21583 which will allow FederatedX to better handle UTF8 (thanks for patch from Tetsuro Ikeda). I had to also add logic to "real_query" to not attempt using values that are not yet defined until get_share is called (such as table->s) since real_query is now used at "ha_federatedx::create" to check if a foreign data source exists in the first place. I also added a "support_files" directory which will contain useful files such as configuration files for running two servers (testing) and any good information which can be added to help users have an easier time running FederatedX. Another thing I've also started is some sort of test framework for running tests on pluggable storage engines. I always found MySQL's test framework with "mysql-test-run.pl" and miscellaneous …
[Read more]I recently got this email from a reader on my site and since I haven't posted for a while, I thought it might be a good discussion:
Jay,
I've been reading your site and had a question that you might have good insight for.
I'm working on a high-traffic website that includes forums. All the code is custom PHP, including the forums, simply because the PHP BB software out there doesn't scale as large and as well as I wanted. I've implemented heavy memcached usage, distributed databases, etc. to handle any level of growth we may hit.
I recently got this email from a reader on my site and since I haven't posted for a while, I thought it might be a good discussion:
Jay,
I've been reading your site and had a question that you might have good insight for.
I'm working on a high-traffic website that includes forums. All the code is custom PHP, including the forums, simply because the PHP BB software out there doesn't scale as large and as well as I wanted. I've implemented heavy memcached usage, distributed databases, etc. to handle any level of growth we may hit.