We choose our joys and sorrows long before we experience them
--Kahlil Gibran
MPP is hardware growth (scalability) by adding small 2 to 4 CPU
servers, preferably cheap, to the existing infrastructure. SMP is
replacing the current hardware by getting a bigger, badder,
meaner box.
ETL
While Ab Initio does provide hash partition based
parallelism capabilities that can be used in MPP environments,
the effective use of this approach is a non-trivial undertaking.
MPP is rarely used for ETL processing. Contemporary update:
Hadoop and
Hive are
coming up fast to provide a viable solution in this space.
Reporting
It is easy to split reporting load into smaller chunks …
Few month ago I heard about some initial work on MySQL Proxy software by Jan Kneschke and I thought about implementing some type of MySQL Replication Aggregator based on this software. The idea was to create some piece of software which could get many replication streams, merge them and feed to some mysql slave. This software could be used for backup purposes and many other interesting things. But back then mysqlproxy distribution has been suspended (afaik, by MySQL AB because of some legal issues).
And at last, today MySQL Proxy project has been released to public and it became much more flexible so I think we need to take a look at it and try to implement such replication aggregator patches for it.
I just want to point you readers to Planet X-Tend, where a bunch of collegues are posting their muses and rants on different Open Source and related thingies, such as Xen, MySQL, OpenStreetMap, KVM, Fon and off course Linux in general.
Please note this late-breaking news related to this story. Then still please sign our letter! 15 years ago, with the release of Oracle 7.0.12, Oracle gave the world?or at least its customers?something really great: the Oracle Wait Interface (OWI). The OWI is one of the reasons that Oracle?s database product and its customer base are what they [...]
Building on the momentum created by OSS Camp-Mumbai, and having received very positive response from the OSS community -- a second Indian OSS Camp has been scheduled for September 8-9, 2007 in Delhi.
The Free Software Foundation today announced the finalized version of the long awaited GPL3 open software license. This was pretty long in the making, more than 16 years since the prior release, but the process, led in large part by Eben Moglen, was deliberate with input from a broad range of users, vendors, developers and lawyers. Pretty much anyone who wanted to comment on the GPL was able to. MySQL participated in the process, both through David Axmark's early discussions and feedback as well as Kaj Arno's chairing of a subcommittee. I'm also glad that the FSF took the extra time to address issues …
[Read more]The other day I was looking for a open source, feature-rich, high performance ETL tool to use in an enterprise environment. I was disappointed nothing really seemed to match my requirements. Have I overlooked something or is this really a niche where there aren’t any viable projects? After looking in the usual places like sourceforge.net and doing a bunch of Google searches. I could not find any products that fit the bill. Here are (some of) my criteria:
- Fast. The candidate tool has to be able to move huge amounts of information between the source and target databases quickly.
- Flexible error handling. Data errors occur all the time, and when errors are encountered, we should be able to stop processing or log the error to a file or push the record into a violations table for subsequent processing. There are probably other popular strategies for handling errors, such as changing the offending data and trying to insert it …
About a week ago Marten send me email pointing to his article published on Jays Blog (Come on Marten, it is time for you to get your own blog). I should have replied much earlier but only found time to do that now. So here is my list
1. Be Pluggable
Unlike many OpenSource projects MySQL was single chunk of code and for years the only way you could officially extend it was using UDFs which was very limited. Compare this with other OpenSource projects such as PostgreSQL (plugable indexes etc), Apache, PHP or Linux Kernel. Yes in MySQL 5.1 the situation is changes - now there are plugable storage engines (something even PostgreSQL does not have) as well as Full Text Search parses but there is very long way to go before you could do any significant functionality ourside of storage engines as …
[Read more]We congratulate the Free Software Foundation on the release of GPLv3 and offer our thanks to the many individuals in the open source community who participated in the process of drafting the license.
It’s good to see overall improvements in GPLv3 over GPLv2, when it comes to compatibility with other Free/Open Source Software licenses, to the compatibility with other legislations than the US legal system, and to strengthened incompatibility with Software Patents. I am also happy if the work of the Committee B ends up contributing to a better adoption of GPLv3. I am in awe as to the patience and skillful diplomacy with which Eben Moglen could tame the group consisting of everything from techies from comparatively small companies (like Trolltech and ourselves) to the seniormost lawyers from the biggest Fortune 500 companies.
MySQL will continue to monitor the industry’s reaction and adoption of the new license, …
[Read more]
We spent a lot of time this month trying to fix Bug#21074 "Large
query_cache freezes mysql server sporadically under heavy
load".
In a nutshell, invalidation of a table can be dead slow (seconds)
when there are tens of thousands of cached queries associated
with this table, and, moreover, invalidation
freezes the entire server when it happens.
It's so funny, this thing happens under two singleton mutexes
(one instance of the mutex exists in the entire server) both of
which are required for every single query that the server
gets.
Invalidation is indeed somewhat slow, but making it a bit faster
will only shift the threshold when the query cache becomes
unusable from tens of thousands of cached results, to, say,
hundreds of thousands. So we thought it'll only change the depth
of the hole in which people will discover they've shoot
themselves in the foot.
Besides, any change of that sort requires quite …