Currently I need to move a bit of data around. I like to use
Kettle
for this type of work rather than writing custom scripts for a
number of reasons (which I won't discuss here).
Anyway here is a quick tip I want to share with whomever it may
concern. It is not rocket science, and many people may go "duh!"
but I hope it will still be useful to others.
Quite often, you need a batch task, like truncating a set of
tables, deleting data, dropping constraints etc. In kettle, you
might model this like a job. In this case, each separate
action can be modelled as a step of the job.
The following screenshot illustrates this approach:
So here, each step is just an SQL statement that performs exactly
the task you want and the steps are connected in order to …
I just wanted to point everybody at a recent blog post by Konstantin. In the post he discusses a solution for dealing with cache invalidation issues of very large caches under heavy load. He points out that cache invalidation can severely bog down the system. The general solution he proposes is to simply deactivate the query cache entirely during invalidation. I think this is an important caveat to be aware of and actually he is asking for feedback if this "solution" is acceptable. I think its awesome that MySQL engineers are giving us the opportunity to provide feedback on such changes. Maybe there should be a dedicated "pipeline" where such requests could be found?
We next go "In the Trenches" with Kevin Henrikson of Zimbra. Zimbra wasn't the first to build a slick email system with a strong AJAX feel, but it has clearly taken the lead among its peers. The backbone of that position is its engineering team, with Kevin at the heart of the organization.
As it turns out, regardless of all the "sex appeal" that Zimbra has in the market (and it has plenty), Kevin's comments reveal that it's community feedback that makes the company tick. Community feedback and an active engineering team that solicits and acts on that feedback, often in real-time. This is the heart of a successful open source business, and Kevin shows us how it's done.
Name, company, title, and what you actually do
Kevin Henrikson, director of Engineering, Zimbra. I currently manage our client engineering team which develops the Zimbra Advanced Client (AJAX based) and Standard Client (JSP/HTML based), the latter …
[Read more]It has become obvious that there are just too many people to meet up with, and too many locations to travel to, with so little time to do them all. So setting up temporary office, seems to make the most sense! Those that have emailed me, have also received the following in their email.
Where?
Lobby Lounge Restaurant/Cafe
Grand Copthorne Waterfront Hotel
392, Havelock Rd
Singapore
When?
Thursday, July 5 2.30pm - 6pm
Friday, July 6 8am - 11am
What to do if I’m not there?
Just drop me an SMS or a quick call to +6-012-204-3201.
This is in addition to the meetup we’re having. Depending on how my meetings on Friday go, there might be yet another afternoon session available.
Technorati Tags: …
[Read more]MySQL Toolkit distribution 620 updates documentation and test suites, includes some major bug fixes and functionality changes, and adds one new tool to the toolkit. This article is mostly a changelog, with some added notes. Many of the tools have matured and I just needed to make the documentation top-notch, but there’s still a lot to be done on the crucial checksumming and syncing tools. Time is in short supply for me right now, though.
I stumbled across this article in the International Herald Tribune today and was shocked by how off such an otherwise reputable publication could be. The general tone of the article was that open source is struggling to grow. I'm not sure how 100 percent year-over-year growth for the prominent commercial open-source start-ups connotes "struggling," but....
On one hand, open-source developers are continuing to struggle to find ways to make money from open-source software, most of which is given away.
But the only way to do so is to work closely with their biggest rivals--proprietary software makers like International Business Machines, Microsoft, SAP, Cisco and Oracle--which also have an interest in limiting erosion to their own sales.
Since when? We have a host of open-source companies jockeying to be first out the …
[Read more]
The only man I know who behaves sensibly is my tailor; he takes
my measurements anew each time he sees me. The rest go on with
their old measurements and expect me to fit them --George Bernard
Shaw
In the ideal world the operational source would provide a
mechanism for identifying changes made to the data since the last
extract, also known as change data capture (CDC). The source may
contain update date, database online log scrubbing mechanism, or
audit logs, etc. for the purpose. In the real life, many sources
will dump the complete data into a file and the responsibility
for identifying the changes to the data will fall on the data
warehouse processes. Or even if one of the CDC mechanisms is
provided by the source it may not be reliable enough.
The CDC process is straightforward for transactional data, for
example: sales transactions. Since the transactions always come
with effective dates, the new transactions are …
As mentioned before, since FooCamp I've been having ideas
around
queue services:
http://krow.livejournal.com/531369.html
http://krow.livejournal.com/530752.html
I've been thinking about this a bit more, and instead of working
on
the concept of a straight queue mechanism (like what Oracle
has),
I've been thinking more about how web services handle this,
in
particular services like Amazon's.
Instead of a flat queue structure, shoot for a temporal
queue.
A range select should force rows to go away for a set period of
time,
until the timer run's out. This gives the processing application
time
to deal with the row, and if it doesn't make it back in time, the
row
should reappear to go back in the …
I’ve read through Top 5 (or more) wishes posted by number of
MySQL employees as well as by a lot of community members.
It was great to see so wide coverage as people with different
backgrounds wish different things - Developers have some wishes
to ease development process, MySQL DBAs would like stuff related
to operations like Hot Backup. People actively working with
Performance problems like Kevin,Jeremy or me have bunch of
performance and scaling related wishes.
There is also a good overlap among people wishes which shows there are some things which are really needed badly.
The great question now is what will happen next ? Will there be any action taken to target wishes which make the most sense ? I heard Jay is collecting and summarizing these wishes so they would not be just lost but will there any true action taking ?
What I would like to see is MySQL taking a time to discuss and prioritize these internally and …
[Read more]
In my last post I’ve asked for any help on my project. So, tanks
to Jay Pipes for his tip.
One of most developer problems is documentation. And a good
solution was initially appeared in Java with javadoc. At now many
languages have special javadoc-like tools. I’ve made a little
research, and decided to use jsdoc and
phpdoc in my
project. I am still open for any suggestions