I believe in automation as much as possible, and I'm always working to make the day to day tasks of operations as smooth as possible. Also I try not to be afraid to take good tools and make them better.
Here in Database Ops at Box, we use pt-kill running as a service to constantly monitor our servers and help protect against long running queries. But our thresholds are pretty generous, and in some cases it's possible for unforeseen circumstances to cause enough queries to storm the database such that we can have problems before any of them hit the threshold for "busy time." Ditto for idle connections.
The response is that someone has to be available to manually run another copy of pt-kill with much lower thresholds to clear out these thundering herds. But what if we could let pt-kill handle both the "normal" mode and still protect [Read more...]
Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.