We receive many requests for help with server stalls. They come under various names: lockup, freeze, sudden slowdown. When something happens only once or twice a day, it can be difficult to catch it in action. Unfortunately, this often leads to trial-and-error approaches, which can drag on for days (or even months), and cause a lot of harm due to the “error” part of “trial-and-error.” At Percona we have become skilled at diagnosing these types of problems, and we can solve many of them quickly and conclusively with no guesswork. The key is to use a logical approach and good tools.
The process is straightforward:
- Determine what conditions are observably abnormal when the problem occurs.
- Gather diagnostic data when the conditions occur.
- Analyze the diagnostic data. The answer will usually be obvious.
Step 1 is usually pretty simple, but it’s the most important to get right. …
[Read more]