I am currently finishing some features to make a program highly resilient to occasional crashing bugs. A particular function was found to crash on queries of the form
WHERE x IN(NULL), and that crashed the entire program. Now we have a framework for intelligently recovering from arbitrary crashes. I will write more on this in the future, because I think it’s a very interesting thing to share.
In this episode, I want to focus on a related topic: how do you test a program that is supposed to be resilient to bugs you can’t predict? Many new problems are caused by writing clever code that is supposed to detect, avoid, or recover from problems, even known problems. Unknown problems are even riskier.
The approach that has given me a great deal of[Read more...]