I’m a big fan of Aesop’s fables. I use them with my kids to remind them to always tell the truth (The Boy Who Cried Wolf) and to appreciate the things that they have (The Dog and the Shadow). I thought it might be interesting to use some of the thousands of fables that Aesop wrote to discuss maintenance and reliability principles. This is my first post in hopefully a long series.
For those not familiar with the fable of “Belling the Cat” here it is.
Long ago, the mice had a general council to consider what measures they could take to outwit their common enemy, the Cat. Some said this, and some said that; but at last a young mouse got up and said he had a proposal to make, which he thought would meet the case. “You will all agree,” said he, “that our chief danger consists in the sly and treacherous manner in which the enemy approaches us. Now, if we could receive some signal of her approach, we could easily escape from her. I venture, therefore, to propose that a small bell be procured, and attached by a ribbon round the neck of the Cat. By this means we should always know when she was about, and could easily retire while she was in the neighbourhood.” This proposal met with general applause, until an old mouse got up and said: “That is all very well, but who is to bell the Cat?” The mice looked at one another and nobody spoke. Then the old mouse said: ‘It is easy to propose impossible remedies.’
When I read this I imagined the general council mentioned above being like a group getting together after an incident. Without the benefit of performing a formal Root Cause Failure Analysis (RCFA), one mouse quickly threw out the idea of solving the problem by addressing a symptom of the situation of a mouse-eating cat running loose around the house. Most of the other mice thought it was a great idea; after all, no one else was suggesting anything. Then the old mouse (in my mind he is the Reliability Engineer) gets up and challenges the “solution”. While in this case, it might have helped the situation in the short-term, how was this “solution” going to be implemented?
I know it’s just a fable, but consider how many times you’ve reacted quickly to a situation without thinking about the implementation strategy or if the “solution” would even do away with the problem completely. Resist the urge to quickly come to a resolution just for the sake of completing a checklist. The “solution” may never be completed and even if it is, will likely come back since the root causes were not addressed.
A good RCFA Process will follow these six steps:
- Notification: Identify the problem and the extent of its consequences
- Clarification: Based on the degree of impact, select whether RCFA is necessary and if so what tool would work best
- Root Cause Analysis: Perform the analysis and look for latent root causes
- Corrective Action Evaluation: Develop meaningful solutions that can be acted upon
- Verification: Double-check that the solution mitigated the original issue and didn’t create new ones
- Documentation: Write down and share your findings and solutions with others in your organization
Does your organization have a formalized RCFA process? Do you think having one would avoid situations like the one in the fable? Please let me know your thoughts in the comments below.