Danny’s Law says that the most resilient systems aren’t those engineered to be flawless but rather those that are flexible, observable, and designed with the expectation of inevitable failures. The focus is on early detection and swift mitigation of problems rather than attempting to preemptively bulletproof every possible scenario. This Law also applies to life. As the old saying goes, “Smooth seas do not make skilled sailors”. Resilience is built by facing adversity, not by avoiding it.

The software for my friend Danny’s financial firms was built from scratch by him and his team. Most computer systems are created by trying to imagine all possible scenarios up-front, planning them in detail, and building them. His focus was different. He built rapidly without extensive upfront planning. He expected problems, and the team focused not on avoiding all problems but being able to detect and handle the problems as quickly as possible.

Some would call this “planning to fail,” with the implicit assumption that if anything goes wrong one must not have done a good job in planning. I would argue that in complex modern systems, failure is inevitable, and detecting it early, before it becomes catastrophic, then resolving it quickly, is paramount. A close second is making sure that the failure of any piece has as little impact as possible on all the other pieces.

Danny recounted how, in a previous project, his team had spent tens of millions of dollars building a financial system. This included massive stress testing to ensure it could handle the volumes. The system performed marvelously for months, until March 16th 2020, when COVID-19 caused the largest single day Dow Jones Industrial Average drop on record. This sent a flood of related financial transactions through his system only to have the system crash, creating millions of dollars in losses. The culprit? A new error detection system had been added to the code to find and alert people about errors. Although in stress testing it handled greater volume, in the production system it failed. The system designed to protect against errors instead caused a massive crash.

From that point onward, Danny built systems with the expectation that they would fail; therefore, his focus was on making sure the failures were detected quickly and dealt with. He designed the system such that a failure in any area would have minimal impact on the larger system. This philosophy paid off handsomely and he is known for having built some of the most resilient systems in the industry. Danny is one of the most successful people I know, and it’s because he was always prepared for the system to fail instead of trying to make it not fail.

Even today in what some people call the “Age of Agile,” there is pushback against this idea which some people call “planning to fail.” In most companies, there is still a strong emphasis on extremely large project plans attempting to cover every eventuality. I have seen many a project manager be extremely proud of their lists with thousands of to-do items and dates on them.

As I mentioned before, the problem is that there are ALWAYS unexpected things. As the saying goes, “Life is what happens to us while we are making other plans.” Of course, systems should be built robustly, resistant to failure and attempting to contain the impact of inevitable failures. They should be ‘chaos tested’, whereby various pieces are taken offline or crash on purpose, to understand what will happen if pieces should fail, and reactions to failures should be swift. Just don’t think that you can ever eliminate failures.

This is equally true in life. As much as we wish to protect our children, or even ourselves, from harm, it is impossible to do so. We can’t even protect them from the day-to-day struggles of dealing with friends and relationships let alone larger issues such as wars, natural disasters, and economic shocks. While clearly they need protection from debilitating bodily injuries, our job is to make them competent– capable of handling whatever life throws at them instead of trying to hide them away.

This raises the question of whether Danny’s approach is right for all things, or just a certain class of things. While a staunch advocate of Danny’s way, I long felt that there were exceptions. I would say, “Danny’s Law doesn’t work on everything. You can’t build a satellite iteratively; it must work right the first time.”

And then SpaceX came along and proved that I wasn’t just wrong but wildly wrong. As of this writing, SpaceX, who develops rockets iteratively, has the largest share of launches in the world and at the lowest cost of any launch provider. Not only did they build these rockets iteratively, doing so made them so efficient that they are now the world market leader!

In Professor Zemel’s Operations class at NYU, which I have talked about several times before, he talked a lot about quality. One of the principles is that the sooner a problem is dealt with after it is detected, the better the quality of the system or product in the long run. This supports Danny’s Law. Flexibility, observability, and the anticipation of failures are core components of resilient and efficient systems.

I am not saying that planning and striving for excellence are unnecessary. Indeed, they are key to any good product or system. I am saying that we must acknowledge that we cannot anticipate every failure no matter how hard we try and that it is often better to prepare for the inevitable failure than to strive for the perfect system.

In business and in life, the real key to resilience doesn’t lie in creating the unbreakable, but in fostering adaptability, keen awareness, and rapid response. Robust systems and lives are those sculpted through the crucible of challenges, not shielded from them. As Danny showed, a focus on handling failure well is more effective than trying for perfection. We could all learn from Danny’s Law, embracing the inevitable challenges as opportunities for innovation, learning, and enduring strength.


Discover more from Lowry On Leadership

Subscribe to get the latest posts sent to your email.

3 responses to “Danny’s Law: Resilience is Built by Facing Adversity, Not by Avoiding It.”

  1. […] a system completely built, fully running and completely reliable. Reliability, in particular, is only achieved by having real people use the system in real circumstances. No amount of testing fully replaces […]

  2. […] Since errors are inevitable, the important thing is to know when they happen and handle them gracefully. […]

  3. […] worked great until I got into calculus class. As my friend Danny likes to say, “it worked great until it didn’t”. This material was so much harder […]

Leave a Reply

Quote of the week

“AI will probably most likely lead to the end of the world, but in the meantime, there’ll be great companies.”

~ Sam Altman (apocryphal)

Designed with WordPress

Discover more from Lowry On Leadership

Subscribe now to keep reading and get access to the full archive.

Continue reading