A Fragile Test is a test that works - most of the time.
A test that fails suddenly, even though the tested code is still correct, is called a Fragile Test. Fragile Tests can undermine everything you worked for so hard when writing your test suite.
One goal in unit testing or Test Driven Design is to shorten the feedback loop. The shorter the time between making a mistake and discovering that mistake, the simpler, faster, and most importantly cheaper it is to fix.
There are other advantages in testing, particularly in Test Driven Design, but the shortened feedback loop is one of the biggest if not the biggest advantage of unit testing. A short feedback loop gives us trust in the functionality of our software but even more so, it allows us to introduce change confidently.
To get the benefit of the shortened feedback loop, two things have to happen.
The longer the time between test case executions, the longer - naturally - is the time it takes for us to receive that feedback. Therefore, only if we regularly and frequently execute all our test cases we will be able to keep that feedback loop short.
That we also need to trust the results might seem obvious, but let me dive a little deeper into that and show you how fragile tests can undermine that trust.
I once was working with a client that had made an odd observation. Their development staff members were avid unit testers and they had implemented a Continuous Integration (CI) loop that ran all tests automatically every time someone checked in a code change to source control.
With that process in place their code base was clean and mostly defect free. However, occasionally a defect would slip through the cracks and the frequency of that happening had increased lately. They were aware that even the best unit test suite would not guarantee freedom from defects. But what struck them oddly was that in most cases of these defects there actually was a unit test that should have caught it, but that test was disabled.
To get to the base of this, I first looked at the tests that had been disabled. I had seen this behavior before and had a hunch. I quickly determined that quite a few of the disabled tests where fragile. What happened was that the developers kept getting emails about CI failures. When they went to investigate, they could not replicate the problem, so after a while they just started to disable these tests. That led to a mindset where the first reaction to an unexpected test case failure was the assumption that the test was broken.
If the developers were not able to immediately see what the problem was, they just assumed it to be "one of those flukes" and they disabled the test so it would not stop them from doing their work. The intention certainly was to go back later to investigate what really was the problem and some of those disabled tests even made it on the backlog, but there was always more important work to be done.
Because of fragile tests, the developers had developed a habit of disabling failing tests. That habit led to more and more holes in their first line of defense. These holes then allowed more and more defects to slip through.
What we all can learn from this experience is, that fragile tests tend to undermine the efforts put into unit testing, so they need to be avoided at all cost.
Several root causes can lead to fragile tests. This post is the first in a series about fragile tests that will discuss ways to identify and prevent fragile tests. Below is a list of the ones already published.