This post, by John Micco, talks about how the Software Developers at Google deal with and minimize the damage of “Flaky Tests”. Micco defines a flaky test “as a test that exhibits both a passing and a failing result with the same code.” He goes on to state that this causes major problems because how difficult it is to find the cause of the flaky tests and the frequency of which these kinds of tests appear. There is also the issues of a flaky tests being dismissed only to find out later that it was a real failure which sometimes results in passing through broken code.
There are several strategies that the developers at Google use to minimize the damage and confusion that is caused by these flaky tests. One way is to only report a true failure if the test fails three times in a row. This, however, can be costly if the test that is being run takes a large amount of time to complete. Another way is that they implement a tool that monitors the flakiness of a test. If the tool determines that the flakiness for a test is too high, it removes the test from the critical path. The last strategy they have is to use another tool that monitors flakiness levels. When the levels change, the tool tries to find the reason for the test’s change in flakiness.