Category Archives: tradecraft

Flaky test Kessler syndrome

I have been working within a monorepo (that is, a gigundo Git repository with lots of separate projects in it) and participating in maintaining its test health. This includes unit tests.

  • After every commit to the repository, all of the unit tests across the whole thing are run by our CI system. The idea is that the entire monorepo should always build successfully.
  • Also, for every pull request (proposed change), all of the unit tests across the whole thing are run by our CI system. Same idea as above.

It isn’t feasible for individual contributors to do this themselves, as the monorepo is just too big and varied. I don’t even try to compile the entire thing myself, let alone run all of its unit tests. All I do, and expect from others, is work within the projects that are relevant to the task at hand (and their dependencies), including making sure that existing and new tests pass.

We really need existing tests to be reliable. A flaky test that fails randomly once in a while crops up in random test runs in the CI system. For a PR test, it’s annoying because that test failure seldom has anything to do with the proposed change. For a post-commit test, it’s just noise. In both circumstances, someone tasked to investigate test failures has to spend time discovering that the flaky test is at fault – a false alarm. Additionally, someone proposing a PR has to re-run tests so that, hopefully, that flaky test doesn’t flake out again. This all produces friction in the development process.

Say you have multiple flaky tests. Now, a test run may fail if one or more of them fails. If there are only a few flaky tests, the effect isn’t much worse than having one. But, as the number of flaky tests grows, probability starts to be your enemy.

Continue reading

Software commitments guarantee poor quality

You’ve heard of the iron triangle, right? Of course you have, I think they teach it in coder kindergarten now. Since I couldn’t find any freely-licensed images of it in less than a minute, I drew my own.

As a refresher, you can only pick two of the three points of this triangle. The third one is therefore chosen for you. Well, that’s how it’s supposed to work.

Continue reading

Three Rules of Leadership

These rules are my own idea for how a good leader does their job. I don’t claim they are “The” Three Rules, just a set of them that I like.

Because I am a nerd, they are structured like Asimov’s Three Laws of Robotics. You can break later rules in order to follow an earlier one.

Rule 1. Don’t be stupid.

Above all else, don’t make moronic decisions or take dumb actions. This is essentially unforgivable. A leader has to be intelligent or, some might say, “smart”.

Rule 2. Be fair, unless that would be stupid.

Even if you have to displease somebody, if it’s a fair decision, they should eventually at least understand and accept it. Everyone really cares about fairness, even the very young.

However, it’s a mistake to do stupid things in the name of fairness. See Harrison Bergeron. It helps to remember that fair does not mean equal.

Rule 3. Don’t be a jerk, unless that’s needed to be fair and not stupid.

To be effective, those you lead must at least not hate you. When a leader acts, they should be compassionate and understanding, or at least not hostile.

However, if push comes to shove, and you can’t effect change in a fair and intelligent way otherwise, then you’ll have to stop being nice, and get the job done. Can’t please everybody.


I don’t want to expound on these ideas, because, well, that could take a while, and detract from their nice, simple form. I’ve considered them for quite some time, and I think they hold up well when considering what real-world leaders do, and whether they are doing a good job or not.

In praise of incremental development

A good traveler has no fixed plans and is not intent on arriving. – Laozi

The other day I was asked about whether an encryption feature in a product I work on supports user-supplied keys. I had to look back in Jira, years into the past, for the answer, which was: nope. It supports default keys, but we hadn’t had time for supporting user-supplied ones.

While this is a little disappointing, it still was the right thing for us to do. That’s the paradox of incremental development, and why I think it’s hard for developers to fully embrace it.

Continue reading

Hierarchy of “Is it supported?”

As tech lead of Cloudera Altus Director, I’ve often been asked whether some product feature, or use of a particular version of some external software, or use of some external service – some it – is supported. I found that it’s really hard to answer these questions. There are different factors involved, like how much testing we’ve done or continue to do, or what we recommend to customers, or say we’ll let them at least get away with. The answer is rarely a binary, absolute yes or no.

So I spent some mental cycles thinking about the question of whether something is supported. Thinking about the different angles, I came up with a hierarchy of support levels. It’s certainly a “1.0” sort of artifact. But, I think it’s a good starting point, and it already helps me answer “Is it supported?” questions today.

Continue reading

Don’t build your tests when you run your tests

You’ve got a Jenkins job set up to run your tests automatically, like you should! Good job. 5 points to Ravenclaw.

git clone git@github.com:joebob/awesome-test-framework.git
cd awesome-test-framework
mvn -DskipTests install
cd ${WORKSPACE}
java awesome-test-framework/awesome-test.jar -d where-my-tests-at run

Wait, what have you done. I’m very disappointed. 10 points from Ravenclaw.

The purpose of your test is to … run your test, not build the tools needed to run the test. You can do that anytime. When you build testing tools as part of the test:

  • you waste time – the build of the test tool will come out the same every time
  • you add an extra source of failure – if building the test tool fails, the test fails, but not because of anything wrong in what you want to test
  • you make the test more difficult to run – because the test run also needs the requirements for building the test tool
  • you make the test longer
  • you reduce test stability – if you build from a branch subject to change, like, say, master
  • kittens die

What should you do instead? You should treat the tool like the project it is, and that means creating builds for it, tagging those builds as releases, and publishing them for download. Then, tests can simply install the tool and use it. This runs quicker, has less prerequisities, and is more stable.

curl -O http://repo/awesome-test-1.2.3.jar # BOOM
java awesome-test-1.2.3.jar -d where-my-tests-at run

The obvious counter to this is that it’s an awful lot of work to devote to a test tool that isn’t going to be released to anybody; by always building from source, you always get the latest version of the tool! I’m sorry, but this doesn’t hold up under scrutiny.

  • If you build from source every time, how do you know what the tool is doing? If someone inadvertently breaks the tool, your test will fail on you and it won’t be your fault. If someone changes how the tool is used, your test will fail on you and it won’t be your fault.
  • By now, we all now refactoring is a vital part of software development. If every test is building a tool, that work must be refactored into one build pass, which means it should just be built ahead of any test. DRY.
  • You work in an organization with lots of other people, and when they use the tool, they are the customers of those who wrote the tool. Don’t those customers, your colleagues, deserve the same respect and devotion of effort that you’d give to others? Wouldn’t you love it if someone gave you a one-line installation of a cool tool?
  • Calculate the time that it takes to build a tool. Say it’s one minute. If you run a test hourly that uses the tool but builds it itself, you waste over 15 days of compute time per year that you could use for actually running tests.
  • Test tools need to be rock solid, or else you can’t be sure that your test results are valid. Did everything pass legitimately or is the tool busted and passes when it shouldn’t. You need to work from known good states of the tool, and that’s what releases are for.

Don’t give in to laziness. Set up binary releases for test tools and install them during tests like Maven, Git, or any other utility you need to run a test.

Protecting the master branch from you

An ounce of prevention is worth a pound of cure. – Benjamin Franklin

Eighteen new commits on master with questionable commit messages. Yay.

One of the most critical things you do when working with remote git repositories is pushing your changes to master. Once that is done, everyone else can see what you’ve done. And, if you mess that up, everyone sees your mistakes, and fixing it swiftly becomes a trial requiring team-wide coordination and the ugly push --force command. A couple of months ago finally messed that up one too many times, so I decided to use Git’s pre-push hook to protect the master branch from my own mistakes.

Essentially, I wanted Git to put up roadblocks to me pushing to master. The pre-push hook can perform checks and reject a push attempt purely on the client side, so it’s a perfect spot for implementing those roadblocks. I took a hook from this useful page and augmented it. While the original hook merely asks if you’re sure you want to push, I added two more checks, neither of which can be overridden interactively.

  1. A file named “ok-to-push” must exist. If it doesn’t, the hook rejects the push. This forces me to stop for a moment and not sweep right into pushing.
  2. No more than one commit can be pushed. Pushing multiple commits at once is a dead giveaway that I merged in a work-in-progress branch, instead of the polished and tested final branch that should be distributed. There is no way around this check except to do one commit at a time.

I posted the hook as a gist.

Everyone makes mistakes, but this is one mistake I’m happy to make less often. The hook is enforcing a higher level of discipline, and hopefully, in the end, will become obsolete as I adopt better habits.