What we weren’t taught

I attended many classes in College: Pascal Programming, Object-Oriented Programming, Functional Programming; Algorithms, Neural Networks, Databases; Project Management, Security, Data Protection; even Algebra and Physics, for some reason. However, there is one thing I do every day and there wasn’t a class for: the proper way to search for a bug.

It sounds silly, but lots of people have no idea how to search for a bug. They get a Jira ticket and start rooting and poking randomly around the source code to see what pops up. Sometimes, their Brownian motion across the functions that make up the program leads them to the bug; most other times, however, they end up desisting, and the ticket remains open per saecula saeculorum.

There were many times when I sat next to a teammate in the middle of their random walk, and I was itching to take away their keyboard and find the bug myself. Still, I couldn’t do it because I’m a senior engineer and must set a good example and help them grow professionally. And since I want to help you grow professionally, I will now tell you the right way to search for bugs.

It is indispensable to reproduce the error in your development environment. If you don’t do this, how will you even be able to find it or know if you were able to fix it? I shouldn’t even need to say it, but sometimes, it is tough to resist the temptation to forge ahead when you cannot reproduce the bug. If that happens, you need to check for any differences between your local environment and the environment where the bug appears, then test every single difference to see if that’s the one. If you are lucky, that will also have caused the bug, and you will be done.

Now that you can reproduce the bug, you need to find its cause. Since we are software engineers, we will use the scientific method, which works way better than the superstitious method.

A sorcerer, surrounded by mystical symbols and mythological creatures, looks disgusted at his laptop screen. “Another goblin?! I rue the day I programmed my computer to summon demons!”

The first step is to come up with a hypothesis: that is, a provisional explanation of the cause of the bug. The program is reading the wrong file; two people tried to update the same record at once; … The hypothesis must be a plausible explanation of the bug: if you find that an animal broke into your coop and ate all the chickens, your hypothesis should involve a fox, not a giraffe.

The next step is to test the hypothesis. Most people try to look for examples that confirm it, but that’s not what you should do, as you can almost always find many in favor of a wrong hypothesis. For example, if you think that all even numbers are greater than 10, there are infinite examples “confirming” it. You need to find examples that refute the hypothesis, as just one is enough to prove that it’s not true.

Therefore, think of the inevitable consequences your hypothesis would have if it were true, and see if any of them are not occurring. For example, if you lose all electric power at home and you think it may be an area outage (your hypothesis), this would mean that your neighbors wouldn’t have electricity either (a consequence of the hypothesis), so you can go out and see if any of them have their lights on (an example that refutes it.)

After falsifying your hypothesis, come up with a new hypothesis and repeat the procedure until you have one you cannot refute. Then create and test new hypotheses to refine your knowledge until you understand the cause of the bug perfectly and until you know what to do to fix it without introducing two new bugs.

This description makes the method sound more complex and slower than just poking around in the code, but I can assure you that it isn’t. With practice, this procedure will become automatic in your mind, and so quick that people will think you work miracles.

The illustration for this Coding Sheet is based on an engraving published in the Dictionnaire Infernal, 1867.