Identifying Software Bugs - Part 2
In the last article, "Various Types of Software Bugs - Vol1", we talked about some basic types of software bugs and how to find them. Every software project has gotten a bug at some point in its development, and every software developer has had to debug a project at some point in their career.
Some of these bugs are so common that fixing them is seen as a standard procedure of every software development process. But not all bugs fall in this category; some are elusive and can take weeks to find and finally fix. They don’t necessarily crop up in every project, but when they hit, they hit hard.
This article will shed light on the dark world of bohrbugs, heisenbugs, mandelbugs, and schroedinbugs.
Sometimes when you’re going through code, you encounter an error so bizarre that it feels like looking into a dark hole. Some bugs are so complex, it takes days or weeks to understand what caused them. Those are the bugs we’re going to explore today.
The name for this unusual bug comes from physicist Niels Bohr’s atomic model. This model shows the state of an atom and its subatomic particles to be predictable.
A bohrbug is a well-defined bug that can be reproduced in a debugging environment with a given set of conditions. Basically, every simple bug is, by this definition, a bohrbug. You can usually fix such a bug with simple debugging techniques.
Perhaps the least exciting bug on our list, the bohrbug is still included for the sake of being thorough.
Once again, the name comes from physics, specifically Heisenberg’s uncertainty principle in quantum dynamics. The main idea of the principle is that particles get changed by simply observing them. It posits that if you measure an experiment at the beginning and the end, you’ll get different results than if you only measured the outcome.
The heisenbug is a counterpart to the bohrbug. Far from routine, the heisenbug is named for its tendency to change under observation. If your system has a bug, as soon as you try to debug it, the bug goes away. Which sounds great at first. But if it’s a true heisenbug, it’ll come back in production when you think your work is done.
The simplest reason for this is that the development environment in which the system is debugged doesn’t match the production environment. Different databases, environment variables, or other inputs can lead to different outcomes that don’t trigger the failure.
The mandelbug is named after Benoît Mandelbrot, a mathematician who became popular for his work on fractal geometry. Fractals are subsets of regular space; their dimensions don’t have integer values but factual ones, like 1.843. This makes the work of Mandlebrot rather confusing for many people.
A mandelbug is a bug that appears to be a heisenbug, but is actually a very complex bohrbug. A mandelbug usually crops up in systems that don’t have a design. Instead, they grow historically over many years and the number of permutations that define their state is so vast, it’s virtually impossible to recreate them in production.
The name of this bug type, again, comes from quantum physics. The source is a thought experiment proposed by physicist Erwin Schrödinger. In this experiment, a cat is put into a box and a quantum process determines if poisonous gas is released into the box. The idea is that quantum processes can have multiple states at once until they are observed, so if nobody opens the box and looks at the outcome of the quantum process, the cat should also have multiple states at once: dead and alive.
A schroedinbug is a failure manifested in production, possibly multiple times, but is only recognized when someone reads the source code. It’s one of those situations where a developer has to change some old code for refactoring purposes or to add new features and discovers old code that never should have worked.
These bugs don’t always have important consequences, but they can. An overwhelming amount of software today is used to calculate things humans couldn’t calculate on their own. People in charge then use the numbers produced by such software to make decisions. If the numbers are wrong, this can lead to problems for those who live with these decisions.
This bug is named after the disaster of the Hindenburg zeppelin, in which the hydrogen-filled airship caught fire. The disaster led to the end of the zeppelin era.
As you may have guessed, the special thing about a hindenbug is that it leads to a huge disaster. It’s the bug that ends your company, or worse, all companies like yours. It could be a crazy heisenbug or a simple Bohrbug, but what matters is that you’re done for.
An example of a hindenbug is the 1996 explosion of an Ariane-5 rocket. A bug inside a floating-point conversion led the rocket to self-destruct 37 seconds after igniting its engines. This disaster dried up funding for space exploration research for quite some time.
How to Exterminate an Exotic Bug
Bugs like these are often hard to find. Existing solutions like Python debugger, Node debugger, or Java debuggers may not be able to help while debugging in production. Sometimes, it takes users and developers ages just to notice them. So, assuming you’ll ever find one, let’s look at methods that can help you get rid of them.
Know Your Application
These strange bug types have many ways to manifest, so your best insurance against them is understanding what’s happening in your application. For massive codebases, you probably won’t be able to find one developer that knows everything, but collectively, everyone in your development team should be able to form the full picture.
Perform Static Analysis
Static analysis tools scan your source code and mark problematic code lines. These tools can find many bugs. They are quick to set up and filled with knowledge about best practices and problematic coding styles. They can find bugs you didn’t even know could happen.
Consider Environmental Factors
In the era of cloud computing, it’s simply not possible to replicate the production environment locally anymore. Though, honestly, “It works on my machine,” was never a good response to a bug anyway.
It's key to get your debugging environment in the cloud as close to production as possible. This helps reduce issues that arise when a project is ready to launch.
Use the Right Tools
Tracing and remote debuggers can give you invaluable insights into your application. This is especially true for cloud software that runs in a data center far away. Instrumenting your serverless functions and containers will ensure you always know what’s happening and when.
Reproduce Your Bugs
By extracting the knowledge you gained from tracing your app in the cloud, you can reproduce the actions that led to your bug locally. Sometimes it’s enough to simply look at a debugging log or a trace of your production environment and see what the problem is. But when a heisenbug hits, you have to nail the state down to reproduce the bug in your development environment.
Write Test Cases
When you’ve finally done enough research on a defect and have it reproduced outside of the production environment, you should cast it into a test case that can be replayed at will. This allows you to see if your fixing efforts have any merit.
Test and Confirm
Don’t just run your test cases once. The heisenbug is flaky by nature, and you need to test it in different conditions. Run your tests often, in random order, and even in parallel. This way, you can be sure that the issues are fixed.
Isn’t There a Tool for This?
If only there were a tool that could record and replay your application executions. Recording the endless possible conditions needed to reproduce a mandelbug could save you quite some time, and turning them into test cases afterward would be child’s play.
That’s why we created Sidekick for cloud applications and application performance monitoring. Sidekick tracks how your application behaves in production and lets you reproduce this behavior in your development environment with just a few clicks.
If you have yet to take your first step into the world of Sidekick, you can get started with Sidekick for free.