Git Backtracking Debugging Technique

You git history

There are 2 great pieces of advice when you're doing firmware development:

  1. Use Git and GitHub
  2. Make commits as small as practically possible

Commits are free so you should use them whenever possible. It will make your life easy when it comes to tracking code changes.

Why Should I Track My Code?

Because in the entire firmware development, you will make hundreds if not thousands changes to the code. You will be introducing new features, testing out a new architecture, refactoring some code smells here and there, paying technical debts, and hundreds of other things you might do to the code.

But because we're human, we can make errors.

From minor typos, to revenue-affecting critical bugs. At some points in our embedded career, we will make mistakes, even senior engineers still can make mistakes. And we often backup our work to prevent something wrong happen. We want to make sure we can restore chaos when something terrible happens. In other words, we backup as an assurance.

Code tracking is a form of backup.

It offers an unlimited and easy way to backup every step we make. And this article will explain to you how to effectively use git to help your debugging activity.

On April 6th 2021, a university researcher submitted a patch to Linux kernel maintainers. The maintainers accepted the patch and merged it to the kernel code.

It turned out that the researcher has injected malicious code into the patch as a study for his university research. Treating Linux kernel community as his lab guinea pig.

Maintainers were not happy with this and instantly ripped out his code from the kernel.

Without using code tracking (in this case, git), the maintainers will have a tedious day checking one by one the malicious code, and deleting them line by line.

Another example of using git will make your life easier in many situations.

The researcher and the entire university have been banned from submitting future patches. Very sad.

How to Pinpoint The Bug with Git

Let's say you have a project with commits like this:

Each circle represents your commit. You have 5 days' worth of commits

On April 5, you make a commit and test the code. Suddenly, a wild bug appears and makes the whole system crash. You reset the device, still crashes.

You're confused because you just added a minor feature. It shouldn't crash.

In this situation, you decided to revert back to commit Apr 4:

git checkout commit-apr4-hash

You rebuild the project and pufff, the bug is gone!

At this point, you can reasonably assume that commit-apr5 is causing the problem.

If you have hundreds of commits, this process can be tedious. Use git bisect to help you automate the pinpointing process.

Find the Root Cause

Use git diff to find code changes that caused the issue:

git diff commit-apr5-hash commit-apr4-hash

You might find some silly like off-by-one errors, use-after-free errors. Or maybe you will find something more serious like a race condition, deadlock, and starving tasks. Now it's your job to figure out how to solve it.

You might ask, what if the git diff result is too big, like 1000 lines too big?

Well, glad you asked. That's why in the first paragraph I said to make git commits as small as humanly possible. This will prevent your git diff become a mammoth.

What if you already made a big diff? Well, good luck finding the root cause haha 😁 

Update: A Small Example

You won't believe it. I wrote this article on Apr 18th, 2024. And one day after publishing the article, I faced a bug which I solved using this technique. I guess I have to walk the talk. Let's go.

I was working on my employer's firmware code. I have this commit history:

4 commits in a morning. Make sure to keep them small

The problem is, in the latest commit the firmware is crashing without a clear explanation (Add WiFi button):

I use ESP32 and this is the system crash log

I applied the backtracking debugging technique by reverting back the commits one by one. Suddenly the bug vanished in the commit hash 10dfaaea.

Checking the diff, I only touched a small part of the files. Nothing weird:

Nothing weird

Upon further inspection, I noticed that the bug appeared when I added a new FreeRTOS task. The problem was I didn't properly terminate the task using vTaskDelete().

Cannot show the code because it's super secret. I later added the vTaskDelete(NULL) and the problem is just gone.

Happy day! Sometimes life forces you to apply what you preach. 😀

Whenever you're ready, there are 2 ways I can help you:

1. FREE Firmware Development Email Course. A free 7-Day Email Course to teach you the foundational concepts you should know to start firmware development. Full of practical tips and industrial insights.

2. FREE Embedded Freelancing Email Course. Another free Email Course to help you grasp the basics of embedded freelancing and make more money as an embedded engineer.