Efficient Bug Hunting with git bisect: A Step-by-Step Guide

Efficient Bug Hunting with git bisect: A Step-by-Step Guide

September 4, 2024

git
git-bisect
bug-hunting

One of the biggest challenges when working with version control systems like Git is finding the specific commit that introduces a bug or an unwanted behavior in the code. Manually checking out each commit until you identify the issue is a slow and tedious process. Fortunately, Git provides a powerful tool called git bisect to make this process easier and faster.

In this article, we'll explore how git bisect works, why it's more efficient than manually searching through commits, and finally, we'll see a practical example with a fictitious commit history.

What is git bisect?

git bisect is a tool that uses binary search to locate the specific commit that introduces a bug. The idea is that instead of reviewing each commit sequentially, you divide the search, drastically reducing the number of commits you need to manually review.

The basic process follows these steps:

  1. You indicate a known commit where the code worked correctly (git bisect good).
  2. You indicate a commit where the code presents the bug (git bisect bad).
  3. Git divides the commits between these two points and takes you to one in the middle for you to check.
  4. Depending on whether that commit is good or bad, you tell Git (git bisect good or git bisect bad), and it will continue to divide the search until you find the exact commit that introduces the issue.

Why and When Should You Use git bisect?

The main benefit of git bisect comes from its efficiency since it uses binary search. Instead of reviewing every commit, binary search allows you to discard approximately half of the commits between the good and bad points with each step, reducing the number of revisions you need to make. This is especially useful when you have a history with many commits between the moment when you know everything worked fine and the moment the bug appears.

When should you use it?

  1. Large commit history: If you have a project with dozens or hundreds of commits between the last known good state and the commit where the error is introduced, manually searching is impractical. git bisect greatly reduces the number of revisions.

  2. Hard-to-locate bugs: If the bug in question is not easily traceable or only appears under certain conditions (e.g., a bug that only occurs in production), it can be difficult to find the problematic commit by reviewing one at a time.

  3. When you need precision: In situations where it's crucial to know exactly which commit introduced a bug, such as in a production project or when working with a team, git bisect allows you to pinpoint the responsible change with accuracy.

How Does git bisect Choose the Commits?

When you run git bisect, Git selects the commit to review by taking the middle point between the commit marked as "good" and the commit marked as "bad". In each iteration, the tool takes you to the commit that splits the remaining commits in two. This allows the range of problematic commits to be quickly reduced.

Each time you mark a commit with git bisect good or git bisect bad, Git discards half of the history and adjusts the remaining range so that you always review the commit that is roughly in the center of the new range. This process repeats until you find the commit that introduces the bug.

Practical Example

Suppose you have a commit history where a bug appears at some point. Let’s use git bisect to find the defective commit.

Step 1: Start the process

First, we start the tool with:

git bisect start

Step 2: Define the bad commit

We know that the bug is present in the latest commit on the branch, so we mark this as bad:

git bisect bad

Step 3: Define a good commit

We know that in commit abc1234, everything was working correctly. Therefore, we mark it as good:

git bisect good abc1234

Step 4: Review the commits

Now, Git takes the range of commits between abc1234 (good) and the most recent commit (bad) and will take us to the commit in the middle:

Bisecting: 7 revisions left to test after this (roughly 3 steps)
[def5678] Refactor user authentication logic

At this point, we test the code to see if the bug is present.

Step 5: Continue the process

Git will continue dividing the commits until only one remains, which will be the commit that introduced the bug.

Example Flow:

Let’s imagine this is our commit history (the most recent is at the top):

commit f0e1234 - Feature: add new payment gateway
commit e9f5678 - Fix: minor UI bug
commit d2a6789 - Bug: introduces the crash (this commit is the culprit)
commit c7b678a - Feature: refactor login system
commit abc1234 - Fix: user profile bug (known good commit)

When starting the git bisect process, Git will first take you to commit c7b678a. If everything works fine there, you execute git bisect good, which will make Git discard all commits before c7b678a and focus on more recent commits. Eventually, it will take you to d2a6789, where you find that the bug is present. By marking it as bad (git bisect bad), you finally identify this commit as the one that introduced the issue.

Step 6: End the process

Once you've found the commit that causes the bug, end the git bisect session with:

git bisect reset

This will return you to the state you were in before starting the process.

Conclusion

git bisect is an incredibly useful tool for tracking down the introduction of bugs in code. By using binary search, you can drastically reduce the number of commits you need to review, saving a lot of time compared to manual searching. If you haven’t used this tool yet, I highly recommend incorporating it into your workflow the next time you need to find the commit that introduced a bug.


Thanks for reading me 😊