commit 4e8297ef49117693250871473e0dd690e00baecb (HEAD) Add another argument Dockerfile | 1 + README.md | 1 + install.cmd | 2 +- commit 06088938b51d2546fa668ff0d635a4464baa6d17 Installation script: fix typo in new argument install.cmd | 2 +- commit 788efd908a74834e23c6ee435553bf556d29835f Add argument to installation script install.cmd | 3 ++-
Before you continue reading, consider whether you would want to change anything about these commits.
What are the problems with this series of commits?
There are three problems with this commit series:
- Commit amendments as individual commits: The second commit in the series does not implement new functionality but merely amends the first commit by fixing a typo.
- Incoherent commits: The last commit also updates a Dockerfile, which does not have anything to do with the modification to the installation script.
- Commit message does not match the content: The last commit does not mention the Dockerfile change in its message.
How to improve this series of commits?
We can improve the commit series by adjusting the commits in the following way:
- Create one commit that introduces the two new install script arguments. This commit shall also contain the update to the README.
- Create another commit for the adjustment of the Dockerfile.
Both of these tasks can be achieved using interactive rebases. Let’s starting with creating a commit that adds the new install script arguments.
Since we want to rebase the last three previous commits, we can use
HEAD~3 as the reference for the rebase (or alternatively
We can start the interactive rebase session by issuing:
git rebase -i HEAD~3
This will automatically open a text editor that displays the following text:
pick 788efd9 Add argument to installation script pick 0608893 Installation script: fix typo in new argument pick 4e8297e Add another argument
Note that the order in which the commits are outputted here is the reverse order of the output from
git log, which means that the newest commit in a series is shown at the bottom.
To fix our commit series, we will perform the following steps:
- Edit the last commit in order to extract the Dockerfile into a new commit.
- Squash the previous commits involving the addition of installation script arguments.
Editing previous commits
To edit the last commit in the series, we replace
e, which indicates that we want to edit that commit:
pick 788efd9 Add argument to installation script pick 0608893 Installation script: fix typo in new argument e 4e8297e Add another argument
Next, we save our changes to the rebase file. Then, Git will display the following message:
Stopped at 4e8297e... Add another argument
This means that we’re ready to modify the commit. To create a new commit containing only the Dockerfile, we will reset the index to the previous commit:
git reset HEAD~
After the reset, all of the commit’s changes have become unstaged:
Unstaged changes after reset: M Dockerfile M README.md M install.cmd
We can now split the files into two commits:
- A commit for the install script containing
- A commit for the Dockerfile containing only
To create a commit containing
install.cmd, we enter:
git add README.md install.cmd git commit -m "Add second argument and update README"
To create a commit containing only
Dockerfile, we enter:
git add Dockerfile git commit -m "Update dockerfile"
Since we’re finished with this commit, we can continue to rebase:
git rebase --continue
This finishes the rebase and we can go on to verify the effectiveness of our changes using
git log --stat:
commit 08c4be384bcd3f360c1c0998f54990116e1c1818 (HEAD) Update dockerfile Dockerfile | 1 + commit 80d323410a05d1a6339477d60cd0a529b47eee2e Add second argument and update README README.md | 1 + install.cmd | 2 +- commit 06088938b51d2546fa668ff0d635a4464baa6d17 Installation script: fix typo in new argument install.cmd | 2 +- commit 788efd908a74834e23c6ee435553bf556d29835f Add argument to installation script install.cmd | 3 ++-
Things already look cleaner now. However, there are now three commits that are all involved in adding arguments to the install script. Since these are all very small changes, we can improve coherence by squashing the commits that involve the install script.
Squashing previous commits
Again, we will use rebase. This time, we have to go back 4 commits in history:
git rebase -i HEAD~4
This gives the following output:
pick 788efd9 Add argument to installation script pick 0608893 Installation script: fix typo in new argument pick 80d3234 Add second argument and update README pick 08c4be3 Update dockerfile
To squash the oldest three commits together, we will use the
s marker and store the file:
pick 788efd9 Add argument to installation script s 0608893 Installation script: fix typo in new argument s 80d3234 Add second argument and update README pick 08c4be3 Update dockerfile
Now, a new file with the following text appears:
# This is a combination of 3 commits. # This is the 1st commit message: Add argument to installation script # This is the commit message #2: Installation script: fix typo in new argument # This is the commit message #3: Add second argument and update README
We can now write a new, improved commit message for the three commits:
Add two new arguments to the installation script - Argument 1 is used to set the path for library X - Argument 2 is used to set the path for library Y - README was updated
After confirming the changes, we can contently look at our our new and improved series of commits:
commit b4de4805ba528c5124d3842cbe04efb3f0021af5 (HEAD) Update dockerfile commit fa50568a6f1e96094eb4999ad10ea73f5ffb0c55 Add two new arguments to the installation script
Now that we’re finished, let’s consider a situation in which you should never use rebase.
When not to use rebase
The major caveat of rebasing is that you are replacing existing commits with new commits, which can have serious consequences when you’re working in a team. Therefore, the golden rule is that you should never use rebase when your commits are used by other developers.
Let’s consider the following example where we have two developers,
Dev A and
Dev B, whose merge base is the shared commit
commit 3A commit 3B | | commit 2S------------------------------- | commit 1S | master
Dev A is not happy with commits
1S, he makes some modifications and squashes the commits, thereby creating a new commit, commit
commit 3A commit 3B | | commit R commit 2S | | | commit 1S | | master----------------------------------
Dev A forcefully pushes his work to the master,
Dev B merges his commits
merge_commit----commit 3B | | commit 3A commit 2S | | commit R commit 1S | | old_master----------
The problem here is that
Dev B reintroduces the changes that
Dev A explicitly didn’t want to have in the master, namely commits
Dev B was aware about the fact that
Dev A had performed a rebase, his best course of action would have been to cherry-pick his new commit
3B onto the master rather than merging all of his commits. Most importantly, the whole problem would not have arisen if
Dev A had never performed the rebase.
In this artificial example, the problem does not look too bad. However, in a real project, multiple developers would be affected and it would be extremely time-consuming to prevent a corruption of the master branch. So, as a general rule, just refrain from using rebases on commits that are used by others.
Why do we want to have well-structured commits?
One question you may ask is, why do we have to go through all of this work just to restructure your commits. The three most important reasons are:
- Performing code reviews becomes highly cumbersome with bloated commits because the reviewer will have a hard time understanding the intentions of the changes.
- Reverting individual changes becomes impossible: when a large commit leads to a problem, it has to reverted as a whole.
- With a clean commit history, developers can quickly scan the git log to find out about recent developments.
Let me know if you know other reasons why you’d want to have a clean commit history.